If you don’t write it down, it never happened

By August 5, 2009Technical

You’re trying to reconfigure a service to do something new. Digging into the config files, you see that everything’s been modified heavily, but it doesn’t seem to make a lot of sense. Everything’s currently working, so it must be right, but why was it done like this in the first place? It looks like this can be simplified, but… you’re not sure. What if it needs to be this complicated for a reason?

Or perhaps it’s 2am, you’ve just been notified by the monitoring system that a critical system that you don’t have a lot of experience with has gone down. Logging in to the server you thought the service was on, you realise that this isn’t the right place, so you waste precious time tracing the network through the load balancers and proxies to where the service really lives. Then you realise that you don’t know where any of the config or binaries are. By now you’ve been ferreting around for 20 minutes and your SLA is just about blown, and you still have no freaking idea what’s going on with any of this…

Maybe the problem is that you’ve got something on your network that’s causing problems from excessive broadcasts on some random IP address. All you’ve got is the MAC address, but what server does that map to? You’ve got no idea.

Then there’s the frustration of being asked by the boss to setup a new instance of some minor service that nobody’s touched in two years as a test platform for a new client. Nobody else in the office remembers how it was done last time, only that it wasn’t a whole lot of fun. Of course, the boss doesn’t see why you can’t whip this one up nice and quick, given that “we’ve already got one of these over here, why should it take too long to do another one?”

By now, you probably know what this is all about (if the post title didn’t give it all away to begin with). We’re talking Documentation. Everyone’s heard a hundred reasons for why you should write documentation. Here’s number 101:

A system really only survives as long as people understand how it works and how to maintain it. The moment that information is lost, the system is basically dead — sooner or later someone’s going to come along and tear it down and replace it with something else, something that they understand. Of course, the operation of that system will sooner or later be lost and the cycle will repeat.

If you want to maximise the life of the systems you build, then, you need to ensure that the crucial details about it aren’t lost. The only way to do that is to write things down. Your memory will fade, you won’t always be around to answer questions, and trying to figure things out post-facto is a pain in the arse.

Of course, it’s all well and good to blather on about how wonderful
documentation is, but the difficult bit is how do you write good
documentation?
Well, I can’t say I’ve got all the answers, but in the next few installments of The Adventures of Project Starbug, I’ll describe how we laid out the documentation for this new project.