On Simplicity

Again and again I see designs that are just... wrong. Given the choice between doing it right and doing it simple, the choice is 'simple'. I'm not the only one to notice this.

Unix is a traditional example. It's as simple as possible. Look at the file API. Let's say that you want to read some data. You can read(2) it. If we're streaming the whole file in, a bit of prefetching would be nice. If we're trying to read regularly-sized records from all over a file, we want different behaviour. The API ignores the notion of record sizes and advance knowledge of access patterns. Any optimisation has to be heuristic.

Perhaps worse is atomic file update. Also known as: Write to new file, hard-link old file to back-up file, move new file over old file. Moreover, there's magic in the semantics of the file system to make this work: the writes to the file should be committed to disk before the move is, so that you don't get a partially-written file in the case of a crash. Ugly hacks to do a conceptually simple operation.

Or look at TCP/IP. The internet's favourite way of distributing video. There's no quality of service built into basic TCP/IP, and a packet-switched network is just the wrong solution for that kind of problem.

Way back, PalmOS was another example (like MacOS before it) - a co-operatively multitasking OS. Simple to design and implement, great for lightweight hardware, able to beat its pre-emptively multitasked competitors, but you know that in the end it'll be running on more powerful hardware doing more complex things, where the original design looks horrendously underpowered.

This is really the key. A complex up-front design will be bad, since you don't understand all the corner cases, people don't want all the infrastructure yet, the current hardware can't really support it, and you'll be trounced in the marketplace by a simple design that works right now.

Real good design is one that can evolve. Unix started off life as an OS for an impressively under-spec'd machine. Amongst other limitations, one thread per process, no real-time guarantees, uniprocessor model, certainly no graphics or networking. Even when Unix had been extended, Linux started as an x86-specific (and uni-processor) monolithic kernel Unix clone. Somehow, the design has extended in a way that has not fundamentally compromised it. Sure, it's warty, but it's somehow scaled. TCP/IP's not dissimilar. Despite its grimness, video over TCP/IP is working out mostly ok.

Sometimes, even ignoring the evolution over time, doing the simple wrong thing is better than the right thing. When I heard about Erlang's error model (basically, 'if something goes wrong, explode, and have a hierarchy of processors that can deal with children exploding rather than attempting to recover), I was horrified. However, it sounds like the right thing, in the end. Error handling is tricky to write, tricky to test, and errors in error handling could make for a nice cascading failure. Playing it dumb may make it more safe and reliable than trying to think hard.

Einstein's maxim to make it as simple as possible, but no simpler, seems to have some relevance. The lesson is, make it simpler than you think 'possible' would be - you might be surprised. On the other hand, make it extensible, so that success and scope-creep doesn't induce long-term failure.

Posted 2012-02-27.