Musings on programming

There's a thing I've always wondered about programming, which is what exactly it takes to do it.

I know a great many people who self-identify as computer geeks, many of whom have a formal computer science education and/or a lot of self-acquired knowledge. A lot of them know very large amounts about computer languages, about algorithms and data structures, about principles of software architecture, about testing and debugging, error handling strategies, and so on. But as far as I'm aware, quite a few of those people have never actually put their money where their mouth is, sat down and written a big useful program from the ground up. A lot of people are competent to fix bugs in other people's programs, to submit small patches, to analyse, criticise and compare, to script and to sysadmin, and to write small utilities, but very few seem to actually write programs in any serious fashion. So I ask myself, what is the quality which separates those of us who do do that from those of us who don't?

Because it's not anything in the above list. I'm pretty good with the detailed corners of the C language, for example, but there are people who know more about them than I do, and yet I write big C programs and they don't. There are people whose formal education in program design, architecture and development methodology vastly outstrips mine, and who profess shock and horror at the idea that (for example) I might leave any significant aspect of a program's design undecided until I get to actually writing that part of the program; but I write big things that work, and to the best of my knowledge they don't. I am forced to the conclusion that the quality which separates me and my fellow big-program-authors from these people cannot be that we have more skill than them in any specific aspect of the art and craft of coding so far mentioned.

So what is it?

Desire is one obvious explanation: that it's not a question of whether people can do the job, but rather a question of whether they want to. Perhaps many geek types haven't written any big programs simply because they haven't happened to come across a problem which needs a big program to solve it, which hasn't been solved satisfactorily already, and which they want solved badly enough to go to all that considerable effort. Me, I enjoy writing big programs, so it doesn't take all that much desire for a particular problem to be solved to motivate me to do so.

Willpower is another possibility: determination and staying power. Perhaps many people start large jobs, but not everybody manages to get to the other end of a long and hard piece of work without losing confidence and/or motivation and/or energy, and giving up.

Neither of those hypotheses is at all specific to programming; either would apply just as well to other fields of lengthy creative endeavour such as writing a novel. (In fact, the second one rather reminds me of NaNoWriMo, which I've known many people to start and very few to complete.) So now I'll list some possible programming-related explanations.

Perfectionism is a thought brought to my mind precisely by the fact that many people who don't write big programs know more than me about the theory of how to do so. Perhaps there's a sweet spot for the amount of perfectionism that it's good to have if you're going to write large programs. Too much, and your rate of code output drops too far and you can't write anything big at all; too little, and your code is unreliable and buggy and it becomes impossible to add more features to it because every time you add a new thing all the existing bits break. Perhaps somewhere in between there's a happy medium, much like the theoretical optimum taxation rate, and people who are able to write big programs are those people who happen to hit just the right degree of perfectionism.

Design is something I've alluded to above. If you don't design at all before you start, you'll run into dead ends a lot of the time and have to go back and do a big rewrite, and every time you do that you're taking a motivation hit and putting further strain on your limited staying power. So doing enough design in advance to avoid most serious-rewrite incidents is likely to maximise your chance of getting the thing finished, but at the same time doing too much can easily mean (and I speak from experience) that you never even get started on the actual coding. Perhaps there's a sweet spot to be hit here as well.

Method might have effects on the willpower issue as well. When I write a big program, for example, I tend to start with a lot of foundation and infrastructure layers which don't have much in the way of visible or tangible functionality, so often I can sit there and write code for weeks before I see so much as a ‘hello, world’; that's a stage that I need a lot of willpower to persevere with. At some point I move into a stage where I'm writing code that adds clear and obvious functionality to the program; this stage typically goes very fast because a lot of the functionality was there in the infrastructure layers and merely needs enabling or invoking, meaning that this stage is particularly easy, and also particularly motivating and fun.

One of my contemporaries at school thought this was odd. ‘When I write a program,’ he said, ‘it starts off doing nothing and gradually does more and more as I keep typing. But your programs do nothing for weeks and weeks and then suddenly they do everything!’ I've learned through the years that my approach works well for me, but it seems likely that if that guy had tried the same approach he might well have lost motivation long before reaching the point of seeing any worthwhile results. So I wonder if picking an approach that works for you is important, which in turn hints that self-education might have a natural advantage over externally applied training.

Memory is the last of the options I've thought of. There are two things you have to be expert on when writing a big program. One is all the fixed stuff you learned in computer science (or wherever else you picked it up from): the details of the language and OS you're using, how to write efficient algorithms, not forgetting that testing is a good idea, that kind of thing. The other is just as big a body of knowledge, but you invent it all yourself. This is the knowledge of how your particular program fits together: what modules or objects exist, what their relationships are, how many copies of each one there might be, whether there are threads, how control flows … Anyone writing a large program has to essentially create an entire field of study, an entire thing that you can be an expert or a novice in, from the ground up. And if you don't have the mental capacity to be an expert in that field, you cannot do expert-level work on your own program; so this hypothesis would imply that there's a limit to how big a program any given person is usefully capable of writing, and that limit is set by the amount of state they can hold in their head about stuff they've just made up.

Of course, you can work to reduce the amount of state you have to maintain in your head; this is a large part of what things like modular design are all about. So perhaps by careful modular design you reduce the amount of stuff you have to keep track of to O(log N) rather than O(N). It's still the case that a 20,000-line program could require you to store five or ten times as much mental state as a 200-line program; that's not nearly as bad as one hundred times as much, but it still leaves plenty of scope for going over your personal limit.

So with this many explanations, it should be clear that my problem is not a shortage of possible answers (although if anyone has a plausible one I've omitted, I'd be interested to hear it). My problem is that I have so many of them that I don't know which is right! It could, of course, very well be that all of the things I've mentioned are true to some extent; but I'd really like to know whether that's the case, or whether one is more significant than the others, or whether half (or all) of the things I've listed are complete drivel.

Because after I find out the answer to that, what I want to know next is how you teach the ability to write a big program that works properly. It seems to me that knowledge of specific things like Unix or C is widespread, the ability to tinker with an existing code base is reasonably common, and the ability to talk endlessly about the theoretical basis of programming is in plentiful supply from the graduate pool of any half-decent university; but the skill and/or will to bring all those things together and get an entire job done is regrettably rare, and the world could do with more people who could do it.

I apologise, incidentally, for the somewhat self-congratulatory note in this entire piece of writing. I've hesitated to write any of it down for a while, because it smacks so much of ‘I'm great, and nobody else is’; but I think it is undeniably true that I am capable of doing a particular type of useful thing, and that many other people are not, and I am curious to know why. It doesn't make me or people like me better than everyone else, of course; many of those other people can do things I can't. (Some of them even manage to be attractive to women once in a while, for example. :-)

So I'm interested in comments from anyone who's still reading. If you consider yourself to be somebody who can and does originate large software projects which succeed, then do you have any idea what might be the key skill that makes you able to do that and other people not? If you've tried to do so and failed, what was difficult about it? If you've shied away from trying (for anything other than a really obvious reason like not actually knowing how to program at all), why?