Wednesday, December 26, 2007

80% of the solution in a fraction of the time

As can happen, I set out to write this piece once already, only to end up with a slightly different one. Here's another take, bringing Wikipedia into the picture.

First, let me say I like Wikipedia. A quick scan will show I refer to it all the time. I see it as a default starting point for information on a particular topic (as opposed to a narrowly-focused search for a given document or type of document). I don't see it as definitive, but I don't think that's really its job.

Wikipedia would seem a perfect test case for Eric S. Raymond's formulation of Linus's Law ("Given enough eyeballs, all bugs are shallow). But -- as Wikipedia's page on Raymond dutifully reports -- Raymond himself has said, well, here's how it came out in a New Yorker article:
Even Eric Raymond, the open-source pioneer whose work inspired Wales, argues that “ ‘disaster’ is not too strong a word” for Wikipedia. In his view, the site is “infested with moonbats.” (Think hobgoblins of little minds, varsity division.) He has found his corrections to entries on science fiction dismantled by users who evidently felt that he was trespassing on their terrain. “The more you look at what some of the Wikipedia contributors have done, the better Britannica looks,” Raymond said. He believes that the open-source model is simply inapplicable to an encyclopedia. For software, there is an objective standard: either it works or it doesn’t. There is no such test for truth.
Let's start right there. Software doesn't simply either work or not. You can't even put it on some sort of linear, objective "goodness" scale. Even in cases where you'd think software is cut and dried, it isn't. Did you test that sort routine with all N! combinations of N elements? Of course you didn't. Did you rigorously prove its correctness? How do you know your correctness proof is correct? Don't laugh: Mathematicians routinely find holes in each other's proofs, in some cases even after publication.

But most software is nowhere near this regime. Often we don't even know exactly what we're trying to write when we set out to write it (thus much of the emphasis on "agile" development techniques). In the case of something like a game, or even a website design, most of what we're after is a subjectively good experience, not something objectively testable (though ironically games seem to put a bigger premium on basic correctness, since bugs spoil the illusion).

It's not even completely clear when software doesn't work. If a piece of code is supposed to do X and Y, but in fact does Y and Z, does it work? It does if I need it to do Y or Z. What if it hangs when you try to do X, but there's an easy work-around? What if it hangs at random 10% of the time when you try to do X, but that's tolerable and nothing else does X at all? What if it does X if a coin flip comes up heads, but might not if it doesn't? I'm not making that one up. See this Wikipedia article (of course) for more info. What if it's an operating system and it just plain hangs some of the time? Not that that would ever happen.

All of this to say that I doubt that software and encyclopedia entries make such different demands on their development process. And as a corollary, I think the results are about the same. Namely, there are excellent results in some cases, reasonable but not excellent results in many cases, and occasional out-and-out garbage.

Here's what I think goes on, roughly, in both cases:
  • Someone comes up with an idea. That person may be an expert in the field, or may just have what looks like a neat idea.
  • The original person produces a first draft, or perhaps just a "stub", or an "enhancement request".
  • If no one with the expertise to take it further is persuaded to do so, it stays right there indefinitely, or may even be purged from the system (perhaps to re-appear later).
  • If the idea has legs, one or more people take it up and make improvements.
  • Typically, one of three stable states is reached:
    • It's perfect. Nothing more than minor cosmetic changes can be added. New ideas along the same lines typically become their own projects.
    • It's good enough for everyone currently involved. That may not be particularly good, but no one can be persuaded to go further. This may be the case right out of the gate, or after several rounds of fixes by a single originator.
    • It's not good enough, but there is no agreement on how to take it further. Work may grind to a halt as competing fixes go in and come out, or the project may split into two similar projects, with better or worse sharing of common material and effort.
Thinking it over, this process is not unique to open source. The magic of the open approach is that the bigger the pool of participants, the bigger the chance that an idea with legs will get supporters and get fleshed out, and the faster it will get to a stable state. In our imperfect world, that stable state is generally short of perfection. Put the two together and you have 80% of the solution in a fraction of the time.

That said, there are some differences between prose and software. I've argued above that software isn't hard and fast. It's soft, in other words. But prose is even softer. As a result, there is greater potential for disagreement on where to go, and in case of disagreement, there looks to be a better chance of thrashing back and forth with competing fixes, as opposed to moving forward but with separate (and to some extent redundant) solutions.

Wikipedia does seem to attract more vandals, but this is not necessarily because it's not software. It may also be because it openly invites frequent edits from a very large pool and changes are moderated after the fact. Open software projects, particularly critical pieces like kernels and basic tools, tend to require changes to pass by a small group of gatekeepers before being checked in. Conversely, some wikis are moderated.

As usual, this is all just my rough "figuring it out as I go along" guess, not anything with actual numbers behind it, but that's my story and I'm sticking to it for now.

1 comment:

David Hull said...

Note to self: "I'm finding this really annoying right now" and "it's bad"continue to be two different things.