Wednesday, June 30, 2010

Maybe I should promote myself

No, I'm not talking about self-promotion. I'm thinking I'll get myself a fancy title.

For what, writing a blog? Well, according to a snarky column on title inflation in the Economist, "Southwest Airlines has a chief Twitter officer. Coca-Cola and Marriott have chief blogging officers," and "Everybody you come across seems to be a chief or president of some variety."

In such an environment it should come as no surprise that there is no shortage of online job title generators (they're slightly amusing and easy to script, a surefire formula). Picking one more or less at random, I came up with "Senior Communications Analyst", so say hello to Field Notes' newly-minted Chief Senior Communications Analyst for Blogging and New Media.

My corner office awaits.

More random graph theory on the web

Picking up an earlier thread, I went looking for papers on non-uniform random graphs, that is to say, connected networks where some members have many more connections than others, but that's about all we know. I turned up an interesting one by Fan Chung and Linyuan Lu describing the characteristics of graphs more like ones you'll find in the context of the web.

In particular, social networks and others commonly encountered appear to follow a power law, meaning that, for some particular number b, there are roughly b times as many people with n connections as there are with n+1. In social networks, b tends to be between 2 and 3, so for example there might be around 1000 members with just one connection, 500 with two, 250 with three, and so on to only a handful having eight or more.

Chung and Lu show that under such conditions, there will very likely be a core of closely connected people, that two people chosen at random will very likely be close to each other, but there will also be some significantly less-connected members. Connecting two people outside the core will generally take considerably more steps than connecting most pairs of people.

So: six degrees of separation (or whatever it really is) in most cases, but considerably more in a few cases.

The vague terms like "very likely," "some" and "significantly" have precise mathematical definitions. The details are in the paper if you're interested.



On a recent plane flight, I looked at the route map to see if the airline's network behaved like a social networking graph. Airline networks are specifically designed to connect destination cities well but as cheaply as possible, keeping in mind that cost depends on a number of complex factors. In particular, you want to minimize the number of stops needed to get from point A to point B.

Did the airline's network follow a power law? It did, but only up to a point. Most cities had only a single connection. Fewer had two and about the same proportion fewer had three. And then there were the two hubs, each with dozens of connections. There was nothing in between. You could get from almost anywhere in the network to almost anywhere else by flying to a hub and then on to the final destination — not a big surprise.

The interesting thing is that airline networks specifically don't behave like social networks, precisely because social networks tend to leave some members out in the cold and in air travel, that just won't do. It may be worthwhile in many cases to emulate some naturally-arising structure, but it's not always the best choice. Sometimes you actually have to plan.

Wiki without the pedia

While tagging my previous post, I noticed that I had tags for both "Wikipedia" and "wiki". There are four articles (now five, of course) tagged "wiki," three of which are more or less to do with Wikipedia. The other is from the Baker's Dozen series, speculating about what role the wiki approach may play in the next generation of search engines.

What really stands out to me about wikis is that there's Wikipedia and then there's everything else.

Everybody's heard of Wikipedia by now and quite a few people have tried their hand at editing it. As a result, there is a well-known tool for editing Wikipedia (Mediawiki) along with a well-established culture and etiquette. There is also enough of a critical mass that, for the most part, articles tend to improve over time.

And then there's everything else. Don't get me wrong. There are some good wikis out there. But there are also an awful lot of half-baked ones. These tend to crop up when a small software shop or similar organization decides that it needs a wiki to, say, document its software architecture and development process. Well, why not? Wikipedia is pretty successful, and software shops are always looking for lightweight, dare I say "agile" ways of tracking what's going on.

In practice, there are several pitfalls:
  • Wikipedia has a lot of eyes. According to Wikipedia, Wal-Mart has about 2 million employees, while Wikipedia has close to 13 million registered users. Granted, Wikipedia claims only about 90,000 "active contributors", but that's still about the same headcount as Microsoft. Chances are, your company isn't that big*
  • It used to be every computer science undergrad wanted to invent and implement a programming language. Somewhere around the turn of the century that ambition seems to have shifted to writing a wiki engine (which typically has at least a toy programming language in it somewhere). So many to choose from and, even though approximately one of the choices has a huge userbase and all that goes with it, the odds are that whoever set up your wiki chose something "better" than Mediawiki.
  • Wikis were designed for quickly throwing together webs of loosely structured text, and not for any of several other things they sometimes get used for. A wiki page generally doesn't know what role it has in a bigger picture. A wiki is not a bug tracker. It is not a release planning system. It doesn't know that feature X was promised to FooCorp for release 2.1 whose schedule has just slipped. No one told it any of that. Ah, but that's where the toy programming language comes in ...
  • Many shops are content to limit wikis to the smaller role of gathering together bits of wisdom that people tend to email each other as the occasion demands. "Why did you design it this way?" "Well ..." The problem is that this conversation tends to happen when, for any of myriad reasons, the design wasn't documented close to the code, so someone is now asking the author. Ideally, the original designer goes and documents the code and replies with a link to the new doc. Alternatively, if the conversation is taking place on an archived list, the answer will be in the archives for future generations. In either case, it's not clear that updating a wiki and replying with a link to that would be an improvement.
  • Wikis need gardening to combat various forms of rot. Typically there's even less time for this, particularly in a small shop, than there is for updating the wiki in the first place.
Wiki writing is not magically easier than any other kind of writing. Maintaining a wiki takes time and dedication. Wikipedia has a lot of dedicated contributors, including many who specialize in gardening and other less glamorous jobs. If your organization is not specifically in the business of producing wiki pages, chances are the wiki will reflect that.


* On the other hand, chances are you wiki is not going to be as big as Wikipedia. Nonetheless, (I claim) there are economies of scale that happen when the user base gets larger.  In a large community people can specialize, for example in maintenance tasks.

[Wikipedia continues to dominate the world of Wiki, even neglecting its sister projects.  The one notable exception I can think of is TV Tropes.  I doubt it has anywhere near the readership of Wikipedia, but it's still the rare example of a publicly-edited non-Wikipedia wiki with a significant readership -- D.H. Dec 2015]

Wikipedia moved my food dish (slightly)

Wikipedia has recently undergone a facelift. Just as a casual user I've noticed approximately two things:
  • The buttons and stuff are shinier.
  • The search field is now up top instead of over to the side.
I was somewhat annoyed by that second item for a bit, but I'm already used to it now, and I can see the UX value in putting such a vital, high-volume element in a more prominent place.

What else did they do? The new features link mentions a couple of new editing widgets, which I may explore next time I edit a page, a new version of the logo (part of the general new shininess) and, "improved search suggestions". They've also made it clearer whether you're reading or editing a page, but I've never had a lot of trouble with that distinction.

Of these, the improved search suggestions are the real winner. Search suggestions rock, and I'd say that even if I didn't work for Google.

The internet ate my brain. I think.

The Economist has a quick review of Nicholas Carr's The Shallows: What the internet is doing to our brains. The gist is that the constant context-switching involved in web surfing is "already damaging the long-term memory consolidation that is the basis for true intelligence."

That "already" -- the reviewer's term and not necessarily Carr's -- is a telling bit of boiler plate, adding a bit of urgency in suggesting that this is the beginning of a long-term trend that will surely rot our brains completely before we know it.

Knee-jerk skepticism:
  • Just how much do we know actually about "the long-term memory consolidation that is the basis for true intelligence", or "true intelligence" for that matter?
  • Suppose we can show that, when web surfing, our brains behave in some sort of inattentive, scattered mode. Does that mean that we've lost the ability to think in any other mode, or just that that's how we think when we're surfing?
  • If the web is rotting our brains by changing our patterns of thought, is there a corresponding change in, say, the rate of technical innovation (by some reasonably objective measure)?
  • More subjectively, the has there been a change in the culture? My understanding is that contemporary culture is vapid, cheap and degraded and that things were much better in our parents' day. If so, that represents exactly zero change from fifty, or a hundred, or a thousand years ago.
  • Returning to that "already" above, assuming that there is some sort of measurable effect, is it the beginning of a trend, the end of some sort of adjustment period, a temporary blip or what?
Of course, this is just an off-the-cuff reaction to someone's review of a book that I've not read a word of -- which possibly serves to support Carr's original point.


Many years ago I was sitting in the living room of a house I lived in when a roommate from Europe wandered in and, seeing I was flipping through channels, asked if anything was on. This was back when European TV typically had way fewer channels than a US cable setup. Without thinking, I started going through the channels in order, at a steady beat of one every couple of seconds, narrating as I went: "That's baseball ... that's just bad videos and commercials ... I've seen that episode already ... that guy's just obnoxious ... that's just Gilligan's Island" The roommate's jaw steadily dropped. "How can you know all that just from a second or two?"

A fair question, but the sad truth is that once you've been through a selection of several dozen channels a few times too many, it becomes all too familiar. Sometimes just the channel number is enough, sometimes it's easy to recognize a face or a setting.

To me, the disquieting bit was not that my brain could pick up cues from previous experience that quickly. That particular circuitry has obvious survival value and has doubtless been in our wiring in some form or another for a good long time. The disquieting bit was that I had the information in my head to retrieve in the first place. I'd obviously spent enough time planted in front of the TV to recognize Bob Denver on sight. With all due respect to the late Mr. Denver, that's not necessarily a happy realization.

Was TV changing the way I processed information, or had my TV watching skewed the information I had on hand to process? Maybe both?

And thus has a quick throwaway post morphed into a not-quite-so-quick rumination on the nature of memory. Am I supposed to have the attention span to still be writing and revising this, or was I supposed to have quit after the first 140 characters?

Friday, June 25, 2010

Webbys: who are these people?

Looking at the 2010 list of Webby award winners, I see three basic categories:
  • Old-media names that I've heard of (The New Yorker, The Economist, Roger Ebert, Amy Poehler, Zach Galifianakis, HBO, Sesame Street ...)
  • New-media names I've never heard of (wonga.com, Record Tripping, Love Letters to the Future, BITTER LAWYER, Mubi, Nawls ...) [and none of these come to mind re-reading in 2015]
  • New-media names I've heard of but don't really use (Twitter, Metacritic, Pandora at the moment)
  • Vint Cerf (file under "demigods")
So what does this mean? Probably some combination of
  • Old media are alive and well on the web
  • One purpose of awards like the Webbys is to bring worthy unknowns to the world's attention
  • I'm a stick-in-the-mud who doesn't even tweet (so why am I writing about the web, again?)

Football on the web

(and by "football" I mean FIFA, not NFL)

Other sources drive plenty of web traffic, but if you want lots of people on a site all at once, a sporting event is the way to go. The latest case in point, of course, is the World Cup, which, according to ESPN, was producing over 12 million hits per minute, or about half again the traffic for the 2008 US presidential election (the previous record holder).

Likewise, twitter has been seeing upwards of 3000 tweets per second, comparable to the Lakers-Celtics NBA final. Normal traffic is more like 700 tweets per second.

Lest talk of presidential elections and NBA finals give too much of a US-centric impression, ESPN cites a measurement of "total mentions in social media" for the month leading up to the Cup. Far and away the top entry, ahead of hosts South Africa and well ahead of the US, are England.

But then, England have a fair bit to talk about.

Wednesday, June 23, 2010

Vint Cerf's webby

A while ago I extolled TCP as a "gold standard" and now, it appears, it's finally getting the credit it deserves. Even Tim Berners-Lee has gotten into the act, making many of the same points, albeit with a bit more polish, in introducing internet pioneer, Webby recipient and Google fellow Vint Cerf. Cerf's response, a Webby-standard five-word acceptance speech, would be a cliche coming from most of us. Considering the source, it's startlingly audacious: "You ain't seen nothing yet."

So, whatever could make an engineering masterpiece like TCP and decades of other career highlights look like "nothing", you can be assured Cerf is hard at work on it.

Thursday, June 10, 2010

Putting the "world-wide" in "world-wide web"

Here's a lovely piece of concept art: In 2006, video blogger Ze Frank challenged his viewers to construct an "earth sandwich". How do you make an earth sandwich? Just put two slices of bread at exactly opposite points on the earth (bow bow bow).

Within about a month, two contestants had placed half-baguettes in New Zealand and the Spanish countryside, accomplishing the feat. The two had (I assume) met online through Ze Frank's show, had doubtless exchanged email and, of course, produced their own online videos of the whole adventure.

In other words, it could only happen on the web, right? Well ... yes and no.

Certainly it could only have happened the way it happened on the web. That's almost tautological. But is there any part of the concept that couldn't have been done without the web? Certainly people in New Zealand and Spain have been able to communicate and share a common idea for much longer than the web has been around. Neither is the earth sandwich the first piece of concept art on a global scale. David Barr's Four Corners project, completed between 1976 and 1985 without benefit of the web or GPS, comes to mind.

On the other hand, I expect the web makes it much, much more likely that such things will happen, by providing a cheap and easy way to broadcast an idea to a global audience. It also provides a cheaper and faster way for participants to communicate with each other by providing both the time-shifting of mail (I don't have to read while you're writing) and the speed of the telephone (we don't have to wait for a message to be physically transported around the world). Time-shifting is particularly useful when the participants are twelve time zones apart.


My inner engineer questions the accuracy of both the earth sandwich and four corners projects, since the earth isn't perfectly round. It's definitely a problem for the four corners. Whether it's a problem for an earth sandwich would depend on the fine points of the GPS coordinate system, though at least the largest source of non-roundness, the equatorial bulge, shouldn't be a problem for points the same distance from the equator. On the earth sandwich page, Doc Searles doesn't even pretend to accuracy -- Cambridge is opposite the Indian Ocean, nowhere near Singapore.

Wednesday, June 9, 2010

i.e.

Consider two prefixes: i and e, both in lowercase, e. e. cummings-style. Once they were emblematic of all things new and shiny and dot-com-y. Where are they now?

e- still has its webby connotations, quite possibly because e-mail is still prevalent. We still have eBay, eHarmony, esurance, Epinions, eFileCabinet and others, though perhaps not as many as one might expect.

i-, on the other hand, was blatantly hijacked by Apple. It used to mean "internet-" or something, but through some masterstroke of Steve Jobs's patented legerdemain, it now means "cool, shiny and Apple-y". In fact, according to Wikipedia, the name "iPod" was already trademarked, for internet kiosks, when freelance copywriter Vinnie Chieco decided the prototype reminded him of 2001, A Space Odyssey, particularly the phrase "Open the pod bay door, Hal!" and proposed the name. How the initial i got attached is not clear, at least not to me.

While Jobs didn't come up with the name himself, he must have made the final call on going with it. The sleight-of-hand was being able to market something with no direct internet connectivity with such a name (the much webbier iTunes didn't come along for another couple of years).

Two other affixes from the era still seem to have life in them. The notion of calling the customized view of FooCorp "myFooCorp" lives on here and there, not to mention mySpace.

And, of course, .com has more or less become punctuation.

Finally, there's camelCase. When I was starting out, there were still widely-used programming languages with ridiculously short limits on names. Classic FORTRAN was limited to eight characters and BASIC dialects varied but could be even worse. Single-case, conventionally ALL CAPS, was still prevalent as well.

[You got around these restrictions by dropping any letter you could -- "parameters" became PARMS, "first name index" might be FSTNMIDX.  Well-organized FORTRAN code typically built variable names up from abbreviated parts and had block comments in key places explaining what all the abbreviations meant.

Early versions of FORTRAN also had the convention that the first letter of the name indicated whether a variable was integer or floating point, so you'd get names like IRANK, since plain RANK would be floating point.  While that led to a lot of names starting with I, I doubt that's where the dot-com-era i- prefix comes from.  --D.H. October 2015]

Two popular languages were less restrictive: C and Pascal. C coding style called for all-lowercase names except for constants, with underscores serving as spaces: my_variable_name. Pascal, on the other hand, didn't allow underscores in names (or maybe they were just considered uncool?). Instead, Pascal code used capitals to break up long names: MyVariableName.

I really don't know how mixed case came to be the dominant style, but it has. I still remember a TA (who would later spend some years working for Apple) complaining that my C-style names_with_underscores hurt his eyes and why didn't I do things TheRightWay. Fast forward a few years and if you want to look web.hip you have to go camelCase. Spaces are so old economy.

The astute reader may notice the subtle distinction between camelCase (starting with lowercase) and PascalCase (starting with uppercase). Both are used in actual code. For example, Java conventions call for names of classes to start with a capital and most other names to start with lowercase. I suspect that dot-commers chose lowercase (for the most part) because it just looked less conventional.

Whatever the reasons, it seems to have caught on, more so, in fact, than any of the particular prefixes.



How much dot-com-y goodness will fit in one name? What's the equivalent of a tall double half-caf soy vanilla latte? My guess is it would be somewhere around "myENet.com", but I may have missed a step.

[A quick search reveals that "tall double half-caf soy vanilla latte" is small beans. The real bidding starts at "Venti, sugar-free, non-fat, vanilla soy, double shot, decaffinated, no foam, extra hot, Peppermint White Chocolate Peppermint Mocha with light whip, upside-down, 1 pump of peppermint, 1 and 3/8 pumps vanilla,180 degrees, heavy whip-cream, 3 ice cubes, 1/4 teaspoon Nutmeg sprinkled on top, with green sprinkles, lightly cinnamon dusted on, stirred, with no lid, double cupped, and a straw"]