Thursday, January 31, 2008

Brand and The Media Lab, 20 years after

Ah, here's the quote: Stewart Brand's The Media Lab, page 68. Media Lab's director, Nicholas Negroponte, has just described a half-gigabit channel as "effectively infinite" and asks what one would do with that much bandwidth:
The Media Lab's best and brightest stuttered. One volunteered you could have a combination of television immediacy and newspaper depth and detail available at any time through such a medium. Fine, that would occupy a teensy fraction of the bandwidth. What else? Uh, every house acquires its own broadcast capability. Uh, you could broadcast solids by having them fabricated at the receiving end. Uh, everything would be instantly available as super fax -- a daily National Geographic-quality New York Times with today's news rather than yesterday's, manufactured at the breakfast table.
That was written 20 years ago. It's tempting to take potshots at past speculation about what was then the future, but let's not. Predicting the future is hard. Myself, I'm hard put to predict the present.

On the other hand, some interesting questions do arise:

First question: What happened to the bandwidth? The Media Lab folks were stuttering 20 years ago because representatives of Bell (I mean Lucent ... I mean AT&T .... I mean Alcatel ... I mean ...) had just told them that they could run 500Mb fiber optic to everyone's home. That's around 60 megabytes/second, both ways.

There is no official definition of broadband, but current reports (such as this one from the OECD) use 256Kb as a working definition. By that measure Denmark and the Netherlands are leading the way (with Switzerland, Korea and Norway rounding out the top five). In Denmark, there is just over one broadband subscription per three inhabitants. If you factor in households with more than one inhabitant, this probably means that almost everyone who wants a broadband connection has one.

However, most of those subscribers have DSL (it doesn't say what flavor). If you're actually looking for fiber, Korea has about 9 subscriptions per 100 inhabitants, Japan about 8, Sweden about 5, Denmark about 3 and the Slovak Republic about 1. So much for fiber to every home. Most of us, it appears, are slogging along with DSL and cable. Your mileage may vary depending on a number of factors, but even the theoretical maximum listed here is nowhere close to 500Mb/s for either.

How did this (not) happen? That's an interesting story having more to do with economics than technology ...

Next question: Suppose we eventually all get a half-gigabit hookup. What could we possibly do with that? The answer seems very clear in hindsight: Blow it all on extra pixels. Late-'80s TV is one thing. HD-TV is quite another. Blu-Ray and HD-DVD play back at about 40Mb/s, easily handled by our infinite pipe, but they're not the ultimate.

Even HD-TV has visible artifacts if you look closely enough. Personally, I don't much care, but the true videophile is going to want more. How much more? Recall that IMAX is multiple gigabytes per second uncompressed, and (at a guess) around a gigabyte compressed, still over twice infinite capacity.

I doubt, though, that higher definition is why most people will want a bigger pipe. I suspect most of us who want a bigger pipe just don't like to wait. Sure, I could watch HDTV over a mere 50Mb/s connection, but who wants to wait for a program to download in real time? We want it now (because we're going to watch it later -- go figure). Now even the infinite pipe doesn't look so infinite. It's still going to take 10 minutes to download a feature film in high definition. Of course, if I can watch a movie while I'm downloading 10 more, that might take some of the sting out of it.

If you buy the "no-waiting" theory then we don't want or need infinite sustained bandwidth, but we would really like infinite peak bandwidth. Give me a gigabit pipe, and I promise I'll hardly ever use it. Or maybe I will. Usage has a way of catching up with capacity, and it seems risky, at the least, to predict otherwise.

Finally, what about the speculation in the original quote? The broadcasting solids idea has an obvious weak point: you'd need a pretty special printer to go with that (modern machine tools can reproduce solids described digitally, by any of several processes, but they're not available for home use).

As to the others: We actually do have something like the "television immediacy and newspaper depth," at least by one plausible definition, in the blogosphere and in conventional news sources. The high-def New York Times is certainly feasible. You could even print it out if you really wanted to. It's another question, and an interesting one, why no one seems to be putting out that sort of content to the mass market.

Wednesday, January 30, 2008

Blobs, metadata and web content

Every time I price hard drives, the cost for a given size looks ridiculously cheap, even by comparison to the ridiculously cheap price from the last time I looked. If you spend $X on a disk drive every year, this year's drive will generally be able to swallow last year's whole without great effort.

And yet, we still manage to fill it all up.

Many years ago, at my first geek job, we splurged a somewhat scary amount of money and bought The Ten Megabyte Disk. It was about the size of a beer fridge and I remember wondering what we could possibly do with all that storage. I mean, it could hold the entire 128K expansion memory of the computer it was attached to (a box about the size of a large set-top cable box) almost 100 times over ...

So what's going to fill up today's terabyte-plus disks (100,000 times the beer fridge, if you're keeping score)? It would have to be video, I'd think. A terabyte will hold around about 80 hours of mini-DV video, 250 single-sided DVDs, 125 double-sided DVDs, or 40 blu-ray DVDs (assuming they're more-or-less full).

I previously estimated 3D IMAX at around 1GB per second, so if that's more or less right a terabyte will hold about 17 minutes of really high-definition video. There's no good way you could watch that in most people's houses right now, but give it time (I'm thinking some sort of VR glasses, not a humongous IMAX screen in every home).

What strikes me here is the vast difference between video and everything else.

A terabyte is about a million minutes of mp3 sound, or if you prefer, 70 solid 24-hour days, or two solid weeks if you prefer high-quality FLAC to mp3. It's hundreds of thousands of high-quality digital photos. It would hold all the actual software installed on my computer, including bitmap images, PDF versions of documentation and so forth, hundreds of times over. If you put every word and line of code I've ever typed in my entire life on a piece of the disk and painted that piece neon candy-apple red, you'd need a microscope to see it.

In the business, we call things like movies and songs BLOBs (Binary Large OBjects). Blobs become much more useful if you attach other information to them, for example the title of a movie, an index of scenes, cast and crew credits, and all the other kinds of stuff you'd find in IMDB. This sort of descriptive information about another piece of data is called metadata.

Metadata is very important. Imagine having 1000 songs and 100 DVDs stored on your disk, listed only by a 4-digit number. It's also absolutely tiny. The entire IMDB entry for a typical movie would fit into a small fraction of a single frame of that movie. You couldn't even use it as a cuing dot. It would flash by too quickly.

By comparison, typical web content is very rich in metadata. For example, this blog entry contains a couple of links and tags (that I put in) and some other indexing information (that Blogger puts in). It sits in a page full of other links and indexing information. Overall there is more text in this blog than metadata, but not thousands or millions of times more.

Many (but not all) web offerings are similarly metadata-rich. A typical social-networking site is all about the links. Google makes its money handing you piles of links. Even something like flickr adds value by categorizing blobs and otherwise making them easy to find.

All of which brings me back to my recurring theme of human bandwidth. We humans can consume vast amounts of visual information, large amounts of audio information, but only so much metadata. As a corollary, the amount of space per user on a web site will be tiny (from the computer's point of view), unless it happens to be rich in audio or video.

Sunday, January 27, 2008

IMAX 3D vs. broadband

Watching a concert film on IMAX 3D is an interesting experience. You get a much better view, for a stadium show, than anyone actually in the house -- sort of like you're somewhere close to the stage and really tall and able to float up level with or above the performers at random intervals. The sound is great, and perfectly synchronized, unlike sitting up in the nosebleeds with a slight-but-noticeable lag between the big screen and the speakers. You can hear the virtual crowd roar around you, even if the actual crowd around you is quiet.

The screen is big enough and close enough that you can't quite take it all in at once. If something darts forward off to one side, you have to turn (slightly) to see it in focus. You don't have to turn as far as you would in real life, another slight but unavoidable disconnect between being fully immersed and sitting in a theater with a huge screen and great speakers.

Note to directors: The 3-D cameras love the drum kit -- all those cylinders poised at various angles -- but don't overdo it. There's also another slight disconnect here. The drumsticks strobe noticeably since even IMAX is still 24 frames per second.

Now for the interesting question: How many bits?

IMAX film has a resolution of approximately 10,000 by 7,000. Assuming 32-bit color, 24 frames per second and 2 cameras, that comes out to about 13 gigabytes per second, uncompressed. There's ample room for compression, particularly in 3D since the two images are largely identical, but you're still talking on the order of a gigabyte per second. Picture throwing two blu-ray DVDs into the maw of the beast every minute and you're in the ballpark.

Leaving aside the small matter of installing an IMAX home theater, could you at least stream the bits into your house? If you happen to have 10-gigabit ethernet or better coming in, you're good to go. 75-year-old Sigbritt Löthberg of Karlstad, Sweden does, thanks to her son, Peter (see here for slightly more details). I don't, and you probably don't, either.

On the other hand, gigabytes are getting cheaper every day. Last I looked, hard drives were running around $0.30/GB. A 90-minute movie would require about 5TB of disk (5400GB at 1GB/second), over $1000 retail. That's probably viable for theaters now -- IMAX film reels are massive -- but not quite ready for home use.

Saturday, January 26, 2008

Virtually 1929

NOTE: I first ran across this story in the (print edition of the) Wall Street Journal. Unfortunately, the Journal keeps most of its material off-limits to non-subscribers (or at least those who don't care to swim in DMCA waters by scraping the material off the Journal's site as it comes out). However, the identical story also appeared elsewhere online, for example here in the Bloomington-Normal, Illinois Pantagraph. I have no idea how long it will be archived there, or what the point of this whole song-and-dance might be. Meanwhile, back at the ranch ...


It seems that banks had been opening in Second Life, where you could deposit your Linden dollars and earn a handsome rate of return -- 200% in one case. Given that Linden dollars are exchangeable for U.S. dollars, and 200% is well above the current risk-free rate of return, this is clearly no ordinary savings account.

In fact, the owners of the banks promising these rates used deposited funds to speculate in Second Life real estate, which is virtual space but ultimately valued in real money. Others were into gambling, again all ultimately based in real money. Effectively, you're not depositing into a savings account at all. You're buying a junk bond in a (virtual) real-estate speculation or (other) gambling concern.

Except you can't sell your bond on to someone else directly. You can only withdraw your money, that is, sell the bond back to the company. Since your de facto position is dressed up as a bank account, you have the illusion of liquidity, but if the bank's speculative activities hit a rough patch, that liquidity dries up in a hurry, as the unfortunate and possibly unknowing bondholders discovered when they came for their money en masse. At least one such bank eventually made the arrangement official by issuing bonds to depositors when it couldn't directly honor withdrawals. It then defaulted on the bonds.

All of which seems a somewhat harsh way to drive home a basic point: If there's real money involved, real laws of economics apply. Real financial regulations ought to apply as well, and indeed that's exactly what has come to pass.

Tuesday, January 22, 2008

Anonymity in a horror flick

I saw a TV ad today for Untraceable. The gimmick involves an "untraceable" web site run by a serial killer. It might be interesting to see Hollywood's take on web anonymity, beyond the obvious "bad guys can use it" message. From the blurbs, the film seems more interested in exploring what might draw people to such a site, but the obvious geek question is whether they pick up on a larger audience providing (potentially) better cover for the killer by what I've called the "I'm Spartacus" effect.

Hollywood's record for getting technical details right is, shall we say, a bit uneven. Knowing my schedule, I'll most likely catch it on demand in a few months, if at all.

Monday, January 21, 2008

Shared space in traffic planning

Honestly, I'm not sure exactly how this item relates to "field notes on the web", but it seems it ought to, somehow.

A few towns in Europe and elsewhere have implemented or plan to implement "shared space" in their downtown areas. Such a scheme replace the usual assortment of sidewalks, curbs, traffic lights, zebra crossings and so forth with a plain, flat unbroken unmarked paved area.

It sounds like a recipe for chaos, but it appears to work in practice. In the absence of markings, motorists tend to slow down and look out. Instead of assuming that no one is coming since the light is green, you'll tend to make sure no one's coming and be prepared in case someone is.

Naturally, this approach is not intended for main thoroughfares reserved for fast-moving car traffic, but only for areas where motorists, pedestrians and cyclists are expected to co-exist.

There appear to be statistics to back all this up, though I tend to think it's early days yet. It seems possible, for example, that drivers will become complacent once the novelty wears off, and it's not clear how well the existing examples would transfer to, a larger urban area or to different street layouts, or how much of the effect is due (at least in some cases) to replacing traffic lights with roundabouts. There have also been concerns from the blind and from cyclists.

Conversely, I'd want to have a look at areas of the world with less traffic apparatus and see what their statistics are like. Nonetheless, it's an interesting idea, and it has a vaguely webby feel to it (at least to me), so there you go ...

Friday, January 18, 2008

More on speech recognition

Speech Recognition has some interesting (and kind) comments on my recent post on speech recognition. A couple of points seem particularly worthy of mention:

First, I hadn't been aware of the sheer volume of texting (I personally hardly ever use it). Volume in the UK averages on the order of a billion messages a week, approaching 20 messages per week for every person in the country (mobile phone user or not). Again, that's a lot of messages, especially considering that the device in question is optimized for speech and more or less pessimized for text.

The post then goes on to mention gethuman.com. If you ever want to know what set of hoops to jump through to bypass a company's twisty maze of voice/touchtone menus all alike, give it a whirl.

Thursday, January 17, 2008

Hacking the ENIAC

On the way home today I heard a piece on Jean Bartik, nee Jennings, one of the original programmers of the ENIAC (along with Kay McNulty, Betty Snyder, Marlyn Wescoff, Fran Bilas and Ruth Lichterman).

In those days, they called programmers "computers". Before computers could use computers, they had to use calculators, slide rules and good old-fashioned pencil and paper. From that point of view the ENIAC was a major advance. Working by hand with a desk calculator, a computer needed about 40 hours for a single trajectory. Here's how Bartik describes the demonstration she and Snyder put together at the expense of much midnight oil
The trajectory ran faster than it took the missle to trace the trajectory. We printed out the trajectory and gave copies to the attendees. It was fabulous. Everyone couldn't believe their eyes. They turned off the lights in the computer room and the attendees could see the accumulators computing the numbers (the tips of the tubes were visible through holes in the front panels of the ENIAC). It set the standard for years to come when a computer was working. Hollywood used the front panels of the ENIAC as the model.
(The full transcript of the chat this is taken from, part of Nasa's Female Frontiers program for elementary school girls, can be found here).

The ENIAC was a bear to program.
It was a parallel machine where we had to essentially build a central processor using program trays, digit trays, accumulators, multiplier, divider/square rooter, function tables, master programmer and I/O devices.
Bartik and her co-workers did this by flipping switches and fiddling with cables. A facility for executing (very short) stored programs was cut from the original design but later retrofitted. This was well before the first work on assembler language. Think about that next time you complain about your IDE!

Getting the thing to do anything at all, much less perform complex computations orders of magnitude faster than previously possible, is an impressive accomplishment and definitely a neat hack.

Note: Wikipedia's article on ENIAC lists the programmers by the names they used at the time. Their individual articles are listed under the names they use (or used) later in life. I've tagged this post under those names. The full list is:
  • Jean Bartik, nee Betty Jean Jennings ("Jean")
  • Kathleen Antonelli, nee Kathleen Rita McNulty ("Kay")
  • Betty Holberton, nee Frances Elizabeth Snyder ("Betty")
  • Marlyn Meltzer, nee Marlyn Wescoff
  • Frances Spence, nee Frances Bilas ("Fran")
  • Ruth Teitelbaum, nee Ruth Lichterman

What's a community, anyway?

Earl comments, regarding megacommunities:
You need a special and somewhat sloppy definition of "community" to avoid oxymoronity.
Or in other words, how can you call something with a million or more people in it a "community"?

Certainly the Los Angeles Department of Public Works has no trouble doing so, nor did the European Community, nor do people referring to the scientific community (one hopes that there are at least a million scientists in the world). On the other hand, a horde of 100 million Skype users doesn't exactly conjure up images of neighbors strolling through the park greeting each other with friendly hellos.

If the working definition of "the registered users (or paying customers) of a given service" misses out too much of the "people living and working together" aspect of community, what is it we're trying to capture here? I can think of three aspects:
  • Belonging: People will say, for example, "I belong to Facebook" (amusingly, people also appear to be saying "All your X are belong to Facebook", for various X. Ah, the classics.)
  • Self-identification: I could define a "community" of, say, people with last names with an even number of letters, but that doesn't make that group a community.
  • Interaction: This, I think, is what gives online communities of whatever size their best claim to "community", and why I didn't mention things like banks with millions of online customers as megacommunities.
The whole point of many of the megacommunities I mentioned, and a major point of at least most of them, is that someone belonging to such a community can easily contact another member. Even if they have never met before, they will have something in common by virtue of belonging to the same community.

Whether a given person will actually do this and whether anything will come of it are separate questions, but so are they in a "real" community.

Monday, January 14, 2008

Megacommunities

How many communities have more than a million members? I'll call such a community a megacommunity. I see that the word is already in use in a slightly different sense (see here for example), but all words have multiple senses and I can't think of anything better.

First, there are geopolitical communities such as nations, provinces, states or what-have-you and large cities. There are around 150 countries with population over 1 million and hundreds of urban areas that size (just how many depends on how you count). Numbers concerning religious affiliation are even more open to interpretation, but it seems safe to say that there are dozens but not hundreds of religious megacommunities.

What about corporations? WalMart now has 1.9 million employees. China National Petroleum has just over a million. And that's it.

Now what about web communities? If anyone knows a comprehensive list of online communities ranked by number of registered users -- and it's probably out there, I'd be glad to see it [In other words, I did a very quick Google and punted]. Here's what I got just by listing sites off the top of my head or as I ran across references to them, and quickly checking first the site's home page (generally not too helpful) then Wikipedia and then Google if neither of those seemed to work. Please take these figures with at least a grain of salt:
  • MySpace has about 300M registered users (Wikipedia)
  • Orkut 67M (possibly including at least one automagically-signed-up user who was just trying to find out how many users there were) (Wikipedia)Facebook has about 60M "active" users (Wikipedia)
  • Photobucket claims 58M on its home page, Wikipedia quotes a recent Fortune article giving 36M.
  • Kodak Gallery (formerly Ofoto) has 20M users (Wikipedia)
  • LinkedIn 17M (web site)
  • Second Life claims 11M residents on its FAQ list
  • AOL around 10M subscribers and dropping rapidly (Computerworld article)
  • flickr had about 8M registered users in May 2007 (TechNewsWorld article)
  • Near misses
  • Incomplete information (OK, I punted again)
    • Technorati indexes gobs and gobs of blogs, but it's not clear how many registered users it has
    • Skype shows around 10M people online at any given time and I've seen 100M total users claimed.
  • Some I didn't get around to looking up
    • Blogger (!)
    • Amazon
    • EBay
    • Gmail
    • Yahoo!
    • Hotmail
    • Wikipedia
    • deli.cio.us
    • StumbleUpon
    • digg
    • Netflix
    • jaxtr
    • Any MMORPGs
It's a big world out there. I hadn't even heard of one or two of the names mentioned until I started digging. There are certainly more online megacommunities than religious ones. There may be or may soon be more than there are national megacommunities. Certainly there are online communities that would rank high on the list of countries by size.

Does this mean that traditional structures are on the verge of being overrun by virtual ones? Well ... first, many of the figures above should be taken with a grain of salt. At the very least, one needs to distinguish between everyone who's ever registered and everyone who's actively participating (Shutterfly could and probably does claim a considerably higher number of registered users than it does "transacting users").

It's also very easy (and sometimes useful) to belong to multiple online communities. It's possible to practice more than one religion or be a citizen of more than one country, but it's not the norm and the practical limit is not much more than one.

Sunday, January 13, 2008

Paying for quality

Here's a music pricing scheme I've seen a few times: For one price, you get a piece of music in, say, mp3 format with a given bit rate. For a somewhat higher price, you get the same piece of music in, say, flac format with higher fidelity.

This seems a very interesting test case for the "information wants to be free" theory. Intuitively, better quality ought to be worth paying for, but on the other hand, copying either version is essentially free, so why should there be a difference?

Suppose the lo-fi version is free and the hi-fi version costs a zillion dollars. No one will buy the hi-fi version (and no one makes any money)

Suppose lo-fi is free and hi-fi costs a pretty penny. Most likely some copies will sell, but many more will circulate illegitimately. No one makes much money.

What if lo-fi costs a modest amount and hi-fi costs a bit more -- pretty much what we have in most cases? Quite possibly there will be some sort of equilibrium. Some people will pirate, but enough people will pay full price to keep the bits flowing.

My totally off-the-wall guess is that at the optimum, the proportions of pirated copies for the lo-fi and hi-fi versions will bear some simple relation to each other. On the other hand, it seems quite possible that no one will want to buy the lo-fi version no matter how cheap, if it's really bad, and conversely there's a point at which it's not worth paying for that last bit of signal quality.

The optimum price will depend on a number of factors, including local law and the cost of breaking it, demand for the particular piece of music, and (I tend to think) people's judgment as to how fair the price is and to what extent piracy hurts the record label as opposed to the artist.

When Stewart Brand said "Information wants to be free" he actually said
On the one hand information wants to be expensive, because it's so valuable. The right information in the right place just changes your life. On the other hand, information wants to be free, because the cost of getting it out is getting lower and lower all the time. So you have these two fighting against each other. [See here for example]
We don't always remember both halves of that statement, but both are essential.

Busking for dollars

I've likened Radiohead's "No really, it's up to you" pricing scheme to a tip jar, and I think the comparison is a good one. They've basically put a ginormous virtual guitar case out on the virtual sidewalk and invited people to toss money in.

So this is an old, old business model, albeit on an unusually large scale. Neither is it new on the web. That PayPal donation button on your favorite project's web site is just another tip jar.

It must get some results, or people would most likely have stopped doing it, but on the other hand I wonder how many people (outside those in musical acts with established audiences of millions) make much off of it.

Saturday, January 12, 2008

Another rule of thumb

In these pages I've quoted Linus' law, Moore's law and probably others, so why not have a go at it myself? I don't believe I've heard this one put quite this way, though I'm sure it's been said before, particularly in other fields of engineering:
Development time depends largely on how quickly you can get accurate results. [Maybe a bit more snappily, on how quickly you can see what happened]
This can depend on any number of things, for example:
  • How long it takes to find out you've made a trivial mistake like misspelling a name. IDEs shorten this time by doing much of the bookkeeping before even officially compiling.
  • How long it takes to find out whether you've fixed a bug or properly implemented a new feature. Processes that put more testing in the developer's hands help shorten this.
  • How long it takes to verify that what worked in your sandbox also works for the official version. Tight build management and release processes help eliminate surprises in what should ideally be a quick step.
  • How long it takes to be sure that you've done what your customer wanted. Processes that put development teams in touch with customers frequently aim to shorten this.
At heart, this is just saying that good feedback loops require good feedback, but it stills seems worth noting.

Friday, January 11, 2008

What is the bandwidth of one voice talking?

A side question to the previous post: About how much information does a voice convey per unit time? Let's neglect tone of voice here; it's clearly important in real speech, but my guess is that if you could quantify it, it would come out to not very many bits (Happy? yes/no. Angry? yes/no. Sarcastic? yes/no. etc., each changing fairly seldom). It's also not something that computers are terribly good at picking up, so it's not relevant to the particular case of speech recognition.

At an upper limit, how fast can people talk? Appearances can be deceiving here. That auctioneer rattling on a mile a minute is really just continually refreshing two 2- or 3-digit numbers (possibly with some zeros attached) that change every few seconds. That person zipping along in a foreign language isn't really talking significantly faster or slower than you would, but it sounds like a lot since you don't understand it. And of course, some people can say more with a word than others can say with a paragraph.

The record for speaking English appears to be around 10 words per second (yikes!). Mind, this is most likely someone spewing out a prepared spiel that they've practiced over and over again. Assuming about 10 bits per English word (estimates vary a bit), that's about 100 bits per second. My guess is that most of us, particularly those of us actually coming up with the words as we go along, would do well to hit half that. On the other hand, most of us type considerably more slowly than that.

So let's figure 50 bits/second for running speech, for example, if you're dictating a letter. What if you're just barking out commands from a set list? Interestingly, bandwidth drops considerably. For example, if it takes half a second to bark out one of 16 commands, that's 8 bits/second. Not exactly broadband.

Tuesday, January 8, 2008

Some day you will be able to talk to your computer

Actually, you can do that now. The trick is getting it to listen.

I heard someone on the radio talking about how before too long, keyboards were going to be practically obsolete. Typing, they said, is just one way of communicating with a computer. Why should you type in, say, a search phrase when you could just say "Where's a good Thai restaurant in the area?" and have the computer google it up?

The "just another way of communicating" angle rang a bell, and now I remember which bell. It's the same reason text is supposed to be dying. I've already argued that, um, there must be something amiss with that position, but killing typing is not the same as killing text itself. One could imagine a world in which we still read text, in all its visually-tuned, random-access glory, but create it using voice recognition and a mouse (or a pen, or touchscreen, since mice are about to be obsolete as well, but let's stick to one UI device at a time).

We shall see. Speech recognition (technically, voice recognition means figuring out who is speaking as opposed to what they're saying) has been around for a while now, getting steadily better. There now appear to be dictation systems that can capture most people's normal, continuous speech with high accuracy. This is important, because having to speak ... slowly ... and ... distinctly ... for ... long ... periods ... of ... time or constantly correct monsieur's misheard words rapidly negates any advantage in ease-of-use.

Nonetheless, I'm not sure speech is going to take over as completely as one might think. Why do people send text with cell phones? One would be hard-pressed to imagine a more tortuous way of producing text than to thumb it in on a tiny numeric pad, especially before word recognition, but people did, and do, even when they could call, or leave a voice mail. Or for that matter, why do people IRC while on the phone?

Conversely, how often do you get an email with a voice attachment? I doubt bandwidth or storage are significant problems for that any more. Speech requires around 3Khz of bandwidth and compresses quite a bit better than, say, classical music.

Personally I would prefer not to talk to my computer much of the time, either because the environment is noisy (playing havoc with accuracy), or because it's quiet (and I don't want to disturb anyone), or because there's someone else in the room I might like to talk to without confusing the UI. Occasionally I might even want to mutter some choice words at it without them showing up in the latest blog post for all the world to see.

In short, speech recognition is useful now (particularly if typing is difficult or impossible for a person) and will continue to become more useful as the technology continues to improve, but I don't see it taking over the world.

Postscript: Long ago I decided I was going to get rich by selling a computer that would run faster if you yelled at it. I could have done it, too. Just hook up a volume meter to the system clock. When nothing's coming in, the system runs at, say, half speed. As the volume increases, the clock comes back up to normal. Of course, you'd have to get people used to a computer that runs artificially slowly. But that's a solved problem ...

Sunday, January 6, 2008

AJAX and samsara

Of the many aspects of the software biz that never ceases to amaze me is how many things I've learned over the years that Ivan Sutherland already knew before 1970. For example, from esr's Jargon File, aka The Hacker's Dictionary (v. 4.4.7):
wheel of reincarnation

[coined in a paper by T.H. Myer and I.E. Sutherland On the Design of Display Processors, Comm. ACM, Vol. 11, no. 6, June 1968)] Term used to refer to a well-known effect whereby function in a computing system family is migrated out to special-purpose peripheral hardware for speed, then the peripheral evolves toward more computing power as it does its job, then somebody notices that it is inefficient to support two asymmetrical processors in the architecture and folds the function back into the main CPU, at which point the cycle begins again.

Several iterations of this cycle have been observed in graphics-processor design, and at least one or two in communications and floating-point processors. Also known as the Wheel of Life, the Wheel of Samsara, and other variations of the basic Hindu/Buddhist theological idea.

(My understanding is that the graphics world is in the process of turning the wheel another notch with the GPGPU)

To this list we can the thin client/fat client oscillation, and at the moment the pendulum is swinging toward thicker clients. Thus AJAX.

Clearly there isn't one inherently correct solution to these sorts of "where do I put the processing power" questions. The answer at any given moment depends on economics: How much does processing power cost (and how much can you gain by concentrating it in powerful servers)? How much does bandwidth cost? How much do the various models cost to develop, test, deploy and administer?

A related question: How much development effort goes into new products and solutions, and how much goes into adapting existing products and solutions to the pendulum's latest swing?

Sorting out AJAX

I've tagged a few posts here "Web 2.0", but I haven't said anything here (until just recently) about AJAX. So here's a quick take.

AJAX normally stands for three things:
  • A is for asynchronous, meaning that your browser doesn't (always) have to wait for the server it's talking to to get back to it.
  • J is for Javascript (or JScript, or more technically ECMAScript), which is one way for your browser to run custom code on its own while it's not waiting for the server.
  • X is for XML, which is XML.
For my money, these are listed pretty much in order of their importance to the picture. If you have some means of decoupling the browser from the "send request - wait for response - display response" routine, then it matters less exactly how you're telling the browser, say, how to display a list of choices that all begin with the letters you typed. It matters even less how the names behind that list happened to look on the wire when the server sent them to the GUI code the browser is running.

In theory, the browser could use any language for its scripts. ECMAscript is the choice in practice because it's close to languages like Java and C# that lots of people know, and more importantly because the major browsers support it.

XML is even more a matter of choice. If you're building a web page, you can define your own format if you really really want to. Or you could use JSON, or anything else you can find a library for. Not to say that there aren't arguments for using XML, just that "it's the only game in town" isn't one of them.

But you can't do any of this fun stuff without a way for the browser to take on some of the processing, and particular to be able to do that processing without waiting for a server response. The A for "asynchronous" is the essential "what" here. The J and the X are the vital but to at least some extent interchangeable "how".

Wednesday, January 2, 2008

My vote for most annoying new web site feature of 2007 ...

... goes to those double-underlined links that pop an ad up in your face when you try to chase them. I can understand the desire to get some kind of revenue out of providing useful information, but this stuff makes me think of Digicrime (no double underlines on that one, but please read their disclaimer before you go playing around there).

Economic copy protection

Suppose I've created something really cool, say a screensaver that's so entrancing that when you see it on my screen, you've got to have it. I agree to give you, and only you, the binary for it, for a hefty sum. Bear with me -- it's a thought experiment.

You've got the binary. I've made no effort to copy protect it. You can copy it as much as you want and give it away to all your friends if you want. Will you?

My guess is no. If you've paid a bunch to have my screensaver on your screen and your screen only, why would you give it away? Will I copy it and give it to someone else? Not if I want to do business with you again.

In effect, the work is copy-protected, but by economic, not technical means.

Now suppose you grow tired of my creation. You could just give it away then, but why not try to get something out of it? Say, sell it to your six friends that have been drooling over it, with each paying 20% of the original price you paid me. They get the cool screensaver at a heavily discounted price (to make up for their not having had it hot off the press), and you get a cool 20% profit. Plus the use of the cool screensaver in the meantime.

But If I'd known you were going to do that, I would have been better off just selling to you six friends to begin with, and maybe to you as well if you could tolerate not being the only one with the new work.

With more customers I can charge less per customer, but by charging less I will also pick up more customers who don't see any need to refrain from copying the work and giving it away. If you pay, say, $4 for something and get as much enjoyment out of it as you would from a fancy cup of coffee, your investment at that point is zero. So why not give it away (absent any technical or legal barrier)?