Field notes on the Web: March 2008

Monday, March 31, 2008

Necessary, but not sufficient

While I've just argued that bigger, faster and better-connected computers do not automatically give rise to qualities like intelligence or consciousness, and that some aspects of intelligence require very little computing power, that doesn't mean that every aspect of intelligence is trivial.

A demo like the Big Dog is possible not only because many people have worked very hard at understanding how animals move and how to explain that to a computer, but also because we now have absolutely sick amounts of computing power available compared to a few decades ago when the serious speculation started.

For example, the first computer chess program could just barely follow all the rules (or could it do even that?), but before long the main elements of computer chess as we know it today were in place. Deep Blue beat Kasparov partly due to constant incremental improvements in the underlying software, but mostly because the hardware came to be able to crunch out so many positions that it could see farther ahead and make "positional" moves out of sheer calculation.

Image processing has come a long way, to the point that cameras can more or less recognize faces (i.e., that these pixels are probably a face, not that it's your face). Some of this is due to better understanding and better algorithms, but even the best algorithm is not going to run in reasonable time without a high-octane pixel-smasher working behind it.

In short, progress is coming from painstaking research aimed at understanding the problems to be solved and getting the code to run, and it's coming from continual advances in hardware performance. Both are necessary. Neither is sufficient alone.

The intelligent web -- or not

It used to be a staple of science fiction and futurology that as computers got bigger and faster, they would get smarter. That hasn't exactly panned out in any clear-cut or linear way. Computers can certainly do all sorts of useful things that they couldn't before, including tasks that have at various times been considered part of "Artificial Intelligence", but there are certainly a whole lot of big, fast, dumb computers out there.

Well, maybe it's not so important for computers to be big and fast, so long as they're well enough connected. Perhaps, with ridiculous amounts of computing power attached to the web, a sort of distributed consciousness will emerge. This idea can be seen floating in the same general vicinity as the idea of a "semantic web", though I don't believe that's what the semantic web is about in any significant way.

In any case, web-connected computers these days seem to keep mostly to themselves. Generally my computer and most likely yours will act as a client to a server somewhere, communicating in a very carefully-delimited way to fetch some piece of information for a human user (the weather report, sport scores, email from other people). This information is generally either cached for a person to look at later, or simply thrown away.

Well what do you expect? It's people who are writing the applications and people who are using them, after all.

There are a few cases of large numbers of desktop computers working together to achieve a common goal. I'm thinking SETI@home or GIMPS, for example. In these cases, someone has written special software for a particular application, complete with hand-crafted communication code. It's not like my desktop computer got the idea that it and yours should go off and try to factor large primes. In fact, it's very much not like it. Pretty much the opposite.

Intelligence is a lot of things, ranging from well-defined tasks that computers are good at, like remembering what just happened, to less well-understood facilities like judging whether someone is lying or learning a natural language by immersion. Try to even define "consciousness" and you're in for a machete-worthy trip through the philosophical wilds.

Only a few of these things we call intelligence are directly applicable to, or useful in, the web as a whole. That's not to say they wouldn't be useful on a small scale. A truly smart SPAM filter would make someone very rich (maybe even the person who wrote it). But that's quite a different thing from "the web" being able to filter SPAM.

Put another way, we don't really interact with the web as a whole. Even when searching the web for knowledge, we interact with a search engine that treats the web as a huge pile of passive information. We don't ask the web for answers. We ask a search engine. A particular application can be smart, and I'd argue that it can be smart in a meaningful way without doing anything magical or particularly hard to explain, but that's an application, not the web.

Naughty Auties

Interacting with people via the web is much like interacting in the real world. To most of us, the differences are probably more interesting or annoying than anything else. E-mail doesn't convey tone of voice, so we invent smileys. Virtual worlds let us take on avatars and play with the way we present ourselves to others.

To those on the autism spectrum, however, the social cues most people take for granted in the real world are a confusing and arbitrary mess that must be painstakingly learned or otherwise dealt with. Having a smiley that says "that was a joke" can be more than just a convenience. The relatively pared-down and stylized social vocabulary of a virtual world can create a safer space than the cacophony of, say, a coffee shop or party.

At least, that's what I gather from this interesting CNN iReport on "Naughty Auties" and autism in Second Life. It's a beautiful thing to see people taking advantage of the quirks of the system to make life better. I wouldn't quite call it a "neat hack", but it's certainly what the old-fashioned hacker ethos (as opposed to the script-kiddie stuff that happens to have the "hacker" label attached to it) is all about.

Thursday, March 27, 2008

Big Dog

Except that it's been circulating on the web, this clip has very little to do with the web per se. But dang is it a neat hack. Or really, a whole bunch of painstakingly assembled neat hacks.

Keep in mind that the beast is carrying about 150 kilos (340 lb) for most of the clip.

MILD SPOILER:

If you're like me, you probably winced when the guy tried to kick the beast over, and cheered (at least inwardly) when it got back up. Empathy toward anything that shows goal-directed behavior appears to reside in yet another deep-seated brain circuit ...

The Attention Economy: What's new?

Speaking of video ...

There's something primal in the attractive power of a flickering TV screen, some circuit deep in the brain that says "I need to look at that. It might be important." We know that the circuit in question must be fairly inacessible because that gut feeling that the pretty pictures might be important is practically always wrong, and yet we continue to gravitate.

Sometimes we watch the pretty pictures because they inform, more often because they entertain, but mostly, it would seem, because they keep us occupied until the real payload comes along: more pretty pictures meant to get us to buy cars, or beer, or beauty products or best of all, a new and bigger TV. To which we can then be even more strongly attracted, and in which we can be even more thoroughly immersed, than ever.

Don't get me wrong. I like TV, particularly when the NCAAs are on. I'm just wondering what exactly is the new part of the idea of grabbing people's attention and extracting money from them, which, if I understand it, is the basis of the "Attention economy".

Or is the attention economy about amassing and trading data about what people are paying attention to? That's surely a means to the end of extracting money from said attention. Granted, the web adds something new here: It's much easier to collect precise and copious data on which pages people are clicking on, as opposed to estimating what they're watching on the TV. But as far as analyzing that data in a widely-accepted and understood way, and actually "monetizing" it, the web still seems to be playing catch-up to Nielsen.

Maybe I'm being too glib here. Once again, I'm really just pushing back against the idea that The Web Changes Everything. The more I study the matter, the more I think that much of the low-hanging fruit was picked generations ago (at least). To paraphrase someone: The web is new and significant, but what's significant isn't new, and what's new isn't significant. That's a strawman position, and I don't really believe anything quite that strongly put, but I do find it a useful lens through which to examine what I run across.

Right now, I'd settle for email on-demand

I've been cleaning up one of my email accounts lately. Just now, I ran across a bunch of messages with large attachments. I wanted to move them to a different folder in the same mailbox. Judging by the network activity and the way my mail client sat comatose for many minutes, it would appear that the client was downloading every message from the old folder and uploading it to the new folder.

Does IMAP really require that? I thought that the whole point of IMAP is that the server has the definitive copy of everything and the client can fetch what it wants on demand. In which case moving a message should just mean changing a tag (or something similar) and should take practically no time at all. From an old-school information theoretic point of view, there is very little new information: "It was there and now it's here" for a couple dozen messages.

Maybe the client thinks it needs to re-read the body of each message so it can re-run its spam filtering or search indexing? I could see that, maybe, particularly if the culprit is a plug-in that doesn't want to make any more assumptions than absolutely necessary. Ideally, though, moving a message would mean moving any already-derived data along with it, not re-deriving results that cannot have changed.

This is partly a case of one of Deutsch's fallacies popping up again. Guess what, kids. Bandwidth isn't infinite. It's all well and good to hypothesize about throwing mammoth hunks of video around, but if we can't handle a few megabytes of email attachments without the world grinding to a halt, perhaps this is premature.

Grumpily yours ...

Tuesday, March 25, 2008

We want the world and we want it reasonably soon

One of the nice features of digital TV is that you can decide when to watch a given program. This isn't a new feature, but digital delivery makes it a lot easier. Anyone remember VCR Plus?

In fact, you can time-shift in two different ways with a typical digital cable setup (at least, I think mine is fairly typical):

Record a program and watch it at your leisure
Get it on demand

In either case, you watch the show when you want. The difference is when the bits arrive. In the first case, they arrive when the broadcaster decides to send them. In the second case, they arrive when you ask for them.

What determines which way the bits get to you?

Storage space: Your DVR will only hold so much. If you try to record everything you might possibly be interested in, you're liable to run out of space. This factor is rapidly changing.
Broadcast bandwidth: There's only so much spectrum available. With infinite bandwidth, a provider could broadcast every piece of video/film ever made at 1,000,000x speed on an infinite loop and everyone could grab what they wanted as it came by [See this post for some implications of that].
Copy protection: A provider might prefer that you not store a copy of the bits. Instead, it would prefer to send encrypted bits to a box that decodes them, without offering a ready way to store the results. This factor is also subject to change as the whole copy protection issue shakes out.
Time sensitivity. A live event has to go out live. Even pre-recorded material can have more impact if everyone gets it at the same time -- the water-cooler effect.

The broadcast-and-record model doesn't require a data network. Traditional TV/Radio broadcasting, whether from towers or satellites, works just fine. VHF and UHF together comprise around 3GHz of bandwidth, readily available without building out a "last mile". At the moment, that's still quite a bit of bandwidth. Even if the broadcast is via a data network, there's no requirement that said network be connected to or behave like the internet.

The downside is that (for the most part) you don't get to choose when the bits are sent. But how much of a downside is that? It's not a problem for live content -- quite the opposite. It's not a problem for not-so-live content either, as long as you can still choose when you watch.

Again, the connection between receiving and watching has been loosening over time as storage gets cheaper and easier. Maybe this is just me, but if my favorite show comes in while I'm asleep and I can watch it whenever I want the next day, I've got no problem. On the other hand, if it's something that everyone just absolutely has to watch at a given time, that's essentially live content.

The upshot is this: Given ample cheap storage, it would appear worthwhile to broadcast anything new that lots of people want to see, regardless of whether they want to see right at that moment.

I haven't paid a lot of attention to satellite TV lately, having had cable for the past few years, but I could imagine someone shipping out a set-top box pre-loaded with a huge video library and space for plenty more. Only new content gets broadcast.

Live content comes in as it happens. Pre-recorded content comes in whenever the bandwidth is available. Everything gets stored on the box, using double-secret heavy encryption mojo. You can have whatever you want, and reasonably soon. The result would be pretty much indistinguishable from a cable set-up.

You would even have on-demand viewing. This difference is that instead of demanding the bits, you're demanding authorization to decrypt the bits (for a while, at least). You might well obtain the decryption key via the web, in which case the web handles the transaction, but the heavy lifting of moving large hunks of video around can happen elsewhere if that makes sense.

Such a scheme would also have all of the usual data-protection problems, notably including analog conversion, but that comes with the territory. Whether it's less secure than an existing on-demand service depends on just how good the double-secret heavy encryption mojo is. I can see content providers being nervous about putting the keys to the kingdom directly in the hands of millions of subscribers, however well protected the data may be. But on the other hand, isn't the whole point of mass media to get the bits to as many people as possible?

Sunday, March 23, 2008

Digital vs. interactive vs. internet vs. broadcast

There's a lot I don't know about the brave new world of electronic media, but a few things are starting to stick in my brain:

Video is the 500lb gorilla. Everything else we currently do takes much less space and bandwidth.
Unless you have fiber or something similar, video is pushing the limits of your bandwidth (*)
Even if you do have fiber, you can use up large chunks of it with bigger and shinier video, or more channels, or better-than-real-time downloads.
TV in the US is set to go digital in the next couple of years. This frees up large chunks of bandwidth for other uses (**).

That last chunk of bandwidth is primo stuff. It was originally allocated to TV exactly because it's suited to transmitting video images into buildings and across wide swathes of rural land.

Now let's look at video in the internet world, as opposed to old-fashioned TV with rabbit ears (Dad, what are "rabbit ears"? Well, in the olden days TVs had antennas sticking up on top of them ...):

It's digital. That means, subject to various legal and technical restrictions, you can make exact copies of it, save it, view it and edit it with a plain old home computer.
It's available on demand (or at least, upon polite request with money attached). You don't have to wait for your show to be on.
It can be interactive. You can do video conferencing, for example.

The interesting thing is that none of these requires the internet. It's been possible for years to record video digitally with a camcorder, make your own DVDs, and so forth. You can (again subject to legal and technical restrictions) digitally record TV programs received via cable, satellite or over the air. Video on demand is a standard feature of cable service. Video conferencing can be done (and in some cases is done) via closed circuit.

This is not to say that the internet can't add value, for example by making video conferencing available to everyone instead of just to parties who happen to have the infrastructure for it. The point is that much of the goodness of new media is not strongly tied to the net, nor is it, as is sometimes implied, a direct consequence of the net being involved.

The internet is inherently point-to-point. Your basic internet packet has a single destination address. You can also specify broadcast and multicast addresses, but those require extra thought and machinery to handle efficiently. The internet is also symmetrical in structure. Anyone can send to anyone. The other party might not listen, but that's a different story.

Broadcast TV, on the other hand, has a very different structure. One party broadcasts to many. It does so very efficiently and scalably. Someone puts up a tower. As many people as want to buy receivers. This can be and is being done for digital video as it was for analog. Traditional cable TV -- everything but the on-demand part -- is somewhat less easily scalable, but it still has the advantage of having only one fixed route: from the broadcast center out to the viewers.

Why should we persist in using old distribution machinery when we've got the infinite flexibility of the net? For one thing, people remain interested in live TV coverage. If I want to watch the NCAAs and a million other people do as well, then why bother the internet backbone about it when the network can just beam the games up to a satellite or broadcast them over the cable network.

The one-way broadcast model still works just fine in cases where you might well want digital content, but don't care about interactivity. In fact, it should generally work better, and using it offloads traffic that would otherwise have to go through the backbone.

Which leaves me to ponder what really ought to end up happening to all that prime broadcast bandwidth.

(* A standard DVD plays back about 4 hours of video from about 8GB of data, or about half a megabyte per second. This is curiously close to the bandwidth of my cable internet connection, and in fact video full-screen comes across more or less OK, but not dazzlingly well. It's interesting to note that the same cable can also keep at least 3 tuners happy showing and/or recording TV shows with good fidelity. It's almost as though cable companies are in the TV business, but I digress ...)

(** Most of it was bought up by Verizon, with Google only putting in a minimum bid. More could be said ...)

Friday, March 21, 2008

Social networking: Business or feature?

The subhead of the Economist story "Everywhere and Nowhere" says it all: "Social networking will become a ubiquitous feature of online life. That does not mean it is a business."

It goes on to compare social networking with email, which began as a closely-guarded service within "walled gardens" (or "fiefdoms", if you prefer) and is now an open, free service that companies like Google and Microsoft use as a loss leader to support their real business.

Along with the laundry list of social networking not working out quite as well as one might expect (or stock valuations might suggest), it points out an interesting twist: A lot of the social networking goodies -- whom you contact, how often and in what context -- are already hiding in plain sight, in your email database and address book.

It even mentions OpenID. Good stuff.

Friday, March 14, 2008

The Economist on the soul of Wikipedia

Lots of fun stuff in The Economist's technology quarterly, including a piece on "The Battle for Wikipedia's Soul".

Some time ago I came away from a debate with ~~an anarchist~~ a libertarian friend with the conclusion that, for better or worse, government is just something that people tend to do. Wikipedia seems a perfect case in point.

Wikipedia started like any wiki, lean and mean and (pretty much) free for all. Over time, however, it has developed rules, customs, social groupings and hierarchies just like any other society.

Also in common with governmental forms of other societies, these have taken on a life of their own. The article quotes a 2006 estimate that entries about governance and editorial policies were the fastest-growing segment and comprised around a quarter of the total content.

I'm curious as to how this was reckoned. Wikipedia claims over 2 million English articles, and even if most of these are rather small, it's hard to believe the WP: space is as big as half a million randomly chosen articles. I'm guessing the figure includes talk pages. In any case, the larger point stands: The Wikipedia community devotes significant resources to governing itself.

One major point of discussion, probably the major one, is what gets in and what stays out. There are two schools of thought. Inclusionists prefer to include as much as possible. Deletionists try to eliminate frivolous or badly-written material.

The heart of the problem is that there are no hard-and-fast rules for deciding what's worthy and what's not. Bad articles are like obscenity: you might not know what it is, but you know it when you see it. And different people see it differently. In the absence of consensus, judgment comes into play, and with that, the question of who does the judging. There's simply no way to decide that will leave everyone happy.

Is this a problem? Not necessarily. Such imperfection is part of every human system I'm aware of. The more important question is how to deal with that imperfection. If there's an epic battle between inclusionists and deletionists, as opposed to just a normal give-and-take, the question is not who will win, but what damage the battle will do to the system as a whole.

Wednesday, March 5, 2008

UI IQ

You need to talk to someone at FooCorp about an account of yours. You call the customer service number. The system asks you for a bunch of magic numbers. You give them. Listen to the lovely on-hold music. Someone answers. Turns out you need to talk to the Quux department. So they patch you through. The Quux department asks you for the same bunch of numbers you just gave ...

No one likes explaining the same thing twice. Whenever this happens, I always say "I just gave you that number." It never helps.

What we call "intelligence" (natural or otherwise) comprises a large number of different abilities, some of them better understood than others. One simple ability, and one that you'd expect computerized systems to be very good at, is remembering what just happened.

Web interfaces got a lot smarter when cookies came into use, because page B had a way of knowing what you said on page A. Not only do you not have to retype information that both pages happened to need, but, slightly more subtly, page B could be tailored based on what happened on page A, or the last time you accessed page B, or any of a number of similar bits of history (cookies aren't the only technique for doing this, but they're often a good one).

This doesn't always work perfectly, but there are some fairly non-infuriating sites out there. And some infuriating ones as well.

The phone system I described isn't dumb because it's a phone system and not a web system. It's dumb because the Quux department is a separate organizational unit from the main switchboard and they don't talk to each other.

The same integration problem crops up when page A and page B belong to different units, or different organizations. The (at least potential) advantage of the web approach is that you can integrate the two together from the outside without either of them knowing that they're being integrated. This can be as simple as a browser plug-in that knows your phone number and makes it easy to fill it into a field marked "phone number".

In the phone case, the analog would be an automated agent that listened in to the conversation and spewed back the numbers that it overheard (or already knew) so you didn't have to. And maybe stayed on hold for you and called you back when it had gotten through. Since the web is inherently in computer-digestible form, the web has a natural advantage for this sort of thing.

When this kind of integration works, the system becomes smarter. That's not just a metaphor. A system with a better memory is smart for the same reason that a dolphin is smarter than a goldfish (or for one of the reasons, anyway).

Sunday, March 2, 2008

Naming the social compenent

I've previously argued that what we call "social networking" comprises (at least) two major parts: A graph of network connections, with access control, and a social aspect, through which people identify themselves as belonging to various groups. Here are some possible names for the social aspect:

Social Aspect or Social Component: Why not? Not snappy, but descriptive.
Totem: Earl suggests this in a comment. Webster's defines it as an entity that watches over or assists a group of people. The term has spiritual origins, but so do daemon and avatar, for example.
Brand: If totem has more purely human connotations, brand is more purely commercial (notwithstanding that commerce is a very human activity). The term is already in use; commercial sites care very much about their brands.

There probably isn't any one correct term, and if there is one, it may not be any of the above. I can't think of many cases where someone would say "X is my brand", but on the other hand business people use the term freely and appropriately. For that matter, I'm not sure I can see people saying "X is my totem," but you never know what might catch on.

Field notes on the Web

Monday, March 31, 2008

Necessary, but not sufficient

The intelligent web -- or not

Naughty Auties

Thursday, March 27, 2008

Big Dog

The Attention Economy: What's new?

Right now, I'd settle for email on-demand

Tuesday, March 25, 2008

We want the world and we want it reasonably soon

Sunday, March 23, 2008

Digital vs. interactive vs. internet vs. broadcast

Friday, March 21, 2008

Social networking: Business or feature?

Friday, March 14, 2008

The Economist on the soul of Wikipedia

Wednesday, March 5, 2008

UI IQ

Sunday, March 2, 2008

Naming the social compenent

About Me

My other blog

People following Field Notes

FeedBurner

Search This Blog

Blog Archive

Reader Picks

Labels

Search This Blog

Pages

The future still isn't what it used to be: This whole "web" thing

Report Abuse