Monday, June 29, 2009

Stealing the pot

This one is old news, but it touches on two themes of interest here. The story is the online poker cheating scandal of 2007.

Under not-so-disruptive technology, consider the very existence of online poker. Many people want to gamble. Governments tend to regulate gambling. There's a lot of money in gambling. Ergo, there is a strong incentive to exploit any available gray area to get a game going, for example by running it in an area where no particular government has clear jurisdiction. This has traditionally included rivers and the high seas, but the internet will do nicely.

Under anonymity, consider how the cheating happened and how it was detected. In online poker, as with message boards and other online games, you go by a handle. Someone named, say, Potripper could be anyone. Your next-door neighbor, a dentist in Saskatchewan, the prime minister of a G8 country, anyone. In this particular case, Potripper was someone with special privileges who could see all the cards on the table. This was of considerable help in knowing when to hold 'em and when to fold 'em.

Back under not-so-disruptive technology, the reason Potripper could get away with this is that the game was operating in a legal gray area, but players, at least once they'd become comfortable with the setup, assumed the game was on the level. Most of the time it was, but most of the time isn't all of the time. Again, this is not new with the internet. It's very, very old.

And finally, back under anonymity, how did the cheats get caught? They got greedy (strictly speaking, this should also go under not-so-disruptive). Players started to notice that some accounts were playing suspiciously. Experienced poker players soon learn to play conservatively most of the time. Potripper was making wild bets, bets that should have failed often enough to lose significant money. But Potripper's big, reckless bets never managed to lose.

You could see it on a scatterplot. Everyone else at the table was in the same general area. The good players were in positive territory, but not very far. The poorer players were in the negative territory, but not very far. Potripper was way, way off from the norm. About fifteen standard deviations off. The odds this would happen by chance were beyond minuscule.

Anonymity requires cover. For a person to remain anonymous, there need to be plenty of people who could possibly be that person. Looking at a scatterplot, there was no way to single out a particular skillful or unskilled player. If the cheat had been content with winning a little here and a little there, sometimes losing a bit, it would have been much harder to detect that there was something amiss. But when there's one, and only one, data point in the "so unlikely it's not even funny" area, it's dead easy to identify Potripper with the cheat.

Matching the handle to the cheating was easy. That left matching the handle to the person, and for that the players investigating caught a break. In response to a complaint, the poker site sent out an exceptionally detailed history of the game play, one that included IP addresses of the participants. That linked up behavior, handle, and IP. The IP was owned by the poker site.

The claim was made that a consultant had cheated "to prove a point". Well ... white hats do routinely try to exploit systems with the intent of passing the information on to the interested parties, for example in the case of MD5 SSL certificates. Once they find the weakness, they make a concerted effort to ensure that it doesn't get exploited for ill. The cheats didn't exactly do this. They instead used a number of handles over the course of months or years to steal millions of dollars. So no, sorry. That's not proving a point. That's out-and-out fraud [I should point out that I take no position here as to who was defrauding whom, but clearly someone was defrauding somebody].

Had they been less greedy, they would probably never have been caught. Conversely, less greedy cheats (or the same ones, having toned it down) may still be at it. Caveat bettor.

Wednesday, June 24, 2009

Still not twittering

I recently said that Twitter was a crucial part, but not the only crucial part, of getting information out of Iran (and in similar situations). Since then, this has only been reconfirmed. Twitter, FaceBook, YouTube, Flickr, the blogosphere and other "new media" have played a central role in events.

So why don't I have a Twitter account? It's a simple case of the general vs. the particular. In general, Twitter has been highly useful (so far, the Iran story still seems to get more traffic Celebrity du Jour). But I still don't see a particular need. I just don't have a pressing need to send short messages to an indeterminate group of interested people.

If I want to send a short message to a co-worker, I walk over and tell them, or send an email. In a previous job, not everyone was in the same office, so we used IM a lot. If I want to send a short message to a personal friend, I call them or email them. If I want to fire an arrow into the virtual air for anyone to catch (watch that sharp point -- this is what comes of mixing metaphors), I write a blog post. I find I have time to do that every few days (the baker's dozen was a bit of an anomaly).

Your milage may vary. If you're a news provider, a minor celebrity, a street protestor, someone with an active online social life, or probably many other kinds of person, it does vary. So far, though, I haven't found myself in any of those groups.

[I don't recall if I titled that post before "tweet" was standard, or was too indifferent to know that it was standard, or was deliberately twitting the tweeters, but in any case I still don't tweet, for pretty much the same reasons as given here --D.H. Jan 2016].

Tuesday, June 23, 2009

Why are there blogs in print?

A while ago I asked "Why is there still print?". Asked, but didn't really answer. If books area test case for going digital (again leaving aside that text is fundamentally digital), then a blog, which is already available online for free, has got to be the acid test. Why on earth would one put a blog in print?

In Conversational Reading, a blog in form if not name, Levi Stahl examines the role of self-publishing (that is, print self-publishing) in putting blogs in print and tells why he has bought not one, but two self-published books adapted from blogs. Essentially, there's something about a book.

It's nice to have an index in the print edition, though.

Wednesday, June 17, 2009

Your blog post published successfully! Now look at these ads.

Somewhere along the line, blogger has started advertising to bloggers. Often when I publish a post now, I see a prominent box in what used to be a nice, empty space to the right of the page. In the box are a handful of Google ads driven by the content of the just-published post [The box has been there for quite a while, but it's only lately that I've seen ads in it. This may be because I happened to post on a couple of topics people are actually interested in. I promise it won't happen again.].

Google itself may be a brilliant example of "dumb is smarter", but Google AdWords are generally evidence that sometimes dumb is just dumb. That's probably not just due to AdWords as a technology, but to the business model of selling words to whoever will buy them up, and to some advertisers' practice of buying up anything and everything. Put it all together and Lucky's speech in waiting for Godot can start to look strangely coherent.

I'm only slightly annoyed at seeing my nice pretty blank space sullied by ads. What strikes me more is that Google has determined that bloggers are a market. This seems obvious in retrospect, as there are millions of us, but it wasn't supposed to work that way, was it? Put up a blog, watch millions of people read it, clean up by selling ads. Wasn't that the story? Come to find out that in a world where there are hordes of writers and no guarantee of readers, going after the writer seems the more sensible approach. Ironic, dontcha think?

[Blogger stopped doing this long enough ago I don't remember.  Probably not too long after I posted this, not that I think that had anything to do with it --D.H. May 2015]

Spartacus in Iran

Anonymity requires cover -- people who could plausibly be the anonymous person, but aren't. I've called this the "I'm Spartacus" effect, and real researchers have studied it more rigorously.

One of the most-repeated tweets regarding the Iran election is a plea for everyone to change their profile, time zone, etc. to say that they're located in Tehran, in order to provide cover for people who really are.

I wouldn't expect this to be completely effective. I'm sure there are ways for the authorities to track down the source of tweets before they make it to the Twitter servers, just as there are countermeasures to avoid detection at the source. However, once the tweets reach the server and are redistributed, it probably does make the job somewhat harder for the authorities. At least they have to read through more tweets to decide which ones are likely to be home-grown.

If nothing else it provides a way of showing solidarity, on a par with tinting one's icon green.

Twitter in Iran

For obvious reasons, there has been tremendous interest in current events in Iran, events which, no matter how they play out, will surely be remembered for generations to come. As the regime has deported non-resident journalists and put resident journalists under strict controls, very little information is getting out through the major media outlets, leaving only good old fashioned samizdat. In days past, the mimeograph was a technology of choice. Today it's Twitter.

This being a geek blog, I'm not going to comment or speculate here on the events in question. However, I did want to comment on the experience of trying to follow the story via Twitter, since it's such a clear real-world test.

A while ago, trying to take on the universal question of "Just what is Twitter?" I concluded it was a lot like a headline crawl. Following every single tweet on a topic like #IranElection is essentially impossible. By the time you even finish the first dozen tweets, dozens more have come in. However, a lot of those are repeats of basically the same message. The net effect if you read a page, refresh and repeat is a list of mostly familiar messages interespersed with the occasional new one, much like the crawl at the bottom of a news channel's screen.

Now, the story all over the headlines, duly echoed back onto Twitter, is that Twitter has been one of the few ways of getting information out, to the point that the US State Department specifically asked Twitter to postpone scheduled maintenace so as not to interrupt the flow. Evidently State was relying on it as a significant source. Twitter as an information smuggling device is not news. It's been used for such purposes since early on. That level of government involvement and the sheer size of the story are newsworthy.

The slant on such stories, at least as filtered back through Twitter, is that new media such as Twitter (and YouTube, FaceBook, Flickr and others) have supplanted the old. You can't count on old media. New is the only way.

On the one hand, it's undeniable that Twitter and company are providing information not available via other means. But there's a risk here of conflating two independent factors: the news-gathering model and the technology actually used in gathering the news. The "out with the old, in with the new" story line is too simple. In fact, the traditional outlets have been right on top of this story, doing conventional news-gathering but using new technology.

Unfiltered Twitter is a mess. An exuberant and exciting mess, but a mess nonetheless. I've already noted that the feed from a big story is impossible to follow in real time. Compounding this, rumors run rampant and there's no way for a newcomer to tell whether a given tweet is likely to contain information, misinformation or even disinformation. For example, a great many people tweeted that the BBC had "gone green" in support of the Mousavi campaign. In fact, the site had always been themed in green (and it would have been completely out of character, to say the least, for the Beeb to take sides).

It's also nearly impossible to tell in real time whether a particular piece of information is current. For example, one video circulated, of a clash between demonstrators and police, turned out to be two years old. In neither of these examples do I have any reason to believe anything nefarious is going on. In the heat of the moment someone noticed something, passed it on, etc. etc. The self-filtering nature of wikis is not really helpful here. That takes a while to stabilize.  Amidst an ongoing chatter of claims and counterclaims in real time it's not a factor.

This is where the conventional media come in. An experienced reporter sifting through all the volume, collating, extracting, summarizing and verifying turns a mess into a coherent, reliable story. Several major outlets are doing just that, and I've checked several in the course of following this story.

Again, Twitter and company are a crucial part of the process here. My point, part of my general push back against the idea that technology changes everything, is not that they aren't, but that the basic game of government repression, information smuggling and established media outlets digesting the smuggled information is quite old. The technology has changed greatly, the game little.

Tuesday, June 16, 2009

Baker's dozen: Summing up (for now)

Trying to wrap up the Topic That Ate My Blog, I see that this is indeed the 13th post in this series. That ought to be enough for now. What have we learned, then?

I would categorize the search sites I've looked at roughly thus:
  • More of the same, possibly with shinier chrome. This would include Ask, Powerset, Bing, Cuil and, for that matter, good old Google, which has not been standing still.
  • Attempts at pure crowdsourcing (everyone uses it to some extent, if only by pulling in Wikipedia). This would include wikianswers and WikiAnswers.com.
  • New approaches. This would be Alpha and True Knowledge (which also incorporates a significant crowdsourcing element)
I'm aware of at least a couple of others I mentioned only in passing and didn't investigate in depth: Kosmix and Freebase (I'm still getting over that name). Powerset appears to pull in Freebase when it can. That happened only once for the baker's dozen, though.

How well do they work? I was going to try to come up with some sort of numerical scale based on how much effort it took to get the answer from the results. A clear answer directly on the search result page would count highest, a link to a clearly relevant page fairly high, plausible links less so. Irrelevant pages would count slightly negative. Then I remembered that I have no methodology and inventing one after the fact would be cheating.

So here's a subjective rating, based on the baker's dozen (and not, for instance, on running a side-by-side comparison for a week):
  • Google and Ask are roughly interchangeable. Powerset/Bing seems to do about as well. They're all singles hitters. They generally connect enough to earn their keep.
  • Alpha is a power hitter. It strikes out a lot, but when it connects, it almost always hits one out of the park.
  • True Knowledge is interesting, but incomplete. It remains to be seen how well its knowledge base will fill up and to what extent it will remain reasonably self-consistent.
  • I can't use Cuil without thinking of Celebrity Jeopardy on SNL.
Considering that the baker's dozen were aimed away from the conventional engines' strengths, they performed remarkably well. Several years ago I used Google side-by-side with (I think) AltaVista and Yahoo! for a period of time to see if I might want to switch to it. Google clearly won on relevance of results. I'm not seeing enough clear difference now, either in relevance of results or overall user experience, to suggest doing a similar trial. As always, though, I may be missing something.

I do intend to keep Alpha in mind whenever I'm looking for a clear, quantitative answer, just as I go directly to Wikipedia if I know the specific topic I'm looking for.  [I still use Alpha, mostly for calculations --D.H. Jan 2016]

Finally, I think it's also worth reiterating that crowdsourcing content works well. Wikipedia turns up all over searches and in many cases you can skip the search and go directly there. Open content and a well-tuned search engine make a winning combination. A finely-tuned index to a proprietary database (not just Alpha, but any of the map/driving directions sites or any of many, many other specialized sites) also works. Trying to crowdsource the search itself seems less promising.

Baker's Dozen: How many cities?

While I was putting together the previous post, on crowdsourcing, I tried to look up how many cities there were in the US (for some reasonable definition of "city"). This seemed right up Wolfram Alpha's alley, so I tried "How many cities are there in the US?" Alpha did something I hadn't seen it do before. It answered the question, but not well.

But at least it's pretty clear where it went astray. For whatever reason, it assumed I was interested in the largest cities in the US and gave me the top five. There was a "more" link, but that just expanded the list to the top 10. Who knew only nine have over a million people? (San Jose is tenth with about 900,000)

Unlike True Knowledge, Alpha chooses not to clutter its display with a list of web hits. That's probably why I missed the "web search" button the first time around. Chasing that takes me to Google. The top hit is WikiAnswers. The answer states that "Because many towns are considered counties in some States it gets complex" and points me at City-Data.com, which is supposed to have all of them. Maybe it does, but it doesn't seem to answer the question directly.

OK, where did Alpha get its information then? Apparently from a variety of sources, including the CIA fact book and the US Census Bureau. So maybe look there. There's also a link to the Wikipedia article, and visually skimming that I see that "In 2006, 254 incorporated places had populations over 100,000."

That's a decent answer. 100,000 is a commonly used if somewhat arbitrary cutoff point for citihood. It wasn't really what I was after, though. I was looking for places one might use in a query of the distance from place A to place B, and I would expect that plenty of places with under 100,000 people would qualify. But that'll have to wait.

So ... Alpha to Google to WikiAnswers to a dead end, Alpha to Wikipedia to a plausible answer; two pointers toward the US Census, which is where I would have gone looking if I hadn't had search engines to guide me.

Friday, June 12, 2009

Baker's dozen: Crowdsourcing

As we've seen, getting a computer to understand a simple English question is not necessarily easy. People, on the other hand, are reasonably good at the task. So instead of trying to get a computer to answer a question, why not use the computer purely as a means of communcation in order to connect a question with someone's direct answer? Two efforts along those lines come to mind.

The creation of Wikipedia founder Jimmy Wales, Wikia Search officially folded its tent last month. Naturally, Wikipedia has an article on the topic, not all of which has quite made it into past tense. The Wikia search site now redirects to Wikianswers, not to be confused with WikiAnswers.com, which I'll get to.

The first question of the baker's dozen to get an answer other than "This question has not been answered." is number 6: Who starred in 2001? This gets a "Magic answer", presented in a curtained frame with black background and a magician's top hat in one corner. The answer is attributed to Yahoo! answers and begins "It is an excellent movie. I give it four stars out of 5." The title of the movie is nowhere mentioned, but it appears to have starred Nicole Kidman and have been set during "gee umm WWI or WWII". A couple of minutes on IMDB identifies the film as The Others. Curiously, the more specific question Who starred in 2001: a Space Odyssey? gets no answer.

I also got a magic answer from Yahoo! on Who invented the hammock? and this time it's relevant: the hammock "originated in Central America more than 1,000 years ago." There seem to be two schools of thought on this one: Central America and Amazon basin. I say it was Colonel Mustard in the library with a lead pipe.

WikiAnswers.com is much the same beast as Wikianswers but commercial and -- according to Wikipedia -- more heavily trafficked. The results are not particularly different from those of Wikianswers, but it does answer How far is it from Bangor to New York?

Going a bit further afield, what about using Twitter as a search engine? If you've got a question, send it out as a tweet and see what comes back. There has apparently been some buzz about this concept, and indeed it's one of the options Wikianswers (the first one, not WikiAnswers.com) gives if it can't answer a question. Farhad Manjoo offers a contrasting viewpoint on Slate.com. The gist, if I understand aright, is that in order to sort through the responses, you need a real search engine, so why not just hook Twitter up with an existing search engine and be done with it?

All in all, crowdsourcing doesn't seem to deliver great results here. Why would that be?

Crowdsourcing, at least the free and open Wiki-style variety, depends on each person being able to get more out than they put in. This is possible because information is not consumed, only used -- if you learn something from a source, that doesn't prevent someone else from learning something from it later. It's also possible because sharing knowledge can be its own reward, but I suspect that's a smaller factor.

The classic case is Wikipedia. If 10,000 people read an article, and only 1/10th edit it, and only 1/10th of those edit it in a substantially useful way, you've still got a hundred people working on the article. Naturally I'm making up those numbers, but real experience suggests something of the kind is at work.

Single, discrete answers are not the same as in-depth articles. For example, suppose there are 10,000 places of interest. There are then 100,000,000 questions of the form "How far is it from X to Y?" You can get rid of the 10,000 cases where X and Y are the same and half of the rest because its just as far from X to Y as from Y to X, but that still leaves about 50,000,000 possible questions.

The odds of any particular question coming up more than once will depend on the prominence of the places. It's quite possible that many people will be interested in how far it is from LA to New York, but if I'm doing a tour from Schenectady to Poughkeepsie to Paducah to Tehachapi to Tonapah, I'm probably not going to find that someone else has already asked and had answered those particular combinations.

If I keep striking out asking questions, why should I go to any trouble to pass along the answers I finally do dig up elsewhere? The canonical answer is for the good of the wiki as a whole, and more selfishly to improve the odds I find my answer next time on the assumption everyone is doing likewise. But if I can generally find the answer without the wiki, why do I care whether the wiki can also answer it? Wikipedia wins because it gathers information that's not readily found in one place elsewhere.

On the other hand, a map database, once it's learned the 10,000 places and the routes between them, will gladly answer any and all distance queries with equal ease.

Not every potential question for a crowdsourced engine has the odds stacked so strongly against it. Probably lots of people want to know celebrity du jour's birthday. Unfortunately, that's just the kind of information that's fairly easy to track down with existing tools.

The True Knowledge experience showed another potential problem. Making information easy to find means indexing it, and indexing is a different beast from asking questions. Wikipedia, for example, provides two basic means of structuring information, as distinct from just typing it in: categorizing (tagging) it and organizing the body text into articles, sections, subsections etc. The results are not perfect, but they're very helpful and probably about as much as we can expect from the crowd. Trying to have the crowd too intimately involved the mechanics of a search mechanism itself is probably not a good fit.

On the other hand, crowd-generated content is great. A large portion, though not 100%, of the web is crowd-generated. As a result, just searching Wikipedia often works well. I prefer it when the result I'm after is something like an encyclopedia article. Along with its take, Wikipedia will provide links to sources and if that's not enough I can still Google. I'll use Wikipedia's native index if I know the particular topic (or can get close). Otherwise I use Google and happily read any relevant Wikipedia articles that show up.

This seems a good division of labor. People write the content and machines search and collate.

Baker's dozen: Cuil

I don't know much about Cuil. I remember hearing about it a while ago. I remember hearing that it's pronounced like kewl, but my brain keeps wanting to parse it as French (maybe kweel?) or with the difficult-for-English-speakers-to-pronounce Dutch ui (putting it somewhere between cowl and Kyle) [Actually, it's cool, according to Wikipedia, which also goes to some length discussing the name.]

But can it answer questions?
  • How much energy does the US consume?
Right off the bat there are some noticeable differences. Google, Ask, Powerset and Bing all find a gazillion results and present the top few in a list. Cuil finds 195 results for this query and presents the top few in a page laid out more like a newspaper. It looks nice. There are images interspersed with the text, which livens things up nicely. The small result set is something one could reasonably expect at least to skim entirely if need be.

Unfortunately, it doesn't look likely that any of those results actually answers the question. The top two hits cover how much power a small transformer and a PC consume, and by the bottom of the page we're off into the weeds on an O'Reilly publishing's O'Reilly Radar blog regarding, um, RNA research or something. Not promising.
  • How many cell phones are there in Africa?
OK ... I see stuff about Africa. I see logos for various cellular service providers all over the place. The first hit looks like it might be relevant, but it's really a blog. The second post on the blog page mentions Africa. The fifth mentions cell phones.

I'm not going to wade through the other 108 results. There's absolutely nothing to suggest, for example, that whoever wrote "How many? Who knows, but enough so that you can find the one that's right for you. To get your search started, here are six types of meditation you can try." is going to take a sudden side trip and answer my question.
  • When is the next Cal-Stanford game?
This time 156 results, and the front page, at least, seems much more focused. On the Big Game in general. Nothing in particular on when the next one might be. To be fair, no one else could answer this one either.
  • When is the next Cal game?
Just when you think you know what's going on ... This time there are 443,932 results. The right-hand side, previously empty, now contains a neatly categorized assortment of links. Hover over one and you see a summary of the results there, again elegantly formatted.

The categories are about baseball, with an emphasis on Cal Ripken. The results themselves mention Cal (that is, UC Berkeley) a fair bit, but nothing to tell me when the next game might be. Again, though, no one else got this one either.
  • Who starred in 2001?
This gets 5,937 results. Again there are categories on the right-hand side. My guess now is that the count of results includes those in the categories. No categories, low count.

The hits themselves are the now-familiar mishmash of stuff about movies and the year 2001.
  • Who starred in 2001: a Space Odyssey?
Everybody else got this. Cuil gives 10 results. One tells me that the late Roy Scheider starred in the sequel, 2010. One is a link to WikiAnswers. The word 2001 does not appear on the linked page. There's a nice-looking image of the space-suited astronaut right next to the snippet. Who was the actor in that scene, anyway? Hmm ... wish I had a search engine to answer that.

I was going to stop after the cell phones question, but I am now caught up in the horribly compelling spectacle. Onward ...
  • Who has covered "Ruby Tuesday"?
Everybody but Alpha got this on the first hit (allowing for True Knowledge's fallback to conventional search), and Alpha at least had the courtesy to tell me it didn't know. Cuil apparently lives in a parallel universe where "Albany Hotel Near Crossgates Mall" is an acceptable answer.
  • What kinds of trees give red fruit?
Top of the list is Wikipedia's article on the sweetgum (or redgum -- aha!), Liquidambar styraciflua. The fruit are shown. They do not appear to be red. The other five entries: Wikipedia's list of Emily Dickinson poems; Wikipedia on Robert Michael Ballentyne; a random article on a stinky plant in Curacao with red sap; an article on fast-growing apple trees ... Yes, that's it! We have a winner! Cuil helpfully illustrates it with a photo of some green apples. The list finishes with what appear to be class notes explaining why common names of organisms can be misleading.

I swear I am not making this up.

I have no idea why Cuil chooses this of all times to go all Wikipedia on me, but totally ignores it when it clearly gives the right answer.
  • Who invented the hammock?
There are 1,654 results and no sidebar. Another promising theory shot down. The top one links to a trivia site and asserts that "This, and several other sites, credits Alcibiades, a student of Socrates, as being the inventor of the hammock.". The trivia site itself wants me to register an ID before it will show me the answer. No, thanks.

ID or no, I have more confidence in Wikipedia's answer (legend says Alcibedes, but it's more likely some unknown person in the Amazon basin long ago). The second hit on the list at least agrees on the general land mass, but says it was on the Yucatan peninsula. That seems about as plausible as the Wikipedia version.

The rest of the page adds little.
  • Who played with Miles Davis on Kind of Blue?
Whaddya know. The very first hit gives details for the album, including the lineup.

Frankly, I'm curious to see what it does with the distance questions.
  • How far is it from Bangor to Leeds?
And I wasn't expecting: "No results were found for: How far is it from Bangor to Leeds? If you’ve checked your spelling, you could try using fewer or different keywords to broaden your search."

OK, I can respect that. It would have been a suitable answer several other places as well.

Bangor to New York gets the same response.
  • How far is it from Paris to Dallas?
This gets one (1) hit, from WikiAnswers. The snippet doesn't look promising ("How far is it driving from Sacramento CA to Salt Lake City UT? Popularity: 9. How long does it take to get from Denver Colorado to Paris? Popularity: 9.") but let's drill through. Yep, there are three questions relating to Paris, but nothing about Paris-Dallas.



What to say? Probably best just to let the results speak for themselves.

Thursday, June 11, 2009

Baker's dozen: True Knowledge

In his post on Wolfram Alpha, Mark Johnson mentions True Knowledge as a point of comparison, so naturally that seemed like a good place to try next. TK is supposed to be able to reason inferentially. For example, if you ask "Who are the Queen's grandchildren?" it will be able to find H.M. kids, and from them their kids, and thence the answer.

Game on.

TK wanted me to create a beta account, complete with re-CAPTCHA squiggly words and an email verification step, but it went ahead and logged me right in without verifying. A good thing, as I'd meant to give a different address.
  • How much energy does the US consume?
TK answers "Sorry, I don't understand that question." It then wonders if I might be interested in any of a number of recent, utterly unrelated queries, but it also offers a list of standard search engine hits. These don't appear to be the top few hits for the question itself, but rather (I'm guessing) the top hits for several similar questions. It certainly seemed heavier on "how much energy" than other lists I've seen. It's probably not googling for the question verbatim, quoted or not. Hmm ... maybe it's googling for the parts of the question it deems important, something like "how much" energy consume?
  • How many cell phones are there in Africa?
Again it's sorry, but the screen looks a little different. It tells me "It sounds like this is something True Knowledge doesn't know about yet. Most of what True Knowledge knows has been taught to it by people like you." and then goes on to paraphrase the question: "How many mobile phones (a telephone device) are geographically located (completely) within Africa (the continent)?" Interesting. Then follow the standard search engine results, probably based on the rephrased form.

But there's more. Right below the "Most of what True Knowledge has been taught ..." message is a button labeled "Teach True Knowledge about this". Sounds good, so I click the button and try to put in the answer from Wolfram Alpha. The tabs are intriguing, including a time period asking when the fact started being true and a couple that appear to provide a glimpse into the technical workings of the engine. Unfortunately, the "Add this fact" option appears to be grayed out, probably because I'm not a confirmed user.
  • When is the next Cal-Stanford game?
Overall TK seems a bit sluggish. This is the cost of actually thinking about what you're saying. After pondering a while, TK decides it doesn't understand. The answer is similar to the one for "How much energy ..."
  • When is the next Cal game?
Likewise.
  • Who starred in 2001?
Well, it gets partway. In particular, it is able to extract just the kind of information I had hoped it would. Here's what it said:
Sorry. I couldn't find a meaningful interpretation of that. The following objects matched your query, but none of them are recordable media (such as TV series, movies, or radio broadcasts)
  • the year 2001
  • the integer 2,001
  • the length of time 2,001 years
  • the age 2,001 years
You may be thinking of a particular recordable medium that isn't in the Knowledge Base yet, in which case you can help out by adding it
The "adding it" link was not marked in any way (Say, by underlining it and putting it in blue, mabye?), but now that I see it pasted in here, I see there's a button for adding it. A couple of button-pressing guesses later, I get
2001 can also be used as a way of referring to 2001: A Space Odyssey, the 1968 science fiction film directed by Stanley Kubrick, written by Kubrick and Arthur C. Clarke. If this is actually the recordable medium you are adding, please click the button below.
Looking good ... and next I get
Here are the facts gathered from your information:
I click the "Add these facts" button. It thanks me.

I retry the question. Same result as before. Most likely the new facts are still rattling through the various caches, or perhaps someone's moderating the input. But if the search succeeds for you later, you'll know whom to thank.

OK, if it's still learning the 2001 -> 2001: a space odyssey link, it presumably knows about 2001 under its full name:
  • Who starred in 2001: a Space Odyssey?
And sure enough, there it is, with sources (Wikipedia) cited and even a chance to disagree with its findings.
  • Who has covered "Ruby Tuesday"?
TK doesn't understand, but it does provide the Wikipedia entry in its list of regular search results. It also appears someone has asked it "How tall is Barack Obama in nautical miles?"
  • What kinds of trees give red fruit?
Likewise (but with a different selection of random questions from other users). As always, the regular search hits are there, so I could always mine that for answers.
  • Who invented the hammock?
This time I am asked to confirm its translation of my question. Toward the bottom of a long list of amusing attempts, including "Who is believed by a significant number of people to be the inventor of Hammock Music, based in Nashville, Tennessee, the label imprint under which Hammock released its initial two recordings, Kenotic and the Stranded Under Endless Sky EP?" I see the more relevant "Who is a key person or group involved in the invention of hammock, a fabric sling used for sleeping or resting?"

This sort of thing is the bane of natural language processing. The more you know about it, the more you appreciate the Google approach's* brilliance in deliberately sidestepping it.

Chasing the link, I find that TK doesn't know, but could I tell it? I'm not going to try to educate it on this one.
  • Who played with Miles Davis on Kind of Blue?
No comprende.
  • How far is it from Bangor to Leeds?
After it asking me which of a long list of Bangors I meant, and my telling it I meant "Bangor, the city in Caernarfonshire, Wales, the United Kingdom," it tells me 182 km. If I add "in miles" to the query it tells me the answer to twelve decimal places. Perhaps it's impatient with me for asking so many questions and knows that spurious precision is a pet peeve of mine.
  • How far is it from Bangor to New York?
This time, instead of giving me a list of Bangors to choose from, it gives a long list of eye-watering rephrases (for example: "How far is it from Bangor, the large town in County Down, Northern Ireland, with a population of 76,403 people in the 2001 Census, making it the most populous town in Northern Ireland and the third most populous settlement in Northern Ireland to The State of New York, the state in the Mid-Atlantic and Northeastern regions of the United States and is the nation's third most populous?"). Fortunately, the one I want is at the top: "How far is it from Bangor International Airport, the public airport located 3 miles (5 km) west in the city of Bangor, in Penobscot County, Maine, United States to the US state of New York?" The answer given is 547 km, or (by my rough-n-ready calculation) about 340 miles.
  • How far is it from Paris to Dallas?
This time, fascinatingly, there is only one choice available: "How far is it from the French city of Paris to Dallas, the place in Dallas County, Texas, USA?" The answer given is 7928km, consistent with everyone else's answer to that particular form. Even more fascinatingly, it knows about Paris, TX. Asking "How far is it from Paris, Texas to Dallas" gives the rephrase "How far is it from Paris, the place in Lamar County, Texas, USA to Dallas, the place in Dallas County, Texas, USA?" and the answer 152km.

Wow. That was ... certainly interesting.

Clearly, it's a work in progress. Clearly, it's doing a lot of interesting stuff. Clearly, a lot of thought and effort has gone into it. I certainly commend the team for putting the thing together and letting the public have at it. Considered as a journey, it was easily the most engaging of the sites I've visited so far. Considered as a destination, not so much.

If nothing else, if you're wondering why it's so darned hard to make a computer answer questions "the right way", and why dumb can be smarter, a little browsing around True Knowledge might provide some insights.


* Again, I acknowledge that "Google approach" glosses over a lot of other early work. Google is just the most prominent current exponent of the pure text-based approach as opposed to "semantic" approaches.

Baker's dozen: More on Powerset, Bing and search 2.0 in general

At this rate, there'll be a baker's dozen baker's dozen posts.

While trying to figure out what to try next, I ran across a blog by Mark Johnson, evidently one of the forces behind Bing. Among other things, he makes the point that there's more to evaluating a search engine than just throwing a few queries at it.

Fair enough. Several of the points are good ones, in particular the advice to try a prospective new search engine for a week or so with everyday queries instead of throwing two or three (or thirteen) contrived queries at it. Some, I don't buy so much. Johnson argues that people often make the mistake of just looking at the top result of a query. For my money this is a mistake the same way saying "irregardless" is a mistake. OK, you can call it a mistake, but it's what people do and they're unlikely to change.

In any case, I'm not aiming at the same kind of evaluation here that Johnson probably has in mind. I'm not looking for a document-turner-upper with nicer amenities, even though I fully understand that a collection of smallish amenities can make a major difference over an extended period of time.

An extended trial is a sensible approach if you're looking for something that's basically Google but better. I'm looking for something that's not Google, something that takes a fundamentally different approach and provides a fundamentally different experience. So far, Alpha is the only such engine I've found.

Nonetheless, I thought it was at least worth double-checking that the Bing that Powerset linked to was substantially the same as Bing in its own right, so I ran the baker's dozen past Bing itself. The search results were substantially the same, but not exactly. I'm not sure if that's because Bing searches differently on its own and Powerset was directing me to a page of Powerset results presented Bingishly, or just because web.contents tend to shift in transit and things have changed in the last day or two.

Even from these brief encounters I can see that both Powerset and Bing have various UI amenities beyond the pretty formatting that might well be helpful for routine use. If you're looking for Google-but-better, you might give them a look and decide for yourself. I might do so myself, though the cynic in me wonders whether a one-week trial is meant to be long enough to establish sufficient inertia to keep one from bothering to switch back ...

Johnson also provides pointers to several other engines to try out. So on with the show ...

Tuesday, June 9, 2009

Barker's Dozen: Wolfram Alpha

Now this was interesting.

With one exception, Alpha's responses fell into two categories:
  • It said it didn't know what to do with the input and possibly gave links to not-particularly-related stuff. Curiously, this included "How much energy does the US consume?"
  • It just totally blew away everything else I've seen (that'll make a nice pull quote), giving a nicely formatted and complete response, stating its assumptions and citing its sources.
The successes:
  • How many cell phones are there in Africa? Surprising, after it whiffed on the energy question, but I strongly suggest you try this one yourself. I literally found myself thinking "Now that's what I'm talking about!"
  • Who starred in 2001: a Space Odyssey? It couldn't grok 2001 by itself, but with the full title it gave a nicely formatted table of actors and roles. Much better than Google/Ask's/Powerset's second place.
  • How far is it from Bangor to Leeds? It started off by assuming I meant Bangor, Maine to Leeds, UK, but offered a drop down with which to change that. One click later and I had an answer, with both cities shown on a map, and their current local times, populations and elevations thrown in (further down the page) for good measure.
  • How far is it from Bangor to New York? Likewise, but this time the default of Maine works off the bat.
  • How far is it from Paris to Dallas? Same great display, but no option to switch cities from Paris, France and Dallas, TX. This is the "one exception".
Clearly Alpha is doing reasonably sophisticated analysis of the questions and delivering well-customized responses. To do this, they are most likely employing a variety of specialized techniques and databases. When the question falls outside the reach of those, you get nothing. But if it falls inside, the results are dramatic.

Even more encouraging, from my point of view, is that Alpha demonstrates a clearly different approach from the usual search, unlike Powerset/Bing. I can easily imagine Alpha incorporating more specialized knowledge over time and being able to answer a wider variety of questions with equally impressive results. For example, "When is the next ..." might well be amenable.

As I've said before, I don't see Alpha replacing Google, but I could definitely see it complementing a pure text-based approach, and becoming the tool of choice for a reasonably wide variety of questions with numeric or otherwise discrete answers. Which, as I understand it, is precisely what Wolfram is after here.

Unlike what I said before, I'm no longer skeptical that Wolfram is pursuing a useful approach.

[I still use Alpha routinely for a certain class of questions, but they tend to be more calculation based than search-based.  E.g., How many firkins in a cubic mile? (1.398 x 1011), but I don't tend to think of it for something like How many cell phones are there in Africa?.  Perhaps I should, though.  The answer Alpha gives for that one is still outstanding. --D.H. May 2015]

Baker's dozen: What did I expect?

With three engines tested (or three and a half, if you include Bing), it's starting to look like another case of dumb is smarter. The pure text-based approach grinds away happily and, even if you don't try to cater to its whims and just give plain English questions, it almost always finds something relevant. Often you'll have to chase links, but all in all you can do quite well.

So far, the semantic approach seems to do no better and may do worse. This might be an artifact of the smaller text base (more or less Wikipedia), but the smaller, more select text base is deliberately part of the approach. If you're lucky, the answer you seek might arrive wrapped up in a bow, but usually not, just as with the pure text approach.

It's also interesting that Ask, which started life as a plain English, sophisticated alternative to Google now seems to look and act pretty much like Google.

And yet ...

There is certainly information in the baker's dozen that could be exploited to give smarter results. The question is whether exploiting this requires magic, or just engineering. I'm not going to make a call on that, but here's what I see:
  • How much energy does the US consume?
"How much" says "look for a number" and "energy" says look for a unit of energy (BTU, Joules, kWh etc.) Anything that doesn't seem to give such a number in the context of "US" and "consumes" is probably not useful.
  • How many cell phones are there in Africa?
"How many" suggests a number. For bonus points, the number is probably going to be within an order of magnitude of the population.
  • When is the next Cal-Stanford game?
"When is the next ..." is a formula suggesting a schedule, timetable or such. A search for "Cal-Stanford game" will probably correlate highly with football (as opposed to, say, "Carolina-Duke game"). If so, that would suggest a football schedule.
  • When is the next Cal game?
I found this one interesting. Most Americans will know that "Cal" is one of the short forms of "California". The search engines know that, too, and it throws them off. "Cal" in the context of "game" almost certainly refers to the University of California (Berkeley) Golden Bears. And again, "When is the next ..." rules out the hits for "California Gaming". In this case, even dumb is too smart for its own good.
  • Who starred in 2001?
  • Who starred in 2001: a Space Odyssey?
"Who starred in X" indicates that X is a film, TV show or play. Powerset appears able to grok this.
  • Who has covered "Ruby Tuesday"?
"Who has covered" indicates a song, though not as strongly as "Who starred in ..." suggests an acting role.
  • What kinds of trees give red fruit?
"What kind(s) of X" indicates that the answers should be Xs.
  • Who invented the hammock?
"Who" indicates that the answer is a person or group of people, but in the case at hand the group is pretty abstract.
  • Who played with Miles Davis on Kind of Blue?
"Who played with X on Y" indicates that the answer is a person, probably a musician or member of a sports team ("Who played with Satchel Paige on the Monarchs?" -- another fairly impressive list)
  • How far is it from Bangor to Leeds?
  • How far is it from Bangor to New York?
  • How far is it from Paris to Dallas?
I wrote that "In these, of course, the implicit question is 'Which Bangor?' or 'Which Paris?'" Google Maps handles this well by simple heuristics based (I assume) mainly on size. The "how far is it from X to Y" formula indicates that X and Y are places and the answer is a distance.


Now let me be clear here that when I say that some formula indicates something about the answer, I'm not saying that a search engine ought to be able to exploit that information. I'm well aware it's not as simple as it might look. Rather, I'm saying that if search engines are really going to get smarter, this is the kind of information they'll need to be able to find and exploit.

As to the original question of what did I expect, I'll just say that nothing so far has been surprising, except perhaps that Google handles plain English questions better than I thought it might.

Monday, June 8, 2009

Baker's dozen: Powerset

[If you came here for a review of Powerset, you might also want to look into Wolfram Alpha]

Continuing the none-too-rigorous field test ...

When I first heard of Powerset, its big innovation seemed to be presenting not just raw results, but structured information in the form of "Factz". These were three-word sequences that were meant to summarize the information in an article. That was about a year ago. Since then, the Factz feature seems to have been toned down somewhat. The site itself looks slick, with various UI ameneties and a custom style sheet for displaying articles. For whatever that's worth.

Powerset claims to answer questions posed in plain English, but it limits its scope to Wikipedia. As we've seen, this is not necessarily a great limitation, as a fair number of questions can be answered perfectly well by producing the relevant Wikipedia article. Powerset now also provides links to Bing. It's not often you see a search engine advertised on TV, but Microsoft is currently running a well-produced campaign for it.

Since the PowerSet page links to Bing I'll have a look there, too. Between the two, there should be equivalent coverage to Google or Ask. This should be the first real test in this series of a search 2.0 engine with questions that, as far as I can tell, ought to be right in its wheelhouse. So here goes:
  • How much energy does the US consume?
The fourth snippet on the Powerset page gives the same figure cited elsewhere "100 quadrillion BTUs (105 exajoules, or 29000 TWh) in 2005". Bing seems to give largely the same list.
  • How many cell phones are there in Africa?
I'm not finding anything here. There's a button you can click on that brings up a pretty widget containing the Wikipedia page in question with a "relevant passage" highlighted. There's a button on that widget for navigating to the next relevant passage, but it doesn't seem to do anything. In any case, I didn't see any figure for cell phones in Africa. Clicking through to Bing again produced what looked to be the identical list, but (again) re-labeled as "Bing reference".
  • When is the next Cal-Stanford game?
As with the other engines, some hits on particular Big Games and some other random stuff, but nothing telling me when the next one is. Given that Bing once again seems to be just the same list, I'm not going to mention it any more unless it does something notably different from Powerset.
  • When is the next Cal game?
The main difference here is that Cal Ripken appears at the top of the list.
  • Who starred in 2001?
At the top of the Powerset results, but not the Bing results, is a row of posters from "Freebase" (Really? You called it "Freebase"? Really?) labeled "2001: A Space Odyssey (film) Performances" Several of them have actors' names below them. Not bad, though not quite as unmistakable as, say, an IMDB entry. The actual articles are roughly the same as for Google/Ask: mostly stars of films made in 2001.
  • Who starred in 2001: a Space Odyssey?
This ought to produce at least as good a result, and it does. Powerset gives a somewhat more concise set of posters and (along with Bing) a list of articles mostly relevant to the film. The top one mentions the names of the stars, not in a list, but buried in the text.
  • Who has covered "Ruby Tuesday"?
If Powerset is an index to Wikipedia, it had better find the article for this one, and it does. The second highlighted passage mentions a particular cover version. Again I can't navigate to it in the widget, but the widget also shows the table of contents of the article in a smaller pane to the right, with the "Cover versions" section prominently visible. Click on that and Bob's your uncle.
  • What kinds of trees give red fruit?
Not much different from previous tries, though several entries mention the "UCN Red List." I can, however, now add "red huckleberry" and "red pitaya" to the list of red fruit. Except that further reading and link-chasing reveals that huckleberries grow on bushes and pitayas are cactus fruit.
  • Who invented the hammock?
Along with the Wikipedia article everyone has found, Powerset brings up a "Factz" (missing in Bing, of course) stating that "Inhabitants Invented Hammock". OK, thankz.
  • Who played with Miles Davis on Kind of Blue?
As expected, the Wikipedia article on the album pops up. Neither happens to make the personnel section easily visible, but once you get to the article it's, well, much like clicking on a link to Wikipedia. But at that point it's not hard to find the answer.
  • How far is it from Bangor to Leeds?
Stuff on Bangor, Leeds, Gaelic football and such, but no readily apparent answer to the question. At least it doesn't try to foist that Field Notes thingie onto the world.
  • How far is it from Bangor to New York?
Similarly, nothing helpful. But guess what? There's a Bangor, New York. Interesting that Google maps chose Bangor Maine (which I expected) over Bangor, NY (which is closer, though not as much closer as one might think).
  • How far is it from Paris to Dallas?
I see: An article on the TV series Dallas, one on the film Paris, Texas, articles on the town of Paris, Arkansas, and on Texas State Highway 24, a list of technology centers ... isn't this exactly the kind of mindless hash that the new search engines are supposed to avoid?

All in all, less than impressive.

In one case (2001), Powerset delivers an answer for which Google and Ask require a more specific query. In one case (cell phones), it delivers nothing where the others delivered a clear link to the answer. In one case (red fruit) it is somewhat less useful than the others. On the distance questions, where plain text search gave at least some moderately helpful answers and Google maps did the serviceable job you'd expect, Powerset completely whiffed. Bing looks like slightly less of the same.

But the style sheets look nice.

Up next (after another brief interlude): Wolfram Alpha.

Sunday, June 7, 2009

Bakers dozen: Ask.com

Continuing my completely unscientific survey, I wanted to stop by Ask.com before looking at the newer engines, because Ask.com (formerly Ask Jeeves in the US, still so known in the UK) is meant to answer questions phrased in plain English instead of requiring Google-ese. In theory, it should do better on a list of questions like the baker's dozen. Will it? Since the results here are (spoiler alert!) quite similar to Google's results, you might want to refer back to those.
  • How much energy does the US consume?
This turned up the same broken link Google came up with, the US Energy Information Administration's home page -- which doesn't seem to answer the question directly -- blog entries on several energy topics, and a bunch of ads. However, one or two of the blog entries gave US consumption in kilowatt-hours per day, and the figure matched up with the annual figure that Google turned up. The question doesn't specify units, so this ought to count as a partial success, the same as with Google. I'd still rate Google ahead here, since it found not just the EIA, but the right page on the EIA site, and did so as the second hit.
  • How many cell phones are there in Africa?
Ask turned up the same top two hits as Google, which satisfactorily answer the question.
  • When is the next Cal-Stanford game?
Ask turns up roughly the same hits as Google, which don't really answer the question.
  • When is the next Cal game?
Again roughly the same not-too-useful results ...
  • Who starred in 2001?
... and again ...
  • Who starred in 2001: a Space Odyssey?
... and again, this time with the same winning top hit ...
  • Who has covered "Ruby Tuesday"?
Same results, different formatting.
  • What kinds of trees give red fruit?
How many ways can I say it? But we can at least add pomegranate to the list.
  • Who invented the hammock?
Ask.com and Wikipedia also seem to make a fine flavor combination
  • Who played with Miles Davis on Kind of Blue?
I continue to sense a certain ... sameness in the Ask and Google results.
  • How far is it from Bangor to Leeds?
Once again Field Notes is high on the list, but I don't see a UK distance calculator or any snippet suggesting a direct answer, only similar queries (e.g., Bangor to Conwy). In case Google or Ask landed you here and you wanted to know, it's about 140 miles (or around 230 km).
  • How far is it from Bangor to New York?
Aha! A clear win for Ask: "The distance between Bangor, ME and New York, NY is 387.0 miles(623.0 km)," right at the top of the page.
  • How far is it from Paris to Dallas?
And again the results are nearly identical to Google's. Dallas to Paris, France is given in the same top hit as Google's (5000 miles), no mention of Paris, TX.

So, what have we learned? Ask and Google produce largely identical results. I would say that in two cases Ask's answers were not quite as useful as Google's. In one, they were clearly better. In only one case did Ask appear to process a question as a question, rather than a collection of keywords. All in all, either seems useful. I don't see any great reason to switch, but I would expect a frequent Ask user would say likewise.

Next: On to the new generation.

Baker's dozen: Methodology? What methodology?

Before I go much further in subjecting search engines to my questioning, two overall disclaimers:
  • Again, there is absolutely no research or experimental protocol behind the questions. I wrote them up in about five minutes off the top of my head.
  • I aimed for the kind of questions Google would not do particularly well on but the latest generation might. Being a habitual Google user, I may well have missed.
And some finer points:

It's become very clear that it's not very clear how to measure partial success. For most of these particular closed-ended questions, full success is reasonably easy to judge: Look for a concise, correct answer presented in direct response to the question. But what about the typical Google answer of links and snippets that may well point you directly at the real answer? It's clearly not full success, but it's still quite useful in practice. What score should it get?

What about a question that's easily answered by a slightly different search question and a little link chasing? This is the status quo, and to some extent people are attuned to that. Is there anything wrong with having to learn how to use a tool? We have to learn how to use cell phones, music players, video web sites and so forth. Care and feeding of search engines is now taught somewhere around grade school or middle school (and often learned even earlier).

What would an actual experimental protocol look like? Where would you get your search questions? From a sampling of the population at large, likely biased by years of Google use? From people who don't routinely use Google (or other current-generation search engines)?

What are we trying to model? I can think of at least three possibilities, each with its own arguments for and against:
  • The current population
  • A population of people coming to the whole "search engine" thing completely cold
  • The "steady-state" condition in which everyone has had a chance to learn and get used to whatever search engine is being tested.
All of this a long-winded way of saying "Hey, this is just tire-kicking. The results are meant to be grist for discussion and nothing more."

Saturday, June 6, 2009

Bakers dozen: Good ol' Google

To evaluate search 2.0, first we need a baseline from search 1.0 (leaving aside that there were search engines before Google).

Google doesn't even pretend to understand what you're asking it. If you ask it "How much energy does the US consume" it says something like "I heard 'much', 'energy', 'US' and 'consume' ... hmm ... here are some documents with those words in them."

Coming from a human research assistant, this would be totally unacceptable. We expect better. But since it's Google, and we've come to know what to expect from Google, we accept it, and it turns out to be quite useful. In that context, an acceptable answer from Google would be the standard 10-item first page containing links to enough documents to easily answer the question.

And so, even with fairly straightforward, objectively answerable questions, we're already veering off into the subjective. What does it mean to "easily" answer a question? With luck, we'll know it when we see it.

On with the show. In the following, I've given Google the question verbatim, without quotes.
  • How much energy does the US consume?
Google fared pretty well, by finding sites that asked and answered similar questions. Top hits:
  1. Population and Energy Consumption. This link appears broken.
  2. General Energy FAQs - Energy Information Administration is a FAQ from the US department of Energy. The second question is "Question: How much of the world’s energy does the United States use?" and the answer given is "[T]he United States primary energy consumption was 100.691 Quadrillion Btu, about 21.8% of the world total."
  3. WikiAnswers - How much energy does the United States use a year"The United States is the largest energy consumer in terms of total use, using 100 quadrillion BTU (105 exajoules, or 29000 TWh) in 2005, equivalent to an (average) consumption rate of 3.3 TW." This matches the DOE figure, but that's probably because the author used the DOE as a source.
  • How many cell phones are there in Africa?
Google didn't appear to do quite so well on this one. Just from looking at the snippets of the articles found, it was hard to tell if any answered the question. However, skimming through the first hit, Cellphones give Africa's farmers a chance to set out their stall ...,I found "At the end of 2007 there were more than 280-million cellphone subscribers in Africa, representing a penetration rate of 30,4%."

The next hit references the African Mobile Factbook, well worth a browse and almost certainly the source of the 280 million figure.
  • When is the next Cal-Stanford game?
I wouldn't expect Google to do well on this one. It might find documents referencing the next game at the time the particular article was written, but how many will have mentioned the 2009 game together with the date? What we need is the Cal (or Stanford) football calendar, which this search is unlikely to turn up ... and sure enough, I see a couple of articles about The Play and about Big Games from several years, but nothing obviously pointing me at Saturday, November 21.

Which I found by googling Stanford Football Schedule 2009, of course.
  • When is the next Cal game?
The results here are even less helpful, as Google cleverly expands "Cal" to "California" and turns up several hits for "California Games," something else entirely. Again, you'd have to think to search for "Cal football schedule 2009" (or whatever sport you're actually interested in). Search 2.0 endeavors to do that for you.
  • Who starred in 2001?
This is not specific enough for Google to get its hooks in. It turns up hits for stars of movies made in 2001, but nothing about the Kubrick classic. Adding more words to a Google search rapidly hits diminishing returns, but this looks like a good place to try ...
  • Who starred in 2001: a Space Odyssey?
Ah, there we go. Didn't even need to quote "a Space Odyssey." The very first hit is 2001 A Space Odyssey starring Keir Dullea and Gary Lockwood ...
  • Who has covered "Ruby Tuesday"?
Naturally, quoting "Ruby Tuesday" turns up hits for the restaurant, but the very first hit is the Wikipedia article on the song, which contains a long list of covers.
  • What kinds of trees give red fruit?
I wasn't expecting much on this one. There was certainly nothing like an exhaustive list in plain sight. Drilling through, however, produced a few answers, such as Brazilian cherry, cocoa, curry leaf, miracle fruit, Malay apple, Kapoho solo, rambutan/lychee, Thai salak, Surinam cherry, Akee fruit, Shadblow serviceberry, Russian hawthorn, downy hawthorn, Toba, madrona and just plain cherry.

Mind, not all are considered good eating. More relevant to the point in question, I had to search through a number of different pages to come up with the colorful list above.
  • Who invented the hammock?
Again, Google and Wikipedia team up, this time for a thorough and nuanced answer, the gist of which is, we don't really know, but probably someone in the Amazon basin. I also turned up the bane of Google searches: sites asking, but not answering, the question you're interested in. Funny how these also tend to be chock full of garish ads.
  • Who played with Miles Davis on Kind of Blue?
Yet again, Wikipedia for the win. The relevant article appears as the first hit, and the personnel section gives the full (impressive) lineup.
  • How far is it from Bangor to Leeds?
Heh. Hit number two is some shady outfit called "Field Notes on the Web" asking the very same question. At the bottom, though, is a link to a UK distance calculator giving the distance as 174.06 miles. The figure is suspiciously precise, but plausible [but see below].
  • How far is it from Bangor to New York?
Hit one is WikiAnswers with the none-too-helpful answer "ma thi wo", but hit two is WikiAnswers to a slight rephrase of the question. The answer given is "From New York, New York to Bangor, Maine it is about 448 miles." Myself, I would have said "about 450 miles."
  • How far is it from Paris to Dallas?
Well, WikiAnswers has Paris, France to Dallas, TX as about 5000 miles, and they'd like to know how far it is from Paris, TX to Dallas, TX. Nothing else on the list looks particularly relevant.

But wait a second. For these last three there's clearly another option in the Google family: Google maps. In all cases I'll simply type in the city names and see what pops out, then refine if that doesn't work.
  • We could not calculate directions between Bangor and Leeds.
  • Bangor, Wales to Leeds UK ("UK" was autofilled -- I was going to type "Leeds, England") gives 142 miles.
  • Bangor to New York turns up two routes, of 447 and 485 miles.
  • We could not calculate directions between Paris and Dallas.
  • Paris, TX to Dallas, TX turns up two routes, of 105 and 110 miles.

So ... what have we learned?
  • Google and Wikipedia. Two great tastes that go great together. Wikipedia has done much of the heavy lifting of pulling together coherent results, and Google does a pretty good job of finding them. Three of the thirteen questions, and three of the ten non-mapping questions, went straight to Wikipedia.
  • It matters, at least to Google, how you ask. If you have a distance question, ask Google maps, not Google search. Um, that doesn't seem like a big surprise. Be prepared to give country/state/province information in ambiguous cases. If you want to know when the next X happens, look for X schedule instead of asking directly.
  • Of the thirteen questions, Google gave a reasonable pointer to a good answer on eight of them on its first page of hits just by putting in the question and making no effort to be Google friendly. On two others (red fruit and Paris to Dallas) it gave links to at least some relevant information. On the remaining three (next Cal-Stanford game, next Cal game, who starred in 2001), you could find a good answer by recasting the question slightly.
In other words, an experienced Google user, which is to say a great many people by now, could have been expected to readily answer all thirteen questions. As far as I can tell this leaves two main areas of improvement, at least in the narrow domain of answering research questions is concerned: Finding the Google-friendly question from a more human-friendly one, and wrapping up the results neatly instead of requiring people to chase links.

As I understand it, this is exactly what the current crop of prospective Google-killers is trying to do. Whether they can, and whether that's enough value added to make a difference to the general public, remains to be seen.

Up next: ask.com.

A baker's dozen for Search 2.0

[Re-reading several years later, it's clear that several things have changed.  I'm not going to comment on them specifically since, obviously, this relates to Google's core business.  I would, however, encourage anyone who finds this series interesting to retry the experiment with whatever's currently available.  You should also try the kinds of searches Joe Andrieu mentions in his comment on this post. --D.H. Jan 2016]

As mentioned before, there seem to be a number of next-generation search engines coming out which claim to go beyond merely looking for keywords and delivering up documents. I have no idea how these perform, and the only way to know is to find out. So here are 13 questions, off the top of my head with very little method behind them, which I intend to pose to various engines. I suspect many of them are not really fair questions for search engines, but I could be wrong. Again, that's why we run the experiment.
  1. How much energy does the US consume?
  2. How many cell phones are there in Africa?
  3. When is the next Cal-Stanford game?
  4. When is the next Cal game?
  5. Who starred in 2001?
  6. Who starred in 2001: a Space Odyssey?
  7. Who has covered "Ruby Tuesday"?
  8. What kinds of trees give red fruit?
  9. Who invented the hammock?
  10. Who played with Miles Davis on Kind of Blue?
In these next three, of course, the implicit question is "Which Bangor?" or "Which Paris?"
  1. How far is it from Bangor to Leeds?
  2. How far is it from Bangor to New York?
  3. How far is it from Paris to Dallas?