Tuesday, June 16, 2009

Baker's dozen: Summing up (for now)

Trying to wrap up the Topic That Ate My Blog, I see that this is indeed the 13th post in this series. That ought to be enough for now. What have we learned, then?

I would categorize the search sites I've looked at roughly thus:
  • More of the same, possibly with shinier chrome. This would include Ask, Powerset, Bing, Cuil and, for that matter, good old Google, which has not been standing still.
  • Attempts at pure crowdsourcing (everyone uses it to some extent, if only by pulling in Wikipedia). This would include wikianswers and WikiAnswers.com.
  • New approaches. This would be Alpha and True Knowledge (which also incorporates a significant crowdsourcing element)
I'm aware of at least a couple of others I mentioned only in passing and didn't investigate in depth: Kosmix and Freebase (I'm still getting over that name). Powerset appears to pull in Freebase when it can. That happened only once for the baker's dozen, though.

How well do they work? I was going to try to come up with some sort of numerical scale based on how much effort it took to get the answer from the results. A clear answer directly on the search result page would count highest, a link to a clearly relevant page fairly high, plausible links less so. Irrelevant pages would count slightly negative. Then I remembered that I have no methodology and inventing one after the fact would be cheating.

So here's a subjective rating, based on the baker's dozen (and not, for instance, on running a side-by-side comparison for a week):
  • Google and Ask are roughly interchangeable. Powerset/Bing seems to do about as well. They're all singles hitters. They generally connect enough to earn their keep.
  • Alpha is a power hitter. It strikes out a lot, but when it connects, it almost always hits one out of the park.
  • True Knowledge is interesting, but incomplete. It remains to be seen how well its knowledge base will fill up and to what extent it will remain reasonably self-consistent.
  • I can't use Cuil without thinking of Celebrity Jeopardy on SNL.
Considering that the baker's dozen were aimed away from the conventional engines' strengths, they performed remarkably well. Several years ago I used Google side-by-side with (I think) AltaVista and Yahoo! for a period of time to see if I might want to switch to it. Google clearly won on relevance of results. I'm not seeing enough clear difference now, either in relevance of results or overall user experience, to suggest doing a similar trial. As always, though, I may be missing something.

I do intend to keep Alpha in mind whenever I'm looking for a clear, quantitative answer, just as I go directly to Wikipedia if I know the specific topic I'm looking for.  [I still use Alpha, mostly for calculations --D.H. Jan 2016]

Finally, I think it's also worth reiterating that crowdsourcing content works well. Wikipedia turns up all over searches and in many cases you can skip the search and go directly there. Open content and a well-tuned search engine make a winning combination. A finely-tuned index to a proprietary database (not just Alpha, but any of the map/driving directions sites or any of many, many other specialized sites) also works. Trying to crowdsource the search itself seems less promising.

No comments: