Field notes on the Web: January 2009

Wednesday, January 28, 2009

More wacky UI design

Today I had to fill out a form online. The organization in question had helpfully provided a URL, which I duly typed in to my browser (just like section 2.2 of the RFC suggests).

On the page was a nice big box labeled "FooForm Online". Pointing at the box was an arrow. At the other end of the arrow was the text "Click here," underlined.

What to do? Click on the underlined text, which looks for all the world like the usual "Click here" link, or ... click on the large box with the arrow pointing to it that says to click there? Or will either work?

It turns out that in this case it's the "click here" text (only) that works. It's easy enough to figure out, and pretty harmless in any case, but ... Huh? What's the point of decorating your "Click here" link with an arrow pointing to a large box that doesn't work?

Monday, January 26, 2009

Call or click today

It's hardly breaking news but ...

Back in days of yore, when the Ginsu_TM Knife was king of the airwaves, no self-respecting late night TV pitch was complete without the tag line "Operators are standing by." These days (and even in those days, for that matter) there's no one actually operating the telephone equipment, so you talk to "service representatives" and such. That's assuming you even want to bother with the phone. Most of the time you can order your Chia Pet_TM online in a fraction of the time. Thus the new tag line: "Call or click today."

Pop culture is one of the strongest indicators of whether a technology has made it. Can you get it at your local big-box store? Does it turn up on the TV news as backdrop for some other story? Does your non-technical uncle ask you about it? By that standard, e-commerce is definitely here. Has been for some time. Like I said, hardly breaking news, but perhaps it's noteworthy how unnoteworthy it is.

Pop culture notes:

Ah, the Ginsu knife. Unfortunately the current Wikipedia article on it is somewhat garbled and I haven't the energy at the moment to help fix it. To make matters worse, there appear to be at least two prominent web sites hawking what appear to be Ginsu products, one of which appears, by cross-reference with Wikipedia, to be the genuine article from Douglas Quikut. The other, despite its genuine-looking logo, seems to use "Ginsu" some places and "Ginzu" in others and doesn't seem to mention Douglas Quikut at all. So I won't reference it and I'll leave it to the lawyers to sort out who's which.

Ah, Chia Pets. From the days of my youth my reaction to Chia has been a profoundly visceral "Huh?" Someone must like them, though. They're still going strong.

Wednesday, January 21, 2009

Newsweek's signal/noise ratio

Yet another cool web thing that I just now learned about: While browsing Newsweek's web site, I ran across a feature called "Signal or Noise?" It's impressively straightforward: They suggest a topic and you can use a slider to indicate how relevant (signal) or irrelevant (noise) the topic is. They in turn will presumably use that information in deciding how far to run with the topic.

Reader feedback has been around forever, but having that level of feedback available in real time is something that's only practical on the web. I've long since decided I have no idea what Web 2.0 actually means, but if it's to do with taking advantage of the social nature of the web and letting information flow in all directions instead of just from the server to the browser, the signal/noise widget seems as much in that spirit as anything [and like so many web.experiments, it seems to be gone already, or at least the link is broken now --DH 27 April 2011].

Tuesday, January 20, 2009

Now one big googly family

I just moved the FeedBurner feed for this blog to my Google account, something all FeedBurner account holders will have to do by the end of next month. As I understand it, my vast crowd of loyal subscribers should notice nothing, but you never know.

Before the move, FeedBurner and Google Analytics would tell me ever-so-slightly different stories about who was dropping by the site. This has been resolved now, by the simple expedient of dropping FeedBurner's version of the story.

That is all.

Three mildly silly things

I realize that if you're the US Post Office, you have to aim for a broad, sort of lowest-common-denominator target, but as I understand it a server can get some idea of what kind of screen resolution the agent on the other end has, and perhaps serve up a map a bit bigger than the one on this page.
When you relinquish tickets on an online ticketing site -- you know, the kind that charges you several bucks in "because we can" fees, a.k.a. "convenience" fees -- to try for different ones, wouldn't be nice if they actually gave you different tickets the next time?
Apparently some local TV stations have taken to broadcasting "live via broadband" low-grade video as part of their broadcast news. OK, I get it. The world of journalism is going to the web.dogs and you want to show that you're on board. But putting up 10 frames per second lo-res video with compression artifacts that make your reporter look like something Dalí would have painted on a bad day is not going to make the best tech-savvy impression. One is reminded of (now Senator [or maybe not]) Al Franken's infamous "one-man mobile uplink".

Ah well. I suppose that's what makes the web the web.

Note: I'm not sure how best to attribute the Frankenphoto. The particular flickr image I used is on Ross Mayfield's photo stream, but I assume NBC has a dog in the fight as well.

Wednesday, January 14, 2009

Yet another insidious form of spam

One of the hazards of holding the awesome power of reaching handfuls of blog readers every day is that people will try to piggyback on that to deliver their own message, for example by posting a comment plugging their product. I've had that happen twice or so. I suspect it would happen more often if more people read the blog (but don't let that stop you -- it's a risk I'm prepared to take).

This morning, however, I saw a new spin on the scheme. While looking through the traffic statistics, such as they are, I ran across the search string "Our fine product 3.2 will be released on some date." Apparently enough words in that sentence match something somewhere in these pages for it to register as a hit so their bot can click through and get their message onto my statistics. Clever, eh?

First, it gets the message to my eyeballs, in case I want to buy their fine product. Second, since I'm a highly-influential "thought leader" by virtue of writing a blog, I might even find it in my heart to tell my vast audience about their fine product. If only I could remember what it was called and who makes it ...

Of course, it could be that someone really was searching for just those keywords, in which case I apologize for the snarky tone.

But somehow, I doubt it.

Now if you'll excuse me, I'm off to google "Sorry, I won't plug your product on my blog".

Monday, January 12, 2009

More muddling over music

We may finally be figuring out how to pay for music in the digital age.

Actually, the likely answer has been clear for a while now. The news is that the major players seem to be warming to it: Buyers pay per song. Some of them cheat. Many of them don't. Musicians may sell directly, or they may go through a label.

Probably the strongest evidence is iTunes dropping DRM, or rather, the labels agreeing to drop DRM. The flip side of the deal is that iTunes will no longer charge a flat $0.99 per song. New songs by popular artists will cost more. Older pop songs will cost less. Other genres will follow their own traditions and customs.

It's an interesting question which of several factors have had how much influence in making all this happen. In no particular order:

Labels may be getting comfortable with the idea that while there's going to be "leakage", enough people will be willing to pay for them to keep the wheels turning. It probably helps that the marginal cost of selling songs online is near zero, but probably not as much as one might think. Pressing CDs is pretty cheap, too. I forget how cheap, but it's a small portion of the retail price. Some of the additional costs -- storefront costs for a brick-and-mortar music shop, for example -- go away online, but many of them, particularly marketing costs, don't. Someone has to pay for all those award shows and after-parties.

Labels may be getting scared by sales of physical media going through the floor. One might be tempted to gloat over yet another example of physical copy-protection yielding to the Mighty Web, except that CDs don't really offer any copy protection either.

Apple seems to have figured out that a flat pricing scheme looks completely screwy to labels, who have been selling music for a lot longer than it has. Charge the same price for the Billboard #1 as for a Johann Kropfgans lute concerto? You're kidding, right? Labels understand that different groups are willing to pay diffrent prices for different music. Before any classical music fans out there pummel me, let me hasten to add that the Kropfgans recording I found costs a bit more than the current #1, Lady Gaga. If classical listeners weren't willing to pay more, there'd be no market.

Further, people will pay different amounts for the exact same music depending on whether they buy it while it's hot and all the cool kids are listening to it, or whether it's sitting in the virtual remainder bin with Herb Alpert (Before any Herb Alpert fans out there pummel me, let me hasten to admit that I own at least one Tijuana Brass album. Now the cool kids can pummel me instead.) It's called "walking down the demand curve," a concept I vividly and bitterly remember learning as an economics guinea pig in college when the Trader Joe's money I'd been counting on failed to materialise.

All of this leads me to inaugurate a new tag (which I'll backfill at some point, as I've been beating this drum for a while now): not-so-disruptive technology. After a bumpy start, and a few false starts, we seem to be converging on a model that looks a lot like the old one.

Did online music Change Everything? No. It's changed some things, but so did the 45, the LP, the boombox, the walkman, the CD, the MP3 player and its cousins, heck, even 8-track. Did iTunes Change Everything? No. In the end the studios have lived to fight another day.

Probably.

Postscript: Some have argued that albums are going to fade away as the iTunes generation picks and chooses the songs it likes, but this may just as well be a pendulum swinging. One could argue whether albums are an artifact of the LP and CD, or whether people will continue to like their songs in that form, but the Billboard hot 100 has been going since 1958, when 45rpm singles were the only real game in town.

Post postscript: There's a whole other interesting discussion to be had over whether file sharing Changed Everything. It should be no surprise that I don't think it did. The more interesting question is how much of that has to do with the labels breathing legal fire about it, and how much of it has to do with the economics of the free-rider problem.

Sunday, January 11, 2009

MD5, SSL, CAs etc.: Summary opinion

In the past few posts (this one, this one and this one) I've tried mostly to lay out the facts as I understood them. But this story is prominent enough that everyone seems to have an opinion on it. So here's mine, keeping in mind that when I'm not busy not being a security expert, I spend my spare time not being a lawyer.

First, the CAs [that is, the ones that persisted in using MD5] clearly deserve a good dose of public humiliation on this one, if only to remind everyone that bad press and possible loss of sales are an additional cost of not acting, even if no actual exploit occurs. They deserve it not because a weakness turned up -- that's pretty much inevitable -- or because they don't always respond to every conceivable threat -- that's a natural consequence of doing business and weighing costs against benefits. They deserve it because in this particular case the threat was clearly large, the fix was clearly not that hard and they had years of lead time to fix things quietly.

Verisign in particular argues that RapidSSL was an acquisition and they were just now able to get to know RapidSSL's code base. Well yeah, but ... Verisign chose to acquire RapidSSL and it would have taken very little due diligence to determine that they were using weak certificates. Depending on their negotiating position, Verisign might even have been able to make RapidSSL's fixing its certs a condition of the acquisition. But in any case, it's a problem Verisign took on voluntarily, and if they didn't know about it they should have.

However ...

It would be a grave mistake to focus completely on the CAs. When you (say) visit your bank's web site, you are trusting

The bank to keep your money safe to the best of its ability.
The bank to keep your private info safe to the best of its ability.
The banking system and government to clean up if the bank fails. [When I wrote this, I was thinking more of a security breach. Heh.]

And on the technical side:

The CAs signing your bank's certificate
The design of the certificate system (PKI)
The researchers that claim all this works they way they say it does
The implementation of SSL you're using
DNS (the system that figures out which actual server to contact when you ask for "foobank.com").
Your browser
Your operating system (a whole separate kettle o' fish)
Whatever else I didn't think of, and I'm sure I've left out several major factors.

Any of these can and does have problems from time to time. However, you're not counting on all of them to be perfect, always. You're counting on the system, in aggregate, to be safe enough. The wrong lesson to learn from all of this would be "The CAs are too lazy to protect our data". A better lesson would be "Every part of the system is imperfect. And so is the system. That's probably OK, but we really don't know."

Not exactly a reassuring story with clear good guys and bad guys, but it's the best I can come up with.

A bit more on MD5 cracking in practice

While we're on the topic, I should point out that Sotirov et. al. were not the first to put the theoretical weakness of MD5 into practice. The timeline from my previous post is

A while ago, someone published a paper showing that MD5 was vulnerable.
Late in 2008, Sotirov et. al. disclosed that they had forged a root certificate.
Soon thereafter (right about now), the CAs got serious about updating their root certificates.

All well and good, but there are a few missing pieces (and even then, this is just scratching the surface):

Rivest introduced MD5 in 1991
The first theoretical indication that MD5 was weak came in 1996, when Dobbertin published a paper on the subject.
In 2005 Wang and Yu published a paper with the straightforward title "How to Break MD5 and Other Hash Functions", including fully-worked examples.
Also in 2005, Lenstra, Wang and de Weber published a paper announcing that, using Wang's method, they had produced colliding X.509 certificates (the kind everyone uses). At this point, one could make a very strong argument that the cat was out of the bag as far as applications like SSL were concerned. A couple of key quotes:

"With this construction we show that MD5 collisions can be crafted easily in such a way that the principles underlying the trust in Public Key Infrastructure [the basis for SSL] are violated."
Below is an example pair of colliding certificates in full detail (byte dump).

In 2007, Lenstra and de Weber, along with Stevens, got some press by claiming to have predicted the outcome of the 2008 presidential election, and then, once they had your attention, explaining that they'd really just made multiple predictions with identical MD5 hashes.
Finally, in late 2008, we join our story currently in progress.

In other words, MD5 has been known to be weak in theory for more than a decade, and in practice for several years. As usual, Wikipedia has more background and pointers to original sources (several of which I used here).

Thursday, January 8, 2009

The Register's take on the MD5/SSL crack

Under the very appropriate rubric of "As usual, the truth is a little more complicated," The Register picks up on two points I glossed over in my previous post on the MD5/SSL crack. Before I get to them, let me quote myself from a different previous post:

The basic trust issues are clear enough, but the kind of mental ju-jitsu needed to think through all the various counter-measures and counter-counter-measures is hairy in the extreme. True black belts are relatively rare, and I'm not one of them.

Caveat lector.

Point 1. Recall that crackable root certificates are in the process of being replaced. In particular, Verisign subsidiary RapidSSL has replaced its tainted root certificate. The register counters:

But there's nothing stopping anyone who might have used the attack before that date to masquerade as RapidSSL and issue counterfeit certificates for any website of their choosing (think Bank of America, HMRC, or any other sensitive online destination).

My understanding here is that there is just such a thing, namely that modern browsers don't rely on a fixed set of trusted root certificates. So, our enterprising cracker puts up a site spoofing my bank, using a bogus certificate, signed by an imposter of RapidSSL's root certificate:

My browser makes an HTTPS connection to my bank. As part of the SSL handshake, it asks the purported bank "Who are you?".
The site responds "I'm FooBank. Says right here on this certificate."
The browser takes the certificate and examines it. The certificate says it's signed by RapidSSL's old root certificate (or it says it's signed by some other certificate that's been signed by RapidSSL's etc., etc.).
Without the knowledge that the old RapidSSL root cert has been spoofed, my browser would say "RapidSSL root cert XYZ? Looks OK to me. Go ahead and serve me."
With the new information, the browser says "RapidSSL root cert XYZ? Don't know about that one. Sorry." My browser does of course know about RapidSSL root cert NewImprovedXYZ, but that's not the root certificate the cracker is claiming signed the site's certificate saying it's FooBank. Same CA (RapidSSL) but a different certificate.

Microsoft's advisory on the subject states that "When visited, Web sites that use Extended Validation (EV) certificates show a green address bar in most modern browsers. These certificates are always signed using SHA-1 and as such are not affected by this newly reported research."

Firefox doesn't use a green bar, and Mozilla's own advisory on the subject (dated December 30 2008 and linking to Microsoft's) is a bit vague, but (without checking the code and getting further bogged down in details) it looks like Firefox has a couple of safeguards in place as well. It would be nice if they had a more definitive security statement, but the upshot is this: It is possible for browsers to reject certificates signed by fishy root certificates and only accept ones signed by root certs that use stronger hashing (e.g, SHA1) than MD5.

Further, the lead researcher in question, Alexander Sotirov, states, in the blog post I previously linked to

Only 5 hours after our presentation, Verisign stopped using MD5 for all new RapidSSL certificates, successfully eliminating this vulnerability [emphasis mine].

So it would appear that it's enough to revoke the offending root certificates, of which there are a known quantity, and the rest of the system will behave appropriately.

Point 2: The article also brings up a broader and more troubling concern:

More generally, what [Verisign product marketing VP Tim] Callan seems to gloss over is the truism VeriSign and the rest of the security community have repeated so many times that it's become a cliche: Hacking is no longer the province of script kiddies[*], but rather sophisticated and well-funded criminal enterprises. It's hard to imagine these groups wouldn't spend huge amounts of money to buy the credentials that would allow them to spoof any website in the world.

The particular concern driving this is that maybe someone else has already quietly duplicated the Sotirov team's substantial effort and has bogus certificates ready to go, or they're about to do so. That may be but, thanks to the recent efforts, the window for using them is rapidly closing. There's no particular evidence that someone beat the White Hats to the punch on this one. You would expect a massive spike in phishing attacks, and that hasn't happened. So far, at least.

As far as I can tell, no one is currently assuming that no Bad Guys have the resources to crack MD5 and forge root certificates. Sotirov's team tipped off everyone they could as soon as they had the goods (albeit carefully and indirectly in the case of the CAs). The CAs in turn appear to be acting with all deliberate speed to plug the holes. They are not currently assuming that they have a lot of time to act.

The more worrisome problem is that MD5 has been known to be vulnerable, and better alternatives have been available, for some time, but only now are the CAs actually ditching MD5. This indeed was based on the notion that it was highly unlikely that Bad Guys would be able to put the theoretical weakness of MD5 to practical use. More precisely, it was based on the calculation that the likelihood times the cost of a crack was more than [I meant "less than"] the cost of fixing the root certs. Now that Sotirov et. al. have published, the likelihood has increased and the cost of not acting along with it.

What if the CAs had been wrong? What if the Black Hats had won the race? In that case, there would likely have been a huge mess. Major banks would have been spoofed and potentially large numbers of customer accounts compromised before the banks took down their sites and breathed heavy fire at the CAs, who in turn scrambled to revoke the bad certs. The bank sites would then be down for however long it took for the CAs to fix their certs, plus however long it took for the CAs to convince the banks that, no, really, it's all safe now. Banking by phone and at, um, actual banks would continue. I'm reasonably sure that credit card transactions at stores would either not be affected or would be easier to fix since they don't use the public internet.

Parallel to that, the banks would have been scrambling to refund customers their fraudulent charges, reset the magic numbers on every account in sight and convince the customers that no, really, it's all safe now. The browser vendors, who appear already to have done what they could have, would face a similar PR nightmare anyway.

A huge mess, but I wouldn't quite call it "betting the internet". Nonetheless, I can't quite shake the nagging feeling that, one of these days, someone is going to bet wrong, or make a rational bet but still lose. That doesn't seem to have been that case this time, but who knows what comes next?

[*] I can't help including my reflexive grumble here: Hacking was never the province of script kiddies. The two are about as far apart as you can get and still have a computer involved. But I'm using "hacking" in its older sense here.

More on Chinese cell phones. A lot more.

China has just announced that it has issued licenses for 3G cell phone service, suitable for streaming video and internet access. The numbers involved are typically big, particularly since cell phones may be the primary or even only way many Chinese connect to the web.

Think hundreds of millions of web-enabled phones. Granted, they'll mostly be behind the Great Firewall of China, but that's still an awful lot of web-enabled phones coming online.

"Hackers crack SSL"

[You may also want to check out this followup, and in particular the disclaimer near the top]

Well, kind of. SSL -- the protocol you use to make sure that, say, you're actually talking to your bank and that no one's listening in -- still appears safe when used correctly. What's actually happened is that Alexander Sotirov et. al. have described a way of using a long-known weakness in the MD5 cryptographic hash function to create a rogue Certificate Authority (CA) certificate. CA certificates are the "root certificates" used to vouch (directly or indirectly) for the certificates that servers use to convince your browser (and whoever else) that they are who they say they are. Certificate Authorities (CAs) are companies that -- very carefully -- issue server certificates cryptographically signed with their root certificates.

Although better alternatives (SHA1 and SHA2, for example) are known and widely available, some signing authorities were still using MD5 when Sotirov's team created their certificate. Anyone who trusted such a certificate, directly or indirectly, would be liable to be fooled by a forged certificate that looked the same as the real one as far as MD5 is concerned. Since at least one of the CAs that the major browsers trust by default still used MD5 at the time, a phisher could have used this certificate to spoof any site in the world.

Being White Hats, Sotirov and company took several steps to ensure that their particular certificate wouldn't be used that way and to give the Good Guys a chance to take preventative steps. This is standard practice. The general pattern is:

Someone publishes a theoretical paper saying that some security measure (in this case MD5) is vulnerable to attack.
Everyone nods thoughtfully and goes back to what they were doing.
Time passes. In some cases years.
Someone actually writes code to exploit the theoretical weakness. Ideally, it's a White Hat who's trying to get the major players off the dime. Less ideally, it's a Black Hat actually trying to use the exploit for ill. Quite often it's the White Hats because (we hope, and experience bears this out) the Black Hats are too busy scamming with the tools they already have to develop sophisticated new exploits.
If it's a White Hat, the next step is to tell the relevant major players "Remember that theoretical paper on (some vulnerability)? I'm going to present an exploit based on it at (some conference). Here's everything I know about the problem and what to do about it." This gives the major players lead time before the cat is out of the bag and the Black Hats have access.
In many cases, including the present one, the White Hats don't present the actual code they used, just the general technique and the results (in this case the bogus certificate) -- at least not until everyone's satisfied that the threat has been addressed. This means that anyone trying to do ill will actually have to have some programming skills. In the present case, it took a team of seven highly-skilled researchers six months to produce their result.

So ... it's highly unlikely that anyone will be able to use the Sotirov & co.'s certificate to steal your bank details. It's only a matter of time before someone else forges a bogus certificate that could have been so used, had the CAs not taken proper steps, but by that time the MD5-based CA certificates will have been taken down. In particular, Verisign has already taken down the one under their direct control and has said that those controlled by resellers should be gone by the end of January.

There's an interesting wrinkle in Sotirov's blog post that I linked to. Thanks to the recent legal wrangling over a paper detailing an attack on the Mifare Classic subway card system, people have become more skittish about giving a heads-up to just anyone.

Most hackers (in the older sense) believe strongly that suppressing useful information is counterproductive. Mifare is a case in point, particularly since Mifare Classic had already been successfully attacked, and the paper Mifare was trying to suppress is publicly available right here. But I digress.

Back at the storyline, Sotirov's team was concerned enough about dealing directly with CAs that "did not have a significant track record of responding to public security vulnerabilities in their systems" and so might "overreact and attempt to stop or delay our presentation through legal or other means" that they took the extra precautions of getting non-disclosure agreements from the browser vendors and using Microsoft as an intermediary in talking to the CAs.

The net effect was the same: The team was able to alert the geeks at Verisign and elsewhere without giving any trigger-happy suits anything to go on, and Verisign in turn acted quickly to deal with the problem. Sotirov closes his post on an encouraging note:

Cryptographic algorithms can become broken overnight, so it is important for CAs to demonstrate the ability to react quickly to such issues. I'm happy with the reponse from Verisign and the other affected CAs. Based on our experience with them, I would not hesitate to work with them directly on any vulnerabilties I might discover in the future.

Tuesday, January 6, 2009

Cynicism 2.0

Actually, I'm not particularly cynical about the web. I like the web. I like blogging about it.

It's just that I'm congenitally skeptical. And on the web, there's just so much out there to be skeptical of.

Monday, January 5, 2009

Web publishing goes to the Dickens

Charles Dickens is known for many things; well, at least for A Christmas Carol and possibly Oliver Twist. He is perhaps somewhat less well-known for his use of unusual names (Ebenezer Scrooge, Martin Chuzzlewit, etc.) and for having published many of his works in serial form.

Doing its bit in the ongoing quest to Make Money Writing on this "Web" Thing, Underland Press has picked up the baton with the Wovel -- rhymes with "novel", looks like an inside-out vowel and is not to be confused with the Wovel, a completely different beast which would appear to rhyme with "shovel".

There are two gimmicks here: First, the Wovel appears in serial form. The author writes from Thursday to Sunday and posts the next installment in time for everyone to read it at work on Monday. During lunch or other company-approved hours, I might hasten to add. Then comes the other gimmick. Readers are invited to vote on what happens next. Votes are taken from Monday through Thursday, at which point our intrepid author starts it all over again.

Two cliffhangers for the price of one: What will your fellow readers vote for (see the results in real time), and where will the author take you from there? Always leave 'em wanting more ...

So, how do they make money off of this? As far as I can tell, they sell books.

[As far as I can tell, the Wovel wrapped up in January of 2010, about a year after this post was published. Wunderland looks like it's still in business, though. -- D. H.]

More AI kool-aid

Another in the occasional series of "not directly related to the web" posts -- if only to remind us that such topics still exist:

The latest edition of 60 Minutes features a piece on functional Magnetic Resonance Imaging (fMRI) entitled Reading Your Mind. Back in the Golden Age of Science Fiction (more on that later), it was a given that, since the brain had been discovered to give off electromagnetic signals, it would be possible to interpret those signals to determine what someone was thinking. Fast-forward several decades and throw in a whole lot of technology, and we can now do some interesting things. For example:

Determine which concrete noun, drawn from a limited set, a subject is thinking of (there's some nice stagecraft in the video version of that segment, but never mind).
Determine whether a subject is thinking of adding or of subtracting two given numbers.
Determine whether a subject has seen a particular view of a virtual environment before.
(They don't give as much detail on this one): Distinguish signatures for kindness, hypocrisy and love.

To go with that, a few grains of salt. All of the above is done under very artificial conditions. For starters, the subject is lying in an MRI machine.

The thoughts being identified are carefully and narrowly defined. In the first two cases, the set of possible responses is taken from a small, specific set. This seems very much like the case of speech recognition. Currently it's not too hard to recognize words and phrases spoken by a variety of speakers so long as they're drawn from a small list ("technical support" or "order status", say).

In the third case, the system is really trying to spot a marker for recognition of a familiar scene. In the last case, who knows what they're doing? All in all, the subtitle, Incredible Research Lets Scientists Get A Glimpse At Your Thoughts, hits it about right. Great research, but the end result is a "glimpse" not a "reading".

There are also some significant ethical questions, many of which are raised in the piece by Paul Root Wolpe, director of the Center for Ethics at Emory University in Atlanta. Not the least of which is, to what extent should we believe any of this?

Granted, we are looking at objectively observable effects, reproducible under laboratory conditions. No one's doubting the basic data. But the further you get from controlled, concrete results like "when the subject saw a picture of a hammer, these sites lit up", the more room there is for interpretation. In one disturbing case in India, a woman was convicted of murder based at least in part on fMRI evidence that indicated she was familiar with the circumstances of her husband's death by poisoning. This seems like several more levels of extrapolation than the current state of the art warrants, particularly when lives are at stake.

I would say that the star of the show, CMU neuroscientist Marcel Just, is also guilty of unwarranted extrapolation, though in a mostly harmless way, in this exchange at the end:

"Do you think one day, who knows how far into the future, there'll be a machine that'll be able to read very complex thought like 'I hate so-and-so'? Or you know, 'I love the ballet because…'?" [Lesley] Stahl asked.

"Definitely. Definitely," Just said. "And not in 20 years. I think in three, five years."

"In three years?" Stahl asked.

"Well, five," Just replied with a smile.

Really? I'd like to try that. Five years from now (so January 2014) I'll gladly step into whatever test apparatus Just and CMU have set up at the time provided:

Everyone else involved in the experiment is willing to go through the same steps first (hey, I know people do MRIs all the time, but still ...)
I will have never seen or been seen/scanned/whatever by the equipment before the test.
The equipment will render its interpretations without human aid (naturally, there will be people setting up equipment, pushing buttons and so forth, but the machine's opinions must be its own).
In no case will I be shown images or words or receive other cues to respond to beyond those listed below (no cold reading techniques, please, however well-intentioned).
Directly before the test, I will submit a strongly encrypted message containing my answers to the questions. I will make every good faith effort to think of the same answers during the test (perhaps the CMU team will have some way of verifying this?). My pre-test answers will be decrypted directly after the test for comparison (perhaps the team can glean the relevant passphrase for bonus points?).
I assume any remaining details, such as "What's a major news outlet?" can be resolved in due time, and that in general everyone assumes good faith.

Before I go on: there is, or at least ought to be, a definite James Randi cast to the conditions I'm describing. That's not an accident, but it's not because I doubt for a minute that there's real science involved here. Rather, I'm convinced there is real science going on and I'd like to see a nice rigorous test of it, to the best of my limited abilities to concoct one. So without further ado, here are some tasks I would like to see put to the machine:

On hearing the cue "person and feeling" I will think of a person whose name has appeared in the New York Times in the previous year, and consider my feelings about that person. The machine should identify the person and whether I regard the person favorably, unfavorably or neutrally.
On hearing the cue "art form and reasons", I will think of an art form and three reasons one might like it. The machine should identify the art form and the reasons. [There may need to be some negotiation over how specific the art-form could be. Ballet should be fine, origami is probably OK, wrought-iron sculpture or Joycean wordplay might be a bit too specific, and odd-meter blues-based oboe improvisation in f-sharp minor, however artful, would be cutting way too thin -- though I'd be interested to hear one]
On hearing the cue "U.S. county", I will think of a county (or parish, or borough as the case may be) somewhere in the United States. The machine should identify the county (including the state it is in).
On hearing the cue "celebrity", I will think of an image of a celebrity mentioned in a news item from a major outlet no more than three days before the test. The machine should identify the celebrity.
On hearing the cue "musical selection" I will think of a musical selection released as a track (or whatever we're calling them then) by a major label (or whatever is publishing music then). The machine should identify the selection.
On hearing the cue "Euclid", I will think of the proof of a proposition from Euclid's Elements (books I - XII). The machine should identify the proposition in question.
On hearing the cue "Shakespeare", I will think of a line from the 1609 printing of Shakespeare's Sonnets. The machine should identify the line.
On hearing the cue "hoops" I will think of an NCAA Division I basketball team. The machine should identify the team.
On hearing the cue "field note", I will think of a post from this blog. The machine should identify the post.
[While proofreading, I thought of one more. It would make an even ten, though I wanted an odd number to ensure that "majority" is well-defined: On hearing the cue "legal principle", I will think of a legal principle defined in Black's Law Dictionary. The machine should identify that principle.]

I will gladly publish the results, whomever they favor, to this blog or, if that's not available, to the web in some other form (ah .. so this is about the web after all).

At this writing, I'm confident that no machine will exist in five years, in CMU's lab or anyone else's, capable of identifying the majority of those thoughts. I would be impressed by a machine that could reliably identify any of them.

I suspect that the best shot would be to pick up on the short-term "phonological loop" by which we are thought to store audio "images", as when silently repeating a phone number in order to remember it. A couple of the items are aimed at thwarting that, and I would certainly feel free to think of, say, the sights and sounds unique to a particular county rather than the name of that county, particularly if it were one I'd visited.

I also suspect that broadening the range of possible answers significantly raises the bar and the next five years' research, however diligent, will not be enough to clear it.

But I'd be happy to be proved wrong.

P.S. Wolpe (the ethicist) says "I always tell my students that there is no science fiction anymore. All the science fiction I read in high school, we're doing." While I appreciate that many of the hypothetical ethical problems raised by science fiction are no longer hypothetical, it's not hard to come up with a list of science fiction staples that are nowhere near reality. [Maybe he just didn't read that much science fiction?] Since he's an ethicist and not, say, a physicist, I won't trot out anywhere near the full list that came to mind, but one could start with faster-than-light travel, or even interstellar travel at half light-speed, or even accelerating anything macroscopic to half light-speed, or a permanent moon base, or a Star Trek-style transporter, or ...

[Stunningly, no one contacted me to take me up on my carefully-crafted challenge. And, of course, five years down the line it doesn't look like anyone is claiming anything close to Marcel Just's prediction. A deeper question: What prompts people who, one would think, ought to know better than to make such predictions to make them anyway? --D.H. May 2015]

Field notes on the Web