Saturday, December 31, 2011

Rumours and tweets of rumours

Someone at the Guardian (aided by academics at several universities) put in a bunch of overtime analyzing the Twitter traffic from last summer's riots in England.  In all, they traced seven rumors, five that turned out to be false, one that turned out to be true, and one they classify as "unsubstantiated."  They then put together a nice interactive graphic of the results, including a graph of the volume of traffic over time and a sort of cloud diagram color coded to show support for, opposition to, questioning of and commentary on the rumor in question, with size indicating the "influence" of the tweet, based on number of followers the originator of the tweet had.

The results are fascinating.  You should probably have a look at them yourself (here's the link again) before going on.

There is a fairly widespread notion that the web corrects itself.  People may put up misinformation, whether deliberately or in good faith, but eventually the real story will come out and supplant it.  The lead-in to the Guardian interactive graphic says so in as many words: "... Twitter is adept at correcting misinformation ..."

I don't see a lot of support for this in the data presented.

In the self-correcting model, you would expect to see an initial wave of green for a false rumor, coming with the original misinformation, steadily replaced by red, with possibly some yellow (questioning) and gray (commentary) in between.  Following is what actually happened for the five rumors determined to be definitely false.
  • Rioters attack London Zoo and release animals:  Initially, green traffic grows.  After a while, red traffic comes in denying the rumor.  Hours later, there is influential red traffic, but the green traffic is still about as influential.  Traffic then dwindles, with the last bits being green, still supporting the rumor hours after it has been disputed.
  • Rioters cook their own food in McDonalds: This one was picked up early by the website of the Daily Mail, which stated that there had been reports of this happening.  In any case, the green traffic surges moderately twice, before peaking at high volume several hours later.  There is no red traffic to speak of.
  • London Eye set on fire:  This one actually does follow the predicted pattern.  The initial green is quickly joined by yellow and red.  The proportion of red steadily grows, and as traffic dies down it is almost entirely red.
  • Rioters attack a children's hospital in Birmingham:  In this case one source of denials was someone actually working at the hospital.  Again, a strong surge of green is gradually taken over by red, but not completely.  As traffic dies down, the rumor is still being circulated as true.  Late in the game, it resurges again, though again there is a countersurge of denial.
  • Army deployed in Bank [I believe this refers to the area in London near the Bank of England and the Bank tube station]:  Traffic starts out yellow, as a question over a photo (which was actually a photo of tanks in Egypt).  Red traffic begins to grow, but so does green, and yellow continues to dominate.  Eventually everything dies down.  The last bits of traffic are yellow
In summary: One of the five cases follows the "good information drives out bad" model.  One other more or less follows it.  Two are an inconclusive mix of support and denial.  One consists almost entirely of support for a false rumor.

This was in one of the world's most connected cities, with widespread access to the internet, cell phones, land lines, television, newspapers, live webcams and whatever else.  Only in the case where the rumor was trivial to refute (for example via this webcam) did Twitter appear to self-correct. 

One would be hard-pressed, I think, to distinguish between the actual true rumor (Miss Selfridge set on fire -- that's the name of a store, not a person) and the false rumor about McDonalds based solely on the volume and influence of tweets confirming and denying.  Likewise, the unsubstantiated rumor (Police 'beat 16-year-old girl') follows its own pattern, mostly surges of green, but interspersed with yellow.

This may seem like a lot of argumentation just to say "Take your tweets with a grain of salt", but pretty much everything tastes better with data.