Wednesday, November 4, 2009

60 Minutes and the MPAA: Part IV - Error bars

In the 60 Minutes piece I've been referencing, A-list director Steven Soderbergh drops the oft-quoted figure of $6.1 billion per year in industry losses. This figure comes from a 2006 study by consulting firm L.E.K. It's easy to find a summary of this report. Just google "video piracy costs" and up it comes. Depending on your browser settings, you may not even see the rest of the hits, but most of the top ones are repeats or otherwise derived from the L.E.K study. And you didn't need to see anything else anyway, did you?

So ... $6.1 billion. Let's assume for the moment that the figure is relevant -- more on that in the next post. How accurate is it?

One of the handful of concepts I retained from high school physics, beyond Newton's laws, was that of significant digits, or "sig digs" as the teacher liked to call them. By convention, if I say "6.1 billion", I mean that I'm confident that it's more than 6.05 billion and less than 6.15 billion. If I'm not sure, I could say 6 billion (meaning more than 5.5 billion and less than 6.5 billion).

Significant digits are just a rough-and-ready convention. If you're serious about measurement you state the uncertainty explicitly, as "6.1 billion, +/- 300 million". My personal opinion is that even if you're not being that rigorous, it's a bad habit to claim more digits than you really know, and a good habit to question anything presented like it's known to an unlikely number of digits.

The point of all this is that precise results are rare in the real world. Much more often, the result is a range of values that we're more or less sure the real value lies in. For extra bonus points, you can say how sure, as "6.1 billion, plus or minus 300 million, with 95% confidence".

From what I can make out, L.E.K. is a reputable outfit and made a legitimate effort to produce meaningful results and explain them. In particular, they didn't just try to count up the number of illegal DVDs sold. If I buy an illegal DVD but go and see the movie anyway, or I never would have seen the movie at all if not for the DVD, it's hard to claim much harm. So L.E.K. tried to establish "how many of their pirated movies [viewers] would have purchased in stores or seen in theaters if they didn't have an unauthorized copy". They did this by surveying 17,000 consumers in 22 countries, doing focus groups and applying a regression model to estimate figures for countries they didn't survey. (This is from a Wall Street Journal article on L.E.K. web site and from the "methodology" section of the summary mentioned above).

On average, they surveyed about 800 people per country, presumably more in larger countries and fewer in smaller. That's enough to do decent polling, but even an ideal poll typically has a statistical error of a few percent. This theoretical limit is closely approached in political polls in countries with frequent elections, because it's done over and over and the pollsters have detailed knowledge of the demographics and how that might effect results. They apply this knowledge to weight the raw results of their polling in order to compensate for their sample not being completely representative (for example it's weighted towards people who will answer the phone when they call and are willing to answer intrusive questions).

For international market research in a little-covered subject, none of this is available. So even if you have a reasonably large sample, you still have to estimate how well that sample represents the public at large. There are known techniques for this sort of thing, so it's not a total shot in the dark, but I don't see anyway you can assume anything near the familiar "+/- 3%" margin. At a wild guess, maybe more like 10-20%, by which I mean you're measuring how the population at large would answer the question, and not what they would actually do, with an error of -- who knows but let's say -- 10-20%. More than the error you'd assume by just running the sample size and the population size through the textbook formula, anyway.

All of this is assuming that people won't lie to surveyors about illicit activity, and that they are able to accurately report what they might have done in some hypothetical situation. Add to that uncertainties in the model for estimating countries not surveyed and the nice, authoritative statement that "Piracy costs the studios $6.1 billion a year" comes out as "Based on surveys and other estimates done in 2006, we think that people who bought illegal DVDs might have spent -- I'm totally making this up here -- somewhere between $4 billion and $8 billion on legitimate fare that year instead, but who really knows?"

Now $4 billion, or whatever it might really be, is still serious cash. The L.E.K. study at the least makes a good case that people are spending significant amounts on pirated goods they might otherwise have bought from studios. I'm not disputing that at the moment. Rather, I'm objecting to a spurious air of precision and authority where very little such exists. More than that, I'm objecting to an investigative news program taking any such key figure at face value without examining the assumptions behind it or noting, for that matter, that it was commissioned by the same association claiming harm.

And again, this is still leaving aside the crucial question of relevance.

No comments: