Saturday, May 1, 2021

Please leave us a 5-star review

It's been long enough that I can't really say I remember for sure, and I can't be bothered to look it up, but as I recall, reviews were supposed to be one of the main ways for the web to correct itself.  I might advertise my business as the best ever, even if it's actually not so good, but not to worry.  The reviewers will keep me honest.  If you're searching for a business, you'll know to trust your friends, or you'll learn which reviewers are worth paying attention to, good information will drive out bad and everyone will be able to make well-informed decisions.

This is actually true, to an extent, but I think it's about the same extent as always.  Major publications try to develop a reputation for objective, reliable reviews, as do some personalities, but then, some also develop a reputation for less-than-objective reviews.  Some, even, may be so reliably un-objective that there's a bit of useful information in what they say after all.  And you can always just ask people you know.

But this is all outside the system of customer reviews that you find on web sites all over the place, whether provided by the business itself, or companies that specialize in reviews.  These, I personally don't find particularly useful or, if I were feeling geekly, I'd say the signal/noise ratio is pretty low.  It turns out there are a couple of built-in problems with online reviews, that were not only predictable, but were predicted at the time.

First, there's the whole question of identity on the internet.  In some contexts, identity is an easy problem: an identity is an email address or a credit or debit account with a bank, or ownership of a particular phone, or something similar that's important to a person in the real world.  Email providers and banks take quite a bit of care to prevent those kind of identities from being stolen, though of course it does still happen.  

However, for the same reason, we tend to be a bit stingy with this kind of identity.  I try hard not give out my credit card details unless I'm making an actual purchase from a reputable merchant, and if my credit card details do get stolen, that card will get closed and a new one opened for the same account.  Likewise, I try not to hand out my personal email or phone number to just anyone, for whatever good that does.

When it comes to reviews, though, there's no good way to know who's writing.  They might be an actual customer, or an employee of the business in question, or they might be several time zones away writing reviews for money, or they might even be a bot.   Platforms are aware of this, and many seem to do a good job of filtering out bogus reviews, but there's always that lingering doubt.  As with identities in general, the stakes matter.  If you're looking at a local business, the chances are probably good that everyone who's left a review has actually been there, though even then they might still have an axe to grind.  In other contexts, though, there's a lot more reason to try to game the system.

But even if everyone is on the up-and-up and leaving the most honest feedback they can, there are still a few pitfalls.  One is selection bias.  If I've had a reasonably good experience with a business, I'll try to thank the people involved and keep them in mind for future work, or mention them if someone asks, but I generally don't take time to write a glowing review -- and companies that do that kind of work often seem to get plenty of business anyway.

If someone does a really horrible job, or deals dishonestly, though, I might well be in much more of a mood to share my story.  Full disclosure: personally I actually don't tend to leave reviews at all, but it's human nature to be more likely to complain in the heat of the moment than to leave a thoughtful note about a decent experience, or even an excellent experience.  In other words, you're only seeing the opinions of a small portion of people.  That wouldn't be so bad if the portion was chosen randomly, but it's anything but.  You're mostly seeing the opinions of people with strong opinions, and particularly, strong negative opinions.

The result is that reviews tend to cluster toward one end or the other.  There are one-star "THIS PLACE IS TERRIBLE!!!" reviews, there are five-star "THIS PLACE IS THE MOST AWESOME EVER!!!" reviews, and not a lot in between.  A five-point scale with most of the action at the endpoints is really more of a two-point scale.  In effect, the overall rating is the weighted average of the two: the number of one-star reviews plus five times the number of five-star reviews, divided by the total number of reviews.  If the overall rating is close to five, then most of the reviews were 5-star.  If it's 3, it's much more likely that the good and the bad are half-and-half than most of the reviews being 3-star.

The reader is left to try to decide why the reviewers have such strong opinions.  Did the car wash do a bad job, or was the reviewer somehow expecting them to change the oil and rotate the tires as well and then get angry when they didn't?  Is the person praising a consultant's integrity actually just their cousin?  Does the person saying that a carpenter did a great job with their shelves actually know much about carpentry or did they just happen to like the carpenter's personality?  If the shelves collapse after a year and a half, are they really going to go back and update their review?  Should they, or should they maybe not store their collection of lead ingots from around the world on a set of wooden shelves?

Specifics can help, but people often don't provide much specific detail, particularly for positive reviews, and when they do, it's not always useful.  If all I see is three five-star reviews saying "So and so was courteous, professional and did great work", I'm not much better off than when I started.  If I see something that starts out with "Their representative was very rude.  They parked their truck in a place everyone in the neighborhood knows not to park.  The paint on the truck was chipped.  Very unprofessional!" I might take what follows with a grain of salt.

There's a difference, I think, between an opinion and a true review.  A true review is aimed at laying out the information that someone else might need to make a decision.  An opinion is just someone's general feeling about something.  If you just ask people to "leave a review", you're going to get a lot more personal impressions than carefully constructed analyses.  Carefully constructing an analysis is work, and no one's getting paid here.

Under the "wisdom of crowds" theory, enough general impressions will aggregate into a complete and accurate assessment.  A cynic would say that this is like hoping that if you put together enough raw eggs, you'll end up with a soufflĂ©, but there are situations where it can actually work (for a crowd, that is, not for eggs).  The problem is that in many cases you don't even have a crowd.  You have a handful of people with their various experiences and opinions.

This all reaches its logical conclusion in the gig economy.  When ride share services first started, I used to think for a bit about what number to give a driver.  "They were pretty good, but I wish they had driven a bit less (or in some cases maybe more) aggressively".  "The car was pretty clean, but there was a bit of a funny smell" or whatever.

Then I started noticing that almost all drivers had 5-star ratings, or close.  The number before the decimal point doesn't really mean anything.  You're either looking at 5.0 or 4.something.  A 4.9 is still a pretty good rating, but a 4.0 rating is actually conspicuously low.  I don't know the exact mechanics behind this, but the numbers speak for themselves.

It's a separate question to what extent we should all be in the business of rating each other to begin with, but I'll let Black Mirror speak to that.

Following all this through, if I give someone a 4-star review for being perfectly fine but not outstanding, I may actually be putting a noticeable dent in their livelihood, and if I give someone 3 stars for being pretty much in the middle, that's probably equivalent to their getting a D on a test.  So anyone who's reasonably good gets five stars, and if they're not that good, well, maybe they were just having a bad day and I'll just skip the rating.  If someone actively put my life in danger, sure, they would get an actual bad rating and I'd see if I could talk to the company, but beyond that ... everyone is awesome.

Whatever the reasons, I think this is a fairly widespread phenomenon.  Reviews are either raves or pans, and anyone or anything with reviews much short of pure raves is operating at a real disadvantage.  Which leads me back to the title.

Podcasts that I listen to, if they mention reviews at all, don't ask "Please leave a review so we can tell what's working and what we might want to improve".  They ask "Please leave a 5-star review".  The implication is that anything less is going to be harmful to their chances of staying in business.  Or at least that's my guess, because I've heard this from science-oriented podcasts and general-interest shows that clearly take care to present their stories as objectively as they can, the kind of folks who might genuinely appreciate a four-star review with a short list of things to work on.

This is a shame.  A five-point scale is pretty crude to begin with, but when it devolves to a two-point scale of horrible/awesome, it's not providing much information at all, pretty much the opposite of the model that I'm still pretty sure people were talking about when the whole ratings thing first started.