Every discipline has its set of taboo behaviors. A poker player with a bad habit of peeking at other players’ cards will be asked to leave the table (if they’re lucky). An author who commits plagiarism will probably see their career go down in flames. (Probably! Not always.)

In data science, one such verboten behavior is cherry picking—the practice of starting with a desired conclusion, then searching for data to confirm it.

Cherry picking is frowned upon because it’s actually quite easy to massage data to tell the story you want people to hear1. Marketing directors report on things like “search volume” or “session duration” to make their website look successful, even if nobody’s buying. Enron did whatever they did to make their revenue look huge. Happens all the time. But it’s not a good thing to do!

Check out this quote:

I wanted to show the scale of the problems we are facing and, at the same time, show how the world is changing to highlight where the world is making progress. I thought it was important to highlight that we are making progress so that we can learn from our successes and encourage us today to tackle the problems we are facing.

“I thought it was important to highlight that we are making progress.”

If you’ve already decided that we’re making progress, and you look for data to demonstrate that fact, will you find it? Yes.

If you do that, are you cherry picking? Also yes.

The above quote is from Max Roser, founder and Co-Director of the Bill and Melinda Gates Foundation-funded2 website Our World in Data.

The thing about cherry picking is it works both ways. So, inevitably, there are those who cite different data and draw different conclusions.

I personally find the minutiae of what constitutes a meaningful metric, or threshold, or an appropriate time scale for analyses of poverty to be pretty interesting stuff. But even if you don’t care, the point is that the data support multiple conclusions.

And that points to the one thing that seems certain here: Bill Gates wants you to believe things are getting better.

Why?

We can’t know, of course. It might be something personal—he’s just a cheery guy, looking to spread good vibes. Or more cynically: He’s looking to shift attention away from his legacy of shady business practices and other scandals.

My own pet theory, though, is that it’s not personal. It’s class loyalty. Imagine a (hypothetical) guy who’s worth hundreds of billions, and who cares more about the preservation of the political and economic system that allows the hoarding of such wealth than he does about starving orphans. (Though he may also care about starving orphans.)

For this guy, a couple million dollars’ investment in a “data-driven” PR operation on behalf of that political and economic system might make perfect sense. Something he considers on the level that you or I might consider adding an “extended protection plan” to our next phone purchase. For 1/100,000th of your net worth? Why not.

Just a pet theory, mind you. But it does have explanatory power over similar initiatives from, say, the Charles Koch-founded Cato Institute.

One thing I’ll say with confidence: We tell ourselves, and each other, stories. They’re not necessarily true. So it’s always worth asking why someone’s telling the story they’re telling, in the way they’re telling it, to the audience they’ve chosen.

  1. If you want to read a whole book on this topic, How to Lie with Statistics is short and surprisingly entertaining. 

  2. The foundation has made at least one substantial grant to OWID, though they don’t appear on the website’s “How we’re funded” page