# How To Lie With Statistics

We are constantly being bombarded with statistics: “9 out of 10 doctors agree”, “gets your clothes 40% whiter”, “polls indicate voters will approve by a 3-2 margin”, and so on. Every advertiser, politician, spin doctor and pressure group seems to have reams of statistics to help sell their product or position. We, as consumers and citizens, make many of our decisions on what to buy or how to vote, based on the information we get from these sources. But do we really know how to evaluate all the data we’re fed?

For the most part, sadly, no. Though it has a heavy impact on everyone’s lives, information about statistics is rarely taught anywhere outside of college, and even there it’s mainly reserved for students of the hard sciences. If I were a conspiracy theorist, I’d say this was purposeful. It certainly makes the job of the spin doctor easier when people are ignorant about the tools of his trade.

So, what are statistics and how are they used?

Statistics are a mathematical tool designed to extrapolate information about a large population from a small sample of it. For example, if you wanted to find out how many Americans eat spaghetti, it would be very expensive and time consuming to ask all of them. Instead, you could ask a small sample of them and make a generalization about the whole population from their answers. It’s also reasonable to assume that, the larger the sample that you take, the more confidence you would have that the statistic you’ve generated is accurate. This confidence can also be represented mathematically. When you see something like “accurate to within plus or minus 3%” after a statistic, that is a mathematical expression of the confidence the statistician has in it, based on the size of the sample relative to the larger population.

Since mathematics is an exact science, you’d think it would be hard to fudge the results. In fact, it’s ridiculously easy. Computer programmers have an appropriate expression: “GIGO”. It stands for “Garbage In, Garbage Out”. A sample that’s not representative of the population you’re looking at will skew the data quite nicely. In the case of the spaghetti question, for instance, if the sample is top-heavy with Italian-Americans, you’re likely to get a very different result than if it’s top-heavy with Eskimos or in proportion with the actual percentage of the population represented by both groups. This kind of error can creep in by accident as well as on purpose.

The way a question is presented can alter results too. If a pollster asks “You don’t eat that awful spaghetti stuff, do you?”. He’ll often get a different answer than if he phrased it neutrally. More subtly, he can ask the question neutrally, but only after leading up to it with, say, other questions that give the impression that tomato sauce causes heartburn.

Even with these and other methods of skewing results, statisticians and their employers don’t actually even need to lie. Public ignorance of what statistics do and do not measure is a much better weapon in their arsenal than an outright fib.

So far, we’ve just been examining polling results. Statistics are also great tools for figuring out correlation, that is to say, how well two different things exist in relation to each other. Cause and effect are not implied, just the fact of and degree of relationship. Of course, that never stops people from drawing a cause and effect conclusion and that’s where the danger lies.

Here’s an example drawn from a 1997 Federal Bureau of Prisons analysis* that’s guaranteed to raise a lot of hackles.

Most polls show that, in America, non-believers (atheists, agnostics, and other non-religious) represent about 10% of the population. However, in Federal prisons, they represent only about 2/10ths of 1% of the total population. Does this mean that non-believers are more moral than believers? Not exactly. All that’s shown here is that these statistics exist in a certain relationship to each other. But correlation is not equivalent to causation. One thing has not been demonstrated to be the cause of the other. There are other factors that are equally good or better predictors of who goes or doesn’t go to prison and who believes or doesn’t believe. Educational and economic status also show high correlations with both populations. But does wealth create atheists and poverty, believers? Does a college education create morality and a grade school one, criminals? Such conclusions sound pretty silly but if you don’t know how statistics work, how do you argue against them? This is a case where ignorance is definitely not bliss.

Part of being a good citizen and a smart consumer is to have the knowledge of how to judge the validity of all the information that’s thrown your way by those who want to manipulate your buying habits or your vote. How do you think for yourself if you lack that knowledge? Learning how statistics work is part of it. Learning the tools of logic and rational thinking will help with the rest. With those in hand, you can cut those puppet strings and start thinking for yourself.