What Are Confidence Intervals in Statistics?

Coin flip — If you flip a coin, 10 tails in a row might be quite unlikely. But after 10 tosses, the probability of getting tails on the next flip will still be 50 percent. Monty Rakusen/Getty Images

Statistics is a bit of a mix between mathematics and probability. The point of statistics is to describe processes you can observe out in the world — the height of oak trees or the likelihood a vaccine will work to fend off disease — without having to measure every oak tree in the world or vaccinate every person before deciding how effective a drug is.

Because probability describes things that involve chance, we have to accept that whatever process we're using statistics to measure, we're never going to get the full picture.

Why Use Statistics?

Suppose you flip a coin four times. You get three heads and one tail. Without using statistics, we might conclude the probability of getting heads is 75 percent, where the real probability of getting heads in a coin flip is 1:1, or a 50-50 chance. If we did 40 coin flips instead, we would certainly get much closer to a 1:1 ratio of heads to tails, and the use of statistics would reflect this.

"Much of statistics has to do with reasoning from a sample — the actual observations — to characteristics of the population — all possible observations," says John Drake, a research professor in the Center for the Ecology of Infectious Diseases at the University of Georgia, in an email. "For instance, we might be interested in the height of oak trees. We can't measure all oak trees in the world, but we can measure some. We can calculate the average height of oak trees in the sample, but this won't necessarily be the same as the average of all oak trees."

Confidence Intervals

Because we can't measure all the world's oak trees, statisticians come up with an estimated range of heights based on probability and all the data at their disposal. This range is called a confidence interval and it consists of two numbers: one that is probably smaller than the true value and one that is probably larger. The true value is probably somewhere between.

"A '95 percent confidence interval' means that 95 out of 100 times that the confidence interval is constructed this way, the interval will include the true value," says Drake. "If we measured samples of oak trees 100 times, the confidence interval based on the data collected in 95 of those experiments would include the population mean, or the average height of all oak trees. Thus, a confidence interval is a measure of the precision of an estimate. The estimate gets more and more precise as you collect more data. This is why the confidence intervals get smaller as more data becomes available."

So, a confidence interval helps show how good or bad the estimate is. When we flip a coin just four times, our estimate of 75 percent has a wide confidence interval because our sample size is very small. Our estimate with 40 coin flips would have a much narrower confidence interval.

The actual meaning of a confidence interval has to do with repeating an experiment over and over. In the case of the four coin flips, a 95 percent confidence interval means that if we repeated the coin flip experiment 100 times, in 95 of those, our probability of getting heads will fall within that confidence interval.

The Limits of Statistics

There are limits to statistics. You have to design a good study — statistics can't tell you anything you didn't ask.

Say you're studying the efficacy of a vaccine, but you didn't include children in your study. You can come up with a confidence interval based on the data you collected, but it won't tell you anything about how well the vaccine protects children.

"In addition to having enough data, the sample also needs to be representative," says Drake. "Usually, this means having a random sample or a stratified random sample. Assuming the 1,000 participants in your hypothetical vaccine trial are representative of the population, then it is reasonable to conclude that the true efficacy of the vaccine is within the reported confidence interval. If the sample is not representative — if it doesn't include children — then there is no statistical basis for drawing conclusions about the unrepresented part of the population."