10 Correlations That Are Not Causations

Big Data, Little Clarity
Target is just one of many companies combing through big data to boost sales. © John Gress/Corbis

Big data -- the process of looking for patterns in data sets so large they resist traditional methods of analysis -- rates big buzz in the boardroom these days [source: Arthur]. But is bigger always better?

It's a rule that's drummed into most researchers in their first stats class: When encountering a sea of data, resist the urge to go on a fishing expedition. Given enough data, patience and methodological leeway, correlations are almost inevitable, if unethical and largely useless.

After all, the mere correlation between two variables does not imply causation; nor does it, in many cases, point to much of a relationship. For one thing, researchers cannot use statistical measures of correlation willy-nilly; each contains certain assumptions and limitations that fishing expeditions too often ignore, to say nothing of the hidden variables, sampling problems and flaws in interpretation that can gum up a poorly designed study.

Granted, big data has its uses. Inventory control thrives on discovering purchasing patterns, however mysterious their underlying causes. To take a somewhat creepy example, Target has used purchasing patterns to identify pregnant customers and then send them targeted coupons [sources: Duhigg; Hill; Taylor]. So enjoy that rewards card -- and 10 percent off your prenatal vitamins -- but don't expect too much out of big data in the causality department.