Correlation does not always even imply correlation

11 Aug, 2014 at 07:48 | Posted in Statistics & Econometrics | 6 Comments

If people start sending you random pairs of variables that happen to be highly correlated, sure, there might well be a connection between them, for example kids’ scores on math tests and language tests are correlated, and this tells us something. But if someone is looking for a particular pattern, and then selects two variables that are correlated, that’s another story. The great thing about causal identification is that it’s valid even if you’re looking to find a pattern. (Not completely, there’s p-hacking and also you can run 100 experiments and only report the best one, etc., but that’s still less of an issue than the fact that pure correlation does not logically tell you anything about causation. To put it another way: returning to Noah’s tweet: Correlation is surely correlated with causation in an aggregate sense, but if you take the subset of correlations that a particular motivated researcher is looking for—then maybe not …

Harrisburg DUI Lawyer discussed random sample errorThe expression “correlation does not imply causation” is popular, and I think it’s popular for a reason, that it does capture a truth about the world …

People see enough random correlations that they can pick them out and interpret them how they like.

So if I had to put something on a bumper sticker (or a tweet), it would be:

“Correlation does not even imply correlation”

That is, correlation in the data you happen to have (even if it happens to be “statistically significant”) does not necessarily imply correlation in the population of interest.

Andrew Gelman


  1. Correlations are only good for hypothesis formation, otherwise spurious.

  2. My comment:

    I was going to write something… But I’ll just quote the Wiki article…

    “Ordinarily, regressions reflect “mere” correlations, but Clive Granger argued that causality in economics could be reflected by measuring the ability of predicting the future values of a time series using past values of another time series. Since the question of “true causality” is deeply philosophical, econometricians assert that the Granger test finds only “predictive causality”.”

    Which means that “causation” in econometrics etc. is really just correlation pushed into the future (this is how the test works). That can change at the drop of a hat (hello, Taleb!).

    So, yeah, be careful. Smith is obviously a True Believer in this stuff. But let’s not have slightly more inquisitive minds accepting the premise that we can properly PROVE causation in economics.

  3. “Correlations are not explanations and besides, they can be as spurious as the high correlation in Finland between foxes killed and divorces.”
    Gunnar Myrdal

  4. Another strawman argument from economists who lack understanding of science. Nothing implies causation, nothing at all. Newton put it elegantly by saying that if one looks for a cause there is an infinite regression of causes that lead to an ultimate cause.

    Correlation just implies correlation. Those who like to impress people with their misunderstandings attack the strawman that correlation implies causation and bring up silly examples like the “high correlation in Finland between foxes killed and divorces”.

  5. Nor is absence of evidence evidence of absence.

  6. […] ein empirischer Beweis ist so gut wie unmöglich. Die Grafik im folgenden Beitrag sagt im Prinzip schon alles […]

Sorry, the comment form is closed at this time.

Blog at
Entries and comments feeds.