## Econometric confusions

31 May, 2016 at 10:43 | Posted in Statistics & Econometrics | 11 Comments

In a recent issue of Real World Economics Review there was a rather interesting, if somewhat dense, article by Judea Pearl and Bryant Chen entitled Regression and Causation: A Critical Examination of Six Econometrics Textbooks …

The paper appears to turn on a single dichotomy. The authors point out that there is a substantial difference between what they refer to as the “conditional-based expectation” and “interventionist-based expectation”. The first is given the notation:

E[Y|X]

While the second is given the notation:

E[Y|do(X)]

The difference between these two relationships is enormous. The first notation — that is, the “conditional-based expectation” — basically means that the value Y is statistically dependent on the value X …

The second notation — that is, the “interventionist-based expectation” — refers to something else entirely. It means that the value Y is causally dependent on the value X …

Now, if we simply go out and take a statistical measure of earnings and expected performance we will find a certain relationship — this will be the conditional-based expectation and it will be purely a statistical relationship.

If, however, we take a group of employees and raise their earnings, X, by a given amount will we see the same increase in performance, Y, as we would expect from a study of the past statistics? Obviously not. This example, of course, is the interventionist-based expectation and is indicative of a causal relationship between the variables …

In economics we are mainly interested in causal rather than statistical relationships. If we want to estimate, for example, the multiplier, it is from a causal rather than a statistical point-of-view. Yet the training that many students receive leads to confusion in this regard. Indeed, we may go one further and ask whether such a confusion also sits in the mind of the textbook writers themselves.

This confusion between statistical relationships and causal ones has long been a problem in econometrics. Keynes, for example, writing his criticism of the econometric method in his seminal paper Professor Tinbergen’s Method noted that Tinbergen had made precisely this error …

The question then arises: why, after over 70 years, are econometrics textbooks engaged in the same oversights and vaguenesses as some of the pioneering studies in the field? I think there is a simple explanation for this. Namely, that if econometricians were to be clear about the distinction between statistical and causal relations it would become obvious rather quickly that the discipline holds far less worth for economists than it is currently thought to possess.

Philip Pilkington

For my own take on the issues raised by Pearl and Chen see here.

## 11 Comments »

1. I’m not fond of these strawman attacks on economics. First, academic economists are probably the only academics with solid knowledge between a conditional correlation, E[Y|X], and a causal effect. Second, it is considered the absolutely most important issue to address in any applied work (see the latest issue of any respectable economics journal). Third, it is probably true that introductory textbooks first teach what conditional correlations are and what they mean, and then move on to causal inference. The reason is simple: Knowledge of the former is required for an understanding of the latter.
.
See for instance: https://en.wikipedia.org/wiki/Instrumental_variable

• Re ‘straw man attacks’ it would perhaps be interesting to know this:
Honors and awards to Judea Pearl:
RCA Laboratories Achievement Award (1963); NATO Senior Fellowship in Science (1974); Pattern Recognition Society Award for an Outstanding Contribution (1978); Fellow, IEEE (1988); Fellow, American Association of Artificial Intelligence (1990); Named “The Most Published Scientist in the Artificial Intelligence Journal,” (1991); Member, National Academy of Engineering (1995); UCLA Faculty Research Lecturer of the Year (1996); IJCAI Research Excellence Award (1999); AAAI Classic Paper Award (2000); Lakatos Award, London School of Economics and Political Science (2001); Corresponding Member, Spanish Academy of Engineering (2002); Pekeris Memorial Lecture (2003); ACM Allen Newell Award (2003); Purpose Prize (2006); Honorary Doctorate, University of Toronto (2007); Honorary Doctorate, Chapman University (2008); Benjamin Franklin Medal in Computers and Cognitive Science (2008); Festschrift and Symposium in honor of Judea Pearl (2010); Rumelhart Prize Symposium in honor of Judea Pearl (2011); David E. Rumelhart Prize (2011); IEEE Intelligent Systems’ AI Hall of Fame (2011); ACM Turing Award (2011); Harvey Prize (2012); elected to National Academy of Sciences (2014).

• I’m happy for Prof. Pearl for all his achievements, but this is not what issue is about. Nor should listing achievements be considered even an attempt of a real argument.
.
In any case, Pearl sets out 11 criteria and then evaluates whether or not 6 textbooks provide “ideal answers” — answers that adhere to the notion of causality by, amongst others, Heckman and Leamer (NB. Heckman and Leamer are prominent economists). The answer is that no textbook provides ideal answers to all 11 criteria. In particular, they fail primarily on the criteria to whether or not they conform to Pearl’s own notation of causality, i.e. E[Y|do(X)] (criteria 10 and 11).
.
To from this draw the conclusion that economists intentionally or unintentionally (the article is very confusing on that point) conceal the difference between (conditional) correlation and causality is a stretch, to say the least.
.
Let’s not forget that this is an issue about that correlation is not causation. I do believe that economists are the social scientists that are the least confused by this topic. This can also be seen in the (very) recent adaptation of other social sciences of economists methods of teasing out causality from observational data.

2. First, academic economists are probably the only academics with solid knowledge between a conditional correlation, E[Y|X], and a causal effect.

I do believe that economists are the social scientists that are the least confused by this topic.

You people are so freaking pretentious.
.
Given that philosophers have been debating what causality is for a couple of millennia, and it is still one of the most fiercely contested topics around, I’m fascinated to hear that you folks have figured it all out.
.
What exactly is the decisive contribution of mainstream economics to these centuries’ worth of debate? “Granger causality”? “Superexogeneity”? Oh: “instrumental variables”?!
.
If economists know what causality is, perhaps they can let the rest of us in on the secret. Pearl’s own conception (which mostly relies on others’ work) is highly contested.

The reason is simple: Knowledge of the former is required for an understanding of the latter.

Says you.

• I’m of course talking about applied work, and not philosophical theorizing. And so are the textbooks. Let’s not move the goalpost.
.
The contribution of economics to the applied work on causality are things like: Instrumental variables/2SLS/GMM, regression discontinuity design, differences-in-differences, long-run restrictions, synthetic controls, etc.
.
But yeah, I think instrumental variables was a major breakthrough.

• Instrumental variables techniques presuppose as all other econometric techniques heaps of invariance assumptions to draw causal conclusions out of data. And as mathematical statistician David Freedman writes in “Statistical Models — theory and practice”:
“Invariance assumptions need to be made in order to draw causal conclusions from non-experimental data: parameters are invariant to interventions, and so are errors or their distributions. Exogeneity is another concern. In a real example, as opposed to a hypothetical, real questions would have to be asked about these assumptions. Why are the equations “structural,” in the sense that the required invariance assumptions hold true? Applied papers seldom address such assumptions, or the narrower statistical assumptions: for instance, why are errors IID?

The tension here is worth considering. We want to use regression to draw causal inferences from non-experimental data. To do that, we need to know that certain parameters and certain distributions would remain invariant if we were to intervene. Invariance can seldom be demonstrated experimentally. If it could, we probably wouldn’t be discussing invariance assumptions. What then is the source of the knowledge?

“Economic theory” seems like a natural answer, but an incomplete one. Theory has to be anchored in reality. Sooner or later, invariance needs empirical demonstration, which is easier said than done.”

Conclusion: ‘Instrumental variables was a major breakthrough’? No!

• Well, you quoted article claims that economists confuse correlation with causation in applied studies. I would claim they don’t, and gave you a list of various methods that economists deploy in order to precisely differentiate between correlation and causation. You response? IV relies on too many assumptions. Fine. But that means that you either do not find applied work on causal inference relevant, or that you know of some alternative method that is more appropriate than IV.
.
It would be interesting to know which one it is. In either case, it’s not relevant to whether or not economists understand the difference between correlation and causation (and, by the way, errors do not have to be iid).

• Move the goalpost? How exactly do you do “applied work” when you don’t know what the heck you’re “applying” and what you’re “applying” it to? There’s a reason why philosophers do their “theorising”. And “applied workers” do philosophical “theorising” too, only they don’t tend to bother to do it explicitly or even recognise that they’re doing it. Causation is actually an excellent case in point.
.
You claim that economists have the best understanding of what the difference is between a conditional correlation and a “causal effect”. That presupposes that you know what a “causal effect” is. (And that “instrumental variables” approaches capture “it”. And that “causal effects” are in fact a single, homogeneous category of phenomenon. And so on.) You talk about introductory textbooks starting with conditional correlations and moving on to “causal inference”. Causal inference is of course an equally massively contentious subject, not of course unrelated to the debates over what causation is in the first place, but such blithe references make it seem that everybody (certainly the textbook writers!) already knows what it is, how one does it and what makes it good or reliable or whatever.
.
Every method of causal inference makes assumptions about what causality is, what forms it takes, what symptoms it produces, how you can tell when you meet it in a dark alley, and so on. Methodological techniques like “instrumental variables” approaches are already known to make strong statistical assumptions about the phenomena being studied (assumptions that are typically just … assumed to be fulfilled, and many of which can’t be “tested” by purely statistical means). But many of the most important assumptions about causation that these techniques rely on are not even recognised in the first place.
.
And the assumption that knowledge of statistical correlation is a necessary condition of causal knowledge is just one very widespread example.

• Is this yet another postmodernist with the usual claims that “since we can’t know anything with 100% certainty, we can’t know anything at all!”? That’s pretty boring and just lazy.
.
In any case, economist share the notion of causality with that of most other sciences. E.g. RCT establishes causality within the domain it is applied to; you have treatment and control groups, and the causal effect is the difference between the two. I am not sure which actual science would disagree with this notion. Perhaps you can enlighten me?
.
“And the assumption that knowledge of statistical correlation is a necessary condition of causal knowledge is just one very widespread example.”
.
Knowledge of estimating a conditional correlation is a prerequisite for the knowledge of estimating a causal relationship. You know, when we use actual data and are interested in actual answers. Or maybe you don’t know.

3. Beware!
Pearl & Chen, Prof. Syll, their ilk and their acolytes, are trying to tempt us towards the outer darkness of negativity, nihilism and ignorance. The doctrines of these false prophets deny innate elements of human intelligence which evolved over millennia.
The Bible warns us:
“A person who has doubts is like a wave that is blown by the wind and tossed by the sea (James 1:6-8).
“The fearful and unbelieving shall have their part in the lake which burneth with fire and brimstone (Revelation 21:8).

Beware also the seductive heresies of econometricians. These proselytise salvation through good works, e.g. mind flagellation such as that applauded by commentator Pontus above: “Instrumental variables/2SLS/GMM, regression discontinuity design, differences-in-differences, long-run restrictions, synthetic controls, etc.”
Statistical alchemy is powerless against Satan and his demons which lurk in every the crevasses of every set of worldly data.
Again the Bible gives us clear warnings:
“The human mind is the most deceitful of all things. It is incurable. No one can understand how deceitful it is” (Jeremiah 17:9).
“They make a noise like a dog…they belch out with their mouth: swords are in their lips…but thou, O LORD, shalt laugh at them” (Psalm 59:5-8).
“Let the mischief of their own lips cover them. Let burning coals fall upon them: let them be cast into the fire; into deep pits, that they rise not up again” (Psalm 140:9-10 KJV).

Rejoice!! Salvation is at hand.
The Gospel of common sense is innate within most humans an other animals.
We are able to make estimates despite only imperfect data!
“The ultimate logic, or psychology, of these deliberations is obscure, a part of the scientifically unfathomable mystery of life and mind. We must simply fall back upon a “capacity” in the intelligent animal to form more or less correct judgments about things, an intuitive sense of values. We are so built that what seems to us reasonable is likely to be confirmed by experience, or we could not live in the world at all.” – Frank Knight: “Risk, Uncertainty and Profit”, 1921.
http://www.econlib.org/library/Knight/knRUP6.html#Pt.III,Ch.VII

Relatively simple reasoning and graphs is often an effective way of gaining understanding and forming estimates.
Salt is good: but if the salt have lost his saltiness, wherewith will ye season it? Have salt in yourselves” (Mark 9:50 KJV, Matthew 5:13, Luke 14:34)

4. Oh, “postmodernist” … good one. When you have nothing to say, throw anything at the wall and hope it sticks. And invent things and attribute them to your interlocutor so you can then call down fire and brimstone (cf. Mr. Kingsley Lewis) on the figments of your own imagination. This is a pattern with you, BTW.

“since we can’t know anything with 100% certainty, we can’t know anything at all!”? That’s pretty boring and just lazy.

Please go read what I wrote and show me where I said this. I’m happy to engage the “uncertainty” debate if you wish, but it’s a sideshow here.
.
If you share the notion of causality with so many others, surely you can tell us what causality is.
.
Back to RCT, are we?
.
RCT claims to identify and measure causal effects. It doesn’t tell us what a causal effect is. It certainly doesn’t present any justification for assuming that the kinds of causal effects that it can successfully identify and measure are the only kinds of causal effects there are — that they are exhaustive of causality.
.
Of course, with RCTs there exists a well-developed conceptual and experimental apparatus, including “treatment” and “control groups”, “intervention”/”treatment”, etc. Transferring these causal and experimental concepts, and their associated licensed inferences, directly and unproblematically to supposedly analogous “observational studies” seems to be a pretty major leap. So too, hence, with the RCT concept of “causal effect”.
.
And of course, strictly speaking, RCT doesn’t establish causality, even on its own terms, and even in “the domain it is applied to”, because it can’t guarantee the elimination of statistical bias. Randomisation simply says that if the relevant assumptions are met by the phenomena under study, the chances are high that bias will be reduced. When and wherever clinical medicine relies only on RCTs to advance causal claims, I’d say, it’s on extremely shaky grounds. The most reliable forms of causal understanding in medicine (or any other science) are never based on statistical evidence alone. Try reading up, for instance, on how John Snow traced the causes and transmission mechanisms of cholera in London in the 1850s.
.
And, as in the rest of statistical causal inference, some of the assumptions required by RCT to make causal claims are huge assumptions indeed.

I am not sure which actual science would disagree with this notion. Perhaps you can enlighten me?

Doubt it is a “science”, but a discipline that indisputably deals with causality and causal inferences and attributions all the time is history. And somehow the majority of historians constantly make singular causal claims without the help of correlations, instrumental variables, RCTs or statistical inference. Which might be taken by some to suggest that there might be importantly different forms of causation, and extremely different ways of successfully identifying causation at work. But naturally, historians aren’t as rigorous as you economists.

Knowledge of estimating a conditional correlation is a prerequisite for the knowledge of estimating a causal relationship.

Again, you heap on more grist for the mill. Are you saying singular causation doesn’t exist? Are you saying that we can’t possibly know whether a singular causal claim is correct (well-established, etc.) unless we have conducted a statistical test? That we should just ignore all the causal claims of all non-“cliometric” historians because we “just don’t know” until the truly rigorous people have arrived on the scene to give their imprimatur of “science”?

You know, when we use actual data and are interested in actual answers. Or maybe you don’t know.

Unlike those foolish historians who spend years in archives and such. Do you really want to go here? Shall we get into the quality of the “actual data”, the inferences and the “actual answers” — you know, the actual record of success — of econometrics?