Econometrics and the ’empirical turn’ in economics

23 Nov, 2015 at 17:17 | Posted in Economics, Statistics & Econometrics | 7 Comments

The increasing use of natural and quasi-natural experiments in economics during the last couple of decades has led some economists to triumphantly declare it as a major step on a recent path toward empirics, where instead of being a deductive philosophy, economics is now increasingly becoming an inductive science.

In their plaidoyer for this view, the work of Joshua Angrist and Jörn-Steffen Pischke is often apostrophized, so lets start with one of their later books and see if there is any real reason to share the optimism on this ’empirical turn’ in economics.

In their new book, Mastering ‘Metrics: The Path from Cause to Effect, Angrist and Pischke write:

masteringOur first line of attack on the causality problem is a randomized experiment, often called a randomized trial. In a randomized trial, researchers change the causal variables of interest … for a group selected using something like a coin toss. By changing circumstances randomly, we make it highly likely that the variable of interest is unrelated to the many other factors determining the outcomes we want to study. Random assignment isn’t the same as holding everything else fixed, but it has the same effect. Random manipulation makes other things equal hold on average across the groups that did and did not experience manipulation. As we explain … ‘on average’ is usually good enough.

Angrist and Pischke may “dream of the trials we’d like to do” and consider “the notion of an ideal experiment” something that “disciplines our approach to econometric research,” but to maintain that ‘on average’ is “usually good enough” is an allegation that in my view is rather unwarranted, and for many reasons.

First of all it amounts to nothing but hand waving to simpliciter assume, without argumentation, that it is tenable to treat social agents and relations as homogeneous and interchangeable entities.

notes7-2Randomization is used to basically allow the econometrician to treat the population as consisting of interchangeable and homogeneous groups (‘treatment’ and ‘control’). The regression models one arrives at by using randomized trials tell us the average effect that variations in variable X has on the outcome variable Y, without having to explicitly control for effects of other explanatory variables R, S, T, etc., etc. Everything is assumed to be essentially equal except the values taken by variable X.

In a usual regression context one would apply an ordinary least squares estimator (OLS) in trying to get an unbiased and consistent estimate:

Y = α + βX + ε,

where α is a constant intercept, β a constant “structural” causal effect and ε an error term.

The problem here is that although we may get an estimate of the “true” average causal effect, this may “mask” important heterogeneous effects of a causal nature. Although we get the right answer of the average causal effect being 0, those who are “treated”( X=1) may have causal effects equal to – 100 and those “not treated” (X=0) may have causal effects equal to 100. Contemplating being treated or not, most people would probably be interested in knowing about this underlying heterogeneity and would not consider the OLS average effect particularly enlightening.

Limiting model assumptions in economic science always have to be closely examined since if we are going to be able to show that the mechanisms or causes that we isolate and handle in our models are stable in the sense that they do not change when we “export” them to our “target systems”, we have to be able to show that they do not only hold under ceteris paribus conditions and a fortiori only are of limited value to our understanding, explanations or predictions of real economic systems.

Real world social systems are not governed by stable causal mechanisms or capacities. The kinds of “laws” and relations that econometrics has established, are laws and relations about entities in models that presuppose causal mechanisms being atomistic and additive. When causal mechanisms operate in real world social target systems they only do it in ever-changing and unstable combinations where the whole is more than a mechanical sum of parts. If economic regularities obtain they do it (as a rule) only because we engineered them for that purpose. Outside man-made “nomological machines” they are rare, or even non-existant. Unfortunately that also makes most of the achievements of econometrics – as most of contemporary endeavours of mainstream economic theoretical modeling – rather useless.

Remember that a model is not the truth. It is a lie to help you get your point across. And in the case of modeling economic risk, your model is a lie about others, who are probably lying themselves. And what’s worse than a simple lie? A complicated lie.

Sam L. Savage The Flaw of Averages

When Joshua Angrist and Jörn-Steffen Pischke in an earlier article of theirs [“The Credibility Revolution in Empirical Economics: How Better Research Design Is Taking the Con out of Econometrics,” Journal of Economic Perspectives, 2010] say that “anyone who makes a living out of data analysis probably believes that heterogeneity is limited enough that the well-understood past can be informative about the future,” I really think they underestimate the heterogeneity problem. It does not just turn up as an external validity problem when trying to “export” regression results to different times or different target populations. It is also often an internal problem to the millions of regression estimates that economists produce every year.

But when the randomization is purposeful, a whole new set of issues arises — experimental contamination — which is much more serious with human subjects in a social system than with chemicals mixed in beakers … Anyone who designs an experiment in economics would do well to anticipate the inevitable barrage of questions regarding the valid transference of things learned in the lab (one value of z) into the real world (a different value of z) …

randomizeAbsent observation of the interactive compounding effects z, what is estimated is some kind of average treatment effect which is called by Imbens and Angrist (1994) a “Local Average Treatment Effect,” which is a little like the lawyer who explained that when he was a young man he lost many cases he should have won but as he grew older he won many that he should have lost, so that on the average justice was done. In other words, if you act as if the treatment effect is a random variable by substituting βt for β0 + β′zt, the notation inappropriately relieves you of the heavy burden of considering what are the interactive confounders and finding some way to measure them …

If little thought has gone into identifying these possible confounders, it seems probable that little thought will be given to the limited applicability of the results in other settings.

Ed Leamer

Evidence-based theories and policies are highly valued nowadays. Randomization is supposed to control for bias from unknown confounders. The received opinion is that evidence based on randomized experiments therefore is the best.

More and more economists have also lately come to advocate randomization as the principal method for ensuring being able to make valid causal inferences.

I would however rather argue that randomization, just as econometrics, promises more than it can deliver, basically because it requires assumptions that in practice are not possible to maintain.

Especially when it comes to questions of causality, randomization is nowadays considered some kind of “gold standard”. Everything has to be evidence-based, and the evidence has to come from randomized experiments.

But just as econometrics, randomization is basically a deductive method. Given the assumptions (such as manipulability, transitivity, separability, additivity, linearity, etc.) these methods deliver deductive inferences. The problem, of course, is that we will never completely know when the assumptions are right. And although randomization may contribute to controlling for confounding, it does not guarantee it, since genuine ramdomness presupposes infinite experimentation and we know all real experimentation is finite. And even if randomization may help to establish average causal effects, it says nothing of individual effects unless homogeneity is added to the list of assumptions. Real target systems are seldom epistemically isomorphic to our axiomatic-deductive models/systems, and even if they were, we still have to argue for the external validity of the conclusions reached from within these epistemically convenient models/systems. Causal evidence generated by randomization procedures may be valid in “closed” models, but what we usually are interested in, is causal evidence in the real target system we happen to live in.

When does a conclusion established in population X hold for target population Y? Only under very restrictive conditions!

Angrist’s and Pischke’s “ideally controlled experiments” tell us with certainty what causes what effects — but only given the right “closures”. Making appropriate extrapolations from (ideal, accidental, natural or quasi) experiments to different settings, populations or target systems, is not easy. “It works there” is no evidence for “it will work here”. Causes deduced in an experimental setting still have to show that they come with an export-warrant to the target population/system. The causal background assumptions made have to be justified, and without licenses to export, the value of “rigorous” and “precise” methods — and ‘on-average-knowledge’ — is despairingly small.

Like us, you want evidence that a policy will work here, where you are. Randomized controlled trials (RCTs) do not tell you that. They do not even tell you that a policy works. What they tell you is that a policy worked there, where the trial was carried out, in that population. Our argument is that the changes in tense – from “worked” to “work” – are not just a matter of grammatical detail. To move from one to the other requires hard intellectual and practical effort. The fact that it worked there is indeed fact. But for that fact to be evidence that it will work here, it needs to be relevant to that conclusion. To make RCTs relevant you need a lot more information and of a very different kind.


So, no, I find it hard to share the enthusiasm and optimism on the value of (quasi)natural experiments and all the statistical-econometric machinery that comes with it. Guess I’m still waiting for the export-warrant …


  1. […] gällande olika statistiska metoder och användning av signifikanstexter och liknande (se exempel här och […]

  2. Political economics and intellectual corruption
    Comment on ‘Econometrics and the ’empirical turn’ in economics’
    “As some one has said, it would seem that even the theorems of Euclid would be challenged and doubted if they should be appealed to by one political party as against another.” (Fisher, 1911, PF. 6)
    Economics is a failed science. The main reason is that it had been hijacked from the very start by political economists and never could get rid of these folks.
    It is of utmost importance to distinguish between political and theoretical economics. The main differences are (i) the goal of political economics is to push an agenda, the goal of theoretical economics is to explain how the actual economy works; (ii) in political economics anything goes, in theoretical economics scientific standards are observed.
    The sole criterion of science is true/false and scientists tackle only those problems that can be clearly decided. The twilight zone between true and false is not denied but simply not of scientific interest. This no-go zone, though, is the natural ecological niche for non-scientists, anti-scientists, politicians, common sensers, gut feelers, storytellers, ideologues, believers, know-nothings, opinion-holders, confusers, ignoramuses, naives, cranks, etcetera.
    Let us call this zone where political economics is at home and where “nothing is clear and everything is possible” (Keynes, 1973, p. 292) the opinion zone. The common denominator of the inhabitants of this intellectual morass between true and false is that they have, for whatever reasons, no real interest in clear-cut answers and real knowledge.
    The curse of economics has been exactly spotted by Schumpeter: “It is only our inability to divorce research from politics, or our suspicion, all too often justified, that the other fellow cannot analyze with single-minded devotion to truth, which makes problems and party issues out of decisions that do not excite anyone in more fortunate fields of research.” (1994, p. 566)
    Theoretical economics is entirely different from political economics: “Research is in fact a continuous discussion of the consistency of theories: formal consistency insofar as the discussion relates to the logical cohesion of what is asserted in joint theories; material consistency insofar as the agreement of observations with theories is concerned.” (Klant, 1994, p. 31)
    Formal consistency is established by the axiomatic-deductive method. Material consistency is established by empirical testing, i.e. by correctly applying well-defined statistical methods. Both elements are required. There is no such thing as an opposition between theory and practice, or rigor and relevance, or deduction and induction.
    The answer to the question of why the scientific method apparently has not worked with overwhelming success in economics is pretty obvious: political economists are (i) scientifically incompetent, and (ii) content with the inconclusive status quo and not really interested in definitive true/false answers.
    “Yet most economists neither seek alternative theories nor believe that they can be found.” (Hausman, 1992, p. 248)
    To deny the very possibility of a true — formally and materially consistent — economic theory is the ultimate intellectual corruption of political economics.
    The interaction of the two indispensable elements of the scientific method will work their magic also in economics — but not until losers like Walrasians, Keynesians, Marxians, Austrians, etc. have been thrown out by genuine scientists.
    Egmont Kakarot-Handtke
    Fisher, I. (1911). The Purchasing Power of Money. Its Determination and Relation
    to Credit, Interest, and Crises. Library of Economics and Liberty. URL
    Hausman, D. M. (1992). The Inexact and Separate Science of Economics. Cambridge: Cambridge University Press.
    Keynes, J. M. (1973). The General Theory of Employment Interest and Money.
    The Collected Writings of John Maynard Keynes Vol. VII. London, Basingstoke:
    Klant, J. J. (1994). The Nature of Economic Thought. Aldershot, Brookfield, VT:
    Edward Elgar.
    Schumpeter, J. A. (1994). History of Economic Analysis. New York, NY: Oxford
    University Press.

  3. Thank you very much for this post!
    I am on an econometrics class, which is mostly based on Angrist’s and Pishcke’s books. From the beginning of the course I had a feeling that there is something sloppy in the way the methods are used, and how the results are interpreted. The assumptions made in the lectures are not considered seriously enough, which I am used to at economics theory courses but not on statistics courses.

    Also what has disturbed me is that there is no clear distinction between statistical methods and theoretical assumptions. Actually this relationship is not discussed at all, and regression models seem to fall from heaven for us and then we just apply them to data and get the “treatment effect”.

    It is very important for me to read your critical texts about econometrics, because I feel quite alone with these doubts in my university.

  4. It’s funny that nothing, literally nothing is enough for Lars. Everything neoclassical economics has to offer is useless, this has been the message for 4 years, and now everything standard econometrics has to offer is useless as well. 😀 This is unbelievable really, especially given the rapidly increasing number of great econ/metrics (especially applied microecon) papers in even greater journals.

    • ‘The beauty is in the eye of the beholder’ and I think that goes for greatness too 🙂

    • Mary, Keynes identified econometrics as a non-starter at the very beginning. It does not matter how fancy the techniques become, you can remove its fundamental problem. The understanding they can provide will always be very limited.

      • Pontificating, on the other hand, is — as Keynes convincingly argued — a rock solid method reaching robust and sound conclusions, and should be practiced in every scientific discipline. (Provided, of course, that the conclusions support the blog author’s political agenda. But that goes without saying.)

Sorry, the comment form is closed at this time.

Blog at
Entries and Comments feeds.