The econometric illusion

7 Jan, 2023 at 17:51 | Posted in Statistics & Econometrics | 11 Comments

What has always bothered me about the “experimentalist” school is the false sense of certainty it conveys. The basic idea is that if we have a “really good instrument” we can come up with “convincing” estimates of “causal effects” that are not “too sensitive to assumptions.” Elsewhere I have written  an extensive critique of this experimentalist perspective, arguing it presents a false panacea, andthat allstatistical inference relies on some untestable assumptions …

Maimonides Quote: Teach thy tongue to say 'I do not know,' and thou shalt  progress.Consider Angrist and Lavy (1999), who estimate the effect of class size on student performance by exploiting variation induced by legal limits. It works like this: Let’s say a law  prevents class size from exceeding. Let’s further assume a particular school has student cohorts that average about 90, but that cohort size fluctuates between, say, 84 and 96. So, if cohort size is 91–96 we end up with four classrooms of size 22 to 24, while if cohort size is 85–90 we end up with three classrooms of size 28 to 30. By comparing test outcomes between students who are randomly assigned to the small vs. large classes (based on their exogenous birth timing), we obtain a credible estimate of the effect of class size on academic performance. Their answer is that a ten-student reduction raises scores by about 0.2 to 0.3 standard deviations.

This example shares a common characteristic of natural experiment studies, which I think accounts for much of their popularity: At first blush, the results do seem incredibly persuasive. But if you think for awhile, you start to see they rest on a host of assumptions. For example, what if schools that perform well attract more students? In this case, incoming cohort sizes are not random, and the whole logic beaks down. What if parents who care most about education respond to large class sizes by sending their kids to a different school? What if teachers assigned to the extra classes offered in high enrollment years are not a random sample of all teachers?

Michael Keane

Keane’s critique of econometric ‘experimentalists’ gives a fair picture of some of the unfounded and exaggerated claims put forward in many econometric natural experiment studies. But — much of the critique really applies to econometrics in general, including the kind of ‘structural’ econometrics Keane himself favours!

The processes that generate socio-economic data in the real world cannot just be assumed to always be adequately captured by a probability measure. And, so, it cannot be maintained that it even should be mandatory to treat observations and data — whether cross-section, time series or panel data — as events generated by some probability model. The important activities of most economic agents do not usually include throwing dice or spinning roulette wheels. Data-generating processes — at least outside of nomological machines like dice and roulette wheels — are not self-evidently best modelled with probability measures.

When economists and econometricians — often uncritically and without arguments — simply assume that one can apply probability distributions from statistical theory to their own area of research, they are skating on thin ice. If you cannot show that data satisfies all the conditions of the probabilistic ‘nomological machine,’ then the statistical inferences made in mainstream economics lack sound foundations.

Statistical — and econometric — patterns should never be seen as anything other than possible clues to follow. Behind observable data, there are real structures and mechanisms operating, things that are — if we really want to understand, explain and (possibly) predict things in the real world — more important to get hold of than to simply correlate and regress observable variables.

Statistics cannot establish the truth value of a fact. Never has. Never will.


  1. There are indeed lots of examples of unsuccessful and bad econometrics.
    And there are lots of examples of good econometrics, especially those using graphical presentations of data.
    For example consider the relationship between indices of human development and GDP per capita:
    (Hint: Click on the line under the graph near “!870” to also see the human development indices for the individual counties over time as well as for 2015.)
    As you inspect the graph don’t kid yourself that you are not doing econometrics.
    If you can discern a general relationship between the index of human development and GDP per capita then you are essentially fitting a curve with the scatter and outliers attributed to omitted variables and allowed for by an implicit probability distribution.
    This suggests that Prof. Syll suffers from a delusion when he describes econometrics as an illusion.
    He fails to realise that he himself often does econometrics looking at graphs!
    Perhaps he does not fully understand that in order to survive humans evolved as innate econometricians able to spot patterns in otherwise meaningless observations.
    It is Prof. Syll who suffers from an illusion. Iat the end of this post he restates the illusion of realist philosophers that there exists a “deeper reality” where “behind observable data, there are real structures and mechanisms operating”. There is zero empirical evidence for this belief.
    Prof. Syll seems to believe that this is a “truth value of a fact”, but actually it is merely an unwarranted metaphysical assumption.
    illusion = a wrong or misinterpreted perception of a sensory experience
    delusion = an idiosyncratic belief or impression that is not in accordance with a generally accepted reality.

    • 《As you inspect the graph don’t kid yourself that you are not doing econometrics.》
      Why are there no error bars?
      When you build a bridge, do you use a safety factor of at least two before exporting your model to the real world? Where is the equivalent error propagation in econometrics?
      When you include error terms in textbooks but throw them out when you actually do econometrics, are you assuming ergodicity and error characteristics that bridge builders can’t get away with?
      If you tracked suicide deaths, would they too increase with GDP? Why did my brother voluntarily choose a life expectancy of 49 despite having a good corporate job, a condo, a sports car, etc.? Is it because econometrics leaves out spiritual development?
      Why use GDP anyway when total world capital is at least an order of magnitude greater?

      • @rsm
        You ask an important question: “Why are there no error bars?”.
        One could also ask “Why is there no algebraic formula for the relationship between X and Y?”
        The answer to these questions stems from the fundamental purpose of statistics such as those produced by mathematical econometricians.
        “…the object of statistical methods is the reduction of data. A quantity of data, which usually by its mere bulk is incapable of entering the mind, is to be replaced by relatively few quantities which shall adequately represent the whole, or which, in other words, shall contain as much as possible, ideally the whole, of the relevant information”
        – R.A. Fisher: 0n the mathematical Foundations of Theoretical Statistics 1922
        Of course, you can always specify your own curves and compute statistics if you so wish. This may be necessary there are 3 or more important factors influencing a variable. Otherwise it is usually much better to communicate using graphs.
        Very often all the key data can be summarised and presented in graphs which can be readily understood by the intended audience. The reader is then able and will generally prefer to use his own innate econometric skills. He can inspect the data directly and draw his own conclusions regarding the existence of interesting patterns and reliability thereof.
        If so there is no need whatsoever for any algebraic formula, standard error computations or other statistics.

        • I read a news article the other day about a study of trends in human nutrition from the onset of the neolithic (and dawn of agriculture) thru the Iron Age. They had some genetic data and have tried to tease apart the differing contributions to height from genetics (and by extension migration and conquest supplanting aboriginal hunter-gatherers with farmers). Interesting topics.
          A scientist was quoted as saying that height was 80% heredity and 20% nutrition and other factors. This is the kind of mystifying nonsense that statistical analysis of variance promotes. Obviously lazy journalism plays a part, but these same kind of meaningless generalizations using the vocabulary of statistics and probability destroy precarious knowledge in many fields.

          • Bruce,
            1. Please give a reference for the “study of trends in human nutrition” which you mention.
            2. Please explain how “precarious knowledge” can be improved without data and analysis thereof.

            • What can “height is 80% heredity and 20% nutrition” possibly mean? What “knowledge” does it communicate?

              • Bruce,
                Maybe this is relevant. It seems that “heritability” has a technical meaning for geneticists:
                “Heritability is defined as the proportion of the total variation in a given phenotype within a population that is attributable to genetic variance.”

        • Are you saying my brother’s suicide has no relevance to econometrics?
          Can I instead tell a story where my brother majored in Economics at Berkeley in the 1980s, because he believed that making money was the best way he could contribute to the world? But did money fail to buy him happiness?
          When economists summarize data do they find convenient ways to throw out data points representing my brother, my buddy (overdosed at age 47), and so many others, because their stories contradict the economic narrative that mainstream economists prefer to bloviate about?

          • Rsm,
            The UNDP agrees with you that GDP per capita is not the same thing as “happiness”.
            Their Human Development Index (and several other indices) were:
            “created to emphasize that people and their capabilities should be the ultimate criteria for assessing the development of a country, not economic growth alone.”

            • Why is economic growth (and jobs) still the goal of even self-styled heterodox economists?

  2. Real societies and social institutions are, arguably, information-generating, information-processing and (very importantly) information-governed phenomena. “Causality” in such contexts must be presumed to be mediated by and indeed to consist of “signal-processing” by and among intelligent actors, who are observing their own behavior and circumstances and acting strategically.
    Keane in his critique is only reminding us of the information-rich nature of the object of study by way of calling into question a method that deliberately abstracts away from and essentially ignores that richness.
    If we had a strong theory, supported by direct observation and measurement of teaching and learning processes, instead of a weak intuition for exactly why and how class-size matters to classroom learning, we might have reason to summarize the effects observed with descriptive statistics. To, in effect, argue for statistical magic that makes deep understanding of processes unnecessary seems like a form of agnatology.

Sorry, the comment form is closed at this time.

Blog at
Entries and Comments feeds.