Why validating assumptions is so important in science

5 Nov, 2019 at 14:05 | Posted in Statistics & Econometrics | 3 Comments

valAn ongoing concern is that excessive focus on formal modeling and statistics can lead to neglect of practical issues and to overconfidence in formal results … Analysis interpretation depends on contextual judgments about how reality is to be mapped onto the model, and how the formal analysis results are to be mapped back into reality. But overconfidence in formal outputs is only to be expected when much labor has gone into deductive reasoning. First, there is a need to feel the labor was justified, and one way to do so is to believe the formal deduction produced important conclusions. Second, there seems to be a pervasive human aversion to uncertainty, and one way to reduce feelings of uncertainty is to invest faith in deduction as a sufficient guide to truth. Unfortunately, such faith is as logically unjustified as any religious creed, since a deduction produces certainty about the real world only when its assumptions about the real world are certain …

Unfortunately, assumption uncertainty reduces the status of deductions and statistical computations to exercises in hypothetical reasoning – they provide best-case scenarios of what we could infer from specific data (which are assumed to have only specific, known problems). Even more unfortunate, however, is that this exercise is deceptive to the extent it ignores or misrepresents available information, and makes hidden assumptions that are unsupported by data …

Despite assumption uncertainties, modelers often express only the uncertainties derived within their modeling assumptions, sometimes to disastrous consequences. Econometrics supplies dramatic cautionary examples in which complex modeling has failed miserably in important applications …

Sander Greenland

Yes, indeed, econometrics fails miserably over and over again. One reason why it does, is that the error term in the regression models used are thought of as representing the effect of the variables that were omitted from the models. The error term is somehow thought to be a ‘cover-all’ term representing omitted content in the model and necessary to include to ‘save’ the assumed deterministic relation between the other random variables included in the model. Error terms are usually assumed to be orthogonal (uncorrelated) to the explanatory variables. But since they are unobservable, they are also impossible to empirically test. And without justification of the orthogonality assumption, there is as a rule nothing to ensure identifiability:

Paul-Romer-727x727With enough math, an author can be confident that most readers will never figure out where a FWUTV (facts with unknown truth value) is buried. A discussant or referee cannot say that an identification assumption is not credible if they cannot figure out what it is and are too embarrassed to ask.

Distributional assumptions about error terms are a good place to bury things because hardly anyone pays attention to them. Moreover, if a critic does see that this is the identifying assumption, how can she win an argument about the true expected value the level of aether? If the author can make up an imaginary variable, “because I say so” seems like a pretty convincing answer to any question about its properties.

Paul Romer


  1. Like anyone else, I like to repeat my little bits of wisdom. I will repeat one now, believing it may be particularly relevant: the necessity of operational models. I take the term, “operational model”, from Popper, who was surprised to discover that physics was not preoccupied with theoretical analysis per se. Popper believed deeply in the truth as embodied in analytic modeling; those little nomological machines were the bestest, in his mind. But he found physics working on issues of measurement: in the mindspace where the scientist gazed at reality and wondered why it differed so much from imagination. And, tried to find out.
    The errors, the deviations of the world from idealized imagination, the messiness, ought to be the interesting part, not the part we earnestly try to “assume” away. The “validity” of a prior assumption should not be an issue at all. Projecting wishes onto error terms is just a strange way to never leave the closed room, the prison of analysis to visit the great outdoors.
    Economists need to learn to do something else, to do operational modeling of closely observed social institutions. To measure specifics, rather than assume generality.

  2. They should at least carry error terms along and report them. GDP should have a plus-or-minus X% attached. Every time you scale something like unemployment, which has its own error term (being a survey), by GDP, you should multiply the error terms and illustrate the resulting wide confidence intervals in your figures. The reason economists don’t do this is that the error margins would be so wide, you could tell any story.

    • Isn’t that a feature rather than a bug?

Sorry, the comment form is closed at this time.

Blog at WordPress.com.
Entries and comments feeds.