Econometrics and the axiom of correct specification

20 Oct, 2016 at 17:22 | Posted in Statistics & Econometrics | 4 Comments

Most work in econometrics and regression analysis is — still — made on the assumption that the researcher has a theoretical model that is ‘true.’ Based on this belief of having a correct specification for an econometric model or running a regression, one proceeds as if the only problem remaining to solve have to do with measurement and observation.

aWhen things sound to good to be true, they usually aren’t. And that goes for econometric wet dreams too. The snag is, of course, that there is pretty little to support the perfect specification assumption. Looking around in social science and economics we don’t find a single regression or econometric model that lives up to the standards set by the ‘true’ theoretical model — and there is pretty little that gives us reason to believe things will be different in the future.

To think that we are being able to construct a model where all relevant variables are included and correctly specify the functional relationships that exist between them, is  not only a belief without support, but a belief impossible to support.

The theories we work with when building our econometric regression models are insufficient. No matter what we study, there are always some variables missing, and we don’t know the correct way to functionally specify the relationships between the variables.

Every regression model constructed is misspecified. There are always an endless list of possible variables to include, and endless possible ways to specify the relationships between them. So every applied econometrician comes up with his own specification and ‘parameter’ estimates. The econometric Holy Grail of consistent and stable parameter-values is nothing but a dream.

overconfidenceIn order to draw inferences from data as described by econometric texts, it is necessary to make whimsical assumptions. The professional audience consequently and properly withholds belief until an inference is shown to be adequately insensitive to the choice of assumptions. The haphazard way we individually and collectively study the fragility of inferences leaves most of us unconvinced that any inference is believable. If we are to make effective use of our scarce data resource, it is therefore important that we study fragility in a much more systematic way. If it turns out that almost all inferences from economic data are fragile, I suppose we shall have to revert to our old methods …

Ed Leamer

A rigorous application of econometric methods in economics really presupposes that the phenomena of our real world economies are ruled by stable causal relations between variables.  Parameter-values estimated in specific spatio-temporal contexts are presupposed to be exportable to totally different contexts. To warrant this assumption one, however, has to convincingly establish that the targeted acting causes are stable and invariant so that they maintain their parametric status after the bridging. The endemic lack of predictive success of the econometric project indicates that this hope of finding fixed parameters is a hope for which there really is no other ground than hope itself.

stat That models should correspond to reality is, after all, a useful but not totally straightforward idea – with some history to it. Developing appropriate models is a serious problem in statistics; testing the connection to the phenomena is even more serious …

In our days, serious arguments have been made from data. Beautiful, delicate theorems have been proved, although the connection with data analysis often remains to be established. And an enormous amount of fiction has been produced, masquerading as rigorous science.

The theoretical conditions that have to be fulfilled for regression analysis and econometrics to really work are nowhere even closely met in reality. Making outlandish statistical assumptions does not provide a solid ground for doing relevant social science and economics. Although regression analysis and econometrics have become the most used quantitative methods in social sciences and economics today, it’s still a fact that the inferences made from them are invalid.

41ibatsefvlRegression models have some serious weaknesses. Their ease of estimation tends to suppress attention to features of the data that matching techniques force researchers to consider, such as the potential heterogeneity of the causal effect and the alternative distributions of covariates across those exposed to different levels of the cause. Moreover, the traditional exogeneity assumption of regression … often befuddles applied researchers … As a result, regression practitioners can too easily accept their hope that the specification of plausible control variables generates as-if randomized experiment.

Econometrics — and regression analysis — is basically a deductive method. Given the assumptions (such as manipulability, transitivity, separability, additivity, linearity, etc) it delivers deductive inferences. The problem, of course, is that we will never completely know when the assumptions are right. Conclusions can only be as certain as their premises — and that also applies to econometrics and regression analysis.


  1. //Parameter-values estimated in specific spatio-temporal contexts are presupposed to be exportable to totally different contexts.//
    A lot of the problem here goes away if point parameter estimation is replaced by interval parameter estimation. For example:

    There are an awful lot of spatio-temporal contexts where you can export ±300bps with 70% confidence.
    (yes, it’s a projection, not a parameter estimate, but I find the chart irresistable; also, the larger point holds: with interval estimation, the weaknesses of the model are surfaced and explicit throughout)

  2. People confuse specifications with models or programs in economics. In general, while specifications describe “what” to be equal, models describe “how” to be equal. Formally, to prove the correctness of a model (i.e meet specifications), we need to valid two properties of the model:

    (1) completeness: this model can imply all allowable instances in specifications
    (2) consistency: this model cannot contradict any allowable instances in specifications

    Economic specifications are accounting identities based on income/spending. Economic specifications are market-independent and true temporal statements for ALL time periods in both past and future.

    Economic models are equilibrium and behavior equations based on demand/supply. Models are market-dependent and true temporal statements at SPECIFIC time periods at most.

    There is no way to provide a correct economic model to meet economic specifications with two properties: completeness and consistency!!!

    For example,

    1 Economic specification:

    All time t such that [
    Current Account Balance (t)]
    = (Households Saving (t) – Households Investment (t))
    + (Business Saving (t) – Business Investment (t))
    + (Public Saving (t) – Public Investment(t))

    Can any loanable funds models meet this specification with completeness and consistency? The answer is not!!!

    2 Economic specification:

    IS(r, t) = LM(r, t) * k*Q(r, t)/P(r, t) ,
    IS(r,t) = GDP(r, t) = P(r, t)* Q(r,t)
    LM(r, t) = Md(r,t)/P(r, t)

    Is IS/LM model correct? Once given some specific functions IS and LM in the model, then the model lost the properties completeness and/or consistency !!!

  3. Like my critique of DSGE modeling referred to on Oct. 19, the claim
    “Every regression model constructed is misspecified. There are always an endless list of possible variables to include, and endless possible ways to specify the relationships between them. So every applied econometrician comes up with his own specification and ‘parameter’ estimates. The econometric Holy Grail of consistent and stable parameter-values is nothing but a dream.”

    conflates statistical with substantive misspecification. In my statistical world, the Linear Regression (LR) model has 5 statistical assumptions pertaining to the conditional process {[Y(t) | X(t)=x(t)], t=1,2, …,n,…}; [1] Normality, [2] Linearity, [3] Homoskedasticity], [4] Independence, and [5] parameter constancy. These assumptions are easily testable vis-a-vis one’s data. When they are thoroughly tested and their validity is established, one can use the traditional estimators, tests and predictors as the deductive frequentist sampling theory asserts, knowing that the optimality properties derived under these assumptions are approximately valid; when one applies a .05 significance test the actual type I error probability is apprx. .05, not .2 or .9 — that can easily be the case when some of the assumptions [1]-[5] are invalid. Omitted variables, false causal claims, etc. are not statistical but substantive assumptions, that can be probed after the statistical adequacy of the LR model is secured. Without securing statistical adequacy one should not use statistical procedures for such probing “as if” these assumptions are valid! In this sense, substantive adequacy is just a goal (a dream), but you cannot even begin to move toward that goal before one secures statistical adequacy. It is one thing to claim that one’s model is only a crude approximation of the reality it aims to explain, and totally another to be proud of the fact that one has imposed invalid assumptions on the data. The latter can be easily tested and remedied, but the former will be an elusive objective for fields like economics.

  4. Aris, although I concur on the need for checking statistical adequacy, I’m still not convinced that will solve the ‘problem’. As our colleague Ed Leamer writes in ‘Macroeconomic Patterns and Stories’:

    “Statistical Science is not really very helpful for understanding or forecasting complex evolving self-healing organic ambiguous social systems – economies, in other words.

    A statistician may have done the programming, but when you press a button on a computer keyboard and ask the computer to find some good patterns, better get clear a sad fact: computers do not think. They do exactly what the programmer told them to do and nothing more. They look for the patterns that we tell them to look for, those and nothing more. When we turn to the computer for advice, we are only talking to ourselves …

    Mathematical analysis works great to decide which horse wins, if we are completely confident which horses are in the race, but it breaks down when we are not sure. In experimental settings, the set of alternative models can often be well agreed on, but with nonexperimental economics data, the set of models is subject to enormous disagreements. You disagree with your model made yesterday, and I disagree with your model today. Mathematics does not help much resolve our internal intellectual disagreements.”

Sorry, the comment form is closed at this time.

Blog at
Entries and comments feeds.