## What inferential leverage do statistical models provide?

18 December, 2016 at 14:19 | Posted in Statistics & Econometrics | 10 CommentsExperimental (and non-experimental) data are often analyzed using a regression model of the form

Yi =a+bZi +Wiβ+εi,

where Wi is a vector of control variables for subject i, while a, b, and β are parameters (if Wi is 1×p, then β is p×1). The effect of treatment is measured by b. The disturbances εi would be assumed independent across subjects, with expectation 0 and constant variance. The Zi and Wi would also need to be independent of the disturbances (this is the exogeneity assumption).

Randomization guarantees that the Zi are independent of the Wi and εi . But why are Wi and εi independent? Why are the εi independent across subjects, with expectation 0 and constant variance? Replacing the indicator Zi for assignment by an indicator Xi for treatment received makes the model less secure: why is choice of treatment independent of the disturbance term? With observational data, such questions are even thornier. Of course, there are models with assumptions that are more general and harder to fathom. But that only postpones the reckoning. More-complicated questions can in turn be asked about more-complicated models …

With models, it is easy to lose track of three essential points: (i) results depend on assumptions, (ii) changing the assumptions in apparently innocuous ways can lead to drastic changes in conclusions, and (iii) familiarity with a model’s name is no guarantee of the model’s truth. Under the circumstances, it may be the assumptions behind the model that provide the leverage, not the data fed into the model. This is a danger with experiments, and even more so with observational studies.

## 10 Comments »

RSS feed for comments on this post. TrackBack URI

### Leave a Reply

Create a free website or blog at WordPress.com.

Entries and comments feeds.

I have been following this blog for several months now, paying particular attention to the entries relating to econometric modeling and inference. As an active econometrician, I found myself having mixed feelings about the discussion on this blog.

On the one hand, I’m very pleased to see economists and econometricians raising important foundational issues that are invariably banned from “learned” journals. I’m also highly sympathetic to the issues raised by both the blogger and the discussants. I’m particularly pleased to see pieces from the late David Freedman featuring prominently on this blog. I did get know David Friedman and appreciated his critical perspective on empirical modeling and inference, but I was also critical of him for not balancing his well-argued destructive comments with constructive advice on how one can alleviate the real problems he was raising.

On the other hand, I’m disappointed to discover that the people who are most worried about these crucial foundational problems in econometric modeling and inference seem to be totally unaware that there are active econometiricians in the trenches who spent their entire career, not just destructively criticizing the conventional wisdom, but offering real constructive solutions to most of these problems. These constructive solutions have been published, with a lot of effort and a huge publication lag, in traditional journals as well as books that are easily accessible to people who care about such issues. Speaking for myself, the main theme of my academic work has been to address the key issue of how a practitioner can ensure that the probabilistic assumptions he/she imposes on the data can be validated vis-à-vis the data, and if any of them are found wanting, how to respecify the original model to account for such departures. This is directly related to the theme of today’s blog entry from Freedman:

“With models, it is easy to lose track of three essential points: (i) results depend on assumptions, (ii) changing the assumptions in apparently innocuous ways can lead to drastic changes in conclusions, and (iii) familiarity with a model’s name is no guarantee of the model’s truth.

Even as a graduate student at the LSE, I realized that the traditional way of specifying statistical models using probabilistic assumptions relating to the error term was partly to blame for the endemic statistical misspecification in applied econometrics. Hence, upon the completion of my Ph.D. I wrote a book entitled “Statistical Foundations of Econometric Modelling”, CUP, 1986, whose primary aim was to demonstrate that one can specify statistical models using probabilistic assumptions based exclusively on the observable stochastic process underlying the observed data, and not the unobservable error process! I went on to recast 80% of traditional econometric models using that perspective and show what are the probabilistic assumptions being invoked, how to test them and what to do next if some are found wanting. Since then, I published a second book in 1999 and more than 80 papers in journals in three different fields, econometrics, statistics and philosophy of science, working out the details of my initial attempt and extending its scope to recast econometric modeling and inference on sounder foundations, where statistical adequacy occupies center stage. I wrote extensively about the devastating effects of minor misspecifications on the reliability of inference; the actual type I error probability in a significance test can easily be 1.0 instead of the nominal .05. Key among the issues I set out to address is that of statistical vs. substantive adequacy, which arguably undermines the trustworthiness most of current empirical results in econometrics. I keep hearing people invoking the slogan attributed to Box “all models are wrong, but some are useful”, not realizing that they confuse statistical inadequacy (some of the probabilistic assumptions imposed on the data are valid) with substantive inadequacy (the structural model is not realistic enough to capture the key features of the phenomenon of interest). These two are very different issues, and one does not need to have a substantively adequate model to learn something about the phenomenon of interest, as long as one has a statistically adequate model. By the way, I wrote the original paper on “Revisiting the omitted variables argument: Substantive vs. statistical adequacy”, in 1982, but after numerous rounds in journals, it was finally published in 2006! Yes, Keynes had raised several important issues pertaining to the reliability of inference in applied econometrics, but almost all of them have been addressed or answered.

Let me finish, by expressing the hope that the people who care about these foundational issues and problems, make some effort to go beyond repeating these issues, and proceed to inform themselves by focusing on genuine attempts to resolve them in the published literature in econometrics and related fields.

Comment by Aris Spanos— 18 December, 2016 #

Aris, thanks for sharing your views on these important subjects with us. It’s especially pleasing since I have read most of your published work for the last 20 years and know they are of the highest quality. That said, there obviously are issues on which we disagree — some of which I have been discussing with Judea Pearl, and that seem to be related to the critique put forward by David Freedman (a critique that already since the 80s have influenced my own evaluation of statistical inference and econometrics profoundly). I will be back with a post (and try to entice you especially to show in which way Keynes’ critique has been “answered”).

Comment by Lars Syll— 18 December, 2016 #

“Hence, upon the completion of my Ph.D. I wrote a book entitled “Statistical Foundations of Econometric Modelling”, CUP, 1986, whose primary aim was to demonstrate that one can specify statistical models using probabilistic assumptions based exclusively on the observable stochastic process underlying the observed data, and not the unobservable error process!”

Prof. Spanos, according to Google Books, your 1986 book doesn’t mention anywhere “sensitivity to initial conditions”, “Lorenz”, “logistic map”, etc., all topics which were “hot” in 1986, and which would seem on their face to have straightforward applicability to many questions in econometric modeling.

I’d be very curious to know to what extent you saw relevance then, and how your thinking may have evolved since.

Comment by Michael Robinson— 18 December, 2016 #

Michael, my aim at the time was not to write an encyclopedic book on empirical modeling and inference that includes all the latest developments.

It was just to recast the blueprint of traditional econometric textbooks so that statistical models are specified in terms of the observable stochastic process underlying the data, and not an unobservable error process. This would render the probabilistic assumptions not only testable but also provide framework of what to do next if any of these assumptions are invalid for the particular data. After 700 pages I was not even finished with the traditional topics in econometric modeling, and I had to exclude statistical models for cross-section and panel data. Chaotic dynamics was not a priority topic for my research agenda, and besides that topic did not mature enough until the mid 1990s to attempt its inclusion in a book on empirical modeling and inference.

Comment by Aris Spanos— 18 December, 2016 #

Thank you for the clarification about your motivations and the state of the art in the 1980’s. I’m curious because I often encounter arguments in economics to the effect that, while the keys may not be under the street light, searching under the street light makes optimal use of the illumination available. There doesn’t appear to be much enthusiasm for exploring the possibility that the optimal use of the illumination available will provably not yield keys.

Somewhat related, consider this chart:

In your view, should we take this as evidence of a statistical modeling inadequacy, a substantive modeling inadequacy, or a political/institutional modeling inadequacy?

Comment by Michael Robinson— 19 December, 2016 #

//

In relation to his question “In your view, should we take this as evidence of a statistical modeling inadequacy, a substantive modeling inadequacy, or a political/institutional modeling inadequacy?”

.

My answer is all of the above, but to be able separate the different sources of misspecification one needs to begin with a statistically adequate model so as to be able to use reliable statistical procedures to sort out the different sources.

//

Fair enough, but shouldn’t we expect the people specifying the statistical model to prefer models which obfuscate the contribution of political/institutional modeling inadequacy over models which clarify it?

Comment by Michael Robinson— 19 December, 2016 #

Aris: I’ve read and appreciated some of your work (albeit as a scientist/mathematician/engineer rather than econometrician).

Having said that, I still don’t understand how it is possible to

“to demonstrate that one can specify statistical models using probabilistic assumptions based exclusively on the observable stochastic process underlying the observed data, and not the unobservable error process!”

For example, it seems that ‘underlying’ and ‘observable’ are immediately in tension.

Furthermore there seem to be numerous statistical models that can adequately capture the same ‘information’ in a given dataset, leaving the choices somewhat arbitrary.

Comment by omaclaren— 19 December, 2016 #

omaclaren: The basic idea is that a modeler begins with substantive subject matter information, which is framed into a structural model M_{ϕ}(z), that demarcates the crucial aspects of the phenomenon of interest by choosing the relevant data Z₀. The traditional perspective specifies the associated statistical model indirectly by assigning the probabilistic structure via structural error terms. By disentangling the the statistical from the structural model, the proposed perspective suggests specifying the statistical model M_{θ}(z) directly by viewing it as a particular parameterization of the observable vector stochastic process {Z_{t}, t∈N} underlying the data Z₀. The specification of M_{θ}(z) has two primary aims in mind: (a) to account for the systematic statistical information in data Z₀, and (b) to adopt a parameterization for M_{θ}(z) that parametrically nests M_{ϕ}(z), so that one can pose the substantive questions of interest to data Z₀.

On Michael’s point about the traditional argument “while the keys may not be under the street light, searching under the street light makes optimal use of the illumination available”, the chance regularity patterns exhibited by the data provide a very different light on the form and probabilistic structure of the statistical model. They represent the systematic statistical information which the statistical model should account for in order to secure the reliability of any inference based on it.

In relation to his question “In your view, should we take this as evidence of a statistical modeling inadequacy, a substantive modeling inadequacy, or a political/institutional modeling inadequacy?”

My answer is all of the above, but to be able separate the different sources of misspecification one needs to begin with a statistically adequate model so as to be able to use reliable statistical procedures to sort out the different sources.

Comment by Aris Spanos— 19 December, 2016 #

Thanks for the clarification. I have further questions, but will leave them for another day/another blog!

Comment by omaclaren— 20 December, 2016 #

I have answered most of Keynes’s criticisms in several papers, but here is a short piece from my “Revisiting Haavelmo’s Structural Econometrics: Bridging the Gap between Theory and Data”, Journal of Economic Methodology 22 (2), 154-175, where I touch of several of the issues raised by Keynes, giving credit to Haavelmo:

“Keynes (1939) was highly criticical of Tinbergen’s (1939) work on the business cycle. Among other issues (Hendry and Morgan, 1995, Boumans and Davis, 2010), he raised a number of questions about the appropriateness of statistical inference generally, and the use of linear regression, in economic modeling, including:

(i) whether a statistical test can prove or disprove a theory,

(ii) the need to include a complete list of all the relevant variables at the outset,

(iii) the descriptive vs. the inductive dimension of empirical inference results,

(iv) the applicability of statistical techniques to nonexperimental data that exhibit

heterogeneity and dependence.

The conventional wisdom in economics at the time was that the statistical techniques pioneered by Fisher were only applicable to experimental data that satisfy the IID assumptions; see Frisch (1934), p. 6. Haavelmo’s (1943b) assessment at the time:

“Since the days when Yule (1926) `discovered’ that correlation between time series might be `nonsense’, very few economists have dared break the ban on time series as an object of statistical inference.” (p. 13)

Haavelmo went on to criticize Keynes by arguing that instead of leveling legitimate criticisms against Tinbergen: “… for his short cuts in statistical method, for his omission of rigorous formulation of the probability models involved and the statistical hypotheses to be tested,” his critics complained “that Tinbergen had tried to go too far in using statistical methods; that inference of this sort was inferior if not worthless, compared with the noble art of theoretical deductions based on `general economic considerations.” (p. 13)

Haavelmo went on to make a case that the issues (i)-(iv) raised by Keynes can only be addressed by statistical methods in the context of rigorously formulated statistical models, build on the joint distribution of the observable stochastic processes underlying the data. He re-iterated that more fully in his monograph:

“… it is not necessary that the observations should be independent and that they should follow the same one-dimensional probability law. It is sufficient to assume that the whole set of, say n, observations may be considered as one observation of n variables (or a “sample point”) following an n-dimensional joint probability law, the “existence” of which my be purely hypothetical.” (Haavelmo, 1944, pp. iii-iv)

It is interesting to note that, in addition to Haavelmo, the pioneers of that period like Frisch and Tinbergen, as well as the Cowles Commission group were well informed about the new developments in both statistics and probability. Nevertheless, a closer look at their published work (Koopmans, 1950) reveals a tension in reconciling the intrinsic stochasticity (stemming from the process {Z_{t}, t∈N underlying the data Z₀) of the F-N-P perspective, with the PET perspective, where the statistical premises are specified by attaching stochastic error terms onto deterministic theory models. This tension is clearly exemplified in Koopmans’s (1938) tentative attempt to reconcile Fisher’s and Frisch’s perspectives on linear regression.

By the 1960s, however, textbook econometrics appears to have largely settled the tension by reverting to the pre-Fisher modeling perspective of the Gauss-Laplace curve-fitting, where the probabilistic structure enters the modeling in a non-intrinsic way thorough error terms that represent errors of measurement, errors of approximation, `random’ omitted effects or stochastic shocks. The publication of two highly influential textbooks by Johnston (1963) and Goldberger (1964) rendered the linear model, as it relates to the Gauss-Markov theorem, the cornerstone of modern textbook econometrics. By encouraging `weak’ probabilistic assumptions in terms of the error term, supplemented with substantive assumptions like `no omitted variables’, the textbook approach renders model validation more or less impossible.

Conflating substantive with statistical assumptions is not a new problem. A glance at the above list of criticisms leveled by Keynes (1939) reveals that items (i)-(ii) pertain to substantive and (iii)-(iv) to statistical adequacy. Haavelmo (1943b) replied to Keynes’ charge (ii) by articulating incisive intuition: “… it is legitimate to try out a regression equation in the data even if the equation should not contain a “complete list” of causes.” (p. 15).

Granted, he did not give a thorough answer by arguing that “no omitted variables” is a substantive and not a statistical assumption (see table 1), but his phrasing `it is legitimate to try out’ conveys the correct attitude. “

Comment by Aris Spanos— 18 December, 2016 #