Econometric beasts of bias27 October, 2015 at 11:26 | Posted in Statistics & Econometrics | 1 Comment
In an article posted earlier on this blog — What are the key assumptions of linear regression models? — yours truly tried to argue that since econometrics doesn’t content itself with only making “optimal” predictions,” but also aspires to explain things in terms of causes and effects, econometricians need loads of assumptions — and that most important of these are additivity and linearity.
Let me take the opportunity to elaborate a little more on why I find these assumptions of such paramount importance and ought to be much more argued for — on both epistemological and ontological grounds — if at all being used.
Limiting model assumptions in economic science always have to be closely examined since if we are going to be able to show that the mechanisms or causes that we isolate and handle in our models are stable in the sense that they do not change when we “export” them to our “target systems”, we have to be able to show that they do not only hold under ceteris paribus conditions and a fortiori only are of limited value to our understanding, explanations or predictions of real economic systems. As the always eminently quotable Keynes wrote (emphasis added) in Treatise on Probability (1921):
The kind of fundamental assumption about the character of material laws, on which scientists appear commonly to act, seems to me to be [that] the system of the material universe must consist of bodies … such that each of them exercises its own separate, independent, and invariable effect, a change of the total state being compounded of a number of separate changes each of which is solely due to a separate portion of the preceding state … Yet there might well be quite different laws for wholes of different degrees of complexity, and laws of connection between complexes which could not be stated in terms of laws connecting individual parts … If different wholes were subject to different laws qua wholes and not simply on account of and in proportion to the differences of their parts, knowledge of a part could not lead, it would seem, even to presumptive or probable knowledge as to its association with other parts … These considerations do not show us a way by which we can justify induction … /427 No one supposes that a good induction can be arrived at merely by counting cases. The business of strengthening the argument chiefly consists in determining whether the alleged association is stable, when accompanying conditions are varied … /468 In my judgment, the practical usefulness of those modes of inference … on which the boasted knowledge of modern science depends, can only exist … if the universe of phenomena does in fact present those peculiar characteristics of atomism and limited variety which appears more and more clearly as the ultimate result to which material science is tending.
Econometrics may be an informative tool for research. But if its practitioners do not investigate and make an effort of providing a justification for the credibility of the assumptions on which they erect their building, it will not fulfill its tasks. There is a gap between its aspirations and its accomplishments, and without more supportive evidence to substantiate its claims, critics will continue to consider its ultimate argument as a mixture of rather unhelpful metaphors and metaphysics. Maintaining that economics is a science in the “true knowledge” business, yours truly remains a skeptic of the pretences and aspirations of econometrics. So far, I cannot really see that it has yielded very much in terms of relevant, interesting economic knowledge.
The marginal return on its ever higher technical sophistication in no way makes up for the lack of serious under-labouring of its deeper philosophical and methodological foundations that already Keynes complained about. The rather one-sided emphasis of usefulness and its concomitant instrumentalist justification cannot hide that neither Haavelmo, nor the legions of probabilistic econometricians following in his footsteps, give supportive evidence for their considering it “fruitful to believe” in the possibility of treating unique economic data as the observable results of random drawings from an imaginary sampling of an imaginary population. After having analyzed some of its ontological and epistemological foundations, I cannot but conclude that econometrics on the whole has not delivered “truth”. And I doubt if it has ever been the intention of its main protagonists.
Our admiration for technical virtuosity should not blind us to the fact that we have to have a cautious attitude towards probabilistic inferences in economic contexts. Science should help us penetrate to “the true process of causation lying behind current events” and disclose “the causal forces behind the apparent facts” [Keynes 1971-89 vol XVII:427]. We should look out for causal relations, but econometrics can never be more than a starting point in that endeavour, since econometric (statistical) explanations are not explanations in terms of mechanisms, powers, capacities or causes. Firmly stuck in an empiricist tradition, econometrics is only concerned with the measurable aspects of reality, But there is always the possibility that there are other variables – of vital importance and although perhaps unobservable and non-additive, not necessarily epistemologically inaccessible – that were not considered for the model. Those who were can hence never be guaranteed to be more than potential causes, and not real causes. A rigorous application of econometric methods in economics really presupposes that the phenomena of our real world economies are ruled by stable causal relations between variables. A perusal of the leading econom(etr)ic journals shows that most econometricians still concentrate on fixed parameter models and that parameter-values estimated in specific spatio-temporal contexts are presupposed to be exportable to totally different contexts. To warrant this assumption one, however, has to convincingly establish that the targeted acting causes are stable and invariant so that they maintain their parametric status after the bridging. The endemic lack of predictive success of the econometric project indicates that this hope of finding fixed parameters is a hope for which there really is no other ground than hope itself.
Real world social systems are not governed by stable causal mechanisms or capacities. As Keynes wrote in his critique of econometrics and inferential statistics already in the 1920s (emphasis added):
The atomic hypothesis which has worked so splendidly in Physics breaks down in Psychics. We are faced at every turn with the problems of Organic Unity, of Discreteness, of Discontinuity – the whole is not equal to the sum of the parts, comparisons of quantity fails us, small changes produce large effects, the assumptions of a uniform and homogeneous continuum are not satisfied. Thus the results of Mathematical Psychics turn out to be derivative, not fundamental, indexes, not measurements, first approximations at the best; and fallible indexes, dubious approximations at that, with much doubt added as to what, if anything, they are indexes or approximations of.
The kinds of “laws” and relations that econometrics has established, are laws and relations about entities in models that presuppose causal mechanisms being atomistic and additive. When causal mechanisms operate in real world social target systems they only do it in ever-changing and unstable combinations where the whole is more than a mechanical sum of parts. If economic regularities obtain they do it (as a rule) only because we engineered them for that purpose. Outside man-made “nomological machines” they are rare, or even non-existant. Unfortunately that also makes most of the achievements of econometrics – as most of contemporary endeavours of mainstream economic theoretical modeling – rather useless.
Following our recent post on econometricians’ traditional privileging of unbiased estimates, there were a bunch of comments echoing the challenge of teaching this topic, as students as well as practitioners often seem to want the comfort of an absolute standard such as best linear unbiased estimate or whatever. Commenters also discussed the tradeoff between bias and variance, and the idea that unbiased estimates can overfit the data.
I agree with all these things but I just wanted to raise one more point: In realistic settings, unbiased estimates simply don’t exist. In the real world we have nonrandom samples, measurement error, nonadditivity, nonlinearity, etc etc etc.
So forget about it. We’re living in the real world …
It’s my impression that many practitioners in applied econometrics and statistics think of their estimation choice kinda like this:
1. The unbiased estimate. It’s the safe choice, maybe a bit boring and maybe not the most efficient use of the data, but you can trust it and it gets the job done.
2. A biased estimate. Something flashy, maybe Bayesian, maybe not, it might do better but it’s risky. In using the biased estimate, you’re stepping off base—the more the bias, the larger your lead—and you might well get picked off …
If you take the choice above and combine it with the unofficial rule that statistical significance is taken as proof of correctness (in econ, this would also require demonstrating that the result holds under some alternative model specifications, but “p less than .05″ is still key), then you get the following decision rule:
A. Go with the safe, unbiased estimate. If it’s statistically significant, run some robustness checks and, if the result doesn’t go away, stop.
B. If you don’t succeed with A, you can try something fancier. But . . . if you do that, everyone will know that you tried plan A and it didn’t work, so people won’t trust your finding.
So, in a sort of Gresham’s Law, all that remains is the unbiased estimate. But, hey, it’s safe, conservative, etc, right?
And that’s where the present post comes in. My point is that the unbiased estimate does not exist! There is no safe harbor. Just as we can never get our personal risks in life down to zero … there is no such thing as unbiasedness. And it’s a good thing, too: recognition of this point frees us to do better things with our data right away.