At least since the time of Keynes’s famous critique of Tinbergen’s econometric methods, those of us in the social science community who have been unpolite enough to dare questioning the preferred methods and models applied in quantitive research in general and economics more specifically, are as a rule met with disapproval. Although people seem to get very agitated and upset by the critique — just read the commentaries on this blog if you don’t believe me — defenders of ‘received theory’ always say that the critique is ‘nothing new,’ that they have always been ‘well aware’ of the problems, and so on, and so on.
So, for the benefit of all mindless practitioners of econometrics and statistics — who don’t want to be disturbed in their doings — eminent mathematical statistician David Freedman has put together a very practical list of vacuous responses to criticism that can be freely used to save your peace of mind:
We know all that. Nothing is perfect … The assumptions are reasonable. The assumptions don’t matter. The assumptions are conservative. You can’t prove the assumptions are wrong. The biases will cancel. We can model the biases. We’re only doing what evereybody else does. Now we use more sophisticated techniques. If we don’t do it, someone else will. What would you do? The decision-maker has to be better off with us than without us … The models aren’t totally useless. You have to do the best you can with the data. You have to make assumptions in order to make progress. You have to give the models the benefit of the doubt. Where’s the harm?
There have been over four decades of econometric research on business cycles … The formalization has undeniably improved the scientific strength of business cycle measures …
But the significance of the formalization becomes more difficult to identify when it is assessed from the applied perspective, especially when the success rate in ex-ante forecasts of recessions is used as a key criterion. The fact that the onset of the 2008 financial-crisis-triggered recession was predicted by only a few ‘Wise Owls’ … while missed by regular forecasters armed with various models serves us as the latest warning that the efficiency of the formalization might be far from optimal. Remarkably, not only has the performance of time-series data-driven econometric models been off the track this time, so has that of the whole bunch of theory-rich macro dynamic models developed in the wake of the rational expectations movement, which derived its fame mainly from exploiting the forecast failures of the macro-econometric models of the mid-1970s recession.
The limits of econometric forecasting has, as noted by Qin, been critically pointed out many times before.
Trygve Haavelmo — with the completion (in 1958) of the twenty-fifth volume of Econometrica — assessed the the role of econometrics in the advancement of economics, and although mainly positive of the “repair work” and “clearing-up work” done, Haavelmo also found some grounds for despair:
We have found certain general principles which would seem to make good sense. Essentially, these principles are based on the reasonable idea that, if an economic model is in fact “correct” or “true,” we can say something a priori about the way in which the data emerging from it must behave. We can say something, a priori, about whether it is theoretically possible to estimate the parameters involved. And we can decide, a priori, what the proper estimation procedure should be … But the concrete results of these efforts have often been a seemingly lower degree of accuracy of the would-be economic laws (i.e., larger residuals), or coefficients that seem a priori less reasonable than those obtained by using cruder or clearly inconsistent methods.
There is the possibility that the more stringent methods we have been striving to develop have actually opened our eyes to recognize a plain fact: viz., that the “laws” of economics are not very accurate in the sense of a close fit, and that we have been living in a dream-world of large but somewhat superficial or spurious correlations.
And as the quote below shows, even Ragnar Frisch shared some of Haavelmo’s — and Keynes’s — doubts on the applicability of econometrics:
I have personally always been skeptical of the possibility of making macroeconomic predictions about the development that will follow on the basis of given initial conditions … I have believed that the analytical work will give higher yields – now and in the near future – if they become applied in macroeconomic decision models where the line of thought is the following: “If this or that policy is made, and these conditions are met in the period under consideration, probably a tendency to go in this or that direction is created”.
Econometrics may be an informative tool for research. But if its practitioners do not investigate and make an effort of providing a justification for the credibility of the assumptions on which they erect their building, it will not fulfill its tasks. There is a gap between its aspirations and its accomplishments, and without more supportive evidence to substantiate its claims, critics will continue to consider its ultimate argument as a mixture of rather unhelpful metaphors and metaphysics. Maintaining that economics is a science in the “true knowledge” business, I remain a skeptic of the pretences and aspirations of econometrics. So far, I cannot really see that it has yielded very much in terms of relevant, interesting economic knowledge. And, more specifically, when it comes to forecasting activities, the results have been bleak indeed.
The marginal return on its ever higher technical sophistication in no way makes up for the lack of serious under-labouring of its deeper philosophical and methodological foundations that already Keynes complained about. The rather one-sided emphasis of usefulness and its concomitant instrumentalist justification cannot hide that the legions of probabilistic econometricians who give supportive evidence for their considering it “fruitful to believe” in the possibility of treating unique economic data as the observable results of random drawings from an imaginary sampling of an imaginary population, are scating on thin ice. After having analyzed some of its ontological and epistemological foundations, I cannot but conclude that econometrics on the whole has not delivered “truth,” nor robust forecasts. And I doubt if it has ever been the intention of its main protagonists.
Our admiration for technical virtuosity should not blind us to the fact that we have to have a more cautious attitude towards probabilistic inference of causality in economic contexts. Science should help us penetrate to — as Keynes put it — “the true process of causation lying behind current events” and disclose “the causal forces behind the apparent facts.” We should look out for causal relations, but econometrics can never be more than a starting point in that endeavour, since econometric (statistical) explanations are not explanations in terms of mechanisms, powers, capacities or causes. Firmly stuck in an empiricist tradition, econometrics is only concerned with the measurable aspects of reality, But there is always the possibility that there are other variables – of vital importance and although perhaps unobservable and non-additive not necessarily epistemologically inaccessible – that were not considered for the model. Those who were can hence never be guaranteed to be more than potential causes, and not real causes.
A rigorous application of econometric methods in economics really presupposes that the phenomena of our real world economies are ruled by stable causal relations between variables. A perusal of the leading econom(etr)ic journals shows that most econometricians still concentrate on fixed parameter models and that parameter-values estimated in specific spatio-temporal contexts are presupposed to be exportable to totally different contexts. To warrant this assumption one, however, has to convincingly establish that the targeted acting causes are stable and invariant so that they maintain their parametric status after the bridging. The endemic lack of predictive success of the econometric project indicates that this hope of finding fixed parameters is a hope for which there really is no other ground than hope itself.
This is a more fundamental and radical problem than the celebrated “Lucas critique” have suggested. This is not the question if deep parameters, absent on the macro-level, exist in “tastes” and “technology” on the micro-level. It goes deeper. Real world social systems are not governed by stable causal mechanisms or capacities. It is the criticism that Keynes — in Essays in Biography — first launched against econometrics and inferential statistics already in the 1920s:
The atomic hypothesis which has worked so splendidly in Physics breaks down in Psychics. We are faced at every turn with the problems of Organic Unity, of Discreteness, of Discontinuity – the whole is not equal to the sum of the parts, comparisons of quantity fails us, small changes produce large effects, the assumptions of a uniform and homogeneous continuum are not satisfied. Thus the results of Mathematical Psychics turn out to be derivative, not fundamental, indexes, not measurements, first approximations at the best; and fallible indexes, dubious approximations at that, with much doubt added as to what, if anything, they are indexes or approximations of.
The kinds of laws and relations that econom(etr)ics has established, are laws and relations about entities in models that presuppose causal mechanisms being atomistic and additive. When causal mechanisms operate in real world social target systems they only do it in ever-changing and unstable combinations where the whole is more than a mechanical sum of parts. If economic regularities obtain they do it (as a rule) only because we engineered them for that purpose. Outside man-made “nomological machines” they are rare, or even non-existant. Unfortunately that also makes most of the achievements of econometrics – as most of contemporary endeavours of economic theoretical modeling – rather useless.
Roland Fryer, an economics professor at Harvard University, recently published a working paper at NBER on the topic of racial bias in police use of force and police shootings. The paper gained substantial media attention – a write-up of it became the top viewed article on the New York Times website. The most notable part of the study was its finding that there was no evidence of racial bias in police shootings, which Fryer called “the most surprising result of [his] career”. In his analysis of shootings in Houston, Texas, black and Hispanic people were no more likely (and perhaps even less likely) to be shot relative to whites.
Fryer’s analysis is highly flawed, however … Fryer was not comparing rates of police shootings by race. Instead, his research asked whether these racial differences were the result of “racial bias” rather than merely “statistical discrimination”. Both terms have specific meanings in economics. Statistical discrimination occurs when an individual or institution treats people differently based on racial stereotypes that ‘truly’ reflect the average behavior of a racial group. For instance, if a city’s black drivers are 50% more likely to possess drugs than white drivers, and police officers are 50% more likely to pull over black drivers, economic theory would hold that this discriminatory policing is rational …
Once explained, it is possible to find the idea of “statistical discrimination” just as abhorrent as “racial bias”. One could point out that the drug laws police enforce were passed with racially discriminatory intent, that collectively punishing black people based on “average behavior” is wrong, or that – as a self-fulfilling prophecy – bias can turn into statistical discrimination (if black people’s cars are searched more thoroughly, for instance, it will appear that their rates of drug possession are higher) …
Even if one accepts the logic of statistical discrimination versus racial bias, it is an inappropriate choice for a study of police shootings. The method that Fryer employs has, for the most part, been used to study traffic stops and stop-and-frisk practices. In those cases, economic theory holds that police want to maximize the number of arrests for the possession of contraband (such as drugs or weapons) while expending the fewest resources. If they are acting in the most cost-efficient, rational manner, the officers may use racial stereotypes to increase the arrest rate per stop. This theory completely falls apart for police shootings, however, because officers are not trying to rationally maximize the number of shootings …
Economic theory aside, there is an even more fundamental problem with the Houston police shooting analysis. In a typical study, a researcher will start with a previously defined population where each individual is at risk of a particular outcome. For instance, a population of drivers stopped by police can have one of two outcomes: they can be arrested, or they can be sent on their way. Instead of following this standard approach, Fryer constructs a fictitious population of people who are shot by police and people who are arrested. The problem here is that these two groups (those shot and those arrested) are, in all likelihood, systematically different from one another in ways that cannot be controlled for statistically … Properly interpreted, the actual result from Fryer’s analysis is that the racial disparity in arrest rates is larger than the racial disparity in police shootings. This is an unsurprising finding, and proves neither a lack of bias nor a lack of systematic discrimination.
The assumption of additivity and linearity means that the outcome variable is, in reality, linearly related to any predictors … and that if you have several predictors then their combined effect is best described by adding their effects together …
This assumption is the most important because if it is not true then even if all other assumptions are met, your model is invalid because you have described it incorrectly. It’s a bit like calling your pet cat a dog: you can try to get it to go in a kennel, or to fetch sticks, or to sit when you tell it to, but don’t be surprised when its behaviour isn’t what you expect because even though you’ve called it a dog, it is in fact a cat. Similarly, if you have described your statistical model inaccurately it won’t behave itself and there’s no point in interpreting its parameter estimates or worrying about significance tests of confidence intervals: the model is wrong.
Econometricians usually think that the data generating process (DGP) always can be modelled properly using a probability measure. The argument is standardly based on the assumption that the right sampling procedure ensures there will always be an appropriate probability measure. But – as always – one really has to argue the case, and present warranted evidence that real-world features are correctly described by some probability measure.
There are no such things as free-standing probabilities – simply because probabilities are strictly seen only defined relative to chance set-ups – probabilistic nomological machines like flipping coins or roulette-wheels. And even these machines can be tricky to handle. Although prob(fair coin lands heads|I toss it) = prob(fair coin lands head & I toss it)|prob(fair coin lands heads) may be well-defined, it’s not certain we can use it, since we cannot define the probability that I will toss the coin given the fact that I am not a nomological machine producing coin tosses.
No nomological machine – no probability.
A chance set-up is a nomological machine for probabilistic laws, and our description of it is a model that works in the same way as a model for deterministic laws … A situation must be like the model both positively and negatively – it must have all the characteristics featured in the model and it must have no significant interventions to prevent it operating as envisaged – before we can expect repeated trials to give rise to events appropriately described by the corresponding probability …
Probabilities attach to the world via models, models that serve as blueprints for a chance set-up – i.e., for a probability-generating machine … Once we review how probabilities are associated with very special kinds of models before they are linked to the world, both in probability theory itself and in empirical theories like physics and economics, we will no longer be tempted to suppose that just any situation can be described by some probability distribution or other. It takes a very special kind of situation withe the arrangements set just right – and not interfered with – before a probabilistic law can arise …
Probabilities are generated by chance set-ups, and their characterisation necessarily refers back to the chance set-up that gives rise to them. We can make sense of probability of drawing two red balls in a row from an urn of a certain composition with replacement; but we cannot make sense of the probability of six per cent inflation in the United Kingdom next year without an implicit reference to a specific social and institutional structure that will serve as the chance set-up that generates this probability.
What is 0.999 …, really?
It appears to refer to a kind of sum:
.9 + + 0.09 + 0.009 + 0.0009 + …
But what does that mean? That pesky ellipsis is the real problem. There can be no controversy about what it means to add up two, or three, or a hundred numbers. But infinitely many? That’s a different story. In the real world, you can never have infinitely many heaps. What’s the numerical value of an infinite sum? It doesn’t have one — until we give it one. That was the great innovation of Augustin-Louis Cauchy, who introduced the notion of limit into calculus in the 1820s.
The British number theorist G. H. Hardy … explains it best: “It is broadly true to say that mathematicians before Cauchy asked not, ‘How shall we define 1 – 1 – 1 + 1 – 1 …’ but ‘What is 1 -1 + 1 – 1 + …?'”
No matter how tight a cordon we draw around the number 1, the sum will eventually, after some finite number of steps, penetrate it, and never leave. Under those circumstances, Cauchy said, we should simply define the value of the infinite sum to be 1.
I have no problem with solving problems in mathematics by defining them away. But how about the real world? Maybe that ought to be a question to consider even for economists all to fond of uncritically following the mathematical way when applying their models to the real world, where indeed ‘you can never have infinitely many heaps’ …
In econometrics we often run into the ‘Cauchy logic’ —the data is treated as if it were from a larger population, a ‘superpopulation’ where repeated realizations of the data are imagined. Just imagine there could be more worlds than the one we live in and the problem is fixed …
Accepting Haavelmo’s domain of probability theory and sample space of infinite populations – just as Fisher’s ‘hypothetical infinite population,’ of which the actual data are regarded as constituting a random sample, von Mises’s ‘collective’ or Gibbs’s ‘ensemble’ – also implies that judgments are made on the basis of observations that are actually never made!
Infinitely repeated trials or samplings never take place in the real world. So that cannot be a sound inductive basis for a science with aspirations of explaining real-world socio-economic processes, structures or events. It’s — just as the Cauchy mathematical logic of defining away problems — not tenable.
In economics it’s always wise to remember C. S. Peirce’s remark that universes are not as common as peanuts …
In order to make causal inferences from simple regression, it is now conventional to assume something like the setting in equation (1) … The equation makes very strong invariance assumptions, which cannot be tested from data on X and Y.
(1) Y = a + bx + δ
What happens without invariance? The answer will be obvious. If intervention changes the intercept a, the slope b, or the mean of the error distribution, the impact of the intervention becomes difficult to determine. If the variance of the error term is changed, the usual confidence intervals lose their meaning.
How would any of this be possible? Suppose, for instance, that — unbeknownst to the statistician — X and Y are both the effects of a common cause operating through linear statistical laws like (1). Suppose errors are independent and normal, while Nature randomizes the common cause to have a normal distribution. The scatter diagram will look lovely, a regression line is easily fitted, and the straightforward causal interpretation will be wrong.
Since econometrics doesn’t content itself with only making ‘optimal predictions’ but also aspires to explain things in terms of causes and effects, econometricians need loads of assumptions. And invariance is not the only limiting assumption that has to be made. Equally important are the ‘atomistic’ assumptions of additivity and linearity.
Limiting model assumptions in economic science always have to be closely examined since if we are going to be able to show that the mechanisms or causes that we isolate and handle in our models are stable in the sense that they do not change when we ‘export’ them to our ‘target systems,’ we have to be able to show that they do not only hold under ceteris paribus conditions and a fortiori only are of limited value to our understanding, explanations or predictions of real economic systems.
The kind of fundamental assumption about the character of material laws, on which scientists appear commonly to act, seems to me to be much less simple than the bare principle of uniformity. They appear to assume something much more like what mathematicians call the principle of the superposition of small effects, or, as I prefer to call it, in this connection, the atomic character of natural law. The system of the material universe must consist, if this kind of assumption is warranted, of bodies which we may term (without any implication as to their size being conveyed thereby) legal atoms, such that each of them exercises its own separate, independent, and invariable effect, a change of the total state being compounded of a number of separate changes each of which is solely due to a separate portion of the preceding state. We do not have an invariable relation between particular bodies, but nevertheless each has on the others its own separate and invariable effect, which does not change with changing circumstances, although, of course, the total effect may be changed to almost any extent if all the other accompanying causes are different. Each atom can, according to this theory, be treated as a separate cause and does not enter into different organic combinations in each of which it is regulated by different laws …
The scientist wishes, in fact, to assume that the occurrence of a phenomenon which has appeared as part of a more complex phenomenon, may be some reason for expecting it to be associated on another occasion with part of the same complex. Yet if different wholes were subject to laws qua wholes and not simply on account of and in proportion to the differences of their parts, knowledge of a part could not lead, it would seem, even to presumptive or probable knowledge as to its association with other parts. Given, on the other hand, a number of legally atomic units and the laws connecting them, it would be possible to deduce their effects pro tanto without an exhaustive knowledge of all the coexisting circumstances.
Econometrics may be an informative tool for research. But if its practitioners do not investigate and make an effort of providing a justification for the credibility of the assumptions on which they erect their building, it will not fulfill its tasks. There is a gap between its aspirations and its accomplishments, and without more supportive evidence to substantiate its claims, critics like Keynes — and yours truly — will continue to consider its ultimate argument as a mixture of rather unhelpful metaphors and metaphysics.
The marginal return on its ever higher technical sophistication in no way makes up for the lack of serious under-labouring of its deeper philosophical and methodological foundations that already Keynes complained about. Firmly stuck in an empiricist tradition, econometrics is only concerned with the measurable aspects of reality, and a rigorous application of econometric methods in economics really presupposes that the phenomena of our real world economies are ruled by stable causal relations.
Unfortunately, real world social systems are usually not governed by stable causal mechanisms or capacities. The kinds of ‘laws’ and relations that econometrics has established, are laws and relations about entities in models that presuppose causal mechanisms being invariant, atomistic and additive. But — when causal mechanisms operate in the real world they only do it in ever-changing and unstable combinations where the whole is more than a mechanical sum of parts. If economic regularities obtain they do it as a rule only because we engineered them for that purpose. Outside man-made ‘nomological machines’ they are rare, or even non-existant.
Although Bayesians think otherwise, to me there’s nothing magical about Bayes’ theorem. The important thing in science is for you to have strong evidence. If your evidence is strong, then applying Bayesian probability calculus is rather unproblematic. Otherwise — garbage in, garbage out. Applying Bayesian probability calculus to subjective beliefs founded on weak evidence is not a recipe for scientific akribi and progress.
Neoclassical economics nowadays usually assumes that agents that have to make choices under conditions of uncertainty behave according to Bayesian rules — that is, they maximize expected utility with respect to some subjective probability measure that is continually updated according to Bayes’ theorem.
Bayesianism reduces questions of rationality to questions of internal consistency (coherence) of beliefs, but – even granted this questionable reductionism – do rational agents really have to be Bayesian? As I have been arguing repeatedly over the years, there is no strong warrant for believing so.
In many of the situations that are relevant to economics one could argue that there is simply not enough of adequate and relevant information to ground beliefs of a probabilistic kind, and that in those situations it is not really possible, in any relevant way, to represent an individual’s beliefs in a single probability measure.
Bayesianism cannot distinguish between symmetry-based probabilities from information and symmetry-based probabilities from an absence of information. In these kinds of situations most of us would rather say that it is simply irrational to be a Bayesian and better instead to admit that we “simply do not know” or that we feel ambiguous and undecided. Arbitrary an ungrounded probability claims are more irrational than being undecided in face of genuine uncertainty, so if there is not sufficient information to ground a probability distribution it is better to acknowledge that simpliciter, rather than pretending to possess a certitude that we simply do not possess.
So, why then are so many scientists nowadays so fond of Bayesianism? I guess one strong reason is that Bayes’ theorem gives them a seemingly fast, simple and rigorous answer to their problems and hypotheses. But, as already Popper showed back in the 1950’s, the Bayesian probability (likelihood) version of confirmation theory is ‘absurd on both formal and intuitive grounds: it leads to self-contradiction.’
In a recent issue of Real World Economics Review there was a rather interesting, if somewhat dense, article by Judea Pearl and Bryant Chen entitled Regression and Causation: A Critical Examination of Six Econometrics Textbooks …
The paper appears to turn on a single dichotomy. The authors point out that there is a substantial difference between what they refer to as the “conditional-based expectation” and “interventionist-based expectation”. The first is given the notation:
While the second is given the notation:
The difference between these two relationships is enormous. The first notation — that is, the “conditional-based expectation” — basically means that the value Y is statistically dependent on the value X …
The second notation — that is, the “interventionist-based expectation” — refers to something else entirely. It means that the value Y is causally dependent on the value X …
Now, if we simply go out and take a statistical measure of earnings and expected performance we will find a certain relationship — this will be the conditional-based expectation and it will be purely a statistical relationship.
If, however, we take a group of employees and raise their earnings, X, by a given amount will we see the same increase in performance, Y, as we would expect from a study of the past statistics? Obviously not. This example, of course, is the interventionist-based expectation and is indicative of a causal relationship between the variables …
In economics we are mainly interested in causal rather than statistical relationships. If we want to estimate, for example, the multiplier, it is from a causal rather than a statistical point-of-view. Yet the training that many students receive leads to confusion in this regard. Indeed, we may go one further and ask whether such a confusion also sits in the mind of the textbook writers themselves.
This confusion between statistical relationships and causal ones has long been a problem in econometrics. Keynes, for example, writing his criticism of the econometric method in his seminal paper Professor Tinbergen’s Method noted that Tinbergen had made precisely this error …
The question then arises: why, after over 70 years, are econometrics textbooks engaged in the same oversights and vaguenesses as some of the pioneering studies in the field? I think there is a simple explanation for this. Namely, that if econometricians were to be clear about the distinction between statistical and causal relations it would become obvious rather quickly that the discipline holds far less worth for economists than it is currently thought to possess.
For my own take on the issues raised by Pearl and Chen see here.