Suppose you are concerned with determining what the most visited parks in a city are. One idea is to take a momentary snapshot: to see how many people are this moment in park A, how many are in park B and so on. Another idea is to look at one individual (or few of them) and to follow him for a certain period of time, e.g. a year. Then, you observe how often the individual is going to park A, how often he is going to park B and so on.
Thus, you obtain two different results: one statistical analysis over the entire ensemble of people at a certain moment in time, and one statistical analysis for one person over a certain period of time. The first one may not be representative for a longer period of time, while the second one may not be representative for all the people. The idea is that an ensemble is ergodic if the two types of statistics give the same result. Many ensembles, like the human populations, are not ergodic.
The importance of ergodicity becomes manifest when you think about how we all infer various things, how we draw some conclusion about something while having information about something else. For example, one goes once to a restaurant and likes the fish and next time he goes to the same restaurant and orders chicken, confident that the chicken will be good. Why is he confident? Or one observes that a newspaper has printed some inaccurate information at one point in time and infers that the newspaper is going to publish inaccurate information in the future. Why are these inferences ok, while others such as “more crimes are committed by black persons than by white persons, therefore each individual black person is not to be trusted” are not ok?
The answer is that the ensemble of articles published in a newspaper is more or less ergodic, while the ensemble of black people is not at all ergodic. If one searches how many mistakes appear in an entire newspaper in one issue, and then searches how many mistakes one news editor does over time, one finds the two results almost identical (not exactly, but nonetheless approximately equal). However, if one takes the number of crimes committed by black people in a certain day divided by the total number of black people, and then follows one random-picked black individual over his life, one would not find that, e.g. each month, this individual commits crimes at the same rate as the crime rate determined over the entire ensemble. Thus, one cannot use ensemble statistics to properly infer what is and what is not probable that a certain individual will do.
Structural econometrics aims to infer causes from probabilities, inferred from sample data generated in non-experimental settings. Arguably, it is the most ambitious part of econometrics. It aims to identify economic structures, robust parts of the economy to which interventions can be made to bring about desirable events. This part of econometrics is distinguished from forecasting econometrics in its attempt to capture something of the ‘real’ economy in the hope of allowing policy makers to act on and control events …
By making many strong background assumptions, the deductivist [the conventional logic of structural econometrics] reading of the regression model allows one — in principle — to support a structural reading of the equations and to support many rich causal claims as a result. Here, however, the difficulty is that of finding good evidence for many of the assumptions on which the approach rests. It seems difficult to believe, even in cases where we have good background economic knowledge, that the background information will be sufficiently to do the job that the deductivist asks of it. As a result, the deductivist approach may be difficult to sustain, at least in economics.
The difficulties in providing an evidence base for the deductive approach show just how difficult it is to warrant such strong causal claims. In short, as might be expected there is a trade-off between the strength of causal claims we would like to make from non-experimental data and the possibility of grounding these in evidence. If this conclusion is correct — and an appropriate elaboration were done to take into account the greater sophistication of actual structural econometric methods — then it suggests that if we want to do evidence-based structural econometrics, then we may need to be more modest in the causal knowledge we aim for. Or failing this, we should not act as if our causal claims — those that result from structural econometrics — are fully warranted by the evidence and we should acknowledge that they rest on contingent, conditional assumptions about the economy and the nature of causality.
Coupled with downright incompetence in statistics, we often find the syndrome that I have come to call statisticism: the notion that computing is synonymous with doing research, the naïve faith that statistics is a complete or sufficient basis for scientific methodology, the superstition that statistical formulas exist for evaluating such things as the relative merits of different substantive theories or the “importance” of the causes of a “dependent variable”; and the delusion that decomposing the covariations of some arbitrary and haphazardly assembled collection of variables can somehow justify not only a “causal model” but also, praise a mark, a “measurement model.” There would be no point in deploring such caricatures of the scientific enterprise if there were a clearly identifiable sector of social science research wherein such fallacies were clearly recognized and emphatically out of bounds.
Dudley Duncan: Notes on Social Measurement
Let’s say we have a stationary process. That does not guarantee that it is also ergodic. The long-run time average of a single output function of the stationary process may not converge to the expectation of the corresponding variables — and so the long-run time average may not equal the probabilistic (expectational) average.
Say we have two coins, where coin A has a probability 1/2 of coming up heads, and coin B has a probability of 1/4 of coming up heads. We pick either of these coins with a probability of 1/2 and then toss the chosen coin over and over again. Now let H1, H2, … be either one or zero as the coin comes up heads or tales. This “process” is obviously stationary, but the time averages — [H1 + ... + Hn]/n — converges to 1/2 if coin A is chosen, and 1/4 if coin B is chosen. Both these time averages has probability of 1/2 and so their expectational average is 1/2 x 1/2 + 1/2 x 1/4 = 3/8, which obviously is not equal to 1/2 or 1/4. The time averages depend on which coin you happen to choose, while the probabilistic (expectational) average is calculated for the whole “system” consisting of both coin A and coin B.
The wide conviction of the superiority of the methods of the science has converted the econometric community largely to a group of fundamentalist guards of mathematical rigour. It is often the case that mathemical rigour is held as the dominant goal and the criterion for research topic choice as well as research evaluation, so much so that the relevance of the research to business cycles is reduced to empirical illustrations. To that extent, probabilistic formalization has trapped econometric business cycle research in the pursuit of means at the expense of ends.
Once the formalization attempts have gone significantly astray from what is needed for analysing and forecasting the multi-faceted characteristics of business cycles, the research community should hopefully make appropriate ‘error corrctions’ of its overestimation of the power of a priori postulated models as well as its underestimation of the importance of the historical approach, or the ‘art’ dimewnsion of business cycle research.
Duo Qin A History of Econometrics (OUP 2013)
There have been over four decades of econometric research on business cycles … The formalization has undeniably improved the scientific strength of business cycle measures …
But the significance of the formalization becomes more difficult to identify when it is assessed from the applied perspective, especially when the success rate in ex-ante forecasts of recessions is used as a key criterion. The fact that the onset of the 2008 financial-crisis-triggered recession was predicted by only a few ‘Wise Owls’ … while missed by regular forecasters armed with various models serves us as the latest warning that the efficiency of the formalization might be far from optimal. Remarkably, not only has the performance of time-series data-driven econometric models been off the track this time, so has that of the whole bunch of theory-rich macro dynamic models developed in the wake of the rational expectations movement, which derived its fame mainly from exploiting the forecast failures of the macro-econometric models of the mid-1970s recession.
The limits of econometric forecasting has, as noted by Qin, been critically pointed out many times before.
Trygve Haavelmo — with the completion (in 1958) of the twenty-fifth volume of Econometrica – assessed the the role of econometrics in the advancement of economics, and although mainly positive of the “repair work” and “clearing-up work” done, Haavelmo also found some grounds for despair:
We have found certain general principles which would seem to make good sense. Essentially, these principles are based on the reasonable idea that, if an economic model is in fact “correct” or “true,” we can say something a priori about the way in which the data emerging from it must behave. We can say something, a priori, about whether it is theoretically possible to estimate the parameters involved. And we can decide, a priori, what the proper estimation procedure should be … But the concrete results of these efforts have often been a seemingly lower degree of accuracy of the would-be economic laws (i.e., larger residuals), or coefficients that seem a priori less reasonable than those obtained by using cruder or clearly inconsistent methods.
There is the possibility that the more stringent methods we have been striving to develop have actually opened our eyes to recognize a plain fact: viz., that the “laws” of economics are not very accurate in the sense of a close fit, and that we have been living in a dream-world of large but somewhat superficial or spurious correlations.
And as the quote below shows, even Ragnar Frisch shared some of Haavelmo’s — and Keynes’s — doubts on the applicability of econometrics:
I have personally always been skeptical of the possibility of making macroeconomic predictions about the development that will follow on the basis of given initial conditions … I have believed that the analytical work will give higher yields – now and in the near future – if they become applied in macroeconomic decision models where the line of thought is the following: “If this or that policy is made, and these conditions are met in the period under consideration, probably a tendency to go in this or that direction is created”.
Econometrics may be an informative tool for research. But if its practitioners do not investigate and make an effort of providing a justification for the credibility of the assumptions on which they erect their building, it will not fulfill its tasks. There is a gap between its aspirations and its accomplishments, and without more supportive evidence to substantiate its claims, critics will continue to consider its ultimate argument as a mixture of rather unhelpful metaphors and metaphysics. Maintaining that economics is a science in the “true knowledge” business, I remain a skeptic of the pretences and aspirations of econometrics. So far, I cannot really see that it has yielded very much in terms of relevant, interesting economic knowledge. And, more specifically, when it comes to forecasting activities, the results have been bleak indeed.
The marginal return on its ever higher technical sophistication in no way makes up for the lack of serious under-labouring of its deeper philosophical and methodological foundations that already Keynes complained about. The rather one-sided emphasis of usefulness and its concomitant instrumentalist justification cannot hide that the legions of probabilistic econometricians who give supportive evidence for their considering it “fruitful to believe” in the possibility of treating unique economic data as the observable results of random drawings from an imaginary sampling of an imaginary population, are scating on thin ice. After having analyzed some of its ontological and epistemological foundations, I cannot but conclude that econometrics on the whole has not delivered “truth,” nor robust forecasts. And I doubt if it has ever been the intention of its main protagonists.
Our admiration for technical virtuosity should not blind us to the fact that we have to have a more cautious attitude towards probabilistic inference of causality in economic contexts. Science should help us penetrate to — as Keynes put it — “the true process of causation lying behind current events” and disclose “the causal forces behind the apparent facts.” We should look out for causal relations, but econometrics can never be more than a starting point in that endeavour, since econometric (statistical) explanations are not explanations in terms of mechanisms, powers, capacities or causes. Firmly stuck in an empiricist tradition, econometrics is only concerned with the measurable aspects of reality, But there is always the possibility that there are other variables – of vital importance and although perhaps unobservable and non-additive not necessarily epistemologically inaccessible – that were not considered for the model. Those who were can hence never be guaranteed to be more than potential causes, and not real causes.
A rigorous application of econometric methods in economics really presupposes that the phenomena of our real world economies are ruled by stable causal relations between variables. A perusal of the leading econom(etr)ic journals shows that most econometricians still concentrate on fixed parameter models and that parameter-values estimated in specific spatio-temporal contexts are presupposed to be exportable to totally different contexts. To warrant this assumption one, however, has to convincingly establish that the targeted acting causes are stable and invariant so that they maintain their parametric status after the bridging. The endemic lack of predictive success of the econometric project indicates that this hope of finding fixed parameters is a hope for which there really is no other ground than hope itself.
This is a more fundamental and radical problem than the celebrated “Lucas critique” have suggested. This is not the question if deep parameters, absent on the macro-level, exist in “tastes” and “technology” on the micro-level. It goes deeper. Real world social systems are not governed by stable causal mechanisms or capacities. It is the criticism that Keynes — in Essays in Biography — first launched against econometrics and inferential statistics already in the 1920s:
The atomic hypothesis which has worked so splendidly in Physics breaks down in Psychics. We are faced at every turn with the problems of Organic Unity, of Discreteness, of Discontinuity – the whole is not equal to the sum of the parts, comparisons of quantity fails us, small changes produce large effects, the assumptions of a uniform and homogeneous continuum are not satisfied. Thus the results of Mathematical Psychics turn out to be derivative, not fundamental, indexes, not measurements, first approximations at the best; and fallible indexes, dubious approximations at that, with much doubt added as to what, if anything, they are indexes or approximations of.
The kinds of laws and relations that econom(etr)ics has established, are laws and relations about entities in models that presuppose causal mechanisms being atomistic and additive. When causal mechanisms operate in real world social target systems they only do it in ever-changing and unstable combinations where the whole is more than a mechanical sum of parts. If economic regularities obtain they do it (as a rule) only because we engineered them for that purpose. Outside man-made “nomological machines” they are rare, or even non-existant. Unfortunately that also makes most of the achievements of econometrics – as most of contemporary endeavours of economic theoretical modeling – rather useless.
In Andrew Gelman’s and Jennifer Hill’s Data Analysis Using Regression and Multilevel/Hierarchical Models, the authors list the assumptions of the linear regression model. On top of the list is validity and additivity/linearity, followed by different assumptions pertaining to error charateristics.
Yours truly can’t but concur, especially on the “decreasing order of importance” of the assumptions. But then, of course, one really has to wonder why econometrics textbooks — almost invariably — turn this order of importance upside-down and don’t have more thorough discussions on the overriding importance of Gelman/Hill’s two first points …
Since econometrics doesn’t content itself with only making “optimal predictions,” but also aspires to explain things in terms of causes and effects, econometricians need loads of assumptions — and most important of these are validity and additivity.
Let me take the opportunity to cite one of my favourite introductory statistics textbooks on one further reason these assumptions are made — and why they ought to be much more argued for on both epistemological and ontological grounds when used (emphasis added):
In a hypothesis test … the sample comes from an unknown population. If the population is really unknown, it would suggest that we do not know the standard deviation, and therefore, we cannot calculate the standard error. To solve this dilemma, we have made an assumption. Specifically, we assume that the standard deviation for the unknown population (after treatment) is the same as it was for the population before treatment.
Actually this assumption is the consequence of a more general assumption that is part of many statistical procedure. The general assumption states that the effect of the treatment is to add a constant amount to … every score in the population … You should also note that this assumption is a theoretical ideal. In actual experiments, a treatment generally does not show a perfect and consistent additive effect.
A standard view among econometricians is that their models — and the causality they may help us to detect — are only in the mind. From a realist point of view, this is rather untenable. The reason we as scientists are interested in causality is that it’s a part of the way the world works. We represent the workings of causality in the real world by means of models, but that doesn’t mean that causality isn’t a fact pertaining to relations and structures that exist in the real world. If it was only “in the mind,” most of us couldn’t care less.
The econometricians’ nominalist-positivist view of science and models, is the belief that science can only deal with observable regularity patterns of a more or less lawlike kind. Only data matters and trying to (ontologically) go beyond observed data in search of the underlying real factors and relations that generate the data is not admissable. All has to take place in the econometric mind’s model since the real factors and relations according to the econometric (epistemologically based) methodology are beyond reach since they allegedly are both unobservable and unmeasurable. This also means that instead of treating the model-based findings as interesting clues for digging deepeer into real structures and mechanisms, they are treated as the end points of the investigation.
The critique put forward here is in line with what mathematical statistician David Freedman writes in Statistical Models and Causal Inference (2010):
In my view, regression models are not a particularly good way of doing empirical work in the social sciences today, because the technique depends on knowledge that we do not have. Investigators who use the technique are not paying adequate attention to the connection – if any – between the models and the phenomena they are studying. Their conclusions may be valid for the computer code they have created, but the claims are hard to transfer from that microcosm to the larger world …
Given the limits to present knowledge, I doubt that models can be rescued by technical fixes. Arguments about the theoretical merit of regression or the asymptotic behavior of specification tests for picking one version of a model over another seem like the arguments about how to build desalination plants with cold fusion and the energy source. The concept may be admirable, the technical details may be fascinating, but thirsty people should look elsewhere …
Causal inference from observational data presents may difficulties, especially when underlying mechanisms are poorly understood. There is a natural desire to substitute intellectual capital for labor, and an equally natural preference for system and rigor over methods that seem more haphazard. These are possible explanations for the current popularity of statistical models.
Indeed, far-reaching claims have been made for the superiority of a quantitative template that depends on modeling – by those who manage to ignore the far-reaching assumptions behind the models. However, the assumptions often turn out to be unsupported by the data. If so, the rigor of advanced quantitative methods is a matter of appearance rather than substance.
Econometrics is basically a deductive method. Given the assumptions (such as manipulability, transitivity, separability, additivity, linearity etc) it delivers deductive inferences. The problem, of course, is that we will never completely know when the assumptions are right. Real target systems are seldom epistemically isomorphic to axiomatic-deductive models/systems, and even if they were, we still have to argue for the external validity of the conclusions reached from within these epistemically convenient models/systems. Causal evidence generated by statistical/econometric procedures like regression analysis may be valid in “closed” models, but what we usually are interested in, is causal evidence in the real target system we happen to live in.
Most advocates of econometrics and regression analysis want to have deductively automated answers to fundamental causal questions. Econometricians think – as David Hendry expressed it in Econometrics – alchemy or science? (1980) – they “have found their Philosophers’ Stone; it is called regression analysis and is used for transforming data into ‘significant results!'” But as David Freedman poignantly notes in Statistical Models: “Taking assumptions for granted is what makes statistical techniques into philosophers’ stones.” To apply “thin” methods we have to have “thick” background knowledge of what’s going on in the real world, and not in idealized models. Conclusions can only be as certain as their premises – and that also applies to the quest for causality in econometrics and regression analysis.
Without requirements of depth, explanations most often do not have practical significance. Only if we search for and find fundamental structural causes, can we hopefully also take effective measures to remedy problems like e.g. unemployment, poverty, discrimination and underdevelopment. A social science must try to establish what relations exist between different phenomena and the systematic forces that operate within the different realms of reality. If econometrics is to progress, it has to abandon its outdated nominalist-positivist view of science and the belief that science can only deal with observable regularity patterns of a more or less law-like kind. Scientific theories ought to do more than just describe event-regularities and patterns – they also have to analyze and describe the mechanisms, structures, and processes that give birth to these patterns and eventual regularities.
Limiting model assumptions in economic science always have to be closely examined since if we are going to be able to show that the mechanisms or causes that we isolate and handle in our models are stable in the sense that they do not change when we “export” them to our “target systems”, we have to be able to show that they do not only hold under ceteris paribus conditions and a fortiori only are of limited value to our understanding, explanations or predictions of real economic systems. As the always eminently quotable Keynes writes (emphasis added) in Treatise on Probability (1921):
The kind of fundamental assumption about the character of material laws, on which scientists appear commonly to act, seems to me to be [that] the system of the material universe must consist of bodies … such that each of them exercises its own separate, independent, and invariable effect, a change of the total state being compounded of a number of separate changes each of which is solely due to a separate portion of the preceding state … Yet there might well be quite different laws for wholes of different degrees of complexity, and laws of connection between complexes which could not be stated in terms of laws connecting individual parts … If different wholes were subject to different laws qua wholes and not simply on account of and in proportion to the differences of their parts, knowledge of a part could not lead, it would seem, even to presumptive or probable knowledge as to its association with other parts … These considerations do not show us a way by which we can justify induction … /427 No one supposes that a good induction can be arrived at merely by counting cases. The business of strengthening the argument chiefly consists in determining whether the alleged association is stable, when accompanying conditions are varied … /468 In my judgment, the practical usefulness of those modes of inference … on which the boasted knowledge of modern science depends, can only exist … if the universe of phenomena does in fact present those peculiar characteristics of atomism and limited variety which appears more and more clearly as the ultimate result to which material science is tending.
Econometrics may be an informative tool for research. But if its practitioners do not investigate and make an effort of providing a justification for the credibility of the assumptions on which they erect their building, it will not fulfill its tasks. There is a gap between its aspirations and its accomplishments, and without more supportive evidence to substantiate its claims, critics will continue to consider its ultimate argument as a mixture of rather unhelpful metaphors and metaphysics. Maintaining that economics is a science in the “true knowledge” business, yours truly remains a skeptic of the pretences and aspirations of econometrics. So far, I cannot really see that it has yielded very much in terms of relevant, interesting economic knowledge.
The marginal return on its ever higher technical sophistication in no way makes up for the lack of serious under-labouring of its deeper philosophical and methodological foundations that already Keynes complained about. The rather one-sided emphasis of usefulness and its concomitant instrumentalist justification cannot hide that neither Haavelmo, nor the legions of probabilistic econometricians following in his footsteps, give supportive evidence for their considering it “fruitful to believe” in the possibility of treating unique economic data as the observable results of random drawings from an imaginary sampling of an imaginary population. After having analyzed some of its ontological and epistemological foundations, I cannot but conclude that econometrics on the whole has not delivered “truth”. And I doubt if it has ever been the intention of its main protagonists.
Our admiration for technical virtuosity should not blind us to the fact that we have to have a cautious attitude towards probabilistic inferences in economic contexts. Science should help us penetrate to “the true process of causation lying behind current events” and disclose “the causal forces behind the apparent facts” [Keynes 1971-89 vol XVII:427]. We should look out for causal relations, but econometrics can never be more than a starting point in that endeavour, since econometric (statistical) explanations are not explanations in terms of mechanisms, powers, capacities or causes. Firmly stuck in an empiricist tradition, econometrics is only concerned with the measurable aspects of reality. But there is always the possibility that there are other variables – of vital importance and although perhaps unobservable and non-additive, not necessarily epistemologically inaccessible – that were not considered for the model. Those who were can hence never be guaranteed to be more than potential causes, and not real causes. A rigorous application of econometric methods in economics really presupposes that the phenomena of our real world economies are ruled by stable causal relations between variables. A perusal of the leading econom(etr)ic journals shows that most econometricians still concentrate on fixed parameter models and that parameter-values estimated in specific spatio-temporal contexts are presupposed to be exportable to totally different contexts. To warrant this assumption one, however, has to convincingly establish that the targeted acting causes are stable and invariant so that they maintain their parametric status after the bridging. The endemic lack of predictive success of the econometric project indicates that this hope of finding fixed parameters is a hope for which there really is no other ground than hope itself.
Real world social systems are not governed by stable causal mechanisms or capacities. The kinds of “laws” and relations that econometrics has established, are laws and relations about entities in models that presuppose causal mechanisms being atomistic and additive. When causal mechanisms operate in real world social target systems they only do it in ever-changing and unstable combinations where the whole is more than a mechanical sum of parts. If economic regularities obtain they do it (as a rule) only because we engineered them for that purpose. Outside man-made “nomological machines” they are rare, or even non-existant. Unfortunately that also makes most of the achievements of econometrics – as most of contemporary endeavours of mainstream economic theoretical modeling – rather useless.
In one of the best articles ever published on applied statistics, eminent statisticians David Freedman and Richard Berk share some of their incisive insights with us on two of the more prominently used fictions in modern statistics and econometrics — “random sampling” and “imaginary populations”:
Random sampling is hardly universal … More typically, perhaps, the data in hand are simply the data most readily available …
“Convenience samples” of this sort are not random samples. Still, researchers may quite properly be worried about replicability. The generic concern is the same as for random sampling: if the study were repeated, the results would be different. What, then, can be said about the results obtained? … The moment that conventional statistical inferences are made from convenience samples, substantive assumptions are made about how the social world operates. Conventional statistical inferences (e.g., formulas for the standard error of the mean, t-tests, etc.) depend on the assumption of random sampling. This is not a matter of debate or opinion; it is a matter of mathematical necessity. When applied to convenience samples, the random sampling assumption is not a mere technicality or a minor revision on the periphery; the assumption becomes an integral part of the theory …
What kinds of social processes are assumed by the application of conventional statistical techniques to convenience samples? Our answer will be that the assumptions are quite unrealistic. If so, probability calculations that depend on the assumptions must be viewed as unrealistic too …
[One] way to treat uncertainty is to deﬁne a real population and assume that the data can be treated as a random sample from that population … This “as-if”strategy would seem to set the stage for statistical business as usual. An explicit goal of the “as-if ” strategy is generalizing to a speciﬁc population.And one issue is this: are the data representative? For example, did each member of the speciﬁed population have the same probability of coming into the sample? If not, and the investigator fails to weight the data, inferences from the sample to the population will likely be wrong …
Another way to treat uncertainty is to create an imaginary population from which the data are assumed to be a random sample … With this approach, the investigator does not explicitly deﬁne a population that could in principle be studied, with unlimited resources of time and money. The investigator merely assumes that such a population exists in some ill-deﬁned sense. And there is a further assumption, that the dataset being analyzed can be treated as if it were based on a random sample from the assumed population. These are convenient ﬁctions. Convenience will not be denied; the source of the ﬁction is two-fold: (i) the population does not have any empirical existence of its own, and (ii) the sample was not in fact drawn at random …
Handwaving is inadequate … Nevertheless, reliance on imaginary populations is widespread. Indeed, regression models are commonly used to analyze convenience samples: … such analyses are often predicated on random sampling from imaginary populations. The rhetoric of imaginary populations is seductive precisely because it seems to free the investigator from the necessity of understanding how data were generated.
David Freedman & Richard Berk
Statistical Assumptions as Empirical Commitments (emphasis added)
And so, what’s the remedy for this wide-spread assumptions malady? As I’ve repeatedly argued, e. g. here, I think it’s absolutely necessary to apply some kind of real-world filter to models. As Paul Pleiderer has it:
Whereas some theoretical models can be immensely useful in developing intuitions, in essence a theoretical model is nothing more than an argument that a set of conclusions follows from a given set of assumptions. Being logically correct may earn a place for a theoretical model on the bookshelf, but when a theoretical model is taken off the shelf and applied to the real world, it is important to question whether the model’s assumptions are in accord with what we know about the world. Is the story behind the model one that captures what is important or is it a fiction that has little connection to what we see in practice? Have important factors been omitted? Are economic agents assumed to be doing things that we have serious doubts they are able to do? These questions and others like them allow us to filter out models that are ill suited to give us
genuine insights. To be taken seriously models should pass through the real world filter …
Although a model may be internally consistent, although it may be subtle and the analysis may be mathematically elegant, none of this carries any guarantee that it is applicable to the actual world. One might think that the applicability or “truth” of a theoretical model can always be established by formal empirical analysis that tests the model’s testable hypotheses, but this a bit of a fantasy. Formal empirical testing should, of course, be vigorously pursued, but lack of data and lack of natural experiments limit our ability in many cases to choose among competing models. In addition, even if we are able to formally test some hypotheses of these competing models, the results of these tests may only allow us to reject some of the models, leaving several survivors that have different implications on issues that we are not able to test. The real world filters will be critical in all these cases.
If people start sending you random pairs of variables that happen to be highly correlated, sure, there might well be a connection between them, for example kids’ scores on math tests and language tests are correlated, and this tells us something. But if someone is looking for a particular pattern, and then selects two variables that are correlated, that’s another story. The great thing about causal identification is that it’s valid even if you’re looking to find a pattern. (Not completely, there’s p-hacking and also you can run 100 experiments and only report the best one, etc., but that’s still less of an issue than the fact that pure correlation does not logically tell you anything about causation. To put it another way: returning to Noah’s tweet: Correlation is surely correlated with causation in an aggregate sense, but if you take the subset of correlations that a particular motivated researcher is looking for—then maybe not …
The expression “correlation does not imply causation” is popular, and I think it’s popular for a reason, that it does capture a truth about the world …
People see enough random correlations that they can pick them out and interpret them how they like.
So if I had to put something on a bumper sticker (or a tweet), it would be:
“Correlation does not even imply correlation”
That is, correlation in the data you happen to have (even if it happens to be “statistically significant”) does not necessarily imply correlation in the population of interest.
In a recent judgement the English Court of Appeal has denied that probability can be used as an expression of uncertainty for events that have either happened or not.
The case was a civil dispute about the cause of a fire, and concerned an appeal against a decision in the High Court by Judge Edwards-Stuart. Edwards-Stuart had essentially concluded that the fire had been started by a discarded cigarette, even though this seemed an unlikely event in itself, because the other two explanations were even more implausible. The Court of Appeal rejected this approach although still supported the overall judgement and disallowed the appeal …
But it’s the quotations from the judgement that are so interesting:
“Sometimes the ‘balance of probability’ standard is expressed mathematically as ’50 + % probability’, but this can carry with it a danger of pseudo-mathematics, as the argument in this case demonstrated. When judging whether a case for believing that an event was caused in a particular way is stronger that the case for not so believing, the process is not scientific (although it may obviously include evaluation of scientific evidence) and to express the probability of some event having happened in percentage terms is illusory.“
The idea that you can assign probabilities to events that have already occurred, but where we are ignorant of the result, forms the basis for the Bayesian view of probability. Put very broadly, the ‘classical’ view of probability is in terms of genuine unpredictability about future events, popularly known as ‘chance’ or ‘aleatory uncertainty’. The Bayesian interpretation allows probability also to be used to express our uncertainty due to our ignorance, known as ‘epistemic uncertainty’ …
The judges went on to say:
“The chances of something happening in the future may be expressed in terms of percentage. Epidemiological evidence may enable doctors to say that on average smokers increase their risk of lung cancer by X%. But you cannot properly say that there is a 25 per cent chance that something has happened … Either it has or it has not“ …
Anyway, I teach the Bayesian approach to post-graduate students attending my ‘Applied Bayesian Statistics’ course at Cambridge, and so I must now tell them that the entire philosophy behind their course has been declared illegal in the Court of Appeal. I hope they don’t mind.
David Siegelhalter should of course go on with his course, but maybe he also ought to contemplate the rather common fact that people — including scientists — often find it possible to believe things although they can’t always warrant or justify their beliefs. And — probabilistic nomological machines do not exist “out there” and so is extremely difficult to properly apply to idiosyncratic real world events (such as fires).
As I see it, Bayesian probabilistic reasoning in science reduces questions of rationality to questions of internal consistency (coherence) of beliefs, but — even granted this questionable reductionism — it’s not self-evident that rational agents really have to be probabilistically consistent. There is no strong warrant for believing so. Rather, there are strong evidence for us encountering huge problems if we let probabilistic reasoning become the dominant method for doing research in social sciences on problems that involve risk and uncertainty.
In many situations one could argue that there is simply not enough of adequate and relevant information to ground beliefs of a probabilistic kind, and that in those situations it is not really possible, in any relevant way, to represent an individual’s beliefs in a single probability measure.
Say you have come to learn (based on own experience and tons of data) that the probability of you becoming the next president in US is 1%. Having moved to Italy (where you have no own experience and no data) you have no information on the event and a fortiori nothing to help you construct any probability estimate on. A Bayesian would, however, argue that you would have to assign probabilities to the mutually exclusive alternative outcomes and that these have to add up to 1 — if you are rational. That is, in this case — and based on symmetry — a rational individual would have to assign probability 1% of becoming the next Italian president and 99% of not so.
That feels intuitively wrong though, and I guess most people would agree. Bayesianism cannot distinguish between symmetry-based probabilities from information and symmetry-based probabilities from an absence of information. In these kinds of situations most of us would rather say that it is simply irrational to be a Bayesian and better instead to admit that we “simply do not know” or that we feel ambiguous and undecided. Arbitrary and ungrounded probability claims are more irrational than being undecided in face of genuine uncertainty, so if there is not sufficient information to ground a probability distribution it is better to acknowledge that simpliciter, rather than pretending to possess a certitude that we simply do not possess.
I think this critique of Bayesianism is in accordance with the views of Keynes’s A Treatise on Probability (1921) and General Theory (1937). According to Keynes we live in a world permeated by unmeasurable uncertainty – not quantifiable stochastic risk – which often forces us to make decisions based on anything but rational expectations. Sometimes we “simply do not know.” Keynes would not have accepted the view of Bayesian economists, according to whom expectations “tend to be distributed, for the same information set, about the prediction of the theory.” Keynes, rather, thinks that we base our expectations on the confidence or “weight” we put on different events and alternatives. To Keynes expectations are a question of weighing probabilities by “degrees of belief”, beliefs that have preciously little to do with the kind of stochastic probabilistic calculations made by the rational agents modeled by probabilistically reasoning Bayesians.