Almost everything we do these days leaves some kind of data trace in some computer system somewhere. When such data is aggregated into huge databases it is called “Big Data”. It is claimed social science will be transformed by the application of computer processing and Big Data. The argument is that social science has, historically, been “theory rich” and “data poor” and now we will be able to apply the methods of “real science” to “social science” producing new validated and predictive theories which we can use to improve the world.
What’s wrong with this? … Firstly what is this “data” we are talking about? In it’s broadest sense it is some representation usually in a symbolic form that is machine readable and processable. And how will this data be processed? Using some form of machine learning or statistical analysis. But what will we find? Regularities or patterns … What do such patterns mean? Well that will depend on who is interpreting them …
Looking for “patterns or regularities” presupposes a definition of what a pattern is and that presupposes a hypothesis or model, i.e. a theory. Hence big data does not “get us away from theory” but rather requires theory before any project can commence.
What is the problem here? The problem is that a certain kind of approach is being propagated within the “big data” movement that claims to not be a priori committed to any theory or view of the world. The idea is that data is real and theory is not real. That theory should be induced from the data in a “scientific” way.
I think this is wrong and dangerous. Why? Because it is not clear or honest while appearing to be so. Any statistical test or machine learning algorithm expresses a view of what a pattern or regularity is and any data has been collected for a reason based on what is considered appropriate to measure. One algorithm will find one kind of pattern and another will find something else. One data set will evidence some patterns and not others. Selecting an appropriate test depends on what you are looking for. So the question posed by the thought experiment remains “what are you looking for, what is your question, what is your hypothesis?”
Ideas matter. Theory matters. Big data is not a theory-neutral way of circumventing the hard questions. In fact it brings these questions into sharp focus and it’s time we discuss them openly.
Eminent statistician David Salsburg is rightfully very critical of the way social scientists — including economists and econometricians — uncritically and without arguments have come to simply assume that they can apply probability distributions from statistical theory on their own area of research:
We assume there is an abstract space of elementary things called ‘events’ … If a measure on the abstract space of events fulfills certain axioms, then it is a probability. To use probability in real life, we have to identify this space of events and do so with sufficient specificity to allow us to actually calculate probability measurements on that space … Unless we can identify [this] abstract space, the probability statements that emerge from statistical analyses will have many different and sometimes contrary meanings …
Kolmogorov established the mathematical meaning of probability: Probability is a measure of sets in an abstract space of events. All the mathematical properties of probability can be derived from this definition. When we wish to apply probability to real life, we need to identify that abstract space of events for the particular problem at hand … It is not well established when statistical methods are used for observational studies … If we cannot identify the space of events that generate the probabilities being calculated, then one model is no more valid than another … As statistical models are used more and more for observational studies to assist in social decisions by government and advocacy groups, this fundamental failure to be able to derive probabilities without ambiguity will cast doubt on the usefulness of these methods.
Wise words well worth pondering on.
As long as economists and statisticians cannot really identify their statistical theories with real-world phenomena there is no real warrant for taking their statistical inferences seriously.
Just as there is no such thing as a ‘free lunch,’ there is no such thing as a ‘free probability.’ To be able at all to talk about probabilities, you have to specify a model. If there is no chance set-up or model that generates the probabilistic outcomes or events – in statistics one refers to any process where you observe or measure as an experiment (rolling a die) and the results obtained as the outcomes or events (number of points rolled with the die, being e. g. 3 or 5) of the experiment – there strictly seen is no event at all.
Probability is — as strongly argued by Keynes — a relational element. It always must come with a specification of the model from which it is calculated. And then to be of any empirical scientific value it has to be shown to coincide with (or at least converge to) real data generating processes or structures – something seldom or never done!
And this is the basic problem with economic data. If you have a fair roulette-wheel, you can arguably specify probabilities and probability density distributions. But how do you conceive of the analogous ‘nomological machines’ for prices, gross domestic product, income distribution etc? Only by a leap of faith. And that does not suffice. You have to come up with some really good arguments if you want to persuade people into believing in the existence of socio-economic structures that generate data with characteristics conceivable as stochastic events portrayed by probabilistic density distributions!
Regression to the men is nothing but the universal truth of the fact that whenever we have an imperfect correlation between two scores, we have regression to the mean.
- Always, but always, plot your data.
- Remember that data quality is at least as important as data quantity.
- Always ask yourself, “Do these results make economic/common sense”?
- Check whether your “statistically significant” results are also “numerically/economically significant”.
- Be sure that you know exactly what assumptions are used/needed to obtain the results relating to the properties of any estimator or test that you use.
- Just because someone else has used a particular approach to analyse a problem that looks like yours, that doesn’t mean they were right!
- “Test, test, test”! (David Hendry). But don’t forget that “pre-testing” raises some important issues of its own.
- Don’t assume that the computer code that someone gives to you is relevant for your application, or that it even produces correct results.
- Keep in mind that published results will represent only a fraction of the results that the author obtained, but is not publishing.
- Don’t forget that “peer-reviewed” does NOT mean “correct results”, or even “best practices were followed”.
Renowned ‘error-statistician’ Aris Spanos maintains — in a comment on this blog a couple of weeks ago — that Keynes’ critique of econometrics and the reliability of inferences made when it is applied, “have been addressed or answered.”
One could, of course, say that, but the valuation of the statement hinges completely on what we mean by a question or critique being ‘addressed’ or ‘answered’. As I will argue below, Keynes’ critique is still valid and unanswered in the sense that the problems he pointed at are still with us today and ‘unsolved.’ Ignoring them — the most common practice among applied econometricians — is not to solve them.
To apply statistical and mathematical methods to the real-world economy, the econometrician has to make some quite strong assumption. In a review of Tinbergen’s econometric work — published in The Economic Journal in 1939 — Keynes gave a comprehensive critique of Tinbergen’s work, focussing on the limiting and unreal character of the assumptions that econometric analyses build on:
Completeness: Where Tinbergen attempts to specify and quantify which different factors influence the business cycle, Keynes maintains there has to be a complete list of all the relevant factors to avoid misspecification and spurious causal claims. Usually this problem is ‘solved’ by econometricians assuming that they somehow have a ‘correct’ model specification. Keynes is, to put it mildly, unconvinced:
It will be remembered that the seventy translators of the Septuagint were shut up in seventy separate rooms with the Hebrew text and brought out with them, when they emerged, seventy identical translations. Would the same miracle be vouchsafed if seventy multiple correlators were shut up with the same statistical material? And anyhow, I suppose, if each had a different economist perched on his a priori, that would make a difference to the outcome.
Homogeneity: To make inductive inferences possible — and being able to apply econometrics — the system we try to analyse has to have a large degree of ‘homogeneity.’ According to Keynes most social and economic systems — especially from the perspective of real historical time — lack that ‘homogeneity.’ As he had argued already in Treatise on Probability (ch. 22), it wasn’t always possible to take repeated samples from a fixed population when we were analysing real-world economies. In many cases there simply are no reasons at all to assume the samples to be homogenous. Lack of ‘homogeneity’ makes the principle of ‘limited independent variety’ non-applicable, and hence makes inductive inferences, strictly seen, impossible since one its fundamental logical premisses are not satisfied. Without “much repetition and uniformity in our experience” there is no justification for placing “great confidence” in our inductions (TP ch. 8).
And then, of course, there is also the ‘reverse’ variability problem of non-excitation: factors that do not change significantly during the period analysed, can still very well be extremely important causal factors.
Stability: Tinbergen assumes there is a stable spatio-temporal relationship between the variables his econometric models analyze. But as Keynes had argued already in his Treatise on Probability it was not really possible to make inductive generalisations based on correlations in one sample. As later studies of ‘regime shifts’ and ‘structural breaks’ have shown us, it is exceedingly difficult to find and establish the existence of stable econometric parameters for anything but rather short time series.
Measurability: Tinbergen’s model assumes that all relevant factors are measurable. Keynes questions if it is possible to adequately quantify and measure things like expectations and political and psychological factors. And more than anything, he questioned — both on epistemological and ontological grounds — that it was always and everywhere possible to measure real-world uncertainty with the help of probabilistic risk measures. Thinking otherwise can, as Keynes wrote, “only lead to error and delusion.”
Independence: Tinbergen assumes that the variables he treats are independent (still a standard assumption in econometrics). Keynes argues that in such a complex, organic and evolutionary system as an economy, independence is a deeply unrealistic assumption to make. Building econometric models from that kind of simplistic and unrealistic assumptions risk to produce nothing but spurious correlations and causalities. Real-world economies are organic systems for which the statistical methods used in econometrics are ill-suited, or even, strictly seen, inapplicable. Mechanical probabilistic models have little leverage when applied to non-atomic evolving organic systems — such as economies.
Building econometric models can’t be a goal in itself. Good econometric models are means that make it possible for us to infer things about the real-world systems they ‘represent.’ If we can’t show that the mechanisms or causes that we isolate and handle in our econometric models are ‘exportable’ to the real-world, they are of limited value to our understanding, explanations or predictions of real-world economic systems.
The kind of fundamental assumption about the character of material laws, on which scientists appear commonly to act, seems to me to be much less simple than the bare principle of uniformity. They appear to assume something much more like what mathematicians call the principle of the superposition of small effects, or, as I prefer to call it, in this connection, the atomic character of natural law. The system of the material universe must consist, if this kind of assumption is warranted, of bodies which we may term (without any implication as to their size being conveyed thereby) legal atoms, such that each of them exercises its own separate, independent, and invariable effect, a change of the total state being compounded of a number of separate changes each of which is solely due to a separate portion of the preceding state. We do not have an invariable relation between particular bodies, but nevertheless each has on the others its own separate and invariable effect, which does not change with changing circumstances, although, of course, the total effect may be changed to almost any extent if all the other accompanying causes are different. Each atom can, according to this theory, be treated as a separate cause and does not enter into different organic combinations in each of which it is regulated by different laws …
The scientist wishes, in fact, to assume that the occurrence of a phenomenon which has appeared as part of a more complex phenomenon, may be some reason for expecting it to be associated on another occasion with part of the same complex. Yet if different wholes were subject to laws qua wholes and not simply on account of and in proportion to the differences of their parts, knowledge of a part could not lead, it would seem, even to presumptive or probable knowledge as to its association with other parts. Given, on the other hand, a number of legally atomic units and the laws connecting them, it would be possible to deduce their effects pro tanto without an exhaustive knowledge of all the coexisting circumstances.
Linearity: To make his models tractable, Tinbergen assumes the relationships between the variables he study to be linear. This is still standard procedure today, but as as Keynes writes:
It is a very drastic and usually improbable postulate to suppose that all economic forces are of this character, producing independent changes in the phenomenon under investigation which are directly proportional to the changes in themselves; indeed, it is ridiculous.
To Keynes it was a ‘fallacy of reification’ to assume that all quantities are additive (an assumption closely linked to independence and linearity).
The unpopularity of the principle of organic unities shows very clearly how great is the danger of the assumption of unproved additive formulas. The fallacy, of which ignorance of organic unity is a particular instance, may perhaps be mathematically represented thus: suppose f(x) is the goodness of x and f(y) is the goodness of y. It is then assumed that the goodness of x and y together is f(x) + f(y) when it is clearly f(x + y) and only in special cases will it be true that f(x + y) = f(x) + f(y). It is plain that it is never legitimate to assume this property in the case of any given function without proof.
J. M. Keynes “Ethics in Relation to Conduct” (1903)
And as even one of the founding fathers of modern econometrics — Trygve Haavelmo — wrote:
What is the use of testing, say, the significance of regression coefficients, when maybe, the whole assumption of the linear regression equation is wrong?
Real-world social systems are usually not governed by stable causal mechanisms or capacities. The kinds of ‘laws’ and relations that econometrics has established, are laws and relations about entities in models that presuppose causal mechanisms and variables — and the relationship between them — being linear, additive, homogenous, stable, invariant and atomistic. But — when causal mechanisms operate in the real world they only do it in ever-changing and unstable combinations where the whole is more than a mechanical sum of parts. Since statisticians and econometricians — as far as I can see — haven’t been able to convincingly warrant their assumptions of homogeneity, stability, invariance, independence, additivity as being ontologically isomorphic to real-world economic systems, Keynes’ critique is still valid . As long as — as Keynes writes in a letter to Frisch in 1935 — “nothing emerges at the end which has not been introduced expressively or tacitly at the beginning,” I remain doubtful of the scientific aspirations of econometrics.
In his critique of Tinbergen, Keynes points us to the fundamental logical, epistemological and ontological problems of applying statistical methods to a basically unpredictable, uncertain, complex, unstable, interdependent, and ever-changing social reality. Methods designed to analyse repeated sampling in controlled experiments under fixed conditions are not easily extended to an organic and non-atomistic world where time and history play decisive roles.
Econometric modeling should never be a substitute for thinking. From that perspective it is really depressing to see how much of Keynes’ critique of the pioneering econometrics in the 1930s-1940s is still relevant today.
The general line you take is interesting and useful. It is, of course, not exactly comparable with mine. I was raising the logical difficulties. You say in effect that, if one was to take these seriously, one would give up the ghost in the first lap, but that the method, used judiciously as an aid to more theoretical enquiries and as a means of suggesting possibilities and probabilities rather than anything else, taken with enough grains of salt and applied with superlative common sense, won’t do much harm. I should quite agree with that. That is how the method ought to be used.
Keynes, letter to E.J. Broster, December 19, 1939
A common goal of statistical analysis in the social sciences is to draw inferences about the relative contributions of different variables to some outcome variable. When regressing academic performance, political affiliation, or vocabulary growth on other variables, researchers often wish to determine which variables matter to the prediction and which do not—typically by considering whether each variable’s contribution remains statistically significant after statistically controlling for other predictors. When a predictor variable in a multiple regression has a coefficient that differs significantly from zero, researchers typically conclude that the variable makes a “unique” contribution to the outcome. And because measured variables are typically viewed as proxies for latent constructs of substantive interest—for example, two cognitive ability measures might be taken to index spatial versus verbal ability—it is natural to generalize the operational conclusion to the latent variable level; that is, to conclude that the latent construct measured by a given predictor variable itself has incremental validity in predicting the outcome, over and above other latent constructs that were examined.
Incremental validity claims pervade the social and biomedical sciences. In some fields, these claims are often explicit … More commonly, however, incremental validity claims are implicit—as when researchers claim that they have statistically “controlled” or “adjusted” for putative confounds—a practice that is exceedingly common in fields ranging from epidemiology to econometrics to behavioral neuroscience … The sheer ubiquity of such appeals might well give one the impression that such claims are unobjectionable, and if anything, represent a foundational tool for drawing meaningful scientific inferences.
Unfortunately, incremental validity claims can be deeply problematic. As we demonstrate below, even small amounts of error in measured predictor variables can result in extremely poorly calibrated Type 1 error probabilities. This basic problem has been discussed in a number of literatures—most extensively, in epidemiology and biostatistics, where concerns about incremental validity claims are often discussed under the heading of residual confounding, but also in fields ranging from psychology to education to econometrics. The common thread is that measurement unreliability and model misspecification will often have a deleterious and large effect on parameter estimates (and associated error rates) when covariates are entered into regression-based model. Consequently, under realistic assumptions, it can be shown that a large proportion of incremental validity claims in many disciplines are likely to be false …
In any given analysis, there is a simple fact of the matter as to whether or not the unique contribution of one or more variables in a regression is statistically significant when controlling for other variables; what room is there for inferential error? Trouble arises, however, when researchers behave as if statistical conclusions obtained at the level of observed measures can be automatically generalized to the level of latent constructs — a near-ubiquitous move, given that most scientists are not interested in prediction purely for prediction’s sake, and typically choose their measures precisely so as to stand in for latent constructs of interest. That is, researchers typically do not care to show that, say, school vouchers are associated with improved academic performance after controlling for a specific survey item asking about respondents’ income bracket; rather, the goal is to show that the vouchers may improve performance after accounting for the general construct of income (or, more generally, socioeconomic status).
The wide conviction of the superiority of the methods of the science has converted the econometric community largely to a group of fundamentalist guards of mathematical rigour. It is often the case that mathemical rigour is held as the dominant goal and the criterion for research topic choice as well as research evaluation, so much so that the relevance of the research to business cycles is reduced to empirical illustrations. To that extent, probabilistic formalization has trapped econometric business cycle research in the pursuit of means at the expense of ends.
Once the formalization attempts have gone significantly astray from what is needed for analysing and forecasting the multi-faceted characteristics of business cycles, the research community should hopefully make appropriate ‘error corrections’ of its overestimation of the power of a priori postulated models as well as its underestimation of the importance of the historical approach, or the ‘art’ dimension of business cycle research.
Duo Qin A History of Econometrics (OUP 2013)
The tool of statistical inference becomes available as the result of a self-imposed limitation of the universe of discourse. It is assumed that the available observations have been generated by a probability law or stochastic process about which some incomplete knowledge is available a priori …
It should be kept in mind that the sharpness and power of these remarkable tools of inductive reasoning are bought by willingness to adopt a specification of the universe in a form suitable for mathematical analysis.
Yes indeed — using statistics and econometrics to make inferences you have to make lots of (mathematical) tractability assumptions. And especially since econometrics aspires to explain things in terms of causes and effects, it needs loads of assumptions, such as e.g. invariance, additivity and linearity.
Limiting model assumptions in economic science always have to be closely examined since if we are going to be able to show that the mechanisms or causes that we isolate and handle in our models are stable in the sense that they do not change when we ‘export’ them to our ‘target systems,’ we have to be able to show that they do not only hold under ceteris paribus conditions. If not, they are of limited value to our explanations and predictions of real economic systems.
Unfortunately, real world social systems are usually not governed by stable causal mechanisms or capacities. The kinds of ‘laws’ and relations that econometrics has established, are laws and relations about entities in models that presuppose causal mechanisms being invariant, atomistic and additive. But — when causal mechanisms operate in the real world they mostly do it in ever-changing and unstable ways. If economic regularities obtain they do so as a rule only because we engineered them for that purpose. Outside man-made ‘nomological machines’ they are rare, or even non-existant.
So — if we want to explain and understand real-world economies we should perhaps be a little bit more cautious with using universe specifications ‘suitable for mathematical analysis’ …
Econometricians usually aim for unbiased estimates. And in econometrics textbooks you learn that if it’s not BLUE, it’s not good.
But if you really think about it, there is no real unbiased estimates. As soon as you weigh in the fact that in all econometric applications you always get your ‘unbiased’ estimates based on non-ideal randomized samples, measurement errors, non-additive and non-linear relationships, and so forth — well, then you realize there is no such a thing as ‘unbiasedness’ in the real-world.
And it’s even worse than this! ‘Randomistas’ are usually very keen to point out that their RCTs give results based on ‘unbiased’ estimator. But that doesn’t take us very far …
One should not jump to the conclusion that there is necessarily a substantive difference between drawing inferences from experimental as opposed to nonexperimental data …
In the experimental setting, the fertilizer treatment is “randomly” assigned to plots of land, whereas in the other case nature did the assignment … “Random” does not mean adequately mixed in every sample. It only means that on the average, the fertilizer treatments are adequately mixed …
Randomization implies that the least squares estimator is “unbiased,” but that definitely does not mean that for each sample the estimate is correct. Sometimes the estimate is too high, sometimes too low …
In particular, it is possible for the randomization to lead to exactly the same allocation as the nonrandom assignment … Many econometricians would insist that there is a difference, because the randomized experiment generates “unbiased” estimates. But all this means is that, if this particular experiment yields a gross overestimate, some other experiment yields a gross underestimate.
So — as soon as we realise that ‘unbiasedness’ is not the Holy Grail of econometrics, it’s easier to accept that it’s better get close to the truth with a biased estimator, than to be stuck with an ‘unbiased’ estimator that is typically not even close to truth.