To see what RCTs show, let me define the Cartesian product of X2, X3, . . . , Xn by Z. What RCTs show is that there exists some z ∈ Z, such that if we have the world in state (x, z) ∈ X instead of (y, z) ∈ X, the world in the next period will be in state a ∈ X instead of state b ∈ X. This is like saying, other things being the same (that is, z), if you vaccinate people, in the next period, there will be no influenza. But if you do not vaccinate them, there will be influenza. If we accept the determinist axiom, as many do, then this demonstration means that whenever we switch from (y, z) to (x, z), the world will switch in the next period from b to a. It is the “whenever” that makes this a causal claim. This is what I am referring to as “circumstantial causality”. Given a certain set of circumstances, changing y to x has a predictable consequence.
The discovery of circumstantial causal connections, as has happened with the rise of RCT studies, is valuable and, at the same time, of limited consequence, more so than the proponents believe. On the one hand, RCTs have given us numerous valuable descriptions of what happened in the past and numerous instances of causes in the past (provided of course that one is willing to accept the determinist axiom). On the other, what they show is very limited. This is because when they show that it was the switch from y to x that caused the switch from b to a, what they are saying is that this was true under certain historical conditions (z), but they cannot tell you what those historical conditions are. RCT discoveries never graduate from something “was a cause” of something else to something “is a cause”. RCTs give us no insight into universal causality because they cannot tell us what it was that was being held constant (z in the above example), when we switched some intervention b to a. For Bengal, in a certain period, electing a woman leader of the local government caused water provisioning to be better. This is no guide to the future because we do not fully know what Bengal in a certain period is like. Henceforth, a reference to causality without a qualifying epithet should be taken to be a reference to universal causality because for policy purposes, that is what is of essence.
If the centuries-old struggle with the problem of finding an analytical definition of probability has produced only endless controversies between the various doctrines, it is, in my opinion, because too little attention has been paid to the singular notion of random. For the dialectical root, in fact, lies in this notion: probability is only an arithmetical aspect of it.
Modern probabilistic econometrics relies on the notion of probability. To at all be amenable to econometric analysis, economic observations allegedly have to be conceived as random events.
But is it really necessary to model the economic system as a system where randomness can only be analyzed and understood when based on an a priori notion of probability?
In probabilistic econometrics, events and observations are as a rule interpreted as random variables as if generated by an underlying probability density function, and, a fortiori, since probability density functions are only definable in a probability context, consistent with a probability. As Haavelmo has it in ‘The probability approach in econometrics’ (1944):
For no tool developed in the theory of statistics has any meaning – except, perhaps for descriptive purposes – without being referred to some stochastic scheme.
When attempting to convince us of the necessity of founding empirical economic analysis on probability models, Haavelmo – building largely on the earlier Fisherian paradigm – actually forces econometrics to (implicitly) interpret events as random variables generated by an underlying probability density function.
This is at odds with reality. Randomness obviously is a fact of the real world. Probability, on the other hand, attaches to the world via intellectually constructed models, and a fortiori is only a fact of a probability generating machine or a well constructed experimental arrangement or “chance set-up”.
Just as there is no such thing as a “free lunch,” there is no such thing as a “free probability.” To be able at all to talk about probabilities, you have to specify a model. If there is no chance set-up or model that generates the probabilistic outcomes or events – in statistics one refers to any process where you observe or measure as an experiment (rolling a die) and the results obtained as the outcomes or events (number of points rolled with the die, being e. g. 3 or 5) of the experiment –there strictly seen is no event at all.
Probability is a relational element. It always must come with a specification of the model from which it is calculated. And then to be of any empirical scientific value it has to be shown to coincide with (or at least converge to) real data generating processes or structures – something seldom or never done!
And this is the basic problem with economic data. If you have a fair roulette-wheel, you can arguably specify probabilities and probability density distributions. But how do you conceive of the analogous nomological machines for prices, gross domestic product, income distribution etc? Only by a leap of faith. And that does not suffice. You have to come up with some really good arguments if you want to persuade people into believing in the existence of socio-economic structures that generate data with characteristics conceivable as stochastic events portrayed by probabilistic density distributions!
From a realistic point of view we really have to admit that the socio-economic states of nature that we talk of in most social sciences – and certainly in econometrics – are not amenable to analyze as probabilities, simply because in the real world open systems that social sciences – including econometrics – analyze, there are no probabilities to be had!
The processes that generate socio-economic data in the real world cannot just be assumed to always be adequately captured by a probability measure. And, so, it cannot really be maintained – as in the Haavelmo paradigm of probabilistic econometrics – that it even should be mandatory to treat observations and data – whether cross-section, time series or panel data – as events generated by some probability model. The important activities of most economic agents do not usually include throwing dice or spinning roulette-wheels. Data generating processes – at least outside of nomological machines like dice and roulette-wheels – are not self-evidently best modeled with probability measures.
If we agree on this, we also have to admit that probabilistic econometrics lacks sound foundations. I would even go further and argue that there really is no justifiable rationale at all for this belief that all economically relevant data can be adequately captured by a probability measure. In most real world contexts one has to argue one’s case. And that is obviously something seldom or never done by practitioners of probabilistic econometrics.
Econometrics and probability are intermingled with randomness. But what is randomness?
In probabilistic econometrics it is often defined with the help of independent trials – two events are said to be independent if the occurrence or nonoccurrence of either one has no effect on the probability of the occurrence of the other – as drawing cards from a deck, picking balls from an urn, spinning a roulette wheel or tossing coins – trials which are only definable if somehow set in a probabilistic context.
But if we pick a sequence of prices – say 2, 4, 3, 8, 5, 6, 6 – that we want to use in an econometric regression analysis, how do we know the sequence of prices is random and a fortiori being able to treat as generated by an underlying probability density function? How can we argue that the sequence is a sequence of probabilistically independent random prices? And are they really random in the sense that is most often applied in probabilistic econometrics – where X is called a random variable only if there is a sample space S with a probability measure and X is a real-valued function over the elements of S?
Bypassing the scientific challenge of going from describable randomness to calculable probability by just assuming it, is of course not an acceptable procedure. Since a probability density function is a “Gedanken” object that does not exist in a natural sense, it has to come with an export license to our real target system if it is to be considered usable.
Among those who at least honestly try to face the problem – the usual procedure is to refer to some artificial mechanism operating in some “games of chance” of the kind mentioned above and which generates the sequence. But then we still have to show that the real sequence somehow coincides with the ideal sequence that defines independence and randomness within our – to speak with science philosopher Nancy Cartwright – “nomological machine”, our chance set-up, our probabilistic model.
As the originator of the Kalman filter, Rudolf Kalman notes in ‘Randomness Reexamined'(1994):
Not being able to test a sequence for ‘independent randomness’ (without being told how it was generated) is the same thing as accepting that reasoning about an “independent random sequence” is not operationally useful.
Probability is a property of the model we choose to use in our endeavour to understand and explain the world in which we live — but probability is not a property of that world.
So why should we define randomness with probability? If we do, we have to accept that to speak of randomness we also have to presuppose the existence of nomological probability machines, since probabilities cannot be spoken of – and actually, to be strict, do not at all exist – without specifying such system-contexts (how many sides do the dice have, are the cards unmarked, etc)
If we do adhere to the Fisher-Haavelmo paradigm of probabilistic econometrics we also have to assume that all noise in our data is probabilistic and that errors are well-behaving, something that is hard to justifiably argue for as a real phenomena, and not just an operationally and pragmatically tractable assumption.
Maybe Kalman’s verdict that
Haavelmo’s error that randomness = (conventional) probability is just another example of scientific prejudice
is, from this perspective seen, not far-fetched.
Accepting Haavelmo’s domain of probability theory and sample space of infinite populations – just as Fisher’s “hypothetical infinite population, of which the actual data are regarded as constituting a random sample”, von Mises’ “collective” or Gibbs’ ”ensemble” – also implies that judgments are made on the basis of observations that are actually never made!
Infinitely repeated trials or samplings never take place in the real world. So that cannot be a sound inductive basis for a science with aspirations of explaining real-world socio-economic processes, structures or events. It’s not tenable.
This importantly also means that if you cannot show that data satisfies all the conditions of the probabilistic nomological machine – including randomness – then the statistical inferences used lack sound foundations.
And in the video below (in Swedish) yours truly shows how to perform a logit regression using Gretl.
Distinguished social psychologist Richard E. Nisbett has a somewhat atypical aversion to multiple regression analysis . In his Intelligence and How to Get It (Norton 2011) he wrote (p. 17):
Researchers often determine the individual’s contemporary IQ or IQ earlier in life, socioeconomic status of the family of origin, living circumstances when the individual was a child, number of siblings, whether the family had a library card, educational attainment of the individual, and other variables, and put all of them into a multiple-regression equation predicting adult socioeconomic status or income or social pathology or whatever. Researchers then report the magnitude of the contribution of each of the variables in the regression equation, net of all the others (that is, holding constant all the others). It always turns out that IQ, net of all the other variables, is important to outcomes. But … the independent variables pose a tangle of causality – with some causing others in goodness-knows-what ways and some being caused by unknown variables that have not even been measured. Higher socioeconomic status of parents is related to educational attainment of the child, but higher-socioeconomic-status parents have higher IQs, and this affects both the genes that the child has and the emphasis that the parents are likely to place on education and the quality of the parenting with respect to encouragement of intellectual skills and so on. So statements such as “IQ accounts for X percent of the variation in occupational attainment” are built on the shakiest of statistical foundations. What nature hath joined together, multiple regressions cannot put asunder.
And now he is back with a half an hour lecture — The Crusade Against Multiple Regression Analysis — posted on The Edge website a week ago (watch the lecture here).
Now, I think that what Nisbett says is right as far as it goes, although it would certainly have strengthened Nisbett’s argumentation if he had elaborated more on the methodological question around causality, or at least had given some mathematical-statistical-econometric references. Unfortunately, his alternative approach is not more convincing than regression analysis. As so many other contemporary social scientists today, Nisbett seems to think that randomization may solve the empirical problem. By randomizing we are getting different “populations” that are homogeneous in regards to all variables except the one we think is a genuine cause. In this way we are supposed to be able to not have to actually know what all these other factors are.
If you succeed in performing an ideal randomization with different treatment groups and control groups that is attainable. But it presupposes that you really have been able to establish – and not just assume – that the probability of all other causes but the putative have the same probability distribution in the treatment and control groups, and that the probability of assignment to treatment or control groups are independent of all other possible causal variables.
Unfortunately, real experiments and real randomizations seldom or never achieve this. So, yes, we may do without knowing all causes, but it takes ideal experiments and ideal randomizations to do that, not real ones.
As I have argued — e. g. here — that means that in practice we do have to have sufficient background knowledge to deduce causal knowledge. Without old knowledge, we can’t get new knowledge – and, no causes in, no causes out.
Nisbett is well worth reading and listening to, but on the issue of the shortcomings of multiple regression analysis, no one sums it up better than eminent mathematical statistician David Freedman in his Statistical Models and Causal Inference:
If the assumptions of a model are not derived from theory, and if predictions are not tested against reality, then deductions from the model must be quite shaky. However, without the model, the data cannot be used to answer the research question …
In my view, regression models are not a particularly good way of doing empirical work in the social sciences today, because the technique depends on knowledge that we do not have. Investigators who use the technique are not paying adequate attention to the connection – if any – between the models and the phenomena they are studying. Their conclusions may be valid for the computer code they have created, but the claims are hard to transfer from that microcosm to the larger world …
Regression models often seem to be used to compensate for problems in measurement, data collection, and study design. By the time the models are deployed, the scientific position is nearly hopeless. Reliance on models in such cases is Panglossian …
Given the limits to present knowledge, I doubt that models can be rescued by technical fixes. Arguments about the theoretical merit of regression or the asymptotic behavior of specification tests for picking one version of a model over another seem like the arguments about how to build desalination plants with cold fusion and the energy source. The concept may be admirable, the technical details may be fascinating, but thirsty people should look elsewhere …
Causal inference from observational data presents may difficulties, especially when underlying mechanisms are poorly understood. There is a natural desire to substitute intellectual capital for labor, and an equally natural preference for system and rigor over methods that seem more haphazard. These are possible explanations for the current popularity of statistical models.
Indeed, far-reaching claims have been made for the superiority of a quantitative template that depends on modeling – by those who manage to ignore the far-reaching assumptions behind the models. However, the assumptions often turn out to be unsupported by the data. If so, the rigor of advanced quantitative methods is a matter of appearance rather than substance.
There have been over four decades of econometric research on business cycles … The formalization has undeniably improved the scientific strength of business cycle measures …
But the significance of the formalization becomes more difficult to identify when it is assessed from the applied perspective, especially when the success rate in ex-ante forecasts of recessions is used as a key criterion. The fact that the onset of the 2008 financial-crisis-triggered recession was predicted by only a few ‘Wise Owls’ … while missed by regular forecasters armed with various models serves us as the latest warning that the efficiency of the formalization might be far from optimal. Remarkably, not only has the performance of time-series data-driven econometric models been off the track this time, so has that of the whole bunch of theory-rich macro dynamic models developed in the wake of the rational expectations movement, which derived its fame mainly from exploiting the forecast failures of the macro-econometric models of the mid-1970s recession.
The limits of econometric forecasting has, as noted by Qin, been critically pointed out many times before.
Trygve Haavelmo — with the completion (in 1958) of the twenty-fifth volume of Econometrica — assessed the the role of econometrics in the advancement of economics, and although mainly positive of the “repair work” and “clearing-up work” done, Haavelmo also found some grounds for despair:
We have found certain general principles which would seem to make good sense. Essentially, these principles are based on the reasonable idea that, if an economic model is in fact “correct” or “true,” we can say something a priori about the way in which the data emerging from it must behave. We can say something, a priori, about whether it is theoretically possible to estimate the parameters involved. And we can decide, a priori, what the proper estimation procedure should be … But the concrete results of these efforts have often been a seemingly lower degree of accuracy of the would-be economic laws (i.e., larger residuals), or coefficients that seem a priori less reasonable than those obtained by using cruder or clearly inconsistent methods.
There is the possibility that the more stringent methods we have been striving to develop have actually opened our eyes to recognize a plain fact: viz., that the “laws” of economics are not very accurate in the sense of a close fit, and that we have been living in a dream-world of large but somewhat superficial or spurious correlations.
And as the quote below shows, even Ragnar Frisch shared some of Haavelmo’s — and Keynes’s — doubts on the applicability of econometrics:
I have personally always been skeptical of the possibility of making macroeconomic predictions about the development that will follow on the basis of given initial conditions … I have believed that the analytical work will give higher yields – now and in the near future – if they become applied in macroeconomic decision models where the line of thought is the following: “If this or that policy is made, and these conditions are met in the period under consideration, probably a tendency to go in this or that direction is created”.
Maintaining that economics is a science in the “true knowledge” business, I remain a skeptic of the pretences and aspirations of econometrics. So far, I cannot really see that it has yielded very much in terms of relevant, interesting economic knowledge. And, more specifically, when it comes to forecasting activities, the results have been bleak indeed.
Firmly stuck in an empiricist tradition, econometrics is only concerned with the measurable aspects of reality, But there is always the possibility that there are other variables – of vital importance and although perhaps unobservable and non-additive not necessarily epistemologically inaccessible – that were not considered for the model. Those who were can hence never be guaranteed to be more than potential causes, and not real causes.
A perusal of the leading econom(etr)ic journals shows that most econometricians still concentrate on fixed parameter models and that parameter-values estimated in specific spatio-temporal contexts are presupposed to be exportable to totally different contexts. To warrant this assumption one, however, has to convincingly establish that the targeted acting causes are stable and invariant so that they maintain their parametric status after the bridging. The endemic lack of predictive success of the econometric project indicates that this hope of finding fixed parameters is a hope for which there really is no other ground than hope itself.
When causal mechanisms operate in real world social target systems they only do it in ever-changing and unstable combinations where the whole is more than a mechanical sum of parts. If economic regularities obtain they do it (as a rule) only because we engineered them for that purpose. Outside man-made “nomological machines” they are rare, or even non-existant. Unfortunately that also makes most of the achievements of econometric forecasting rather useless.
The initial choice of a prior probability distribution is not regulated in any way. The probabilities, called subjective or personal probabilities, reflect personal degrees of belief. From a Bayesian philosopher’s point of view, any prior distribution is as good as any other. Of course, from a Bayesian decision maker’s point of view, his own beliefs, as expressed in his prior distribution, may be better than any other beliefs, but Bayesianism provides no means of justifying this position. Bayesian rationality rests in the recipe alone, and the choice of the prior probability distribution is arbitrary as far as the issue of rationality is concerned. Thus, two rational persons with the same goals may adopt prior distributions that are wildly different …
Bayesian learning is completely inflexible after the initial choice of probabilities: all beliefs that result from new observations have been fixed in advance. This holds because the new probabilities are just equal to certain old conditional probabilities …
According to the Bayesian recipe, the initial choice of a prior probability distribution is arbitrary. But the probability calculus might still rule out some sequences of beliefs and thus prevent complete arbitrariness.
Actually, however, this is not the case: nothing is ruled out by the probability calculus …
Thus, anything goes … By adopting a suitable prior probability distribution, we can fix the consequences of any observations for our beliefs in any way we want. This result, which will be referred to as the anything-goes theorem, holds for arbitrarily complicated cases and any number of observations. It implies, among other consequences, that two rational persons with the same goals and experiences can, in all eternity, differ arbitrarily in their beliefs about future events …
From a Bayesian point of view, any beliefs and, consequently, any decisions are as rational or irrational as any other, no matter what our goals and experiences are. Bayesian rationality is just a probabilistic version of irrationalism. Bayesians might say that somebody is rational only if he actually rationalizes his actions in the Bayesian way. However, given that such a rationalization always exists, it seems a bit pedantic to insist that a decision maker should actually provide it.
Nice to see that the son of Hans Albert is keeping critical rationalism alive …
Suppose you test a highly confirmed hypothesis, for example, that the price elasticity of demand is negative. What would you do if the computer were to spew out a positive coefficient? Surely you would not claim to have overthrown the law of demand … Instead, you would rerun many variants of your regression until the recalcitrant computer finally acknowledged the sovereignty of your theory …
Only the naive are shocked by such soft and gentle testing … Easy it is. But also wrong, when the purpose of the exercise is not to use a hypothesis, but to determine its validity …
Econometric tests are far from useless. They are worth doing, and their results do tell something … But many economists insist that economics can deliver more, much more, than merely, more or less, plausible knowledge, that it can reach its results with compelling demonstrations. By such a standard how should one describe our usual way of testing hypotheses? One possibility is to interpret it as Blaug [The Methodology of Economics, 1980, p. 256] does, as ‘playing tennis with the net down’ …
Perhaps my charge that econometric testing lacks seriousness of purpose is wrong … But regardless of the cause, it should be clear that most econometric testing is not rigorous. Combining such tests with formalized theoretical analysis or elaborate techniques is another instance of the principle of the strongest link. The car is sleek and elegant; too bad the wheels keep falling off.
And who said learning statistics can’t be fun?
[Actually the Ronald Fisher appearing in the video is a mixture of the real Ronald Fisher and Jerzy Neyman and Egon Pearson, but that’s for another blogpost.]
For my own critical view on the value of p values — see e. g. here.
Many thanks for sending me your article. I enjoyed it very much. I am sure these matters need discussing in that sort of way. There is one point, to which in practice I attach a great importance, you do not allude to. In many of these statistical researches, in order to get enough observations they have to be scattered over a lengthy period of time; and for a lengthy period of time it very seldom remains true that the environment is sufficiently stable. That is the dilemma of many of these enquiries, which they do not seem to me to face. Either they are dependent on too few observations, or they cannot rely on the stability of the environment. It is only rarely that this dilemma can be avoided.
Letter from J. M. Keynes to T. Koopmans, May 29, 1941
Almost a hundred years after John Maynard Keynes wrote his seminal A Treatise on Probability (1921), it is still very difficult to find statistics books that seriously try to incorporate his far-reaching and incisive analysis of induction and evidential weight.
The standard view in statistics – and the axiomatic probability theory underlying it – is to a large extent based on the rather simplistic idea that “more is better.” But as Keynes argues – “more of the same” is not what is important when making inductive inferences. It’s rather a question of “more but different.”
Variation, not replication, is at the core of induction. Finding that p(x|y) = p(x|y & w) doesn’t make w “irrelevant.” Knowing that the probability is unchanged when w is present gives p(x|y & w) another evidential weight (“weight of argument”). Running 10 replicative experiments do not make you as “sure” of your inductions as when running 10 000 varied experiments – even if the probability values happen to be the same.
According to Keynes we live in a world permeated by unmeasurable uncertainty – not quantifiable stochastic risk – which often forces us to make decisions based on anything but “rational expectations.” Keynes rather thinks that we base our expectations on the confidence or “weight” we put on different events and alternatives. To Keynes expectations are a question of weighing probabilities by “degrees of belief,” beliefs that often have preciously little to do with the kind of stochastic probabilistic calculations made by the rational agents as modeled by “modern” social sciences. And often we “simply do not know.” As Keynes writes in Treatise:
The kind of fundamental assumption about the character of material laws, on which scientists appear commonly to act, seems to me to be [that] the system of the material universe must consist of bodies … such that each of them exercises its own separate, independent, and invariable effect, a change of the total state being compounded of a number of separate changes each of which is solely due to a separate portion of the preceding state … Yet there might well be quite different laws for wholes of different degrees of complexity, and laws of connection between complexes which could not be stated in terms of laws connecting individual parts … If different wholes were subject to different laws qua wholes and not simply on account of and in proportion to the differences of their parts, knowledge of a part could not lead, it would seem, even to presumptive or probable knowledge as to its association with other parts … These considerations do not show us a way by which we can justify induction … /427 No one supposes that a good induction can be arrived at merely by counting cases. The business of strengthening the argument chiefly consists in determining whether the alleged association is stable, when accompanying conditions are varied … /468 In my judgment, the practical usefulness of those modes of inference … on which the boasted knowledge of modern science depends, can only exist … if the universe of phenomena does in fact present those peculiar characteristics of atomism and limited variety which appears more and more clearly as the ultimate result to which material science is tending.
Science according to Keynes should help us penetrate to “the true process of causation lying behind current events” and disclose “the causal forces behind the apparent facts.” Models can never be more than a starting point in that endeavour. He further argued that it was inadmissible to project history on the future. Consequently we cannot presuppose that what has worked before, will continue to do so in the future. That statistical models can get hold of correlations between different “variables” is not enough. If they cannot get at the causal structure that generated the data, they are not really “identified.”
How strange that economists and other social scientists as a rule do not even touch upon these aspects of scientific methodology that seems to be so fundamental and important for anyone trying to understand how we learn and orient ourselves in an uncertain world. An educated guess on why this is so would be that Keynes’s concepts are not possible to squeeze into a single calculable numerical “probability.” In the quest for quantities one puts a blind eye to qualities and looks the other way – but Keynes’s ideas keep creeping out from under the statistics carpet.