Getting it right about the causal structure of a real system in front of us is often a matter of great importance. It is not appropriate to offer the authority of formalism over serious consideration of what are the best assumptions to make about the structure at hand …
Where we don’t know, we don’t know. When we have to proceed with little information we should make the best evaluation we can for the case at hand — and hedge our bets heavily; we should not proceed with false confidence having plumped either for or against some specific hypothesis … for how the given system works when we really have no idea.
Trying to get around this lack of knowledge, mainstream economists in their quest for deductive certainty in their models, standardly assume things like ‘independence,’ ‘linearity,’ ‘additivity,’ ‘stability,’ ‘manipulability,’ ‘variation free variables,’ ‘faithfulness,’ ‘invariance,’ ‘implementation neutrality,’ ‘superexogeneity,’ etc., etc.
This can’t be the right way to tackle real-world problems. If those conditions do not hold, almost everything in those models is lost. The price paid for deductively is an exceedingly narrow scope. By this I do not mean to say that we have to discard all (causal) theories/laws building on ‘stability,’ ‘invariance,’ etc. But we have to acknowledge the fact that outside the systems that possibly fullfil these assumptions, they are of little substantial value. Running paper and pen experiments on artificial ‘analogue’ model economies is a sure way of ‘establishing’ (causal) economic laws or solving intricate problems — in the model-world. But they are pure substitutes for the real thing and they don’t have much bearing on what goes on in real-world open social systems. Deductive systems are powerful. But one single false premise and all power is gone. Setting up convenient circumstances for conducting thought-experiments may tell us a lot about what happens under those kinds of circumstances. But — few, if any, real-world social systems are ‘convenient.’ So most of those systems, theories and models, are irrelevant for letting us know what we really want to know.
Limiting model assumptions in economic science always have to be closely examined. The results we get in models are only as sure as the assumptions on which they build — and if the economist doesn’t give any guidance on how to apply his models to real-world systems he doesn’t deserve our attention. Of course one can always say — as James Heckman — that it is relatively straightforward to define causality “when the causes can be independently varied.” But what good does that do when we know for a fact that real-world causes almost never can be independently varied?
Building models can’t be a goal in itself. Good models are means that makes it possible for us to infer things about the real-world systems they ‘represent.’ If we can’t show that the mechanisms or causes that we isolate and handle in our models are ‘exportable’ to the real-world, they are of limited value to our understanding, explanations or predictions of real economic systems.
The kind of fundamental assumption about the character of material laws, on which scientists appear commonly to act, seems to me to be much less simple than the bare principle of uniformity. They appear to assume something much more like what mathematicians call the principle of the superposition of small effects, or, as I prefer to call it, in this connection, the atomic character of natural law. The system of the material universe must consist, if this kind of assumption is warranted, of bodies which we may term (without any implication as to their size being conveyed thereby) legal atoms, such that each of them exercises its own separate, independent, and invariable effect, a change of the total state being compounded of a number of separate changes each of which is solely due to a separate portion of the preceding state. We do not have an invariable relation between particular bodies, but nevertheless each has on the others its own separate and invariable effect, which does not change with changing circumstances, although, of course, the total effect may be changed to almost any extent if all the other accompanying causes are different. Each atom can, according to this theory, be treated as a separate cause and does not enter into different organic combinations in each of which it is regulated by different laws …
The scientist wishes, in fact, to assume that the occurrence of a phenomenon which has appeared as part of a more complex phenomenon, may be some reason for expecting it to be associated on another occasion with part of the same complex. Yet if different wholes were subject to laws qua wholes and not simply on account of and in proportion to the differences of their parts, knowledge of a part could not lead, it would seem, even to presumptive or probable knowledge as to its association with other parts. Given, on the other hand, a number of legally atomic units and the laws connecting them, it would be possible to deduce their effects pro tanto without an exhaustive knowledge of all the coexisting circumstances.
Real-world social systems are usually not governed by stable causal mechanisms or capacities. The kinds of ‘laws’ and relations that e. g. econometrics has established, are laws and relations about entities in models that presuppose causal mechanisms being invariant and atomistic. But — when causal mechanisms operate in the real world they only do it in ever-changing and unstable combinations where the whole is more than a mechanical sum of parts. If economic regularities obtain they do it as a rule only because we engineered them for that purpose. Outside man-made ‘nomological machines’ they are rare, or even non-existant.
Since there is no absolutely certain knowledge at hand in social sciences — including economics — explicit argumentation and justification ought to play an extremely important role if the purported knowledge claims are to be sustainably warranted. Without careful supporting arguments, building ‘convenient’ analogue models of real-world phenomena accomplishes absolutely nothing.
So we better follow Cartwright’s advice:
Where we don’t know, we don’t know. When we have to proceed with little information we should make the best evaluation we can for the case at hand — and hedge our bets heavily.
When we cannot accept that the observations, along the time-series available to us, are independent, or cannot by some device be divided into groups that can be treated as independent, we get into much deeper water. For we have then, in strict logic, no more than one observation, all of the separate items having to be taken together. For the analysis of that the probability calculus is useless; it does not apply. We are left to use our judgement, making sense of what has happened as best we can, in the manner of the historian. Applied economics does then come back to history, after all.
I am bold enough to conclude, from these considerations that the usefulness of ‘statistical’ or ‘stochastic’ methods in economics is a good deal less than is now conventionally supposed. We have no business to turn to them automatically; we should always ask ourselves, before we apply them, whether they are appropriate to the problem at hand. Very often they are not. Thus it is not at all sensible to take a small number of observations (sometimes no more than a dozen observations) and to use the rules of probability to deduce from them a ‘significant’ general law. For we are assuming, if we do so, that the variations from one to another of the observations are random, so that if we had a larger sample (as we do not) they would by some averaging tend to disappear. But what nonsense this is when the observations are derived, as not infrequently happens, from different countries, or localities, or industries — entities about which we may well have relevant information, but which we have deliberately decided, by our procedure, to ignore. By all means let us plot the points on a chart, and try to explain them; but it does not help in explaining them to suppress their names. The probability calculus is no excuse for forgetfulness.
John Hicks’ Causality in economics is an absolute masterpiece. It ought to be on the reading list of every course in economic methodology.
One of the limitations with economics is the restricted possibility to perform experiments, forcing it to mainly rely on observational studies for knowledge of real-world economies.
But still — the idea of performing laboratory experiments holds a firm grip of our wish to discover (causal) relationships between economic ‘variables.’ If we only could isolate and manipulate variables in controlled environments, we would probably find ourselves in a situation where we with greater ‘rigour’ and ‘precision’ could describe, predict, or explain economic happenings in terms of ‘structural’ causes, ‘parameter’ values of relevant variables, and economic ‘laws.’
Galileo Galilei’s experiments are often held as exemplary for how to perform experiments to learn something about the real world. Galileo’s experiments were according to Nancy Cartwright (Hunting Causes and Using Them, p. 223)
designed to find out what contribution the motion due to the pull of the earth will make, with the assumption that the contribution is stable across all the different kinds of situations falling bodies will get into … He eliminated (as far as possible) all other causes of motion on the bodies in his experiment so that he could see how they move when only the earth affects them. That is the contribution that the earth’s pull makes to their motion.
Galileo’s heavy balls dropping from the tower of Pisa, confirmed that the distance an object falls is proportional to the square of time, and that this law (empirical regularity) of falling bodies could be applicable outside a vacuum tube when e. g. air existence is negligible.
The big problem is to decide or find out exactly for which objects air resistance (and other potentially ‘confounding’ factors) is ‘negligible.’ In the case of heavy balls, air resistance is obviously negligible, but how about feathers or plastic bags?
One possibility is to take the all-encompassing-theory road and find out all about possible disturbing/confounding factors — not only air resistence — influencing the fall and build that in to one great model delivering accurate predictions on what happens when the object that falls is not only a heavy ball, but feathers and plastic bags. This usually amounts to ultimately state some kind of ceteris paribus interpretation of the ‘law.’
Another road to take would be to concentrate on the negligibility assumption and to specify the domain of applicability to be only heavy compact bodies. The price you have to pay for this is that (1) ‘negligibility’ may be hard to establish in open real-world systems, (2) the generalisation you can make from ‘sample’ to ‘population’ is heavily restricted, and (3) you actually have to use some ‘shoe leather’ and empirically try to find out how large is the ‘reach’ of the ‘law.’
In mainstream economics one has usually settled for the ‘theoretical’ road (and in case you think the present ‘natural experiments’ hype has changed anything, remember that to mimic real experiments, exceedingly stringent special conditions have to obtain).
In the end, it all boils down to one question — are there any heavy balls to be found in economics, so that we can indisputably establish the existence of economic laws operating in real-world economies?
As far as I can see there some heavy balls out there, but not even one single real economic law.
Economic factors/variables are more like feathers than heavy balls — non-negligible factors (like air resistance and chaotic turbulence) are hard to rule out as having no influence on the object studied.
Galilean experiments are hard to carry out in economics, and the theoretical ‘analogue’ models economists construct and in which they perform their ‘thought-experiments’ build on assumptions that are far away from the kind of idealized conditions under which Galileo performed his experiments. The ‘nomological machines’ that Galileo and other scientists have been able to construct, have no real analogues in economics. The stability, autonomy, modularity, and interventional invariance, that we may find between entities in nature, simply are not there in real-world economies. That’s are real-world fact, and contrary to the beliefs of most mainstream economists, they wont’t go away simply by applying deductive-axiomatic economic theory with tons of more or less unsubstantiated assumptions.
By this I do not mean to say that we have to discard all (causal) theories/laws building on modularity, stability, invariance, etc. But we have to acknowledge the fact that outside the systems that possibly fullfil these requirements/assumptions, they are of little substantial value. Running paper and pen experiments on artificial ‘analogue’ model economies is a sure way of ‘establishing’ (causal) economic laws or solving intricate econometric problems of autonomy, identification, invariance and structural stability — in the model-world. But they are pure substitutes for the real thing and they don’t have much bearing on what goes on in real-world open social systems. Setting up convenient circumstances for conducting Galilean experiments may tell us a lot about what happens under those kind of circumstances. But — few, if any, real-world social systems are ‘convenient.’ So most of those systems, theories and models, are irrelevant for letting us know what we really want to know..
To solve, understand, or explain real-world problems you actually have to know something about them – logic, pure mathematics, data simulations or deductive axiomatics don’t take you very far. Most econometrics and economic theories/models are splendid logic machines. But — applying them to the real-world is a totally hopeless undertaking! The assumptions one has to make in order to successfully apply these deductive-axiomatic theories/models/machines are devastatingly restrictive and mostly empirically untestable– and hence make their real-world scope ridiculously narrow. To fruitfully analyse real-world phenomena with models and theories you cannot build on patently and known to be ridiculously absurd assumptions.
No matter how much you would like the world to entirely consist of heavy balls, the world is not like that. The world also has its fair share of feathers and plastic bags.
In this paper we began by describing the position of those critical realists who are sceptical about multi-variate statistics … Some underlying assumptions of this sceptical argument were shown to be false. Then a positive case in favour of using analytical statistics as part of a mixed-methods methodology was developed. An example of the interpretation of logistic regression was used to show that the interpretation need not be atomistic or reductionist. However, we also argued that the data underlying such interpretations are ‘ficts’, i.e. are not true in themselves, and cannot be considered to be accurate or true descriptions of reality. Instead, the validity of the interpretations of such data are what social scientists should argue about. Therefore what matters is how warranted arguments are built by the researcher who uses statistics. Our argument supports seeking surprising findings; being aware of the caveat that demi-regularities do not necessarily reveal laws; and otherwise following advice given from the ‘sceptical’ school. However the capacity of multi-variate statistics to provide a grounding for warranted arguments implies that their use cannot be rejected out of hand by serious social researchers.
The bias toward the superficial and the response to extraneous influences on research are both examples of real harm done in contemporary social science by a roughly Bayesian paradigm of statistical inference as the epitome of empirical argument. For instance the dominant attitude toward the sources of black-white differential in United States unemployment rates (routinely the rates are in a two to one ratio) is “phenomenological.” The employment differences are traced to correlates in education, locale, occupational structure, and family background. The attitude toward further, underlying causes of those correlations is agnostic … Yet on reflection, common sense dictates that racist attitudes and institutional racism must play an important causal role. People do have beliefs that blacks are inferior in intelligence and morality, and they are surely influenced by these beliefs in hiring decisions … Thus, an overemphasis on Bayesian success in statistical inference discourages the elaboration of a type of account of racial disadavantages that almost certainly provides a large part of their explanation.
For all scholars seriously interested in questions on what makes up a good scientific explanation, Richard Miller’s Fact and Method is a must read. His incisive critique of Bayesianism is still unsurpassed.
One of my favourite “problem situating lecture arguments” against Bayesianism goes something like this: Assume you’re a Bayesian turkey and hold a nonzero probability belief in the hypothesis H that “people are nice vegetarians that do not eat turkeys and that every day I see the sun rise confirms my belief.” For every day you survive, you update your belief according to Bayes’ Rule
P(H|e) = [P(e|H)P(H)]/P(e),
where evidence e stands for “not being eaten” and P(e|H) = 1. Given that there do exist other hypotheses than H, P(e) is less than 1 and a fortiori P(H|e) is greater than P(H). Every day you survive increases your probability belief that you will not be eaten. This is totally rational according to the Bayesian definition of rationality. Unfortunately — as Bertrand Russell famously noticed — for every day that goes by, the traditional Christmas dinner also gets closer and closer …
For more on my own objections to Bayesianism:
Bayesianism — a patently absurd approach to science
Bayesianism — preposterous mumbo jumbo
One of the reasons I’m a Keynesian and not a Bayesian
The move from a structuralist account in which capital is understood to structure social relations in relatively homologous ways to a view of hegemony in which power relations are subject to repetition, convergence, and rearticulation brought the question of temporality into the thinking of structure, and marked a shift from a form of Althusserian theory that takes structural totalities as theoretical objects to one in which the insights into the contingent possibility of structure inaugurate a renewed conception of hegemony as bound up with the contingent sites and strategies of the rearticulation of power.
Friedman enters this scene arguing that all we need to do is predict successfully, that this can be done even without realistic theories, and that unrealistic theories are to be preferred to realistic ones, essentially because they can usually be more parsimonious.
The first thing to note about this response is that Friedman is attempting to turn inevitable failure into a virtue. In the context of economic modelling, the need to produce formulations in terms of systems of isolated atoms, where these are not characteristic of social reality, means that unrealistic formulations are more or less unavoidable. Arguing that they are to be preferred to realistic ones in this context belies the fact that there is not a choice …
My own response to Friedman’s intervention is that it was mostly an irrelevancy, but one that has been opportunistically grasped by some as a supposed defence of the profusion of unrealistic assumptions in economics. This would work if successful prediction were possible. But usually it is not.
If scientific progress in economics – as Robert Lucas and other latter days followers of Milton Friedman seem to think – lies in our ability to tell ‘better and better stories’ one would of course expect economics journals being filled with articles supporting the stories with empirical evidence confirming the predictions. However, the journals still show a striking and embarrassing paucity of empirical studies that (try to) substantiate these predictive claims. Equally amazing is how little one has to say about the relationship between the model and real world target systems. It is as though explicit discussion, argumentation and justification on the subject isn’t considered to be required.
If the ultimate criterion of success of a model is to what extent it predicts and coheres with (parts of) reality, modern mainstream economics seems to be a hopeless misallocation of scientific resources. To focus scientific endeavours on proving things in models, is a gross misapprehension of what an economic theory ought to be about. Deductivist models and methods disconnected from reality are not relevant to predict, explain or understand real-world economies.
There can be no theory without assumptions since it is the assumptions embodied in a theory that provide, by way of reason and logic, the implications by which the subject matter of a scientific discipline can be understood and explained. These same assumptions provide, again, by way of reason and logic, the predictions that can be compared with empirical evidence to test the validity of a theory. It is a theory’s assumptions that are the premises in the logical arguments that give a theory’s explanations meaning, and to the extent those assumptions are false, the explanations the theory provides are meaningless no matter how logically powerful or mathematically sophisticated those explanations based on false assumptions may seem to be.
As yours truly has repeatedly argued on this blog (e.g. here here here), RCTs usually do not provide evidence that their results are exportable to other target systems. The almost religious belief with which many of its propagators portray it, cannot hide the fact that RCTs cannot be taken for granted to give generalizable results. That something works somewhere is no warranty for it to work for us or even that it works generally.
An extremely interesting systematic review article, on the grand claims to external validity often raised by advocates of RCTs, now confirms this view and show that using an RCT is not at all the “gold standard” it is portrayed as:
In theory there seems to be a consensus among empirical researchers that establishing external validity of a policy evaluation study is as important as establishing its internal validity. Against this background, this paper has systematically reviewed the existing RCT literature in order to examine the extent to which external validity concerns are addressed in the practice of conducting and publishing RCTs for policy evaluation purposes. We have identified all 92 papers based on RCTs that evaluate a policy intervention and that are published in the leading economic journals between 2009 and 2014. We reviewed them with respect to whether the published papers address the different hazards of external validity that we developed …
Many published RCTs do not provide a comprehensive presentation of how the experiment was implemented. More than half of the papers do not even provide the reader with information on whether the participants in the experiment are aware of being part of an experiment – which is crucial to assess whether Hawthorne- or John- Henry-effects could codetermined the outcomes in the RCT …
Further, potential general equilibrium effects are only rarely addressed. This is above all worrisome in case outcomes involve price changes (e.g. labor market outcomes) with straightforward repercussions when the program is brought to scale …
In many of the studies we reviewed, the assumptions that the authors make in generalizing their results, as well as respective limitations to the inferences we can draw, are left behind a veil …
A more transparent reporting would also lead to a situation in which RCTs that properly accounted for the potential hazards to external validity receive more attention than those that did not … We therefore call for dedicating the same devotion to establishing external validity as is done to establish internal validity. It would be desirable if the peer review process at economics journals explicitly scrutinized design features of RCTs that are relevant for extrapolating the findings to other settings and the respective assumptions made by the authors … Given the trade-offs we all face during the laborious implementation of studies it is almost certain that external validity will often be sacrificed for other features to which the peer-review process currently pays more attention.
Blinding is rarely possible in economics or social science trials, and this is one of the major differences from most (although not all) RCTs in medicine, where blinding is standard, both for those receiving the treatment and those administering it … Subjects in social RCTs usually know whether they are receiving the treatment or not and so can react to their assignment in ways that can affect the outcome other than through the operation of the treatment; in econometric language, this is akin to a violation of exclusion restrictions, or a failure of exogeneity …
Note also that knowledge of their assignment may cause people to want to cross over from treatment to control, or vice versa, to drop out of the program, or to change their behavior in the trial depending on their assignment. In extreme cases, only those members of the trial sample who expect to benefit from the treatment will accept treatment. Consider, for example, a trial in which children are randomly allocated to two schools that teach in different languages, Russian or English, as happened during the breakup of the former Yugoslavia. The children (and their parents) know their allocation, and the more educated, wealthier, and less-ideologically committed parents whose children are assigned to the Russian-medium schools can (and did) remove their children to private English-medium schools. In a comparison of those who accepted their assignments, the effects of the language of instruction will be distorted in favor of the English schools by differences in family characteristics. This is a case where, even if the random number generator is fully functional, a later balance test will show systematic differences in observable background characteristics between the treatment and control groups; even if the balance test is passed, there may still be selection on unobservables for which we cannot test …
Various statistical corrections are available for a few of the selection problems non- blinding presents, but all rely on the kind of assumptions that, while common in observational studies, RCTs are designed to avoid. Our own view is that assumptions and the use of prior knowledge are what we need to make progress in any kind of analysis, including RCTs whose promise of assumption-free learning is always likely to be illusory …
This only confirms that ‘ideally controlled experiments’ tell us with certainty what causes what effects — but only given the right ‘closures.’ Making appropriate extrapolations from (ideal, accidental, natural or quasi) experiments to different settings, populations or target systems, is not easy. “It works there” is no evidence for “it will work here”. Causes deduced in an experimental setting still have to show that they come with an export-warrant to the target system. The causal background assumptions made have to be justified, and without licenses to export, the value of ‘rigorous’ methods and ‘on-average-knowledge’ is despairingly small.