A passage from Stanley Lieberson’s classic book on the methodology of social research, Making It Count (1985), has always stuck with me. In it, he considers what a social scientist might conclude from a regression model predicting black and white earnings from various background characteristics, including education. Invariably the coefficient for schooling is strong, positive, and significant—the more education one has, the greater one’s earnings. Moreover, the apparent gap between black and white earnings is much smaller when schooling is included as a predictor in the equation than when it is left out. In this sense, the racial gap is “explained” by lower average levels of education among blacks compared with whites. Obviously, therefore, all one has to do to reduce the racial gap in earnings is to increase levels of black education. The social scientist thus recommends that policymakers design and implement programs to reduce black dropout rates and increase the odds of college attendance.
“Suppose we start with a radically different perspective on this question and see where it leads us.
Let us hypothesize that racial or other interest groups will tend to take as much as they can for
themselves and will give as little as necessary to maintain the system and avoid having it overturned.
In this case, whites will give blacks as little as they can. Under such circumstances, one would assume that observed interrelations between income gaps and features such as education . . . describe . . . the current pathways leading from a specific causal force to the outcome of that force. If so, a complicated causal analysis of factors contributing to the racial gaps in income has not the causal value one might have assumed. It describes the given set of events at a given time; it describes what a black person might well follow as a get-ahead strategy if he or she can assume that not many other blacks will follow the same strategy and hence the basic [social] matrix will remain unaltered. But there is no assurance that this matrix will continue to operate—indeed, there is virtual certainty that the matrix will not continue to operate if some superficial factor that appears to cause the income gap is no longer relevant (for example, if the groups end up with the same educational distribution). In which case, new rules and regulations will operate; the other regression coefficients will change in value in order to maintain the existing system.” (pp. 191–92)
Simply put, Lieberson argues that if whites are selfishly motivated to discriminate against blacks to enhance their own material well-being, then when the government forces them to end a particular discriminatory practice, they will simply look for other means to maintain white privilege. If an older discriminatory mechanism based explicitly on race becomes impossible to sustain, whites will substitute new ones that are more subtly associated with race. The specific mechanisms by which racial stratification is achieved may thus be expected to change over time as practices shift in response to civil rights enforcement.
In my eyes, social orders are normally fragile and precarious; unpleasant surprises may turn up at any moment. I also think it wrong to demand that someone who identifies a problem should immediately offer a solution as well. I do not bow to such prescriptions … Problems may be such that there is no solution to them — or anyway, none achievable here and now. If someone were to ask me reproachfully where was ‘the positive,’ this would then indeeed be a case where I could appeal to Adorno. For his reply, much better formulated, would doubtless have been: what if there is nothing positive?
[h/t Lord Keynes]
For more on my own objections to Bayesianism:
Bayesianism — a patently absurd approach to science
Bayesianism — preposterous mumbo jumbo
One of the reasons I’m a Keynesian and not a Bayesian
Keynes and Bayes in paradise
One of my favourite “problem situating lecture arguments” against Bayesianism goes something like this: Assume you’re a Bayesian turkey and hold a nonzero probability belief in the hypothesis H that “people are nice vegetarians that do not eat turkeys and that every day I see the sun rise confirms my belief.” For every day you survive, you update your belief according to Bayes’ Rule
P(H|e) = [P(e|H)P(H)]/P(e),
where evidence e stands for “not being eaten” and P(e|H) = 1. Given that there do exist other hypotheses than e, P(e) is less than 1 and a fortiori P(H|e) is greater than P(H). Every day you survive increases your probability belief that you will not be eaten. This is totally rational according to the Bayesian definition of rationality. Unfortunately — as Bertrand Russell famously noticed — for every day that goes by, the traditional Christmas dinner also gets closer and closer …
When applying deductivist thinking to economics, neoclassical economists usually set up “as if” models based on a set of tight axiomatic assumptions from which consistent and precise inferences are made. The beauty of this procedure is of course that if the axiomatic premises are true, the conclusions necessarily follow. The snag is that if the models are to be relevant, we also have to argue that their precision and rigour still holds when they are applied to real-world situations. They often don’t. When addressing real economies, the idealizations and abstractions necessary for the deductivist machinery to work simply don’t hold.
If the real world is fuzzy, vague and indeterminate, then why should our models build upon a desire to describe it as precise and predictable? The logic of idealization is a marvellous tool in mathematics and axiomatic-deductivist systems, but a poor guide for action in real-world systems, in which concepts and entities are without clear boundaries and continually interact and overlap.
Or as Hans Albert has it on the neoclassical style of thought:
In everyday situations, if, in answer to an inquiry about the weather forecast, one is told that the weather will remain the same as long as it does not change, then one does not normally go away with the impression of having been particularly well informed, although it cannot be denied that the answer refers to an interesting aspect of reality, and, beyond that, it is undoubtedly true …
We are not normally interested merely in the truth of a statement, nor merely in its relation to reality; we are fundamentally interested in what it says, that is, in the information that it contains …
Information can only be obtained by limiting logical possibilities; and this in principle entails the risk that the respective statement may be exposed as false. It is even possible to say that the risk of failure increases with the informational content, so that precisely those statements that are in some respects most interesting, the nomological statements of the theoretical hard sciences, are most subject to this risk. The certainty of statements is best obtained at the cost of informational content, for only an absolutely empty and thus uninformative statement can achieve the maximal logical probability …
The neoclassical style of thought – with its emphasis on thought experiments, reflection on the basis of illustrative examples and logically possible extreme cases, its use of model construction as the basis of plausible assumptions, as well as its tendency to decrease the level of abstraction, and similar procedures – appears to have had such a strong influence on economic methodology that even theoreticians who strongly value experience can only free themselves from this methodology with difficulty …
Science progresses through the gradual elimination of errors from a large offering of rivalling ideas, the truth of which no one can know from the outset. The question of which of the many theoretical schemes will finally prove to be especially productive and will be maintained after empirical investigation cannot be decided a priori. Yet to be useful at all, it is necessary that they are initially formulated so as to be subject to the risk of being revealed as errors. Thus one cannot attempt to preserve them from failure at every price. A theory is scientifically relevant first of all because of its possible explanatory power, its performance, which is coupled with its informational content …
The connections sketched out above are part of the general logic of the sciences and can thus be applied to the social sciences. Above all, with their help, it appears to be possible to illuminate a methodological peculiarity of neoclassical thought in economics, which probably stands in a certain relation to the isolation from sociological and social-psychological knowledge that has been cultivated in this discipline for some time: the model Platonism of pure economics, which comes to expression in attempts to immunize economic statements and sets of statements (models) from experience through the application of conventionalist strategies …
Clearly, it is possible to interpret the ‘presuppositions’ of a theoretical system … not as hypotheses, but simply as limitations to the area of application of the system in question. Since a relationship to reality is usually ensured by the language used in economic statements, in this case the impression is generated that a content-laden statement about reality is being made, although the system is fully immunized and thus without content. In my view that is often a source of self-deception in pure economic thought …
A further possibility for immunizing theories consists in simply leaving open the area of application of the constructed model so that it is impossible to refute it with counter examples. This of course is usually done without a complete knowledge of the fatal consequences of such methodological strategies for the usefulness of the theoretical conception in question, but with the view that this is a characteristic of especially highly developed economic procedures: the thinking in models, which, however, among those theoreticians who cultivate neoclassical thought, in essence amounts to a new form of Platonism.
One of the few statisticians that I have on my blogroll is Andrew Gelman. Although not sharing his Bayesian leanings, yours truly finds his open-minded, thought-provoking and non-dogmatic statistical thinking highly recommendable. The plaidoyer infra for “reverse causal questioning” is typical Gelmanian:
When statistical and econometrc methodologists write about causal inference, they generally focus on forward causal questions. We are taught to answer questions of the type “What if?”, rather than “Why?” Following the work by Rubin (1977) causal questions are typically framed in terms of manipulations: if x were changed by one unit, how much would y be expected to change? But reverse causal questions are important too … In many ways, it is the reverse causal questions that motivate the research, including experiments and observational studies, that we use to answer the forward questions …
Reverse causal reasoning is different; it involves asking questions and searching for new variables that might not yet even be in our model. We can frame reverse causal questions as model checking. It goes like this: what we see is some pattern in the world that needs an explanation. What does it mean to “need an explanation”? It means that existing explanations — the existing model of the phenomenon — does not do the job …
By formalizing reverse casual reasoning within the process of data analysis, we hope to make a step toward connecting our statistical reasoning to the ways that we naturally think and talk about causality. This is consistent with views such as Cartwright (2007) that causal inference in reality is more complex than is captured in any theory of inference … What we are really suggesting is a way of talking about reverse causal questions in a way that is complementary to, rather than outside of, the mainstream formalisms of statistics and econometrics.
In a time when scientific relativism is expanding, it is important to keep up the claim for not reducing science to a pure discursive level. We have to maintain the Enlightenment tradition of thinking of reality as principally independent of our views of it and of the main task of science as studying the structure of this reality. Perhaps the most important contribution a researcher can make is reveal what this reality that is the object of science actually looks like.
Science is made possible by the fact that there are structures that are durable and are independent of our knowledge or beliefs about them. There exists a reality beyond our theories and concepts of it. It is this independent reality that our theories in some way deal with. Contrary to positivism, I would as a critical realist argue that the main task of science is not to detect event-regularities between observed facts. Rather, that task must be conceived as identifying the underlying structure and forces that produce the observed events.
In Gelman’s essay there is no explicit argument for abduction — inference to the best explanation — but I would still argue that it is de facto nothing but a very strong argument for why scientific realism and inference to the best explanation are the best alternatives for explaining what’s going on in the world we live in. The focus on causality, model checking, anomalies and context-dependence — although here expressed in statistical terms — is as close to abductive reasoning as we get in statistics and econometrics today.
Yours truly and people like Tony Lawson have for many years been urging economists to pay attention to the ontological foundations of their assumptions and models. Sad to say, economists have not paid much attention — and so modern economics has become increasingly irrelevant to the understanding of the real world.
Within mainstream economics internal validity is still everything and external validity nothing. Why anyone should be interested in that kind of theories and models is beyond imagination. As long as mainstream economists do not come up with any export-licenses for their theories and models to the real world in which we live, they really should not be surprised if people say that this is not science, but autism!
Studying mathematics and logics is interesting and fun. It sharpens the mind. In pure mathematics and logics we do not have to worry about external validity. But economics is not pure mathematics or logics. It’s about society. The real world. Forgetting that, economics is really in dire straits.
Mathematical axiomatic systems lead to analytic truths, which do not require empirical verification, since they are true by virtue of definitions and logic. It is a startling discovery of the twentieth century that sufficiently complex axiomatic systems are undecidable and incomplete. That is, the system of theorem and proof can never lead to ALL the true sentences about the system, and ALWAYS contain statements which are undecidable – their truth values cannot be determined by proof techniques. More relevant to our current purpose is that applying an axiomatic hypothetico-deductive system to the real world can only be done by means of a mapping, which creates a model for the axiomatic system. These mappings then lead to assertions about the real world which require empirical verification. These assertions (which are proposed scientific laws) can NEVER be proven in the sense that mathematical theorems can be proven …
Many more arguments can be given to explain the difference between analytic and synthetic truths, which corresponds to the difference between mathematical and scientific truths. As I have explained in greater detail in my paper, the scientific method arose as a rejection of the axiomatic method used by the Greeks for scientific methodology. It was this rejection of axiomatics and logical certainty in favour of empirical and observational approach which led to dramatic progress in science. However, this did involve giving up the certainties of mathematical argumentation and learning to live with the uncertainties of induction. Economists need to do the same – abandon current methodology borrowed from science and develop a new methodology suited for the study of human beings and societies.
According to some people there’s really no need for heterodox theoretical critiques of mainstream neoclassical economics, but rather challenges to neoclassical economics “buttressed by good empirical work.” Out with “big-think theorizing” and in with “ordinary empiricism.”
Although thought provoking, the view on empiricism and experiments offered is however too simplistic. And for several reasons — but mostly because the kind of experimental empiricism it favours is largely untenable.
Experiments are actually very similar to theoretical models in many ways — they e. g. have the same basic problem that they are built on rather artificial conditions and have difficulties with the “trade-off” between internal and external validity. The more artificial conditions, the more internal validity, but also less external validity. The more we rig experiments/models to avoid the “confounding factors”, the less the conditions are reminicent of the real “target system”. The nodal issue is how economists using different isolation strategies in different “nomological machines” attempt to learn about causal relationships. I doubt the generalizability of both research strategies, because the probability is high that causal mechanisms are different in different contexts and that lack of homogeneity/ stability/invariance doesn’t give us warranted export licenses to the “real” societies or economies.
If we see experiments as theory tests or models that ultimately aspire to say something about the real “target system”, then the problem of external validity is central.
Assume that you have examined how the work performance of Swedish workers A is affected by B (“treatment”). How can we extrapolate/generalize to new samples outside the original population (e.g. to the UK)? How do we know that any replication attempt “succeeds”? How do we know when these replicated experimental results can be said to justify inferences made in samples from the original population? If, for example, P(A|B) is the conditional density function for the original sample, and we are interested in doing a extrapolative prediction of E [P(A|B)], how can we know that the new sample’s density function is identical with the original? Unless we can give some really good argument for this being the case, inferences built on P(A|B) is not really saying anything on that of the target system’s P'(A|B).
As I see it is this heart of the matter. External validity/extrapolation/generalization is founded on the assumption that we can make inferences based on P(A|B) that is exportable to other populations for which P'(A|B) applies. Sure, if one can convincingly show that P and P’ are similar enough, the problems are perhaps not insurmountable. But arbitrarily just introducing functional specification restrictions of the type invariance/stability/homogeneity is, at least for an epistemological realist, far from satisfactory. And often it is — unfortunately — exactly this that we see when we take part of neoclassical economists’ models/experiments.
By this I do not mean to say that empirical methods per se are so problematic that they can never be used. On the contrary, I am basically — though not without reservations — in favour of the increased use of experiments within economics as an alternative to completely barren “bridge-less” axiomatic-deductive theory models. My criticism is more about aspiration levels and what we believe we can achieve with our mediational epistemological tools and methods in social sciences.
Many ‘experimentalists’ claim that it is easy to replicate experiments under different conditions and therefore a fortiori easy to test the robustness of experimental results. But is it really that easy? If in the example given above, we run a test and find that our predictions were not correct – what can we conclude? The B “works” in Sweden but not in the UK? Or that B “works” in a backward agrarian society, but not in a post-modern service society? That B “worked” in the field study conducted in year 2005 but not in year 2014? Population selection is almost never simple. Had the problem of external validity only been about inference from sample to population, this would be no critical problem. But the really interesting inferences are those we try to make from specific labs/experiments to specific real world situations/institutions/structures that we are interested in understanding or (causally) to explain. And then the population problem is more difficult to tackle.
Just as traditional neoclassical modelling, randomized experiments is basically a deductive method. Given the assumptions (such as manipulability, transitivity, separability, additivity, linearity etc) these methods deliver deductive inferences. The problem, of course, is that we will never completely know when the assumptions are right. Real target systems are seldom epistemically isomorphic to our axiomatic-deductive models/systems, and even if they were, we still have to argue for the external validity of the conclusions reached from within these epistemically convenient models/systems. Causal evidence generated by randomization procedures may be valid in “closed” models, but what we usually are interested in, is causal evidence in the real target system we happen to live in.
Ideally controlled experiments (still the benchmark even for natural and quasi experiments) tell us with certainty what causes what effects – but only given the right “closures”. Making appropriate extrapolations from (ideal, accidental, natural or quasi) experiments to different settings, populations or target systems, is not easy. “It works there” is no evidence for “it will work here”. Causes deduced in an experimental setting still have to show that they come with an export-warrant to the target population/system. The causal background assumptions made have to be justified, and without licenses to export, the value of “rigorous” and “precise” methods is despairingly small.
Many advocates of randomization and experiments want to have deductively automated answers to fundamental causal questions. But to apply “thin” methods we have to have “thick” background knowledge of what’s going on in the real world, and not in (ideally controlled) experiments. Conclusions can only be as certain as their premises – and that also goes for methods based on randomized experiments.
The claimed strength of a social experiment, relatively to non-experimental methods, is that few assumptions are required to establish its internal validity in identifying a project’s impact. The identification is not assumption-free. People are (typically and thankfully) free agents who make purposive choices about whether or not they should take up an assigned intervention. As is well understood by the randomistas, one needs to correct for such selective compliance … The randomized assignment is assumed to only affect outcomes through treatment status (the “exclusion restriction”).
There is another, more troubling, assumption just under the surface. Inferences are muddied by the presence of some latent factor—unobserved by the evaluator but known to the participant—that influences the individual-specific impact of the program in question … Then the standard instrumental variable method for identifying [the average treatment effect on the treated] is no longer valid, even when the instrumental variable is a randomized assignment … Most social experiments in practice make the implicit and implausible assumption that the program has the same impact for everyone.
While internal validity … is the claimed strength of an experiment, its acknowledged weakness is external validity—the ability to learn from an evaluation about how the specific intervention will work in other settings and at larger scales. The randomistas see themselves as the guys with the lab coats—the scientists—while other types, the “policy analysts,” worry about things like external validity. Yet it is hard to argue that external validity is less important than internal validity when trying to enhance development effectiveness against poverty; nor is external validity any less legitimate as a topic for scientific inquiry.
It is generally recognised that the Ricardian analysis was concerned with what we now call long-period equilibrium. Marshall’s contribution mainly consisted in grafting on to this the marginal principle and the principle of substitution, together with some discussion of the passage from one position of long-period equilibrium to another. But he assumed, as Ricardo did, that the amounts of the factors of production in use were given and that the problem was to determine the way in which they would be used and their relative rewards. Edgeworth and Professor Pigou and other later and contemporary writers have embroidered and improved this theory by considering how different peculiarities in the shapes of the supply functions of the factors of production would affect matters, what will happen in conditions of monopoly and imperfect competition, how far social and individual advantage coincide, what are the special problems of exchange in an open system and the like. But these more recent writers like their predecessors were still dealing with a system in which the amount of the factors employed was given and the other relevant facts were known more or less for certain. This does not mean that they were dealing with a system in which change was ruled out, or even one in which the disappointment of expectation was ruled out. But at any given time facts and expectations were assumed to be given in a definite and calculable form; and risks, of which, though admitted, not much notice was taken, were supposed to be capable of an exact actuarial computation. The calculus of probability, though mention of it was kept in the background, was supposed to be capable of reducing uncertainty to the same calculable status as that of certainty itself; just as in the Benthamite calculus of pains and pleasures or of advantage and disadvantage, by which the Benthamite philosophy assumed men to be influenced in their general ethical behaviour.
Actually, however, we have, as a rule, only the vaguest idea of any but the most direct consequences of our acts. Sometimes we are not much concerned with their remoter consequences, even though time and chance may make much of them. But sometimes we are intensely concerned with them, more so, occasionally, than with the immediate consequences. Now of all human activities which are affected by this remoter preoccupation, it happens that one of the most important is economic in character, namely, wealth. The whole object of the accumulation of wealth is to produce results, or potential results, at a comparatively distant, and sometimes indefinitely distant, date. Thus the fact that our knowledge of the future is fluctuating, vague and uncertain, renders wealth a peculiarly unsuitable subject for the methods of the classical economic theory. This theory might work very well in a world in which economic goods were necessarily consumed within a short interval of their being produced. But it requires, I suggest, considerable amendment if it is to be applied to a world in which the accumulation of wealth for an indefinitely postponed future is an important factor; and the greater the proportionate part played by such wealth accumulation the more essential does such amendment become.
By ‘uncertain’ knowledge, let me explain, I do not mean merely to distinguish what is known for certain from what is only probable. The game of roulette is not subject, in this sense, to uncertainty; nor is the prospect of a Victory bond being drawn. Or, again, the expectation of life is only slightly uncertain. Even the weather is only moderately uncertain. The sense in which I am using the term is that in which the prospect of an European war is uncertain, or the price of copper and the rate of interest twenty years hence, or the obsolescence of a new invention, or the position of private wealth-owners in the social system in 1970. About these matters their is no scientific basis on which to form any calculable probability whatever. We simply do not know.
J M Keynes “The General Theory of Employment” Quarterly Journal of Economics, February 1937.