The most expedient population and data generation model to adopt is one in which the population is regarded as a realization of an infinite superpopulation. This setup is the standard perspective in mathematical statistics, in which random variables are assumed to exist with fixed moments for an uncountable and unspecified universe of events …
This perspective is tantamount to assuming a population machine that spawns individuals forever (i.e., the analog to a coin that can be flipped forever). Each individual is born as a set of random draws from the distributions of Y¹, Y°, and additional variables collectively denoted by S …
Because of its expediency, we will usually write with the superpopulation model in the background, even though the notions of infinite superpopulations and sequences of sample sizes approaching infinity are manifestly unrealistic.
In econometrics one often gets the feeling that many of its practitioners think of it as a kind of automatic inferential machine: input data and out comes casual knowledge. This is like pulling a rabbit from a hat. Great — but first you have to put the rabbit in the hat. And this is where assumptions come in to the picture.
The assumption of imaginary “superpopulations” is one of the many dubious assumptions used in modern econometrics.
As social scientists — and economists — we have to confront the all-important question of how to handle uncertainty and randomness. Should we define randomness with probability? If we do, we have to accept that to speak of randomness we also have to presuppose the existence of nomological probability machines, since probabilities cannot be spoken of – and actually, to be strict, do not at all exist – without specifying such system-contexts. Accepting a domain of probability theory and sample space of infinite populations also implies that judgments are made on the basis of observations that are actually never made!
Infinitely repeated trials or samplings never take place in the real world. So that cannot be a sound inductive basis for a science with aspirations of explaining real-world socio-economic processes, structures or events. It’s not tenable.
In his great book Statistical Models and Causal Inference: A Dialogue with the Social Sciences David Freedman also touched on this fundamental problem, arising when you try to apply statistical models outside overly simple nomological machines like coin tossing and roulette wheels:
Lurking behind the typical regression model will be found a host of such assumptions; without them, legitimate inferences cannot be drawn from the model. There are statistical procedures for testing some of these assumptions. However, the tests often lack the power to detect substantial failures. Furthermore, model testing may become circular; breakdowns in assumptions are detected, and the model is redefined to accommodate. In short, hiding the problems can become a major goal of model building.
Using models to make predictions of the future, or the results of interventions, would be a valuable corrective. Testing the model on a variety of data sets – rather than fitting refinements over and over again to the same data set – might be a good second-best … Built into the equation is a model for non-discriminatory behavior: the coefficient d vanishes. If the company discriminates, that part of the model cannot be validated at all.
Regression models are widely used by social scientists to make causal inferences; such models are now almost a routine way of demonstrating counterfactuals. However, the “demonstrations” generally turn out to depend on a series of untested, even unarticulated, technical assumptions. Under the circumstances, reliance on model outputs may be quite unjustified. Making the ideas of validation somewhat more precise is a serious problem in the philosophy of science. That models should correspond to reality is, after all, a useful but not totally straightforward idea – with some history to it. Developing appropriate models is a serious problem in statistics; testing the connection to the phenomena is even more serious …
In our days, serious arguments have been made from data. Beautiful, delicate theorems have been proved, although the connection with data analysis often remains to be established. And an enormous amount of fiction has been produced, masquerading as rigorous science.
And as if this wasn’t enough, one could — as we’ve seen — also seriously wonder what kind of “populations” these statistical and econometric models ultimately are based on. Why should we as social scientists — and not as pure mathematicians working with formal-axiomatic systems without the urge to confront our models with real target systems — unquestioningly accept models based on concepts like the “infinite superpopulations” used in e.g. the potential outcome framework that has become so popular lately in social sciences?
Of course one could treat observational or experimental data as random samples from real populations. I have no problem with that. But probabilistic econometrics does not content itself with that kind of populations. Instead it creates imaginary populations of “parallel universes” and assume that our data are random samples from that kind of “infinite superpopulations.”
But this is actually nothing else but hand-waving! And it is inadequate for real science. As David Freedman writes:
With this approach, the investigator does not explicitly define a population that could in principle be studied, with unlimited resources of time and money. The investigator merely assumes that such a population exists in some ill-defined sense. And there is a further assumption, that the data set being analyzed can be treated as if it were based on a random sample from the assumed population. These are convenient fictions … Nevertheless, reliance on imaginary populations is widespread. Indeed regression models are commonly used to analyze convenience samples … The rhetoric of imaginary populations is seductive because it seems to free the investigator from the necessity of understanding how data were generated.
In social sciences — including economics — it’s always wise to ponder C. S. Peirce’s remark that universes are not as common as peanuts …
Recall [Russell's famous] turkey problem. You look at the past and derive some rule about the future. Well, the problems in projecting from the past can be even worse than what we have already learned, because the same past data can confirm a theory and also its exact opposite …
For the technical version of this idea, consider a series of dots on a page representing a number through time … Let’s say your high school teacher asks you to extend the series of dots. With a linear model, that is, using a ruler, you can run only a single straight line from the past to the future. The linear model is unique. There is one and only one straight line that can project a series of points …
This is what philosopher Nelson Goodman called the riddle of induction: we project a straight line only because we have a linear model in our head — the fact that a number has risen for 1 000 days straight should make you more confident that it will rise in the future. But if you have a nonlinear model in your head, it might confirm that the number should decline on day 1 001 …
The severity of Goodman’s riddle of induction is as follows: if there is no longer even a single unique way to ‘generalize’ from what you see, to make an inference about the unknown, then how should you operate? The answer, clearly, will be that you should employ ‘common sense’.
And economists standardly — and without even the slightest justification — assume linearity in their models …
So far we have shown that for two prominent questions in the economics of education, experimental and non-experimental estimates appear to be in tension. Furthermore, experimental results across different contexts are often in tension with each other. The first tension presents policymakers with a trade-off between the internal validity of estimates from the “wrong” context, and the greater external validity of observational data analysis from the “right” context. The second tension, between equally well-identifed results across contexts, suggests that the resolution of this trade-off is not trivial. There appears to be genuine heterogeneity in the true causal parameter across contexts.
Despite the fact that we have chosen to focus on extremely well-researched literatures,
it is plausible that a development practitioner confronting questions related to class size, private schooling, or the labor-market returns to education would confront a dearth of well-identified, experimental or quasi-experimental evidence from the country or context in which they are working. They would instead be forced to choose between less internally valid OLS estimates, and more internally valid experimental estimates produced in a very different setting. For all five of the examples explored here, the literature provides a compelling case that policymakers interested in minimizing the error of their parameter estimates would do well to prioritize careful thinking about local evidence over rigorously-estimated causal effects from the wrong context.
One thing that’s missing from Krugman’s treatment of useful economics is the explicit recognition of what Keynes and before him Frank Knight, emphasized: the persistent presence of enormous uncertainty in the economy. Most people most of the time don’t just face quantifiable risks, to be tamed by statistics and probabilistic reasoning. We have to take decisions in the prospect of events–big and small–we can’t predict even with probabilities. Keynes famously argued that classical economics had no role for money just because it didn’t allow for uncertainty. Knight similarly noted that it made no room for the entrepreneur owing to the same reason. That to this day standard economic theory continues to rules out money and excludes entrepreneurs may strike the noneconomist as odd to say the least. But there it is. Why is uncertainty so important? Because the more of it there is in the economy the less scope for successful maximizing and the more unstable are the equilibria the economy exhibits, if it exhibits any at all. Uncertainty is just what the New Classical neglected when they endorsed the efficient market hypothesis and the Black-Scholes formulae for pumping returns out of well-behaved risks.
If uncertainty is an ever present, pervasive feature of the economy, then we can be confident, along with Krugman, that New Classical models wont be useful over the long haul. Even if people are perfectly rational too many uncertain, “exogenous” events will divert each new equilibrium path before it can even get started.
There is a second feature of the economy that Krugman’s useful economics needs to reckon with, one that Keynes and after him George Soros, emphasized. Along with uncertainty, the economy exhibits pervasive reflexivity: expectations about the economic future tend to actually shift that future. This will be true whether those expectations are those of speculators, regulators, even garden-variety consumers and producers. Reflexiveness is everywhere in the economy, though it is only easily detectable when it goes to extremes, as in bubbles and busts, or regulatory capture …
When combined uncertainty and reflexivity greatly limit the power of maximizing and equilibrium to do useful economics … Between them, they make the economy a moving target for the economist. Models get into people’s heads and change their behavior, usually in ways that undermine the model’s usefulness to predict.
Which models do this and how they work is not a matter of quantifiable risk, but radical uncertainty …
Between them reflexivity and uncertainty make economics into a retrospective, historical science, one whose models—simple or complex—are continually made obsolete by events, and so cannot be improved in the direction of greater predictive power, even by more complication. The way expectations reflexively drive future economic events, and are driven by past ones, is constantly being changed by the intervention of unexpected, uncertain, exogenous ones.
[h/t Jan Milch]
Ever since the Enlightenment various economists had been seeking to mathematise the study of the economy. In this, at least prior to the early years of the twentieth century, economists keen to mathematise their discipline felt constrained in numerous ways, and not least by pressures by (non-social) natural scientists and influential peers to conform to the ‘standards’ and procedures of (non-social) natural science, and thereby abandon any idea of constructing an autonomous tradition of mathematical economics. Especially influential, in due course, was the classical reductionist programme, the idea that all mathematical disciplines should be reduced to or based on the model of physics, in particular on the strictly deterministic approach of mechanics, with its emphasis on methods of infinitesimal calculus …
However, in the early part of the twentieth century changes occurred in the inter-pretation of the very nature of mathe-matics, changes that caused the classical reductionist programme itself to fall into disarray. With the development of relativity theory and especially quantum theory, the image of nature as continuous came to be re-examined in particular, and the role of infinitesimal calculus, which had previously been regarded as having almost ubiquitous relevance within physics, came to be re-examined even within that domain.
The outcome, in effect, was a switch away from the long-standing emphasis on mathematics as an attempt to apply the physics model, and specifically the mechanics metaphor, to an emphasis on mathematics for its own sake.
Mathematics, especially through the work of David Hilbert, became increasingly viewed as a discipline properly concerned with providing a pool of frameworks for possible realities. No longer was mathematics seen as the language of (non-social) nature, abstracted from the study of the latter. Rather, it was conceived as a practice concerned with formulating systems comprising sets of axioms and their deductive consequences, with these systems in effect taking on a life of their own. The task of finding applications was henceforth regarded as being of secondary importance at best, and not of immediate concern.
This emergence of the axiomatic method removed at a stroke various hitherto insurmountable constraints facing those who would mathematise the discipline of economics. Researchers involved with mathematical projects in economics could, for the time being at least, postpone the day of interpreting their preferred axioms and assumptions. There was no longer any need to seek the blessing of mathematicians and physicists or of other economists who might insist that the relevance of metaphors and analogies be established at the outset. In particular it was no longer regarded as necessary, or even relevant, to economic model construction to consider the nature of social reality, at least for the time being. Nor, it seemed, was it possible for anyone to insist with any legitimacy that the formulations of economists conform to any specific model already found to be successful elsewhere (such as the mechanics model in physics). Indeed, the very idea of fixed metaphors or even interpretations, came to be rejected by some economic ‘modellers’ (albeit never in any really plausible manner).
The result was that in due course deductivism in economics, through morphing into mathematical deductivism on the back of developments within the discipline of mathematics, came to acquire a new lease of life, with practitioners (once more) potentially oblivious to any inconsistency between the ontological presuppositions of adopting a mathematical modelling emphasis and the nature of social reality. The consequent rise of mathematical deductivism has culminated in the situation we find today.
I senaste numret av Pedagogisk Forskning i Sverige (2-3 2014) ger författaren till artikeln En pedagogisk relation mellan människa och häst. På väg mot en pedagogisk filosofisk utforskning av mellanrummet följande intressanta “programförklaring”:
Med en posthumanistisk ansats belyser och reflekterar jag över hur både människa och häst överskrider sina varanden och hur det öppnar upp ett mellanrum med dimensioner av subjektivitet, kroppslighet och ömsesidighet.
A 2005 governmental inquiry led to a trial period involving anonymous job applications in seven public sector workplaces during 2007. In doing so, the public sector aims to improve the recruitment process and to increase the ethnic diversity among its workforce. There is evidence to show that gender and ethnicity have an influence in the hiring process although this is considered as discrimination by current legislation …
The process of ‘depersonalising’ job applications is to make these applications anonymous. In the case of the Gothenburg trial, certain information about the applicant – such as name, sex, country of origin or other identifiable traits of ethnicity and gender – is hidden during the first phase of the job application procedure. The recruiting managers therefore do not see the full content of applications when deciding on whom to invite for interview. Once a candidate has been selected for interview, this information can then be seen.
The trial involving job applications of this nature in the city of Gothenburg is so far the most extensive in Sweden. For this reason, the Institute for Labour Market Policy Evaluation (IFAU) has carried out an evaluation of the impact of anonymous job applications in Gothenburg …
The data used in the IFAU study derive from three districts in Gothenburg … Information on the 3,529 job applicants and a total of 109 positions were collected from all three districts …
A difference-in-difference model was used to test the findings and to estimate the effects in the outcome variables: whether a difference emerges regarding an invitation to interview and job offers in relation to gender and ethnicity in the case of anonymous job applications compared with traditional application procedures.
For job openings where anonymous job applications were applied, the IFAU study reveals that gender and the ethnic origin of the applicant do not affect the probability of being invited for interview. As would be expected from previous research, these factors do have an impact when compared with recruitment processes using traditional application procedures where all the information on the applicant, such as name, sex, country of origin or other identifiable traits of ethnicity and gender, is visible during the first phase of the hiring process. As a result, anonymous applications are estimated to increase the probability of being interviewed regardless of gender and ethnic origin, showing an increase of about 8% for both non-western migrant workers and women.
As yours truly has repeatedly argued (here here here) on this blog, RCTs usually do not provide evidence that their results are exportable to other target systems. The almost religious belief with which its propagators portray it, cannot hide the fact that RCTs cannot be taken for granted to give generalizable results. That something works somewhere is no warranty for it to work for us or even that it works generally.
In an extremely interesting article on the grand claims to external validity often raised by advocates of RCTs, Lant Pritchett and Justin Sandefur now confirm this view and show that using an RCT is not at all the “gold standard” it is portrayed as:
Our point here is not to argue against any well-founded generalization of research findings, nor against the use of experimental methods. Both are central pillars of scientific research. As a means of quantifying the impact of a given development project, or measuring the underlying causal parameter of a clearly-specified economic model, field experiments provide unquestioned advantages over observational studies.
But the popularity of RCTs in development economics stems largely from the claim that they provide a guide to making “evidence-based” policy decisions. In the vast majority of cases, policy recommendations based on experimental results hinge not only on the interior validity of the treatment effect estimates, but also on their external validity across contexts.
Inasmuch as development economics is a worthwhile, independent field of study – rather than a purely parasitic form of regional studies, applying the lessons of rich-country economies to poorer settings – its central conceit is that development is different. The economic, social, and institutional systems of poor countries operate differently than in rich countries in ways that are sufficiently fundamental to require different models and different data.
It is difficult if not impossible to adjudicate the external validity of an individual eperimental result in isolation. But experimental results do not exist in a vacuum. On many development policy questions, the literature as a whole — i. e., the combination of experimental and non-experimental results across multiple contexts — collectively invalidate any claim of external validity for any individual experimental result.