## Sloppy regression interpretations

14 Aug, 2019 at 16:57 | Posted in Statistics & Econometrics | Leave a commentIn most econometrics textbooks the authors give an interpretation of a linear regression such as

Y = a + bX,

saying that a one-unit *increase* in X (years of education) will cause a b unit *increase* in Y (wages).

Dealing with time-series regressions this may well be OK. The problem is that this ‘dynamic’ interpretation of b is standardly also given as the ‘explanation’ of the slope coefficient for cross-sectional data. But in that case, the only increase that can generally come in question is in the value of X (years of education) when going from individual to another individual in the population. If we are interested — as we usually are — in saying something about the dynamics of an individual’s wages and education, we have to look elsewhere (unless we assume cross-unit and cross-time invariance, which, of course, would be utterly ridiculous from a perspective of relevance and realism).

## Understanding regression assumptions

12 Aug, 2019 at 20:45 | Posted in Statistics & Econometrics | Leave a commentAlthough most social scientists can recite the formal definitions of the various regression assumptions, many have little appreciation of the substantive meanings of these assumptions. And unless the meanings of these assumptions are understood, regression analysis almost inevitably will be a rigid exercise in which a handful of independent variables are cavalierly inserted into a standard linear additive regression and coefficients are estimated. Although such an exercise may occasionally produce results that are worth believing, it will do so only when an analyst is very lucky … This monograph was written to encourage students to … think of regression assumptions as a vital set of conditions the applicability of which must be explicitly analyzed each time regression analysis is utilized.

I thought this book was great when I first read it 25 years ago.

I still do.

## On the applicability of statistics in social sciences

8 Aug, 2019 at 13:32 | Posted in Statistics & Econometrics | Leave a commentEminent statistician David Salsburg is rightfully very critical of the way social scientists — including economists and econometricians — uncritically and without arguments have come to simply assume that they can apply probability distributions from statistical theory on their own area of research:

We assume there is an abstract space of elementary things called ‘events’ … If a measure on the abstract space of events fulfills certain axioms, then it is a probability. To use probability in real life, we have to identify this space of events and do so with sufficient specificity to allow us to actually calculate probability measurements on that space … Unless we can identify [this] abstract space, the probability statements that emerge from statistical analyses will have many different and sometimes contrary meanings …

Kolmogorov established the mathematical meaning of probability: Probability is a measure of sets in an abstract space of events. All the mathematical properties of probability can be derived from this definition. When we wish to apply probability to real life, we need to identify that abstract space of events for the particular problem at hand … It is not well established when statistical methods are used for observational studies … If we cannot identify the space of events that generate the probabilities being calculated, then one model is no more valid than another … As statistical models are used more and more for observational studies to assist in social decisions by government and advocacy groups, this fundamental failure to be able to derive probabilities without ambiguity will cast doubt on the usefulness of these methods.

Wise words well worth pondering on.

As long as economists and statisticians cannot really identify their statistical theories with real-world phenomena there is no real warrant for taking their statistical inferences seriously.

Just as there is no such thing as a ‘free lunch,’ there is no such thing as a ‘free probability.’ To be able at all to talk about probabilities, you have to specify a model. If there is no chance set-up or model that generates the probabilistic outcomes or events – in statistics one refers to any process where you observe or measure as an experiment (rolling a die) and the results obtained as the outcomes or events (number of points rolled with the die, being e. g. 3 or 5) of the experiment – there strictly seen is no event at all.

Probability is — as strongly argued by Keynes — a relational thing. It always must come with a specification of the model from which it is calculated. And then to be of any empirical scientific value it has to be shown to coincide with (or at least converge to or approximate) real data generating processes or structures — something seldom or never done!

And this is the basic problem with economic data. If you have a fair roulette-wheel, you can arguably specify probabilities and probability density distributions. But how do you conceive of the analogous ‘nomological machines’ for prices, gross domestic product, income distribution etc? Only by a leap of faith. And that does not suffice. You have to come up with some really good arguments if you want to persuade people into believing in the existence of socio-economic structures that generate data with characteristics conceivable as stochastic events portrayed by probabilistic density distributions!

## Econometric illusions

3 Aug, 2019 at 10:07 | Posted in Statistics & Econometrics | 5 CommentsBecause I was there when the economics department of my university got an IBM 360, I was very much caught up in the excitement of combining powerful computers with economic research. Unfortunately, I lost interest in econometrics almost as soon as I understood how it was done. My thinking went through four stages:

1.Holy shit! Do you see what you can do with a computer’s help.

2.Learning computer modeling puts you in a small class where only other members of the caste can truly understand you. This opens up huge avenues for fraud:

3.The main reason to learn stats is to prevent someone else from committing fraud against you.

4.More and more people will gain access to the power of statistical analysis. When that happens, the stratification of importance within the profession should be a matter of who asks the best questions.Disillusionment began to set in. I began to suspect that all the really interesting economic questions were FAR beyond the ability to reduce them to mathematical formulas. Watching computers being applied to other pursuits than academic economic investigations over time only confirmed those suspicions.

1.Precision manufacture is an obvious application for computing. And for many applications, this worked magnificently. Any design that combined straight line and circles could be easily described for computerized manufacture. Unfortunately, the really interesting design problems can NOT be reduced to formulas. A car’s fender, for example, can not be describe using formulas—it can only be described by specifying an assemblage of multiple points. If math formulas cannot describe something as common and uncomplicated as a car fender, how can it hope to describe human behavior?

2.When people started using computers for animation, it soon became apparent that human motion was almost impossible to model correctly. After a great deal of effort, the animators eventually put tracing balls on real humans and recorded that motion before transferring it to the the animated character. Formulas failed to describe simple human behavior—like a toddler trying to walk.Lately, I have discovered a Swedish economist who did NOT give up econometrics merely because it sounded so impossible. In fact, he still teaches the stuff. But for the rest of us, he systematically destroys the pretensions of those who think they can describe human behavior with some basic Formulas.

Maintaining that economics is a science in the ‘true knowledge’ business, that Swedish economist remains a sceptic of the pretences and aspirations of econometrics. The marginal return on its ever higher technical sophistication in no way makes up for the lack of serious under-labouring of its deeper philosophical and methodological foundations that already Keynes complained about. The rather one-sided emphasis of usefulness and its concomitant instrumentalist justification cannot hide that the legions of probabilistic econometricians who give supportive evidence for their considering it ‘fruitful to believe’ in the possibility of treating unique economic data as the observable results of random drawings from an imaginary sampling of an imaginary population, are skating on thin ice.

A rigorous application of econometric methods in economics really presupposes that the phenomena of our real world economies are ruled by stable causal relations between variables. The endemic lack of both explanatory and predictive success of the econometric project indicate that this hope of finding fixed parameters is an incredible hope for which there, really, is no other ground than hope itself.

## Observational data and causal inference

2 Aug, 2019 at 11:18 | Posted in Statistics & Econometrics | Leave a commentDistinguished Professor of social psychology Richard E. Nisbett takes on the idea of intelligence and IQ testing in his **Intelligence and How to Get It** (Norton 2011). He also has some interesting thoughts on multiple-regression analysis and writes:

Researchers often determine the individual’s contemporary IQ or IQ earlier in life, socioeconomic status of the family of origin, living circumstances when the individual was a child, number of siblings, whether the family had a library card, educational attainment of the individual, and other variables, and put all of them into a multiple-regression equation predicting adult socioeconomic status or income or social pathology or whatever. Researchers then report the magnitude of the contribution of each of the variables in the regression equation, net of all the others (that is, holding constant all the others). It always turns out that IQ, net of all the other variables, is important to outcomes. But … the independent variables pose a tangle of causality – with some causing others in goodness-knows-what ways and some being caused by unknown variables that have not even been measured. Higher socioeconomic status of parents is related to educational attainment of the child, but higher-socioeconomic-status parents have higher IQs, and this affects both the genes that the child has and the emphasis that the parents are likely to place on education and the quality of the parenting with respect to encouragement of intellectual skills and so on. So statements such as “IQ accounts for X percent of the variation in occupational attainment” are built on the shakiest of statistical foundations. What nature hath joined together, multiple regressions cannot put asunder.

Now, I think this is right as far as it goes, although it would certainly have strengthened Nisbett’s argumentation if he had elaborated more on the methodological question around causality, or at least had given some mathematical-statistical-econometric references. Unfortunately, his alternative approach is not more convincing than regression analysis. Like so many other contemporary social scientists today, Nisbett seems to think that randomization may solve the empirical problem. By randomizing we are getting different “populations” that are homogeneous in regards to all variables except the one we think is a genuine cause. In that way, we are supposed being able not having to actually know what all these other factors are.

If you succeed in performing *ideal* randomization with different treatment groups and control groups that is attainable. *But* it presupposes that you really have been able to establish – and not just assume – that the probability of all other causes but the putative have the same probability distribution in the treatment and control groups, and that the probability of assignment to treatment or control groups is independent of all other possible causal variables.

Unfortunately, *real *experiments and *real* randomizations seldom or never achieve this. So, yes, we may do without knowing *all *causes, but it takes *ideal* experiments and *ideal* randomizations to do that, not *real *ones.

As yours truly has argued more than once on this blog, that means that in practice we do have to have sufficient background knowledge to deduce causal knowledge. Without old knowledge, we can’t get new knowledge, and — no causes in, no causes out.

## The biggest problem in science

25 Jul, 2019 at 22:14 | Posted in Statistics & Econometrics | Comments Off on The biggest problem in scienceThere’s a huge debate going on in social science right now. The question is simple, and strikes near the heart of all research: What counts as solid evidence? …

Prominent statisticians, psychologists, economists, sociologists, political scientists, biomedical researchers, and others … argue that results should only be deemed “statistically significant” if they pass a higher threshold.

“We propose a change to P< 0.005,” the authors write. “This simple step would immediately improve the reproducibility of scientific research in many fields” …

There’s a critique of the proposal the authors whom I spoke to agree completely with: Changing the definition of statistical significance doesn’t address the real problem. And the real problem is the culture of science.

In 2016, Vox sent out a survey to more than 200 scientists, asking, “If you could change one thing about how science works today, what would it be and why?” One of the clear themes in the responses: The institutions of science need to get better at rewarding failure.

One young scientist told us, "I feel torn between asking questions that I know will lead to statistical significance and asking questions that matter.”

The biggest problem in science isn’t statistical significance. It’s the culture. She felt torn because young scientists need publications to get jobs. Under the status quo, in order to get publications, you need statistically significant results. Statistical significance alone didn’t lead to the replication crisis. The institutions of science incentivized the behaviors that allowed it to fester.

As shown over and over again when significance tests are applied, people have a tendency to read ‘not disconfirmed’ as ‘probably confirmed.’ Standard scientific methodology tells us that when there is only say a 10 % probability that pure sampling error could account for the observed difference between the data and the null hypothesis, it would be more ‘reasonable’ to conclude that we have a case of disconfirmation. Especially if we perform many independent tests of our hypothesis and they all give about the same 10 % result as our reported one, I guess most researchers would count the hypothesis as even more disconfirmed.

We should never forget that the underlying parameters we use when performing significance tests are *model constructions*. Our p-values mean nothing if the model is wrong. And most importantly — statistical significance tests DO NOT validate models!

In journal articles a typical regression equation will have an intercept and several explanatory variables. The regression output will usually include an F-test, with p – 1 degrees of freedom in the numerator and n – p in the denominator. The null hypothesis will not be stated. The missing null hypothesis is that all the coefficients vanish, except the intercept.

If F is significant, that is often thought to validate the model. Mistake. The F-test takes the model as given. Significance only means this:

ifthe model is rightandthe coefficients are 0, it is very unlikely to get such a big F-statistic. Logically, there are three possibilities on the table:

i) An unlikely event occurred.

ii) Or the model is right and some of the coefficients differ from 0.

iii) Or the model is wrong.

So?

## Improbability and the law of truly large numbers

21 Jul, 2019 at 12:30 | Posted in Statistics & Econometrics | Comments Off on Improbability and the law of truly large numbers

## The validity of statistical induction

18 Jul, 2019 at 22:59 | Posted in Statistics & Econometrics | 10 CommentsIn my judgment, the practical usefulness of those modes of inference, here termed Universal and Statistical Induction, on the validity of which the boasted knowledge of modern science depends, can only exist—and I do not now pause to inquire again whether such an argument must be circular—if the universe of phenomena does in fact present those peculiar characteristics of

atomismandlimited varietywhich appear more and more clearly as the ultimate result to which material science is tending …The physicists of the nineteenth century have reduced matter to the collisions and arrangements of particles, between which the ultimate qualitative differences are very few …

The validity of some current modes of inference may depend on the assumption that it is to material of this kind that we are applying them … Professors of probability have been often and justly derided for arguing as if nature were an urn containing black and white balls in fixed proportions. Quetelet once declared in so many words—“l’urne que nous interrogeons, c’est la nature.” But again in the history of science the methods of astrology may prove useful to the astronomer; and it may turn out to be true—reversing Quetelet’s expression—that “La nature que nous interrogeons, c’est une urne”.

Professors of probability and statistics, yes. And more or less every mainstream economist!

## Two must-read statistics books

17 Jul, 2019 at 16:36 | Posted in Statistics & Econometrics | 4 CommentsMathematical statistician** David Freedman**‘s *Statistical Models and Causal Inference *(Cambridge University Press, 2010)* * and *Statistical Models: Theory and Practice* (Cambridge University Press, 2009) are marvellous books. They ought to be mandatory reading for every serious social scientist — including economists and econometricians — who doesn’t want to succumb to *ad hoc* assumptions and unsupported statistical conclusions!

How do we calibrate the uncertainty introduced by data collection? Nowadays, this question has become quite salient, and it is routinely answered using wellknown methods of statistical inference, with standard errors, t -tests, and P-values … These conventional answers, however, turn out to depend critically on certain rather restrictive assumptions, for instance, random sampling …

Thus, investigators who use conventional statistical technique turn out to be making, explicitly or implicitly, quite restrictive behavioral assumptions about their data collection process … More typically, perhaps, the data in hand are simply the data most readily available …

The moment that conventional statistical inferences are made from convenience samples, substantive assumptions are made about how the social world operates … When applied to convenience samples, the random sampling assumption is not a mere technicality or a minor revision on the periphery; the assumption becomes an integral part of the theory …

In particular, regression and its elaborations … are now standard tools of the trade. Although rarely discussed, statistical assumptions have major impacts on analytic results obtained by such methods.

Consider the usual textbook exposition of least squares regression. We have n observational units, indexed by i = 1, . . . , n. There is a response variable yi , conceptualized as μi + i , where μi is the theoretical mean of yi while the disturbances or errors i represent the impact of random variation (sometimes of omitted variables). The errors are assumed to be drawn independently from a common (gaussian) distribution with mean 0 and finite variance. Generally, the error distribution is not empirically identifiable outside the model; so it cannot be studied directly—even in principle—without the model. The error distribution is an imaginary population and the errors i are treated as if they were a random sample from this imaginary population—a research strategy whose frailty was discussed earlier.

Usually, explanatory variables are introduced and μi is hypothesized to be a linear combination of such variables. The assumptions about the μi and i are seldom justified or even made explicit—although minor correlations in the i can create major bias in estimated standard errors for coefficients …

Why do μi and i behave as assumed? To answer this question, investigators would have to consider, much more closely than is commonly done, the connection between social processes and statistical assumptions …

We have tried to demonstrate that statistical inference with convenience samples is a risky business. While there are better and worse ways to proceed with the data at hand, real progress depends on deeper understanding of the data-generation mechanism. In practice, statistical issues and substantive issues overlap. No amount of statistical maneuvering will get very far without some understanding of how the data were produced.

More generally, we are highly suspicious of efforts to develop empirical generalizations from any single dataset. Rather than ask what would happen in principle if the study were repeated, it makes sense to actually repeat the study. Indeed, it is probably impossible to predict the changes attendant on replication without doing replications. Similarly, it may be impossible to predict changes resulting from interventions without actually intervening.

## Econometrics — a con art with no relevance whatsoever to real world economics

11 Jul, 2019 at 15:51 | Posted in Economics, Statistics & Econometrics | 17 CommentsEconometrics looks “sciency”. Once in a seminar presentation I displayed two equations, one taken from

Econometricaand the other from theJournal of Theoretical and Experimental Physicsand challenged the audience to tell me which is which. No one volunteered to tell me which is which, including at least one hard-core econometrician. Economics is a social science where the behaviour of decision makers is not governed purely by economic considerations but also by social and psychological factors, which are not amenable to econometric testing. This is why no economic theory holds everywhere all the time. And this is why the results of empirical testing of economic theories are typically a mixed bag. And this is why econometricians use time-varying parametric estimation to account for changes in the values of estimated parameters over time (which means that the underlying relationship does not have the universality of a law). And this is why there are so many estimation methods that can be used to produce the desired results. In physics, on the other hand, a body falling under the force of gravity travels with an acceleration of 32 feet per second per second – this is true anywhere any time. In physics also, the boiling point of water under any level of atmospheric pressure can be predicted with accuracy.Unlike physicists, econometricians are in a position to obtain the desired results, armed with the arsenal of tools produced by econometric theory. When an econometrician fails to obtain the desired results, he or she may try different functional forms, lag structures and estimation methods, and indulge in data mining until the desired results are obtained (torture produces a confession even when applied to data). If the empirical work is conducted for the purpose of writing an academic paper, the researcher seeks results that are “interesting” enough to warrant publication or results that confirm the view of the orthodoxy. And it is typically the case that the results cannot be replicated. Physicists do not have this luxury – it is unthinkable and easily verifiable that a physicist manipulates data (by using principal components or various econometric transformations) to obtain readings that refute Boyle’s law. Economists study the behaviour of consumers, firms and governments where expectations and uncertainties play key roles in the translation of economic theory into real world economics. These uncertainties mean that econometric modelling cannot produce accurate representation of the working of the economy.

Mainstream economists often hold the view that if you are critical of econometrics it can only be because you are a sadly misinformed and misguided person who dislike and do not understand much of it.

As Moosa’s eminent article shows, this is, however, nothing but a gross misapprehension.

And just as Moosa, Keynes certainly did not misunderstand the crucial issues at stake in his critique of econometrics. Quite the contrary. He knew them all too well — and was not satisfied with the validity and philosophical underpinnings of the assumptions made for applying its methods.

Keynes’ critique is still valid and unanswered in the sense that the problems he pointed at are still with us today and ‘unsolved.’ Ignoring them — the most common practice among applied econometricians — is not to solve them.

To apply statistical and mathematical methods to the real-world economy, the econometrician has to make some quite strong assumptions. In a review of Tinbergen’s econometric work — published in *The Economic Journal* in 1939 — Keynes gave a comprehensive critique of Tinbergen’s work, focusing on the limiting and unreal character of the assumptions that econometric analyses build on:

**Completeness**: Where Tinbergen attempts to specify and quantify which different factors influence the business cycle, Keynes maintains there has to be a complete list of *all* the relevant factors to avoid misspecification and spurious causal claims. Usually, this problem is ‘solved’ by econometricians assuming that they somehow have a ‘correct’ model specification. Keynes is, to put it mildly, unconvinced:

It will be remembered that the seventy translators of the Septuagint were shut up in seventy separate rooms with the Hebrew text and brought out with them, when they emerged, seventy identical translations. Would the same miracle be vouchsafed if seventy multiple correlators were shut up with the same statistical material? And anyhow, I suppose, if each had a different economist perched on his

a priori, that would make a difference to the outcome.

**Homogeneity**: To make inductive inferences possible — and being able to apply econometrics — the system we try to analyse has to have a large degree of ‘homogeneity.’ According to Keynes most social and economic systems — especially from the perspective of real historical time — lack that ‘homogeneity.’ As he had argued already in *Treatise on Probability* (ch. 22), it wasn’t always possible to take repeated samples from a fixed population when we were analysing real-world economies. In many cases, there simply are no reasons at all to assume the samples to be homogenous. Lack of ‘homogeneity’ makes the principle of ‘limited independent variety’ non-applicable, and hence makes inductive inferences, strictly seen, impossible since one of its fundamental logical premises are not satisfied. Without “much repetition and uniformity in our experience” there is no justification for placing “great confidence” in our inductions (TP ch. 8).

And then, of course, there is also the ‘reverse’ variability problem of non-excitation: factors that do not change significantly during the period analysed, can still very well be extremely important causal factors.

**Stability:** Tinbergen assumes there is a stable spatio-temporal relationship between the variables his econometric models analyze. But as Keynes had argued already in his *Treatise on Probability* it was not really possible to make inductive generalisations based on correlations in one sample. As later studies of ‘regime shifts’ and ‘structural breaks’ have shown us, it is exceedingly difficult to find and establish the existence of stable econometric parameters for anything but rather short time series.

**Measurability:** Tinbergen’s model assumes that all relevant factors are measurable. Keynes questions if it is possible to adequately quantify and measure things like expectations and political and psychological factors. And more than anything, he questioned — both on epistemological and ontological grounds — that it was always and everywhere possible to measure real-world uncertainty with the help of probabilistic risk measures. Thinking otherwise can, as Keynes wrote, “only lead to error and delusion.”

**Independence**: Tinbergen assumes that the variables he treats are independent (still a standard assumption in econometrics). Keynes argues that in such a complex, organic and evolutionary system as an economy, independence is a deeply unrealistic assumption to make. Building econometric models from that kind of simplistic and unrealistic assumptions risk producing nothing but spurious correlations and causalities. Real-world economies are organic systems for which the statistical methods used in econometrics are ill-suited, or even, strictly seen, inapplicable. Mechanical probabilistic models have little leverage when applied to non-atomic evolving organic systems — such as economies.

It is a great fault of symbolic pseudo-mathematical methods of formalising a system of economic analysis … that they expressly assume strict independence between the factors involved and lose all their cogency and authority if this hypothesis is disallowed; whereas, in ordinary discourse, where we are not blindly manipulating but know all the time what we are doing and what the words mean, we can keep “at the back of our heads” the necessary reserves and qualifications and the adjustments which we shall have to make later on, in a way in which we cannot keep complicated partial differentials “at the back” of several pages of algebra which assume that they all vanish.

Building econometric models can’t be a goal in itself. Good econometric models are means that make it possible for us to infer things about the real-world systems they ‘represent.’ If we can’t show that the mechanisms or causes that we isolate and handle in our econometric models are ‘exportable’ to the real world, they are of limited value to our understanding, explanations or predictions of real-world economic systems.

The kind of fundamental assumption about the character of material laws, on which scientists appear commonly to act, seems to me to be much less simple than the bare principle of uniformity. They appear to assume something much more like what mathematicians call the principle of the superposition of small effects, or, as I prefer to call it, in this connection, the

atomiccharacter of natural law. The system of the material universe must consist, if this kind of assumption is warranted, of bodies which we may term (without any implication as to their size being conveyed thereby)legal atoms, such that each of them exercises its own separate, independent, and invariable effect, a change of the total state being compounded of a number of separate changes each of which is solely due to a separate portion of the preceding state …The scientist wishes, in fact, to assume that the occurrence of a phenomenon which has appeared as part of a more complex phenomenon, may be some reason for expecting it to be associated on another occasion with part of the same complex. Yet if different wholes were subject to laws

quawholes and not simply on account of and in proportion to the differences of their parts, knowledge of a part could not lead, it would seem, even to presumptive or probable knowledge as to its association with other parts.

**Linearity:** To make his models tractable, Tinbergen assumes the relationships between the variables he study to be linear. This is still standard procedure today, but as Keynes writes:

It is a very drastic and usually improbable postulate to suppose that all economic forces are of this character, producing independent changes in the phenomenon under investigation which are directly proportional to the changes in themselves; indeed, it is ridiculous.

To Keynes, it was a ‘fallacy of reification’ to assume that all quantities are additive (an assumption closely linked to independence and linearity).

The unpopularity of the principle of organic unities shows very clearly how great is the danger of the assumption of unproved additive formulas. The fallacy, of which ignorance of organic unity is a particular instance, may perhaps be mathematically represented thus: suppose f(x) is the goodness of x and f(y) is the goodness of y. It is then assumed that the goodness of x and y together is f(x) + f(y) when it is clearly f(x + y) and only in special cases will it be true that f(x + y) = f(x) + f(y). It is plain that it is never legitimate to assume this property in the case of any given function without proof.

J. M. Keynes “Ethics in Relation to Conduct” (1903)

And as even one of the founding fathers of modern econometrics — Trygve Haavelmo — wrote:

What is the use of testing, say, the significance of regression coefficients, when maybe, the whole assumption of the linear regression equation is wrong?

Real-world social systems are usually not governed by stable causal mechanisms or capacities. The kinds of ‘laws’ and relations that econometrics has established, are laws and relations about entities in models that presuppose causal mechanisms and variables — and the relationship between them — being linear, additive, homogenous, stable, invariant and atomistic. But — when causal mechanisms operate in the real world they only do it in ever-changing and unstable combinations where the whole is more than a mechanical sum of parts. Since statisticians and econometricians — as far as I can see — haven’t been able to convincingly warrant their assumptions of homogeneity, stability, invariance, independence, additivity as being ontologically isomorphic to real-world economic systems, Keynes’ critique is still valid. As long as — as Keynes writes in a letter to Frisch in 1935 — “nothing emerges at the end which has not been introduced expressively or tacitly at the beginning,” I remain doubtful of the scientific aspirations of econometrics.

In his critique of Tinbergen, Keynes points us to the fundamental logical, epistemological and ontological problems of applying statistical methods to a basically unpredictable, uncertain, complex, unstable, interdependent, and ever-changing social reality. Methods designed to analyse repeated sampling in controlled experiments under fixed conditions are not easily extended to an organic and non-atomistic world where time and history play decisive roles.

Econometric modelling should never be a substitute for thinking. From that perspective, it is really depressing to see how much of Keynes’ critique of the pioneering econometrics in the 1930s-1940s is still relevant today. And that is also a reason why we — as does Moosa — have to keep on criticizing it.

The general line you take is interesting and useful. It is, of course, not exactly comparable with mine. I was raising the logical difficulties. You say in effect that, if one was to take these seriously, one would give up the ghost in the first lap, but that the method, used judiciously as an aid to more theoretical enquiries and as a means of suggesting possibilities and probabilities rather than anything else, taken with enough grains of salt and applied with superlative common sense, won’t do much harm. I should quite agree with that. That is how the method ought to be used.

Keynes, letter to E.J. Broster, December 19, 1939

## Trösklar och statistisk signifikans

30 Jun, 2019 at 09:48 | Posted in Statistics & Econometrics | 1 CommentI en artikel på *Ekonomistas* argumenterar nationalekonomen Robert Östling för att lösningen på den uppmärksammade ‘replikationskrisen’ är att ändra på tröskeln för vad som ska betraktas som ‘statistiskt signifikant’ från 5% till 0,5%.

Även om detta i sig är vällovligt är det dock ingen lösning. Det räcker inte med att ändra godtyckliga nivåer för vad som ska anses vara ‘statistiskt signifikant’ eller ej. Det är inte där det grundläggande problemet ligger:

We recommend dropping the NHST [null hypothesis significance testing] paradigm — and the p-value thresholds associated with it — as the default statistical paradigm for research, publication, and discovery in the biomedical and social sciences. Specifically, rather than allowing statistical signicance as determined by p < 0.05 (or some other statistical threshold) to serve as a lexicographic decision rule in scientic publication and statistical decision making more broadly as per the status quo, we propose that the p-value be demoted from its threshold screening role and instead, treated continuously, be considered along with the neglected factors [such factors as prior and related evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain] as just one among many pieces of evidence.

We make this recommendation for three broad reasons. First, in the biomedical and social sciences, the sharp point null hypothesis of zero effect and zero systematic error used in the overwhelming majority of applications is generally not of interest because it is generally implausible. Second, the standard use of NHST — to take the rejection of this straw man sharp point null hypothesis as positive or even definitive evidence in favor of some preferredalternative hypothesis — is a logical fallacy that routinely results in erroneous scientic reasoning even by experienced scientists and statisticians. Third, p-value and other statistical thresholds encourage researchers to study and report single comparisons rather than focusing on the totality of their data and results.

Vi får aldrig glömma att de underliggande parametrar vi använder när vi gör våra signifikanstestningar är *modellkonstruktioner*. Oberoende av vlka p-värden vi än får så säger de oss ingenting om modellen är fel. Och framför allt — oberoende av hur många signifikanstester och vilka tösklar vi sätter så validerar de *aldrig* modeller!

In journal articles a typical regression equation will have an intercept and several explanatory variables. The regression output will usually include an F-test, with p – 1 degrees of freedom in the numerator and n – p in the denominator. The null hypothesis will not be stated. The missing null hypothesis is that all the coefficients vanish, except the intercept.

If F is significant, that is often thought to validate the model. Mistake. The F-test takes the model as given. Significance only means this:

ifthe model is rightandthe coefficients are 0, it is very unlikely to get such a big F-statistic. Logically, there are three possibilities on the table:

i) An unlikely event occurred.

ii) Or the model is right and some of the coefficients differ from 0.

iii) Or the model is wrong.

So?

## The difference between statistical and causal assumptions

24 Jun, 2019 at 19:57 | Posted in Statistics & Econometrics | Comments Off on The difference between statistical and causal assumptionsThere are three fundamental differences between statistical and causal assumptions. First, statistical assumptions, even untested, are testable in principle, given sufficiently large sample and sufficiently fine measurements. Causal assumptions, in contrast, cannot be verified even in principle, unless one resorts to experimental control. This difference is especially accentuated in Bayesian analysis. Though the priors that Bayesians commonly assign to statistical parameters are untested quantities, the sensitivity to these priors tends to diminish with increasing sample size. In contrast, sensitivity to priors of causal parameters … remains non-zero regardless of (nonexperimental) sample size.

Second, statistical assumptions can be expressed in the familiar language of probability calculus, and thus assume an aura of scholarship and scientific re- spectability. Causal assumptions, as we have seen before, are deprived of that honor, and thus become immediate suspect of informal, anecdotal or metaphysical thinking. Again, this difference becomes illuminated among Bayesians, who are accustomed to accepting untested, judgmental assumptions, and should therefore invite causal assumptions with open arms—they don’t. A Bayesian is prepared to accept an expert’s judgment, however esoteric and untestable, so long as the judgment is wrapped in the safety blanket of a probability expression. Bayesians turn extremely suspicious when that same judgment is cast in plain English, as in “mud does not cause rain” …

The third resistance to causal (vis-a-vis statistical) assumptions stems from their intimidating clarity. Assumptions about abstract properties of density functions or about conditional independencies among variables are, cognitively speaking, rather opaque, hence they tend to be forgiven, rather than debated. In contrast, assumptions about how variables cause one another are shockingly transparent, and tend therefore to invite counter-arguments and counter-hypotheses.

Pearl’s seminal contributions to this research field is well-known and indisputable. But on the ‘taming’ and ‘resolve’ of the issues, yurs truly however has to admit that (under the influence of especially David Freedman and Nancy Cartwright) I still have some doubts on the reach, especially in terms of realism and relevance, of his ‘do-calculus solutions’ for social sciences in general and economics in specific (see here, here, here and here). The distinction between the causal — ‘interventionist’ — E[Y|do(X)] and the more traditional statistical — ‘conditional expectationist’ — E[Y|X] is crucial, but Pearl and his associates, although they have fully explained why the first is so important, have to convince us that it (in a relevant way) can be exported from ‘engineer’ contexts where it arguably easily and universally apply, to socio-economic contexts where ‘manipulativity’ and ‘modularity’ are not perhaps so universally at hand.

## Why statistics does not give us causality

24 Jun, 2019 at 12:28 | Posted in Statistics & Econometrics | 4 CommentsIf contributions made by statisticians to the understanding of causation are to be taken over with advantage in any specific field of inquiry, then what is crucial is that the right relationship should exist between statistical and subject-matter concerns …

Where the ultimate aim of research is not prediction

per sebut rather causal explanation, an idea of causation that is expressed in terms of predictive power — as, for example, ‘Granger’ causation — is likely to be found wanting. Causal explanations cannot be arrived at through statistical methodology alone: a subject-matter input is also required in the form of background knowledge and, crucially, theory …Likewise, the idea of causation as consequential manipulation is apt to research that can be undertaken primarily through experimental methods and, especially to ‘practical science’ where the central concern is indeed with ‘the consequences of performing particular acts’. The development of this idea in the context of medical and agricultural research is as understandable as the development of that of causation as robust dependence within applied econometrics. However, the extension of the manipulative approach into sociology would not appear promising, other than in rather special circumstances … The more fundamental difficulty is that, under the — highly anthropocentric — principle of ‘no causation without manipulation’, the recognition that can be given to the action of individuals as having causal force is in fact peculiarly limited.

Causality in social sciences — and economics — can never solely be a question of statistical inference. Causality entails more than predictability, and to really in depth explain social phenomena require theory. Analysis of variation — the foundation of all econometrics — can never in itself reveal how these variations are brought about. First, when we are able to tie actions, processes or structures to the statistical relations detected, can we say that we are getting at relevant explanations of causation.

Most facts have many different, possible, alternative explanations, but we want to find the best of all contrastive (since all real explanation takes place relative to a set of alternatives) explanations. So which is the best explanation? Many scientists, influenced by statistical reasoning, think that the likeliest explanation is the best explanation. But the likelihood of x is not in itself a strong argument for thinking it explains y. I would rather argue that what makes one explanation better than another are things like aiming for and finding powerful, deep, causal, features and mechanisms that we have warranted and justified reasons to believe in. Statistical — especially the variety based on a Bayesian epistemology — reasoning generally has no room for these kinds of explanatory considerations. The only thing that matters is the probabilistic relation between evidence and hypothesis. That is also one of the main reasons I find abduction — inference to the best explanation — a better description and account of what constitute actual scientific reasoning and inferences.

For more on these issues — see the chapter “Capturing causality in economics and the limits of statistical inference” in my On the use and misuse of theories and models in economics.

In the social sciences … regression is used to discover relationships or to disentangle cause and effect. However, investigators have only vague ideas as to the relevant variables and their causal order; functional forms are chosen on the basis of convenience or familiarity; serious problems of measurement are often encountered.

Regression may offer useful ways of summarizing the data and making predictions. Investigators may be able to use summaries and predictions to draw substantive conclusions. However, I see no cases in which regression equations, let alone the more complex methods, have succeeded as engines for discovering causal relationships.

Some statisticians and data scientists think that algorithmic formalisms somehow give them access to causality. That is, however, simply not true. Assuming ‘convenient’ things like faithfulness or stability is not to give proofs. It’s to assume what has to be proven. Deductive-axiomatic methods used in statistics do no produce evidence for causal inferences. The real causality we are searching for is the one existing in the real world around us. If there is no warranted connection between axiomatically derived theorems and the real-world, well, then we haven’t really obtained the causation we are looking for.

## Why attractive people you date tend to be jerks

19 Jun, 2019 at 00:12 | Posted in Statistics & Econometrics | 1 CommentHave you ever noticed that, among the people you date, the attractive ones tend to be jerks? Instead of constructing elaborate psychosocial theories, consider a simpler explanation. Your choice of people to date depends on two factors, attractiveness and personality. You’ll take a chance on dating a mean attractive person or a nice unattractive person, and certainly a nice attractive person, but not a mean unattractive person … This creates a spurious negative correlation between attractiveness and personality. The sad truth is that unattractive people are just as mean as attractive people — but you’ll never realize it, because you’ll never date somebody who is both mean and unattractive.

## Substantive relevance — not ‘clever’ design — is what matters most in science

11 Jun, 2019 at 16:40 | Posted in Statistics & Econometrics | 1 Comment

If anything, Snow’s path-breaking research underlines how important it is not to equate science with statistical calculation. And that the value of ‘as-if’ random interventions and experiments ultimately depend on the degree to which they if shed light on substantive and interesting scientific questions.

All science entail human judgement, and using statistical models doesn’t relieve us of that necessity. And we should never forget that the underlying parameters we use when performing statistical tests are *model constructions*. And if the model is wrong, the value of our calculations is nil. As ‘shoe-leather researcher’ David Freedman wrote in *Statistical Models and Causal Inference*:

I believe model validation to be a central issue. Of course, many of my colleagues will be found to disagree. For them, fitting models to data, computing standard errors, and performing significance tests is “informative,” even though the basic statistical assumptions (linearity, independence of errors, etc.) cannot be validated. This position seems indefensible, nor are the consequences trivial. Perhaps it is time to reconsider.

Blog at WordPress.com.

Entries and comments feeds.