## On the limits of ‘mediation analysis’ and ‘statistical causality’

23 June, 2018 at 23:18 | Posted in Statistics & Econometrics | Leave a comment“Mediation analysis” is this thing where you have a treatment and an outcome and you’re trying to model how the treatment works: how much does it directly affect the outcome, and how much is the effect “mediated” through intermediate variables …

In the real world, it’s my impression that almost all the mediation analyses that people actually fit in the social and medical sciences are misguided: lots of examples where the assumptions aren’t clear and where, in any case, coefficient estimates are hopelessly noisy and where confused people will over-interpret statistical significance …

More and more I’ve been coming to the conclusion that the standard causal inference paradigm is broken … So how to do it? I don’t think traditional path analysis or other multivariate methods of the throw-all-the-data-in-the-blender-and-let-God-sort-em-out variety will do the job. Instead we need some structure and some prior information.

Causality in social sciences — and economics — can never solely be a question of statistical inference. Causality entails more than predictability, and to really in depth explain social phenomena require theory. Analysis of variation — the foundation of all econometrics — can never in itself reveal how these variations are brought about. First, when we are able to tie actions, processes or structures to the statistical relations detected, can we say that we are getting at relevant explanations of causation.

Most facts have many different, possible, alternative explanations, but we want to find the best of all contrastive (since all real explanation takes place relative to a set of alternatives) explanations. So which is the best explanation? Many scientists, influenced by statistical reasoning, think that the likeliest explanation is the best explanation. But the likelihood of x is not in itself a strong argument for thinking it explains y. I would rather argue that what makes one explanation better than another are things like aiming for and finding powerful, deep, causal, features and mechanisms that we have warranted and justified reasons to believe in. Statistical — especially the variety based on a Bayesian epistemology — reasoning generally has no room for these kinds of explanatory considerations. The only thing that matters is the probabilistic relation between evidence and hypothesis. That is also one of the main reasons I find abduction — inference to the best explanation — a better description and account of what constitute actual scientific reasoning and inferences.

In the social sciences … regression is used to discover relationships or to disentangle cause and effect. However, investigators have only vague ideas as to the relevant variables and their causal order; functional forms are chosen on the basis of convenience or familiarity; serious problems of measurement are often encountered.

Regression may offer useful ways of summarizing the data and making predictions. Investigators may be able to use summaries and predictions to draw substantive conclusions. However, I see no cases in which regression equations, let alone the more complex methods, have succeeded as engines for discovering causal relationships.

Some statisticians and data scientists think that algorithmic formalisms somehow give them access to causality. That is, however, simply not true. Assuming ‘convenient’ things like faithfulness or stability is not to give proofs. It’s to assume what has to be proven. Deductive-axiomatic methods used in statistics do no produce evidence for causal inferences. The real causality we are searching for is the one existing in the real world around us. If there is no warranted connection between axiomatically derived theorems and the real world, well, then we haven’t really obtained the causation we are looking for.

If contributions made by statisticians to the understanding of causation are to be taken over with advantage in any specific field of inquiry, then what is crucial is that the right relationship should exist between statistical and subject-matter concerns …

The idea of causation as consequential manipulation is apt to research that can be undertaken primarily through experimental methods and, especially to ‘practical science’ where the central concern is indeed with ‘the consequences of performing particular acts’. The development of this idea in the context of medical and agricultural research is as understandable as the development of that of causation as robust dependence within applied econometrics. However, the extension of the manipulative approach into sociology would not appear promising, other than in rather special circumstances … The more fundamental difficulty is that under the — highly anthropocentric — principle of ‘no causation without manipulation’, the recognition that can be given to the action of individuals as having causal force is in fact peculiarly limited.

## Haavelmo and Frisch on the limited value of econometrics

21 June, 2018 at 12:12 | Posted in Statistics & Econometrics | 1 CommentFor the sake of balancing the overly rosy picture of econometric achievements given in the usual econometrics textbooks today, it may be interesting to see how Trygve Haavelmo — with the completion (in 1958) of the twenty-fifth volume of *Econometrica — *assessed the role of econometrics in the advancement of economics. Although mainly positive of the “repair work” and “clearing-up work” done, Haavelmo also found some grounds for despair:

We have found certain general principles which would seem to make good sense. Essentially, these principles are based on the reasonable idea that, if an economic model is in fact “correct” or “true,” we can say something a priori about the way in which the data emerging from it must behave. We can say something, a priori, about whether it is theoretically possible to estimate the parameters involved. And we can decide, a priori, what the proper estimation procedure should be … But the concrete results of these efforts have often been a seemingly

lower degree of accuracyof the would-be economic laws (i.e., larger residuals), or coefficients that seem a priori less reasonable than those obtained by using cruder or clearly inconsistent methods.There is the possibility that the more stringent methods we have been striving to develop have actually opened our eyes to recognize a plain fact: viz., that the “laws” of economics are not very accurate in the sense of a close fit, and that we have been living in a dream-world of large but somewhat superficial or spurious correlations.

And as the quote below shows, Frisch also shared some of Haavelmo’s — and Keynes’s — doubts on the applicability of econometrics:

I have personally always been skeptical of the possibility of making macroeconomic predictions about the development that will follow on the basis of given initial conditions … I have believed that the analytical work will give higher yields – now and in the near future – if they become applied in macroeconomic decision models where the line of thought is the following: “If this or that policy is made, and these conditions are met in the period under consideration, probably a tendency to go in this or that direction is created”.

Real-world social systems are usually not governed by stable causal mechanisms or capacities. The kinds of ‘laws’ and relations that econometrics has established, are laws and relations about entities in models that presuppose causal mechanisms and variables — and the relationship between them — being linear, additive, homogenous, stable, invariant and atomistic. But — when causal mechanisms operate in the real world they only do it in ever-changing and unstable combinations where the whole is more than a mechanical sum of parts.

Since statisticians and econometricians have not been able to convincingly warrant their assumptions of homogeneity, stability, invariance, independence, additivity as being ontologically isomorphic to real-world economic systems, there are still strong reasons to be critical of the econometric project. There are deep epistemological and ontological problems of applying statistical methods to a basically unpredictable, uncertain, complex, unstable, interdependent, and ever-changing social reality. Methods designed to analyse repeated sampling in controlled experiments under fixed conditions are not easily extended to an organic and non-atomistic world where time and history play decisive roles.

Econometric modelling should never be a substitute for thinking.

The general line you take is interesting and useful. It is, of course, not exactly comparable with mine. I was raising the logical difficulties. You say in effect that, if one was to take these seriously, one would give up the ghost in the first lap, but that the method, used judiciously as an aid to more theoretical enquiries and as a means of suggesting possibilities and probabilities rather than anything else, taken with enough grains of salt and applied with superlative common sense, won’t do much harm. I should quite agree with that. That is how the method ought to be used.

Keynes, letter to E.J. Broster, December 19, 1939

## Statistics and econometrics are not very helpful for understanding economies

16 June, 2018 at 20:32 | Posted in Statistics & Econometrics | Leave a commentA statistician may have done the programming, but when you press a button on a computer keyboard and ask the computer to find some good patterns, better get clear a sad fact: computers do not think. They do exactly what the programmer told them to do and nothing more. They look for the patterns that we tell them to look for, those and nothing more. When we turn to the computer for advice, we are only talking to ourselves …

Mathematical analysis works great to decide which horse wins, if we are completely confident which horses are in the race, but it breaks down when we are not sure. In experimental settings, the set of alternative models can often be well agreed on, but with nonexperimental economics data, the set of models is subject to enormous disagreements. You disagree with your model made yesterday, and I disagree with your model today. Mathematics does not help much resolve our internal intellectual disagreements.

Indeed. As social researchers, we should never equate science with mathematics and statistical calculation. All science entail human judgement, and using mathematical and statistical models don’t relieve us of that necessity. They are no substitutes for thinking and doing real science. Or as a great German philosopher once famously wrote:

There is no royal road to science, and only those who do not dread the fatiguing climb of its steep paths have a chance of gaining its luminous summits.

## Econometric inconsistencies

16 June, 2018 at 10:17 | Posted in Statistics & Econometrics | 1 CommentIn plain terms, it is evident that if what is really the same factor is appearing in several places under various disguises, a free choice of regression coefficients can lead to strange results. It becomes like those puzzles for children where you write down your age, multiply, add this and that, subtract something else, and eventually end up with the number of the Beast in Revelation.

Prof. Tinbergen explains that, generally speaking, he assumes that the correlations under investigation are

linear… One would have liked to be told emphatically what is involved in the assumption of linearity. It means that the quantitative effect of any causal factor on the phenomenon under investigation is directly proportional to the factor’s own magnitude … But it is a very drastic and usually improbable postulate to suppose that all economic forces are of this character, producing independent changes in the phenomenon under investigation which are directly proportional to the changes in themselves ; indeed, it is ridiculous. Yet this is what Prof. Tinbergen is throughout assuming …

Keynes’ comprehensive critique of econometrics and the assumptions it is built around — completeness, measurability, independence, homogeneity, and linearity — is still valid today.

Most work in econometrics is made on the assumption that the researcher has a theoretical model that is ‘true.’ But — to think that we are being able to construct a model where all relevant variables are included and correctly specify the functional relationships that exist between them, is not only a belief without support, it is a belief *impossible* to support.

The theories we work with when building our econometric regression models are insufficient. No matter what we study, there are always some variables missing, and we don’t know the correct way to functionally specify the relationships between the variables.

*Every* econometric model constructed is misspecified. There is always an endless list of possible variables to include, and endless possible ways to specify the relationships between them. So every applied econometrician comes up with his own specification and ‘parameter’ estimates. The econometric Holy Grail of consistent and stable parameter-values is nothing but a dream.

A rigorous application of econometric methods in economics really presupposes that the phenomena of our real world economies are ruled by stable causal relations between variables. Parameter-values estimated in specific spatio-temporal contexts are *presupposed* to be exportable to totally different contexts. To warrant this assumption one, however, has to convincingly establish that the targeted acting causes are stable and invariant so that they maintain their parametric status after the bridging. The endemic lack of predictive success of the econometric project indicates that this hope of finding fixed parameters is a hope for which there really is no other ground than hope itself.

The theoretical conditions that have to be fulfilled for econometrics to really work are nowhere even closely met in reality. Making outlandish statistical assumptions do not provide a solid ground for doing relevant social science and economics. Although econometrics has become the most used quantitative methods in economics today, it’s still a fact that the inferences made from them are as a rule invalid.

Econometrics is basically a deductive method. Given the assumptions, it delivers deductive inferences. The problem, of course, is that we will never completely know when the assumptions are right. Conclusions can only be as certain as their premises — and that also applies to econometrics.

## Bayesian religion

12 June, 2018 at 14:19 | Posted in Statistics & Econometrics | Leave a commentThere is a nice YouTube video with Tony O’Hagan interviewing Dennis Lindley. Of course, Dennis is a legend and his impact on the field of statistics is huge.

At one point, Tony points out that some people liken Bayesian inference to a religion. Dennis claims this is false. Bayesian inference, he correctly points out, starts with some basic axioms and then the rest follows by deduction. This is logic, not religion.I agree that the mathematics of Bayesian inference is based on sound logic. But, with all due respect, I think Dennis misunderstood the question. When people say that “Bayesian inference is like a religion,” they are not referring to the logic of Bayesian inference. They are referring to how adherents of Bayesian inference behave.

(As an aside, detractors of Bayesian inference do not deny the correctness of the logic. They just don’t think the axioms are relevant for data analysis. For example, no one doubts the axioms of Peano arithmetic. But that doesn’t imply that arithmetic is the foundation of statistical inference. But I digress.)

The vast majority of Bayesians are pragmatic, reasonable people. But there is a sub-group of die-hard Bayesians who do treat Bayesian inference like a religion. By this I mean:

They are very cliquish.

They have a strong emotional attachment to Bayesian inference.

They are overly sensitive to criticism.

They are unwilling to entertain the idea that Bayesian inference might have flaws.

When someone criticizes Bayes, they think that critic just “doesn’t get it.”

They mock people with differing opinions …No evidence you can provide would ever make the die-hards doubt their ideas. To them, Sir David Cox, Brad Efron and other giants in our field who have doubts about Bayesian inference, are not taken seriously because they “just don’t get it.”

So is Bayesian inference a religion? For most Bayesians: no. But for the thin-skinned, inflexible die-hards who have attached themselves so strongly to their approach to inference that they make fun of, or get mad at, critics: yes, it is a religion.

## On randomness and probability in economics

11 June, 2018 at 11:50 | Posted in Statistics & Econometrics | 10 CommentsModern mainstream economics relies to a large degree on the notion of probability. To at all be amenable to applied economic analysis, economic observations have to be conceived as random events that are analyzable within a probabilistic framework. But is it really necessary to model the economic system as a system where randomness can only be analyzed and understood when based on an *a priori* notion of probability?

When attempting to convince us of the necessity of founding empirical economic analysis on probability models, neoclassical economics actually forces us to (implicitly) interpret events as random variables generated by an underlying probability density function.

This is at odds with reality. Randomness obviously is a fact of the real world. Probability, on the other hand, attaches (if at all) to the world via intellectually constructed models, and *a fortiori* is only a fact of a probability generating (nomological) machine or a well constructed experimental arrangement or ‘chance set-up.’

Just as there is no such thing as a ‘free lunch,’ there is no such thing as a ‘free probability.’

To be able at all to talk about probabilities, you have to specify a model. If there is no chance set-up or model that generates the probabilistic outcomes or events – in statistics one refers to any process where you observe or measure as an experiment (rolling a die) and the results obtained as the *outcomes* or *events* (number of points rolled with the die, being e. g. 3 or 5) of the experiment – there strictly seen is no event at all.

Probability is a relational element. It always must come with a specification of the model from which it is calculated. And then to be of any empirical scientific value it has to be *shown* to coincide with (or at least converge to) real data generating processes or structures – something seldom or never done.

And this is the basic problem with economic data. If you have a fair roulette-wheel, you can arguably specify probabilities and probability density distributions. But how do you conceive of the analogous nomological machines for prices, gross domestic product, income distribution etc? Only by a leap of faith. And that does not suffice. You have to come up with some really good arguments if you want to persuade people into believing in the existence of socio-economic structures that generate data with characteristics conceivable as stochastic events portrayed by probabilistic density distributions.

We simply have to admit that the socio-economic states of nature that we talk of in most social sciences – and certainly in economics – are not amenable to analyze as probabilities, simply because in the real world open systems there are no probabilities to be had!

The processes that generate socio-economic data in the real world cannot just be assumed to always be adequately captured by a probability measure. And, so, it cannot be maintained that it even should be mandatory to treat observations and data – whether cross-section, time series or panel data – as events generated by some probability model. The important activities of most economic agents do not usually include throwing dice or spinning roulette-wheels. Data generating processes – at least outside of nomological machines like dice and roulette-wheels – are not self-evidently best modelled with probability measures.

If we agree on this, we also have to admit that much of modern neoclassical economics lacks sound foundations.

When economists and econometricians – often uncritically and without arguments — simply assume that one can apply probability distributions from statistical theory on their own area of research, they are really skating on thin ice.

This importantly also means that if you cannot show that data satisfies *all* the conditions of the probabilistic nomological machine, then the statistical inferences made in mainstream economics lack sound foundations!

## Living high and feeling low

6 June, 2018 at 11:44 | Posted in Statistics & Econometrics | 5 CommentsHigh-altitude areas — particularly the US intermountain states — have increased rates of suicide and depression, suggests a review of research evidence in the Harvard Review of Psychiatry.

The increased suicide rates might be explained by blood oxygen levels due to low atmospheric pressure, according to the article by Brent Michael Kious, MD, PhD, of University of Utah, Salt Lake City, and colleagues …

They analyzed 12 studies, most performed in the United States, including population-based data on the relationship between suicide or depression and altitude. While the studies used varying methods, most reported that higher-altitude areas had increased rates of depression and suicide. In general, the correlation was stronger for suicide than for depression.

The highest suicide rates were clustered in the intermountain states: Arizona, Colorado, Idaho, Montana, Nevada, New Mexico, Utah, and Wyoming …

Suicide rates were more strongly associated with altitude than with firearm ownership. Other factors linked to suicide rate included increased poverty rate, lower income, and smaller population ratios of white and divorced women. However, the studies could not account for all factors potentially affecting variations in suicide, such as substance abuse rates and cultural differences.

While more than 80 percent of US suicides occur in low-altitude areas, that’s because most of the population lives near sea level. Adjusted for population distribution, suicide rates per 100,000 population were 17.7 at high altitude, 11.9 at middle altitude, and 4.8 at low altitude.

## “Doctor, it hurts when I p”

3 June, 2018 at 10:44 | Posted in Statistics & Econometrics | 2 CommentsA low-powered study is only going to be able to see a pretty big effect. But sometimes you know that the effect, if it exists, is small. In other words, a study that accurately measures the effect … is likely to be rejected as statistically insignificant, while any result that passes the p < .05 test is either a false positive or a true positive that massively overstates the … effect.

…

A conventional boundary, obeyed long enough, can be easily mistaken for an actual thing in the world. Imagine if we talked about the state of the economy this way! Economists have a formal definition of a 'recession,' which depends on arbitrary thresholds just as 'statistical significance' does. One doesn't say, 'I don't care about the unemployment rate, or housing starts, or the aggregate burden of student loans, or the federal deficit; if it's not a recession, we're not going to talk about it.' One would be nuts to say so. The critics — and there are more of them, and they are louder, each year — say that a great deal of scientific practice is nuts in just this way.

If anything, this underlines how important it is not to equate science with statistical calculation. All science entail human judgement, and using statistical models doesn’t relieve us of that necessity. Working with misspecified models, the scientific value of significance testing is actually zero — even though you’re making valid statistical inferences! Statistical models and concomitant significance tests are no substitutes for doing real science. Or as a noted German philosopher once famously wrote:

There is no royal road to science, and only those who do not dread the fatiguing climb of its steep paths have a chance of gaining its luminous summits.

Statistical significance doesn’t say that something is important or true. Since there already are far better and more relevant testing that can be done (see e. g. here and here), it is high time to consider what should be the proper function of what has now really become a statistical fetish. Given that it anyway is very unlikely than any population parameter is exactly zero, and that contrary to assumption most samples in social science and economics are not random or having the right distributional shape – why continue to press students and researchers to do null hypothesis significance testing, testing that relies on a weird backward logic that students and researchers usually don’t understand?

In its standard form, a significance test is not the kind of “severe test” that we are looking for in our search for being able to confirm or disconfirm empirical scientific hypothesis. This is problematic for many reasons, one being that there is a strong tendency to accept the null hypothesis since it can’t be rejected at the standard 5% significance level. In their standard form, significance tests bias against new hypotheses by making it hard to disconfirm the null hypothesis.

As shown over and over again when it is applied, people have a tendency to read “not disconfirmed” as “probably confirmed.” And — most importantly — we should of course never forget that the underlying parameters we use when performing significance tests are *model constructions*. Our p-values mean next to nothing if the model is wrong. As David Freedman writes in *Statistical Models and Causal Inference*:

I believe model validation to be a central issue. Of course, many of my colleagues will be found to disagree. For them, fitting models to data, computing standard errors, and performing significance tests is “informative,” even though the basic statistical assumptions (linearity, independence of errors, etc.) cannot be validated. This position seems indefensible, nor are the consequences trivial. Perhaps it is time to reconsider.

## Why the p-value is a poor substitute for scientific reasoning

24 May, 2018 at 15:07 | Posted in Statistics & Econometrics | Leave a comment

A non-trivial part of teaching statistics is made up of learning students to perform significance testing. A problem I have noticed repeatedly over the years, however, is that no matter how careful you try to be in explicating what the probabilities generated by these statistical tests really are, still most students misinterpret them.

This is not to blame on students’ ignorance, but rather on significance testing not being particularly transparent (conditional probability inference is difficult even to those of us who teach and practice it). A lot of researchers fall prey to the same mistakes.

If anything, the above video underlines how important it is not to equate science with statistical calculation. All science entail human judgement, and using statistical models doesn’t relieve us of that necessity. Working with misspecified models, the scientific value of significance testing is actually zero — even though you’re making valid statistical inferences! Statistical models and concomitant significance tests are no substitutes for doing real science.

In its standard form, a significance test is not the kind of ‘severe test’ that we are looking for in our search for being able to confirm or disconfirm empirical scientific hypotheses. This is problematic for many reasons, one being that there is a strong tendency to accept the null hypothesis since they can’t be rejected at the standard 5% significance level. In their standard form, significance tests bias against new hypotheses by making it hard to disconfirm the null hypothesis.

And as shown over and over again when it is applied, people have a tendency to read “not disconfirmed” as “probably confirmed.” Standard scientific methodology tells us that when there is only say a 10 % probability that pure sampling error could account for the observed difference between the data and the null hypothesis, it would be more “reasonable” to conclude that we have a case of disconfirmation. Especially if we perform many independent tests of our hypothesis and they all give the same 10% result as our reported one, I guess most researchers would count the hypothesis as even more disconfirmed.

Most importantly — we should never forget that the underlying parameters we use when performing significance tests are *model constructions*. Our p-values mean next to nothing if the model is wrong. Statistical significance tests DO NOT validate models!

In journal articles a typical regression equation will have an intercept and several explanatory variables. The regression output will usually include an F-test, with p-1 degrees of freedom in the numerator and n-p in the denominator. The null hypothesis will not be stated. The missing null hypothesis is that all the coefficients vanish, except the intercept.

If F is significant, that is often thought to validate the model. Mistake. The F-test takes the model as given. Significance only means this:

ifthe model is rightandthe coefficients are 0, it is very unlikely to get such a big F-statistic. Logically, there are three possibilities on the table:

i) An unlikely event occurred.

ii) Or the model is right and some of the coefficients differ from 0.

iii) Or the model is wrong.

So?

## Regression to the mean

23 May, 2018 at 08:52 | Posted in Statistics & Econometrics | Leave a commentRegression to the men is nothing but the universal truth of the fact that whenever we have an imperfect correlation between two scores, we have regression to the mean.

Create a free website or blog at WordPress.com.

Entries and comments feeds.