On the limits of ‘mediation analysis’ and ‘statistical causality’

23 Jun, 2018 at 23:18 | Posted in Statistics & Econometrics | 5 Comments

mediator“Mediation analysis” is this thing where you have a treatment and an outcome and you’re trying to model how the treatment works: how much does it directly affect the outcome, and how much is the effect “mediated” through intermediate variables …

In the real world, it’s my impression that almost all the mediation analyses that people actually fit in the social and medical sciences are misguided: lots of examples where the assumptions aren’t clear and where, in any case, coefficient estimates are hopelessly noisy and where confused people will over-interpret statistical significance …

More and more I’ve been coming to the conclusion that the standard causal inference paradigm is broken … So how to do it? I don’t think traditional path analysis or other multivariate methods of the throw-all-the-data-in-the-blender-and-let-God-sort-em-out variety will do the job. Instead we need some structure and some prior information.

Andrew Gelman

Causality in social sciences — and economics — can never solely be a question of statistical inference. Causality entails more than predictability, and to really in depth explain social phenomena require theory. Analysis of variation — the foundation of all econometrics — can never in itself reveal how these variations are brought about. First, when we are able to tie actions, processes or structures to the statistical relations detected, can we say that we are getting at relevant explanations of causation.

Most facts have many different, possible, alternative explanations, but we want to find the best of all contrastive (since all real explanation takes place relative to a set of alternatives) explanations. So which is the best explanation? Many scientists, influenced by statistical reasoning, think that the likeliest explanation is the best explanation. But the likelihood of x is not in itself a strong argument for thinking it explains y. I would rather argue that what makes one explanation better than another are things like aiming for and finding powerful, deep, causal, features and mechanisms that we have warranted and justified reasons to believe in. Statistical — especially the variety based on a Bayesian epistemology — reasoning generally has no room for these kinds of explanatory considerations. The only thing that matters is the probabilistic relation between evidence and hypothesis. That is also one of the main reasons I find abduction — inference to the best explanation — a better description and account of what constitute actual scientific reasoning and inferences.

In the social sciences … regression is used to discover relationships or to disentangle cause and effect. However, investigators have only vague ideas as to the relevant variables and their causal order; functional forms are chosen on the basis of convenience or familiarity; serious problems of measurement are often encountered.

Regression may offer useful ways of summarizing the data and making predictions. Investigators may be able to use summaries and predictions to draw substantive conclusions. However, I see no cases in which regression equations, let alone the more complex methods, have succeeded as engines for discovering causal relationships.

David Freedman

Some statisticians and data scientists think that algorithmic formalisms somehow give them access to causality. That is, however, simply not true. Assuming ‘convenient’ things like faithfulness or stability is not to give proofs. It’s to assume what has to be proven. Deductive-axiomatic methods used in statistics do no produce evidence for causal inferences. The real causality we are searching for is the one existing in the real world around us. If there is no warranted connection between axiomatically derived theorems and the real world, well, then we haven’t really obtained the causation we are looking for.

If contributions made by statisticians to the understanding of causation are to be taken over with advantage in any specific field of inquiry, then what is crucial is that the right relationship should exist between statistical and subject-matter concerns …
introduction-to-statistical-inferenceThe idea of causation as consequential manipulation is apt to research that can be undertaken primarily through experimental methods and, especially to ‘practical science’ where the central concern is indeed with ‘the consequences of performing particular acts’. The development of this idea in the context of medical and agricultural research is as understandable as the development of that of causation as robust dependence within applied econometrics. However, the extension of the manipulative approach into sociology would not appear promising, other than in rather special circumstances … The more fundamental difficulty is that​ under the — highly anthropocentric — principle of ‘no causation without manipulation’, the recognition that can be given to the action of individuals as having causal force is in fact peculiarly limited.

John H. Goldthorpe


  1. Mathematics is sometimes implicated in the mistakes of financiers and policy-makers, and often held to be impotent about issues of real important. But it seems to me that mathematics as such has some important implications, if only we could find a way to explain them.

    Here – https://djmarsay.wordpress.com/notes/puzzles/petty-theft/ – is an attempt to give an example of why we have to careful about interpreting causality claims that doesn’t involve any sophisticated (or otherwise) mathematics, statistics or economics. Does it help clarify the issues?

  2. “I would rather argue that what makes one explanation better than another are things like aiming for and finding powerful, deep, causal, features and mechanisms that we have warranted and justified reasons to believe in. Statistical — especially the variety based on a Bayesian epistemology — reasoning generally has no room for these kinds of explanatory considerations.”

    I agree with you about warranted and justified beliefs about causal features and mechanisms, but it seems to me that it is Bayesian probability that has the promise of making room for them in terms of prior distributions. OC, I am heavily influenced by Keynes. And I think that sometimes the evidence requires not just updating a probability distribution, but discarding it entirely.

    • Min, Re your final sentence: Can you give compelling examples of where Bayesian updating is not adequate? Keynes had many attempts and I attempt some more in my blog (as above), but alas they seem ‘water off a duck’s back’.

      • Consider the question of whether there is a human more than 90 feet tall. (OC, there isn’t, but suppose that we were going to test the hypothesis that there is.) We measure the height of “randomly” chose people. (Bayesians do not require randomness, but it has its points.) Every person whose height we measure is less than 90 feet tall. We update our prior distribution accordingly against the hypothesis. Then, lo and behold, one man is 89 feet high. (!) How do we update the distribution? Doesn’t it make sense to think that the existence of an 89 foot tall man is evidence in favor of a 90 foot tall man?

        OC, we all know about black swans. 🙂 Some years ago I was pondering Hempel’s Raven again. Are all ravens black? Even not counting albino ravens or ravens painted orange, I decided, with no new evidence about ravens at all, but based merely upon the theory of evolution, that non-black ravens were very likely to exist or to have existed. A brief internet search revealed that I was right. What I had done was to throw away my probability distribution with no evidentiary updating at all. Was I just lucky?

        • Sorry for the tardy response.

          Bayesians do not all claim that anyone’s updates will actually be consistent with their priors, only that they ‘should’ be. This is perfectly correct as applied to idealised sampling problems, which yours seem to be. (This is almost a tautology: an idealised sampling is one which satisfies Bayesian theory!) The issues that I have arise when there is no particular reason to think idealised sampling at all realistic

          I see no difficulty in your height example. According to Bayesian dogma your priors are inconsistent with your updating. So what? I’m less clear about your ravens, but as long as your priors respect Cromwell’s law, I don’t think that any Bayesians would find this example convincing either. I have some examples on my blog (djmarsay.wordpress.om). I’d appreciate any comments or suggestions.

Sorry, the comment form is closed at this time.

Blog at WordPress.com.
Entries and comments feeds.