Why the idea of causation cannot be a purely statistical one

23 Jun, 2021 at 15:32 | Posted in Statistics & Econometrics | 6 Comments

If contributions made by statisticians to the understanding of causation are to be taken over with advantage in any specific field of inquiry, then what is crucial is that the right relationship should exist between statistical and subject-matter concerns …

introduction-to-statistical-inferenceWhere the ultimate aim of research is not prediction per se but rather causal explanation, an idea of causation that is expressed in terms of predictive power — as, for example, ‘Granger’ causation — is likely to be found wanting. Causal explanations cannot be arrived at through statistical methodology alone: a subject-matter input is also required in the form of background knowledge and, crucially, theory …

Likewise, the idea of causation as consequential manipulation is apt to research that can be undertaken primarily through experimental methods and, especially to ‘practical science’ where the central concern is indeed with ‘the consequences of performing particular acts’. The development of this idea in the context of medical and agricultural research is as understandable as the development of that of causation as robust dependence within applied econometrics. However, the extension of the manipulative approach into sociology would not appear promising, other than in rather special circumstances … The more fundamental difficulty is that, under the — highly anthropocentric — principle of ‘no causation without manipulation’, the recognition that can be given to the action of individuals as having causal force is in fact peculiarly limited.

John H. Goldthorpe

Causality in social sciences — and economics — can never solely be a question of statistical inference. Statistics and data often serve to suggest causal accounts, but causality entails more than predictability, and to really in depth explain social phenomena require theory. Analysis of variation — the foundation of all econometrics — can never in itself reveal how these variations are brought about. First, when we are able to tie actions, processes or structures to the statistical relations detected, can we say that we are getting at relevant explanations of causation.

5cd674ec7348d0620e102a79a71f0063Most facts have many different, possible, alternative explanations, but we want to find the best of all contrastive (since all real explanation takes place relative to a set of alternatives) explanations. So which is the best explanation? Many scientists, influenced by statistical reasoning, think that the likeliest explanation is the best explanation. But the likelihood of x is not in itself a strong argument for thinking it explains y. I would rather argue that what makes one explanation better than another are things like aiming for and finding powerful, deep, causal, features and mechanisms that we have warranted and justified reasons to believe in. Statistical — especially the variety based on a Bayesian epistemology — reasoning generally has no room for these kinds of explanatory considerations. The only thing that matters is the probabilistic relation between evidence and hypothesis. That is also one of the main reasons I find abduction — inference to the best explanation — a better description and account of what constitute actual scientific reasoning and inferences.

In the social sciences … regression is used to discover relationships or to disentangle cause and effect. However, investigators have only vague ideas as to the relevant variables and their causal order; functional forms are chosen on the basis of convenience or familiarity; serious problems of measurement are often encountered.

Regression may offer useful ways of summarizing the data and making predictions. Investigators may be able to use summaries and predictions to draw substantive conclusions. However, I see no cases in which regression equations, let alone the more complex methods, have succeeded as engines for discovering causal relationships.

David Freedman

Some statisticians and data scientists think that algorithmic formalisms somehow give them access to causality. That is, however, simply not true. Assuming ‘convenient’ things like faithfulness or stability is not to give proofs. It’s to assume what has to be proven. Deductive-axiomatic methods used in statistics do no produce evidence for causal inferences. The real causality we are searching for is the one existing in the real world around us. If there is no warranted connection between axiomatically derived theorems and the real-world, well, then we haven’t really obtained the causation we are looking for.


  1. Most of the ideas in this post were anticipated 70 years ago ago by Bertrand Russell in his book “Human Knowledge: Its Scope and Limits” published in 1948 (part 6, chapters 9 & 10).
    Russell argued that:
    “Induction used without common sense leads more often to false conclusions than to true ones”.
    He tentatively suggested that “crude induction” could be replaced by five “premises of non-deductive inference” in order to “substitute something more precise and more effective”.
    Russell explained the origins of inductive inference:
    “Knowledge of connections between facts has its biological origin in animal expectations…It is biologically advantageous to have such expectations as will usually be verified; it is therefore not surprising if the psychological laws governing expectations are, in the main, in conformity with the objective laws governing expected occurrences.”

    For example: “The inference from smell to edibility is usually reliable…
    I think, therefore, that we may be said to “know” what is
    necessary for scientific inference…

    “As mankind have advanced in intelligence, their inferential habits
    have come gradually nearer to agreement with the laws of nature
    which have made these habits, throughout, more often a source
    of true expectations than of false ones. The forming of inferential
    habits which lead to true expectations is part of the adaptation to
    the environment upon which biological survival depends.”
    Russell maintained that all this was consistent with “strict
    adherence to a doctrine by which empiricist philosophy has been
    inspired: that all human knowledge is uncertain, inexact, and
    partial. To this doctrine we have not found any limitation

    • Doesn’t the problem of induction mean that tomorrow you can encounter smelly edible foods that confound your just-so story of biologically advantageous objective laws?

      • There are indeed exceptions to most general rules. For example, a few edible foods are considered to have a bad smell by many people, e.g. Durian fruit.
        However, as quoted above, Russell wrote that “The inference from smell to edibility is USUALLY reliable…”.
        He did NOT say that inductive inferences were always or 100% reliable.
        Russell emphasized: “all human knowledge is uncertain, inexact, and partial”.
        Exceptions do not disprove rules. All inductive inference is probabilistic: “The fact that things often fail to fulfil our expectations is no evidence that our expectations will not probably be fulfilled in a given case or a given class of cases” – Russell 1912.

        • “The fact that things often fail to fulfil our expectations is no evidence that our expectations will not probably be fulfilled in a given case or a given class of cases” – Russell 1912.”
          Didn’t stock returns fail to fulfill expectations in early 2020, but the Fed acted by printing money liberally to fulfill the expectations anyway? Despite predictions that the Fed just didn’t have the money to repeat another 2008-style expansion?
          Can’t we guarantee financial expectations through central bank money printing? Isn’t that what Russell’s philosophy inexorably leads us to conclude?

          • Doesn’t the Fed expect that propping up Wall Street saves the real economy, because it knows that finance is the dog and the real economy nothing but its tail?
            Didn’t I have many arguments with mainstream economists before 2020 who said the Fed could not, in fact, print trillions again? But the expectations of Wall Street that they would were fulfilled, anyway?

        • Robert,
          ” but the Fed acted by printing money liberally to fulfill the expectations anyway?”
          They printed money to keep the economy going not prop up Wall St..
          ” the Fed just didn’t have the money to repeat another 2008-style expansion?”
          The Fed could print trillions if it wanted to.

Sorry, the comment form is closed at this time.

Blog at WordPress.com.
Entries and Comments feeds.