## Logistic regression (student stuff)

27 Feb, 2021 at 16:02 | Posted in Statistics & Econometrics | Leave a comment.

In the video below (in Swedish) yours truly shows how to perform a logit regression using Gretl.

## The leap of generalization

24 Feb, 2021 at 16:45 | Posted in Statistics & Econometrics | Leave a commentStatistician Andrew Gelman has an interesting blogpost up on what inference in science really means:

I like Don Rubin’s take on this, which is that if you want to go from association to causation, state very clearly what the assumptions are for this step to work. The clear statement of these assumptions can be helpful in moving forward …

Another way to say this is that all inference is about generalizing from sample to population, to predicting the outcomes of hypothetical interventions on new cases. You can’t escape the leap of generalization. Even a perfectly clean randomized experiment is typically of interest only to the extent that it generalizes to new people not included in the original study.

I agree — but that’s also why we so often fail (even when having the best intentions) when it comes to making generalizations in social sciences.

What strikes me again and again when taking part of the results of randomized experiments is that they really are very similar to theoretical models. They all have the same basic problem — they are built on rather artificial conditions and have difficulties with the ‘trade-off’ between internal and external validity. The more artificial conditions, the more internal validity, but also less external validity. The more we rig experiments/models to avoid the ‘confounding factors,’ the less the conditions are reminiscent of the real ‘target system.’ The nodal issue is basically about how scientists using different isolation strategies in different ‘nomological machines’ attempt to learn about causal relationships. I doubt the generalizability of the (randomized or not) experiment strategy because the probability is high that causal mechanisms are different in different contexts and that lack of homogeneity/stability/invariance don’t give us warranted export licenses to the ‘real’ societies.

Evidence-based theories and policies are highly valued nowadays. Randomization is supposed to best control for bias from unknown confounders. The received opinion — including Rubin and Gelman — is that evidence based on randomized experiments therefore is the best.

More and more economists have also lately come to advocate randomization as the principal method for ensuring being able to make valid causal inferences. Especially when it comes to questions of causality, randomization is nowadays considered some kind of “gold standard”. Everything has to be evidence-based, and the evidence has to come from randomized experiments.

But just as econometrics, randomization is basically a deductive method. Given the assumptions (such as manipulability, transitivity, Reichenbach probability principles, separability, additivity, linearity, etc., etc.) these methods deliver deductive inferences. The problem, of course, is that we will never completely know when the assumptions are right. [And although randomization may contribute to controlling for confounding, it does not guarantee it, since genuine ramdomness presupposes infinite experimentation and we know all real experimentation is finite. And even if randomization may help to establish average causal effects, it says nothing of individual effects unless homogeneity is added to the list of assumptions.] Real target systems are seldom epistemically isomorphic to our axiomatic-deductive models/systems, and even if they were, we still have to argue for the external validity of the conclusions reached from within these epistemically convenient models/systems. Causal evidence generated by randomization procedures may be valid in ‘closed’ models, but what we usually are interested in, is causal evidence in the real target system we happen to live in.

Many advocates of randomization want to have deductively automated answers to fundamental causal questions. But to apply ‘thin’ methods we have to have ‘thick’ background knowledge of what’s going on in the real world, and not in (ideally controlled) experiments. Conclusions can only be as certain as their premises — and that also goes for methods based on randomized experiments.

So yours truly agrees with Gelman that “all inference is about generalizing from sample to population.” But I don’t think randomized experiments — ideal or not — take us very far on that road. Randomized experiments in social sciences are far from being the ‘gold standard’ they so often are depicted as.

## Keynes on the methodology of econometrics

21 Feb, 2021 at 20:39 | Posted in Statistics & Econometrics | Leave a commentThere is first of all the central question of methodology — the logic of applying the method of multiple correlation to unanalysed economic material, which we know to be non-homogeneous through time. If we are dealing with the action of numerically measurable, independent forces, adequately analysed so that we were dealing with independent atomic factors and between them completely comprehensive, acting with fluctuating relative strength on material constant and homogeneous through time, we might be able to use the method of multiple correlation with some confidence for disentangling the laws of their action … In fact we know that every one of these conditions is far from being satisfied by the economic material under investigation.

Letter from John Maynard Keynes to Royall Tyler (1938)

## On the difference between econometrics and data science

20 Jan, 2021 at 22:42 | Posted in Statistics & Econometrics | 1 Comment.

Causality in social sciences can never solely be a question of statistical inference. Causality entails more than predictability, and to really in depth explain social phenomena require theory. The analysis of variation can never in itself reveal how these variations are brought about. First when we are able to tie actions, processes or structures to the statistical relations detected, can we say that we are getting at relevant explanations of causation.

Most facts have many different, possible, alternative explanations, but we want to find the best of all contrastive (since all real explanation takes place relative to a set of alternatives) explanations. So which is the best explanation? Many scientists, influenced by statistical reasoning, think that the likeliest explanation is the best explanation. But the likelihood of x is not in itself a strong argument for thinking it explains y. I would rather argue that what makes one explanation better than another are things like aiming for and finding powerful, deep, causal, features and mechanisms that we have warranted and justified reasons to believe in. Statistical — especially the variety based on a Bayesian epistemology — reasoning generally has no room for these kinds of explanatory considerations. The only thing that matters is the probabilistic relation between evidence and hypothesis. That is also one of the main reasons I find abduction — inference to the best explanation — a better description and account of what constitute actual scientific reasoning and inferences.

Some statisticians and data scientists think that algorithmic formalisms somehow give them access to causality. That is simply not true. Assuming ‘convenient’ things like faithfulness or stability is not to give proofs. It’s to assume what has to be proven. Deductive-axiomatic methods used in statistics do no produce evidence for causal inferences. The real causality we are searching for is the one existing in the real world around us. If there is no warranted connection between axiomatically derived theorems and the real-world, well, then we haven’t really obtained the causation we are looking for.

## Fooled by randomness

13 Jan, 2021 at 18:08 | Posted in Statistics & Econometrics | 6 CommentsA non-trivial part of teaching statistics to social science students is made up of teaching them to perform significance testing. A problem yours truly has noticed repeatedly over the years, however, is that no matter how careful you try to be in explicating what the probabilities generated by these statistical tests — p-values — really are, still most students misinterpret them.

A couple of years ago I gave a statistics course for the *Swedish National Research School in History*, and at the exam I asked the students to explain how one should correctly interpret p-values. Although the correct definition is p(data|null hypothesis), a majority of the students either misinterpreted the p-value as being the likelihood of a sampling error (which of course is wrong, since the very computation of the p-value is based on the assumption that sampling errors are what causes the sample statistics not coinciding with the null hypothesis) or that the p-value is the probability of the null hypothesis being true, given the data (which of course also is wrong, since it is p(null hypothesis|data) rather than the correct p(data|null hypothesis)).

This is not to blame on students’ ignorance, but rather on significance testing not being particularly transparent (conditional probability inference is difficult even to those of us who teach and practice it). A lot of researchers fall pray to the same mistakes. So – given that it anyway is very unlikely than any population parameter is exactly zero, and that contrary to assumption most samples in social science and economics are not random or having the right distributional shape – why continue to press students and researchers to do null hypothesis significance testing, testing that relies on weird backward logic that students and researchers usually don’t understand?

Let me just give a simple example to illustrate how slippery it is to deal with p-values – and how easy it is to impute causality to things that really are nothing but chance occurrences.

Say you have collected cross-country data on austerity policies and growth (and let’s assume that you have been able to “control” for possible confounders). You find that countries that have implemented austerity policies have on average increased their growth by say 2% more than the other countries. To really feel sure about the efficacy of the austerity policies you run a significance test – thereby actually assuming without argument that all the values you have come from the same probability distribution – and you get a p-value of less than 0.05. Heureka! You’ve got a statistically significant value. The probability is less than 1/20 that you got this value out of pure stochastic randomness.

But wait a minute. There is – as you may have guessed – a snag. If you test austerity policies in enough many countries you will get a statistically ‘significant’ result out of pure chance 5% of the time. So, really, there is nothing to get so excited about!

Statistical significance doesn’t say that something is important or true. And since there already are far better and more relevant testing that can be done (see e. g. here and here), it is high time to give up on this statistical fetish and not continue to be fooled by randomness.

## Big data truthiness

9 Jan, 2021 at 23:10 | Posted in Statistics & Econometrics | Comments Off on Big data truthinessAll of these examples exhibit the confusion that often accompanies the drawing of causal conclusions from observational data. The likelihood of such confusion is not diminished by increasing the amount of data, although the publicity given to ‘big data’ would have us believe so. Obviously the flawed causal connection between drowning and eating ice cream does not diminish if we increase the number of cases from a few dozen to a few million. The amateur carpenter’s complaint that ‘this board is too short, and even though I’ve cut it four more times, it is still too short,’ seems eerily appropriate.

## Econometrics and the challenge of regression specification

7 Jan, 2021 at 14:28 | Posted in Statistics & Econometrics | 1 CommentMost work in econometrics and regression analysis is — still — made on the assumption that the researcher has a theoretical model that is ‘true.’ Based on this belief of having a correct specification for an econometric model or running a regression, one proceeds as if the only problem remaining to solve have to do with measurement and observation.

When things sound too good to be true, they usually aren’t. And that goes for econometric wet dreams too. The snag is, of course, that there is pretty little to support the perfect specification assumption. Looking around in social science and economics we don’t find a single regression or econometric model that lives up to the standards set by the ‘true’ theoretical model — and there is pretty little that gives us reason to believe things will be different in the future.

To think that we are being able to construct a model where all relevant variables are included and correctly specify the functional relationships that exist between them, is not only a belief without support, but a belief *impossible* to support.

The theories we work with when building our econometric regression models are insufficient. No matter what we study, there are always some variables missing, and we don’t know the correct way to functionally specify the relationships between the variables.

*Every* regression model constructed is misspecified. There are always an endless list of possible variables to include, and endless possible ways to specify the relationships between them. So every applied econometrician comes up with his own specification and ‘parameter’ estimates. The econometric Holy Grail of consistent and stable parameter-values is nothing but a dream.

In order to draw inferences from data as described by econometric texts, it is necessary to make whimsical assumptions. The professional audience consequently and properly withholds belief until an inference is shown to be adequately insensitive to the choice of assumptions. The haphazard way we individually and collectively study the fragility of inferences leaves most of us unconvinced that any inference is believable. If we are to make effective use of our scarce data resource, it is therefore important that we study fragility in a much more systematic way. If it turns out that almost all inferences from economic data are fragile, I suppose we shall have to revert to our old methods …

A rigorous application of econometric methods in economics really presupposes that the phenomena of our real-world economies are ruled by stable causal relations between variables. Parameter-values estimated in specific spatio-temporal contexts are *presupposed* to be exportable to totally different contexts. To warrant this assumption one, however, has convincingly to establish that the targeted acting causes are stable and invariant so that they maintain their parametric status after the bridging. The endemic lack of predictive success of the econometric project indicates that this hope of finding fixed parameters is a hope for which there really is no other ground than hope itself.

## How scientists manipulate research

27 Dec, 2020 at 13:22 | Posted in Statistics & Econometrics | 1 Comment

*All* science entails human judgment, and using statistical models doesn’t relieve us of that necessity. Working with misspecified models, the scientific value of significance testing is actually zero — even though you’re making valid statistical inferences! Statistical models and concomitant significance tests are no substitutes for doing real science.

In its standard form, a significance test is not the kind of ‘severe test’ that we are looking for in our search for being able to confirm or disconfirm empirical scientific hypotheses. This is problematic for many reasons, one being that there is a strong tendency to accept the null hypothesis since they can’t be rejected at the standard 5% significance level. In their standard form, significance tests bias against new hypotheses by making it hard to disconfirm the null hypothesis.

And as shown over and over again when it is applied, people have a tendency to read “not disconfirmed” as “probably confirmed.” Standard scientific methodology tells us that when there is only say a 10 % probability that pure sampling error could account for the observed difference between the data and the null hypothesis, it would be more “reasonable” to conclude that we have a case of disconfirmation. Especially if we perform many independent tests of our hypothesis and they all give the same 10% result as our reported one, I guess most researchers would count the hypothesis as even more disconfirmed.

Statistics is no substitute for thinking. We should never forget that the underlying parameters we use when performing significance tests are model constructions. Our p-values mean next to nothing if the model is wrong. Statistical significance tests do not validate models!

In many social sciences, p-values and null hypothesis significance testing (NHST) is often used to draw far-reaching scientific conclusions — despite the fact that they are as a rule poorly understood and that there exist alternatives that are easier to understand and more informative.

Not the least using confidence intervals (CIs) and effect sizes are to be preferred to the Neyman-Pearson-Fisher mishmash approach that is so often practiced by applied researchers.

Running a Monte Carlo simulation with 100 replications of a fictitious sample having N = 20, confidence intervals of 95%, a normally distributed population with a mean = 10 and a standard deviation of 20, taking two-tailed p-values on a zero null hypothesis, we get varying CIs (since they are based on varying sample standard deviations), but with a minimum of 3.2 and a maximum of 26.1, we still get a clear picture of what would happen in an infinite limit sequence. On the other hand p-values (even though from a purely mathematical-statistical sense more or less equivalent to CIs) vary strongly from sample to sample, and jumping around between a minimum of 0.007 and a maximum of 0.999 doesn’t give you a clue of what will happen in an infinite limit sequence!

## Econometrics — the art of pulling a rabbit out of a hat

22 Dec, 2020 at 17:57 | Posted in Statistics & Econometrics | 5 CommentsIn econometrics one often gets the feeling that many of its practitioners think of it as a kind of automatic inferential machine: input data and out comes causal knowledge. This is — as Joan Robinson once had it — like pulling a rabbit from a hat. Great — but first you have to put the rabbit in the hat. And this is where assumptions come in to the picture.

The assumption of imaginary ‘superpopulations’ is one of the many dubious assumptions used in modern econometrics, and as Clint Ballinger highlights, this is a particularly questionable rabbit pulling assumption:

Inferential statistics are based on taking a random sample from a larger population … and attempting to draw conclusions about a) the larger population from that data and b) the probability that the relations between measured variables are consistent or are artifacts of the sampling procedure.

However, in political science, economics, development studies and related fields the data often represents as complete an amount of data as can be measured from the real world (an ‘apparent population’). It is not the result of a random sampling from a larger population. Nevertheless, social scientists treat such data as the result of random sampling.

Because there is no source of further cases a fiction is propagated—the data is treated as if it were from a larger population, a ‘superpopulation’ where repeated realizations of the data are imagined. Imagine there could be more worlds with more cases and the problem is fixed …

What ‘draw’ from this imaginary superpopulation does the real-world set of cases we have in hand represent? This is simply an unanswerable question. The current set of cases could be representative of the superpopulation, and it could be an extremely unrepresentative sample, a one in a million chance selection from it …

The problem is not one of statistics that need to be fixed. Rather, it is a problem of the misapplication of inferential statistics to non-inferential situations.

## Fuzzy RDD & IV (student stuff)

16 Dec, 2020 at 10:04 | Posted in Statistics & Econometrics | Comments Off on Fuzzy RDD & IV (student stuff).

## COVID19 and causal inference (wonkish)

14 Dec, 2020 at 19:06 | Posted in Statistics & Econometrics | 1 Comment.

## Natural experiments in the social sciences

5 Dec, 2020 at 14:46 | Posted in Statistics & Econometrics | Comments Off on Natural experiments in the social sciencesHow, then, can social scientists best make inferences about causal effects? One option is true experimentation … Random assignment ensures that any differences in outcomes between the groups are due either to chance error or to the causal effect … If the experiment were to be repeated over and over, the groups would not differ, on average, in the values of potential confounders. Thus, the average of the average difference of group outcomes, across these many experiments, would equal the true difference in outcomes … The key point is that randomization is powerful because it obviates confounding …

Thad Dunning’s book is a very useful guide for social scientists interested in research methodology in general and natural experiments in specific. Dunning argues that since random or as-if random assignment in natural experiments obviates the need for controlling potential confounders, this kind of “simple and transparent” design-based research method is preferable to more traditional multivariate regression analysis where the controlling only comes in* ex post* via statistical modelling.

But — there is always a but …

The point of making a randomized experiment is often said to be that it ‘ensures’ that any correlation between a supposed cause and effect indicates a causal relation. This is believed to hold since randomization (allegedly) ensures that a supposed causal variable does not correlate with other variables that may influence the effect.

The problem with that simplistic view on randomization is that the claims made are exaggerated and sometimes even false.

Since most real-world experiments and trials build on performing a finite amount of randomization, what *would* happen *if* you kept on randomizing forever, does not help you to ‘ensure’ or ‘guarantee’ that you do not make false causal conclusions in the one particular randomized experiment you actually do perform. It is indeed difficult to see why thinking about what you know you will never do, would make you happy about what you actually do.

In econometrics one often gets the feeling that many of its practitioners think of it as a kind of automatic inferential machine: input data and out comes causal knowledge. This is like pulling a rabbit from a hat. Great — but first you have to put the rabbit in the hat. And this is where assumptions come into the picture.

The assumption of imaginary ‘super populations’ is one of the many dubious assumptions used in modern econometrics.

As social scientists — and economists — we have to confront the all-important question of how to handle uncertainty and randomness. Should we define randomness with probability? If we do, we have to accept that to speak of randomness we also have to presuppose the existence of nomological probability machines, since probabilities cannot be spoken of — and actually, to be strict, do not at all exist — without specifying such system-contexts. Accepting a domain of probability theory and sample space of infinite populations also implies that judgments are made on the basis of observations that are actually never made!

Infinitely repeated trials or samplings never take place in the real world. So that cannot be a sound inductive basis for science with aspirations of explaining real-world socio-economic processes, structures or events. It’s not tenable. Why should we as social scientists — and not as pure mathematicians working with formal-axiomatic systems without the urge to confront our models with real target systems — unquestioningly accept models based on concepts like the ‘infinite super populations’ used in e.g. the ‘potential outcome’ framework that has become so popular lately in social sciences?

One could, of course, treat observational or experimental data as random samples from real populations. I have no problem with that (although it has to be noted that most ‘natural experiments’ are *not* based on random sampling from some underlying population — which, of course, means that the effect-estimators, strictly seen, only are unbiased for the specific groups studied). But probabilistic econometrics does not content itself with that kind of populations. Instead, it creates imaginary populations of ‘parallel universes’ and assume that our data are random samples from that kind of ‘infinite super populations.’

In social sciences — including economics — it’s always wise to ponder C. S. Peirce’s remark that universes are not as common as peanuts …

## Causality and analysis of variation

5 Dec, 2020 at 13:23 | Posted in Statistics & Econometrics | 1 CommentModern econometrics is fundamentally based on assuming — usually without any explicit justification — that we can gain causal knowledge by considering independent variables that may have an impact on the *variation* of a dependent variable. This is however, far from self-evident. Often the *fundamental* causes are *constant* forces that are not amenable to the kind of analysis econometrics supplies us with. As Stanley Lieberson has it in Making It Count:

One can always say whether, in a given empirical context, a given variable or theory accounts for more variation than another. But it is almost certain that the variation observed is not universal over time and place. Hence the use of such a criterion first requires a conclusion about the variation over time and place in the dependent variable. If such an analysis is not forthcoming, the theoretical conclusion is undermined by the absence of information …

Moreover, it is questionable whether one can draw much of a conclusion about causal forces from simple analysis of the observed variation … To wit, it is vital that one have an understanding, or at least a working hypothesis, about what is causing the event per se; variation in the magnitude of the event will not provide the answer to that question.

Trygve Haavelmo was making a somewhat similar point back in 1941, when criticizing the treatmeant of the interest variable in Tinbergen’s regression analyses. The regression coefficient of the interest rate variable being zero was according to Haavelmo not sufficient for inferring that “variations in the rate of interest play only a minor role, or no role at all, in the changes in investment activity.” Interest rates may very well play a decisive indirect role by influencing other causally effective variables. And:

the rate of interest may not have varied much during the statistical testing period, and for this reason the rate of interest would not “explain” very much of the variation in net profit (and thereby the variation in investment) which has actually taken place during this period. But one cannot conclude that the rate of influence would be inefficient as an autonomous regulator, which is, after all, the important point.

This problem of ‘nonexcitation’ — when there is too little variation in a variable to say anything about its *potential* importance, and we can’t identify the reason for the *factual *influence of the variable being ‘negligible’ — strongly confirms that causality in economics and other social sciences can never solely be a question of statistical inference. Causality entails more than predictability, and to really in depth explain social phenomena requires theory.

Analysis of variation — the foundation of all econometrics — can never in itself reveal *how* these variations are brought about. First when we are able to tie actions, processes or structures to the statistical relations detected, can we say that we are getting at relevant explanations of causation. Too much in love with axiomatic-deductive modeling, neoclassical economists especially tend to forget that accounting for causation — *how* causes bring about their effects — demands deep subject-matter knowledge and acquaintance with the intricate fabrics and contexts. As already Keynes argued in his *A Treatise on Probability*, statistics and econometrics should primarily be seen as means to describe patterns of associations and correlations, means that we may use as *suggestions* of possible causal realations.

## Covariance algebra (student stuff)

4 Dec, 2020 at 10:49 | Posted in Statistics & Econometrics | Comments Off on Covariance algebra (student stuff).

## Instrumental variables (student stuff)

2 Dec, 2020 at 18:43 | Posted in Statistics & Econometrics | Comments Off on Instrumental variables (student stuff).

Great presentation that also underlines the mistake many economists do when they use IV analysis and think that their basic identification assumption is empirically testable. It is not. And just swapping an assumption of residuals being uncorrelated with the independent variables with the assumption that the same residuals are uncorrelated with an instrument doesn’t solve the endogeneity problem or improve our causal analysis.

Blog at WordPress.com.

Entries and comments feeds.