Table 2 Fallacy (student stuff)

14 Jun, 2021 at 19:02 | Posted in Statistics & Econometrics | 2 Comments


Discrimination and the use of ‘statistical controls’

14 Jun, 2021 at 12:27 | Posted in Statistics & Econometrics | Leave a comment

The gender pay gap is a fact that, sad to say, to a non-negligible extent is the result of discrimination. And even though many women are not deliberately discriminated against, but rather self-select into lower-wage jobs, this in no way magically explains away the discrimination gap. As decades of socialization research has shown, women may be ‘structural’ victims of impersonal social mechanisms that in different ways aggrieve them. Wage discrimination is unacceptable. Wage discrimination is a shame.

You see it all the time in studies. “We controlled for…” And then the list starts … The more things you can control for, the stronger your study is — or, at least, the stronger your study seems. Controls give the feeling of specificity, of precision. But sometimes, you can control for too much. Sometimes you end up controlling for the thing you’re trying to measure …

paperAn example is research around the gender wage gap, which tries to control for so many things that it ends up controlling for the thing it’s trying to measure …

Take hours worked, which is a standard control in some of the more sophisticated wage gap studies. Women tend to work fewer hours than men. If you control for hours worked, then some of the gender wage gap vanishes. As Yglesias wrote, it’s “silly to act like this is just some crazy coincidence. Women work shorter hours because as a society we hold women to a higher standard of housekeeping, and because they tend to be assigned the bulk of childcare responsibilities.”

Controlling for hours worked, in other words, is at least partly controlling for how gender works in our society. It’s controlling for the thing that you’re trying to isolate.

Ezra Klein

Trying to reduce the risk of having established only ‘spurious relations’ when dealing with observational data, statisticians and econometricians standardly add control variables. The hope is that one thereby will be able to make more reliable causal inferences. But — as Keynes showed already back in the 1930s when criticizing statistical-econometric applications of regression analysis — if you do not manage to get hold of all potential confounding factors, the model risks producing estimates of the variable of interest that are even worse than models without any control variables at all. Conclusion: think twice before you simply include ‘control variables’ in your models!

piled-up-dishes-in-kitchen-sinkWhen I present this argument … one or more scholars say, “But shouldn’t I control for everything I can in my regressions? If not, aren’t my coefficients biased due to excluded variables?” … The excluded variable argument only works if you are sure your specification is precisely correct with all variables included. But no one can know that with more than a handful of explanatory variables …

A preferable approach is to separate the observations into meaningful subsets—internally compatible statistical regimes … If this can’t be done, then statistical analysis can’t be done. A researcher claiming that nothing else but the big, messy regression is possible because, after all, some results have to be produced, is like a jury that says, “Well, the evidence was weak, but somebody had to be convicted.”

Christopher H. Achen

Kitchen sink econometric models are often the result of researchers trying to control for confounding. But what they usually haven’t understood is that the confounder problem requires a causal solution and not statistical ‘control.’ Controlling for everything opens up the risk that we control for ‘collider’ variables and thereby create ‘back-door paths’ which gives us confounding that wasn’t there to begin with.

Ett ljus i radiomörkret

13 Jun, 2021 at 14:04 | Posted in Varia | Leave a comment

radioI dessa tider — när ljudrummet dränks i den kommersiella radions pubertalflams — har man nästan gett upp.

Men det finns ljus i mörkret.

I programmet Text och musik med Eric Schüldt — som sänds på söndagsförmiddagarna i P2 mellan klockan 11 och 12 — kan man lyssna på seriös musik och en programledare som verkligen har något att säga och inte bara låter foderluckan glappa. Att få höra någon med intelligens och känsla tala om saker som vi alla går och bär på djupt inne i våra själar — men nästan aldrig vågar prata om — är en lisa för själen.

Jag har i flera år nu lyssnat på Erics program varje söndag. En helg utan hans tänkvärda och ofta lite melankoliska funderingar och vemodiga musik har blivit otänkbart.

I dag kunde man bland annat höra filmmusik av vår egen Stefan Nilsson, vars musik till Ingmar Bergmans och Bille Augusts Den goda viljan är bland det vackraste och mest suggestiva i filmmusikväg som gjorts.

Me and Jane Austen in Karlsbad (personal)

13 Jun, 2021 at 09:37 | Posted in Varia | Leave a comment

karlsbadBack in the 80’s yours truly had the pleasure of studying German in Vienna. A wonderful town full of history and Kaffeehäuser.

A couple of years ago, I was invited to give a series of lectures at University of Vienna and at Vienna University of Economics and Business Administration. I spent an absolutely fabulous week with visits to Café Central, Hofburg, Vienna State Opera, Belvedere, Pratern, etc., etc. 

Afterwards, yours truly — of course — could not resist the temptation to make a stopover in Karlsbad (Karlovy Vary). If you like to walk right into a novel by Jane Austen — and your wallet isn’t too thin — it’s a highly recommendable place. Hopefully,​​ when the present pandemic is all over, I will get time off for a new visit. 

La réforme du bac et le Covid

12 Jun, 2021 at 14:18 | Posted in Education & School | Leave a comment


Extreme events and how to live with them

12 Jun, 2021 at 12:00 | Posted in Statistics & Econometrics | Leave a comment


John Maynard Keynes — life, ideas, legacy

12 Jun, 2021 at 11:47 | Posted in Economics | Leave a comment


On the limits of ‘mediation analysis’ and ‘statistical causality’

11 Jun, 2021 at 18:12 | Posted in Statistics & Econometrics | Leave a comment

mediator“Mediation analysis” is this thing where you have a treatment and an outcome and you’re trying to model how the treatment works: how much does it directly affect the outcome, and how much is the effect “mediated” through intermediate variables …

In the real world, it’s my impression that almost all the mediation analyses that people actually fit in the social and medical sciences are misguided: lots of examples where the assumptions aren’t clear and where, in any case, coefficient estimates are hopelessly noisy and where confused people will over-interpret statistical significance …

More and more I’ve been coming to the conclusion that the standard causal inference paradigm is broken … So how to do it? I don’t think traditional path analysis or other multivariate methods of the throw-all-the-data-in-the-blender-and-let-God-sort-em-out variety will do the job. Instead we need some structure and some prior information.

Andrew Gelman

Causality in social sciences — and economics — can never solely be a question of statistical inference. Causality entails more than predictability, and to really in depth explain social phenomena require theory. Analysis of variation — the foundation of all econometrics — can never in itself reveal how these variations are brought about. First, when we are able to tie actions, processes or structures to the statistical relations detected, can we say that we are getting at relevant explanations of causation.

Most facts have many different, possible, alternative explanations, but we want to find the best of all contrastive (since all real explanation takes place relative to a set of alternatives) explanations. So which is the best explanation? Many scientists, influenced by statistical reasoning, think that the likeliest explanation is the best explanation. But the likelihood of x is not in itself a strong argument for thinking it explains y. I would rather argue that what makes one explanation better than another are things like aiming for and finding powerful, deep, causal, features and mechanisms that we have warranted and justified reasons to believe in. Statistical — especially the variety based on a Bayesian epistemology — reasoning generally has no room for these kinds of explanatory considerations. The only thing that matters is the probabilistic relation between evidence and hypothesis. That is also one of the main reasons I find abduction — inference to the best explanation — a better description and account of what constitute actual scientific reasoning and inferences.

In the social sciences … regression is used to discover relationships or to disentangle cause and effect. However, investigators have only vague ideas as to the relevant variables and their causal order; functional forms are chosen on the basis of convenience or familiarity; serious problems of measurement are often encountered.

Regression may offer useful ways of summarizing the data and making predictions. Investigators may be able to use summaries and predictions to draw substantive conclusions. However, I see no cases in which regression equations, let alone the more complex methods, have succeeded as engines for discovering causal relationships.

David Freedman

Some statisticians and data scientists think that algorithmic formalisms somehow give them access to causality. That is, however, simply not true. Assuming ‘convenient’ things like faithfulness or stability is not to give proofs. It’s to assume what has to be proven. Deductive-axiomatic methods used in statistics do no produce evidence for causal inferences. The real causality we are searching for is the one existing in the real world around us. If there is no warranted connection between axiomatically derived theorems and the real world, well, then we haven’t really obtained the causation we are looking for.

If contributions made by statisticians to the understanding of causation are to be taken over with advantage in any specific field of inquiry, then what is crucial is that the right relationship should exist between statistical and subject-matter concerns …
introduction-to-statistical-inferenceThe idea of causation as consequential manipulation is apt to research that can be undertaken primarily through experimental methods and, especially to ‘practical science’ where the central concern is indeed with ‘the consequences of performing particular acts’. The development of this idea in the context of medical and agricultural research is as understandable as the development of that of causation as robust dependence within applied econometrics. However, the extension of the manipulative approach into sociology would not appear promising, other than in rather special circumstances … The more fundamental difficulty is that​ under the — highly anthropocentric — principle of ‘no causation without manipulation’, the recognition that can be given to the action of individuals as having causal force is in fact peculiarly limited.

John H. Goldthorpe

Att tjäna pengar på sjuka — en riktigt sjuk idé

11 Jun, 2021 at 09:15 | Posted in Economics | Leave a comment

För Attendo är affärsidén underbemanningPersonal på Attendos äldreboende Långbroberg i södra Stockholm har larmat om missförhållanden – utan att de känner att de får gehör hos cheferna.

Ett tiotal anställda väljer nu att berätta om:

■ Obemannade avdelningar nattetid, där boende skriker av smärta och det dröjer innan de får hjälp.

■ Blaskig soppa, frysta måltider och ont om frukost, vilket gör att brukare är hungriga och rasar i vikt.

■ Boende som läggs för natten redan vid 16-tiden och får äta middag i sängen.

– Jag kastar mig hellre framför ett tåg än bli gammal om det ska vara så här, säger undersköterskan Maria Norstad Pantén, 60.

Karin Sörbring/Expressen

Många som är verksamma inom skolvärlden eller vårdsektorn har haft svårt att förstå socialdemokratins inställning till privatiseringar och vinstuttag i den mjuka välfärdssektorn. Av någon outgrundlig anledning har ledande socialdemokrater under många år pläderat för att vinster ska vara tillåtna i skolor och vårdföretag. Ofta har argumentet varit att driftsformen inte har någon betydelse. Så är inte fallet. Driftsform och att tillåta vinst i välfärden har visst betydelse. Och den är negativ.

Socialdemokratin är förvisso långt ifrån ensamt om sitt velande. På den andra kanten hörs från Svenskt Näringsliv och landets alla ledarskribenter en jämn ström av krav på ökad kontroll, tuffare granskning och inspektioner.

Men vänta lite nu! Var det inte så att när man på 1990-talet påbörjade systemskiftet inom välfärdssektorn ofta anförde som argument för privatiseringarna att man just skulle slippa den byråkratiska logikens kostnader i form av regelverk, kontroller och uppföljningar? Konkurrensen – denna marknadsfundamentalismens panacé – skulle ju göra driften effektivare och höja verksamheternas kvalitet. Marknadslogiken skulle tvinga bort de ‘byråkratiska’ och tungrodda offentliga verksamheterna och kvar skulle bara finnas de bra företagen som ‘valfriheten’ möjliggjort.

Och nu när den panglossianska privatiseringsvåtdrömmen visar sig vara en mardröm så ska just det som man ville bli av med – regelverk och ‘byråkratisk’ tillsyn och kontroll – vara lösningen?

Man tar sig för pannan – och det av många skäl!

För ska man genomföra de åtgärdspaket som förs fram undrar man ju hur det går med den där effektivitetsvinsten. Kontroller, uppdragsspecifikationer, inspektioner m m kostar ju pengar och hur mycket överskott blir det då av privatiseringarna när dessa kostnader också ska räknas hem i kostnads-intäktsanalysen? Och hur mycket värd är den där ‘valfriheten’ när vi ser hur den gång på gång bara resulterar i verksamhet där vinst genereras genom kostnadsnedskärningar och sänkt kvalitet?

Effektiv resursanvändning kan aldrig vara ett mål i sig. Däremot kan det vara ett nödvändigt medel för att nå uppsatta mål. Välfärdsstatens vara eller icke vara handlar inte bara om ekonomisk effektivitet, utan också om våra föreställningar om ett värdigt liv, rättvisa och lika behandling.

Så grundfrågan är inte om skattefinansierade privata företag ska få göra vinstuttag eller om det krävs hårdare tag i form av kontroll och inspektion. Grundfrågan är om det är marknadens och privatiseringarnas logik som ska styra våra välfärdsinrättningar eller om det ske via demokratins och politikens logik. Grundfrågan handlar om den gemensamma välfärdssektorn ska styras av demokrati och politik eller av marknaden.

Causality and the need to reform the teaching of statistics

10 Jun, 2021 at 16:18 | Posted in Statistics & Econometrics | Leave a comment

An Introduction to Causality (6 May 2021): Overview · HIFIS and Helmholtz  Events (Indico)I will argue that realistic and thus scientifically relevant statistical theory is best viewed as a subdomain of causality theory, not a separate entity or an extension of probability. In particular, the application of statistics (and indeed most technology) must deal with causation if it is to represent adequately the underlying reality of how we came to observe what was seen … The network we deploy for analysis incorporates whatever time-order and independence assumptions we use for interpreting observed associations, whether those assumptions are derived from background (contextual) or design information … Statistics should integrate causal networks into its basic teachings and indeed into its entire theory, starting with the probability and bias models that are used to build up statistical methods and interpret their outputs. Every real data analysis has a causal component comprising the causal network assumed to have created the data set …

Thus, because statistical analyses need a causal skeleton to connect to the world, causality is not extra-statistical but instead is a logical antecedent of real-world inferences. Claims of random or “ignorable” or “unbiased” sampling or allocation are justified by causal actions to block (“control”) unwanted causal effects on the sample patterns. Without such actions of causal blocking, independence can only be treated as a subjective exchangeability assumption whose justification requires detailed contextual information about absence of factors capable of causally influencing both selection (including selection for treatment) and outcomes …

Given the absence of elaborated causality discussions in statistics textbooks and coursework, we should not be surprised at the widespread misuse and misinterpretation of statistical methods and results. This is why incorporation of causality into introductory statistics is needed as urgently as other far more modest yet equally resisted reforms involving shifts in labels and interpretations for p-values and interval estimates.

Sander Greenland

Seven lessons we need to learn from the pandemic

10 Jun, 2021 at 09:07 | Posted in Economics | Leave a comment


The elite illusion

9 Jun, 2021 at 13:13 | Posted in Statistics & Econometrics | Leave a comment


A great set of lectures — but yours truly still warns his students that regression-based averages is something we have reasons to be cautious about.

Suppose we want to estimate the average causal effect of a dummy variable (T) on an observed outcome variable (O). In a usual regression context one would apply an ordinary least squares estimator (OLS) in trying to get an unbiased and consistent estimate:

O = α + βT + ε,

where α is a constant intercept, β a constant ‘structural’ causal effect and ε an error term.

The problem here is that although we may get an estimate of the ‘true’ average causal effect, this may ‘mask’ important heterogeneous effects of a causal nature. Although we get the right answer of the average causal effect being 0, those who are ‘treated’ (T=1) may have causal effects equal to -100 and those ‘not treated’ (T=0) may have causal effects equal to 100. Contemplating being treated or not, most people would probably be interested in knowing about this underlying heterogeneity and would not consider the OLS average effect particularly enlightening.

The heterogeneity problem does not just turn up as an external validity problem when trying to ‘export’ regression results to different times or different target populations. It is also often an internal problem to the millions of OLS estimates that economists produce every year.

Why data alone does not answer counterfactual questions.

8 Jun, 2021 at 22:40 | Posted in Statistics & Econometrics | Leave a comment


Keynes only regret

7 Jun, 2021 at 22:30 | Posted in Varia | 1 Comment

Steve Keen (@ProfSteveKeen) | Twitter

Your’s truly won’t have to share Keynes regret.

Champagne is one of the few things in life he never says no to.

What are the key assumptions of linear regression models?

7 Jun, 2021 at 22:08 | Posted in Statistics & Econometrics | 3 Comments

In Andrew Gelman’s and Jennifer Hill’s Data Analysis Using Regression and Multilevel/Hierarchical Models the authors list the assumptions of the linear regression model. The assumptions — in decreasing order of importance — are:

Assumptions... Assumptions...everywhere... - Buzz and Woody (Toy Story) Meme  | Make a Meme1. Validity. Most importantly, the data you are analyzing should map to the research question you are trying to answer. This sounds obvious but is often overlooked or ignored because it can be inconvenient. . . .

2. Additivity and linearity. The most important mathematical assumption of the regression model is that its deterministic component is a linear function of the separate predictors . . .

3. Independence of errors. . . .

4. Equal variance of errors. . . .

5. Normality of errors. . . .

Further assumptions are necessary if a regression coefficient is to be given a causal interpretation …

Yours truly can’t but concur — especially on the “decreasing order of importance” of the assumptions. But then, of course, one really has to wonder why econometrics textbooks almost invariably turn this order of importance upside-down and don’t have more thorough discussions on the overriding importance of Gelman/Hill’s two first points …

Next Page »

Blog at
Entries and Comments feeds.