## ‘Overcontrolling’ in statistical studies

31 Jul, 2022 at 13:27 | Posted in Statistics & Econometrics | Comments Off on ‘Overcontrolling’ in statistical studiesYou see it all the time in studies. “We controlled for…” And then the list starts … The more things you can control for, the stronger your study is — or, at least, the stronger your study seems. Controls give the feeling of specificity, of precision. But sometimes, you can control for too much. Sometimes you end up controlling for the thing you’re trying to measure …

An example is research around the gender wage gap, which tries to control for so many things that it ends up controlling for the thing it’s trying to measure …

Take hours worked, which is a standard control in some of the more sophisticated wage gap studies. Women tend to work fewer hours than men. If you control for hours worked, then some of the gender wage gap vanishes. As Yglesias wrote, it’s “silly to act like this is just some crazy coincidence. Women work shorter hours because as a society we hold women to a higher standard of housekeeping, and because they tend to be assigned the bulk of childcare responsibilities.”

Controlling for hours worked, in other words, is at least partly controlling for how gender works in our society. It’s controlling for the thing that you’re trying to isolate.

Trying to reduce the risk of having established only ‘spurious relations’ when dealing with observational data, statisticians and econometricians standardly add control variables. The hope is that one thereby will be able to make more reliable causal inferences. But — as Keynes showed already back in the 1930s when criticizing statistical-econometric applications of regression analysis — if you do not manage to get hold of *all* potential confounding factors, the model risks producing estimates of the variable of interest that are even worse than models without any control variables at all. Conclusion: think twice before you simply include ‘control variables’ in your models!

When I present this argument … one or more scholars say, “But shouldn’t I control for everything I can in my regressions? If not, aren’t my coefficients biased due to excluded variables?” … The excluded variable argument only works if you are sure your specification is precisely correct with all variables included. But no one can know that with more than a handful of explanatory variables …

A preferable approach is to separate the observations into meaningful subsets—internally compatible statistical regimes … If this can’t be done, then statistical analysis can’t be done. A researcher claiming that nothing else but the big, messy regression is possible because, after all, some results have to be produced, is like a jury that says, “Well, the evidence was weak, but somebody had to be convicted.”

Kitchen sink econometric models are often the result of researchers trying to control for confounding. But what they usually haven’t understood is that the confounder problem requires a *causal* solution and not *statistical* ‘control.’ Controlling for everything opens up the risk that we control for ‘collider’ variables and thereby create ‘back-door paths’ which gives us confounding that wasn’t there, to begin with.

## Economics and the law of the hammer

30 Jul, 2022 at 21:04 | Posted in Economics | 2 Comments.

As yours truly has reported repeatedly during the last couple of years, university students all over the world are increasingly beginning to question if the kind of economics they are taught — mainstream economics — really is of any value. Some have even started to question if economics is a science.

My own take on the issue is that economics — and especially mainstream economics — has lost immensely in terms of status and prestige during the last years. Not the least because of its manifest inability to foresee the latest financial and economic crises — and its lack of constructive and sustainable policies to take us out of the crises.

We all know that many activities, relations, processes, and events are uncertain and that the data do not unequivocally single out one decision as the only “rational” one. Neither the economist nor the deciding individual can fully pre-specify how people will decide when facing uncertainties and ambiguities that are ontological facts of the way the world works.

Mainstream economists, however, have wanted to use their hammer, and so decided to pretend that the world looks like a nail. Pretending that uncertainty can be reduced to risk and constructing models on that assumption have only contributed to financial crises and economic havoc.

How do we put an end to this intellectual cataclysm? How do we re-establish credence and trust in economics as a science? Five changes are absolutely decisive.

(1) **Stop pretending that we have exact and rigorous answers on everything**. Because we don’t. We build models and theories and tell people that we can calculate and foresee the future. But we do this based on mathematical and statistical assumptions that often have little or nothing to do with reality. By pretending that there is no really important difference between model and reality we lull people into thinking that we have things under control. We haven’t! This false feeling of security was one of the factors that contributed to the financial crisis of 2008.

(2) **Stop the childish and exaggerated belief in mathematics giving answers to important economic questions**. Mathematics gives exact answers to exact questions. But the relevant and interesting questions we face in the economic realm are rarely of that kind. Questions like “Is 2 + 2 = 4?” are never posed in real economies. Instead of a fundamentally misplaced reliance on abstract mathematical-deductive-axiomatic models having anything of substance to contribute to our knowledge of real economies, it would be far better if we pursued “thicker” models and relevant empirical studies and observations. Mathematics cannot establish the truth value of a fact. Never has. Never will.

(3) **Stop pretending that there are laws in economics**. There are no universal laws in economics. Economies are not like planetary systems or physics labs. The most we can aspire to in real economies is establishing possible tendencies with varying degrees of generalizability.

(4) **Stop treating other social sciences as poor relations.** Economics has long suffered from hubris. A more broad-minded and multifarious science would enrich today’s economics and make it more relevant and realistic.

(5) **Stop building models and making forecasts of the future based on totally unreal micro-founded macro models with intertemporally optimizing robot-like representative actors equipped with rational expectations.** This is pure nonsense. We have to build our models on assumptions that are not so blatantly in contradiction to reality. Assuming that people are green and come from Mars is not a good — not even as a “successive approximation” — modeling strategy.

## Nights in white satin

30 Jul, 2022 at 16:35 | Posted in Varia | Comments Off on Nights in white satin.

On revient toujours à ses premières amours …

## Perfect day (personal)

28 Jul, 2022 at 21:41 | Posted in Varia | Comments Off on Perfect day (personal).

Spending a beautiful summer afternoon with my lovely daughters, Linnea and Tora.

.

## On statistics and causality

28 Jul, 2022 at 15:56 | Posted in Statistics & Econometrics | 9 CommentsIronically, the need for a theory of causation began to surface at the same time that statistics came into being … This was a critical moment in the history of science. The opportunity to equip causal questions with a language of their own came very close to being realized but was squandered. In the following years, these questions were declared unscientific and went underground. Despite heroic efforts by the geneticist Sewall Wright (1889-1988), causal vocabulary was virtually prohibited for more than half a century. And when you prohibit speech, you prohibit thought and stifle principles, methods, and tools.

Readers do not have to be scientists to witness this prohibition. In Statistics 101, every student learns to chant, “Correlation is not causation.” With good reason! The rooster’s crow is highly correlated with the sunrise; yet it does not cause the sunrise.

Unfortunately, statistics has fetishized this commonsense observation. It tells us that correlation is not causation, but it does not tell us what causation is. In vain will you search the index of a statistics textbook for an entry on “cause.” Students are not allowed to say that X is the cause of Y — only that X and Y are “related” or “associated.”

Statistical reasoning certainly seems paradoxical to most people.

Take for example the well-known Simpson’s paradox.

From a theoretical perspective, Simpson’s paradox importantly shows that causality can never be reduced to a question of statistics or probabilities unless you are — miraculously — able to keep constant *all* other factors that influence the probability of the outcome studied.

To understand causality we always have to relate it to a specific causal *structure*. Statistical correlations are *never* enough. No structure, no causality.

Simpson’s paradox is an interesting paradox in itself, but it can also highlight a deficiency in the traditional statistical/econometric approach toward causality. Say you have 1000 observations on men and an equal amount of observations on women applying for admission to university studies, and that 70% of men are admitted, but only 30% of women. Running a logistic regression to find out the odds ratios (and probabilities) for men and women on admission, females seem to be in a less favorable position (‘discriminated’ against) compared to males (male odds are 2.33, female odds are 0.43, giving an odds ratio of 5.44). But once we find out that males and females apply to different departments we may well get a Simpson’s paradox result where males turn out to be ‘discriminated’ against (say 800 males apply for economics studies (680 admitted) and 200 for physics studies (20 admitted), and 100 female apply for economics studies (90 admitted) and 900 for physics studies (210 admitted) — giving odds ratios of 0.62 and 0.37).

Statistical — and econometric — patterns should never be seen as anything else than possible clues to follow. Behind observable data, there are real structures and mechanisms operating, things that are — if we really want to understand, explain and (possibly) predict things in the real world — more important to get hold of than to simply correlate and regress observable variables.

Statistics cannot establish the truth value of a fact. Never has. Never will.

## Should we save capitalism?

28 Jul, 2022 at 14:23 | Posted in Economics | Comments Off on Should we save capitalism?.

## The insignificance of significance

27 Jul, 2022 at 10:38 | Posted in Statistics & Econometrics | Comments Off on The insignificance of significanceA significance test is a scientific instrument, and like any other instrument, it has a certain degree of precision. If you make the test more sensitive—by increasing the size of the studied population, for example—you enable yourself to see ever-smaller effects. That’s the power of the method, but also its danger. The truth is, the null hypothesis, if we take it literally, is probably just about always false. When you drop a powerful drug into a patient’s bloodstream, it’s hard to believe the intervention has exactly zero effect on the probability that the patient will develop esophageal cancer, or thrombosis, or bad breath …

If only we could go back in time to the dawn of statistical nomenclature and declare that a result passing Fisher’s test with a p-value of less than 0.05 was “statistically noticeable” or “statistically detectable” instead of “statistically significant”! That would be truer to the meaning of the method, which merely counsels us about the existence of an effect but is silent about its size or importance.

## Werden wir immer dümmer?

26 Jul, 2022 at 16:27 | Posted in Politics & Society | Comments Off on Werden wir immer dümmer?.

## The trickle down scam

25 Jul, 2022 at 10:05 | Posted in Economics | Comments Off on The trickle down scamThe empirical literature on the impact of corporate taxes on economic growth reaches ambiguous conclusions: corporate tax cuts increase, reduce, or do not significantly affect growth. We apply meta-regression methods to a novel data set with 441 estimates from 42 primary studies. There is evidence for publication selectivity in favour of reporting growth-enhancing effects of corporate tax cuts. Correcting for this bias, we cannot reject the hypothesis of a zero effect of corporate taxes on growth. Several factors influence reported estimates, including researcher choices concerning the measurement of growth and corporate taxes, and controlling for other budgetary components.

## Why I am not a Bayesian

24 Jul, 2022 at 16:34 | Posted in Statistics & Econometrics | 1 CommentAssume you’re a Bayesian turkey and hold a nonzero probability belief in hypothesis H that “people are nice vegetarians that do not eat turkeys and that every day I see the sun rises confirms my belief.” For every day you survive, you update your belief according to Bayes’ Rule

P(H|e) = [P(e|H)P(H)]/P(e),

where evidence e stands for “not being eaten” and P(e|H) = 1. Given that there do exist other hypotheses than H, P(e) is less than 1 and so P(H|e) is greater than P(H). Every day you survive increases your probability belief that you will not be eaten. This is totally rational according to the Bayesian definition of rationality. Unfortunately — as Bertrand Russell famously noticed — for every day that goes by, the traditional Christmas dinner also gets closer and closer …

Mainstream economics nowadays usually assumes that agents that have to make choices under conditions of uncertainty behave according to Bayesian rules — that is, they maximise expected utility with respect to some subjective probability measure that is continually updated according to Bayes’ theorem. If not, they are supposed to be irrational.

Bayesianism reduces questions of rationality to questions of internal consistency (coherence) of beliefs, but — even granted this questionable reductionism — do rational agents really have to be Bayesian?

The nodal point here is — of course — that although Bayes’ Rule is *mathematically* unquestionable, that doesn’t qualify it as indisputably applicable to *scientific* questions. As one of my favourite statistics bloggers — Andrew Gelman — puts it:

The fundamental objections to Bayesian methods are twofold: on one hand, Bayesian methods are presented as an automatic inference engine, and this raises suspicion in anyone with applied experience, who realizes that different methods work well in different settings … The second objection to Bayes comes from the opposite direction and addresses the subjective strand of Bayesian inference: the idea that prior and posterior distributions represent subjective states of knowledge …

Beyond these objections is a general impression of the shoddiness of some Bayesian analyses, combined with a feeling that Bayesian methods are being oversold as an all-purpose statistical solution to genuinely hard problems. Compared to classical inference, which focuses on how to extract the information available in data, Bayesian methods seem to quickly move to elaborate computation …

Bayesian inference is a coherent mathematical theory but I don’t trust it in scientific applications. Subjective prior distributions don’t transfer well from person to person, and there’s no good objective principle for choosing a noninformative prior (even if that concept were mathematically defined, which it’s not). Where do prior distributions come from, anyway? I don’t trust them and I see no reason to recommend that other people do, just so that I can have the warm feeling of philosophical coherence …

## What to do about the present inflation

20 Jul, 2022 at 14:40 | Posted in Economics | 1 CommentTo be sure, some normalization of interest rates would be a good thing. Interest rates are supposed to reflect the scarcity of capital, and the “correct” price of capital obviously is not zero or negative – as near-zero interest rates and very negative real (inflation-adjusted) interest rates would seem to imply. But there are substantial dangers in pushing rates too high, too fast.

For example, it is important to recognize that US wage growth has slowed sharply, from an annualized rate of over 6% in the fall of 2021 to just 4.4% in the most recent period … So much for the “wage-price spiral” that previously had everyone scared and fueled demands for rapid monetary-policy tightening.

Itis also important to recognize that this development runs completely counter to the standard Phillips curve models, which assume an inverse relationship between inflation and unemployment over the short term. The slowing of wage growth has occurred at a time when the unemployment rate is under 4% – a level below anyone’s estimates of the “non-accelerating inflation rate of unemployment.” This phenomenon may owe something to the much lower level of unionization and worker power in today’s economy; but whatever the reason, the sharp slowdown in wage growth indicates that policymakers should think twice before generating further increases in unemployment to tame inflation …

Most importantly, we need to help those at the bottom and middle cope with the consequences of inflation. Because the US is close to being energy independent, the country as a whole is relatively unaffected by changing energy prices (gains to exporters are simply offset by importers’ losses). But there is a huge distribution problem. Oil and gas companies are raking in windfall gains while ordinary citizens struggle to make ends meet. An “inflation rebate,” financed by a windfall-profits tax on fossil-fuel corporations, would efficiently address these inequities.

## Heckman on where causality resides

17 Jul, 2022 at 13:53 | Posted in Statistics & Econometrics | 11 CommentsI make two main points that are firmly anchored in the econometric tradition. The first is that causality is a property of a model of hypotheticals. A fully articulated model of the phenomena being studied precisely defines hypothetical or counterfactual states. A definition of causality drops out of a fully articulated model as an automatic by-product. A model is a set of possible counterfactual worlds constructed under some rules. The rules may be the laws of physics, the consequences of utility maximization, or the rules governing social interactions, to take only three of many possible examples. A model is in the mind. As a consequence, causality is in the mind.

So, according to this ‘Nobel prize’ winning econometrician, “causality is in the mind.” But is that a tenable view? Yours truly thinks not. If one as an economist or social scientist would subscribe to that view there would be pretty little reason to be interested in questions of causality at all. And it sure doesn’t suffice just to say that all science is predicated on assumptions. To most of us, models are seen as ‘vehicles’ or ‘instruments’ by which we represent causal processes and structures that exist and operate in the real world. As we all know, models often do not succeed in representing or explaining these processes and structures, but if we didn’t consider them as anything but figments of our minds, well then maybe we ought to reconsider why we should be in the science business at all …

The world as we know it has limited scope for certainty and perfect knowledge. Its intrinsic and almost unlimited complexity and the interrelatedness of its parts prevent the possibility of treating it as constituted by atoms with discretely distinct, separable and stable causal relations. Our knowledge accordingly has to be of a rather fallible kind. To search for deductive precision and rigour in such a world is self-defeating. The only way to defend such an endeavour is to restrict oneself to prove things in closed model worlds. Why we should care about these and not ask questions of relevance is hard to see. As scientists, we have to get our priorities right. Ontological under-labouring has to precede epistemology.

The value of getting at precise and rigorous conclusions about causality based on ‘tractability’ conditions that are seldom met in real life, is difficult to assess. Testing and constructing models is one thing, but we do also need guidelines for how to evaluate in which situations and contexts they are applicable. Formalism may help us a bit down the road, but we have to make sure it somehow also fits the world if it is going to be really helpful in navigating that world. In all of science, conclusions are never more certain than the assumptions on which they are founded. But most epistemically convenient methods and models that work in ‘well-behaved’ systems do not come with warrants that they will work in other (real-world) contexts.

## Postmodernism explained

16 Jul, 2022 at 10:30 | Posted in Politics & Society | Comments Off on Postmodernism explained.

## Econometrics — nothing but a second-best explanatory practice

13 Jul, 2022 at 10:50 | Posted in Statistics & Econometrics | Comments Off on Econometrics — nothing but a second-best explanatory practiceConsider two elections, A and B. For each of them, identify the events that cause a given percentage of voters to turn out. Once we have thus explained the turnout in election A and the turnout in election B, the explanation of the difference (if any) follows automatically, as a by-product. As a bonus, we might be able to explain whether identical turnouts in A and B are accidental, that is, due to differences that exactly offset each other, or not. In practice, this procedure might be too demanding. The data or he available theories might not allow us to explain the phenomena “in and of themselves.” We should be aware, however, that if we do resort to explanation of variation, we are engaging in a second-best explanatory practice.

Modern econometrics is fundamentally based on assuming — usually without any explicit justification — that we can gain causal knowledge by considering independent variables that may have an impact on the *variation* of a dependent variable. This is, however, far from self-evident. Often the *fundamental* causes are *constant* forces that are not amenable to the kind of analysis econometrics supplies us with. As Stanley Lieberson has it in Making It Count:

One can always say whether, in a given empirical context, a given variable or theory accounts for more variation than another. But it is almost certain that the variation observed is not universal over time and place. Hence the use of such a criterion first requires a conclusion about the variation over time and place in the dependent variable. If such an analysis is not forthcoming, the theoretical conclusion is undermined by the absence of information …

Moreover, it is questionable whether one can draw much of a conclusion about causal forces from simple analysis of the observed variation … To wit, it is vital that one have an understanding, or at least a working hypothesis, about what is causing the event per se; variation in the magnitude of the event will not provide the answer to that question.

Trygve Haavelmo was making a somewhat similar point back in 1941 when criticizing the treatment of the interest variable in Tinbergen’s regression analyses. The regression coefficient of the interest rate variable being zero was according to Haavelmo not sufficient for inferring that “variations in the rate of interest play only a minor role, or no role at all, in the changes in investment activity.” Interest rates may very well play a decisive indirect role by influencing other causally effective variables. And:

the rate of interest may not have varied much during the statistical testing period, and for this reason the rate of interest would not “explain” very much of the variation in net profit (and thereby the variation in investment) which has actually taken place during this period. But one cannot conclude that the rate of influence would be inefficient as an autonomous regulator, which is, after all, the important point.

This problem of ‘nonexcitation’ — when there is too little variation in a variable to say anything about its *potential* importance, and we can’t identify the reason for the *factual *influence of the variable being ‘negligible’ — strongly confirms that causality in economics and other social sciences can never solely be a question of statistical inference. Causality entails more than predictability, and to really in-depth explain social phenomena requires theory.

Analysis of variation — the foundation of all econometrics — can never in itself reveal *how* these variations are brought about. First when we are able to tie actions, processes, or structures to the statistical relations detected, can we say that we are getting at relevant explanations of causation. Too much in love with axiomatic-deductive modeling, neoclassical economists especially tend to forget that accounting for causation — *how* causes bring about their effects — demands deep subject-matter knowledge and acquaintance with the intricate fabrics and contexts. As already Keynes argued in his *A Treatise on Probability*, statistics (and econometrics) should primarily be seen as means to describe patterns of associations and correlations, means that we may use as *suggestions* of possible causal relations. Forgetting that, economists will continue to be stuck with a second-best explanatory practice.

## Uber — the ugly truth

12 Jul, 2022 at 18:45 | Posted in Politics & Society | Comments Off on Uber — the ugly truth.

.

**Added July 13**: And Macron now tells us he’s “extremely proud” of his Uber maneuvering …

Blog at WordPress.com.

Entries and Comments feeds.