On the difference between econometrics and data science

20 Jan, 2021 at 22:42 | Posted in Statistics & Econometrics | 1 Comment


Causality in social sciences can never solely be a question of statistical inference. Causality entails more than predictability, and to really in depth explain social phenomena require theory. The analysis of variation can never in itself reveal how these variations are brought about. First when we are able to tie actions, processes or structures to the statistical relations detected, can we say that we are getting at relevant explanations of causation.

Most facts have many different, possible, alternative explanations, but we want to find the best of all contrastive (since all real explanation takes place relative to a set of alternatives) explanations. So which is the best explanation? Many scientists, influenced by statistical reasoning, think that the likeliest explanation is the best explanation. But the likelihood of x is not in itself a strong argument for thinking it explains y. I would rather argue that what makes one explanation better than another are things like aiming for and finding powerful, deep, causal, features and mechanisms that we have warranted and justified reasons to believe in. Statistical — especially the variety based on a Bayesian epistemology — reasoning generally has no room for these kinds of explanatory considerations. The only thing that matters is the probabilistic relation between evidence and hypothesis. That is also one of the main reasons I find abduction — inference to the best explanation — a better description and account of what constitute actual scientific reasoning and inferences.

Some statisticians and data scientists think that algorithmic formalisms somehow give them access to causality. That is simply not true. Assuming ‘convenient’ things like faithfulness or stability is not to give proofs. It’s to assume what has to be proven. Deductive-axiomatic methods used in statistics do no produce evidence for causal inferences. The real causality we are searching for is the one existing in the real world around us. If there is no warranted connection between axiomatically derived theorems and the real-world, well, then we haven’t really obtained the causation we are looking for.

Leontief’s devastating critique of econom(etr)ics

18 Jan, 2021 at 17:57 | Posted in Economics | 2 Comments

Much of current academic teaching and research has been criticized for its lack of relevance, that is, of immediate practical impact … I submit that the consistently indifferent performance in practical applications is in fact a symptom of a fundamental imbalance in the present state of our discipline. The weak and all too slowly growing empirical foundation clearly cannot support the proliferating superstructure of pure, or should I say, speculative economic theory …

004806Uncritical enthusiasm for mathematical formulation tends often to conceal the ephemeral substantive content of the argument behind the formidable front of algebraic signs … In the presentation of a new model, attention nowadays is usually centered on a step-by-step derivation of its formal properties. But if the author — or at least the referee who recommended the manuscript for publication — is technically competent, such mathematical manipulations, however long and intricate, can even without further checking be accepted as correct. Nevertheless, they are usually spelled out at great length. By the time it comes to interpretation of the substantive conclusions, the assumptions on which the model has been based are easily forgotten. But it is precisely the empirical validity of these assumptions on which the usefulness of the entire exercise depends.

What is really needed, in most cases, is a very difficult and seldom very neat assessment and verification of these assumptions in terms of observed facts. Here mathematics cannot help and because of this, the interest and enthusiasm of the model builder suddenly begins to flag: “If you do not like my set of assumptions, give me another and I will gladly make you another model; have your pick.” …

But shouldn’t this harsh judgment be suspended in the face of the impressive volume of econometric work? The answer is decidedly no. This work can be in general characterized as an attempt to compensate for the glaring weakness of the data base available to us by the widest possible use of more and more sophisticated statistical techniques. Alongside the mounting pile of elaborate theoretical models we see a fast-growing stock of equally intricate statistical tools. These are intended to stretch to the limit the meager supply of facts … Like the economic models they are supposed to implement, the validity of these statistical tools depends itself on the acceptance of certain convenient assumptions pertaining to stochastic properties of the phenomena which the particular models are intended to explain; assumptions that can be seldom verified.

Wassily Leontief

A salient feature of modern mainstream economics is the idea of science advancing through the use of “successive approximations” whereby ‘small-world’ models become more and more relevant and applicable to the ‘large world’ in which we live. Is this really a feasible methodology? Yours truly thinks not.

Most models in science are representations of something else. Models “stand for” or “depict” specific parts of a “target system” (usually the real world). And all empirical sciences use simplifying or unrealistic assumptions in their modelling activities. That is not the issue — as long as the assumptions made are not unrealistic in the wrong way or for the wrong reasons.

Theories are difficult to directly confront with reality. Economists therefore build models of their theories. Those models are representations that are directly examined and manipulated to indirectly say something about the target systems.

But models do not only face theory. They also have to look to the world. Being able to model a “credible world,” a world that somehow could be considered real or similar to the real world, is not the same as investigating the real world. Even though all theories are false, since they simplify, they may still possibly serve our pursuit of truth. But then they cannot be unrealistic or false in any way. The falsehood or unrealisticness has to be qualified.

If we cannot show that the mechanisms or causes we isolate and handle in our models are stable, in the sense that what when we export them from are models to our target systems they do not change from one situation to another, then they only hold under ceteris paribus conditions and a fortiori are of limited value for our understanding, explanation and prediction of our real world target system. No matter how many convoluted refinements of concepts made in the model, if the “successive approximations” do not result in models similar to reality in the appropriate respects (such as structure, isomorphism etc), the surrogate system becomes a substitute system that does not bridge to the world but rather misses its target.

So, I have to conclude that constructing “minimal economic models” — or using microfounded macroeconomic models as “stylized facts” or “stylized pictures” somehow “successively approximating” macroeconomic reality — is a rather unimpressive attempt at legitimizing using ‘small-world’ models and fictitious idealizations for reasons more to do with mathematical tractability than with a genuine interest of understanding and explaining features of real economies.

As noticed by Leontief, there is no reason to suspend this harsh judgment when facing econometrics. When it comes to econometric modelling one could, of course, choose to treat observational or experimental data as random samples from real populations. I have no problem with that (although it has to be noted that most ‘natural experiments’ are not based on random sampling from some underlying population — which, of course, means that the effect-estimators, strictly seen, only are unbiased for the specific samples studied). But econometrics does not content itself with that kind of populations. Instead, it creates imaginary populations of ‘parallel universes’ and assume that our data are random samples from that kind of  ‘infinite super populations.’ This is actually nothing else but hand-waving! And it is inadequate for real science. As David Freedman writes:

With this approach, the investigator does not explicitly define a population that could in principle be studied, with unlimited resources of time and money. The investigator merely assumes that such a population exists in some ill-defined sense. And there is a further assumption, that the data set being analyzed can be treated as if it were based on a random sample from the assumed population. These are convenient fictions … Nevertheless, reliance on imaginary populations is widespread. Indeed regression models are commonly used to analyze convenience samples … The rhetoric of imaginary populations is seductive because it seems to free the investigator from the necessity of understanding how data were generated.

Skolval och segregation

17 Jan, 2021 at 23:09 | Posted in Economics | Comments Off on Skolval och segregation

download3Vi undersöker hur skolsegregationen rent hypotetiskt skulle ha utvecklats om alla elever gått i den närmsta kommunala skolan, och inte haft möjlighet att välja skola. I figuren visas denna utveckling med den svarta linjen. I början av 1990-talet gick nästan alla elever i den närmsta skolan och därför är skillnaden mellan verklig skolsegregation (blå linje) och hypotetisk skolsegregation (svart linje) inte så stor. Utvecklingen av den svarta linjen visar att skolsegregationen skulle ha ökat betydligt även utan möjlighet att välja skola, en ökning som kan hänföras till att boendet blivit allt mer segregerat. Skillnaden mellan den blåa och den svarta linjen visar skolvalets bidrag till skolsegregationen, det vill säga den del av skolsegregationen som beror på att elever inte går på den närmsta kommunala skolan. Figuren visar att drygt en fjärdedel av den ökade skolsegregationen kan hänföras till skolvalet. Även om ökad bostadssegregation har haft störst betydelse för den ökade skolsegregationen, är skolvalets bidrag alltså inte marginellt och måste därför tas på allvar.

Helena Holmlund & Anna Sjögren & Björn Öckert

Ulf Kristersson har helt rätt

15 Jan, 2021 at 22:20 | Posted in Politics & Society | 2 Comments

Ulf Kristersson: Kommer inte låta C diktera regeringsfrågan | SVT NyheterModeraterna kräver idag att regeringen ökar trycket på att personal som arbetar inom vården eller med riskgrupper vaccinerar sig mot covid 19. De som avstår från att vaccinera sig bör överväga att byta anställning eller omplaceras, skriver Moderaterna i ett pressmeddelande.

Det är sällan man har anledning hålla med den moderate partiledaren, men här har han för en gångs skull klockrent rätt!

Fooled by randomness

13 Jan, 2021 at 18:08 | Posted in Statistics & Econometrics | 6 Comments

A non-trivial part of teaching statistics to social science students is made up of teaching them to perform significance testing. A problem yours truly has noticed repeatedly over the years, however, is that no matter how careful you try to be in explicating what the probabilities generated by these statistical tests — p-values — really are, still most students misinterpret them.

Is betting random? | Analysing randomness in bettingA couple of years ago I gave a statistics course for the Swedish National Research School in History, and at the exam I asked the students to explain how one should correctly interpret p-values. Although the correct definition is p(data|null hypothesis), a majority of the students either misinterpreted the p-value as being the likelihood of a sampling error (which of course is wrong, since the very computation of the p-value is based on the assumption that sampling errors are what causes the sample statistics not coinciding with the null hypothesis) or that the p-value is the probability of the null hypothesis being true, given the data (which of course also is wrong, since it is p(null hypothesis|data) rather than the correct p(data|null hypothesis)).

This is not to blame on students’ ignorance, but rather on significance testing not being particularly transparent (conditional probability inference is difficult even to those of us who teach and practice it). A lot of researchers fall pray to the same mistakes. So – given that it anyway is very unlikely than any population parameter is exactly zero, and that contrary to assumption most samples in social science and economics are not random or having the right distributional shape – why continue to press students and researchers to do null hypothesis significance testing, testing that relies on weird backward logic that students and researchers usually don’t understand?

Let me just give a simple example to illustrate how slippery it is to deal with p-values – and how easy it is to impute causality to things that really are nothing but chance occurrences.

Say you have collected cross-country data on austerity policies and growth (and let’s assume that you have been able to “control” for possible confounders). You find that countries that have implemented austerity policies have on average increased their growth by say 2% more than the other countries. To really feel sure about the efficacy of the austerity policies you run a significance test – thereby actually assuming without argument that all the values you have come from the same probability distribution – and you get a p-value of  less than 0.05. Heureka! You’ve got a statistically significant value. The probability is less than 1/20 that you got this value out of pure stochastic randomness.

But wait a minute. There is – as you may have guessed – a snag. If you test austerity policies in enough many countries you will get a statistically ‘significant’ result out of pure chance 5% of the time. So, really, there is nothing to get so excited about!

Statistical significance doesn’t say that something is important or true. And since there already are far better and more relevant testing that can be done (see e. g. here and  here), it is high time to give up on this statistical fetish and not continue to be fooled by randomness.

My favourite French teacher

12 Jan, 2021 at 12:16 | Posted in Varia | 3 Comments


MMT-perspektiv på pengar och skatter

11 Jan, 2021 at 18:14 | Posted in Economics | 3 Comments

Om stater inte alls behöver sina medborgares pengar, varför betalar vi då överhuvudtaget skatt?

Heikki Patomäki: Modern Monetary Theory - Populist Rhetoric or a Credible  Alternative? | Brave New EuropeStephanie Kelton: – Ponera att den amerikanska staten skulle slopa alla skatter utan att samtidigt skära ned på sina utgifter. Om jag inte längre behöver betala någon skatt kan jag såklart göra av med mer pengar – problemet är bara att ekonomin och dess samlade arbetsstyrka samtidigt endast har en begränsad mängd extra varor och tjänster att erbjuda. Förr eller senare är kapaciteten förbrukad och då kommer inte utbudet längre kunna hålla jämna steg med den växande efterfrågan. I det läget blir varorna och tjänsterna dyrare och mina dollar tappar i värde. Om staten i en sådan situation inte ikläder sig rollen som skatteindrivare och suger ut pengarna ur ekonomin igen så stiger inflationstrycket.

Skatter har alltså en inflationshämmande funktion?

Stephanie Kelton: – Ja. I Modern monetary theory står alltid inflationen i centrum. Att jag kämpar mot artificiella begränsningar – som budget­disciplinen – beror på att jag i stället vill kunna fokusera på det som verkligen begränsar vår budget – risken för inflation. Jag har jobbat i den amerikanska senatens budgetutskott, och under hela min tid där hörde jag inte en enda senator eller någon av senatorernas med­arbetare ta ordet ”inflation” i sin mun. De funderar överhuvudtaget inte över den saken. Men det måste man göra om man strör så astronomiska summor omkring sig!

Måste man verkligen fundera så mycket över inflationen? Det räcker väl med lite sunt förnuft för att inse att en valuta förlorar i värde om man trycker mer pengar?

Stephanie Kelton: – Det är mer komplicerat än så. Sambandet mellan penningskapande och inflation är långtifrån så entydigt som många tror. För det första kan man öka penningmängden utan att det leder till inflation – det är ett fenomen som många centralbanker fått erfara på sistone. Alldeles oavsett om vi befinner oss i euro­zonen, i USA eller i Japan: på alla dessa platser försöker man sedan flera år tillbaka nå upp till det officiella inflationsmålet på knappt två procent – men utan att lyckas. Varför det blivit så är oklart. Och för det andra kan inflationen skjuta i höjden utan att det går att hänvisa till en växande penningmängd som enda orsak. Ett välkänt exempel är oljekrisen på 1970-­talet, som ledde till stigande priser. Detta kallas för kostnads­inflation. Rent generellt borde vi nationalekonomer visa prov på betydligt mer ödmjukhet. Våra kunskaper om olika typer av inflation och deras orsaker är alltjämt alldeles för bristfälliga. Den vetenskapliga kunskaps­nivån är på detta område pinsamt låg.

Flamman / Die Zeit

January 6, 2021 — a day that will live forever in infamy in American history

11 Jan, 2021 at 17:01 | Posted in Politics & Society | 2 Comments

The Latest: Congress splits up to debate PA vote count – Twin Cities

One of those who took part in the forced entry into the Capitol on January 6, here sits in House Speaker Nancy Pelosi’s office.

Interview mit Stephanie Kelton

11 Jan, 2021 at 11:50 | Posted in Economics | 6 Comments

Kelton: Die Vorstellung, dass Staaten nur eine begrenzte Menge an Geld zur Verfügung hätten, kommt aus einer Zeit, in der die Währung in den meisten Ländern in der einen oder anderen Form an Edelmetalle wie Gold oder Silber gekoppelt war. Heute ist das nicht mehr so. Geld wird einfach gedruckt – genauer gesagt: im Computer erzeugt. Es lässt sich beliebig vermehren.

defZEIT: Das klingt jetzt so, als würden Sie einem Kind sagen: Süßigkeiten machen nicht dick. Nimm dir, so viel du willst!

Kelton: Nein, nein! Es gibt eine Grenze für die Staatsausgaben. Aber diese Grenze wird nicht durch die Höhe der Verschuldung bestimmt, sondern durch die Inflationsrate.

ZEIT: Wie meinen Sie das denn genau? Wir in Deutschland denken beim Thema Inflation normalerweise an Massenarbeitslosigkeit und staatlichen Kontrollverlust.

Kelton: Ich meine etwas anderes. Die Inflation ist auch eine Begleiterscheinung des Wirtschaftens. Um im Bild zu bleiben: Sie entsteht, wenn die Restaurants nicht mehr halb leer sind, sondern voll und sich in den Läden die Menschen drängeln. Denn dann werden irgendwann die Arbeitskräfte knapp. Die Folge: Die Restaurantangestellten können höhere Löhne durchsetzen, die Restaurantbesitzer erhöhen die Preise. In einer solchen Situation wäre es falsch, die Wirtschaft durch staatliche Ausgaben zusätzlich anzuheizen, denn dann würde sie heißlaufen. Wenn aber viele Menschen keine Arbeit haben, liegen Ressourcen brach, die der Staat nutzbar machen kann. In den meisten Industrieländern ist genau das derzeit der Fall.

Die Zeit

Big data truthiness

9 Jan, 2021 at 23:10 | Posted in Statistics & Econometrics | Comments Off on Big data truthiness

Amazon.com: Truth or Truthiness (Distinguishing Fact from Fiction by  Learning to Think Like a Data Scientist) (9781107130579): Wainer, Howard:  BooksAll of these examples exhibit the confusion that often accompanies the drawing of causal conclusions from observational data. The likelihood of such confusion is not diminished by increasing the amount of data, although the publicity given to ‘big data’ would have us believe so. Obviously the flawed causal connection between drowning and eating ice cream does not diminish if we increase the number of cases from a few dozen to a few million. The amateur carpenter’s complaint that ‘this board is too short, and even though I’ve cut it four more times, it is still too short,’ seems eerily appropriate.

Garbage-can econometrics

7 Jan, 2021 at 22:23 | Posted in Economics | 2 Comments

When no formal theory is available, as is often the case, then the analyst needs to justify statistical specifications by showing that they fit the data. That means more than just “running things.” It means careful graphical and crosstabular analysis …

garbageWhen I present this argument … one or more scholars say, “But shouldn’t I control for every-thing I can? If not, aren’t my regression coefficients biased due to excluded variables?” But this argument is not as persuasive as it may seem initially.

First of all, if what you are doing is mis-specified already, then adding or excluding other variables has no tendency to make things consistently better or worse. The excluded variable argument only works if you are sure your specification is precisely correct with all variables included. But no one can know that with more than a handful of explanatory variables. 

Still more importantly, big, mushy regression and probit equations seem to need a great many control variables precisely because they are jamming together all sorts of observations that do not belong together. Countries, wars, religious preferences, education levels, and other variables that change people’s coefficients are “controlled” with dummy variables that are completely inadequate to modeling their effects. The result is a long list of independent variables, a jumbled bag of nearly unrelated observations, and often, a hopelessly bad specification with meaningless (but statistically significant with several asterisks!) results.

Christopher H. Achen

This article is one of my absolute favourites. Why? Because it reaffirms yours truly’s view that since there is no absolutely certain knowledge at hand in social sciences — including economics — explicit argumentation and justification ought to play an extremely important role if purported knowledge claims are to be sustainably warranted. As Achen puts it — without careful supporting arguments, “just dropping variables into SPSS, STATA, S or R programs accomplishes nothing.”

Econometrics and the challenge of regression specification

7 Jan, 2021 at 14:28 | Posted in Statistics & Econometrics | 1 Comment

Most work in econometrics and regression analysis is — still — made on the assumption that the researcher has a theoretical model that is ‘true.’ Based on this belief of having a correct specification for an econometric model or running a regression, one proceeds as if the only problem remaining to solve have to do with measurement and observation.

aWhen things sound too good to be true, they usually aren’t. And that goes for econometric wet dreams too. The snag is, of course, that there is pretty little to support the perfect specification assumption. Looking around in social science and economics we don’t find a single regression or econometric model that lives up to the standards set by the ‘true’ theoretical model — and there is pretty little that gives us reason to believe things will be different in the future.

To think that we are being able to construct a model where all relevant variables are included and correctly specify the functional relationships that exist between them, is  not only a belief without support, but a belief impossible to support.

The theories we work with when building our econometric regression models are insufficient. No matter what we study, there are always some variables missing, and we don’t know the correct way to functionally specify the relationships between the variables.

Every regression model constructed is misspecified. There are always an endless list of possible variables to include, and endless possible ways to specify the relationships between them. So every applied econometrician comes up with his own specification and ‘parameter’ estimates. The econometric Holy Grail of consistent and stable parameter-values is nothing but a dream.

Roots Revealed: Research Tip: Check Your AssumptionsIn order to draw inferences from data as described by econometric texts, it is necessary to make whimsical assumptions. The professional audience consequently and properly withholds belief until an inference is shown to be adequately insensitive to the choice of assumptions. The haphazard way we individually and collectively study the fragility of inferences leaves most of us unconvinced that any inference is believable. If we are to make effective use of our scarce data resource, it is therefore important that we study fragility in a much more systematic way. If it turns out that almost all inferences from economic data are fragile, I suppose we shall have to revert to our old methods …

Ed Leamer

A rigorous application of econometric methods in economics really presupposes that the phenomena of our real-world economies are ruled by stable causal relations between variables.  Parameter-values estimated in specific spatio-temporal contexts are presupposed to be exportable to totally different contexts. To warrant this assumption one, however, has convincingly to establish that the targeted acting causes are stable and invariant so that they maintain their parametric status after the bridging. The endemic lack of predictive success of the econometric project indicates that this hope of finding fixed parameters is a hope for which there really is no other ground than hope itself.

Overconfident economists

7 Jan, 2021 at 09:11 | Posted in Economics | 1 Comment

Worst of all, when we feel pumped up with our progress, a tectonic shift can occur, like the Panic of 2008, making it seem as though our long journey has left us disappointingly close to the State of Complete Ignorance whence we began …

overconfidence It often takes years down the Path, but sooner or later, someone articulates the concerns that gnaw away in each of us and asks if the Assumptions are valid …

It would be much healthier for all of us if we could accept our fate, recognize that perfect knowledge will be forever beyond our reach and find happiness with what we have …

Can we economists agree that it is extremely hard work to squeeze truths from our data sets and what we genuinely understand will remain uncomfortably limited? We need words in our methodological vocabulary to express the limits … Those who think otherwise should be required to wear a scarlet-letter O around their necks, for “overconfidence.”

Ed Leamer

Many economists regularly pretend to know more than they do. Often this is a conscious strategy to promote their authority in politics and among policy makers. When economists present their models it should be mandatory that the models have warning labels to alert readers to the limited real-world relevance of models building on assumptions known to be absurdly unreal.

Economics may be an informative tool for research. But if its practitioners do not investigate and make an effort of providing a justification for the credibility of the assumptions on which they erect their building, it will not fullfil its task. There is a gap between its aspirations and its accomplishments, and without more supportive evidence to substantiate its claims, critics like yours truly will continue to consider its ultimate arguments as a mixture of rather unhelpful metaphors and metaphysics.

Nowadays it has almost become a self-evident truism among economists that you cannot expect people to take your arguments seriously unless they are based on or backed up by advanced econometric modelling​. So legions of mathematical-statistical theorems are proved — and heaps of fiction are being produced, masquerading as science. The rigour​ of the econometric modelling and the far-reaching assumptions they are built on is frequently simply not supported by data. This is a dire warning of the need to change direction of economics.

Comment apprendre les langues

6 Jan, 2021 at 18:28 | Posted in Varia | Comments Off on Comment apprendre les langues


Master Class

5 Jan, 2021 at 17:56 | Posted in Varia | Comments Off on Master Class


Bille August’s and Ingmar Bergman’s masterpiece.

With breathtakingly beautiful music by Stefan Nilsson   

And it breaks my heart every time I watch it.

« Previous PageNext Page »

Blog at WordPress.com.
Entries and Comments feeds.

%d bloggers like this: