Variable selection — not about having a ‘good fit’

17 Sep, 2015 at 11:08 | Posted in Statistics & Econometrics | Comments Off on Variable selection — not about having a ‘good fit’

cost-accounting-determining-how-cost-behaves-34-728Which independent variables should be included in the equation? The goal is a “good fit” … How can a good fit be recognized? A popular measure for the satisfactoriness of a regression is the coefficient of determination, R2. If this number is large, it is said, the regression gives a good fit …

Nothing about R2 supports these claims. This statistic is best regarded as characterizing the geometric shape of the regression points and not much more.

The central difficulty with R2 for social scientists is that the independent variables are not subject to experimental manipulation. In some samples, they vary widely, producing large variance; in other cases, the observations are more tightly grouped and there is Little dispersion. The variances are a function of the sample, not of the underlying relationship. Hence they cannot have any real connection to the “strength” of the relationship as social scientists ordinarily use the term, i. e., as a measure of how much effect a given change in independent variable has on the dependent variable …

Thus “maximizing R2” cannot be a reasonable procedure for arriving at a strong relationship. It neither measures causal power nor is comparable across samples … “Explaining variance” is not what social science is about.

Christopher Achen

Should we ‘control for’ everything when running regressions? No way!

16 Sep, 2015 at 12:55 | Posted in Statistics & Econometrics | 1 Comment

When I present this argument … one or more scholars say, “But shouldn’t I control for everything I can in my regressions? If not, aren’t my coefficients biased due to excluded variables?” This argument is not as persuasive as it may seem initially. First of all, if what you are doing is misspecified already, then adding or excluding other variables has no tendency to make things consistently better or worse … The excluded variable argument only works if you are sure your specification is precisely correct with all variables included. But no one can know that with more than a handful of explanatory variables.

piled-up-dishes-in-kitchen-sinkStill more importantly, big, mushy linear regression and probit equations seem to need a great many control variables precisely because they are jamming together all sorts of observations
that do not belong together. Countries, wars, racial categories, religious preferences, education levels, and other variables that change people’s coefficients are “controlled” with dummy variables that are completely inadequate to modeling their effects. The result is a long list of independent variables, a jumbled bag of nearly unrelated observations, and often a hopelessly bad specification with meaningless (but statistically significant with several
asterisks!) results.

A preferable approach is to separate the observations into meaningful subsets—internally compatible statistical regimes … If this can’t be done, then statistical analysis can’t be done. A researcher claiming that nothing else but the big, messy regression is possible because, after all, some results have to be produced, is like a jury that says, “Well, the evidence was weak, but somebody had to be convicted.”

Christopher H. Achen

How economists argue

15 Sep, 2015 at 17:17 | Posted in Economics | 2 Comments

Reality-Check-1024x682To a mainstream economist, theory means model, and model means ideas expressed in mathematical form. In learning how to “think like an economist,” students learn certain critical concepts and models, ideas which typically are taught initially through simple mathematical analyses. These models, students learn, are theory. In more advanced courses, economic theories are presented in more mathematically elaborate models. Mainstream economists believe proper models – good models – take a recognizable form: presentation in equations, with mathematically expressed definitions, assumptions, and theoretical developments clearly laid out. Students also learn how economists argue. They learn that the legitimate way to argue is with models and econometrically constructed forms of evidence …

Because all models are incomplete, students also learn that no model is perfect. Indeed, students learn that it is bad manners to engage in excessive questioning of simplifying assumptions. Claiming that a model is deficient is a minor feat – presumably anyone can do that. What is really valued is coming up with a better model, a better theory. And so, goes the accumulated wisdom of properly taught economists, those who criticize without coming up with better models are only pedestrian snipers.

Sherlock Holmes på besök i högskolevärlden

15 Sep, 2015 at 15:09 | Posted in Varia | Comments Off on Sherlock Holmes på besök i högskolevärlden

sherlock_holmesPå en av landets största högskolor

kunde man igår på dess hemsida läsa att

“Bra studiemiljö positivt för studieresultatet.”


Vem hade kunnat ana det …

Simpson’s paradox and perspectival realism

15 Sep, 2015 at 13:23 | Posted in Economics | 1 Comment

Which causal relationships we see depend on which model we use and its conceptual/causal articulation; which model is bestdepends on our purposes and pragmatic interests.

simpsons_paradox_by_insecondsflat-d37lk7yTake the case of Simpson’s paradox, which can be described as the situation in which conditional probabilities (often related to causal relations) are opposite for subpopulations than for the whole population. Let academic salaries be higher for economists than for sociologists, and let salaries within each group be higher for women than for men. But let there be twice as many men than women in economics and twice as many women than men in sociology. By construction, the average salary of women is higher than that for men in each group; yet, for the right values of the different salaries, women are paid less on average, taking both groups together. [Example: Economics — 2 men earn 100$, 1 woman 101$; Sociology — 1 man earn 90$, 2 women 91$. Average female earning: (101 + 2×91)/3 = 94.3; Average male earning: (2×100 + 90)/3 = 96.6 — LPS]

An aggregate model leads to the conclusion that that being female causes a lower salary. We might feel an uneasiness with such a model, since I have already filled in the details that show more precisely why the result comes about. The temptation is to say that the aggregate model shows that being female apparently causes lower salaries; but the more refined description of a disaggregated model shows that really being female causes higher salaries. A true paradox, however, is not a contradiction, but a seeming contradiction. Another way to look at it is to say that the aggregate model is really true at that level of aggregation and is useful for policy and that equally true more disaggregated model gives an explanation of the mechanism behind the true aggregate model.

It is not wrong to take an aggregate perspective and to say that being female causes a lower salary. We may not have access to the refined description. Even if we do, we may as matter of policy think (a) that the choice of field is not susceptible to useful policy intervention, and (b) that our goal is to equalize income by sex and not to enforce equality of rates of pay. That we may not believe the factual claim of (a) nor subscribe to the normative end of (b) is immaterial. The point is that that they mark out a perspective in which the aggregate model suits both our purposes and the facts: it tells the truth as seen from a particular perspective.

Kevin Hoover

What is a good model?

14 Sep, 2015 at 20:42 | Posted in Theory of Science & Methodology | 1 Comment

e18bc09e184ed8197c54b9ce807714a6Whereas increasing the difference between a model and its target system may have the advantage that the model becomes easier to study, studying a model is ultimately aimed at learning something about the target system. Therefore, additional approximations come with the cost of making the correspondence between model and target system less straight- forward. Ultimately, this makes the interpretation of results on the model in terms of the target system more problematic. We should keep in mind the advice of Whitehead: “Seek simplicity and distrust it.”

A ‘good model’ is to be understood as a model that achieves an equilibrium between being useful and not being too wrong. The usefulness of a model is clearly context-dependent; it may involve a combination of desired features such as being understandable (for students, researchers, or others), achieving computational tractability, and other criteria. ‘Not being too wrong’ is to be understood as ‘not being too different from reality’.

Sylvia Wenmackers & Danny Vanpoucke

An interesting article underlining the fact that all empirical sciences use simplifying or unrealistic assumptions in their modeling activities, and that that is not the issue – as long as the assumptions made are not unrealistic in the wrong way or for the wrong reasons.

Theories are difficult to directly confront with reality. Economists therefore build models of their theories. Those models are representations that are directly examined and manipulated to indirectly say something about the target systems.

But models do not only face theory. They also have to look to the world. Being able to model a “credible world,” a world that somehow could be considered real or similar to the real world, is not the same as investigating the real world. Even though all theories are false, since they simplify, they may still possibly serve our pursuit of truth. But then they cannot be unrealistic or false in any way. The falsehood or unrealisticness has to be qualified.

Some of the standard assumptions made in neoclassical economic theory – on rationality, information handling and types of uncertainty – are not possible to make more realistic by “de-idealization” or “successive approximations” without altering the theory and its models fundamentally.

If we cannot show that the mechanisms or causes we isolate and handle in our models are stable, in the sense that what when we export them from are models to our target systems they do not change from one situation to another, then they only hold under ceteris paribus conditions and a fortiori are of limited value for our understanding, explanation and prediction of our real world target system.

No matter how many convoluted refinements of concepts made in the model, if the “successive approximations” do not result in models similar to reality in the appropriate respects (such as structure, isomorphism etc), the surrogate system becomes a substitute system that does not bridge to the world but rather misses its target.

Sir David Hendry on the inadequacies of DSGE models

13 Sep, 2015 at 19:31 | Posted in Economics | Comments Off on Sir David Hendry on the inadequacies of DSGE models

In most aspects of their lives humans must plan forwards. They take decisions today that affect their future in complex interactions with the decisions of others. When taking such decisions, the available information is only ever a subset of the universe of past and present information, as no individual or group of individuals can be aware of all the relevant information. Hence, views or expectations about the future, relevant for their decisions, use a partial information set, formally expressed as a conditional expectation given the available information.

HendryDavid-15x10cm-300dpiMoreover, all such views are predicated on there being no un-anticipated future changes in the environment pertinent to the decision. This is formally captured in the concept of ‘stationarity’. Without stationarity, good outcomes based on conditional expectations could not be achieved consistently. Fortunately, there are periods of stability when insights into the way that past events unfolded can assist in planning for the future.

The world, however, is far from completely stationary. Unanticipated events occur, and they cannot be dealt with using standard data-transformation techniques such as differencing, or by taking linear combinations, or ratios. In particular, ‘extrinsic unpredictability’ – unpredicted shifts of the distributions of economic variables at unanticipated times – is common. As we shall illustrate, extrinsic unpredictability has dramatic consequences for the standard macroeconomic forecasting models used by governments around the world – models known as ‘dynamic stochastic general equilibrium’ models – or DSGE models …

Many of the theoretical equations in DSGE models take a form in which a variable today, say incomes (denoted as yt) depends inter alia on its ‘expected future value’… For example, yt may be the log-difference between a de-trended level and its steady-state value. Implicitly, such a formulation assumes some form of stationarity is achieved by de-trending.

Unfortunately, in most economies, the underlying distributions can shift unexpectedly. This vitiates any assumption of stationarity. The consequences for DSGEs are profound. As we explain below, the mathematical basis of a DSGE model fails when distributions shift … This would be like a fire station automatically burning down at every outbreak of a fire. Economic agents are affected by, and notice such shifts. They consequently change their plans, and perhaps the way they form their expectations. When they do so, they violate the key assumptions on which DSGEs are built.

David Hendry & Grayham Mizon

A great article, confirming much of Keynes’s critique of econometrics and underlining that to understand real world ”non-routine” decisions and unforeseeable changes in behaviour, stationary probability distributions are of no avail. In a world full of genuine uncertainty – where real historical time rules the roost – the probabilities that ruled the past are not those that will rule the future.

When we cannot accept that the observations, along the time-series available to us, are independent … we have, in strict logic, no more than one observation, all of the separate items having to be taken together. For the analysis of that the probability calculus is useless; it does not apply … I am bold enough to conclude, from these considerations that the usefulness of ‘statistical’ or ‘stochastic’ methods in economics is a good deal less than is now conventionally supposed … We should always ask ourselves, before we apply them, whether they are appropriate to the problem in hand. Very often they are not … The probability calculus is no excuse for forgetfulness.

John Hicks

Time is what prevents everything from happening at once. To simply assume that economic processes are stationary is not a sensible way for dealing with the kind of genuine uncertainty that permeates open systems such as economies.

Econometrics is basically a deductive method. Given the assumptions (such as manipulability, transitivity, Reichenbach probability principles, separability, additivity, linearity etc) it delivers deductive inferences. The problem, of course, is that we will never completely know when the assumptions are right. Real target systems are seldom epistemically isomorphic to axiomatic-deductive models/systems, and even if they were, we still have to argue for the external validity of the conclusions reached from within these epistemically convenient models/systems. Causal evidence generated by statistical/econometric procedures may be valid in “closed” models, but what we usually are interested in, is causal evidence in the real target system we happen to live in.

Advocates of econometrics want to have deductively automated answers to fundamental causal questions. But to apply “thin” methods we have to have “thick” background knowledge of what’s going on in the real world, and not in idealized models. Conclusions can only be as certain as their premises – and that also applies to the quest for causality and forecasting predictability in econometrics.

And the waltz goes on

13 Sep, 2015 at 11:25 | Posted in Varia | Comments Off on And the waltz goes on


(h/t Jeanette Meyer)
Absolutely fabulous!!

I d’pluck a fair rose for my love (private)

12 Sep, 2015 at 22:45 | Posted in Varia | Comments Off on I d’pluck a fair rose for my love (private)


Model selection and the reference class problem (wonkish)

12 Sep, 2015 at 16:20 | Posted in Theory of Science & Methodology | Comments Off on Model selection and the reference class problem (wonkish)

The reference class problem arises when we want to assign a probability to a single proposition, X, which may be classified in various ways, yet its probability can change depending on how it is classified. (X may correspond to a sentence, or event, or an individual’s instantiating a given property, or the outcome of a random experiment, or a set of possible worlds, or some other bearer of probability.) X may be classified as belonging to set S1, or to set S2, and so on. Qua member of S1, its probability is p1; qua member of S2, its probability is p2, where p1 ≠ p2; and so on. And perhaps qua member of some other set, its probability does not exist at all …

0282_0Now, the bad news. Giving primacy to conditional probabilities does not so much rid us the epistemological reference class problem as give us another way of stating it. Which of the many conditional probabilities should guide us, should underpin our inductive reasonings and decisions? Our friend John Smith is still pondering his prospects of living at least eleven more years as he contemplates buying life insurance. It will not help him much to tell him of the many conditional probabilities that apply to him, each relativized to a different reference class: “conditional on your being an Englishman, your probability of living to 60 is x; conditional on your being consumptive, it is y; …”. (By analogy, when John Smith is pondering how far away is London, it will not help him much to tell him of the many distances that there are, each relative to a different reference frame.) If probability is to serve as a guide to life, it should in principle be possible to designate one of these conditional probabilities as the right one. To be sure, we could single out one conditional probability among them, and insist that that is the one that should guide him. But that is tantamount to singling out one reference class of the many to which he belongs, and claiming that we have solved the original reference class problem. Life, unfortunately, is not that easy—and neither is our guide to life.

Alan Hájek

When choosing which models to use in our analyses, we cannot get around the fact that the evaluation of our hypotheses, explanations, and predictions cannot be made without reference to a specific statistical model or framework. What Hajék so eloquently points at is that the probabilistic-statistical inferences we make from our samples decisively depends on what population we choose to refer to. The reference class problem shows that there usually are many such populations to choose from, and that the one we choose decides which probabilities we come up with and a fortiori which predictions we make. Not consciously contemplating the relativity effects this choice of “nomological-statistical machines” have, is probably one of the reasons economists have a false sense of the amount of uncertainty that really afflicts their models.

Arvo Pärt

12 Sep, 2015 at 14:01 | Posted in Varia | 1 Comment



The world’s greatest composer of contemporary classical music, Arvo Pärt, was 80 yesterday.

A day without listening to his music would be unimaginable.

The ‘bad luck’ theory of unemployment

11 Sep, 2015 at 15:46 | Posted in Economics | 1 Comment

182ytxt83k5oxjpgAs is well-known, New Classical Economists have never accepted Keynes’s distinction between voluntary and involuntary unemployment. According to New Classical übereconomist Robert Lucas, an unemployed worker can always instantaneously find some job. No matter how miserable the work options are, “one can always choose to accept them,” according to Lucas:

KLAMER: My taxi driver here is driving a taxi, even though he is an accountant, because he can’t find a job …

LUCAS: I would describe him as a taxi driver [laughing], if what he is doing is driving a taxi.

KLAMER: But a frustrated taxi driver.

LUCAS: Well, we draw these things out of urns, and sometimes we get good draws, sometimes we get bad draws.

Arjo Klamer

In New Classical Economics unemployment is seen as as a kind of leisure that workers optimally select.

This is, of course, only what you would expect of New Classical Chicago economists.

But sadly enough this extraterrestial view of unemployment is actually shared by so called New Keynesians, whose microfounded dynamic stochastic general equilibrium models cannot even incorporate such a basic fact of reality as involuntary unemployment!

Of course, working with microfounded representative agent models, this should come as no surprise. If one representative agent is employed, all representative agents are. The kind of unemployment that occurs is voluntary, since it is only adjustments of the hours of work that these optimizing agents make to maximize their utility.

In the basic DSGE models used by most ‘New Keynesians’, the labour market is always cleared – responding to a changing interest rate, expected life time incomes, or real wages, the representative agent maximizes the utility function by varying her labour supply, money holding and consumption over time. Most importantly – if the real wage somehow deviates from its “equilibrium value,” the representative agent adjust her labour supply, so that when the real wage is higher than its “equilibrium value,” labour supply is increased, and when the real wage is below its “equilibrium value,” labour supply is decreased.

In this model world, unemployment is always an optimal choice to changes in the labour market conditions. Hence, unemployment is totally voluntary. To be unemployed is something one optimally chooses to be.

It is extremely important to pose the question why mainstream economists choose to work with these kinds of models. It is not a harmless choice based solely on ‘internal’ scientific considerations. It is in fact also, and not to a trivial extent, a conscious choice motivated by ideology.

By employing these models one is actually to a significant degree absolving the structure of market economies from any responsibility in creating unemployment. Focussing on the choices of individuals, the unemployment ‘problem’ is reduced to being an individual ‘problem’, and not something that essentially has to do with the workings of market economies. A conscious methodological choice in this way comes to work as an apologetic device for not addressing or challenging given structures.

Not being able to explain unemployment, these models can’t help us to change the structures and institutions that produce the arguably greatest problem of our society.

Inequality and the poverty of atomistic reductionism

11 Sep, 2015 at 11:07 | Posted in Economics | 2 Comments

41xS7f+ClcL._SX322_BO1,204,203,200_The essence of this critique of the market lies in insisting on the structural relations that hold among individuals. The classic conception of the market sees individuals atomistically and therefore maintains that an individual’s holding can be justified by looking only at that individual. This was the original appeal of the libertarian picture: that the validity of an agreement could be established by establishing A’s willingness, B’s willingness, and the fact that they are entitled to trade what they are trading. Justification could be carried out purely locally. But this is not the case … Whether or not A is being coerced into trading with B is a function, not just of the local properties of A and B, but of the overall distribution of holdings and the willingness of other traders to trade with A …

If what we are trying to explain is really a relational property, the process of explaining it individual by individual simply will not work. And most if not all of the interesting properties in social explanation are inherently relational: for example, the properties of being rich or poor, employed or unemployed …

For the liberal the problem of economic distribution is raised by a simple juxtaposition: some are poor while others are rich. These two state of affairs are compared, side by side, and then the utilitarian question of redistribution becomes relevant. We could say that the liberal critique of inequality is that some are poor while others are rich, but, by contrast, the radical critique is that some are poor because others are rich …

tennis_520405In weakly competitive situations individualistic explanations suffice, whereas they are inadequate to explain strongly competitive situations. If A defeats B in golf and the question arises Why did A win and B lose?, the answer is simply the logical sum of the two independent explanations of the score which A received and the score which B received. But if A defeats B in tennis there is no such thing as the independent explanations of why A defeated B on the one hand and why B lost to A on the other. There is only one, unified explanation of the outcome of the match …

If anything is clear it is that society is not weakly, but strongly competitive and the presence of strong competition ensures that there are internal relations among the individual destinies of the participants. Consequently, individualistic explanations will not suffice in such cases.

The limits of statistical inference

10 Sep, 2015 at 21:55 | Posted in Statistics & Econometrics, Theory of Science & Methodology | 1 Comment

causationCausality in social sciences — and economics — can never solely be a question of statistical inference. Causality entails more than predictability, and to really in depth explain social phenomena require theory. Analysis of variation — the foundation of all econometrics — can never in itself reveal how these variations are brought about. First when we are able to tie actions, processes or structures to the statistical relations detected, can we say that we are getting at relevant explanations of causation.

Most facts have many different, possible, alternative explanations, but we want to find the best of all contrastive (since all real explanation takes place relative to a set of alternatives) explanations. So which is the best explanation? Many scientists, influenced by statistical reasoning, think that the likeliest explanation is the best explanation. But the likelihood of x is not in itself a strong argument for thinking it explains y. I would rather argue that what makes one explanation better than another are things like aiming for and finding powerful, deep, causal, features and mechanisms that we have warranted and justified reasons to believe in. Statistical — especially the variety based on a Bayesian epistemology — reasoning generally has no room for these kinds of explanatory considerations. The only thing that matters is the probabilistic relation between evidence and hypothesis. That is also one of the main reasons I find abduction — inference to the best explanation — a better description and account of what constitute actual scientific reasoning and inferences.

For more on these issues — see the chapter “Capturing causality in economics and the limits of statistical inference” in my On the use and misuse of theories and models in economics.

Critical realism and mathematics in economics

9 Sep, 2015 at 11:00 | Posted in Economics | 1 Comment

Interesting lecture, but I think just listening to what Tony Lawson or yours truly have to say, shows how unfounded and ridiculous is the idea that many mainstream economists have that because heterodox people often criticize the application of mathematics in mainstream economics, we are critical of math per se.


No, there is nothing wrong with mathematics per se.

No, there is nothing wrong with applying mathematics to economics.

amathMathematics is one valuable tool among other valuable tools for understanding and explaining things in economics.

What is, however, totally wrong, are the utterly simplistic beliefs that

• “math is the only valid tool”

• “math is always and everywhere self-evidently applicable”

• “math is all that really counts”

• “if it’s not in math, it’s not really economics”

“almost everything can be adequately understood and analyzed with math”

When it comes to the issue of mathematics in economics Roger Farmer also has some good advice well worth considering:

A common mistake amongst Ph.D. students is to place too much weight on the ability of mathematics to solve an economic problem. They take a model off the shelf and add a new twist. A model that began as an elegant piece of machinery designed to illustrate a particular economic issue, goes through five or six amendments from one paper to the next. By the time it reaches the n’th iteration it looks like a dog designed by committee.

Mathematics doesn’t solve economic problems. Economists solve economic problems. My advice: never formalize a problem with mathematics until you have already figured out the probable answer. Then write a model that formalizes your intuition and beat the mathematics into submission. That last part is where the fun begins because the language of mathematics forces you to make your intuition clear. Sometimes it turns out to be right. Sometimes you will realize your initial guess was mistaken. Always, it is a learning process.

And — of course — the always eminently quotable Keynes did also have some thoughts on the use of mathematics in economics …

But I am unfamiliar with the methods involved and it may be that my impression that nothing emerges at the end which has not been introduced expressly or tacitly at the beginning is quite wrong … It seems to me essential in an article of this sort to put in the fullest and most explicit manner at the beginning the assumptions which are made and the methods by which the price indexes are derived; and then to state at the end what substantially novel conclusions has been arrived at …


I cannot persuade myself that this sort of treatment of economic theory has anything significant to contribute. I suspect it of being nothing better than a contraption proceeding from premises which are not stated with precision to conclusions which have no clear application … [This creates] a mass of symbolism which covers up all kinds of unstated special assumptions.

Letter from Keynes to Frisch 28 November 1935

« Previous PageNext Page »

Blog at
Entries and comments feeds.