## On the non-applicability of statistical models

12 December, 2017 at 14:45 | Posted in Statistics & Econometrics | 17 Comments

Eminent statistician David Salsburg is rightfully very critical of the way social scientists — including economists and econometricians — uncritically and without arguments have come to simply assume that they can apply probability distributions from statistical theory on their own area of research:

We assume there is an abstract space of elementary things called ‘events’ … If a measure on the abstract space of events fulfils​ certain axioms, then it is a probability. To use probability in real life, we have to identify this space of events and do so with sufficient specificity to allow us to actually calculate probability measurements on that space … Unless we can identify [this] abstract space, the probability statements that emerge from statistical analyses will have many different and sometimes contrary meanings …

Kolmogorov established the mathematical meaning of probability: Probability is a measure of sets in an abstract space of events. All the mathematical properties of probability can be derived from this definition. When we wish to apply probability to real life, we need to identify that abstract space of events for the particular problem at hand … It is not well established when statistical methods are used for observational studies … If we cannot identify the space of events that generate the probabilities being calculated, then one model is no more valid than another … As statistical models are used more and more for observational studies to assist in social decisions by government and advocacy groups, this fundamental failure to be able to derive probabilities without ambiguity will cast doubt on the usefulness of these methods.

Wise words well worth pondering on.

As long as economists and statisticians cannot really identify their statistical models with real-world phenomena there is no real warrant for taking their statistical inferences seriously.

Just as there is no such thing as a ‘free lunch,’ there is no such thing as a ‘free probability.’ To be able at all to talk about probabilities, you have to specify a model. If there is no chance set-up or model that generates the probabilistic outcomes or events – in statistics one refers to any process where you observe or measure as an experiment (rolling a die) and the results obtained as the outcomes or events (number of points rolled with the die, being e. g. 3 or 5) of the experiment – there strictly seen is no event at all.

Probability is a relational element. It always must come with a specification of the model from which it is calculated. And then to be of any empirical scientific value it has to be shown to coincide with (or at least converge to) real data generating processes or structures – something seldom or never done!

And this is the basic problem with economic data. If you have a fair roulette-wheel, you can arguably specify probabilities and probability density distributions. But how do you conceive of the analogous ‘nomological machines’ for prices, gross domestic product, income distribution etc? Only by a leap of faith. And that does not suffice. You have to come up with some really good arguments if you want to persuade people into believing in the existence of socio-economic structures that generate data with characteristics conceivable as stochastic events portrayed by probabilistic density distributions!

1. One of the big problems for those of us who haven’t bought in to the mainstream view is that we often seem to disagree between ourselves, so people stick with the mainstream. I think the main thing to take away from Salsburg is that if we are going to apply probability theory we need some sort of ‘warrant’. There is nothing in mathematics or statistics or anywhere else that gives a general reason for supposing that whatever we are faced with is ‘stochastic’. I agree.

Some Bayesians note that we think in terms of subjective probabilities, and detail the consequences. They may be right about how ‘we’ think, but that doesn’t mean that the result of our thinking is correct.

I quibble, though, where Salsburg says: “To use probability in real life, we have to identify this space of events and do so with sufficient specificity to allow us to actually calculate probability measurements on that space.” The mere existence of probability distributions has some strong implications for how we ‘should’ think about social issues, including economics. It seems to me that it is not enough to argue that probability distributions cant be calculated with any specificity: we need to argue that they don’t exist.

• I would argue that most of the time the assumed existence of probability distributions (in a strict mathematical-statistical sense) is unwarranted in a real-world context. And even if they would exist — and this is, I guess, Salsburg’s point — it’s far from certain that we could make precise measurements and calculations based on them (cf. Keynes argumentation re risk and genuine uncertainty in Treatise on Probability).

2. “Just as there is no such thing as a ‘free lunch,’ there is no such thing as a ‘free probability.’ To be able at all to talk about probabilities, you have to specify a model.”

The universe itself is a free lunch; dark energy is a free lunch; the microwave drive is a free lunch. Money created by the Fed is a free lunch …

The model can be a default null hypothesis model. There is no model to explain why a particle collapses to one particular state or another with probabilities like 1/sqrt(5), or whatever. Why don’t particles collapse to states with equal probabilities? We do not know. Thus the complaint about economics is also a complaint against physics, and all sciences.

I think probabilities should not be subject to the law that says they must add to one. I think of probabilities more like points or scores which can accumulate. All probabilities can increase at once. If one probability goes up it does not mean another has to go down. Such is my probability story.

“how do you conceive of the analogous ‘nomological machines’ for prices, gross domestic product, income distribution etc:

For prices, fintech has computer programs with perfect hedges and self-funding portfolios that lend long and borrow short and profit from various contrived and colluded arbitrage opportunities … as for the rest, who cares.

• Robert, you can use probability theory in something like the way you describe by including an option ‘something else’. The trick is knowing when the something else is becoming more probable, as it was through 2007/8. Unfortunately, this is not standard practice.

• Yes, I have been exposed to smoothing techniques in natural language processing. Since out-of-vocabulary words occur in real life, you have to assign a probability to them. The question I have is why do the probabilities have to sum to one? I want to treat probabilities more as scores, or points, which can sum to arbitrary numbers. I think that gives me more flexibility; instead of recalculating all existing probabilities I can simply raise one probability higher, or add a new option and give it a score.

I guess I am averse to laws in general. I have seen the Law of Total Probability cited in various classes I’ve taken, but I’ve remained unconvinced of the advantages of sticking to the law. I think there are easier ways to think about probability which do not necessarily involve them summing to one.

3. The trouble with statistics in forecasting is that we are using past behavior to tell us about something expected in the future. The chances of being accurate are poor because history repeats itself only after the same errors have been made.

• David, I agree. But from 2005 to now not everything changed, so I argue that looking at the past for ‘laws’ is useful, so long as you aren’t misled by them.

4. When timid philosophers like Salzburg and Prof. Syll try to justify their dithering impotence by abstruse intellectual excuses such the lack of “real data generating processes”, “nomological machines”, etc., they commit the mistake of “not starting” identified by Gautama Buddha (c.480 BCE – c.400 BCE):
“There are only two mistakes one can make along the road to truth; not going all the way, and not starting.”

• Kingsley, if Syll et al have a positive message, it is not so obvious as their negative one. I have had similar experiences of many who are talking about a familiar topic using an unfamiliar language. Based on previous experience, I now tend to give them the benefit of the doubt, and suppose that they may have something worthwhile to say, if only I could understand it.

Keynes’ Treatise seems to me hugely incomprehensible, but wouldn’t you grant that there may be something in his view of probability and probability theory? I attempt to make sense of it in my blog, but have yet to find the ‘magic bullet’ that makes it accessible. As in so many areas, it seems that one has to ‘regress to progress’. That is, one has to recognise the limitations of the views of economists before one can appreciate the mathematics. Any ideas, anyone?

• My idea is that Professor Syll, Professor Mehrling, et al. are (somewhat timidly) heading towards the obvious conclusion that prices are arbitrary, the efficient market hypothesis is wrong, and we should stop imposing economic constraints on public policies.

5. Just because Kolmogorov (or anybody else for that matter) was able to give a formal abstract definition of probability and derive the mathematical laws of probability does not mean that real world phenomena need obey the laws of probability. Probability and the notion of randomness seem to be inextricably entwined. Who’s to say whether any phenomenon is random nor not? There might be statistical measures which enable such questions to be, prima facie, resolved. But these statistical measures are based on the abstract laws of probability. Probability is purely an intellectual construct. Randomness is an intellectual construct. It seems to me that every phenomenon in the cosmos has a cause and obeys some law, no matter the scale of observation. For it to be otherwise, yields unimaginable consequences. The notions of probability and randomness are invoked because there are phenomena we cannot otherwise explain.

• “For it to be otherwise, yields unimaginable consequences.”

The universe is not only stranger than you imagine, but stranger than you can imagine … (a quotation variously attributed to Eddington, Haldane, etc.)

Someone mentioned Buddha in another comment, so I’ll throw out that in the Jain view, karma is the law of cause and effect and moksha (or nirvana) is the state of liberation from karma. Through knowledge, Jains hold that one can transcend ordinary laws of cause and effect. For a completely different formulation of a similar idea, see http://www.thelema101.com/intro “Do what thou wilt shall be the whole of the Law” …

• “Do what thou wilt shall be the whole of the Law” …

Crowley’s mother’s nickname for Crowley as a child was “The Beast”. Perhaps what he preached as an adult he must also have practiced as child, much to the frustration of his mother. (Just making that up – I can’t remember why his mother called him so.) Putting aside the Law of Karma, the corollary to laws of probability, is that some “real” phenomena have no cause. For me, this is an absurdity.

• I understand your position, but what caused the first cause?

• Perhaps you should ask Mr. Crowley.

• No reply button appears under Henry Rech’s response to this comment, so I’ll use this space to reply to the comment: “Perhaps you should ask Mr. Crowley.”

Physicists such as Guth and Krauss have called the universe itself “the ultimate free lunch”. Krauss, I’ve heard in interviews, is also a strong believer that everything has a cause. I’m not sure how he reconciles the two. Maybe he thinks the universe popped into existence arbitrarily, then everything after that has a cause? Maybe he thinks there is a cause for the matter creation in the Big Bang? But these can only be hypotheses, beliefs.

When I heard Krauss in an interview express a similar point of view to yours, that everything has a cause, I thought that he was making a leap of faith. He has observed, in the lab, things causing other things, and then he uses induction to conclude that everything has a cause. But I’m sure you have heard of the the problem of induction https://en.wikipedia.org/wiki/Problem_of_induction

6. Krauss may have based his conclusion on observation. And as you point out, the conclusion is only as good as the last observation allows it to be. I guess I’ve done the same thing. I look at the world around us and understand that science tells us that some phenomena are governed by law. Does that mean that all phenomena are governed by law? Who knows? But it seems to me that it makes more sense to say that all phenomena are governed by law. The notion of randomness implies that some phenomena are not governed by law. But randomness is an intellectual construct, as I see it and should not be used to describe the condition of otherwise unexplainable phenomena. By phenomena, I mean physical phenomena. Lars’s comment is dealing with social/economic phenomena. It is difficult to discern cause and effect in social/economic phenomena. Perhaps the argument is becomes circular here.