Fooled by randomness

13 Jan, 2021 at 18:08 | Posted in Statistics & Econometrics | 6 Comments

A non-trivial part of teaching statistics to social science students is made up of teaching them to perform significance testing. A problem yours truly has noticed repeatedly over the years, however, is that no matter how careful you try to be in explicating what the probabilities generated by these statistical tests — p-values — really are, still most students misinterpret them.

A couple of years ago I gave a statistics course for the Swedish National Research School in History, and at the exam I asked the students to explain how one should correctly interpret p-values. Although the correct definition is p(data|null hypothesis), a majority of the students either misinterpreted the p-value as being the likelihood of a sampling error (which of course is wrong, since the very computation of the p-value is based on the assumption that sampling errors are what causes the sample statistics not coinciding with the null hypothesis) or that the p-value is the probability of the null hypothesis being true, given the data (which of course also is wrong, since it is p(null hypothesis|data) rather than the correct p(data|null hypothesis)).

This is not to blame on students’ ignorance, but rather on significance testing not being particularly transparent (conditional probability inference is difficult even to those of us who teach and practice it). A lot of researchers fall pray to the same mistakes. So – given that it anyway is very unlikely than any population parameter is exactly zero, and that contrary to assumption most samples in social science and economics are not random or having the right distributional shape – why continue to press students and researchers to do null hypothesis significance testing, testing that relies on weird backward logic that students and researchers usually don’t understand?

Let me just give a simple example to illustrate how slippery it is to deal with p-values – and how easy it is to impute causality to things that really are nothing but chance occurrences.

Say you have collected cross-country data on austerity policies and growth (and let’s assume that you have been able to “control” for possible confounders). You find that countries that have implemented austerity policies have on average increased their growth by say 2% more than the other countries. To really feel sure about the efficacy of the austerity policies you run a significance test – thereby actually assuming without argument that all the values you have come from the same probability distribution – and you get a p-value of  less than 0.05. Heureka! You’ve got a statistically significant value. The probability is less than 1/20 that you got this value out of pure stochastic randomness.

But wait a minute. There is – as you may have guessed – a snag. If you test austerity policies in enough many countries you will get a statistically ‘significant’ result out of pure chance 5% of the time. So, really, there is nothing to get so excited about!

Statistical significance doesn’t say that something is important or true. And since there already are far better and more relevant testing that can be done (see e. g. here and  here), it is high time to give up on this statistical fetish and not continue to be fooled by randomness.

1. The epistemology of our intuitions about what and how we learn from observation and interpretation need closer examination. I am not sure why, but I find the approach taken by “best explanation” narratives oddly uncritical and common sense expectations about what we can learn from sequences of events founder easily. And, yet, geology has been highly successful as a science and diagnostic troubleshooting is a commonplace activity. Somehow there is a disconnect here.

• Geology denied clear evidence of continental drift for decades. Geology can’t predict earthquakes. Geologists come up with extremely hand-wavy explanations for banding. Geologists, repeating their mistake with Wegener, try to explain everything in terms of temperature and pressure but in another hundred years they will have to acknowledge there are more powerful forces at work …
.
Geology, like bridge-building, is only “successful” because they include safety factors that double their failure predictions. “My prediction might be 100% wrong, so build for that.”

• Wegener recognized a pattern, but it took a long time to accumulate evidence that confirmed that pattern corresponded to a model with the power to extend an interpretation to encompass and organize a great deal of data. Wegener did a good job with what he had, but what he had was thin and he was naive about a lot. His guesstimate for the speed of tectonic movement was two orders of magnitude off — that kind of thing left a lot of room for critics to scorn.
.
Pattern recognition and sudden leaps of insight are inherent parts of knowledge-building as are critical doubt and skepticism. Both are likely to be wrong most of the time about what they do not know they do not know.

• Any kid can look at a globe and tell Africa and South America fit together. Geologists predicted peak oil; that was a lie. On reddit’s geology forum I delight in flummoxing the geologists by using the same nit-picky criticism techniques they used against Wegener, against their own hand-wavy fantastical explanations. Temperature and pressure are not able to describe banding (or they could recreate it in a lab). Geology is Cargo Cult Science …

• The development of the science of geology is an apt way to study the development of science. It begins with observation and measurement and then hypothesis formation and testing. And as observation and measurement instrumentation has become more sophisticated so has hypothesis formation.
.
Geology had in the first instance to overcome the notions of native creationism and religious belief – vestiges of these still prevail in some quarters. Economics has had similarly to overcome ideology and its influence on theorization. Ideology still plays a significant role in progress of economic theory.
.
The science of geology developed through the stages of Neptunism, Plutonism and Uniformitarianism. The model of the geosyncline (development of the Earth’s crust vertically) came to explain basin development, orogeny and volcanism. With accumulating observational data this theory was supplanted by plate tectonics (development of the Earth’s crust horizontally). Plate tectonic processes are now recognized as far back as the Archean.
.
Fortunately for geology, the understanding of geologic processes are founded in the sciences of physics, chemistry and botany. These provide a firm foundation for theoretical development. Unfortunately for economics such a firm foundation is not available.
.
The science of geology has essentially developed in a linear fashion, whereas the development of economics has sputtered and spurted in all kinds of directions, with enduring parallel development of alternate theories. It is difficult to see the situation changing.

• Geology today ignores electricity. There is an arbitrary social consensus that only temperature and pressure are operating. They can’t explain the emergence of ordered banding from homogenous rock, because they are slaves to obsolete Thermodynamics …

Sorry, the comment form is closed at this time.