## Significance tests and the quest for the Holy Grail of “true” models

3 August, 2012 at 14:46 | Posted in Statistics & Econometrics | 5 CommentsAfter having mastered all the technicalities of regression analysis and econometrics, students often feel as though they are the masters of universe. I usually cool them down with a required reading of **Christopher Achen**‘s modern classic *Interpreting and Using Regression*.

It usually get them back on track again, and they understand that

no increase in methodological sophistication … alter the fundamental nature of the subject. It remains a wondrous mixture of rigorous theory, experienced judgment, and inspired guesswork. And that, finally, is its charm.

After giving a statistics course, I usually – at the exam – ask students to explain how one should correctly interpret p-values. Although the correct definition is p(data|null hypothesis), a majority of the students either misinterpreted the p-value as being the likelihood of a sampling error (which of course is wrong, since the very computation of the p-value is based on the assumption that sampling errors are what causes the sample statistics not coinciding with the null hypothesis) or that the p-value is the probability of the null hypothesis being true, given the data (which of course also is wrong, since it is p(null hypothesis|data) rather than the correct p(data|null hypothesis)).

This is not to blame on students’ ignorance, but rather on significance testing not being particularly transparent – conditional probability inference is difficult even to those of us who teach and practice it. A lot of researchers fall pray to the same mistakes. So – given that it anyway is very unlikely than any population parameter is exactly zero, and that contrary to assumption most samples in social science and economics are not random or having the right distributional shape – why continue to press students and researchers to do null hypothesis significance testing, testing that relies on weird backward logic that students and researchers usually don’t understand? And as Achen writes:

Significance testing as a search for specification errors substitutes calculations for substantive thinking. Worse, it channels energy toward the hopeless search for functionally correct specifications and divert attention from the real tasks, which are to formulate a manageable description of the data and to exclude competing ones.

## 5 Comments »

RSS feed for comments on this post. TrackBack URI

### Leave a Reply

Blog at WordPress.com. | The Pool Theme.

Entries and comments feeds.

Achen’s is one of the better among us poliscifis.

Comment by Dwayne Woods— 3 August, 2012 #

And he gets better and better (read: more critical) over time!

Comment by Lars P Syll— 3 August, 2012 #

True, as Clarke and Primo pointed out in their recent book he had some positivistic overhang that tended to creep in.

Comment by Dwayne Woods— 3 August, 2012 #

Here is a sample of Clarke and Primo

http://www.nytimes.com/2012/04/01/opinion/sunday/the-social-sciences-physics-envy.html?_r=1

Comment by Dwayne Woods— 4 August, 2012 #

Thanks. I’ll have a look at them!

Comment by Lars P Syll— 4 August, 2012 #