Monte Carlo simulation and p values (wonkish)28 January, 2013 at 11:45 | Posted in Statistics & Econometrics | 9 Comments
In many social sciences p values and null hypothesis significance testing (NHST) are often used to draw far-reaching scientific conclusions – despite the fact that they are as a rule poorly understood and that there exist altenatives that are easier to understand and more informative.
Not the least using confidence intervals (CIs) and effect sizes are to be preferred to the Neyman-Pearson-Fisher mishmash approach that is so often practised by applied researchers.
Running a Monte Carlo simulation with 100 replications of a fictitious sample having N = 20, confidence itervals of 95%, a normally distributed population with a mean = 10 and a standard deviation of 20, taking two-tailed p values on a zero null hypothesis, we get varying CIs (since they are based on varying sample standard deviations), but with a minimum of 3.2 and a maximum of 26.1 we still get a clear picture of what would happen in an infinite limit sequence. On the other hand p values (even though from a purely mathematical statistical sense more or less equivalent to CIs) vary strongly from sample to sample, and jumping around between a minimum of 0.007 and a maximum of 0.999 don’t give you a clue of what will happen in an infinite limit sequence! So, I can’t but agree with Geoff Cummings:
The problems are so severe we need to shift as much as possible from NHST … The first shift should be to estimation: report and interpret effect sizes and CIs … I suggest p should be given only a marginal role, its problem explained, and it should be interpreted primarily as an indicator of where the 95% CI falls in relation to a null hypothesised value.
[In case you want to do your own Monte Carlo simulation, here's an example using Gretl:
loop 100 --progressive
series y = normal(10,15)
scalar zs = (10-mean(y))/sd(y)
scalar df = $nobs-1
scalar ysd= sd(y)
scalar tstat = (ybar-10)/ybarsd
pvalue t df tstat
scalar lowb = mean(y) - critical(t,df,0.025)*ybarsd
scalar uppb = mean(y) + critical(t,df,0.025)*ybarsd
scalar pval = pvalue(t,df,tstat)
store E:\pvalcoeff.gdt lowb uppb pval