Post-model-selection inference problems (wonkish)

4 January, 2017 at 19:08 | Posted in Statistics & Econometrics | Comments Off on Post-model-selection inference problems (wonkish)

last-line-of-defense-statisticsIt has long been recognized by some that when any parameter estimates are discarded, the sampling distribution of the remaining parameter estimates can be distorted …

For example, suppose the model a researcher selects depends on the day of the week. On Mondays it’s model A, on Tuesdays it’s model B, and so onup to seven different models on seven different days. Each model, therefore,is the “final” model with a probability of 1/7th that has nothing to do with the values of the regression parameters. Then, if the data analysis happens to be done on a Thursday, say, it is the results from model D that are reported. All of the other model results that could have been reported are not. Those parameter estimates are summarily discarded …

Model selection is a procedure by which some models are chosen over others. But model selection is subject to uncertainty. Because regression parameter estimates depend on the model in which they are embedded, there is in post-model-selection estimates additional uncertainty not present when a model is specified in advance. The uncertainty translates into sampling distributions that are a mixture of distributions, whose properties can differ dramatically from those required for convention statistical inference.

Richard Berk, Lawrence Brown, Linda Zhao

Advertisements

Create a free website or blog at WordPress.com.
Entries and comments feeds.