Statistical models and the assumptions on which they build

21 Sep, 2022 at 19:47 | Posted in Statistics & Econometrics | Leave a comment

Assumption / Validation Flowchart | by Tami Reiss | Product Ponderings |  MediumEvery method of statistical inference depends on a complex web of assumptions about how data were collected and analyzed, and how the analysis results were selected for presentation. The full set of assumptions is embodied in a statistical model that underpins the method … Many problems arise however because this statistical model often incorporates unrealistic or at best unjustified assumptions …

The difficulty of understanding and assessing underlying assumptions is exacerbated by the fact that the statistical model is usually presented in a highly compressed and abstract form—if presented at all. As a result, many assumptions go unremarked and are often unrecognized by users as well as consumers of statistics. Nonetheless, all statistical methods and interpretations are premised on the model assumptions; that is, on an assumption that the model provides a valid representation of the variation we would expect to see across data sets, faithfully reflecting the circumstances surrounding the study and phenomena
occurring within it.

Sander Greenland et al.

If anything, the common abuse of statistical tests underlines how important it is not to equate science with statistical calculation. All science entails human judgment, and using statistical models doesn’t relieve us of that necessity. Working with misspecified models, the scientific value of statistics is actually zero — even though you’re making valid statistical inferences! Statistical models are no substitutes for doing real science. Or as a famous German philosopher once famously wrote:

There is no royal road to science, and only those who do not dread the fatiguing climb of its steep paths have a chance of gaining its luminous summits.

We should never forget that the underlying parameters we use when performing statistical tests are model constructions. And if the model is wrong, the value of our calculations is nil. As ‘shoe-leather researcher’ David Freedman wrote in Statistical Models and Causal Inference:

I believe model validation to be a central issue. Of course, many of my colleagues will be found to disagree. For them, fitting models to data, computing standard errors, and performing significance tests is “informative,” even though the basic statistical assumptions (linearity, independence of errors, etc.) cannot be validated. This position seems indefensible, nor are the consequences trivial. Perhaps it is time to reconsider.

Leave a Comment »

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.
Entries and Comments feeds.