Regression analysis and randomisation distract us from the real scientific issues

15 Nov, 2016 at 18:37 | Posted in Statistics & Econometrics | 1 Comment

In my view, regression models are not a particularly good way of doing empirical work in the social sciences today, because the technique depends on knowledge that we do not have. Investigators who use the technique are not paying adequate attention to the connection – if any – between the models and the phenomena they are studying. Their conclusions may be valid for the computer code they have created, but the claims are hard to transfer from that microcosm to the larger world …

Given the limits to present knowledge, I doubt that models can be rescued by technical fixes. Arguments about the theoretical merit of regression or the asymptotic behavior of specification tests for picking one version of a model over another seem like the arguments about how to build desalination plants with cold fusion and the energy source. The concept may be admirable, the technical details may be fascinating, but thirsty people should look elsewhere …

Causal inference from observational data presents many difficulties, especially when underlying mechanisms are poorly understood. There is a natural desire to substitute intellectual capital for labor, and an equally natural preference for system and rigor over methods that seem more haphazard. These are possible explanations for the current popularity of statistical models.

Indeed, far-reaching claims have been made for the superiority of a quantitative template that depends on modeling – by those who manage to ignore the far-reaching assumptions behind the models. However, the assumptions often turn out to be unsupported by the data. If so, the rigor of advanced quantitative methods is a matter of appearance rather than substance.

David Freedman

Freedman is absolutely spot on in his critique of how regression analysis has been applied in social sciences.

But a growing number of social scientists today seems to think that randomization may somehow solve the causality problems surrounding regression analysis and econometrics. By randomizing we are getting different ‘populations’ (‘treatment’ and ‘control’ groups) that are homogeneous in regards to all variables except the one we think is a genuine cause. In this way we are supposed to not have to actually know what all these other factors are.

If you succeed in performing an ideal randomization with different treatment groups and control groups that is attainable. But it presupposes that you really have been able to establish – and not just assume – that the probability of all other causes but the putative have the same probability distribution in the ‘treatment’ and ‘control’ groups, and that the probability of assignment to ‘treatment’ or ‘control’ groups are independent of all other possible causal variables.

Unfortunately, real experiments and real randomizations seldom or never achieve this. So, yes, we may do without knowing all causes, but it takes ideal experiments and ideal randomizations to do that, not real ones.

That means that in practice we have to have sufficient background knowledge to deduce causal knowledge. Without old knowledge, we can’t get new knowledge. No causes in, no causes out.

Conclusion — neither regression analysis, nor randomisation, are substitutes for doing real science.

1 Comment

For confirmation the strangeness of most regression stuff, check out this recent article in the British Journal of Political Science – Do We Really Know the WTO Cures Cancer?

Comment by DW— 16 Nov, 2016 #

Sorry, the comment form is closed at this time.

Blog at WordPress.com.
Entries and Comments feeds.

	Jan Milch on Keynes — en ständigt akt…
	rsm on Brownian motion (student …
	Nanikore on The total incompetence of peop…
	Bruce Wilder on The total incompetence of peop…
	rsm on Ergodicity — a questiona…
	Edward Fullbrook on Susan Neiman on why left is no…
	rsm on The non-existence of economic…
	fredtorssander on The non-existence of economic…
	Mel on Cutting-edge macroeconomics…
	fredtorssander on MMT — coming to an econo…
	Jan Milch on The Swedish for-profit ‘…
	rsm on The Swedish for-profit ‘…
	fredtorssander on What’s the use of e…
	rsm on What’s the use of e…
	fredtorssander on What’s the use of e…

LARS P. SYLL

Regression analysis and randomisation distract us from the real scientific issues

1 Comment

Recent Posts

Comments Policy

Recent Comments

Reading List

Categories

Archives