External validity and experiments — a Faustian bargain

3 Sep, 2019 at 09:54 | Posted in Economics | 1 Comment

bed-net-2048x1152Under control conditions bed nets have been shown to be highly effective in preventing malaria: Households randomly “treated” with bed nets experience a reduction in malaria incidence relative to households randomly allocated to control conditions. These controlled experiments identify the “effects of causes”, in this case bed net use reduces malaria incidence. Based on this evidence, numerous programs have been implemented that freely distribute bed nets in areas of high malaria incidence. And yet, to date, the jury is still out as to the effectiveness of these interventions in reducing malaria incidence in the treated areas. Why? One answer is lack of compliance – providing a free bed net does not imply that it will be used appropriately.

This example illustrates the point that just because X can be shown to cause
changes in Y , it does not follow that it explains any of the observed variance in Y in the real world nor, indeed, that it can be an effective cause in the real world – a problem of external validity -, where we may lack sufficient control to ensure full compliance. In other words, there are other factors (potentially unobserved) that moderate the causal effect in real applications … We gain understanding of potential
causes for reducing malaria, but we may still end up with bad policy predictions.

Fernando M Garcia & Leonard Wantchekon

The problem many ‘randomistas’ end up with when underestimating heterogeneity and interaction is not only an external validity problem when trying to ‘export’ regression results to different times or different target populations. It is also often an internal problem to the millions of regression estimates that economists produce every year.

‘Ideally controlled experiments’ tell us with certainty what causes what effects — but only given the right ‘closures.’ Making appropriate extrapolations from (ideal, accidental, natural or quasi) experiments to different settings, populations or target systems, is not easy. ‘It works there’ is no evidence for ‘it will work here.’ Causes deduced in an experimental setting still have to show that they come with an export-warrant to the target population/system. The causal background assumptions made have to be justified, and without licenses to export, the value of ‘rigorous’ and ‘precise’ methods — and ‘on-average-knowledge’ — is despairingly small.

RCTs have very little reach beyond giving descriptions of what has happened in the past. From the perspective of the future and for policy purposes they are as a rule of limited value since they cannot tell us what background factors were held constant when the trial intervention was being made.

RCTs usually do not provide evidence that the results are exportable to other target systems. RCTs cannot be taken for granted to give generalizable results. That something works somewhere for someone is no warranty for us to believe it to work for us here or even that it works generally.

1 Comment »

RSS feed for comments on this post. TrackBack URI

  1. Are you saying that RCTs are subject to more problems than other forms of experiment (that try to answer the same question)? I would say that all the problems of internal and external validity to which RCTs are subject are also present in spades in other experiment designs, for observational studies in truckloads.
    What do you propose as an alternative to statistical testing?


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.
Entries and comments feeds.