Policy evaluation and the hazards to external validity

17 November, 2016 at 11:43 | Posted in Theory of Science & Methodology | 1 Comment

As yours truly has repeatedly argued on this blog (e.g. here here here),  RCTs usually do not provide evidence that their results are exportable to other target systems. The almost religious belief with which many of its propagators portray it, cannot hide the fact that RCTs cannot be taken for granted to give generalizable results. That something works somewhere is no warranty for it to work for us or even that it works generally.

An extremely interesting systematic review article, on the grand claims to external validity often raised by advocates of RCTs, now confirms this view and show that using an RCT is not at all the “gold standard” it is portrayed as:


In theory there seems to be a consensus among empirical researchers that establishing external validity of a policy evaluation study is as important as establishing its internal validity. Against this background, this paper has systematically reviewed the existing RCT literature in order to examine the extent to which external validity concerns are addressed in the practice of conducting and publishing RCTs for policy evaluation purposes. We have identified all 92 papers based on RCTs that evaluate a policy intervention and that are published in the leading economic journals between 2009 and 2014. We reviewed them with respect to whether the published papers address the different hazards of external validity that we developed …

Many published RCTs do not provide a comprehensive presentation of how the experiment was implemented. More than half of the papers do not even provide the reader with information on whether the participants in the experiment are aware of being part of an experiment – which is crucial to assess whether Hawthorne- or John- Henry-effects could codetermined the outcomes in the RCT …

Further, potential general equilibrium effects are only rarely addressed. This is above all worrisome in case outcomes involve price changes (e.g. labor market outcomes) with straightforward repercussions when the program is brought to scale …

In many of the studies we reviewed, the assumptions that the authors make in generalizing their results, as well as respective limitations to the inferences we can draw, are left behind a veil …

A more transparent reporting would also lead to a situation in which RCTs that properly accounted for the potential hazards to external validity receive more attention than those that did not … We therefore call for dedicating the same devotion to establishing external validity as is done to establish internal validity. It would be desirable if the peer review process at economics journals explicitly scrutinized design features of RCTs that are relevant for extrapolating the findings to other settings and the respective assumptions made by the authors … Given the trade-offs we all face during the laborious implementation of studies it is almost certain that external validity will often be sacrificed for other features to which the peer-review process currently pays more attention.

Jörg Peters, Jörg Langbein & Gareth Roberts


1 Comment

  1. In your opening line, “As yours truly has repeatedly argued on this blog (e.g. here here here)”, the first two instances of the word “here” link to the same article.

    (Which, I suppose, means you repeated yourself here 🙂 I’ll get my hat.)

Sorry, the comment form is closed at this time.

Blog at WordPress.com.
Entries and comments feeds.