## RCTs in the Garden of Eden

8 December, 2016 at 15:29 | Posted in Statistics & Econometrics | 3 Comments

Suppose researchers come to a town and do an RCT on the town population to check whether the injection of a green chemical improves memory and has adverse side effects. Suppose it is found that it has no side effects and improves memory greatly in 95% of cases. If the study is properly done and the random draw is truly random, it is likely to be treated as an important finding and will, in all likelihood, be published in a major scientific journal.

Now consider a particular woman called Eve who lives in this town and is keen to enhance her memory. Can she, on the basis of this scientific study, deduce that there is a probability of 0.95 that her memory will improve greatly if she takes this injection? The answer is no, because she is not a random draw of an individual from this town. All we do know from the law of large numbers is that for every randomly drawn person from this population the probability that the injection will enhance memory is 0.95. But this would not be true for a specially chosen person in the same way that this would not be true of someone chosen from another town or another time.

To see this more clearly, permit me to alter the scenario in a statistically neutral way. Suppose that what I called the town in the above example is actually the Garden of Eden, which is inhabited by snakes and other similar creatures, and Eve and Adam are the only human beings in this place. Suppose now the same experiment was carried out in the Garden of Eden. That is, randomisers came, drew a large random sample of creatures, and administered the green injection and got the same result as described above. It works in 95% of cases. Clearly, Eve will have little confidence, on the basis of this, to expect that this treatment will work on her. I am assuming that the random draw of creatures on which the injection was tested did not include Eve and Adam. Eve will in all likelihood flee from anyone trying to administer this injection to her because she would have plainly seen that what the RCT demonstrates is that it works in the case of snakes and other such creatures, and the fact that she is part of the population from which the random sample was drawn is in no way pertinent.

Indeed, and the importance of this will become evident later, suppose in a neighbouring garden, where all living creatures happen to be humans, there was a biased-sample (meaning non-random8) trial of this injection, and it was found that the injection does not enhance memory and, in fact, gives a throbbing headache in a large proportion of cases, it is likely that Eve would be tempted to go along with this biased-sample study done on another population rather than the RCT conducted on her own population in drawing conclusions about what the injection might do to her. There is as little hard reason for Eve to reach this conclusion as it would be for her to conclude that the RCT result in her own Garden of Eden would work on her. I am merely pointing to a propensity of the human mind whereby certain biased trials may appear more relevant to us than certain perfectly controlled ones.

Kaushik Basu

Basu’s reasoning confirms what yours truly has repeatedly argued on this blog and in On the use and misuse of theories and models in mainstream economics  — RCTs usually do not provide evidence that the results are exportable to other target systems. The almost religious belief with which its propagators portray it, cannot hide the fact that RCTs cannot be taken for granted to give generalizable results. That something works somewhere is no warranty for it to work for us or even that it works generally.

1. As a mathematician, I don’t follow Basu’s argument. Imagine that Eve goes to the doctor with symptoms that match those of the RCT. Then the conclusions apply. But now the doctor notes that Eve is human and that humans are rare. Then the results of the RCT give us a prior to which we would like to apply Bayes’ rule. But we don’t have enough information. So while the mathematics is still valid, it doesn’t give us a very useful conclusion: An RCT finding for a whole population tells us nothing about a rare atypical subpopulation.

To do better, we would need to stratify by such factors as age and gender. Then we could draw useful conclusions, but only so long as the study actually considered all the factors which we now consider to be relevant.

As an example, suppose that you have a medical or family history which shows that you are atypical as far as the diagnosis is concerned. Then you should treat the common findings with more caution than would otherwise be the case. But a well-designed and interpreted RCT is still useful. (Or at least would withstand Basu’s critique.)

• Indeed. The garden of Eden analogy is a poor one. If Eve has not noticed that she is not a snake, she has no reason to prefer the biased study. If she has, there is no reason to consider the RCT.

• The problem with medicine in particular is that one can never rule out the possibility that one is a member of some as yet undiscovered small atypical sub-group, so one can never know that the RCT results ‘objectively’ apply. I have worked with many people who argue that one should nonetheless use the RCT result, providing various ‘reasons’. These include: it is the best there is, or it is ‘scientific’, or that civilisation as we know it would end if we failed to act ‘rationally’. So there are ‘reasons’. The challenge is to offer something better.

I tend to say that the best available RCT results should inform the decision, but we also need to judge and take account of their uncertainty. If someone insists that the RCT gives ‘the probability’ then I might not waste time arguing but seek to tease out what it is the probability of. For example, one could see it as the probability for me as seen from the perspective of someone who doesn’t know me or my medical or family history. Or I might just see it as the probability for the population as a whole. Either way there is clearly more to it.

Many I deal with are explicitly or implicitly some sort of Bayesian. According to Bayes’ paper the rationale for relying on his theory in this case would depend on observed homogeneity. But I observe heterogeneity.

Sorry, the comment form is closed at this time.