Does randomization control for ‘lack of balance’?

16 Mar, 2023 at 16:40 | Posted in Statistics & Econometrics | 2 Comments

Mike Clarke, the Director of the Cochrane Centre in the UK, for example, states on the Centre’s Web site: ‘In a randomized trial, the only difference between the two groups being compared is that of most interest: the intervention under investigation’.

Evidence-based medicine is broken: why we need data and technology to fix itThis seems clearly to constitute a categorical assertion that by randomizing, all other factors — both known and unknown — are equalized between the experimental and control groups; hence the only remaining difference is exactly that one group has been given the treatment under test, while the other has been given either a placebo or conventional therapy; and hence any observed difference in outcome between the two groups in a randomized trial (but only in a randomized trial) must be the effect of the treatment under test.

Clarke’s claim is repeated many times elsewhere and is widely believed. It is admirably clear and sharp, but it is clearly unsustainable … Clearly the claim taken literally is quite trivially false: the experimental group contains Mrs Brown and not Mr Smith, whereas the control group contains Mr Smith and not Mrs Brown, etc. Some restriction on the range of differences being considered is obviously implicit here; and presumably the real claim is something like that the two groups have the same means and distributions of all the [causally?] relevant factors. Although this sounds like a meaningful claim, I am not sure whether it would remain so under analysis … And certainly, even with respect to a given (finite) list of potentially relevant factors, no one can really believe that it automatically holds in the case of any particular randomized division of the subjects involved in the study. Although many commentators often seem to make the claim … no one seriously thinking about the issues can hold that randomization is a sufficient condition for there to be no difference between the two groups that may turn out to be relevant …

In sum, despite what is often said and written, no one can seriously believe that having randomized is a sufficient condition for a trial result to be reasonably supposed to reflect the true effect of some treatment. Is randomizing a necessary condition for this? That is, is it true that we cannot have real evidence that a treatment is genuinely effective unless it has been validated in a properly randomized trial? Again, some people in medicine sometimes talk as if this were the case, but again no one can seriously believe it. Indeed, as pointed out earlier, modern medicine would be in a terrible state if it were true. As already noted, the overwhelming majority of all treatments regarded as unambiguously effective by modern medicine today — from aspirin for mild headache through diuretics in heart failure and on to many surgical procedures — were never (and now, let us hope, never will be) ‘validated’ in an RCT.

John Worrall

For more on the question of ‘balance’ in randomized experiments, this collection of papers in Social Science & Medicine gives many valuable insights.

2 Comments »

RSS feed for comments on this post. TrackBack URI

  1. I wonder where the balancing fables about randomization originated. True, we can expect convergence to some kind of balance on average as the sizes of the randomized groups increase; but most randomized experiments I see in health and medical sciences are not large enough for that result to be invoked and on close inspection may display large random imbalances that confound the results and call for covariate adjustment (Greenland S, Robins JM, Pearl J. Confounding and collapsibility in causal inference. Statistical Science 1999,v. 14, p.29–46; see p. 35).

    I am unaware of work in which Fisher, perhaps the most influential proponent of randomized designs, said that the purpose of randomization was to balance covariates. Instead, its purpose was to provide a deductive basis for the distributions of statistics contrasting outcomes among treatment groups under various hypotheses or models (e.g., as in ANOVA). This is a frequentist rationale free of claims (including the erroneous ones quoted by Worrall) about balancing covariates. It continues to apply even if we adjust for observed imbalances across treatment groups, albeit now the distributions are conditional on those adjustments (e.g., as in ANCOVA). See for example Senn SJ. Baseline balance and valid statistical analyses: common misunderstandings. Applied Clinical Trials, March 2005, p. 25-27.
    https://www.appliedclinicaltrialsonline.com/view/baseline-balance-and-valid-statistical-analyses-common-misunderstandings

    A Bayesian rationale free of balancing claims was given by Cornfield (Recent methodological contributions to clinical trials. American Journal of Epidemiology 1976, vol. 104, p. 408–424.), who described randomization as an act that created exchangeable baseline priors: If one believes treatment was randomized, then, absent further information, one should have priors for the baseline covariate distributions in treatment groups that are exchangeable across the groups. As with the Fisherian rationale, it continues to apply even if we adjust for observed imbalances across treatment groups, albeit again the prior distributions rendered exchangeable are now conditional on those adjustments.

    • Sander, you (rhetorically) wonder where the balancing fables about randomization originated. Martinez & Teira show that the ‘fable’ is at least older than Fisher … Noticing that “Worrall’s assessment of randomization is now a mainstream view among philosophers of medicine” they argue his argument “seems to presuppose a Millean conception of experimental balance: for causal inference in a comparative experiment to be sound, all the antecedent causal factors (covariates) have a similar value in both groups, so that the intervention is the sole explanans of any difference in the outcome … Ronald Fisher’s original argument for randomization parted ways with Mill, focusing instead on the analysis of variance. Randomization, for Fisher, did not control for unknown factors guaranteeing a balanced distribution … The crucial difference is that, for Fisher, the analysis of variance allowed solid causal conclusions even if there was no Millean balance between covariates … The upshot of our analysis is that Worrall is right in showing that randomization does not provide a good warrant of experimental balance in Mill’s sense. But for both frequentist and Bayesian statisticians such understanding of balance is not necessary for causal inference, while randomization is not so easy to dispense with.”
      Beyond that, after re-reading several of Woorall’s articles, it seems to me that his concerns about ‘lacking balance’ (which I think he shares with Deaton, Cartwright, and Freedman) has to do with the fact that singular randomizations — contrary to the ones we in theory re-randomize indefinitely — do guarantee nothing when it comes to causality. And — even if we could run indefinitely many randomizations, the confounder ‘balance’ would still only (in a probabilistic sense) be ‘guaranteed’ in the population we run our experiments on, and not the target population we (usually) ultimately aim for.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.
Entries and Comments feeds.