Let’s Take the Con Out of Randomized Control Trials in Development: The Puzzles and Paradoxes of External Validity, Empirically Illustrated
CID Faculty Working Paper No. 399
The enthusiasm for the potential of RCTs in development rests in part on the assumption that the use of the rigorous evidence that emerges from an RCT (or from a small set of studies identified as rigorous in a “systematic” review) leads to the adoption of more effective policies, programs or projects. However, the supposed benefits of using rigorous evidence for “evidence based” policy making depend critically on the extent to which there is external validity. If estimates of causal impact or treatment effects that have internal validity (are unbiased) in one context (where the relevant “context” could be country, region, implementing organization, complementary policies, initial conditions, etc.) cannot be applied to another context then applying evidence that is rigorous in one context may actually reduce predictive accuracy in other contexts relative to simple evidence from that context—even if that evidence is biased (Pritchett and Sandefur 2015). Using empirical estimates from a large number of developing countries of the difference in student learning in public and private schools (just as one potential policy application) I show that commonly made assumptions about external validity are, in the face of the actual observed heterogeneity across contexts, both logically incoherent and empirically unhelpful. Logically incoherent, in that it is impossible to reconcile general claims about external validity of rigorous estimates of causal impact and the heterogeneity of the raw facts about differentials. Empirically unhelpful in that using a single (or small set) of rigorous estimates to apply to all other actually leads to a larger root mean square error of prediction of the “true” causal impact across contexts than just using the estimates from non-experimental data from each country. In the data about private and public schools, under plausible assumptions, an exclusive reliance on the rigorous evidence has RMSE three times worse than using the biased OLS result from each context. In making policy decisions one needs to rely on an understanding of the relevant phenomena that encompasses all of the available evidence.
Affiliated Program: Building State Capability