Data-driven decisions: the case for randomised policy trials

Speech by Andrew Leigh: “…In 1747, 31-year-old Scottish naval surgeon James Lind set about determining the most effective treatment for scurvy, a disease that was killing thousands of sailors around the world. Selecting 12 sailors suffering from scurvy, Lind divided them into six pairs. Each pair received a different treatment: cider; sulfuric acid; vinegar; seawater; a concoction of nutmeg, garlic and mustard; and two oranges and a lemon. In less than a week, the pair who had received oranges and lemons were back on active duty, while the others languished. Given that sulphuric acid was the British Navy’s main treatment for scurvy, this was a crucial finding.

The trial provided robust evidence for the powers of citrus because it created a credible counterfactual. The sailors didn’t choose their treatments, nor were they assigned based on the severity of their ailment. Instead, they were randomly allocated, making it likely that difference in their recovery were due to the treatment rather than other characteristics.

Lind’s randomised trial, one of the first in history, has attained legendary status. Yet because 1747 was so long ago, it is easy to imagine that the methods he used are no longer applicable. After all, Lind’s research was conducted at a time before electricity, cars and trains, an era when slavery was rampant and education was reserved for the elite. Surely, some argue, ideas from such an age have been superseded today.

In place of randomised trials, some put their faith in ‘big data’. Between large-scale surveys and extensive administrative datasets, the world is awash in data as never before. Each day, hundreds of exabytes of data are produced. Big data has improved the accuracy of weather forecasts, permitted researchers to study social interactions across racial and ethnic lines, enabled the analysis of income mobility at a fine geographic scale and much more…(More)”