The Good Judgment Project: Harnessing the Wisdom of the Crowd to Forecast World Events


The Economist: “But then comes the challenge of generating real insight into forecasting accuracy. How can one compare forecasting ability?
The only reliable method is to conduct a forecasting tournament in which independent judges ask all participants to make the same forecasts in the same timeframes. And forecasts must be expressed numerically, so there can be no hiding behind vague verbiage. Words like “may” or “possible” can mean anything from probabilities as low as 0.001% to as high as 60% or 70%. But 80% always and only means 80%.
In the late 1980s one of us (Philip Tetlock) launched such a tournament. It involved 284 economists, political scientists, intelligence analysts and journalists and collected almost 28,000 predictions. The results were startling. The average expert did only slightly better than random guessing. Even more disconcerting, experts with the most inflated views of their own batting averages tended to attract the most media attention. Their more self-effacing colleagues, the ones we should be heeding, often don’t get on to our radar screens.
That project proved to be a pilot for a far more ambitious tournament currently sponsored by the Intelligence Advanced Research Projects Activity (IARPA), part of the American intelligence world. Over 5,000 forecasters have made more than 1m forecasts on more than 250 questions, from euro-zone exits to the Syrian civil war. Results are pouring in and they are revealing. We can discover who has better batting averages, not take it on faith; discover which methods of training promote accuracy, not just track the latest gurus and fads; and discover methods of distilling the wisdom of the crowd.
The big surprise has been the support for the unabashedly elitist “super-forecaster” hypothesis. The top 2% of forecasters in Year 1 showed that there is more than luck at play. If it were just luck, the “supers” would regress to the mean: yesterday’s champs would be today’s chumps. But they actually got better. When we randomly assigned “supers” into elite teams, they blew the lid off IARPA’s performance goals. They beat the unweighted average (wisdom-of-overall-crowd) by 65%; beat the best algorithms of four competitor institutions by 35-60%; and beat two prediction markets by 20-35%.
Over to you
To avoid slipping back to business as usual—believing we know things that we don’t—more tournaments in more fields are needed, and more forecasters. So we invite you, our readers, to join the 2014-15 round of the IARPA tournament. Current questions include: Will America and the EU reach a trade deal? Will Turkey get a new constitution? Will talks on North Korea’s nuclear programme resume? To volunteer, go to the tournament’s website at www.goodjudgmentproject.com. We predict with 80% confidence that at least 70% of you will enjoy it—and we are 90% confident that at least 50% of you will beat our dart-throwing chimps.”
See also https://web.archive.org/web/2013/http://www.iarpa.gov/Programs/ia/ACE/ace.html