One Data Point Can Beat Big Data


Essay by Gerd Gigerenzer: “…In my research group at the Max Planck Institute for Human Development, we’ve studied simple algorithms (heuristics) that perform well under volatile conditions. One way to derive these rules is to rely on psychological AI: to investigate how the human brain deals with situations of disruption and change. Back in 1838, for instance, Thomas Brown formulated the Law of Recency, which states that recent experiences come to mind faster than those in the distant past and are often the sole information that guides human decision. Contemporary research indicates that people do not automatically rely on what they recently experienced, but only do so in unstable situations where the distant past is not a reliable guide for the future. In this spirit, my colleagues and I developed and tested the following “brain algorithm”:

Recency heuristic for predicting the flu: Predict that this week’s proportion of flu-related doctor visits will equal those of the most recent data, from one week ago.

Unlike Google’s secret Flu Trends algorithm, this rule is transparent and can be easily applied by everyone. Its logic can be understood. It relies on a single data point only, which can be looked up on the website of the Center for Disease Control. And it dispenses with combing through 50 million search terms and trial-and-error testing of millions of algorithms. But how well does it actually predict the flu?

Three fellow researchers and I tested the recency rule using the same eight years of data on which Google Flu Trends algorithm was tested, that is, weekly observations between March 2007 and August 2015. During that time, the proportion of flu-related visits among all doctor visits ranged between one percent and eight percent, with an average of 1.8 percent visits per week (Figure 1). This means that if every week you were to make the simple but false prediction that there are zero flu-related doctor visits, you would have a mean absolute error of 1.8 percentage points over four years. Google Flu Trends predicted much better than that, with a mean error of 0.38 percentage points (Figure 2). The recency heuristic had a mean error of only 0.20 percentage points, which is even better. If we exclude the period where the swine flu happened, that is before the first update of Google Flu Trends, the result remains essentially the same (0.38 and 0.19, respectively)….(More)”.