An early warning approach to monitor COVID-19 activity with multiple digital traces in near real time


Paper by Nicole E. Kogan et al: “We propose that several digital data sources may provide earlier indication of epidemic spread than traditional COVID-19 metrics such as confirmed cases or deaths. Six such sources are examined here: (i) Google Trends patterns for a suite of COVID-19–related terms; (ii) COVID-19–related Twitter activity; (iii) COVID-19–related clinician searches from UpToDate; (iv) predictions by the global epidemic and mobility model (GLEAM), a state-of-the-art metapopulation mechanistic model; (v) anonymized and aggregated human mobility data from smartphones; and (vi) Kinsa smart thermometer measurements.

We first evaluate each of these “proxies” of COVID-19 activity for their lead or lag relative to traditional measures of COVID-19 activity: confirmed cases, deaths attributed, and ILI. We then propose the use of a metric combining these data sources into a multiproxy estimate of the probability of an impending COVID-19 outbreak. Last, we develop probabilistic estimates of when such a COVID-19 outbreak will occur on the basis of multiproxy variability. These outbreak-timing predictions are made for two separate time periods: the first, a “training” period, from 1 March to 31 May 2020, and the second, a “validation” period, from 1 June to 30 September 2020. Consistent predictive behavior among proxies in both of these subsequent and nonoverlapping time periods would increase the confidence that they may capture future changes in the trajectory of COVID-19 activity….(More)”.