Twitter Datastream Used to Predict Flu Outbreaks


arXivBlog: “The rate at which people post flu-related tweets could become a powerful tool in the battle to spot epidemics earlier, say computer scientists.

Back in 2008, Google launched its now famous flu trends website. It works on the hypothesis that people make more flu-related search queries when they are suffering from the illness than when they are healthy. So counting the number of flu-related search queries in a given country gives a good indication of how the virus is spreading.
The predictions are pretty good. The data generally closely matches that produced by government organisations such as the Centers for Disease Control and Prevention (CDC) in the US. Indeed, in some cases, it has been able to spot an incipient epidemic more than a week before the CDC.
That’s been hugely important. An early indication that the disease is spreading in a population gives governments a welcome headstart in planning its response.
So an interesting question is whether other online services, in particular social media, can make similar or even better predictions. Today, we have an answer thanks to the work of Jiwei Li at Carnegie Mellon University in Pittsburgh, and Claire Cardie at Cornell University in New York State, who have been able to detect the early stages of an influenza outbreak using Twitter.
Their approach is in many ways similar to Google’s. They simply filter the Twitter datastream for flu-related tweets that are also geotagged. That allows them to create a map showing the distribution of these tweets and how it varies over time.
They also model the dynamics of the disease with some interesting subtleties. In the new model, a flu epidemic can be in one of four phases: non-epidemic phase, a rising phase where numbers are increasing, a stationary phase and a declining phase where numbers are falling.
The new approach uses an algorithm that attempts to spot the switch from one phase to another as early as possible. Indeed, Li and Cardie test the effectiveness of their approach using a Twitter dataset of 3.6 million flu-related tweets from about 1 million people in the US between June 2008 and June 2010…
Ref: arxiv.org/abs/1309.7340: Early Stage Influenza Detection from Twitter”