How the USGS uses Twitter data to track earthquakes

Twitter Blog: “After the disastrous Sichuan earthquake in 2008, people turned to Twitter to share firsthand information about the earthquake. What amazed many was the impression that Twitter was faster at reporting the earthquake than the U.S. Geological Survey (USGS), the official government organization in charge of tracking such events.

This Twitter activity wasn’t a big surprise to the USGS. The USGS National Earthquake Information Center (NEIC) processes about 2,000 realtime earthquake sensors, with the majority based in the United States. That leaves a lot of empty space in the world with no sensors. On the other hand, there are hundreds of millions of people using Twitter who can report earthquakes. At first, the USGS staff was a bit skeptical that Twitter could be used as a detection system for earthquakes – but when they looked into it, they were surprised at the effectiveness of Twitter data for detection.

USGS staffers Paul Earle, a seismologist, and Michelle Guy, a software developer, teamed up to look at how Twitter data could be used for earthquake detection and verification. By using Twitter’s Public API, they decided to use the same time series event detection method they use when detecting earthquakes. This gave them a baseline for earthquake-related chatter, but they decided to dig in even further. They found that people Tweeting about actual earthquakes kept their Tweets really short, even just to ask, “earthquake?” Concluding that people who are experiencing earthquakes aren’t very chatty, they started filtering out Tweets with more than seven words. They also recognized that people sharing links or the size of the earthquake were significantly less likely to be offering firsthand reports, so they filtered out any Tweets sharing a link or a number. Ultimately, this filtered stream proved to be very significant at determining when earthquakes occurred globally.

USGS Modeling Twitter Data to Detect Earthquakes

While I was at the USGS office in Golden, Colo. interviewing Michelle and Paul, three earthquakes happened in a relatively short time. Using Twitter data, their system was able to pick up on an aftershock in Chile within one minute and 20 seconds – and it only took 14 Tweets from the filtered stream to trigger an email alert. The other two earthquakes, off Easter Island and Indonesia, weren’t picked up because they were not widely felt…..

The USGS monitors for earthquakes in many languages, and the words used can be a clue as to the magnitude and location of the earthquake. Chile has two words for earthquakes: terremotoand temblor; terremoto is used to indicate a bigger quake. This one in Chile started with people asking if it was a terremoto, but others realizing that it was a temblor.

As the USGS team notes, Twitter data augments their own detection work on felt earthquakes. If they’re getting reports of an earthquake in a populated area but no Tweets from there, that’s a good indicator to them that it’s a false alarm. It’s also very cost effective for the USGS, because they use Twitter’s Public API and open-source software such as Kibana and ElasticSearch to help determine when earthquakes occur….(More)”