Twitter Can Now Predict Crime, and This Raises Serious Questions


Motherboard: “Police departments in New York City may soon be using geo-tagged tweets to predict crime. It sounds like a far-fetched sci-fi scenario a la Minority Report, but when I contacted Dr. Matthew Greber, the University of Virginia researcher behind the technology, he explained that the system is far more mathematical than metaphysical.
The system Greber has devised is an amalgam of both old and new techniques. Currently, many police departments target hot spots for criminal activity based on actual occurrences of crime. This approach, called kernel density estimation (KDE), involves pairing a historical crime record with a geographic location and using a probability function to calculate the possibility of future crimes occurring in that area. While KDE is a serviceable approach to anticipating crime, it pales in comparison to the dynamism of Twitter’s real-time data stream, according to Dr. Gerber’s research paper “Predicting Crime Using Twitter and Kernel Density Estimation”.
Dr. Greber’s approach is similar to KDE, but deals in the ethereal realm of data and language, not paperwork. The system involves mapping the Twitter environment; much like how police currently map the physical environment with KDE. The big difference is that Greber is looking at what people are talking about in real time, as well as what they do after the fact, and seeing how well they match up. The algorithms look for certain language that is likely to indicate the imminent occurrence of a crime in the area, Greber says. “We might observe people talking about going out, getting drunk, going to bars, sporting events, and so on—we know that these sort of events correlate with crime, and that’s what the models are picking up on.”
Once this data is collected, the GPS tags in tweets allows Greber and his team to pin them to a virtual map and outline hot spots for potential crime. However, everyone who tweets about hitting the club later isn’t necessarily going to commit a crime. Greber tests the accuracy of his approach by comparing Twitter-based KDE predictions with traditional KDE predictions based on police data alone. The big question is, does it work? For Greber, the answer is a firm “sometimes.” “It helps for some, and it hurts for others,” he says.
According to the study’s results, Twitter-based KDE analysis yielded improvements in predictive accuracy over traditional KDE for stalking, criminal damage, and gambling. Arson, kidnapping, and intimidation, on the other hand, showed a decrease in accuracy from traditional KDE analysis. It’s not clear why these crimes are harder to predict using Twitter, but the study notes that the issue may lie with the kind of language used on Twitter, which is characterized by shorthand and informal language that can be difficult for algorithms to parse.
This kind of approach to high-tech crime prevention brings up the familiar debate over privacy and the use of users’ date for purposes they didn’t explicitly agree to. The case becomes especially sensitive when data will be used by police to track down criminals. On this point, though he acknowledges post-Snowden societal skepticism regarding data harvesting for state purposes, Greber is indifferent. “People sign up to have their tweets GPS tagged. It’s an opt-in thing, and if you don’t do it, your tweets won’t be collected in this way,” he says. “Twitter is a public service, and I think people are pretty aware of that.”…