Jonathan Blum, Principal Deputy Administrator, Centers for Medicare & Medicaid Services : “Today the Centers for Medicare & Medicaid Services (CMS) took a major step forward in making Medicare data more transparent and accessible, while maintaining the privacy of beneficiaries, by announcing the release of new data on medical services and procedures furnished to Medicare fee-for-service beneficiaries by physicians and other healthcare professionals (http://www.cms.gov/newsroom/newsroom-center.html). For too long, the only information on physicians readily available to consumers was physician name, address and phone number. This data will, for the first time, provide a better picture of how physicians practice in the Medicare program.
This new data set includes over nine million rows of data on more than 880,000 physicians and other healthcare professionals in all 50 states, DC and Puerto Rico providing care to Medicare beneficiaries in 2012. The data set presents key information on the provision of services by physicians and how much they are paid for those services, and is organized by provider (National Provider Identifier or NPI), type of service (Healthcare Common Procedure Coding System, or HCPCS) code, and whether the service was performed in a facility or office setting. This public data set includes the number of services, average submitted charges, average allowed amount, average Medicare payment, and a count of unique beneficiaries treated. CMS takes beneficiary privacy very seriously and we will protect patient-identifiable information by redacting any data in cases where it includes fewer than 11 beneficiaries.
Previously, CMS could not release this information due to a permanent injunction issued by a court in 1979. However, in May 2013, the court vacated this injunction, causing a series of events that has led CMS to be able to make this information available for the first time.
Data to Fuel Research and Innovation
In addition to the public data release, CMS is making slight modifications to the process to request CMS data for research purposes. This will allow researchers to conduct important research at the physician level. As with the public release of information described above, CMS will continue to prohibit the release of patient-identifiable information. For more information about CMS’s disclosures to researchers, please contact the Research Data Assistance Center (ResDAC) at http://www.resdac.org/.
Unprecedented Data Access
This data release follows other CMS efforts to make more data available to the public. Since 2010, the agency has released an unprecedented amount of aggregated data in machine-readable form, with much of it available at http://www.healthdata.gov. These data range from previously unpublished statistics on Medicare spending, utilization, and quality at the state, hospital referral region, and county level, to detailed information on the quality performance of hospitals, nursing homes, and other providers.
In May 2013, CMS released information on the average charges for the 100 most common inpatient services at more than 3,000 hospitals nationwide http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Inpatient.html.
In June 2013, CMS released average charges for 30 selected outpatient procedures http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Outpatient.html.
We will continue to work toward harnessing the power of data to promote quality and value, and improve the health of our seniors and persons with disabilities.”
Medicare to Publish Trove of Data on Doctors
Louise Radnofsky in the Wall Street Journal: “The Obama administration said it would publish as early as next week data on what Medicare paid individual doctors in 2012, aiming to boost transparency and help root out fraud.
The move, which faced fierce resistance from doctors’ groups, would end a decadeslong block on making the information public.
Federal officials said they planned to release reimbursement information on April 9 or soon after that would show billing data for 880,000 health-care providers treating patients in the government-run insurance program for elderly and disabled people. It will include how many times the providers carried out a particular service or procedure, whether they carried it out in a medical facility or an office setting, the average amount they charged Medicare for it, the average amount they were paid for it, and the total number of people they treated.
The data set would show the names and addresses of the providers in connection with their reimbursement information, officials at the Centers for Medicare and Medicaid Services said. The agency hasn’t previously released such data.
Physicians’ organizations had sought to prevent the release of the data, citing concerns about physician privacy. But a federal judge last year lifted a long-standing injunction placed on the publication of the information by a federal court in Florida, in response to a challenge from Dow Jones & Co., The Wall Street Journal’s parent company.
Jonathan Blum, principal deputy administrator at CMS, informed the American Medical Association and Florida Medical Association in letters dated Wednesday that the agency would move to publish the data soon.
Ardis Dee Hoven, president of the American Medical Association, said the group remained concerned that CMS was taking a “broad approach” that could result in “unwarranted bias against physicians that can destroy careers.” Dr. Hoven said the AMA wanted doctors to be able to review and correct their information before the data set was published. The Florida Medical Association couldn’t immediately be reached.
Mr. Blum said that for privacy reasons, data related to subsets of fewer than 11 Medicare patients would be redacted.
In the letters, Mr. Blum said the agency believed that news organizations seeking the information—which include the Journal—would be able to use it to shed light on problems in the Medicare program. He also specifically cited earlier reporting by the Journal that had drawn on similar data.
“The Department concluded that the data to be released would assist the public’s understanding of Medicare fraud, waste, and abuse, as well as shed light on payments to physicians for services furnished to Medicare beneficiaries,” Mr. Blum wrote. “As an example, using similar payment information, The Wall Street Journal was able to identify and report on a number of instances of Medicare fraud, waste, and abuse, using Medicare payment data in its Secrets of the System series,” Mr. Blum wrote. That series was a finalist for a Pulitzer Prize in 2011.”
Politics and the Internet
Edited book by William H. Dutton (Routledge – 2014 – 1,888 pages: “It is commonplace to observe that the Internet—and the dizzying technologies and applications which it continues to spawn—has revolutionized human communications. But, while the medium’s impact has apparently been immense, the nature of its political implications remains highly contested. To give but a few examples, the impact of networked individuals and institutions has prompted serious scholarly debates in political science and related disciplines on: the evolution of ‘e-government’ and ‘e-politics’ (especially after recent US presidential campaigns); electronic voting and other citizen participation; activism; privacy and surveillance; and the regulation and governance of cyberspace.
As research in and around politics and the Internet flourishes as never before, this new four-volume collection from Routledge’s acclaimed Critical Concepts in Political Science series meets the need for an authoritative reference work to make sense of a rapidly growing—and ever more complex—corpus of literature. Edited by William H. Dutton, Director of the Oxford Internet Institute (OII), the collection gathers foundational and canonical work, together with innovative and cutting-edge applications and interventions.
With a full index and comprehensive bibliographies, together with a new introduction by the editor, which places the collected material in its historical and intellectual context, Politics and the Internet is an essential work of reference. The collection will be particularly useful as a database allowing scattered and often fugitive material to be easily located. It will also be welcomed as a crucial tool permitting rapid access to less familiar—and sometimes overlooked—texts. For researchers, students, practitioners, and policy-makers, it is a vital one-stop research and pedagogic resource.”
Smart cities are here today — and getting smarter
Computer World: “Smart cities aren’t a science fiction, far-off-in-the-future concept. They’re here today, with municipal governments already using technologies that include wireless networks, big data/analytics, mobile applications, Web portals, social media, sensors/tracking products and other tools.
These smart city efforts have lofty goals: Enhancing the quality of life for citizens, improving government processes and reducing energy consumption, among others. Indeed, cities are already seeing some tangible benefits.
But creating a smart city comes with daunting challenges, including the need to provide effective data security and privacy, and to ensure that myriad departments work in harmony.
What makes a city smart? As with any buzz term, the definition varies. But in general, it refers to using information and communications technologies to deliver sustainable economic development and a higher quality of life, while engaging citizens and effectively managing natural resources.
Making cities smarter will become increasingly important. For the first time ever, the majority of the world’s population resides in a city, and this proportion continues to grow, according to the World Health Organization, the coordinating authority for health within the United Nations.
A hundred years ago, two out of every 10 people lived in an urban area, the organization says. As recently as 1990, less than 40% of the global population lived in a city — but by 2010 more than half of all people lived in an urban area. By 2050, the proportion of city dwellers is expected to rise to 70%.
As many city populations continue to grow, here’s what five U.S. cities are doing to help manage it all:
Scottsdale, Ariz.
The city of Scottsdale, Ariz., has several initiatives underway.
One is MyScottsdale, a mobile application the city deployed in the summer of 2013 that allows citizens to report cracked sidewalks, broken street lights and traffic lights, road and sewer issues, graffiti and other problems in the community….”
Visualizing Health IT: A holistic overview
Andy Oram in O’Reilly Data: “There is no dearth of health reformers offering their visions for patient engagement, information exchange, better public health, and disruptive change to health industries. But they often accept too freely the promise of technology, without grasping how difficult the technical implementations of their reforms would be. Furthermore, no document I have found pulls together the various trends in technology and explores their interrelationships.
I have tried to fill this gap with a recently released report: The Information Technology Fix for Health: Barriers and Pathways to the Use of Information Technology for Better Health Care. This posting describes some of the issues it covers.
Take a basic example: fitness devices. Lots of health reformers would love to see these pulled into treatment plans to help people overcome hypertension and other serious conditions. It’s hard to understand the factors that make doctors reluctant to do so–blind conservatism is not the problem, but actual technical factors. To become part of treatment plans, the accuracy of devices would have to be validated, they would need to produce data in formats and units that are universally recognized, and electronic records would have to be undergo major upgrades to store and process the data.
Another example is patient engagement, which doctors and hospitals are furiously pursuing. Not only are patients becoming choosier and rating their institutions publicly in Yelp-like fashion, but the clinicians have come to realize that engaged patients are more likely to participate in developing effective treatment plans, not to mention following through on them.
Engaging patients to improve their own outcomes directly affects the institutions’ bottom lines as insurers and the government move from paying for each procedure to pay-per-value (a fixed sum for handling a group of patients that share a health condition). But what data do we need to make pay-per-value fair and accurate? How do we get that data from one place to another, and–much more difficult–out of one ungainly proprietary format and possibly into others? The answer emerging among activists to these questions is: leave the data under the control of the patients, and let them share it as they find appropriate.
Collaboration may be touted even more than patient engagement as the way to better health. And who wouldn’t want his cardiologist to be consulting with his oncologist, nutritionist, and physical therapist? It doesn’t happen as much as it should, and while picking up the phone may be critical sometimes to making the right decisions, electronic media can also be of crucial value. Once again, we have to overcome technical barriers.
The The Information Technology Fix for Health report divides these issues into four umbrella categories:
- Devices, sensors, and patient monitoring
- Using data: records, public data sets, and research
- Coordinated care: teams and telehealth
- Patient empowerment
Underlying all these as a kind of vast subterranean network of interconnected roots are electronic health records (EHRs). These must function well in order for devices to send output to the interested observers, researchers to collect data, and teams to coordinate care. The article delves into the messy and often ugly area of formats and information exchange, along with issues of privacy. I extol once again the virtue of patient control over records and suggest how we could overcome all barriers to make that happen.”
The GovLab Index: Privacy and Security
Please find below the latest installment in The GovLab Index series, inspired by the Harper’s Index. “The GovLab Index: Privacy and Security” examines the attitudes and concerns of American citizens regarding online privacy. Previous installments include Designing for Behavior Change, The Networked Public, Measuring Impact with Evidence, Open Data, The Data Universe, Participation and Civic Engagement and Trust in Institutions.
Globally
- Percentage of people who feel the Internet is eroding their personal privacy: 56%
- Internet users who feel comfortable sharing personal data with an app: 37%
- Number of users who consider it important to know when an app is gathering information about them: 70%
- How many people in the online world use privacy tools to disguise their identity or location: 28%, or 415 million people
- Country with the highest penetration of general anonymity tools among Internet users: Indonesia, where 42% of users surveyed use proxy servers
- Percentage of China’s online population that disguises their online location to bypass governmental filters: 34%
In the United States
Over the Years
- In 1996, percentage of the American public who were categorized as having “high privacy concerns”: 25%
- Those with “Medium privacy concerns”: 59%
- Those who were unconcerned with privacy: 16%
- In 1998, number of computer users concerned about threats to personal privacy: 87%
- In 2001, those who reported “medium to high” privacy concerns: 88%
- Individuals who are unconcerned about privacy: 18% in 1990, down to 10% in 2004
- How many online American adults are more concerned about their privacy in 2014 than they were a year ago, indicating rising privacy concerns: 64%
- Number of respondents in 2012 who believe they have control over their personal information: 35%, downward trend for 7 years
- How many respondents in 2012 continue to perceive privacy and the protection of their personal information as very important or important to the overall trust equation: 78%, upward trend for seven years
- How many consumers in 2013 trust that their bank is committed to ensuring the privacy of their personal information is protected: 35%, down from 48% in 2004
Privacy Concerns and Beliefs
- How many Internet users worry about their privacy online: 92%
- Those who report that their level of concern has increased from 2013 to 2014: 7 in 10
- How many are at least sometimes worried when shopping online: 93%, up from 89% in 2012
- Those who have some concerns when banking online: 90%, up from 86% in 2012
- Number of Internet users who are worried about the amount of personal information about them online: 50%, up from 33% in 2009
- Those who report that their photograph is available online: 66%
- Their birthdate: 50%
- Home address: 30%
- Cell number: 24%
- A video: 21%
- Political affiliation: 20%
- Those who report that their photograph is available online: 66%
- Consumers who are concerned about companies tracking their activities: 58%
- Those who are concerned about the government tracking their activities: 38%
- How many users surveyed felt that the National Security Association (NSA) overstepped its bounds in light of recent NSA revelations: 44%
- Respondents who are comfortable with advertisers using their web browsing history to tailor advertisements as long as it is not tied to any other personally identifiable information: 36%, up from 29% in 2012
- Percentage of voters who do not want political campaigns to tailor their advertisements based on their interests: 86%
- Percentage of respondents who do not want news tailored to their interests: 56%
- Percentage of users who are worried about their information will be stolen by hackers: 75%
- Those who are worried about companies tracking their browsing history for targeted advertising: 54%
- How many consumers say they do not trust businesses with their personal information online: 54%
- Top 3 most trusted companies for privacy identified by consumers from across 25 different industries in 2012: American Express, Hewlett Packard and Amazon
- Most trusted industries for privacy: Healthcare, Consumer Products and Banking
- Least trusted industries for privacy: Internet and Social Media, Non-Profits and Toys
- Respondents who admit to sharing their personal information with companies they did not trust in 2012 for reasons such as convenience when making a purchase: 63%
- Percentage of users who say they prefer free online services supported by targeted ads: 61%
- Those who prefer paid online services without targeted ads: 33%
- How many Internet users believe that it is not possible to be completely anonymous online: 59%
- Those who believe complete online anonymity is still possible: 37%
- Those who say people should have the ability to use the Internet anonymously: 59%
- Percentage of Internet users who believe that current laws are not good enough in protecting people’s privacy online: 68%
- Those who believe current laws provide reasonable protection: 24%
FULL LIST at http://thegovlab.org/the-govlab-index-privacy-and-trust/
Artists Show How Anyone Can Fight the Man with Open Data
MotherBoard: “The UK’s Open Data Institute usually looks, as you’d probably expect, like an office full of people staring at screens. But visit at the moment and you might see a potato gun among the desks or a bunch of drone photos on the wall—all in the name of encouraging public discussion around and engagement with open data.
The ODI was set up by World Wide Web inventor Tim Berners-Lee and interdisciplinary researcher Nigel Shadbolt in London to push for an open data culture, and from Monday it will be hosting the second Data as Culture exhibition, which presents a more artistic take on questions surrounding the practicalities of open data. In doing so, it shows quite how the general public can (and probably really should) use data to inform their own lives and to engage with political issues.
All of the exhibits are based on freely available data, which is made lot more animated and accessible than numbers in a spreadsheet. “I made the decision straight away to move away from anything screen-based,” curator Shiri Shalmy told me as she gave me a tour, winding through office workers tapping away on keyboards. “Everything had to be physical.”…
James Bridle’s work on drone warfare touches a similar theme, though in this case the data are not hidden: his images of military UAVs come from Google Maps. “They’re there for anybody to look at, they’re kind of secret but available,” said Shalmy, who added that with the data out there, we can’t pretend we don’t know what’s going on. “They can do things in secret as long as we pretend it’s a secret.”
We’ve looked at Bridle’s work before, from his Dronestagram photos to his chalk outlines of drones, and he’s been commissioned to do something new for the Data as Culture show: Shalmy has asked him to compare the open data on military drones against that of London’s financial centre. He’ll present what he digs up in summer.
From the series ‘Watching the Watchers.’ Image: James Bridle/ODI
Using this kind of government data—from local council expenses to military movements—shows quite how much information is available and how it can be used to hold politicians to account. In essence, anyone can do surveillance to some level. While activists including Berners-Lee push for more data to be made accessible, it’s only useful if we actually bother to engage with it, and work like Bridle’s pose the uneasy suggestion that sometimes it’s more comfortable to remain ignorant.
And in addition to reading data, we can collect it. Rather than delving into government files, a knitted banner by artist Sam Meech uses publicly generated data to make a political point. The banner bears the phrase “8 hour labour,” a reference to the eight-hour workday movement that sprang up in Britain’s Industrial Revolution. The idea was that people would have eight hours work, eight hours rest, and eight hours recreation.
A detail from Sam Meechan’s Punchcard Economy. Image: Sam Meechan/ODI
But the black-and-white pattern in the banner is made up of much less regular working hours: those logged by self-employed creatives, who can take part by entering their own timesheet data via virtual punchcards. Shalmy pointed out her own schedule in a week when she was setting up the exhibition: a 70-hour block woven into the knit. It’s an example of how individuals can use data to make a political point—the work is reminiscent of trade union banners and seems particularly relevant at a time when controversial zero hours contracts are on the rise.
Also garnering data from the public, artist collective Thickear are asking people to fill in data forms on their arrival, which they’ll file on an old-fashioned spike. I took one of the forms, only to be confronted with nonsensical bureaucratic-type boxes. “The data itself is not informative in any way,” said Shalmy. It’s more about the idea of who we trust to give our data to. How often do we accept privacy policies without even giving ourselves the chance to even blink at the small print?…”
Government Surveillance and Internet Search Behavior
New paper by Marthews, Alex and Tucker, Catherine: “This paper uses data from Google Trends on search terms from before and after the surveillance revelations of June 2013 to analyze whether Google users’ search behavior shifted as a result of an exogenous shock in information about how closely their internet searches were being monitored by the U. S. government. We use data from Google Trends on search volume for 282 search terms across eleven different countries. These search terms were independently rated for their degree of privacy-sensitivity along multiple dimensions. Using panel data, our result suggest that cross-nationally, users were less likely to search using search terms that they believed might get them in trouble with the U. S. government. In the U. S., this was the main subset of search terms that were affected. However, internationally there was also a drop in traffic for search terms that were rated as personally sensitive. These results have implications for policy makers in terms of understanding the actual effects on search behavior of disclosures relating to the scale of government surveillance on the Internet and their potential effects on international competitiveness.“
Index: Privacy and Security
The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on privacy and security and was originally published in 2014.
Globally
- Percentage of people who feel the Internet is eroding their personal privacy: 56%
- Internet users who feel comfortable sharing personal data with an app: 37%
- Number of users who consider it important to know when an app is gathering information about them: 70%
- How many people in the online world use privacy tools to disguise their identity or location: 28%, or 415 million people
- Country with the highest penetration of general anonymity tools among Internet users: Indonesia, where 42% of users surveyed use proxy servers
- Percentage of China’s online population that disguises their online location to bypass governmental filters: 34%
In the United States
Over the Years
- In 1996, percentage of the American public who were categorized as having “high privacy concerns”: 25%
- Those with “Medium privacy concerns”: 59%
- Those who were unconcerned with privacy: 16%
- In 1998, number of computer users concerned about threats to personal privacy: 87%
- In 2001, those who reported “medium to high” privacy concerns: 88%
- Individuals who are unconcerned about privacy: 18% in 1990, down to 10% in 2004
- How many online American adults are more concerned about their privacy in 2014 than they were a year ago, indicating rising privacy concerns: 64%
- Number of respondents in 2012 who believe they have control over their personal information: 35%, downward trend for 7 years
- How many respondents in 2012 continue to perceive privacy and the protection of their personal information as very important or important to the overall trust equation: 78%, upward trend for seven years
- How many consumers in 2013 trust that their bank is committed to ensuring the privacy of their personal information is protected: 35%, down from 48% in 2004
Privacy Concerns and Beliefs
- How many Internet users worry about their privacy online: 92%
- Those who report that their level of concern has increased from 2013 to 2014: 7 in 10
- How many are at least sometimes worried when shopping online: 93%, up from 89% in 2012
- Those who have some concerns when banking online: 90%, up from 86% in 2012
- Number of Internet users who are worried about the amount of personal information about them online: 50%, up from 33% in 2009
- Those who report that their photograph is available online: 66%
- Their birthdate: 50%
- Home address: 30%
- Cell number: 24%
- A video: 21%
- Political affiliation: 20%
- Those who report that their photograph is available online: 66%
- Consumers who are concerned about companies tracking their activities: 58%
- Those who are concerned about the government tracking their activities: 38%
- How many users surveyed felt that the National Security Association (NSA) overstepped its bounds in light of recent NSA revelations: 44%
- Respondents who are comfortable with advertisers using their web browsing history to tailor advertisements as long as it is not tied to any other personally identifiable information: 36%, up from 29% in 2012
- Percentage of voters who do not want political campaigns to tailor their advertisements based on their interests: 86%
- Percentage of respondents who do not want news tailored to their interests: 56%
- Percentage of users who are worried about their information will be stolen by hackers: 75%
- Those who are worried about companies tracking their browsing history for targeted advertising: 54%
- How many consumers say they do not trust businesses with their personal information online: 54%
- Top 3 most trusted companies for privacy identified by consumers from across 25 different industries in 2012: American Express, Hewlett Packard and Amazon
- Most trusted industries for privacy: Healthcare, Consumer Products and Banking
- Least trusted industries for privacy: Internet and Social Media, Non-Profits and Toys
- Respondents who admit to sharing their personal information with companies they did not trust in 2012 for reasons such as convenience when making a purchase: 63%
- Percentage of users who say they prefer free online services supported by targeted ads: 61%
- Those who prefer paid online services without targeted ads: 33%
- How many Internet users believe that it is not possible to be completely anonymous online: 59%
- Those who believe complete online anonymity is still possible: 37%
- Those who say people should have the ability to use the Internet anonymously: 59%
- Percentage of Internet users who believe that current laws are not good enough in protecting people’s privacy online: 68%
- Those who believe current laws provide reasonable protection: 24%
Security Related Issues
- How many have had an email or social networking account compromised or taken over without permission: 21%
- Those who have been stalked or harassed online: 12%
- Those who think the federal government should do more to act against identity theft: 74%
- Consumers who agree that they will avoid doing business with companies who they do not believe protect their privacy online: 89%
- Among 65+ year old consumers: 96%
Privacy-Related Behavior
- How many mobile phone users have decided not to install an app after discovering the amount of information it collects: 54%
- Number of Internet users who have taken steps to remove or mask their digital footprint (including clearing cookies, encrypting emails, and using virtual networks to mask their IP addresses): 86%
- Those who have set their browser to disable cookies: 65%
- Number of users who have not allowed a service to remember their credit card information: 73%
- Those who have chosen to block an app from accessing their location information: 53%
- How many have signed up for a two-step sign-in process: 57%
- Percentage of Gen-X (33-48 year olds) and Millennials (18-32 year olds) who say they never change their passwords or only change them when forced to: 41%
- How many report using a unique password for each site and service: 4 in 10
- Those who use the same password everywhere: 7%
Sources
- “2012 Most Trusted Companies for Privacy,” Ponemon Institute, January 28, 2013.
- “Fortinet 2014 Privacy Survey Reveals Gen-X and Millennial Attitudes Surrounding Passwords, Personal Data, Email Snooping and Online Marketing Practices” Fortinet, Feb 24, 2014.
- “Global Privacy Survey 2013,” MEFMobile, 2013.
- “Internet Security Priorities,” Computer and Communications Industry Association, December 20, 2013.
- Kiss, Jemima. “Privacy Tools used by 28% of the online world, research finds,” The Guardian, 21 January 2014.
- Kumaraguru, Ponnurangam, and Lorrie Faith Cranor. “Privacy indexes: a survey of Westin’s studies.” Institute for Software Research International, Carnegie Mellon University, and Carnegie Mellon CyLab. December 2005.
- “Most Trusted Retail Banks for Privacy,” Ponemon Institute, October 10, 2013.
- “TRUSTe 2014 US Consumer Confidence Privacy Report,” TRUSTe, 2014.
- Rainie, Lee, Kiesler, Sara, Kang, Ruogu, and Madden, Mary. “Anonymity, Privacy, and Security Online,” PewResearch Internet Project, September 05, 2013.
- Smith, Aaron and Mary Madden. “Privacy and Data Management on Mobile Devices,” PewResearch Internet Project, September 5, 2012.
- Turow, Joseph, Carpini, Michael, Draper, Nora, and Rowan Howard-Williams. “Americans Roundly Reject Tailored Political Advertising,” Annenberg School for Communication, University of Pennsylvania.
How Twitter Could Help Police Departments Predict Crime
Eric Jaffe in Atlantic Cities: “Initially, Matthew Gerber didn’t believe Twitter could help predict where crimes might occur. For one thing, Twitter’s 140-character limit leads to slang and abbreviations and neologisms that are hard to analyze from a linguistic perspective. Beyond that, while criminals occasionally taunt law enforcement via Twitter, few are dumb or bold enough to tweet their plans ahead of time. “My hypothesis was there was nothing there,” says Gerber.
But then, that’s why you run the data. Gerber, a systems engineer at the University of Virginia’s Predictive Technology Lab, did indeed find something there. He reports in a new research paper that public Twitter data improved the predictions for 19 of 25 crimes that occurred early last year in metropolitan Chicago, compared with predictions based on historical crime patterns alone. Predictions for stalking, criminal damage, and gambling saw the biggest bump…..
Of course, the method says nothing about why Twitter data improved the predictions. Gerber speculates that people are tweeting about plans that correlate highly with illegal activity, as opposed to tweeting about crimes themselves.
Let’s use criminal damage as an example. The algorithm identified 700 Twitter topics related to criminal damage; of these, one topic involved the words “united center blackhawks bulls” and so on. Gather enough sports fans with similar tweets and some are bound to get drunk enough to damage public property after the game. Again this scenario extrapolates far more than the data tells, but it offers a possible window into the algorithm’s predictive power.
The map on the left shows predicted crime threat based on historical patterns; the one on the right includes Twitter data. (Via Decision Support Systems)
From a logistical standpoint, it wouldn’t be too difficult for police departments to use this method in their own predictions; both the Twitter data and modeling software Gerber used are freely available. The big question, he says, is whether a department used the same historical crime “hot spot” data as a baseline for comparison. If not, a new round of tests would have to be done to show that the addition of Twitter data still offered a predictive upgrade.
There’s also the matter of public acceptance. Data-driven crime prediction tends to raise any number of civil rights concerns. In 2012, privacy advocates criticized the FBI for a similar plan to use Twitter for crime predictions. In recent months the Chicago Police Department’s own methods have been knocked as a high-tech means of racial profiling. Gerber says his algorithms don’t target any individuals and only cull data posted voluntarily to a public account.”