What is “crowdsourcing public safety” and why are public safety agencies moving toward this trend?
Crowdsourcing—the term coined by our own assistant professor of journalism Jeff Howe—involves taking a task or job traditionally performed by a distinct agent, or employee, and having that activity be executed by an “undefined, generally large group of people in an open call.” Crowdsourcing public safety involves engaging and enabling private citizens to assist public safety professionals in addressing natural disasters, terror attacks, organized crime incidents, and large-scale industrial accidents.
Public safety agencies have long recognized the need for citizen involvement. Tip lines and missing persons bulletins have been used to engage citizens for years, but with advances in mobile applications and big data analytics, the ability of public safety agencies to receive, process, and make use of high volume, tips, and leads makes crowdsourcing searches and investigations more feasible. You saw this in the FBI Boston Marathon Bombing web-based Tip Line. You see it in the “See Something Say Something” initiatives throughout the country. You see it in AMBER alerts or even remote search and rescue efforts. You even see it in more routine instances like Washington State’s HERO program to reduce traffic violations.
Have these efforts been successful, and what challenges remain?
There are a number of issues to overcome with regard to crowdsourcing public safety—such as maintaining privacy rights, ensuring data quality, and improving trust between citizens and law enforcement officers. Controversies over the National Security Agency’s surveillance program and neighborhood watch programs – particularly the shooting death of teenager Trayvon Martin by neighborhood watch captain George Zimmerman, reflect some of these challenges. It is not clear yet from research the precise set of success criteria, but those efforts that appear successful at the moment have tended to be centered around a particular crisis incident—such as a specific attack or missing person. But as more crowdsourcing public safety mobile applications are developed, adoption and use is likely to increase. One trend to watch is whether national public safety programs are able to tap into the existing social networks of community-based responders like American Red Cross volunteers, Community Emergency Response Teams, and United Way mentors.
The move toward crowdsourcing public safety is part of an overall trend toward improving community resilience, which refers to a system’s ability to bounce back after a crisis or disturbance. Stephen Flynn and his colleagues at Northeastern’s George J. Kostas Research Institute for Homeland Security are playing a key role in driving a national conversation in this area. Community resilience is inherently multi-disciplinary, so you see research being done regarding transportation infrastructure, social media use after a crisis event, and designing sustainable urban environments. Northeastern is a place where use-inspired research is addressing real-world problems. It will take a village to improve community resilience capabilities, and our institution is a vital part of thought leadership for that village.”
If big data is an atomic bomb, disarmament begins in Silicon Valley
Derrick Harris at GigaOM: “Big data is like atomic energy, according to scientist Albert-László Barabási in a Monday column on Politico. It’s very beneficial when used ethically, and downright destructive when turned into a weapon. He argues scientists can help resolve the damage done by government spying by embracing the principles of nuclear nonproliferation that helped bring an end to Cold War fears and distrust.
Barabási’s analogy is rather poetic:
“Powered by the right type of Big Data, data mining is a weapon. It can be just as harmful, with long-term toxicity, as an atomic bomb. It poisons trust, straining everything from human relations to political alliances and free trade. It may target combatants, but it cannot succeed without sifting through billions of data points scraped from innocent civilians. And when it is a weapon, it should be treated like a weapon.”
I think he’s right, but I think the fight to disarm the big data bomb begins in places like Silicon Valley and Madison Avenue. And it’s not just scientists; all citizens should have a role…
I write about big data and data mining for a living, and I think the underlying technologies and techniques are incredibly valuable, even if the applications aren’t always ideal. On the one hand, advances in machine learning from companies such as Google and Microsoft are fantastic. On the other hand, Facebook’s newly expanded Graph Search makes Europe’s proposed right-to-be-forgotten laws seem a lot more sensible.
But it’s all within the bounds of our user agreements and beauty is in the eye of the beholder.
Perhaps the reason we don’t vote with our feet by moving to web platforms that embrace privacy, even though we suspect it’s being violated, is that we really don’t know what privacy means. Instead of regulating what companies can and can’t do, perhaps lawmakers can mandate a degree of transparency that actually lets users understand how data is being used, not just what data is being collected. Great, some company knows my age, race, ZIP code and web history: What I really need to know is how it’s using that information to target, discriminate against or otherwise serve me.
An intelligent national discussion about the role of the NSA is probably in order. For all anyone knows, it could even turn out we’re willing to put up with more snooping than the goverment might expect. But until we get a handle on privacy from the companies we choose to do business with, I don’t think most Americans have the stomach for such a difficult fight.”
The Value of Personal Data
The Digital Enlightenment Yearbook 2013 is dedicated this year to Personal Data: “The value of personal data has traditionally been understood in ethical terms as a safeguard for personality rights such as human dignity and privacy. However, we have entered an era where personal data are mined, traded and monetized in the process of creating added value – often in terms of free services including efficient search, support for social networking and personalized communications. This volume investigates whether the economic value of personal data can be realized without compromising privacy, fairness and contextual integrity. It brings scholars and scientists from the disciplines of computer science, law and social science together with policymakers, engineers and entrepreneurs with practical experience of implementing personal data management.
The resulting collection will be of interest to anyone concerned about privacy in our digital age, especially those working in the field of personal information management, whether academics, policymakers, or those working in the private sector.”
Riding the Waves or Caught in the Tide? Navigating the Evolving Information Environment
IFLA Trend Report: “In the global information environment, time moves quickly and there’s an abundance of commentators trying to keep up. With each new technological development, a new report emerges assessing its impact on different sectors of society. The IFLA Trend Report takes a broader approach and identifies five high level trends shaping the information society, spanning access to education, privacy, civic engagement and transformation. Its findings reflect a year’s consultation with a range of experts and stakeholders from different disciplines to map broader societal changes occurring, or likely to occur in the information environment.
The IFLA Trend Report is more than a single document – it is a selection of resources to help you understand where libraries fit into a changing society.
From Five Key Trends Which Will Change Our Information Environment:
Trend 1:
New Technologies Will Both Expand and Limit Who Has Access to Information…
Trend 2:
Online Education Will Democratise and Disrupt Global Learning…
Trend 3:
The Boundaries of Privacy and Data Protection Will Be Redefined…
Trend 4:
Hyper-Connected Societies Will Listen to and Empower New Voices and Groups…In hyper-connected societies more opportunities for collective action are being realised – enabling the rise of new voices and promoting the growth of single-issue movements at the expense of traditional political parties. Open government initiatives and access to public sector data are leading to more transparency and citizen-focused public services.
Trend 5:
The Global Information Economy Will Be Transformed by New Technologies…”
A Theory of Creepy: Technology, Privacy and Shifting Social Norms
Omer Tene and Jules Polonetsky in Yale Journal of Law & Technology: “The rapid evolution of digital technologies has hurled to the forefront of public and legal discourse dense social and ethical dilemmas that we have hardly begun to map and understand. In the near past, general community norms helped guide a clear sense of ethical boundaries with respect to privacy. One does not peek into the window of a house even if it is left open. One does not hire a private detective to investigate a casual date or the social life of a prospective employee. Yet with technological innovation rapidly driving new models for business and inviting new types of personal socialization, we often have nothing more than a fleeting intuition as to what is right or wrong. Our intuition may suggest that it is responsible to investigate the driving record of the nanny who drives our child to school, since such tools are now readily available. But is it also acceptable to seek out the records of other parents in our child’s car pool or of a date who picks us up by car? Alas, intuitions and perceptions of “creepiness” are highly subjective and difficult to generalize as social norms are being strained by new technologies and capabilities. And businesses that seek to create revenue opportunities by leveraging newly available data sources face huge challenges trying to operationalize such subjective notions into coherent business and policy strategies.
This article presents a set of social and legal considerations to help individuals, engineers, businesses and policymakers navigate a world of new technologies and evolving social norms. These considerations revolve around concepts that we have explored in prior work, including enhanced transparency; accessibility to information in usable format; and the elusive principle of context.”
OECD's Revised Guidelines on Privacy
OECD: “Over many decades the OECD has played an important role in promoting respect for privacy as a fundamental value and a condition for the free flow of personal data across borders. The cornerstone of OECD work on privacy is its newly revised Guidelines on the Protection of Privacy and Transborder Flows of Personal Data (2013).
Another key component of work in this area aims to improve cross-border co-operation among privacy law enforcement authorities. This work produced an OECD Recommendation on Cross-border Co-operation in the Enforcement of Laws Protecting Privacy in 2007 and inspired the formation of the Global Privacy Enforcement Network, to which the OECD provides support.
Other projects have examined privacy notices and considered privacy in the context of horizontal issues such as radio frequency indentification (RFID), digital identity management, and looked at metrics to inform policy making in these areas. The important role of privacy is also addressed in the OECD Recommendation on Principles for Internet Policy Making (2011) and the Seoul Ministerial Declaration on the Future of the Internet Economy (2008).
Current work is examining privacy-related issues raised by large-scale data use and analytics. It is part of a broader project on the data-driven innovation and growth, which already produced a preliminary report identifying key issues.”
(Appropriate) Big Data for Climate Resilience?
Amy Luers at the Stanford Social Innovation Review: “The answer to whether big data can help communities build resilience to climate change is yes—there are huge opportunities, but there are also risks.
Opportunities
- Feedback: Strong negative feedback is core to resilience. A simple example is our body’s response to heat stress—sweating, which is a natural feedback to cool down our body. In social systems, feedbacks are also critical for maintaining functions under stress. For example, communication by affected communities after a hurricane provides feedback for how and where organizations and individuals can provide help. While this kind of feedback used to rely completely on traditional communication channels, now crowdsourcing and data mining projects, such as Ushahidi and Twitter Earthquake detector, enable faster and more-targeted relief.
- Diversity: Big data is enhancing diversity in a number of ways. Consider public health systems. Health officials are increasingly relying on digital detection methods, such as Google Flu Trends or Flu Near You, to augment and diversify traditional disease surveillance.
- Self-Organization: A central characteristic of resilient communities is the ability to self-organize. This characteristic must exist within a community (see the National Research Council Resilience Report), not something you can impose on it. However, social media and related data-mining tools (InfoAmazonia, Healthmap) can enhance situational awareness and facilitate collective action by helping people identify others with common interests, communicate with them, and coordinate efforts.
Risks
- Eroding trust: Trust is well established as a core feature of community resilience. Yet the NSA PRISM escapade made it clear that big data projects are raising privacy concerns and possibly eroding trust. And it is not just an issue in government. For example, Target analyzes shopping patterns and can fairly accurately guess if someone in your family is pregnant (which is awkward if they know your daughter is pregnant before you do). When our trust in government, business, and communities weakens, it can decrease a society’s resilience to climate stress.
- Mistaking correlation for causation: Data mining seeks meaning in patterns that are completely independent of theory (suggesting to some that theory is dead). This approach can lead to erroneous conclusions when correlation is mistakenly taken for causation. For example, one study demonstrated that data mining techniques could show a strong (however spurious) correlation between the changes in the S&P 500 stock index and butter production in Bangladesh. While interesting, a decision support system based on this correlation would likely prove misleading.
- Failing to see the big picture: One of the biggest challenges with big data mining for building climate resilience is its overemphasis on the hyper-local and hyper-now. While this hyper-local, hyper-now information may be critical for business decisions, without a broader understanding of the longer-term and more-systemic dynamism of social and biophysical systems, big data provides no ability to understand future trends or anticipate vulnerabilities. We must not let our obsession with the here and now divert us from slower-changing variables such as declining groundwater, loss of biodiversity, and melting ice caps—all of which may silently define our future. A related challenge is the fact that big data mining tends to overlook the most vulnerable populations. We must not let the lure of the big data microscope on the “well-to-do” populations of the world make us blind to the less well of populations within cities and communities that have more limited access to smart phones and the Internet.”
From Networked Publics to Issue Publics: Reconsidering the Public/Private Distinction in Web Science
New paper by Andreas Birkbak: “As an increasing part of everyday life becomes connected with the web in many areas of the globe, the question of how the web mediates political processes becomes still more urgent. Several scholars have started to address this question by thinking about the web in terms of a public space. In this paper, we aim to make a twofold contribution towards the development of the concept of publics in web science. First, we propose that although the notion of publics raises a variety of issues, two major concerns continue to be user privacy and democratic citizenship on the web. Well-known arguments hold that the complex connectivity of the web puts user privacy at risk and enables the enclosure of public debate in virtual echo chambers. Our first argument is that these concerns are united by a set of assumptions coming from liberal political philosophy that are rarely made explicit. As a second contribution, this paper points towards an alternative way to think about publics by proposing a pragmatist reorientation of the public/private distinction in web science, away from seeing two spheres that needs to be kept separate, towards seeing the public and the private as something that is continuously connected. The theoretical argument is illustrated by reference to a recently published case study of Facebook groups, and future research agendas for the study of web-mediated publics are proposed.”
Public Open Data: The Good, the Bad, the Future
Camille Crittenden at IDEALAB: “Some of the most powerful tools combine official public data with social media or other citizen input, such as the recent partnership between Yelp and the public health departments in New York and San Francisco for restaurant hygiene inspection ratings. In other contexts, such tools can help uncover and ultimately reduce corruption by making it easier to “follow the money.”
Despite the opportunities offered by “free data,” this trend also raises new challenges and concerns, among them, personal privacy and security. While attention has been devoted to the unsettling power of big data analysis and “predictive analytics” for corporate marketing, similar questions could be asked about the value of public data. Does it contribute to community cohesion that I can find out with a single query how much my neighbors paid for their house or (if employed by public agencies) their salaries? Indeed, some studies suggest that greater transparency leads not to greater trust in government but to resignation and apathy.
Exposing certain law enforcement data also increases the possibility of vigilantism. California law requires the registration and publication of the home addresses of known sex offenders, for instance. Or consider the controversy and online threats that erupted when, shortly after the Newtown tragedy, a newspaper in New York posted an interactive map of gun permit owners in nearby counties.
…Policymakers and officials must still mind the “big data gap.”So what does the future hold for open data? Publishing data is only one part of the information ecosystem. To be useful, tools must be developed for cleaning, sorting, analyzing and visualizing it as well. …
For-profit companies and non-profit watchdog organizations will continue to emerge and expand, building on the foundation of this data flood. Public-private partnerships such as those between San Francisco and Appallicious or Granicus, startups created by Code for America’s Incubator, and non-partisan organizations like the Sunlight Foundation and MapLight rely on public data repositories for their innovative applications and analysis.
Making public data more accessible is an important goal and offers enormous potential to increase civic engagement. To make the most effective and equitable use of this resource for the public good, cities and other government entities should invest in the personnel and equipment — hardware and software — to make it universally accessible. At the same time, Chief Data Officers (or equivalent roles) should also be alert to the often hidden challenges of equity, inclusion, privacy, and security.”
Index: The Data Universe
The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on the data universe and was originally published in 2013.
- How much data exists in the digital universe as of 2012: 2.7 zetabytes*
- Increase in the quantity of Internet data from 2005 to 2012: +1,696%
- Percent of the world’s data created in the last two years: 90
- Number of exabytes (=1 billion gigabytes) created every day in 2012: 2.5; that number doubles every month
- Percent of the digital universe in 2005 created by the U.S. and western Europe vs. emerging markets: 48 vs. 20
- Percent of the digital universe in 2012 created by emerging markets: 36
- Percent of the digital universe in 2020 predicted to be created by China alone: 21
- How much information in the digital universe is created and consumed by consumers (video, social media, photos, etc.) in 2012: 68%
- Percent of which enterprises have liability or responsibility for (copyright, privacy, compliance with regulations, etc.): 80
- Amount included in the Obama Administration’s 2-12 Big Data initiative: over $200 million
- Amount the Department of Defense is investing annually on Big Data projects as of 2012: over $250 million
- Data created per day in 2012: 2.5 quintillion bytes
- How many terabytes* of data collected by the U.S. Library of Congress as of April 2011: 235
- How many terabytes of data collected by Walmart per hour as of 2012: 2,560, or 2.5 petabytes*
- Projected growth in global data generated per year, as of 2011: 40%
- Number of IT jobs created globally by 2015 to support big data: 4.4 million (1.9 million in the U.S.)
- Potential shortage of data scientists in the U.S. alone predicted for 2018: 140,000-190,000, in addition to 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions
- Time needed to sequence the complete human genome (analyzing 3 billion base pairs) in 2003: ten years
- Time needed in 2013: one week
- The world’s annual effective capacity to exchange information through telecommunication networks in 1986, 2007, and (predicted) 2013: 281 petabytes, 65 exabytes, 667 exabytes
- Projected amount of digital information created annually that will either live in or pass through the cloud: 1/3
- Increase in data collection volume year-over-year in 2012: 400%
- Increase in number of individual data collectors from 2011 to 2012: nearly double (over 300 data collection parties in 2012)
*1 zetabyte = 1 billion terabytes | 1 petabyte = 1,000 terabytes | 1 terabyte = 1,000 gigabytes | 1 gigabyte = 1 billion bytes
Sources
- “Open Data Sites,” data.gov, accessed August 19, 2013.
- “Obama Administration Unveils “Big Data’ Initiative: Announces $200 Million in New R&D Investments,” Office of Science and Technology Policy, March 29, 2012.
- Joan Gantz and David Reinsel, “The Digital Universe Decade – Are you Ready?” International Data Corporation, iview, May 2010
- Andrew McAfee and Erik Brynjolfsson,“Big Data: The Management Revolution,” Harvard business Review, October 2012.
- “Big data: The next frontier for innovation, competition and productivity,” Mckinsey Global Institute, June 2011.
- “Big Data, Official statistics and Social Science Research,” The World Bank, December 12, 2012.
- “Data, data everywhere,” The Economist, February 25, 2010.
- Jimmy Daly, “18 Incredible Internet-Usage Statistics,” FedTech, June 12, 2013.
- Douglas Karr, “Infographic: Big Data Brings Marketing Big Numbers,” Marketing Tech Blog, May 9, 2012.
- “2012 Krux Cross-Industry Study,” Krux Research, June 12, 2012.
- “Gartner Says Big Data Creates Big Jobs,” Gartner, October 22, 2012.
- “Big Data Universe Beginning to Explode,” CSC, accessed August 19, 2013.
- “The Digital Universe in 2020,” IDC, December 2012
- “Big Data, for Better or Worse: 90 % of World’s Data Generated Over Last Two Years”, Science Faily, May 22, 2013.