What is “crowdsourcing public safety” and why are public safety agencies moving toward this trend?
Crowdsourcing—the term coined by our own assistant professor of journalism Jeff Howe—involves taking a task or job traditionally performed by a distinct agent, or employee, and having that activity be executed by an “undefined, generally large group of people in an open call.” Crowdsourcing public safety involves engaging and enabling private citizens to assist public safety professionals in addressing natural disasters, terror attacks, organized crime incidents, and large-scale industrial accidents.
Public safety agencies have long recognized the need for citizen involvement. Tip lines and missing persons bulletins have been used to engage citizens for years, but with advances in mobile applications and big data analytics, the ability of public safety agencies to receive, process, and make use of high volume, tips, and leads makes crowdsourcing searches and investigations more feasible. You saw this in the FBI Boston Marathon Bombing web-based Tip Line. You see it in the “See Something Say Something” initiatives throughout the country. You see it in AMBER alerts or even remote search and rescue efforts. You even see it in more routine instances like Washington State’s HERO program to reduce traffic violations.
Have these efforts been successful, and what challenges remain?
There are a number of issues to overcome with regard to crowdsourcing public safety—such as maintaining privacy rights, ensuring data quality, and improving trust between citizens and law enforcement officers. Controversies over the National Security Agency’s surveillance program and neighborhood watch programs – particularly the shooting death of teenager Trayvon Martin by neighborhood watch captain George Zimmerman, reflect some of these challenges. It is not clear yet from research the precise set of success criteria, but those efforts that appear successful at the moment have tended to be centered around a particular crisis incident—such as a specific attack or missing person. But as more crowdsourcing public safety mobile applications are developed, adoption and use is likely to increase. One trend to watch is whether national public safety programs are able to tap into the existing social networks of community-based responders like American Red Cross volunteers, Community Emergency Response Teams, and United Way mentors.
The move toward crowdsourcing public safety is part of an overall trend toward improving community resilience, which refers to a system’s ability to bounce back after a crisis or disturbance. Stephen Flynn and his colleagues at Northeastern’s George J. Kostas Research Institute for Homeland Security are playing a key role in driving a national conversation in this area. Community resilience is inherently multi-disciplinary, so you see research being done regarding transportation infrastructure, social media use after a crisis event, and designing sustainable urban environments. Northeastern is a place where use-inspired research is addressing real-world problems. It will take a village to improve community resilience capabilities, and our institution is a vital part of thought leadership for that village.”
Crowdfunding in the EU – exploring the added value of potential EU action
Press Release: “Following the Workshop on Crowdfunding organised on 3 June 2013 in Brussels, the European Commission has today launched a consultation inviting stakeholders to share their views about crowdfunding: its potential benefits, risks, and the design of an optimal policy framework to untap the potential of this new form of financing…
Whereas many crowdfunding campaigns are local in nature, others would benefit from easier access to financing within a single European market. But to make sure crowdfunding is not just a momentary trend that fades away, but rather a sustainable source of financing for new European projects, certain safeguards are needed, in particular to ensure people’s trust. The ultimate objective of this consultation is to gather data about the needs of market participants and to identify the areas in which there is a potential added value in EU action to encourage the growth of this new industry, either through facilitative, soft-law measures or legislative action.
The consultation covers all forms of crowdfunding, ranging from donations and rewards to financial investments. Everyone is invited to share their opinion and fill in the on-line questionnaire, including citizens who might contribute to crowdfunding campaigns and entrepreneurs who might launch such campaigns. National authorities and crowdfunding platforms are also particularly encouraged to reply. The consultation will run until 31 December 2013.
See also MEMO/13/847
The consultation is available at:
http://ec.europa.eu/internal_market/consultations/2013/crowdfunding/index_en.htm
Further information:
Workshop on Crowdfunding – 3 June 2013
http://ec.europa.eu/internal_market/conferences/2013/0603-crowdfunding-workshop/
Commissioner Barnier’s speech at the Workshop on Crowdfunding
SPEECH/13/492″
If big data is an atomic bomb, disarmament begins in Silicon Valley
Derrick Harris at GigaOM: “Big data is like atomic energy, according to scientist Albert-László Barabási in a Monday column on Politico. It’s very beneficial when used ethically, and downright destructive when turned into a weapon. He argues scientists can help resolve the damage done by government spying by embracing the principles of nuclear nonproliferation that helped bring an end to Cold War fears and distrust.
Barabási’s analogy is rather poetic:
“Powered by the right type of Big Data, data mining is a weapon. It can be just as harmful, with long-term toxicity, as an atomic bomb. It poisons trust, straining everything from human relations to political alliances and free trade. It may target combatants, but it cannot succeed without sifting through billions of data points scraped from innocent civilians. And when it is a weapon, it should be treated like a weapon.”
I think he’s right, but I think the fight to disarm the big data bomb begins in places like Silicon Valley and Madison Avenue. And it’s not just scientists; all citizens should have a role…
I write about big data and data mining for a living, and I think the underlying technologies and techniques are incredibly valuable, even if the applications aren’t always ideal. On the one hand, advances in machine learning from companies such as Google and Microsoft are fantastic. On the other hand, Facebook’s newly expanded Graph Search makes Europe’s proposed right-to-be-forgotten laws seem a lot more sensible.
But it’s all within the bounds of our user agreements and beauty is in the eye of the beholder.
Perhaps the reason we don’t vote with our feet by moving to web platforms that embrace privacy, even though we suspect it’s being violated, is that we really don’t know what privacy means. Instead of regulating what companies can and can’t do, perhaps lawmakers can mandate a degree of transparency that actually lets users understand how data is being used, not just what data is being collected. Great, some company knows my age, race, ZIP code and web history: What I really need to know is how it’s using that information to target, discriminate against or otherwise serve me.
An intelligent national discussion about the role of the NSA is probably in order. For all anyone knows, it could even turn out we’re willing to put up with more snooping than the goverment might expect. But until we get a handle on privacy from the companies we choose to do business with, I don’t think most Americans have the stomach for such a difficult fight.”
Global Open Data Initiative moving forward
Provide a leading vision for how governments approach open data. Open data commitments are among the most popular commitments for countries participating in the Open Government Partnership. The Global Open Data Initiative recommendations and resources will help guide open data initiatives and others as they seek to design and implement strong, effective open data initiatives and policies. Global Open Data Initiative resources will also help civil society actors who will be evaluating government initiatives.
Increase awareness of open data. Global Open Data Initiative will work to advance the understanding of open data issues, challenges, and resources by promoting best practices, engaging in online and offline dialogue, and supporting networking between organizations both new and familiar to the open data arena.
Support the development of the global open data community especially in civil society. Civil society organizations (CSOs) have a key role to play as suppliers, intermediaries, and users of open data, though at present, relatively few organizations are engaging with open data and the opportunities it presents. Most CSOs lack the awareness, skills and support needed to be active users and providers of open data in ways that can help them meet their goals. The Global Open Data Initiative aims to help CSOs, to engage with and use open data whether whatever area they work on – be it climate change, democratic rights, land governance or financial reform.
Our immediate focus is on two activities:
- To consult with members of the CSO community around the world about what they think is important in this area
- Develop a set of principles in collaboration with the CSO community to guide open government data policies and approaches and to help initiate, strengthen and further elevate conversations between governments and civil society.”
Using Big Data to Ask Big Questions
Chase Davis in the SOURCE: “First, let’s dispense with the buzzwords. Big Data isn’t what you think it is: Every federal campaign contribution over the last 30-plus years amounts to several tens of millions of records. That’s not Big. Neither is a dataset of 50 million Medicare records. Or even 260 gigabytes of files related to offshore tax havens—at least not when Google counts its data in exabytes. No, the stuff we analyze in pursuit of journalism and app-building is downright tiny by comparison.
But you know what? That’s ok. Because while super-smart Silicon Valley PhDs are busy helping Facebook crunch through petabytes of user data, they’re also throwing off intellectual exhaust that we can benefit from in the journalism and civic data communities. Most notably: the ability to ask Big Questions.
Most of us who analyze public data for fun and profit are familiar with small questions. They’re focused, incisive, and often have the kind of black-and-white, definitive answers that end up in news stories: How much money did Barack Obama raise in 2012? Is the murder rate in my town going up or down?
Big Questions, on the other hand, are speculative, exploratory, and systemic. As the name implies, they are also answered at scale: Rather than distilling a small slice of a dataset into a concrete answer, Big Questions look at entire datasets and reveal small questions you wouldn’t have thought to ask.
Can we track individual campaign donor behavior over decades, and what does that tell us about their influence in politics? Which neighborhoods in my city are experiencing spikes in crime this week, and are police changing patrols accordingly?
Or, by way of example, how often do interest groups propose cookie-cutter bills in state legislatures?
Looking at Legislation
Even if you don’t follow politics, you probably won’t be shocked to learn that lawmakers don’t always write their own bills. In fact, interest groups sometimes write them word-for-word.
Sometimes those groups even try to push their bills in multiple states. The conservative American Legislative Exchange Council has gotten some press, but liberal groups, social and business interests, and even sororities and fraternities have done it too.
On its face, something about elected officials signing their names to cookie-cutter bills runs head-first against people’s ideal of deliberative Democracy—hence, it tends to make news. Those can be great stories, but they’re often limited in scope to a particular bill, politician, or interest group. They’re based on small questions.
Data science lets us expand our scope. Rather than focusing on one bill, or one interest group, or one state, why not ask: How many model bills were introduced in all 50 states, period, by anyone, during the last legislative session? No matter what they’re about. No matter who introduced them. No matter where they were introduced.
Now that’s a Big Question. And with some basic data science, it’s not particularly hard to answer—at least at a superficial level.
Analyze All the Things!
Just for kicks, I tried building a system to answer this question earlier this year. It was intended as an example, so I tried to choose methods that would make intuitive sense. But it also makes liberal use of techniques applied often to Big Data analysis: k-means clustering, matrices, graphs, and the like.
If you want to follow along, the code is here….
To make exploration a little easier, my code represents similar bills in graph space, shown at the top of this article. Each dot (known as a node) represents a bill. And a line connecting two bills (known as an edge) means they were sufficiently similar, according to my criteria (a cosine similarity of 0.75 or above). Thrown into a visualization software like Gephi, it’s easy to click around the clusters and see what pops out. So what do we find?
There are 375 clusters in total. Because of the limitations of our data, many of them represent vague, subject-specific bills that just happen to have similar titles even though the legislation itself is probably very different (think things like “Budget Bill” and “Campaign Finance Reform”). This is where having full bill text would come handy.
But mixed in with those bills are a handful of interesting nuggets. Several bills that appear to be modeled after legislation by the National Conference of Insurance Legislators appear in multiple states, among them: a bill related to limited lines travel insurance; another related to unclaimed insurance benefits; and one related to certificates of insurance.”
The Shutdown’s Data Blackout
Opinion piece by Katherine G. Abraham and John Haltiwanger in The New York Times: “Today, for the first time since 1996 and only the second time in modern memory, the Bureau of Labor Statistics will not issue its monthly jobs report, as a result of the shutdown of nonessential government services. This raises an important question: Are the B.L.S. report and other economic data that the government provides “nonessential”?
If we’re trying to understand how much damage the shutdown or sequestration cuts are doing to jobs or the fragile economic recovery, they are definitely essential. Without robust economic data from the federal government, we can speculate, but we won’t really know.
In the last two shutdowns, in 1995 and 1996, the Congressional Budget Office estimated the economic damage at around 0.5 percent of the gross domestic product. This time, Moody’s estimates that a three-to-four-week shutdown might subtract 1.4 percent (annualized) from gross domestic product growth this quarter and take $55 billion out of the economy. Democrats tend to play up such projections; Republicans tend to play them down. If the shutdown continues, though, we’ll all be less able to tell what impact it is having, because more reports like the B.L.S. jobs report will be delayed, while others may never be issued.
In fact, sequestration cuts that affected 2013 budgets are already leading federal statistics agencies to defer or discontinue dozens of reports on everything from income to overseas labor costs. The economic data these agencies produce are key to tracking G.D.P., earnings and jobs, and to informing the Federal Reserve, the executive branch and Congress on the state of the economy and the impact of economic policies. The data are also critical for decisions made by state and local policy makers, businesses and households.
The combined budget for all the federal statistics agencies totals less than 0.1 percent of the federal budget. Yet the same across-the-board-cut mentality that led to sequester and shutdown has shortsightedly cut statistics agencies, too, as if there were something “nonessential” about spending money on accurately assessing the economic effects of government actions and inactions. As a result, as we move through the shutdown, the debt-ceiling fight and beyond, reliable, essential data on the impact of policy decisions will be harder to come by.
Unless the sequester cuts are reversed, funding for economic data will shrink further in 2014, on top of a string of lean budget years. More data reports will be eliminated at the B.L.S., the Census Bureau, the Bureau of Economic Analysis and other agencies. Even more insidious damage will come from compromising the methods for producing the reports that still are paid for and from failing to prepare for the future.
To save money, survey sample sizes will be cut, reducing the reliability of national data and undermining local statistics. Fewer resources will be devoted to maintaining the listings used to draw business survey samples, running the risk that surveys based on those listings won’t do as good a job of capturing actual economic conditions. Hiring and training will be curtailed. Over time, the availability and quality of economic indicators will diminish.
That would be especially paradoxical and backward at a time when economic statistics can and should be advancing through technological innovation instead of marched backward by politics. Integrating survey data, administrative data and commercial data collected with scanners and other digital technologies could produce richer, more useful information with less of a burden on businesses and households.
Now more than ever, framing sound economic policy depends on timely and accurate information about the economy. Bad or ill-targeted data can lead to bad or ill-targeted decisions about taxes and spending. The tighter the budget and the more contentious the political debate around it, the more compelling the argument for investing in federal data that accurately show how government policies are affecting the economy, so we can target the most effective cuts or spending or other policies, and make ourselves accountable for their results. That’s why Congress should restore funding to the federal statistical agencies at a level that allows them to carry out their critical work.”
Best Practices for Government Crowdsourcing Programs
Anton Root: “Crowdsourcing helps communities connect and organize, so it makes sense that governments are increasingly making use of crowd-powered technologies and processes.
Just recently, for instance, we wrote about the Malaysian government’s initiative to crowdsource the national budget. Closer to home, we’ve seen government agencies from U.S. AID to NASA make use of the crowd.
Daren Brabham, professor at the University of Southern California, recently published a report titled “Using Crowdsourcing In Government” that introduces readers to the basics of crowdsourcing, highlights effective use cases, and establishes best practices when it comes to governments opening up to the crowd. Below, we take a look at a few of the suggestions Brabham makes to those considering crowdsourcing.
Brabham splits up his ten best practices into three phases: planning, implementation, and post-implementation. The first suggestion in the planning phase he makes may be the most critical of all: “Clearly define the problem and solution parameters.” If the community isn’t absolutely clear on what the problem is, the ideas and solutions that users submit will be equally vague and largely useless.
This applies not only to government agencies, but also to SMEs and large enterprises making use of crowdsourcing. At Massolution NYC 2013, for instance, we heard again and again the importance of meticulously defining a problem. And open innovation platform InnoCentive’s CEO Andy Zynga stressed the big role his company plays in helping organizations do away with the “curse of knowledge.”
Brabham also has advice for projects in their implementation phase, the key bit being: “Launch a promotional plan and a plan to grow and sustain the community.” Simply put, crowdsourcing cannot work without a crowd, so it’s important to build up the community before launching a campaign. It does take some balance, however, as a community that’s too large by the time a campaign launches can turn off newcomers who “may not feel welcome or may be unsure how to become initiated into the group or taken seriously.”
Brabham’s key advice for the post-implementation phase is: “Assess the project from many angles.” The author suggests tracking website traffic patterns, asking users to volunteer information about themselves when registering, and doing original research through surveys and interviews. The results of follow-up research can help to better understand the responses submitted, and also make it easier to show the successes of the crowdsourcing campaign. This is especially important for organizations partaking in ongoing crowdsourcing efforts.”
Imagining Data Without Division
Thomas Lin in Quanta Magazine: “As science dives into an ocean of data, the demands of large-scale interdisciplinary collaborations are growing increasingly acute…Seven years ago, when David Schimel was asked to design an ambitious data project called the National Ecological Observatory Network, it was little more than a National Science Foundation grant. There was no formal organization, no employees, no detailed science plan. Emboldened by advances in remote sensing, data storage and computing power, NEON sought answers to the biggest question in ecology: How do global climate change, land use and biodiversity influence natural and managed ecosystems and the biosphere as a whole?…
For projects like NEON, interpreting the data is a complicated business. Early on, the team realized that its data, while mid-size compared with the largest physics and biology projects, would be big in complexity. “NEON’s contribution to big data is not in its volume,” said Steve Berukoff, the project’s assistant director for data products. “It’s in the heterogeneity and spatial and temporal distribution of data.”
Unlike the roughly 20 critical measurements in climate science or the vast but relatively structured data in particle physics, NEON will have more than 500 quantities to keep track of, from temperature, soil and water measurements to insect, bird, mammal and microbial samples to remote sensing and aerial imaging. Much of the data is highly unstructured and difficult to parse — for example, taxonomic names and behavioral observations, which are sometimes subject to debate and revision.
And, as daunting as the looming data crush appears from a technical perspective, some of the greatest challenges are wholly nontechnical. Many researchers say the big science projects and analytical tools of the future can succeed only with the right mix of science, statistics, computer science, pure mathematics and deft leadership. In the big data age of distributed computing — in which enormously complex tasks are divided across a network of computers — the question remains: How should distributed science be conducted across a network of researchers?
Part of the adjustment involves embracing “open science” practices, including open-source platforms and data analysis tools, data sharing and open access to scientific publications, said Chris Mattmann, 32, who helped develop a precursor to Hadoop, a popular open-source data analysis framework that is used by tech giants like Yahoo, Amazon and Apple and that NEON is exploring. Without developing shared tools to analyze big, messy data sets, Mattmann said, each new project or lab will squander precious time and resources reinventing the same tools. Likewise, sharing data and published results will obviate redundant research.
To this end, international representatives from the newly formed Research Data Alliance met this month in Washington to map out their plans for a global open data infrastructure.”
Digital Participation – The Case of the Italian 'Dialogue with Citizens'
New paper by Gianluca Sgueo presented at Democracy and Technology – Europe in Tension from the 19th to the 21th Century – Sorbonne Paris, 2013: “This paper focuses on the initiative named “Dialogue With Citizens” that the Italian Government introduced in 2012. The Dialogue was an entirely web-based experiment of participatory democracy aimed at, first, informing citizens through documents and in-depth analysis and, second, designed for answering to their questions and requests. During the year and half of life of the initiative roughly 90.000 people wrote (approximately 5000 messages/month). Additionally, almost 200.000 participated in a number of public online consultations that the government launched in concomitance with the adoption of crucial decisions (i.e. the spending review national program).
From the analysis of this experiment of participatory democracy three questions can be raised. (1) How can a public institution maximize the profits of participation and minimize its costs? (2) How can public administrations manage the (growing) expectations of the citizens once they become accustomed to participation? (3) Is online participatory democracy going to develop further, and why?
In order to fully answer such questions, the paper proceeds as follows: it will initially provide a general overview of online public participation both at the central and the local level. It will then discuss the “Dialogue with Citizens” and a selected number of online public consultations lead by the Italian government in 2012. The conclusions will develop a theoretical framework for reflection on the peculiarities and problems of the web-participation.”
Mobile phone data are a treasure-trove for development
Paul van der Boor and Amy Wesolowski in SciDevNet: “Each of us generates streams of digital information — a digital ‘exhaust trail’ that provides real-time information to guide decisions that affect our lives. For example, Google informs us about traffic by using both its ‘My Location’ feature on mobile phones and third-party databases to aggregate location data. BBVA, one of Spain’s largest banks, analyses transactions such as credit card payments as well as ATM withdrawals to find out when and where peak spending occurs.This type of data harvest is of great value. But, often, there is so much data that its owners lack the know-how to process it and fail to realise its potential value to policymakers.
Meanwhile, many countries, particularly in the developing world, have a dearth of information. In resource-poor nations, the public sector often lives in an analogue world where piles of paper impede operations and policymakers are hindered by uncertainty about their own strengths and capabilities.Nonetheless, mobile phones have quickly pervaded the lives of even the poorest: 75 per cent of the world’s 5.5 billion mobile subscriptions are in emerging markets. These people are also generating digital trails of anything from their movements to mobile phone top-up patterns. It may seem that putting this information to use would take vast analytical capacity. But using relatively simple methods, researchers can analyse existing mobile phone data, especially in poor countries, to improve decision-making.
Think of existing, available data as low-hanging fruit that we — two graduate students — could analyse in less than a month. This is not a test of data-scientist prowess, but more a way of saying that anyone could do it.
There are three areas that should be ‘low-hanging fruit’ in terms of their potential to dramatically improve decision-making in information-poor countries: coupling healthcare data with mobile phone data to predict disease outbreaks; using mobile phone money transactions and top-up data to assess economic growth; and predicting travel patterns after a natural disaster using historical movement patterns from mobile phone data to design robust response programmes.
Another possibility is using call-data records to analyse urban movement to identify traffic congestion points. Nationally, this can be used to prioritise infrastructure projects such as road expansion and bridge building.
The information that these analyses could provide would be lifesaving — not just informative or revenue-increasing, like much of this work currently performed in developed countries.
But some work of high social value is being done. For example, different teams of European and US researchers are trying to estimate the links between mobile phone use and regional economic development. They are using various techniques, such as merging night-time satellite imagery from NASA with mobile phone data to create behavioural fingerprints. They have found that this may be a cost-effective way to understand a country’s economic activity and, potentially, guide government spending.
Another example is given by researchers (including one of this article’s authors) who have analysed call-data records from subscribers in Kenya to understand malaria transmission within the country and design better strategies for its elimination. [1]
In this study, published in Science, the location data of the mobile phones of more than 14 million Kenyan subscribers was combined with national malaria prevalence data. After identifying the sources and sinks of malaria parasites and overlaying these with phone movements, analysis was used to identify likely transmission corridors. UK scientists later used similar methods to create different epidemic scenarios for the Côte d’Ivoire.”