Crowdsourcing Gun Violence Research


Penn Engineering: “Gun violence is often described as an epidemic, but as visible and shocking as shooting incidents are, epidemiologists who study that particular source of mortality have a hard time tracking them. The Centers for Disease Control is prohibited by federal law from conducting gun violence research, so there is little in the way of centralized infrastructure to monitor where, how,when, why and to whom shootings occur.

Chris Callison-Burch, Aravind K.Joshi Term Assistant Professor in Computer and InformationScience, and graduate studentEllie Pavlick are working to solve this problem.

They have developed the GunViolence Database, which combines machine learning and crowdsourcing techniques to produce a national registry of shooting incidents. Callison-Burch and Pavlick’s algorithm scans thousands of articles from local newspaper and television stations,determines which are about gun violence, then asks everyday people to pullout vital statistics from those articles, compiling that information into a unified, open database.

For natural language processing experts like Callison-Burch and Pavlick, the most exciting prospect of this effort is that it is training computer systems to do this kind of analysis automatically. They recently presented their work on that front at Bloomberg’s Data for Good Exchange conference.

The Gun Violence Database project started in 2014, when it became the centerpiece of Callison-Burch’s “Crowdsourcing and Human Computation”class. There, Pavlick developed a series of homework assignments that challenged undergraduates to develop a classifier that could tell whether a given news article was about a shooting incident.

“It allowed us to teach the things we want students to learn about datascience and natural language processing, while giving them the motivation to do a project that could contribute to the greater good,” says Callison-Burch.

The articles students used to train their classifiers were sourced from “TheGun Report,” a daily blog from New York Times reporters that attempted to catalog shootings from around the country in the wake of the Sandy Hook massacre. Realizing that their algorithmic approach could be scaled up to automate what the Times’ reporters were attempting, the researchers began exploring how such a database could work. They consulted with DouglasWiebe, a Associate Professor of Epidemiology in Biostatistics andEpidemiology in the Perelman School of Medicine, to learn more about what kind of information public health researchers needed to better study gun violence on a societal scale.

From there, the researchers enlisted people to annotate the articles their classifier found, connecting with them through Mechanical Turk, Amazon’scrowdsourcing platform, and their own website, http://gun-violence.org/…(More)”

Crowdsourcing investigative journalism


Convoca in Peru: “…collaborative effort is the essence of Convoca. We are a team of journalists and programmers who work with professionals from different disciplines and generations to expose facts that are hidden by networks of power and affect the life of citizens. We bet on the work in partnership to publish findings of high impact from Peru, where the Amazon survives in almost 60% of the country, in the middle of oil exploitation, minerals and criminal activities such as logging, illegal mining and human trafficking. Fifty percent of social conflicts have as epicenter extractives areas of natural resources where the population and communities with the highest poverty rates live.

Over one year and seven months, Convoca has uncovered facts of public relevance such as the patterns of corruption and secrecy networking with journalists from Latin America and the world. The series of reports with the BRIO platform revealed the cost overruns of highways and public works in Latin American countries in the hands of Brazilian companies financed by the National Bank of Economic and Social Development (BNDES), nowadays investigated in the most notorious corruption scandal in the region, ‘Lava Jato’. This research won the 2016 Journalistic Excellence Award granted by the Inter American Press Association (SIP). On a global scale, we dove into 11 million and a half files of the ‘Panama Papers’ with more than a hundred media and organizations led by the International Consortium of Investigative Journalists (ICIJ), which allowed to undress the world of tax havens where companies and characters hide their fortune.

Our work on extractive industries ‘Excesses unpunished’ won the most important award of data journalism in the world, the Data Journalism Awards 2016, and is a finalist of the Gabriel Garcia Marquez Award which recognized the best of journalism in Latin America. We invite you to be the voice of this effort to keep publishing new reports that allow citizens to make better decisions about their destinies and compel groups of power to come clean about their activities and fulfill their commitments. So join ConBoca: The Power of Citizens Call, our first fundraising campaign alongside our readers. We believe that journalism is a public service….(More)”

Remote Data Collection: Three Ways to Rethink How You Collect Data in the Field


Magpi : “As mobile devices have gotten less and less expensive – and as millions worldwide have climbed out of poverty – it’s become quite common to see a mobile phone in every person’s hand, or at least in every family, and this means that we can utilize an additional approach to data collection that were simply not possible before….

In our Remote Data Collection Guide, we discuss these new technologies and the:

  • Key benefits of remote data collection in each of three different situations.
  • The direct impact of remote data collection on reducing the cost of your efforts.
  • How to start the process of choosing the right option for your needs….(More)”

When is the crowd wise or can the people ever be trusted?


Julie Simon at NESTA: “Democratic theory has tended to take a pretty dim view of people and their ability to make decisions. Many political philosophers believe that people are at best uninformed and at worst, ignorant and incompetent.  This view is a common justification for our system of representative democracy – people can’t be trusted to make decisions so this responsibility should fall to those who have the expertise, knowledge or intelligence to do so.

Think back to what Edmund Burke said on the subject in his speech to the Electors of Bristol in 1774, “Your representative owes you, not his industry only, but his judgement; and he betrays, instead of serving you, if he sacrifices it to your opinion.” He reminds us that “government and legislation are matters of reason and judgement, and not of inclination”. Others, like the journalist Charles Mackay, whose book on economic bubbles and crashes,Extraordinary Popular Delusions and the Madness of Crowds, had an even more damning view of the crowd’s capacity to exercise either judgement or reason.

The thing is, if you believe that ‘the crowd’ isn’t wise then there isn’t much point in encouraging participation – these sorts of activities can only ever be tokenistic or a way of legitimising the decisions taken by others.

There are then those political philosophers who effectively argue that citizens’ incompetence doesn’t matter. They argue that the aggregation of views – through voting – eliminates ‘noise’ which enables you to arrive at optimal decisions. The larger the group, the better its decisions will be.  The corollary of this view is that political decision making should involve mass participation and regular referenda – something akin to the Swiss model.

Another standpoint is to say that there is wisdom within crowds – it’s just that it’s domain specific, unevenly distributed and quite hard to transfer. This idea was put forward by Friedrich Hayek in his seminal 1945 essay on The Use of Knowledge in Society in which he argues that:

“…the knowledge of the circumstances of which we must make use never exists in concentrated or integrated form, but solely as the dispersed bits of incomplete and frequently contradictory knowledge which all the separate individuals possess. The economic problem of society is thus not merely a problem of how to allocate ‘given’ resources……it is a problem of the utilization of knowledge not given to anyone in its totality”.

Hayek argued that it was for this reason that central planning couldn’t work since no central planner could ever aggregate all the knowledge distributed across society to make good decisions.

More recently, Eric Von Hippel built on these foundations by introducing the concept of information stickiness; information is ‘sticky’ if it is costly to move from one place to another. One type of information that is frequently ‘sticky’ is information about users’ needs and preferences.[1] This helps to account for why manufacturers tend to develop innovations which are incremental – meeting already identified needs – and why so many organisations are engaging users in their innovation processes:  if knowledge about needs and tools for developing new solutions can be co-located in the same place (i.e. the user) then the cost of transferring sticky information is eliminated…..

There is growing evidence on how crowdsourcing can be used by governments to solve clearly defined technical, scientific or informational problems. Evidently there are significant needs and opportunities for governments to better engage citizens to solve these types of problems.

There’s also a growing body of evidence on how digital tools can be used to support and promote collective intelligence….

So, the critical task for public officials is to have greater clarity over the purpose of engagement –  in order to better understand which methods of engagement should be used and what kinds of  groups should be targeted.

At the same time, the central question for researchers is when and how to tap into collective intelligence: what tools and approaches can be used when we’re looking at arenas which are often sites of contestation? Should this input be limited to providing information and expertise to be used by public officials or representatives, or should these distributed experts exercise some decision making power too? And when we’re dealing with value based judgements when should we rely on large scale voting as a mechanism for making ‘smarter’ decisions and when are deliberative forms of engagement more appropriate? These are all issues we’re exploring as part of our ongoing programme of work on democratic innovations….(More)”

You Can Help Map the Accessibility of the World


Josh Cohen in Next City: “…using a web app called Project Sidewalk….The app, from a team at the University of Maryland’s Human-Computer Interaction Lab, crowdsources audit data in order to map urban accessibility. After taking a brief tutorial on what to look for and a how-to, participants “walk” the D.C. streets using Google Street View. The app provides a set of tools to mark curb ramps (or a lack of them), broken sidewalks, and obstacles in the sidewalk, and rank them on a scale of 1 to 5 for level of accessibility.

Project Sidewalk’s public beta launched on August 30. As of this writing, 212 people have participated and audited 377.5 miles of sidewalk in D.C.

“We’re starting in D.C. as a launch point because we know D.C., we live here, we can do physical audits to validate the data we’re getting,” says Jon Froehlich, a University of Maryland professor who is leading the project. “But we want to expand to 10 more cities in the next year or two.”

Project Sidewalk tutorial

Project Sidewalk wants to produce a few end products with their data too. The first is an accessibility-mapping tool that offers end-to-end route directions that takes into account a person’s particular mobility challenges. Froehlich points out that barriers for someone in an electric wheelchair might be different than someone in a manual wheelchair or someone with vision impairment. The other product is an “access score” map that ranks a neighborhood’s accessibility and highlights problem areas.

Froehlich hopes departments of transportation might adopt the tool as well. “People tasked with improving infrastructure can start to use it to triage their work or verify their own data. A lot of cities don’t have money or time to go out and map the accessibility of their streets,” he says.

Crowdsourcing and using Street View to reduce the amount of labor required to conduct audits is an important first step for Project Sidewalk, but in order to expand to cities throughout the country, they need to automate the review process as much as possible. To do that, the team is experimenting with computer learning….(More)”.

Playful Cities: Crowdsourcing Urban Happiness with Web Games


Daniele Quercia in Built Environment: “It is well known that the layout and configuration of urban space plugs directly into our sense of community wellbeing. The twentieth-century city planner Kevin Lynch showed that a city’s dwellers create their own personal ‘mental maps’ of the city based on features such as the routes they use and the areas they visit. Maps that are easy to remember and navigate bring comfort and ultimately contribute to people’s wellbeing. Unfortunately, traditional social science experiments (including those used to capture mental maps) take time, are costly, and cannot be conducted at city scale. This paper describes how, starting in mid-2012, a team of researchers from a variety of disciplines set about tackling these issues. They were able to translate a few traditional experiments into 1-minute ‘web games with a purpose’. This article describes those games, the main insights they offer, their theoretical implications for urban planning, and their practical implications for improvements in navigation technologies….(More)”

Crowdsourcing Tolstoy


 at the NewYorker: “When Leo Tolstoy’s great-great-granddaughter, the journalist Fyokla Tolstaya, announced that the Leo Tolstoy State Museum was looking for volunteers to proofread some forty-six thousand eight hundred pages of her relative’s writings, she hoped to generate enough interest to get the first round of corrections done in six months.

Within days, some three thousand Russians—engineers, I.T. workers, schoolteachers, retirees, a student pilot, a twenty-year-old waitress—signed on. “We were so happy and so surprised,” said Tolstaya. “They finished in fourteen days.”

Now, thanks largely to the efforts of these volunteers, nearly all of the great Russian writer’s massive body of work, including novels, diaries, letters, religious tracts, philosophical treatises, travelogues, and childhood memories, will soon be available online, in a form that can be easily downloaded, free of charge. “Of course we realized there are some novels on the Internet,” Tolstaya said. “But most [writings] are not. We in the museum decided this is not good.”…

The definitive, ninety-volume jubilee edition of Tolstoy’s works, compiled and published in Russia from the nineteen-twenties to the nineteen-fifties, had already been scanned by the Russian State Library. However, converting the PDFs into an easy-to-use digital format posed a challenge. For one thing, even after ABBYY, a company that specializes in translating printed documents into digital records, offered their services for free, proofreading costs were likely to be prohibitive. Charging readers to download the works was not an option. “At the end of his life, Tolstoy said, ‘I don’t need any money for my work. I want to give my work to the people,’ “ said Tolstaya. “It was important for us to make it free for everyone. It is his will.”

That was when they hit on the idea of crowdsourcing, Tolstaya said. “It’s according to Leo Tolstoy’s ideas, to do it with the help of all people around the world—vsem mirom—even the world’s hardest task can be done with the help of everyone.”…(More)”

Leveraging Mixed Expertise in Crowdsourcing


Dissertation by David Merritt: “Crowdsourcing systems promise to leverage the “wisdom of crowds” to help solve many kinds of problems that are difficult to solve using only computers. Although a crowd of people inherently represents a diversity of skill levels, knowledge, and opinions, crowdsourcing system designers typically view this diversity as noise and effectively cancel it out by aggregating responses. However, we believe that by embracing crowd workers’ diverse expertise levels, system designers can better leverage that knowledge to increase the wisdom of crowds. In this thesis, we propose solutions to a limitation of current crowdsourcing approaches: not accounting for a range of expertise levels in the crowd. The current body of work in crowdsourcing does not systematically examine this, suggesting that researchers may not believe the benefits of using mixed expertise warrants the complexities of supporting it. This thesis presents two systems, Escalier and Kurator, to show that leveraging mixed expertise is a worthwhile endeavor because it materially benefits system performance, at scale, for various types of problems. We also demonstrate an effective technique, called expertise layering, to incorporate mixed expertise into crowdsourcing systems. Finally, we show that leveraging mixed expertise enables researchers to use crowdsourcing to address new types of problems….(More)”

Crowdsourcing at Statistics Canada


Pilot project by Statistics Canada: “Our crowdsourcing pilot project will focus on mapping buildings across Canada.

If you live in Ottawa or Gatineau, you can be among the first to collaborate with us. If you live elsewhere, stay in touch! Your town or city could be next. We are very excited to work with communities across the country on this project.

As a project contributor, you can help create a free and open source of information on commercial, industrial, government and other buildings in Canada. We need your support to close this important data gap! Your work will improve your community’s knowledge of its buildings, and in turn inform policies and programs designed to help you.

An eye on the future

There are currently no accurate national-level statistics on buildings— and their attributes—that can be used to compare specific local areas. The information you submit will help to fill existing data gaps and provide new analytical opportunities that are important to data users.

This project will also teach us about the possibilities and limitations of crowdsourcing. Crowdsourcing data collection may become a way for Statistics Canada and other organizations around the world to collect much-needed information by reaching out to citizens.

What you can do

Using your knowledge of your neighbourhood, along with an online mapping tool called OpenStreetMap, you and other members of the public will be able to input the location, physical attributes and other features of buildings.


It all starts with you, on October 17, 2016

We will officially launch the crowdsourcing campaign for the pilot on October 17, 2016 and will provide further instructions and links to resources.

To subscribe to a distribution list for periodic updates on the project, send us an email at statcan.crowdsource.statcan@canada.ca. We will keep you posted!….(More)”

How Technology is Crowd-Sourcing the Fight Against Hunger


Beth Noveck at Media Planet: “There is more than enough food produced to feed everyone alive today. Yet access to nutritious food is a challenge everywhere and depends on getting every citizen involved, not just large organizations. Technology is helping to democratize and distribute the job of tackling the problem of hunger in America and around the world.

Real-time research

One of the hardest problems is the difficulty of gaining real-time insight into food prices and shortages. Enter technology. We no longer have to rely on professional inspectors slowly collecting information face-to-face. The UN World Food Programme, which provides food assistance to 80 million people each year, together with Nielsen is conducting mobile phone surveys in 15 countries (with plans to expand to 30), asking people by voice and text about what they are eating. Formerly blank maps are now filled in with information provided quickly and directly by the most affected people, making it easy to prioritize the allocation of resources.

Technology helps the information flow in both directions, enabling those in need to reach out, but also to become more effective at helping themselves. The Indian Ministry of Agriculture, in collaboration with Reuters Market Light, provides information services in nine Indian languages to 1.4 million registered farmers in 50,000 villages across 17 Indian states via text and voice messages.

“In the United States, 40 percent of the food produced here is wasted, and yet 1 in 4 American children (and 1 in 6 adults) remain food insecure…”

Data to the people

New open data laws and policies that encourage more transparent publication of public information complement data collection and dissemination technologies such as phones and tablets. About 70 countries and hundreds of regions and cities have adopted open data policies, which guarantee that the information these public institutions collect be available for free use by the public. As a result, there are millions of open datasets now online on websites such as the Humanitarian Data Exchange, which hosts 4,000 datasets such as country-by-country stats on food prices and undernourishment around the world.

Companies are compiling and sharing data to combat food insecurity, too. Anyone can dig into the data on the Global Open Data for Agriculture and Nutrition platform, a data collaborative where 300 private and public partners are sharing information.

Importantly, this vast quantity of open data is available to anyone, not only to governments. As a result, large and small entrepreneurs are able to create new apps and programs to combat food insecurity, such as Plantwise, which uses government data to offer a knowledge bank and run “plant clinics” that help farmers lose less of what they grow to pests. Google uses open government data to show people the location of farmers markets near their homes.

Students, too, can learn to play a role. For the second summer in a row, the Governance Lab at New York University, in partnership with the United States Department of Agriculture (USDA), mounted a two-week open data summer camp for 40 middle and high school students. The next generation of problem solvers is learning new data science skills by working on food safety and other projects using USDA open data.

Enhancing connection

Ultimately, technology enables greater communication and collaboration among the public, social service organizations, restaurants, farmers and other food producers who must work together to avoid food crises. The European Food Safety Authority in Italy has begun exploring how to use internet-based collaboration (often called citizen science or crowdsourcing) to get more people involved in food and feed risk assessment.

In the United States, 40 percent of the food produced here is wasted, and yet 1 in 4 American children (and 1 in 6 adults) remain food insecure, according to the Rockefeller Foundation. Copia, a San Francisco based smartphone app facilitates donations and deliveries of those with excess food in six cities in the Bay Area. Zero Percent in Chicago similarly attacks the distribution problem by connecting restaurants to charities to donate their excess food. Full Harvest is a tech platform that facilitates the selling of surplus produce that otherwise would not have a market.

Mobilizing the world

Prize-backed challenges create the incentives for more people to collaborate online and get involved in the fight against hunger….(More)”