Selected Readings on Crowdsourcing Data


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of crowdsourcing data was originally published in 2013.

As institutions seek to improve decision-making through data and put public data to use to improve the lives of citizens, new tools and projects are allowing citizens to play a role in both the collection and utilization of data. Participatory sensing and other citizen data collection initiatives, notably in the realm of disaster response, are allowing citizens to crowdsource important data, often using smartphones, that would be either impossible or burdensomely time-consuming for institutions to collect themselves. Civic hacking, often performed in hackathon events, on the other hand, is a growing trend in which governments encourage citizens to transform data from government and other sources into useful tools to benefit the public good.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Baraniuk, Chris. “Power Politechs.” New Scientist 218, no. 2923 (June 29, 2013): 36–39. http://bit.ly/167ul3J.

  • In this article, Baraniuk discusses civic hackers, “an army of volunteer coders who are challenging preconceptions about hacking and changing the way your government operates. In a time of plummeting budgets and efficiency drives, those in power have realised they needn’t always rely on slow-moving, expensive outsourcing and development to improve public services. Instead, they can consider running a hackathon, at which tech-savvy members of the public come together to create apps and other digital tools that promise to enhace the provision of healthcare, schools or policing.”
  • While recognizing that “civic hacking has established a pedigree that demonstrates its potential for positive impact,” Baraniuk argues that a “more rigorous debate over how this activity should evolve, or how authorities ought to engage in it” is needed.

Barnett, Brandon, Muki Hansteen Izora, and Jose Sia. “Civic Hackathon Challenges Design Principles: Making Data Relevant and Useful for Individuals and Communities.” Hack for Change, https://bit.ly/2Ge6z09.

  • In this paper, researchers from Intel Labs offer “guiding principles to support the efforts of local civic hackathon organizers and participants as they seek to design actionable challenges and build useful solutions that will positively benefit their communities.”
  • The authors proposed design principles are:
    • Focus on the specific needs and concerns of people or institutions in the local community. Solve their problems and challenges by combining different kinds of data.
    • Seek out data far and wide (local, municipal, state, institutional, non-profits, companies) that is relevant to the concern or problem you are trying to solve.
    • Keep it simple! This can’t be overstated. Focus [on] making data easily understood and useful to those who will use your application or service.
    • Enable users to collaborate and form new communities and alliances around data.

Buhrmester, Michael, Tracy Kwang, and Samuel D. Gosling. “Amazon’s Mechanical Turk A New Source of Inexpensive, Yet High-Quality, Data?” Perspectives on Psychological Science 6, no. 1 (January 1, 2011): 3–5. http://bit.ly/H56lER.

  • This article examines the capability of Amazon’s Mechanical Turk to act a source of data for researchers, in addition to its traditional role as a microtasking platform.
  • The authors examine the demographics of MTurkers and find that “MTurk participants are slightly more demographically diverse than are standard Internet samples and are significantly more diverse than typical American college samples; (b) participation is affected by compensation rate and task length, but participants can still be recruited rapidly and inexpensively; (c) realistic compensation rates do not affect data quality; and (d) the data obtained are at least as reliable as those obtained via traditional methods.”
  • The paper concludes that, just as MTurk can be a strong tool for crowdsourcing tasks, data derived from MTurk can be high quality while also being inexpensive and obtained rapidly.

Goodchild, Michael F., and J. Alan Glennon. “Crowdsourcing Geographic Information for Disaster Response: a Research Frontier.” International Journal of Digital Earth 3, no. 3 (2010): 231–241. http://bit.ly/17MBFPs.

  • This article examines issues of data quality in the face of the new phenomenon of geographic information being generated by citizens, in order to examine whether this data can play a role in emergency management.
  • The authors argue that “[d]ata quality is a major concern, since volunteered information is asserted and carries none of the assurances that lead to trust in officially created data.”
  • Due to the fact that time is crucial during emergencies, the authors argue that, “the risks associated with volunteered information are often outweighed by the benefits of its use.”
  • The paper examines four wildfires in Santa Barbara in 2007-2009 to discuss current challenges with volunteered geographical data, and concludes that further research is required to answer how volunteer citizens can be used to provide effective assistance to emergency managers and responders.

Hudson-Smith, Andrew, Michael Batty, Andrew Crooks, and Richard Milton. “Mapping for the Masses Accessing Web 2.0 Through Crowdsourcing.” Social Science Computer Review 27, no. 4 (November 1, 2009): 524–538. http://bit.ly/1c1eFQb.

  • This article describes the way in which “we are harnessing the power of web 2.0 technologies to create new approaches to collecting, mapping, and sharing geocoded data.”
  • The authors examine GMapCreator and MapTube, which allow users to do a range of map-related functions such as create new maps, archive existing maps, and share or produce bottom-up maps through crowdsourcing.
  • They conclude that “these tools are helping to define a neogeography that is essentially ‘mapping for the masses,’ while noting that there are many issues of quality, accuracy, copyright, and trust that will influence the impact of these tools on map-based communication.”

Kanhere, Salil S. “Participatory Sensing: Crowdsourcing Data from Mobile Smartphones in Urban Spaces.” In Distributed Computing and Internet Technology, edited by Chittaranjan Hota and Pradip K. Srimani, 19–26. Lecture Notes in Computer Science 7753. Springer Berlin Heidelberg. 2013. https://bit.ly/2zX8Szj.

  • This paper provides a comprehensive overview of participatory sensing — a “new paradigm for monitoring the urban landscape” in which “ordinary citizens can collect multi-modal data streams from the surrounding environment using their mobile devices and share the same using existing communications infrastructure.”
  • In addition to examining a number of innovative applications of participatory sensing, Kanhere outlines the following key research challenges:
    • Dealing with incomplete samples
    •  Inferring user context
    • Protecting user privacy
    • Evaluating data trustworthiness
    • Conserving energy