Selected Readings on Crowdsourcing Data


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of crowdsourcing data was originally published in 2013.

As institutions seek to improve decision-making through data and put public data to use to improve the lives of citizens, new tools and projects are allowing citizens to play a role in both the collection and utilization of data. Participatory sensing and other citizen data collection initiatives, notably in the realm of disaster response, are allowing citizens to crowdsource important data, often using smartphones, that would be either impossible or burdensomely time-consuming for institutions to collect themselves. Civic hacking, often performed in hackathon events, on the other hand, is a growing trend in which governments encourage citizens to transform data from government and other sources into useful tools to benefit the public good.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Baraniuk, Chris. “Power Politechs.” New Scientist 218, no. 2923 (June 29, 2013): 36–39. http://bit.ly/167ul3J.

  • In this article, Baraniuk discusses civic hackers, “an army of volunteer coders who are challenging preconceptions about hacking and changing the way your government operates. In a time of plummeting budgets and efficiency drives, those in power have realised they needn’t always rely on slow-moving, expensive outsourcing and development to improve public services. Instead, they can consider running a hackathon, at which tech-savvy members of the public come together to create apps and other digital tools that promise to enhace the provision of healthcare, schools or policing.”
  • While recognizing that “civic hacking has established a pedigree that demonstrates its potential for positive impact,” Baraniuk argues that a “more rigorous debate over how this activity should evolve, or how authorities ought to engage in it” is needed.

Barnett, Brandon, Muki Hansteen Izora, and Jose Sia. “Civic Hackathon Challenges Design Principles: Making Data Relevant and Useful for Individuals and Communities.” Hack for Change, https://bit.ly/2Ge6z09.

  • In this paper, researchers from Intel Labs offer “guiding principles to support the efforts of local civic hackathon organizers and participants as they seek to design actionable challenges and build useful solutions that will positively benefit their communities.”
  • The authors proposed design principles are:
    • Focus on the specific needs and concerns of people or institutions in the local community. Solve their problems and challenges by combining different kinds of data.
    • Seek out data far and wide (local, municipal, state, institutional, non-profits, companies) that is relevant to the concern or problem you are trying to solve.
    • Keep it simple! This can’t be overstated. Focus [on] making data easily understood and useful to those who will use your application or service.
    • Enable users to collaborate and form new communities and alliances around data.

Buhrmester, Michael, Tracy Kwang, and Samuel D. Gosling. “Amazon’s Mechanical Turk A New Source of Inexpensive, Yet High-Quality, Data?” Perspectives on Psychological Science 6, no. 1 (January 1, 2011): 3–5. http://bit.ly/H56lER.

  • This article examines the capability of Amazon’s Mechanical Turk to act a source of data for researchers, in addition to its traditional role as a microtasking platform.
  • The authors examine the demographics of MTurkers and find that “MTurk participants are slightly more demographically diverse than are standard Internet samples and are significantly more diverse than typical American college samples; (b) participation is affected by compensation rate and task length, but participants can still be recruited rapidly and inexpensively; (c) realistic compensation rates do not affect data quality; and (d) the data obtained are at least as reliable as those obtained via traditional methods.”
  • The paper concludes that, just as MTurk can be a strong tool for crowdsourcing tasks, data derived from MTurk can be high quality while also being inexpensive and obtained rapidly.

Goodchild, Michael F., and J. Alan Glennon. “Crowdsourcing Geographic Information for Disaster Response: a Research Frontier.” International Journal of Digital Earth 3, no. 3 (2010): 231–241. http://bit.ly/17MBFPs.

  • This article examines issues of data quality in the face of the new phenomenon of geographic information being generated by citizens, in order to examine whether this data can play a role in emergency management.
  • The authors argue that “[d]ata quality is a major concern, since volunteered information is asserted and carries none of the assurances that lead to trust in officially created data.”
  • Due to the fact that time is crucial during emergencies, the authors argue that, “the risks associated with volunteered information are often outweighed by the benefits of its use.”
  • The paper examines four wildfires in Santa Barbara in 2007-2009 to discuss current challenges with volunteered geographical data, and concludes that further research is required to answer how volunteer citizens can be used to provide effective assistance to emergency managers and responders.

Hudson-Smith, Andrew, Michael Batty, Andrew Crooks, and Richard Milton. “Mapping for the Masses Accessing Web 2.0 Through Crowdsourcing.” Social Science Computer Review 27, no. 4 (November 1, 2009): 524–538. http://bit.ly/1c1eFQb.

  • This article describes the way in which “we are harnessing the power of web 2.0 technologies to create new approaches to collecting, mapping, and sharing geocoded data.”
  • The authors examine GMapCreator and MapTube, which allow users to do a range of map-related functions such as create new maps, archive existing maps, and share or produce bottom-up maps through crowdsourcing.
  • They conclude that “these tools are helping to define a neogeography that is essentially ‘mapping for the masses,’ while noting that there are many issues of quality, accuracy, copyright, and trust that will influence the impact of these tools on map-based communication.”

Kanhere, Salil S. “Participatory Sensing: Crowdsourcing Data from Mobile Smartphones in Urban Spaces.” In Distributed Computing and Internet Technology, edited by Chittaranjan Hota and Pradip K. Srimani, 19–26. Lecture Notes in Computer Science 7753. Springer Berlin Heidelberg. 2013. https://bit.ly/2zX8Szj.

  • This paper provides a comprehensive overview of participatory sensing — a “new paradigm for monitoring the urban landscape” in which “ordinary citizens can collect multi-modal data streams from the surrounding environment using their mobile devices and share the same using existing communications infrastructure.”
  • In addition to examining a number of innovative applications of participatory sensing, Kanhere outlines the following key research challenges:
    • Dealing with incomplete samples
    •  Inferring user context
    • Protecting user privacy
    • Evaluating data trustworthiness
    • Conserving energy

Index: Measuring Impact with Evidence


The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on measuring impact with evidence and was originally published in 2013.

United States

  • Amount per $100 of government spending that is backed by evidence that the money is being spent wisely: less than $1
  • Number of healthcare treatments delivered in the U.S. that lack evidence of effectiveness: more than half
  • How much of total U.S. healthcare expenditure is spent to determine what works: less than 0.1 percent
  • Number of major U.S. federal social programs evaluated since 1990 using randomized experiments and found to have “weak or no positive effects”: 9 out of 10
  • Year the Coalition for Evidence-Based Policy was set up to work with federal policymakers to advance evidence-based reforms in major U.S. social programs: 2001
  • Year the Program Assessment Rating Tool (PART) was introduced by President Bush’s Office of Management and Budget (OMB): 2002
    • Out of about 1,000 programs assessed, number found to be effective in 2008: 19%
    • Percentage of programs that could not be assessed due to insufficient data: 17%
    • Amount spent on the Even Start Family Literacy Program, rated ineffective by PART, over the life of the Bush administration: more than $1 billion
  •  Year Washington State legislature began using Washington State Institute for Public Policy’s estimates on how “a portfolio of evidence-based and economically sound programs . . . could affect the state’s crime rate, the need to build more prisons, and total criminal-justice spending”: 2007
    • Amount invested by legislature in these programs: $48 million
    • Amount saved by the legislature: $250 million
  • Number of U.S. States in a pilot group working to adapt The Pew-MacArthur Results First Initiative, based on the Washington State model, to make performance-based policy decisions: 14
  • Net savings in health care expenditure by using the Transitional Care Model, which meets the Congressionally-based Top Tier Evidence Standard: $4,000 per patient
  • Number of states that conducted “at least some studies that evaluated multiple program or policy options for making smarter investments of public dollars” between 2008-2011: 29
  • Number of states that reported that their cost-benefit analysis influenced policy decisions or debate: 36
  • Date the Office of Management and Budget issued a memorandum proposing new evaluations and advising agencies to include details on determining effectiveness of their programs, link disbursement to evidence, and support evidence-based initiatives: 2007
  • Percentage increase in resources for innovation funds that use a tiered model for evidence, according to the President’s FY14 budget: 44% increase
  • Amount President Obama proposed in his FY 2013 budget to allocate in existing funding to Performance Partnerships “in which states and localities would be given the flexibility to propose better ways to combine federal resources in exchange for greater accountability for results”:  $200 million
  • Amount of U.S. federal program funding that Harvard economist Jeffrey Liebman suggests be directed towards evaluations of outcomes: 1%
  • Amount of funding the City of New York has committed for evidence-based research and development initiatives through its Center for Economic Opportunity: $100 million a year

Internationally

  • How many of the 30 OECD countries in 2005-6 have a formal requirement by law that the benefits of regulation justify the costs: half
    • Number of 30 OECD member countries in 2008 that reported quantifying benefits to regulations: 16
    • Those who reported quantifying costs: 24
  • How many members make up the Alliance for Useful Evidence, a network that “champion[s]  evidence, the opening up of government data for interrogation and use, alongside the sophistication in research methods and their applications”: over 1,000
  • Date the UK government, the ESRC and the Big Lottery Fund announced plans to create a network of ‘What Works’ evidence centres: March 2013
  • Core funding for the What Works Centre for Local Economic Growth: £1m p.a. over an initial three year term
  • How many SOLACE Summit members in 2012 were “very satisfied” with how Research and Intelligence resources support evidence-based decision-making: 4%
    • Number of areas they identified for improving evidence-based decision-making: 5
    • Evaluation of the impact of past decisions: 46% of respondents
    • Benchmarking data with other areas: 39%
    • assessment of options available: 33% 
    • how evidence is presented: 29% 
    • Feedback on public engagement and consultation: 25%
  •  Number of areas for improvement for Research and Intelligence staff development identified at the SOLACE Summit: 6
    • Strengthening customer insight and data analysis: 49%
    • Impact evaluation: 48%
    • Strategic/corporate thinking/awareness: 48%
    • Political acumen: 46%
    • Raising profile/reputation of the council for evidence-based decisions: 37%
    • Guidance/mentoring on use of research for other officers: 25%

Sources

Selected Readings on Smart Disclosure


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of smart disclosure was originally published in 2013.

While much attention is paid to open data, data transparency need not be managed by a simple On/Off switch: It’s often desirable to make specific data available to the public or individuals in targeted ways. A prime example is the use of government data in Smart Disclosure, which provides consumers with data they need to make difficult marketplace choices in health care, financial services, and other important areas. Governments collect two kinds of data that can be used for Smart Disclosure: First, governments collect information on services of high interest to consumers, and are increasingly releasing this kind of data to the public. In the United States, for example, the Department of Health and Human Services collects and releases online data on health insurance options, while the Department of Education helps consumers understand the true cost (after financial aid) of different colleges. Second, state, local, or national governments hold information on consumers themselves that can be useful to them. In the U.S., for example, the Blue Button program was launched to help veterans easily access their own medical records.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Better Choices: Better Deals Report on Progress in the Consumer Empowerment Strategy. Progress Report. Consumer Empowerment Strategy. United Kingdom: Department for Business Innovation & Skills, December 2012. http://bit.ly/17MqnL3.

  • The report details the progress made through the United Kingdom’s consumer empowerment strategy, Better Choices: Better Deals. The plan seeks to mitigate knowledge imbalances through information disclosure programs and targeted nudges.
  • The empowerment strategy’s four sections demonstrate the potential benefits of Smart Disclosure: 1. The power of information; 2. The power of the crowd; 3. Helping the vulnerable; and 4. A new approach to Government working with business.
Braunstein, Mark L.,. “Empowering the Patient.” In Health Informatics in the Cloud, 67–79. Springer Briefs in Computer Science. Springer New York Heidelberg Dordrecht London, 2013. https://bit.ly/2UB4jTU.
  • This book discusses the application of computing to healthcare delivery, public health and community based clinical research.
  • Braunstein asks and seeks to answer critical questions such as: Who should make the case for smart disclosure when the needs of consumers are not being met? What role do non-profits play in the conversation on smart disclosure especially when existing systems (or lack thereof) of information provision do not work or are unsafe?

Brodi, Elisa. “Product-Attribute Information” and “Product-Use Information”: Smart Disclosure and New Policy Implications for Consumers’ Protection. SSRN Scholarly Paper. Rochester, NY: Social Science Research Network, September 4, 2012. http://bit.ly/17hssEK.

  • This paper from the Research Area of the Bank of Italy’s Law and Economics Department “surveys the literature on product use information and analyzes whether and to what extent Italian regulator is trying to ensure consumers’ awareness as to their use pattern.” Rather than focusing on the type of information governments can release to citizens, Brodi proposes that governments require private companies to provide valuable use pattern information to citizens to inform decision-making.
  • The form of regulation proposed by Brodi and other proponents “is based on a basic concept: consumers can be protected if companies are forced to disclose data on the customers’ consumption history through electronic files.”
National Science and Technology Council. Smart Disclosure and Consumer Decision Making: Report of the Task Force on Smart Disclosure. Task Force on Smart Disclosure: Information and Efficiency in Consumer Markets. Washington, DC: United States Government: Executive Office of the President, May 30, 2013. http://1.usa.gov/1aamyoT.
    • This inter-agency report is a comprehensive description of smart disclosure approaches being used across the Federal Government. The report not only highlights the importance of making data available to consumers but also to innovators to build better options for consumers.
  • In addition to providing context about government policies that guide smart disclosure initiatives, the report raises questions about what parties have influence in this space.

“Policies in Practice: The Download Capability.” Markle Connecting for Health Work Group on Consumer Engagement, August 2010. http://bit.ly/HhMJyc.

  • This report from the Markle Connecting for Health Work Group on Consumer Engagement — the creator of the Blue Button system for downloading personal health records — features a “set of privacy and security practices to help people download their electronic health records.”
  • To help make health information easily accessible for all citizens, the report lists a number of important steps:
    • Make the download capability a common practice
    • Implement sound policies and practices to protect individuals and their information
    • Collaborate on sample data sets
    • Support the download capability as part of Meaningful Use and qualified or certified health IT
    • Include the download capability in procurement requirements.
  • The report also describes the rationale for the development of the Blue Button — perhaps the best known example of Smart Disclosure currently in existence — and the targeted release of health information in general:
    • Individual access to information is rooted in fair information principles and law
    • Patients need and want the information
    • The download capability would encourage innovation
    • A download capability frees data sources from having to make many decisions about the user interface
    • A download capability would hasten the path to standards and interoperability.
Sayogo, Djoko Sigit, and Theresa A. Pardo. “Understanding Smart Data Disclosure Policy Success: The Case of Green Button.” In Proceedings of the 14th Annual International Conference on Digital Government Research, 72–81. New York: ACM New York, NY, USA, 2013. http://bit.ly/1aanf1A.
  • This paper from the Proceedings of the 14th Annual International Conference on Digital Government Research explores the implementation of the Green Button Initiative, analyzing qualitative data from interviews with experts involved in Green Button development and implementation.
  • Moving beyond the specifics of the Green Button initiative, the authors raise questions on the motivations and success factors facilitating successful collaboration between public and private organizations to support smart disclosure policy.

Thaler, Richard H., and Will Tucker. “Smarter Information, Smarter Consumers.” Harvard Business Review January – February 2013. The Big Idea. http://bit.ly/18gimxw.

  • In this article, Thaler and Tucker make three key observations regarding the challenges related to smart disclosure:
    • “We are constantly confronted with information that is highly important but extremely hard to navigate or understand.”
    • “Repeated attempts to improve disclosure, including efforts to translate complex contracts into “plain English,” have met with only modest success.”
    • “There is a fundamental difficulty of explaining anything complex in simple terms. Most people find it difficult to write instructions explaining how to tie a pair of shoelaces.