Selected Readings on Data Governance


Jos Berens (Centre for Innovation, Leiden University) and Stefaan G. Verhulst (GovLab)

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data governance was originally published in 2015.

Context
The field of Data Collaboratives is premised on the idea that sharing and opening-up private sector datasets has great – and yet untapped – potential for promoting social good. At the same time, the potential of data collaboratives depends on the level of societal trust in the exchange, analysis and use of the data exchanged. Strong data governance frameworks are essential to ensure responsible data use. Without such governance regimes, the emergent data ecosystem will be hampered and the (perceived) risks will dominate the (perceived) benefits. Further, without adopting a human-centered approach to the design of data governance frameworks, including iterative prototyping and careful consideration of the experience, the responses may fail to be flexible and targeted to real needs.

Selected Readings List (in alphabetical order)

Annotated Selected Readings List (in alphabetical order)

Better Place Lab, “Privacy, Transparency and Trust.” Mozilla, 2015. Available from: http://www.betterplace-lab.org/privacy-report.

  • This report looks specifically at the risks involved in the social sector having access to datasets, and the main risks development organizations should focus on to develop a responsible data use practice.
  • Focusing on five specific countries (Brazil, China, Germany, India and Indonesia), the report displays specific country profiles, followed by a comparative analysis centering around the topics of privacy, transparency, online behavior and trust.
  • Some of the key findings mentioned are:
    • A general concern on the importance of privacy, with cultural differences influencing conception of what privacy is.
    • Cultural differences determining how transparency is perceived, and how much value is attached to achieving it.
    • To build trust, individuals need to feel a personal connection or get a personal recommendation – it is hard to build trust regarding automated processes.

Montjoye, Yves Alexandre de; Kendall, Jake and; Kerry, Cameron F. “Enabling Humanitarian Use of Mobile Phone Data.” The Brookings Institution, 2015. Available from: http://www.brookings.edu/research/papers/2014/11/12-enabling-humanitarian-use-mobile-phone-data.

  • Focussing in particular on mobile phone data, this paper explores ways of mitigating privacy harms involved in using call detail records for social good.
  • Key takeaways are the following recommendations for using data for social good:
    • Engaging companies, NGOs, researchers, privacy experts, and governments to agree on a set of best practices for new privacy-conscientious metadata sharing models.
    • Accepting that no framework for maximizing data for the public good will offer perfect protection for privacy, but there must be a balanced application of privacy concerns against the potential for social good.
    • Establishing systems and processes for recognizing trusted third-parties and systems to manage datasets, enable detailed audits, and control the use of data so as to combat the potential for data abuse and re-identification of anonymous data.
    • Simplifying the process among developing governments in regards to the collection and use of mobile phone metadata data for research and public good purposes.

Centre for Democracy and Technology, “Health Big Data in the Commercial Context.” Centre for Democracy and Technology, 2015. Available from: https://cdt.org/insight/health-big-data-in-the-commercial-context/.

  • Focusing particularly on the privacy issues related to using data generated by individuals, this paper explores the overlap in privacy questions this field has with other data uses.
  • The authors note that although the Health Insurance Portability and Accountability Act (HIPAA) has proven a successful approach in ensuring accountability for health data, most of these standards do not apply to developers of the new technologies used to collect these new data sets.
  • For non-HIPAA covered, customer facing technologies, the paper bases an alternative framework for consideration of privacy issues. The framework is based on the Fair Information Practice Principles, and three rounds of stakeholder consultations.

Center for Information Policy Leadership, “A Risk-based Approach to Privacy: Improving Effectiveness in Practice.” Centre for Information Policy Leadership, Hunton & Williams LLP, 2015. Available from: https://www.informationpolicycentre.com/uploads/5/7/1/0/57104281/white_paper_1-a_risk_based_approach_to_privacy_improving_effectiveness_in_practice.pdf.

  • This white paper is part of a project aiming to explain what is often referred to as a new, risk-based approach to privacy, and the development of a privacy risk framework and methodology.
  • With the pace of technological progress often outstripping the capabilities of privacy officers to keep up, this method aims to offer the ability to approach privacy matters in a structured way, assessing privacy implications from the perspective of possible negative impact on individuals.
  • With the intended outcomes of the project being “materials to help policy-makers and legislators to identify desired outcomes and shape rules for the future which are more effective and less burdensome”, insights from this paper might also feed into the development of innovative governance mechanisms aimed specifically at preventing individual harm.

Centre for Information Policy Leadership, “Data Governance for the Evolving Digital Market Place”, Centre for Information Policy Leadership, Hunton & Williams LLP, 2011. Available from: http://www.huntonfiles.com/files/webupload/CIPL_Centre_Accountability_Data_Governance_Paper_2011.pdf.

  • This paper argues that as a result of the proliferation of large scale data analytics, new models governing data inferred from society will shift responsibility to the side of organizations deriving and creating value from that data.
  • It is noted that, with the reality of the challenge corporations face of enabling agile and innovative data use “In exchange for increased corporate responsibility, accountability [and the governance models it mandates, ed.] allows for more flexible use of data.”
  • Proposed as a means to shift responsibility to the side of data-users, the accountability principle has been researched by a worldwide group of policymakers. Tailing the history of the accountability principle, the paper argues that it “(…) requires that companies implement programs that foster compliance with data protection principles, and be able to describe how those programs provide the required protections for individuals.”
  • The following essential elements of accountability are listed:
    • Organisation commitment to accountability and adoption of internal policies consistent with external criteria
    • Mechanisms to put privacy policies into effect, including tools, training and education
    • Systems for internal, ongoing oversight and assurance reviews and external verification
    • Transparency and mechanisms for individual participation
    • Means of remediation and external enforcement

Crawford, Kate; Schulz, Jason. “Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harm.” NYU School of Law, 2014. Available from: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2325784&download=yes.

  • Considering the privacy implications of large-scale analysis of numerous data sources, this paper proposes the implementation of a ‘procedural data due process’ mechanism to arm data subjects against potential privacy intrusions.
  • The authors acknowledge that some privacy protection structures already know similar mechanisms. However, due to the “inherent analytical assumptions and methodological biases” of big data systems, the authors argue for a more rigorous framework.

Letouze, Emmanuel, and; Vinck, Patrick. “The Ethics and Politics of Call Data Analytics”, DataPop Alliance, 2015. Available from: http://static1.squarespace.com/static/531a2b4be4b009ca7e474c05/t/54b97f82e4b0ff9569874fe9/1421442946517/WhitePaperCDRsEthicFrameworkDec10-2014Draft-2.pdf.

  • Focusing on the use of Call Detail Records (CDRs) for social good in development contexts, this whitepaper explores both the potential of these datasets – in part by detailing recent successful efforts in the space – and political and ethical constraints to their use.
  • Drawing from the Menlo Report Ethical Principles Guiding ICT Research, the paper explores how these principles might be unpacked to inform an ethics framework for the analysis of CDRs.

Data for Development External Ethics Panel, “Report of the External Ethics Review Panel.” Orange, 2015. Available from: http://www.d4d.orange.com/fr/content/download/43823/426571/version/2/file/D4D_Challenge_DEEP_Report_IBE.pdf.

  • This report presents the findings of the external expert panel overseeing the Orange Data for Development Challenge.
  • Several types of issues faced by the panel are described, along with the various ways in which the panel dealt with those issues.

Federal Trade Commission Staff Report, “Mobile Privacy Disclosures: Building Trust Through Transparency.” Federal Trade Commission, 2013. Available from: www.ftc.gov/os/2013/02/130201mobileprivacyreport.pdf.

  • This report looks at ways to address privacy concerns regarding mobile phone data use. Specific advise is provided for the following actors:
    • Platforms, or operating systems providers
    • App developers
    • Advertising networks and other third parties
    • App developer trade associations, along with academics, usability experts and privacy researchers

Mirani, Leo. “How to use mobile phone data for good without invading anyone’s privacy.” Quartz, 2015. Available from: http://qz.com/398257/how-to-use-mobile-phone-data-for-good-without-invading-anyones-privacy/.

  • This paper considers the privacy implications of using call detail records for social good, and ways to mitigate risks of privacy intrusion.
  • Taking example of the Orange D4D challenge and the anonymization strategy that was employed there, the paper describes how classic ‘anonymization’ is often not enough. The paper then lists further measures that can be taken to ensure adequate privacy protection.

Bernholz, Lucy. “Several Examples of Digital Ethics and Proposed Practices” Stanford Ethics of Data conference, 2014, Available from: http://www.scribd.com/doc/237527226/Several-Examples-of-Digital-Ethics-and-Proposed-Practices.

  • This list of readings prepared for Stanford’s Ethics of Data conference lists some of the leading available literature regarding ethical data use.

Abrams, Martin. “A Unified Ethical Frame for Big Data Analysis.” The Information Accountability Foundation, 2014. Available from: http://www.privacyconference2014.org/media/17388/Plenary5-Martin-Abrams-Ethics-Fundamental-Rights-and-BigData.pdf.

  • Going beyond privacy, this paper discusses the following elements as central to developing a broad framework for data analysis:
    • Beneficial
    • Progressive
    • Sustainable
    • Respectful
    • Fair

Lane, Julia; Stodden, Victoria; Bender, Stefan, and; Nissenbaum, Helen, “Privacy, Big Data and the Public Good”, Cambridge University Press, 2014. Available from: http://www.dataprivacybook.org.

  • This book treats the privacy issues surrounding the use of big data for promoting the public good.
  • The questions being asked include the following:
    • What are the ethical and legal requirements for scientists and government officials seeking to serve the public good without harming individual citizens?
    • What are the rules of engagement?
    • What are the best ways to provide access while protecting confidentiality?
    • Are there reasonable mechanisms to compensate citizens for privacy loss?

Richards, Neil M, and; King, Jonathan H. “Big Data Ethics”. Wake Forest Law Review, 2014. Available from: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2384174.

  • This paper describes the growing impact of big data analytics on society, and argues that because of this impact, a set of ethical principles to guide data use is called for.
  • The four proposed themes are: privacy, confidentiality, transparency and identity.
  • Finally, the paper discusses how big data can be integrated into society, going into multiple facets of this integration, including the law, roles of institutions and ethical principles.

OECD, “OECD Guidelines on the Protection of Privacy and Transborder Flows of Personal Data”. Available from: http://www.oecd.org/sti/ieconomy/oecdguidelinesontheprotectionofprivacyandtransborderflowsofpersonaldata.htm.

  • A globally used set of principles to inform thought about handling personal data, the OECD privacy guidelines serve as one the leading standards for informing privacy policies and data governance structures.
  • The basic principles of national application are the following:
    • Collection Limitation Principle
    • Data Quality Principle
    • Purpose Specification Principle
    • Use Limitation Principle
    • Security Safeguards Principle
    • Openness Principle
    • Individual Participation Principle
    • Accountability Principle

The White House Big Data and Privacy Working Group, “Big Data: Seizing Opportunities, Preserving Values”, White House, 2015. Available from: https://www.whitehouse.gov/sites/default/files/docs/big_data_privacy_report_5.1.14_final_print.pdf.

  • Documenting the findings of the White House big data and privacy working group, this report lists i.a. the following key recommendations regarding data governance:
    • Bringing greater transparency to the data services industry
    • Stimulating international conversation on big data, with multiple stakeholders
    • With regard to educational data: ensuring data is used for the purpose it is collected for
    • Paying attention to the potential for big data to facilitate discrimination, and expanding technical understanding to stop discrimination

William Hoffman, “Pathways for Progress” World Economic Forum, 2015. Available from: http://www3.weforum.org/docs/WEFUSA_DataDrivenDevelopment_Report2015.pdf.

  • This paper treats i.a. the lack of well-defined and balanced governance mechanisms as one of the key obstacles preventing particularly corporate sector data from being shared in a controlled space.
  • An approach that balances the benefits against the risks of large scale data usage in a development context, building trust among all stake holders in the data ecosystem, is viewed as key.
  • Furthermore, this whitepaper notes that new governance models are required not just by the growing amount of data and analytical capacity, and more refined methods for analysis. The current “super-structure” of information flows between institutions is also seen as one of the key reasons to develop alternatives to the current – outdated – approaches to data governance.

Selected Readings on Cities and Civic Technology


By Julia Root and Stefaan Verhulst

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of civic innovation was originally published in 2014.

The last five years have seen a wave of new organizations, entrepreneurs and investment in cities and the field of civic innovation.  Two subfields, Civic Tech and Government Innovation, are particularly aligned with GovLab’s interest in the ways in which technology is and can be deployed to redesign public institutions and re-imagine governance.

The emerging field of civic technology, or “Civic Tech,” champions new digital platforms, open data and collaboration tools for transforming government service delivery and engagement with citizens. Government Innovation, while not a new field, has seen in the last five years a proliferation of new structures (e.g. Mayor’s Office of New Urban Mechanics), roles (e.g. Chief Technology/Innovation Officer) and public/private investment (e.g. Innovation Delivery Teams and Code for America Fellows) that are building a world-wide movement for transforming how government thinks about and designs services for its citizens.

There is no set definition for “civic innovation.” However, broadly speaking, it is about improving our cities through the implementation of tools, ideas and engagement methods that strengthen the relationship between government and citizens. The civic innovation field encompasses diverse actors from across the public, private and nonprofit spectrums. These can include government leaders, nonprofit and foundation professionals, urbanists, technologists, researchers, business leaders and community organizers, each of whom may use the term in a different way, but ultimately are seeking to disrupt how cities and public institutions solve problems and invest in solutions.

Selected Reading List (in alphabetical order)

Annotated Selected Readings (in alphabetical order)

Books

Goldsmith, Stephen, and Susan Crawford. The Responsive City: Engaging Communities Through Data-Smart Governance. 1 edition. San Francisco, CA: Jossey-Bass, 2014. http://bit.ly/1zvKOL0.

  • The Responsive City, a guide to civic engagement and governance in the digital age, is the culmination of research originating from the Data-Smart City Solutions initiative, an ongoing project at Harvard Kennedy School working to catalyze adoption of data projects on the city level.
  • The “data smart city” is one that is responsive to citizens, engages them in problem solving and finds new innovative solutions for dismantling entrenched bureaucracy.
  • The authors document case studies from New York City, Boston and Chicago to explore the following topics:
    • Building trust in the public sector and fostering a sustained, collective voice among communities;
    • Using data-smart governance to preempt and predict problems while improving quality of life;
    • Creating efficiencies and saving taxpayer money with digital tools; and
    • Spearheading these new approaches to government with innovative leadership.

Townsend, Anthony M. Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia. 1 edition. New York: W. W. Norton & Company, 2013. http://bit.ly/17Y4G0R.

  • In this book, Townsend illustrates how “cities worldwide are deploying technology to address both the timeless challenges of government and the mounting problems posed by human settlements of previously unimaginable size and complexity.”
  • He also considers “the motivations, aspirations, and shortcomings” of the many stakeholders involved in the development of smart cities, and poses a new civics to guide these efforts.
  • He argues that smart cities are not made smart by various, soon-to-be-obsolete technologies built into its infrastructure; instead, it is how citizens are using ever-changing and grassroots technologies to be “human-centered, inclusive and resilient” that will make cities ‘smart.’

Reports + Journal Articles

Black, Alissa, and Rachel Burstein. “The 2050 City – What Civic Innovation Looks Like Today and Tomorrow.” White Paper. New America Foundation – California Civic Innovation Project, June 2013. https://bit.ly/2GohMvw.

  • Through their interviews, the authors determine that civic innovation is not just a “compilation of projects” but that it can inspire institutional structural change.
  • Civic innovation projects that have a “technology focus can sound very different than process-related innovations”; however the outcomes are actually quite similar as they disrupt how citizens and government engage with one another.
  • Technology is viewed by some of the experts as an enabler of civic innovation – not necessarily the driver for innovation itself. What constitutes innovation is how new tools are implemented by government or by civic groups that changes the governing dynamic.

Patel, Mayur, Jon Sotsky, Sean Gourley, and Daniel Houghton. “Knight Foundation Report on Civic Technology.” Presentation. Knight Foundation, December 2013. http://slidesha.re/11UYgO0.

  • This reports aims to advance the field of civic technology, which compared to the tech industry as a whole is relatively young. It maps the field, creating a starting place for understanding activity and investment in the sector.
  • It defines two themes, Open Government and Civic Action, and identifies 11 clusters of civic tech innovation that fall into the two themes. For each cluster, the authors describe the type of activities and highlights specific organizations.
  • The report identified more than $430 million of private and philanthropic investment directed to 102 civic tech organizations from January 2011 to May 2013.

Open Plans. “Field Scan on Civic Technology.” Living Cities, November 2012. http://bit.ly/1HGjGih.

  • Commissioned by Living Cities and authored by Open Plans, the Field Scan investigates the emergent field of civic technology and generates the first analysis of the potential impact for the field as well as a critique for how tools and new methods need to be more inclusive of low-income communities in their use and implementation.
  • Respondents generally agreed that the tools developed and in use in cities so far are demonstrations of the potential power of civic tech, but that these tools don’t yet go far enough.
  • Civic tech tools have the potential to improve the lives of low-income people in a number of ways. However, these tools often fail to reach the population they are intended to benefit. To better understand this challenge, civic tech for low-income people must be considered in the broader context of their interactions with technology and with government.
  • Although hackathons are popular, their approach to problem solving is not always driven by community needs, and hackathons often do not produce useful material for governments or citizens in need.

Goldberg, Jeremy M. “Riding the Second Wave of Civic Innovation.” Governing, August 28, 2014. http://bit.ly/1vOKnhJ.

  • In this piece, Goldberg argues that innovation and entrepreneurship in local government increasingly require mobilizing talent from many sectors and skill sets.

Black, Alissa, and Burstein, Rachel. “A Guide for Making Innovation Offices Work.” IBM Center for the Business of Government, October 2014. http://bit.ly/1vOFZP4.

  • In this report, Burstein and Black examine the recent trend toward the creation of innovation offices across the nation at all levels of government to understand the structural models now being used to stimulate innovation—both internally within an agency, and externally for the agency’s partners and communities.
  • The authors conducted interviews with leadership of innovation offices of cities that include Philadelphia, Austin, Kansas City, Chicago, Davis, Memphis and Los Angeles.
  • The report cites examples of offices, generates a typology for the field, links to projects and highlights success factors.

Mulholland, Jessica, and Noelle Knell. “Chief Innovation Officers in State and Local Government (Interactive Map).” Government Technology, March 28, 2014. http://bit.ly/1ycArvX.

  • This article provides an overview of how different cities structure their Chief Innovation Officer positions and provides links to offices, projects and additional editorial content.
  • Some innovation officers find their duties merged with traditional CIO responsibilities, as is the case in Chicago, Philadelphia and New York City. Others, like those in Louisville and Nashville, have titles that reveal a link to their jurisdiction’s economic development endeavors.

Toolkits

Bloomberg Philanthropies. January 2014. “Transform Your City through Innovation: The Innovation Delivery Model for Making It Happen.” New York: Bloomberg Philanthropies. http://bloombg.org/120VrKB.

  • In 2011, Bloomberg Philanthropies funded a three-year innovation capacity program in five major United States cities— Atlanta, Chicago, Louisville, Memphis, and New Orleans – in which cities could hire top-level staff to develop and see through the implementation of solutions to top mayoral priorities such as customer service, murder, homelessness, and economic development, using a sequence of steps.
  • The Innovation Delivery Team Playbook describes the Innovation Delivery Model and describes each aspect of the model from how to hire and structure the team, to how to manage roundtables and run competitions.

Selected Readings on Economic Impact of Open Data


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of open data was originally published in 2014.

Open data is publicly available data – often released by governments, scientists, and occasionally private companies – that is made available for anyone to use, in a machine-readable format, free of charge. Considerable attention has been devoted to the economic potential of open data for businesses and other organizations, and it is now widely accepted that open data plays an important role in spurring innovation, growth, and job creation. From new business models to innovation in local governance, open data is being quickly adopted as a valuable resource at many levels.

Measuring and analyzing the economic impact of open data in a systematic way is challenging, and governments as well as other providers of open data seek to provide access to the data in a standardized way. As governmental transparency increases and open data changes business models and activities in many economic sectors, it is important to understand best practices for releasing and using non-proprietary, public information. Costs, social challenges, and technical barriers also influence the economic impact of open data.

These selected readings are intended as a first step in the direction of answering the question of if we can and how we consider if opening data spurs economic impact.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Bonina, Carla. New Business Models and the Values of Open Data: Definitions, Challenges, and Opportunities. NEMODE 3K – Small Grants Call 2013. http://bit.ly/1xGf9oe

  • In this paper, Dr. Carla Bonina provides an introduction to open data and open data business models, evaluating their potential economic value and identifying future challenges for the effectiveness of open data, such as personal data and privacy, the emerging data divide, and the costs of collecting, producing and releasing open (government) data.

Carpenter, John and Phil Watts. Assessing the Value of OS OpenData™ to the Economy of Great Britain – Synopsis. June 2013. Accessed July 25, 2014. http://bit.ly/1rTLVUE

  • John Carpenter and Phil Watts of Ordnance Survey undertook a study to examine the economic impact of open data to the economy of Great Britain. Using a variety of methods such as case studies, interviews, downlad analysis, adoption rates, impact calculation, and CGE modeling, the authors estimates that the OS OpenData initiative will deliver a net of increase in GDP of £13 – 28.5 million for Great Britain in 2013.

Capgemini Consulting. The Open Data Economy: Unlocking Economic Value by Opening Government and Public Data. Capgemini Consulting. Accessed July 24, 2014. http://bit.ly/1n7MR02

  • This report explores how governments are leveraging open data for economic benefits. Through using a compariative approach, the authors study important open data from organizational, technological, social and political perspectives. The study highlights the potential of open data to drive profit through increasing the effectiveness of benchmarking and other data-driven business strategies.

Deloitte. Open Growth: Stimulating Demand for Open Data in the UK. Deloitte Analytics. December 2012. Accessed July 24, 2014. http://bit.ly/1oeFhks

  • This early paper on open data by Deloitte uses case studies and statistical analysis on open government data to create models of businesses using open data. They also review the market supply and demand of open government data in emerging sectors of the economy.

Gruen, Nicholas, John Houghton and Richard Tooth. Open for Business: How Open Data Can Help Achieve the G20 Growth Target.  Accessed July 24, 2014, http://bit.ly/UOmBRe

  • This report highlights the potential economic value of the open data agenda in Australia and the G20. The report provides an initial literature review on the economic value of open data, as well as a asset of case studies on the economic value of open data, and a set of recommendations for how open data can help the G20 and Australia achieve target objectives in the areas of trade, finance, fiscal and monetary policy, anti-corruption, employment, energy, and infrastructure.

Heusser, Felipe I. Understanding Open Government Data and Addressing Its Impact (draft version). World Wide Web Foundation. http://bit.ly/1o9Egym

  • The World Wide Web Foundation, in collaboration with IDRC has begun a research network to explore the impacts of open data in developing countries. In addition to the Web Foundation and IDRC, the network includes the Berkman Center for Internet and Society at Harvard, the Open Development Technology Alliance and Practical Participation.

Howard, Alex. San Francisco Looks to Tap Into the Open Data Economy. O’Reilly Radar: Insight, Analysis, and Reach about Emerging Technologies.  October 19, 2012.  Accessed July 24, 2014. http://oreil.ly/1qNRt3h

  • Alex Howard points to San Francisco as one of the first municipalities in the United States to embrace an open data platform.  He outlines how open data has driven innovation in local governance.  Moreover, he discusses the potential impact of open data on job creation and government technology infrastructure in the City and County of San Francisco.

Huijboom, Noor and Tijs Van den Broek. Open Data: An International Comparison of Strategies. European Journal of ePractice. March 2011. Accessed July 24, 2014.  http://bit.ly/1AE24jq

  • This article examines five countries and their open data strategies, identifying key features, main barriers, and drivers of progress for of open data programs. The authors outline the key challenges facing European, and other national open data policies, highlighting the emerging role open data initiatives are playing in political and administrative agendas around the world.

Manyika, J., Michael Chui, Diana Farrell, Steve Van Kuiken, Peter Groves, and Elizabeth Almasi Doshi. Open Data: Unlocking Innovation and Performance with Liquid Innovation. McKinsey Global Institute. October 2013. Accessed July 24, 2014.  http://bit.ly/1lgDX0v

  • This research focuses on quantifying the potential value of open data in seven “domains” in the global economy: education, transportation, consumer products, electricity, oil and gas, health care, and consumer finance.

Moore, Alida. Congressional Transparency Caucus: How Open Data Creates Jobs. April 2, 2014. Accessed July 30, 2014. Socrata. http://bit.ly/1n7OJpp

  • Socrata provides a summary of the March 24th briefing of the Congressional Transparency Caucus on the need to increase government transparency through adopting open data initiatives. They include key takeaways from the panel discussion, as well as their role in making open data available for businesses.

Stott, Andrew. Open Data for Economic Growth. The World Bank. June 25, 2014. Accessed July 24, 2014. http://bit.ly/1n7PRJF

  • In this report, The World Bank examines the evidence for the economic potential of open data, holding that the economic potential is quite large, despite a variation in the published estimates, and difficulties assessing its potential methodologically. They provide five archetypes of businesses using open data, and provides recommendations for governments trying to maximize economic growth from open data.

Selected Readings on Sentiment Analysis


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of sentiment analysis was originally published in 2014.

Sentiment Analysis is a field of Computer Science that uses techniques from natural language processing, computational linguistics, and machine learning to predict subjective meaning from text. The term opinion mining is often used interchangeably with Sentiment Analysis, although it is technically a subfield focusing on the extraction of opinions (the umbrella under which sentiment, evaluation, appraisal, attitude, and emotion all lie).

The rise of Web 2.0 and increased information flow has led to an increase in interest towards Sentiment Analysis — especially as applied to social networks and media. Events causing large spikes in media — such as the 2012 Presidential Election Debates — are especially ripe for analysis. Such analyses raise a variety of implications for the future of crowd participation, elections, and governance.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Choi, Eunsol et al. “Hedge detection as a lens on framing in the GMO debates: a position paper.” Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics 13 Jul. 2012: 70-79. http://bit.ly/1wweftP

  • Understanding the ways in which participants in public discussions frame their arguments is important for understanding how public opinion is formed. This paper adopts the position that it is time for more computationally-oriented research on problems involving framing. In the interests of furthering that goal, the authors propose the following question: In the controversy regarding the use of genetically-modified organisms (GMOs) in agriculture, do pro- and anti-GMO articles differ in whether they choose to adopt a more “scientific” tone?
  • Prior work on the rhetoric and sociology of science suggests that hedging may distinguish popular-science text from text written by professional scientists for their colleagues. The paper proposes a detailed approach to studying whether hedge detection can be used to understand scientific framing in the GMO debates, and provides corpora to facilitate this study. Some of the preliminary analyses suggest that hedges occur less frequently in scientific discourse than in popular text, a finding that contradicts prior assertions in the literature.

Michael, Christina, Francesca Toni, and Krysia Broda. “Sentiment analysis for debates.” (Unpublished MSc thesis). Department of Computing, Imperial College London (2013). http://bit.ly/Wi86Xv

  • This project aims to expand on existing solutions used for automatic sentiment analysis on text in order to capture support/opposition and agreement/disagreement in debates. In addition, it looks at visualizing the classification results for enhancing the ease of understanding the debates and for showing underlying trends. Finally, it evaluates proposed techniques on an existing debate system for social networking.

Murakami, Akiko, and Rudy Raymond. “Support or oppose?: classifying positions in online debates from reply activities and opinion expressions.” Proceedings of the 23rd International Conference on Computational Linguistics: Posters 23 Aug. 2010: 869-875. https://bit.ly/2Eicfnm

  • In this paper, the authors propose a method for the task of identifying the general positions of users in online debates, i.e., support or oppose the main topic of an online debate, by exploiting local information in their remarks within the debate. An online debate is a forum where each user posts an opinion on a particular topic while other users state their positions by posting their remarks within the debate. The supporting or opposing remarks are made by directly replying to the opinion, or indirectly to other remarks (to express local agreement or disagreement), which makes the task of identifying users’ general positions difficult.
  • A prior study has shown that a link-based method, which completely ignores the content of the remarks, can achieve higher accuracy for the identification task than methods based solely on the contents of the remarks. In this paper, it is shown that utilizing the textual content of the remarks into the link-based method can yield higher accuracy in the identification task.

Pang, Bo, and Lillian Lee. “Opinion mining and sentiment analysis.” Foundations and trends in information retrieval 2.1-2 (2008): 1-135. http://bit.ly/UaCBwD

  • This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. Its focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. It includes material on summarization of evaluative text and on broader issues regarding privacy, manipulation, and economic impact that the development of opinion-oriented information-access services gives rise to. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided.

Ranade, Sarvesh et al. “Online debate summarization using topic directed sentiment analysis.” Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining 11 Aug. 2013: 7. http://bit.ly/1nbKtLn

  • Social networking sites provide users a virtual community interaction platform to share their thoughts, life experiences and opinions. Online debate forum is one such platform where people can take a stance and argue in support or opposition of debate topics. An important feature of such forums is that they are dynamic and grow rapidly. In such situations, effective opinion summarization approaches are needed so that readers need not go through the entire debate.
  • This paper aims to summarize online debates by extracting highly topic relevant and sentiment rich sentences. The proposed approach takes into account topic relevant, document relevant and sentiment based features to capture topic opinionated sentences. ROUGE (Recall-Oriented Understudy for Gisting Evaluation, which employ a set of metrics and a software package to compare automatically produced summary or translation against human-produced onces) scores are used to evaluate the system. This system significantly outperforms several baseline systems and show improvement over the state-of-the-art opinion summarization system. The results verify that topic directed sentiment features are most important to generate effective debate summaries.

Schneider, Jodi. “Automated argumentation mining to the rescue? Envisioning argumentation and decision-making support for debates in open online collaboration communities.” http://bit.ly/1mi7ztx

  • Argumentation mining, a relatively new area of discourse analysis, involves automatically identifying and structuring arguments. Following a basic introduction to argumentation, the authors describe a new possible domain for argumentation mining: debates in open online collaboration communities.
  • Based on our experience with manual annotation of arguments in debates, the authors propose argumentation mining as the basis for three kinds of support tools, for authoring more persuasive arguments, finding weaknesses in others’ arguments, and summarizing a debate’s overall conclusions.

Selected Readings on Crowdsourcing Expertise


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of crowdsourcing was originally published in 2014.

Crowdsourcing enables leaders and citizens to work together to solve public problems in new and innovative ways. New tools and platforms enable citizens with differing levels of knowledge, expertise, experience and abilities to collaborate and solve problems together. Identifying experts, or individuals with specialized skills, knowledge or abilities with regard to a specific topic, and incentivizing their participation in crowdsourcing information, knowledge or experience to achieve a shared goal can enhance the efficiency and effectiveness of problem solving.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Börner, Katy, Michael Conlon, Jon Corson-Rikert, and Ying Ding. “VIVO: A Semantic Approach to Scholarly Networking and Discovery.” Synthesis Lectures on the Semantic Web: Theory and Technology 2, no. 1 (October 17, 2012): 1–178. http://bit.ly/17huggT.

  • This e-book “provides an introduction to VIVO…a tool for representing information about research and researchers — their scholarly works, research interests, and organizational relationships.”
  • VIVO is a response to the fact that, “Information for scholars — and about scholarly activity — has not kept pace with the increasing demands and expectations. Information remains siloed in legacy systems and behind various access controls that must be licensed or otherwise negotiated before access. Information representation is in its infancy. The raw material of scholarship — the data and information regarding previous work — is not available in common formats with common semantics.”
  • Providing access to structured information on the work and experience of a diversity of scholars enables improved expert finding — “identifying and engaging experts whose scholarly works is of value to one’s own. To find experts, one needs rich data regarding one’s own work and the work of potential related experts. The authors argue that expert finding is of increasing importance since, “[m]ulti-disciplinary and inter-disciplinary investigation is increasingly required to address complex problems. 

Bozzon, Alessandro, Marco Brambilla, Stefano Ceri, Matteo Silvestri, and Giuliano Vesci. “Choosing the Right Crowd: Expert Finding in Social Networks.” In Proceedings of the 16th International Conference on Extending Database Technology, 637–648. EDBT  ’13. New York, NY, USA: ACM, 2013. http://bit.ly/18QbtY5.

  • This paper explores the challenge of selecting experts within the population of social networks by considering the following problem: “given an expertise need (expressed for instance as a natural language query) and a set of social network members, who are the most knowledgeable people for addressing that need?”
  • The authors come to the following conclusions:
    • “profile information is generally less effective than information about resources that they directly create, own or annotate;
    • resources which are produced by others (resources appearing on the person’s Facebook wall or produced by people that she follows on Twitter) help increasing the assessment precision;
    • Twitter appears the most effective social network for expertise matching, as it very frequently outperforms all other social networks (either combined or alone);
    • Twitter appears as well very effective for matching expertise in domains such as computer engineering, science, sport, and technology & games, but Facebook is also very effective in fields such as locations, music, sport, and movies & tv;
    • surprisingly, LinkedIn appears less effective than other social networks in all domains (including computer science) and overall.”

Brabham, Daren C. “The Myth of Amateur Crowds.” Information, Communication & Society 15, no. 3 (2012): 394–410. http://bit.ly/1hdnGJV.

  • Unlike most of the related literature, this paper focuses on bringing attention to the expertise already being tapped by crowdsourcing efforts rather than determining ways to identify more dormant expertise to improve the results of crowdsourcing.
  • Brabham comes to two central conclusions: “(1) crowdsourcing is discussed in the popular press as a process driven by amateurs and hobbyists, yet empirical research on crowdsourcing indicates that crowds are largely self-selected professionals and experts who opt-in to crowdsourcing arrangements; and (2) the myth of the amateur in crowdsourcing ventures works to label crowds as mere hobbyists who see crowdsourcing ventures as opportunities for creative expression, as entertainment, or as opportunities to pass the time when bored. This amateur/hobbyist label then undermines the fact that large amounts of real work and expert knowledge are exerted by crowds for relatively little reward and to serve the profit motives of companies. 

Dutton, William H. Networking Distributed Public Expertise: Strategies for Citizen Sourcing Advice to Government. One of a Series of Occasional Papers in Science and Technology Policy, Science and Technology Policy Institute, Institute for Defense Analyses, February 23, 2011. http://bit.ly/1c1bpEB.

  • In this paper, a case is made for more structured and well-managed crowdsourcing efforts within government. Specifically, the paper “explains how collaborative networking can be used to harness the distributed expertise of citizens, as distinguished from citizen consultation, which seeks to engage citizens — each on an equal footing.” Instead of looking for answers from an undefined crowd, Dutton proposes “networking the public as advisors” by seeking to “involve experts on particular public issues and problems distributed anywhere in the world.”
  • Dutton argues that expert-based crowdsourcing can be successfully for government for a number of reasons:
    • Direct communication with a diversity of independent experts
    • The convening power of government
    • Compatibility with open government and open innovation
    • Synergy with citizen consultation
    • Building on experience with paid consultants
    • Speed and urgency
    • Centrality of documents to policy and practice.
  • He also proposes a nine-step process for government to foster bottom-up collaboration networks:
    • Do not reinvent the technology
    • Focus on activities, not the tools
    • Start small, but capable of scaling up
    • Modularize
    • Be open and flexible in finding and going to communities of experts
    • Do not concentrate on one approach to all problems
    • Cultivate the bottom-up development of multiple projects
    • Experience networking and collaborating — be a networked individual
    • Capture, reward, and publicize success.

Goel, Gagan, Afshin Nikzad and Adish Singla. “Matching Workers with Tasks: Incentives in Heterogeneous Crowdsourcing Markets.” Under review by the International World Wide Web Conference (WWW). 2014. http://bit.ly/1qHBkdf

  • Combining the notions of crowdsourcing expertise and crowdsourcing tasks, this paper focuses on the challenge within platforms like Mechanical Turk related to intelligently matching tasks to workers.
  • The authors’ call for more strategic assignment of tasks in crowdsourcing markets is based on the understanding that “each worker has certain expertise and interests which define the set of tasks she can and is willing to do.”
  • Focusing on developing meaningful incentives based on varying levels of expertise, the authors sought to create a mechanism that, “i) is incentive compatible in the sense that it is truthful for agents to report their true cost, ii) picks a set of workers and assigns them to the tasks they are eligible for in order to maximize the utility of the requester, iii) makes sure total payments made to the workers doesn’t exceed the budget of the requester.

Gubanov, D., N. Korgin, D. Novikov and A. Kalkov. E-Expertise: Modern Collective Intelligence. Springer, Studies in Computational Intelligence 558, 2014. http://bit.ly/U1sxX7

  • In this book, the authors focus on “organization and mechanisms of expert decision-making support using modern information and communication technologies, as well as information analysis and collective intelligence technologies (electronic expertise or simply e-expertise).”
  • The book, which “addresses a wide range of readers interested in management, decision-making and expert activity in political, economic, social and industrial spheres, is broken into five chapters:
    • Chapter 1 (E-Expertise) discusses the role of e-expertise in decision-making processes. The procedures of e-expertise are classified, their benefits and shortcomings are identified, and the efficiency conditions are considered.
    • Chapter 2 (Expert Technologies and Principles) provides a comprehensive overview of modern expert technologies. A special emphasis is placed on the specifics of e-expertise. Moreover, the authors study the feasibility and reasonability of employing well-known methods and approaches in e-expertise.
    • Chapter 3 (E-Expertise: Organization and Technologies) describes some examples of up-to-date technologies to perform e-expertise.
    • Chapter 4 (Trust Networks and Competence Networks) deals with the problems of expert finding and grouping by information and communication technologies.
    • Chapter 5 (Active Expertise) treats the problem of expertise stability against any strategic manipulation by experts or coordinators pursuing individual goals.

Holst, Cathrine. “Expertise and Democracy.” ARENA Report No 1/14, Center for European Studies, University of Oslo. http://bit.ly/1nm3rh4

  • This report contains a set of 16 papers focused on the concept of “epistocracy,” meaning the “rule of knowers.” The papers inquire into the role of knowledge and expertise in modern democracies and especially in the European Union (EU). Major themes are: expert-rule and democratic legitimacy; the role of knowledge and expertise in EU governance; and the European Commission’s use of expertise.
    • Expert-rule and democratic legitimacy
      • Papers within this theme concentrate on issues such as the “implications of modern democracies’ knowledge and expertise dependence for political and democratic theory.” Topics include the accountability of experts, the legitimacy of expert arrangements within democracies, the role of evidence in policy-making, how expertise can be problematic in democratic contexts, and “ethical expertise” and its place in epistemic democracies.
    • The role of knowledge and expertise in EU governance
      • Papers within this theme concentrate on “general trends and developments in the EU with regard to the role of expertise and experts in political decision-making, the implications for the EU’s democratic legitimacy, and analytical strategies for studying expertise and democratic legitimacy in an EU context.”
    • The European Commission’s use of expertise
      • Papers within this theme concentrate on how the European Commission uses expertise and in particular the European Commission’s “expertgroup system.” Topics include the European Citizen’s Initiative, analytic-deliberative processes in EU food safety, the operation of EU environmental agencies, and the autonomy of various EU agencies.

King, Andrew and Karim R. Lakhani. “Using Open Innovation to Identify the Best Ideas.” MIT Sloan Management Review, September 11, 2013. http://bit.ly/HjVOpi.

  • In this paper, King and Lakhani examine different methods for opening innovation, where, “[i]nstead of doing everything in-house, companies can tap into the ideas cloud of external expertise to develop new products and services.”
  • The three types of open innovation discussed are: opening the idea-creation process, competitions where prizes are offered and designers bid with possible solutions; opening the idea-selection process, ‘approval contests’ in which outsiders vote to determine which entries should be pursued; and opening both idea generation and selection, an option used especially by organizations focused on quickly changing needs.

Long, Chengjiang, Gang Hua and Ashish Kapoor. Active Visual Recognition with Expertise Estimation in Crowdsourcing. 2013 IEEE International Conference on Computer Vision. December 2013. http://bit.ly/1lRWFur.

  • This paper is focused on improving the crowdsourced labeling of visual datasets from platforms like Mechanical Turk. The authors note that, “Although it is cheap to obtain large quantity of labels through crowdsourcing, it has been well known that the collected labels could be very noisy. So it is desirable to model the expertise level of the labelers to ensure the quality of the labels. The higher the expertise level a labeler is at, the lower the label noises he/she will produce.”
  • Based on the need for identifying expert labelers upfront, the authors developed an “active classifier learning system which determines which users to label which unlabeled examples” from collected visual datasets.
  • The researchers’ experiments in identifying expert visual dataset labelers led to findings demonstrating that the “active selection” of expert labelers is beneficial in cutting through the noise of crowdsourcing platforms.

Noveck, Beth Simone. “’Peer to Patent’: Collective Intelligence, Open Review, and Patent Reform.” Harvard Journal of Law & Technology 20, no. 1 (Fall 2006): 123–162. http://bit.ly/HegzTT.

  • This law review article introduces the idea of crowdsourcing expertise to mitigate the challenge of patent processing. Noveck argues that, “access to information is the crux of the patent quality problem. Patent examiners currently make decisions about the grant of a patent that will shape an industry for a twenty-year period on the basis of a limited subset of available information. Examiners may neither consult the public, talk to experts, nor, in many cases, even use the Internet.”
  • Peer-to-Patent, which launched three years after this article, is based on the idea that, “The new generation of social software might not only make it easier to find friends but also to find expertise that can be applied to legal and policy decision-making. This way, we can improve upon the Constitutional promise to promote the progress of science and the useful arts in our democracy by ensuring that only worth ideas receive that ‘odious monopoly’ of which Thomas Jefferson complained.”

Ober, Josiah. “Democracy’s Wisdom: An Aristotelian Middle Way for Collective Judgment.” American Political Science Review 107, no. 01 (2013): 104–122. http://bit.ly/1cgf857.

  • In this paper, Ober argues that, “A satisfactory model of decision-making in an epistemic democracy must respect democratic values, while advancing citizens’ interests, by taking account of relevant knowledge about the world.”
  • Ober describes an approach to decision-making that aggregates expertise across multiple domains. This “Relevant Expertise Aggregation (REA) enables a body of minimally competent voters to make superior choices among multiple options, on matters of common interest.”

Sims, Max H., Jeffrey Bigham, Henry Kautz and Marc W. Halterman. Crowdsourcing medical expertise in near real time.” Journal of Hospital Medicine 9, no. 7, July 2014. http://bit.ly/1kAKvq7.

  • In this article, the authors discuss the develoment of a mobile application called DocCHIRP, which was developed due to the fact that, “although the Internet creates unprecedented access to information, gaps in the medical literature and inefficient searches often leave healthcare providers’ questions unanswered.”
  • The DocCHIRP pilot project used a “system of point-to-multipoint push notifications designed to help providers problem solve by crowdsourcing from their peers.”
  • Healthcare providers (HCPs) sought to gain intelligence from the crowd, which included 85 registered users, on questions related to medication, complex medical decision making, standard of care, administrative, testing and referrals.
  • The authors believe that, “if future iterations of the mobile crowdsourcing applications can address…adoption barriers and support the organic growth of the crowd of HCPs,” then “the approach could have a positive and transformative effect on how providers acquire relevant knowledge and care for patients.”

Spina, Alessandro. “Scientific Expertise and Open Government in the Digital Era: Some Reflections on EFSA and Other EU Agencies.” in Foundations of EU Food Law and Policy, eds. A. Alemmano and S. Gabbi. Ashgate, 2014. http://bit.ly/1k2EwdD.

  • In this paper, Spina “presents some reflections on how the collaborative and crowdsourcing practices of Open Government could be integrated in the activities of EFSA [European Food Safety Authority] and other EU agencies,” with a particular focus on “highlighting the benefits of the Open Government paradigm for expert regulatory bodies in the EU.”
  • Spina argues that the “crowdsourcing of expertise and the reconfiguration of the information flows between European agencies and teh public could represent a concrete possibility of modernising the role of agencies with a new model that has a low financial burden and an almost immediate effect on the legal governance of agencies.”
  • He concludes that, “It is becoming evident that in order to guarantee that the best scientific expertise is provided to EU institutions and citizens, EFSA should strive to use the best organisational models to source science and expertise.”

Selected Readings on Crowdsourcing Tasks and Peer Production


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of crowdsourcing was originally published in 2014.

Technological advances are creating a new paradigm by which institutions and organizations are increasingly outsourcing tasks to an open community, allocating specific needs to a flexible, willing and dispersed workforce. “Microtasking” platforms like Amazon’s Mechanical Turk are a burgeoning source of income for individuals who contribute their time, skills and knowledge on a per-task basis. In parallel, citizen science projects – task-based initiatives in which citizens of any background can help contribute to scientific research – like Galaxy Zoo are demonstrating the ability of lay and expert citizens alike to make small, useful contributions to aid large, complex undertakings. As governing institutions seek to do more with less, looking to the success of citizen science and microtasking initiatives could provide a blueprint for engaging citizens to help accomplish difficult, time-consuming objectives at little cost. Moreover, the incredible success of peer-production projects – best exemplified by Wikipedia – instills optimism regarding the public’s willingness and ability to complete relatively small tasks that feed into a greater whole and benefit the public good. You can learn more about this new wave of “collective intelligence” by following the MIT Center for Collective Intelligence and their annual Collective Intelligence Conference.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Benkler, Yochai. The Wealth of Networks: How Social Production Transforms Markets and Freedom. Yale University Press, 2006. http://bit.ly/1aaU7Yb.

  • In this book, Benkler “describes how patterns of information, knowledge, and cultural production are changing – and shows that the way information and knowledge are made available can either limit or enlarge the ways people can create and express themselves.”
  • In his discussion on Wikipedia – one of many paradigmatic examples of people collaborating without financial reward – he calls attention to the notable ongoing cooperation taking place among a diversity of individuals. He argues that, “The important point is that Wikipedia requires not only mechanical cooperation among people, but a commitment to a particular style of writing and describing concepts that is far from intuitive or natural to people. It requires self-discipline. It enforces the behavior it requires primarily through appeal to the common enterprise that the participants are engaged in…”

Brabham, Daren C. Using Crowdsourcing in Government. Collaborating Across Boundaries Series. IBM Center for The Business of Government, 2013. http://bit.ly/17gzBTA.

  • In this report, Brabham categorizes government crowdsourcing cases into a “four-part, problem-based typology, encouraging government leaders and public administrators to consider these open problem-solving techniques as a way to engage the public and tackle difficult policy and administrative tasks more effectively and efficiently using online communities.”
  • The proposed four-part typology describes the following types of crowdsourcing in government:
    • Knowledge Discovery and Management
    • Distributed Human Intelligence Tasking
    • Broadcast Search
    • Peer-Vetted Creative Production
  • In his discussion on Distributed Human Intelligence Tasking, Brabham argues that Amazon’s Mechanical Turk and other microtasking platforms could be useful in a number of governance scenarios, including:
    • Governments and scholars transcribing historical document scans
    • Public health departments translating health campaign materials into foreign languages to benefit constituents who do not speak the native language
    • Governments translating tax documents, school enrollment and immunization brochures, and other important materials into minority languages
    • Helping governments predict citizens’ behavior, “such as for predicting their use of public transit or other services or for predicting behaviors that could inform public health practitioners and environmental policy makers”

Boudreau, Kevin J., Patrick Gaule, Karim Lakhani, Christoph Reidl, Anita Williams Woolley. “From Crowds to Collaborators: Initiating Effort & Catalyzing Interactions Among Online Creative Workers.” Harvard Business School Technology & Operations Mgt. Unit Working Paper No. 14-060. January 23, 2014. https://bit.ly/2QVmGUu.

  • In this working paper, the authors explore the “conditions necessary for eliciting effort from those affecting the quality of interdependent teamwork” and “consider the the role of incentives versus social processes in catalyzing collaboration.”
  • The paper’s findings are based on an experiment involving 260 individuals randomly assigned to 52 teams working toward solutions to a complex problem.
  • The authors determined the level of effort in such collaborative undertakings are sensitive to cash incentives. However, collaboration among teams was driven more by the active participation of teammates, rather than any monetary reward.

Franzoni, Chiara, and Henry Sauermann. “Crowd Science: The Organization of Scientific Research in Open Collaborative Projects.” Research Policy (August 14, 2013). http://bit.ly/HihFyj.

  • In this paper, the authors explore the concept of crowd science, which they define based on two important features: “participation in a project is open to a wide base of potential contributors, and intermediate inputs such as data or problem solving algorithms are made openly available.” The rationale for their study and conceptual framework is the “growing attention from the scientific community, but also policy makers, funding agencies and managers who seek to evaluate its potential benefits and challenges. Based on the experiences of early crowd science projects, the opportunities are considerable.”
  • Based on the study of a number of crowd science projects – including governance-related initiatives like Patients Like Me – the authors identify a number of potential benefits in the following categories:
    • Knowledge-related benefits
    • Benefits from open participation
    • Benefits from the open disclosure of intermediate inputs
    • Motivational benefits
  • The authors also identify a number of challenges:
    • Organizational challenges
    • Matching projects and people
    • Division of labor and integration of contributions
    • Project leadership
    • Motivational challenges
    • Sustaining contributor involvement
    • Supporting a broader set of motivations
    • Reconciling conflicting motivations

Kittur, Aniket, Ed H. Chi, and Bongwon Suh. “Crowdsourcing User Studies with Mechanical Turk.” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 453–456. CHI ’08. New York, NY, USA: ACM, 2008. http://bit.ly/1a3Op48.

  • In this paper, the authors examine “[m]icro-task markets, such as Amazon’s Mechanical Turk, [which] offer a potential paradigm for engaging a large number of users for low time and monetary costs. [They] investigate the utility of a micro-task market for collecting user measurements, and discuss design considerations for developing remote micro user evaluation tasks.”
  • The authors conclude that in addition to providing a means for crowdsourcing small, clearly defined, often non-skill-intensive tasks, “Micro-task markets such as Amazon’s Mechanical Turk are promising platforms for conducting a variety of user study tasks, ranging from surveys to rapid prototyping to quantitative measures. Hundreds of users can be recruited for highly interactive tasks for marginal costs within a timeframe of days or even minutes. However, special care must be taken in the design of the task, especially for user measurements that are subjective or qualitative.”

Kittur, Aniket, Jeffrey V. Nickerson, Michael S. Bernstein, Elizabeth M. Gerber, Aaron Shaw, John Zimmerman, Matthew Lease, and John J. Horton. “The Future of Crowd Work.” In 16th ACM Conference on Computer Supported Cooperative Work (CSCW 2013), 2012. http://bit.ly/1c1GJD3.

  • In this paper, the authors discuss paid crowd work, which “offers remarkable opportunities for improving productivity, social mobility, and the global economy by engaging a geographically distributed workforce to complete complex tasks on demand and at scale.” However, they caution that, “it is also possible that crowd work will fail to achieve its potential, focusing on assembly-line piecework.”
  • The authors argue that seven key challenges must be met to ensure that crowd work processes evolve and reach their full potential:
    • Designing workflows
    • Assigning tasks
    • Supporting hierarchical structure
    • Enabling real-time crowd work
    • Supporting synchronous collaboration
    • Controlling quality

Madison, Michael J. “Commons at the Intersection of Peer Production, Citizen Science, and Big Data: Galaxy Zoo.” In Convening Cultural Commons, 2013. http://bit.ly/1ih9Xzm.

  • This paper explores a “case of commons governance grounded in research in modern astronomy. The case, Galaxy Zoo, is a leading example of at least three different contemporary phenomena. In the first place, Galaxy Zoo is a global citizen science project, in which volunteer non-scientists have been recruited to participate in large-scale data analysis on the Internet. In the second place, Galaxy Zoo is a highly successful example of peer production, some times known as crowdsourcing…In the third place, is a highly visible example of data-intensive science, sometimes referred to as e-science or Big Data science, by which scientific researchers develop methods to grapple with the massive volumes of digital data now available to them via modern sensing and imaging technologies.”
  • Madison concludes that the success of Galaxy Zoo has not been the result of the “character of its information resources (scientific data) and rules regarding their usage,” but rather, the fact that the “community was guided from the outset by a vision of a specific organizational solution to a specific research problem in astronomy, initiated and governed, over time, by professional astronomers in collaboration with their expanding universe of volunteers.”

Malone, Thomas W., Robert Laubacher and Chrysanthos Dellarocas. “Harnessing Crowds: Mapping the Genome of Collective Intelligence.” MIT Sloan Research Paper. February 3, 2009. https://bit.ly/2SPjxTP.

  • In this article, the authors describe and map the phenomenon of collective intelligence – also referred to as “radical decentralization, crowd-sourcing, wisdom of crowds, peer production, and wikinomics – which they broadly define as “groups of individuals doing things collectively that seem intelligent.”
  • The article is derived from the authors’ work at MIT’s Center for Collective Intelligence, where they gathered nearly 250 examples of Web-enabled collective intelligence. To map the building blocks or “genes” of collective intelligence, the authors used two pairs of related questions:
    • Who is performing the task? Why are they doing it?
    • What is being accomplished? How is it being done?
  • The authors concede that much work remains to be done “to identify all the different genes for collective intelligence, the conditions under which these genes are useful, and the constraints governing how they can be combined,” but they believe that their framework provides a useful start and gives managers and other institutional decisionmakers looking to take advantage of collective intelligence activities the ability to “systematically consider many possible combinations of answers to questions about Who, Why, What, and How.”

Mulgan, Geoff. “True Collective Intelligence? A Sketch of a Possible New Field.” Philosophy & Technology 27, no. 1. March 2014. http://bit.ly/1p3YSdd.

  • In this paper, Mulgan explores the concept of a collective intelligence, a “much talked about but…very underdeveloped” field.
  • With a particular focus on health knowledge, Mulgan “sets out some of the potential theoretical building blocks, suggests an experimental and research agenda, shows how it could be analysed within an organisation or business sector and points to possible intellectual barriers to progress.”
  • He concludes that the “central message that comes from observing real intelligence is that intelligence has to be for something,” and that “turning this simple insight – the stuff of so many science fiction stories – into new theories, new technologies and new applications looks set to be one of the most exciting prospects of the next few years and may help give shape to a new discipline that helps us to be collectively intelligent about our own collective intelligence.”

Sauermann, Henry and Chiara Franzoni. “Participation Dynamics in Crowd-Based Knowledge Production: The Scope and Sustainability of Interest-Based Motivation.” SSRN Working Papers Series. November 28, 2013. http://bit.ly/1o6YB7f.

  • In this paper, Sauremann and Franzoni explore the issue of interest-based motivation in crowd-based knowledge production – in particular the use of the crowd science platform Zooniverse – by drawing on “research in psychology to discuss important static and dynamic features of interest and deriv[ing] a number of research questions.”
  • The authors find that interest-based motivation is often tied to a “particular object (e.g., task, project, topic)” not based on a “general trait of the person or a general characteristic of the object.” As such, they find that “most members of the installed base of users on the platform do not sign up for multiple projects, and most of those who try out a project do not return.”
  • They conclude that “interest can be a powerful motivator of individuals’ contributions to crowd-based knowledge production…However, both the scope and sustainability of this interest appear to be rather limited for the large majority of contributors…At the same time, some individuals show a strong and more enduring interest to participate both within and across projects, and these contributors are ultimately responsible for much of what crowd science projects are able to accomplish.”

Schmitt-Sands, Catherine E. and Richard J. Smith. “Prospects for Online Crowdsourcing of Social Science Research Tasks: A Case Study Using Amazon Mechanical Turk.” SSRN Working Papers Series. January 9, 2014. http://bit.ly/1ugaYja.

  • In this paper, the authors describe an experiment involving the nascent use of Amazon’s Mechanical Turk as a social science research tool. “While researchers have used crowdsourcing to find research subjects or classify texts, [they] used Mechanical Turk to conduct a policy scan of local government websites.”
  • Schmitt-Sands and Smith found that “crowdsourcing worked well for conducting an online policy program and scan.” The microtasked workers were helpful in screening out local governments that either did not have websites or did not have the types of policies and services for which the researchers were looking. However, “if the task is complicated such that it requires ongoing supervision, then crowdsourcing is not the best solution.”

Shirky, Clay. Here Comes Everybody: The Power of Organizing Without Organizations. New York: Penguin Press, 2008. https://bit.ly/2QysNif.

  • In this book, Shirky explores our current era in which, “For the first time in history, the tools for cooperating on a global scale are not solely in the hands of governments or institutions. The spread of the Internet and mobile phones are changing how people come together and get things done.”
  • Discussing Wikipedia’s “spontaneous division of labor,” Shirky argues that the process is like, “the process is more like creating a coral reef, the sum of millions of individual actions, than creating a car. And the key to creating those individual actions is to hand as much freedom as possible to the average user.”

Silvertown, Jonathan. “A New Dawn for Citizen Science.” Trends in Ecology & Evolution 24, no. 9 (September 2009): 467–471. http://bit.ly/1iha6CR.

  • This article discusses the move from “Science for the people,” a slogan adopted by activists in the 1970s to “’Science by the people,’ which is “a more inclusive aim, and is becoming a distinctly 21st century phenomenon.”
  • Silvertown identifies three factors that are responsible for the explosion of activity in citizen science, each of which could be similarly related to the crowdsourcing of skills by governing institutions:
    • “First is the existence of easily available technical tools for disseminating information about products and gathering data from the public.
    • A second factor driving the growth of citizen science is the increasing realisation among professional scientists that the public represent a free source of labour, skills, computational power and even finance.
    • Third, citizen science is likely to benefit from the condition that research funders such as the National Science Foundation in the USA and the Natural Environment Research Council in the UK now impose upon every grantholder to undertake project-related science outreach. This is outreach as a form of public accountability.”

Szkuta, Katarzyna, Roberto Pizzicannella, David Osimo. “Collaborative approaches to public sector innovation: A scoping study.” Telecommunications Policy. 2014. http://bit.ly/1oBg9GY.

  • In this article, the authors explore cases where government collaboratively delivers online public services, with a focus on success factors and “incentives for services providers, citizens as users and public administration.”
  • The authors focus on six types of collaborative governance projects:
    • Services initiated by government built on government data;
    • Services initiated by government and making use of citizens’ data;
    • Services initiated by civil society built on open government data;
    • Collaborative e-government services; and
    • Services run by civil society and based on citizen data.
  • The cases explored “are all designed in the way that effectively harnesses the citizens’ potential. Services susceptible to collaboration are those that require computing efforts, i.e. many non-complicated tasks (e.g. citizen science projects – Zooniverse) or citizens’ free time in general (e.g. time banks). Those services also profit from unique citizens’ skills and their propensity to share their competencies.”

Selected Readings on Behavioral Economics: Nudges


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of behavioral economics was originally published in 2014.

The 2008 publication of Richard Thaler and Cass Sunstein’s Nudge ushered in a new era of behavioral economics, and since then, policy makers in the United States and elsewhere have been applying behavioral economics to the field of public policy. Like Smart Disclosure, behavioral economics can be used in the public sector to improve the decisionmaking ability of citizens without relying on regulatory interventions. In the six years since Nudge was published, the United Kingdom has created the Behavioural Insights Team (also known as the Nudge Unit), a cross-ministerial organization that uses behavioral economics to inform public policy, and the White House has recently followed suit by convening a team of behavioral economists to create a behavioral insights-driven team in the United States. Policymakers have been using behavioral insights to design more effective interventions in the fields of long term unemployment; roadway safety; enrollment in retirement plans; and increasing enrollment in organ donation registries, to name some noteworthy examples. The literature of this nascent field provides a look at the growing optimism in the potential of applying behavioral insights in the public sector to improve people’s lives.

Selected Reading List (in alphabetical order)

  • John Beshears, James Choi, David Laibson and Brigitte C. Madrian – The Importance of Default Options for Retirement Savings Outcomes: Evidence from the United States – a paper examining the role default options play in encouraging intelligent retirement savings decisionmaking.
  • Cabinet Office and Behavioural Insights Team, United Kingdom – Applying Behavioural Insights to Healtha paper outlining some examples of behavioral economics being applied to the healthcare landscape using cost-efficient interventions.
  • Matthew Darling, Saugato Datta and Sendhil Mullainathan – The Nature of the BEast: What Behavioral Economics Is Not – a paper discussing why control and behavioral economics are not as closely aligned as some think, reiterating the fact that the field is politically agnostic.
  • Antoinette Schoar and Saugato Datta – The Power of Heuristics – a paper exploring the concept of “heuristics,” or rules of thumb, which can provide helpful guidelines for pushing people toward making “reasonably good” decisions without a full understanding of the complexity of a situation.
  • Richard H. Thaler and Cass R. Sunstein – Nudge: Improving Decisions About Health, Wealth, and Happiness – an influential book describing the many ways in which the principles of behavioral economics can be and have been used to influence choices and behavior through the development of new “choice architectures.” 
  • U.K. Parliament Science and Technology Committee – Behaviour Changean exploration of the government’s attempts to influence the behaviour of its citizens through nudges, with a focus on comparing the effectiveness of nudges to that of regulatory interventions.

Annotated Selected Reading List (in alphabetical order)

Beshears, John, James Choi, David Laibson and Brigitte C. Madrian. “The Importance of Default Options for Retirement Savings Outcomes: Evidence from the United States.” In Jeffrey R. Brown, Jeffrey B. Liebman and David A. Wise, editors, Social Security Policy in a Changing Environment, Cambridge: National Bureau of Economic Research, 2009. http://bit.ly/LFmC5s.

  • This paper examines the role default options play in pushing people toward making intelligent decisions regarding long-term savings and retirement planning.
  • Importantly, the authors provide evidence that a strategically oriented default setting from the outset is likely not enough to fully nudge people toward the best possible decisions in retirement savings. They find that the default settings in every major dimension of the savings process (from deciding whether to participate in a 401(k) to how to withdraw money at retirement) have real and distinct effects on behavior.

Cabinet Office and Behavioural Insights Team, United Kingdom. “Applying Behavioural Insights to Health.” December 2010. http://bit.ly/1eFP16J.

  • In this report, the United Kingdom’s Behavioural Insights Team does not attempt to “suggest that behaviour change techniques are the silver bullet that can solve every problem.” Rather, they explore a variety of examples where local authorities, charities, government and the private-sector are using behavioural interventions to encourage healthier behaviors.  
  • The report features case studies regarding behavioral insights ability to affect the following public health issues:
    • Smoking
    • Organ donation
    • Teenage pregnancy
    • Alcohol
    • Diet and weight
    • Diabetes
    • Food hygiene
    • Physical activity
    • Social care
  • The report concludes with a call for more experimentation and knowledge gathering to determine when, where and how behavioural interventions can be most effective in helping the public become healthier.

Darling, Matthew, Saugato Datta and Sendhil Mullainathan. “The Nature of the BEast: What Behavioral Economics Is Not.” The Center for Global Development. October 2013. https://bit.ly/2QytRmf.

  • In this paper, Darling, Datta and Mullainathan outline the three most pervasive myths that abound within the literature about behavioral economics:
    • First, they dispel the relationship between control and behavioral economics.  Although tools used within behavioral economics can convince people to make certain choices, the goal is to nudge people to make the choices they want to make. For example, studies find that when retirement savings plans change the default to opt-in rather than opt-out, more workers set up 401K plans. This is an example of a nudge that guides people to make a choice that they already intend to make.
    • Second, they reiterate that the field is politically agnostic. Both liberals and conservatives have adopted behavioral economics and its approach is neither liberal nor conservative. President Obama embraces behavioral economics but the United Kingdom’s conservative party does, too.
    • And thirdly, the article highlights that irrationality actually has little to do with behavioral economics. Context is an important consideration when one considers what behavior is rational and what behavior is not. Rather than use the term “irrational” to describe human beings, the authors assert that humans are “infinitely complex” and behavior that is often considered irrational is entirely situational.

Schoar, Antoinette and Saugato Datta. “The Power of Heuristics.” Ideas42. January 2014. https://bit.ly/2UDC5YK.

  • This paper explores the notion that being presented with a bevy of options can be desirable in many situations, but when making an intelligent decision requires a high-level understanding of the nuances of vastly different financial aid packages, for example, options can overwhelm. Heuristics (rules of thumb) provide helpful guidelines that “enable people to make ‘reasonably good’ decisions without needing to understand all the complex nuances of the situation.”
  • The underlying goal heuristics in the policy space involves giving people the type of “rules of thumb” that enable make good decisionmaking regarding complex topics such as finance, healthcare and education. The authors point to the benefit of asking individuals to remember smaller pieces of knowledge by referencing a series of studies conducted by psychologists Beatty and Kahneman that showed people were better able to remember long strings of numbers when they were broken into smaller segments.
  • Schoar and Datta recommend these four rules when implementing heuristics:
    • Use heuristics where possible, particularly in complex situation;
    • Leverage new technology (such as text messages and Internet-based tools) to implement heuristics.
    • Determine where heuristics can be used in adult training programs and replace in-depth training programs with heuristics where possible; and
    • Consider how to apply heuristics in situations where the exception is the rule. The authors point to the example of savings and credit card debt. In most instances, saving a portion of one’s income is a good rule of thumb. However, when one has high credit card debt, paying off debt could be preferable to building one’s savings.

Thaler, Richard H. and Cass R. Sunstein. Nudge: Improving Decisions About Health, Wealth, and Happiness. Yale University Press, 2008. https://bit.ly/2kNXroe.

  • This book, likely the single piece of scholarship most responsible for bringing the concept of nudges into the public consciousness, explores how a strategic “choice architecture” can help people make the best decisions.
  • Thaler and Sunstein, while advocating for the wider and more targeted use of nudges to help improve people’s lives without resorting to overly paternal regulation, look to five common nudges for lessons and inspiration:
    • The design of menus gets you to eat (and spend) more;
    • “Flies” in urinals improve, well, aim;
    • Credit card minimum payments affect repayment schedules;
    • Automatic savings programs increase savings rate; and
    • “Defaults” can improve rates of organ donation.
  • In the simplest terms, the authors propose the wider deployment of choice architectures that follow “the golden rule of libertarian paternalism: offer nudges that are most likely to help and least likely to inflict harm.”

U.K. Parliament Science and Technology Committee. “Behaviour Change.” July 2011. http://bit.ly/1cbYv5j.

  • This report from the U.K.’s Science and Technology Committee explores the government’s attempts to influence the behavior of its citizens through nudges, with a focus on comparing the effectiveness of nudges to that of regulatory interventions.
  • The author’s central conclusion is that, “non-regulatory measures used in isolation, including ‘nudges,’ are less likely to be effective. Effective policies often use a range of interventions.”
  • The report’s other major findings and recommendations are:
    • Government must invest in gathering more evidence about what measures work to influence population behaviour change;
    • They should appoint an independent Chief Social Scientist to provide them with robust and independent scientific advice;
    • The Government should take steps to implement a traffic light system of nutritional labelling on all food packaging; and
    • Current voluntary agreements with businesses in relation to public health have major failings. They are not a proportionate response to the scale of the problem of obesity and do not reflect the evidence about what will work to reduce obesity. If effective agreements cannot be reached, or if they show minimal benefit, the Government should pursue regulation.”

Selected Readings on Personal Data: Security and Use


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of personal data was originally published in 2014.

Advances in technology have greatly increased the potential for policymakers to utilize the personal data of large populations for the public good. However, the proliferation of vast stores of useful data has also given rise to a variety of legislative, political, and ethical concerns surrounding the privacy and security of citizens’ personal information, both in terms of collection and usage. Challenges regarding the governance and regulation of personal data must be addressed in order to assuage individuals’ concerns regarding the privacy, security, and use of their personal information.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Cavoukian, Ann. “Personal Data Ecosystem (PDE) – A Privacy by Design Approach to an Individual’s Pursuit of Radical Control.” Privacy by Design, October 15, 2013. https://bit.ly/2S00Yfu.

  • In this paper, Cavoukian describes the Personal Data Ecosystem (PDE), an “emerging landscape of companies and organizations that believe individuals should be in control of their personal data, and make available a growing number of tools and technologies to enable this control.” She argues that, “The right to privacy is highly compatible with the notion of PDE because it enables the individual to have a much greater degree of control – “Radical Control” – over their personal information than is currently possible today.”
  • To ensure that the PDE reaches its privacy-protection potential, Cavouckian argues that it must practice The 7 Foundational Principles of Privacy by Design:
    • Proactive not Reactive; Preventative not Remedial
    • Privacy as the Default Setting
    • Privacy Embedded into Design
    • Full Functionality – Positive-Sum, not Zero-Sum
    • End-to-End Security – Full Lifecycle Protection
    • Visibility and Transparency – Keep it Open
    • Respect for User Privacy – Keep it User-Centric

Kirkham, T., S. Winfield, S. Ravet, and S. Kellomaki. “A Personal Data Store for an Internet of Subjects.” In 2011 International Conference on Information Society (i-Society). 92–97.  http://bit.ly/1alIGuT.

  • This paper examines various factors involved in the governance of personal data online, and argues for a shift from “current service-oriented applications where often the service provider is in control of the person’s data” to a person centric architecture where the user is at the center of personal data control.
  • The paper delves into an “Internet of Subjects” concept of Personal Data Stores, and focuses on implementation of such a concept on personal data that can be characterized as either “By Me” or “About Me.”
  • The paper also presents examples of how a Personal Data Store model could allow users to both protect and present their personal data to external applications, affording them greater control.

OECD. The 2013 OECD Privacy Guidelines. 2013. http://bit.ly/166TxHy.

  • This report is indicative of the “important role in promoting respect for privacy as a fundamental value and a condition for the free flow of personal data across borders” played by the OECD for decades. The guidelines – revised in 2013 for the first time since being drafted in 1980 – are seen as “[t]he cornerstone of OECD work on privacy.”
  • The OECD framework is built around eight basic principles for personal data privacy and security:
    • Collection Limitation
    • Data Quality
    • Purpose Specification
    • Use Limitation
    • Security Safeguards
    • Openness
    • Individual Participation
    • Accountability

Ohm, Paul. “Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization.” UCLA Law Review 57, 1701 (2010). http://bit.ly/18Q5Mta.

  • This article explores the implications of the “astonishing ease” with which scientists have demonstrated the ability to “reidentify” or “deanonmize” supposedly anonymous personal information.
  • Rather than focusing exclusively on whether personal data is “anonymized,” Ohm offers five factors for governments and other data-handling bodies to use for assessing the risk of privacy harm: data-handling techniques, private versus public release, quantity, motive and trust.

Polonetsky, Jules and Omer Tene. “Privacy in the Age of Big Data: A Time for Big Decisions.” Stanford Law Review Online 64 (February 2, 2012): 63. http://bit.ly/1aeSbtG.

  • In this article, Tene and Polonetsky argue that, “The principles of privacy and data protection must be balanced against additional societal values such as public health, national security and law enforcement, environmental protection, and economic efficiency. A coherent framework would be based on a risk matrix, taking into account the value of different uses of data against the potential risks to individual autonomy and privacy.”
  • To achieve this balance, the authors believe that, “policymakers must address some of the most fundamental concepts of privacy law, including the definition of ‘personally identifiable information,’ the role of consent, and the principles of purpose limitation and data minimization.”

Shilton, Katie, Jeff Burke, Deborah Estrin, Ramesh Govindan, Mark Hansen, Jerry Kang, and Min Mun. “Designing the Personal Data Stream: Enabling Participatory Privacy in Mobile Personal Sensing”. TPRC, 2009. http://bit.ly/18gh8SN.

  • This article argues that the Codes of Fair Information Practice, which have served as a model for data privacy for decades, do not take into account a world of distributed data collection, nor the realities of data mining and easy, almost uncontrolled, dissemination.
  • The authors suggest “expanding the Codes of Fair Information Practice to protect privacy in this new data reality. An adapted understanding of the Codes of Fair Information Practice can promote individuals’ engagement with their own data, and apply not only to governments and corporations, but software developers creating the data collection programs of the 21st century.”
  • In order to achieve this change in approach, the paper discusses three foundational design principles: primacy of participants, data legibility, and engagement of participants throughout the data life cycle.

Selected Readings on Big Data


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of big data was originally published in 2014.

Big Data refers to the wide-scale collection, aggregation, storage, analysis and use of data. Government is increasingly in control of a massive amount of raw data that, when analyzed and put to use, can lead to new insights on everything from public opinion to environmental concerns. The burgeoning literature on Big Data argues that it generates value by: creating transparency; enabling experimentation to discover needs, expose variability, and improve performance; segmenting populations to customize actions; replacing/supporting human decision making with automated algorithms; and innovating new business models, products and services. The insights drawn from data analysis can also be visualized in a manner that passes along relevant information, even to those without the tech savvy to understand the data on its own terms (see The GovLab Selected Readings on Data Visualization).

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Australian Government Information Management Office. The Australian Public Service Big Data Strategy: Improved Understanding through Enhanced Data-analytics Capability Strategy Report. August 2013. http://bit.ly/17hs2xY.

  • This Big Data Strategy produced for Australian Government senior executives with responsibility for delivering services and developing policy is aimed at ingraining in government officials that the key to increasing the value of big data held by government is the effective use of analytics. Essentially, “the value of big data lies in [our] ability to extract insights and make better decisions.”
  • This positions big data as a national asset that can be used to “streamline service delivery, create opportunities for innovation, identify new service and policy approaches as well as supporting the effective delivery of existing programs across a broad range of government operations.”

Bollier, David. The Promise and Peril of Big Data. The Aspen Institute, Communications and Society Program, 2010. http://bit.ly/1a3hBIA.

  • This report captures insights from the 2009 Roundtable exploring uses of Big Data within a number of important consumer behavior and policy implication contexts.
  • The report concludes that, “Big Data presents many exciting opportunities to improve modern society. There are incalculable opportunities to make scientific research more productive, and to accelerate discovery and innovation. People can use new tools to help improve their health and well-being, and medical care can be made more efficient and effective. Government, too, has a great stake in using large databases to improve the delivery of government services and to monitor for threats to national security.
  • However, “Big Data also presents many formidable challenges to government and citizens precisely because data technologies are becoming so pervasive, intrusive and difficult to understand. How shall society protect itself against those who would misuse or abuse large databases? What new regulatory systems, private-law innovations or social practices will be capable of controlling anti-social behaviors–and how should we even define what is socially and legally acceptable when the practices enabled by Big Data are so novel and often arcane?”

Boyd, Danah and Kate Crawford. “Six Provocations for Big Data.” A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society. September 2011http://bit.ly/1jJstmz.

  • In this paper, Boyd and Crawford raise challenges to unchecked assumptions and biases regarding big data. The paper makes a number of assertions about the “computational culture” of big data and pushes back against those who consider big data to be a panacea.
  • The authors’ provocations for big data are:
    • Automating Research Changes the Definition of Knowledge
    • Claims to Objectivity and Accuracy are Misleading
    • Big Data is not always Better Data
    • Not all Data is Equivalent
    • Just Because it is accessible doesn’t make it ethical
    • Limited Access to Big Data creates New Digital Divide

The Economist Intelligence Unit. Big Data and the Democratisation of Decisions. October 2012. http://bit.ly/17MpH8L.

  • This report from the Economist Intelligence Unit focuses on the positive impact of big data adoption in the private sector, but its insights can also be applied to the use of big data in governance.
  • The report argues that innovation can be spurred by democratizing access to data, allowing a diversity of stakeholders to “tap data, draw lessons and make business decisions,” which in turn helps companies and institutions respond to new trends and intelligence at varying levels of decision-making power.

Manyika, James, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, and Angela Hung Byers. Big Data: The Next Frontier for Innovation, Competition, and Productivity.  McKinsey & Company. May 2011. http://bit.ly/18Q5CSl.

  • This report argues that big data “will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus, and that “leaders in every sector will have to grapple with the implications of big data.” 
  • The report offers five broad ways in which using big data can create value:
    • First, big data can unlock significant value by making information transparent and usable at much higher frequency.
    • Second, as organizations create and store more transactional data in digital form, they can collect more accurate and detailed performance information on everything from product inventories to sick days, and therefore expose variability and boost performance.
    • Third, big data allows ever-narrower segmentation of customers and therefore much more precisely tailored products or services.
    • Fourth, big sophisticated analytics can substantially improve decision-making.
    • Finally, big data can be used to improve the development of the next generation of products and services.

The Partnership for Public Service and the IBM Center for The Business of Government. “From Data to Decisions II: Building an Analytics Culture.” October 17, 2012. https://bit.ly/2EbBTMg.

  • This report discusses strategies for better leveraging data analysis to aid decision-making. The authors argue that, “Organizations that are successful at launching or expanding analytics program…systematically examine their processes and activities to ensure that everything they do clearly connects to what they set out to achieve, and they use that examination to pinpoint weaknesses or areas for improvement.”
  • While the report features many strategies for government decisions-makers, the central recommendation is that, “leaders incorporate analytics as a way of doing business, making data-driven decisions transparent and a fundamental approach to day-to-day management. When an analytics culture is built openly, and the lessons are applied routinely and shared widely, an agency can embed valuable management practices in its DNA, to the mutual benet of the agency and the public it serves.”

TechAmerica Foundation’s Federal Big Data Commission. “Demystifying Big Data: A Practical Guide to Transforming the Business of Government.” 2013. http://bit.ly/1aalUrs.

  • This report presents key big data imperatives that government agencies must address, the challenges and the opportunities posed by the growing volume of data and the value Big Data can provide. The discussion touches on the value of big data to businesses and organizational mission, presents case study examples of big data applications, technical underpinnings and public policy applications.
  • The authors argue that new digital information, “effectively captured, managed and analyzed, has the power to change every industry including cyber security, healthcare, transportation, education, and the sciences.” To ensure that this opportunity is realized, the report proposes a detailed big data strategy framework with the following steps: define, assess, plan, execute and review.

World Economic Forum. “Big Data, Big Impact: New Possibilities for International Development.” 2012. http://bit.ly/17hrTKW.

  • This report examines the potential for channeling the “flood of data created every day by the interactions of billions of people using computers, GPS devices, cell phones, and medical devices” into “actionable information that can be used to identify needs, provide services, and predict and prevent crises for the benefit of low-income populations”
  • The report argues that, “To realise the mutual benefits of creating an environment for sharing mobile-generated data, all ecosystem actors must commit to active and open participation. Governments can take the lead in setting policy and legal frameworks that protect individuals and require contractors to make their data public. Development organisations can continue supporting governments and demonstrating both the public good and the business value that data philanthropy can deliver. And the private sector can move faster to create mechanisms for the sharing data that can benefit the public.”

Selected Readings on Data Visualization


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data visualization was originally published in 2013.

Data visualization is a response to the ever-increasing amount of  information in the world. With big data, informatics and predictive analytics, we have an unprecedented opportunity to revolutionize policy-making. Yet data by itself can be overwhelming. New tools and techniques for visualizing information can help policymakers clearly articulate insights drawn from data. Moreover, the rise of open data is enabling those outside of government to create informative and visually arresting representations of public information that can be used to support decision-making by those inside or outside governing institutions.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Duke, D.J., K.W. Brodlie, D.A. Duce and I. Herman. “Do You See What I Mean? [Data Visualization].” IEEE Computer Graphics and Applications 25, no. 3 (2005): 6–9. http://bit.ly/1aeU6yA.

  • In this paper, the authors argue that a more systematic ontology for data visualization to ensure the successful communication of meaning. “Visualization begins when someone has data that they wish to explore and interpret; the data are encoded as input to a visualization system, which may in its turn interact with other systems to produce a representation. This is communicated back to the user(s), who have to assess this against their goals and knowledge, possibly leading to further cycles of activity. Each phase of this process involves communication between two parties. For this to succeed, those parties must share a common language with an agreed meaning.”
  • That authors “believe that now is the right time to consider an ontology for visualization,” and “as visualization move from just a private enterprise involving data and tools owned by a research team into a public activity using shared data repositories, computational grids, and distributed collaboration…[m]eaning becomes a shared responsibility and resource. Through the Semantic Web, there is both the means and motivation to develop a shared picture of what we see when we turn and look within our own field.”

Friendly, Michael. “A Brief History of Data Visualization.” In Handbook of Data Visualization, 15–56. Springer Handbooks Comp.Statistics. Springer Berlin Heidelberg, 2008. http://bit.ly/17fM1e9.

  • In this paper, Friendly explores the “deep roots” of modern data visualization. “These roots reach into the histories of the earliest map making and visual depiction, and later into thematic cartography, statistics and statistical graphics, medicine and other fields. Along the way, developments in technologies (printing, reproduction), mathematical theory and practice, and empirical observation and recording enabled the wider use of graphics and new advances in form and content.”
  • Just as the general the visualization of data is far from a new practice, Friendly shows that the graphical representation of government information has a similarly long history. “The collection, organization and dissemination of official government statistics on population, trade and commerce, social, moral and political issues became widespread in most of the countries of Europe from about 1825 to 1870. Reports containing data graphics were published with some regularity in France, Germany, Hungary and Finland, and with tabular displays in Sweden, Holland, Italy and elsewhere.”

Graves, Alvaro and James Hendler. “Visualization Tools for Open Government Data.” In Proceedings of the 14th Annual International Conference on Digital Government Research, 136–145. Dg.o ’13. New York, NY, USA: ACM, 2013. http://bit.ly/1eNSoXQ.

  • In this paper, the authors argue that, “there is a gap between current Open Data initiatives and an important part of the stakeholders of the Open Government Data Ecosystem.” As it stands, “there is an important portion of the population who could benefit from the use of OGD but who cannot do so because they cannot perform the essential operations needed to collect, process, merge, and make sense of the data. The reasons behind these problems are multiple, the most critical one being a fundamental lack of expertise and technical knowledge. We propose the use of visualizations to alleviate this situation. Visualizations provide a simple mechanism to understand and communicate large amounts of data.”
  • The authors also describe a prototype of a tool to create visualizations based on OGD with the following capabilities:
    • Facilitate visualization creation
    • Exploratory mechanisms
    • Viralization and sharing
    • Repurpose of visualizations

Hidalgo, César A. “Graphical Statistical Methods for the Representation of the Human Development Index and Its Components.” United Nations Development Programme Human Development Reports, September 2010. http://bit.ly/166TKur.

  • In this paper for the United Nations Human Development Programme, Hidalgo argues that “graphical statistical methods could be used to help communicate complex data and concepts through universal cognitive channels that are heretofore underused in the development literature.”
  • To support his argument, representations are provided that “show how graphical methods can be used to (i) compare changes in the level of development experienced by countries (ii) make it easier to understand how these changes are tied to each one of the components of the Human Development Index (iii) understand the evolution of the distribution of countries according to HDI and its components and (iv) teach and create awareness about human development by using iconographic representations that can be used to graphically narrate the story of countries and regions.”

Stowers, Genie. “The Use of Data Visualization in Government.” IBM Center for The Business of Government, Using Technology Series, 2013. http://bit.ly/1aame9K.

  • This report seeks “to help public sector managers understand one of the more important areas of data analysis today — data visualization. Data visualizations are more sophisticated, fuller graphic designs than the traditional spreadsheet charts, usually with more than two variables and, typically, incorporating interactive features.”
  • Stowers also offers numerous examples of “visualizations that include geographical and health data, or population and time data, or financial data represented in both absolute and relative terms — and each communicates more than simply the data that underpin it. In addition to these many examples of visualizations, the report discusses the history of this technique, and describes tools that can be used to create visualizations from many different kinds of data sets.”