Selected Readings on Data Portability


By Juliet McMurren, Andrew Young, and Stefaan G. Verhulst

As part of an ongoing effort to build a knowledge base for the field of improving governance through technology, The GovLab publishes a series of Selected Readings, which provide an annotated and curated collection of recommended works on themes such as open data, data collaboration, and civic technology.

In this edition, we explore selected literature on data portability.

To suggest additional readings on this or any other topic, please email info@thelivinglib.org. All our Selected Readings can be found here.

Context

Data today exists largely in silos, generating problems and inefficiencies for the individual, business and society at large. These include:

  • difficulty switching (data) between competitive service providers;
  • delays in sharing data for important societal research initiatives;
  • barriers for data innovators to reuse data that could generate insights to inform individuals’ decision making; and
  • inhibitions to scale data donation.

Data portability — the principle that individuals have a right to obtain, copy, and reuse their own personal data and to transfer it from one IT platform or service to another for their own purposes — is positioned as a solution to these problems. When fully implemented, it would make data liquid, giving individuals the ability to access their own data in a usable and transferable format, transfer it from one service provider to another, or donate data for research and enhanced data analysis by those working in the public interest.

Some companies, including Google, Apple, Twitter and Facebook, have sought to advance data portability through initiatives like the Data Transfer Project, an open source software project designed to facilitate data transmittals. Newly enacted data protection legislation such as Europe’s General Data Protection Regulation (2018) and the California Consumer Privacy Act (2018) give data holders a right to data portability. However, despite the legal and technical advances made, many questions toward scaling up data liquidity and portability responsibly and systematically remain. These new data rights have generated complex and as yet unanswered questions about the limits of data ownership, the implications for privacy, security and intellectual property rights, and the practicalities of how, when, and to whom data can be transferred.

In this edition of the GovLab’s Selected Readings series, we examine the emerging literature on data portability to provide a foundation for future work on the value proposition of data portability. Readings are listed in alphabetical order.

Selected readings

Cho, Daegon, Pedro Ferreira, and Rahul Telang, The Impact of Mobile Number Portability on Price and Consumer Welfare (2016)

  • In this paper, the authors analyze how Mobile Number Portability (MNP) — the ability for consumers to maintain their phone number when changing providers, thus reducing switching costs — affected the relationship between switching costs, market price and consumer surplus after it was introduced in most European countries in the early 2000s.
  • Theory holds that when switching costs are high, market leaders will enjoy a substantial advantage and are able to keep prices high. Policy makers will therefore attempt to decrease switching costs to intensify competition and reduce prices to consumers.
  • The study reviewed quarterly data from 47 wireless service providers in 15 EU countries between 1999 and 2006. The data showed that MNP simultaneously decreased market price by over four percent and increased consumer welfare by an average of at least €2.15 per person per quarter. This increase amounted to a total of €880 million per quarter across the 15 EU countries analyzed in this paper and accounted for 15 percent of the increase in consumer surplus observed over this time.

CtrlShift, Data Mobility: The data portability growth opportunity for the UK economy (2018)

  • Commissioned by the UK Department of Digital, Culture, Media and Sport (DCMS), this study was intended to identify the potential of personal data portability for the UK economy.
  • Its scope went beyond the legal right to data portability envisaged by the GDPR, to encompass the current state of personal data portability and mobility, requirements for safe and secure data sharing, and the potential economic benefits through stimulation of innovation, productivity and competition.
  • The report concludes that increased personal data mobility has the potential to be a vital stimulus for the development of the digital economy, driving growth by empowering individuals to make use of their own data and consent to others using it to create new data-driven services and technologies.
  • However, the report concludes that there are significant challenges to be overcome, and new risks to be addressed, before the value of personal data can be realized. Much personal data remains locked in organizational silos, and systemic issues related to data security and governance and the uneven sharing of benefits need to be resolved.

Data Guidance and Future of Privacy Forum, Comparing Privacy Laws: GDPR v. CCPA (2018)

  • This paper compares the provisions of the GDPR with those of the California Consumer Privacy Act (2018).
  • Both article 20 of the GDPR and section 1798 of the CCPA recognize a right to data portability. Both also confer on data subjects the right to receive data from controllers free of charge upon request, and oblige controllers to create mechanisms to provide subjects with their data in portable and reusable form so that it can be transmitted to third parties for reuse.
  • In the CCPA, the right to data portability is an extension of the right to access, and only confers on data subjects the right to apply for data collected within the past 12 months and have it delivered to them. The GDPR does not impose a time limit, and allows data to be transferred from one data controller to another, but limits the right to automatically collected personal data provided by the data subject themselves through consent or contract.

Data Transfer Project, Data Transfer Project Overview and Fundamentals (2018)

  • The paper presents an overview of the goals, principles, architecture, and system components of the Data Transfer Project. The intent of the DTP is to increase the number of services offering data portability and provide users with the ability to transfer data directly in and out of participating providers through systems that are easy and intuitive to use, private and secure, reciprocal between services, and focused on user data. The project, which is supported by Microsoft, Google, Twitter and Facebook, is an open-source initiative that encourages the participation of other providers to reduce the infrastructure burden on providers and users.
  • In addition to benefits to innovation, competition, and user choice, the authors point to benefits to security, through allowing users to backup, organize, or archive their data, recover from account hijacking, and retrieve their data from deprecated services.
  • The DTP’s remit was to test concepts and feasibility for the transfer of specific types of user data between online services using a system of adapters to transfer proprietary formats into canonical formats that can be used to transfer data while allowing providers to maintain control over the security of their service. While not resolving all formatting or support issues, this approach would allow substantial data portability and encourage ecosystem sustainability.

Deloitte, How to Flourish in an Uncertain Future: Open Banking(2017)

  • This report addresses the innovative and disruptive potential of open banking, in which data is shared between members of the banking ecosystem at the authorization of the customer, with the potential to increase competition and facilitate new products and services. In the resulting marketplace model, customers could use a single banking interface to access products from multiple players, from established banks to newcomers and fintechs.
  • The report’s authors identify significant threats to current banking models. Banks that failed to embrace open banking could be relegated to a secondary role as an infrastructure provider, while third parties — tech companies, fintech, and price comparison websites — take over the customer relationship.
  • The report identifies four overlapping operating models banks could adopt within an open banking model: full service providers, delivering proprietary products through their own interface with little or no third-party integration; utilities, which provide other players with infrastructure without customer-facing services; suppliers, which offer proprietary products through third-party interfaces; and interfaces,which provide distribution services through a marketplace interface. To retain market share, incumbents are likely to need to adopt a combination of these roles, offering their own products and services and those of third parties through their own and others’ interfaces.

Digital Competition Expert Panel Unlocking Digital Competition(2019)

  • This report captures the findings of the UK Digital Competition Expert Panel, which was tasked in 2018 with considering opportunities and challenges the digital economy might pose for competition and competition policy and to recommend any necessary changes. The panel focused on the impact of big players within the sector, appropriate responses to mergers or anticompetitive practices, and the impact on consumers.
  • The panel found that the digital economy is creating many benefits, but that digital markets are subject to tipping, in which emerging winners can scoop much of the market. This concentration can give rise to substantial costs, especially to consumers, and cannot be solved by competition alone. However, government policy and regulatory solutions have limitations, including the slowness of policy change, uneven enforcement and profound informational asymmetries between companies and government.
  • The panel proposed the creation of a digital markets unit that would be tasked with developing a code of competitive conduct, enabling greater personal data mobility and systems designed with open standards, and advancing access to non-personal data to reduce barriers to market entry.
  • The panel’s model of data mobility goes beyond data portability, which involves consumers being able to request and transfer their own data from one provider to another. Instead, the panel recommended empowering consumers to instigate transfers of data between a business and a third party in order to access price information, compare goods and services, or access tailored advice and recommendations. They point to open banking as an example of how this could function in practice.
  • It also proposed updating merger policy to make it more forward-looking to better protect consumers and innovation and preserve the competitiveness of the market. It recommended the creation of antitrust policy that would enable the implementation of interim measures to limit damage to competition while antitrust cases are in process.

Egan, Erin, Charting a Way Forward: Data Portability and Privacy(2019)

  • This white paper by Facebook’s VP and Chief Privacy Officer, Policy, represents an attempt to advance the conversation about the relationship between data portability, privacy, and data protection. The author sets out five key questions about data portability: what is it, whose and what data should be portable, how privacy should be protected in the context of portability, and where responsibility for data misuse or improper protection should lie.
  • The paper finds that definitions of data portability still remain imprecise, particularly with regard to the distinction between data portability and data transfer. In the interest of feasibility and a reasonable operational burden on providers, it proposes time limits on providers’ obligations to make observed data portable.
  • The paper concludes that there are strong arguments both for and against allowing users to port their social graph — the map of connections between that user and other users of the service — but that the key determinant should be a capacity to ensure the privacy of all users involved. Best-practice data portability protocols that would resolve current differences of approach as to what, how and by whom information should be made available would help promote broader portability, as would resolution of liability for misuse or data exposure.

Engels, Barbara, Data portability among online platforms (2016)

  • The article examines the effects on competition and innovation of data portability among online platforms such as search engines, online marketplaces, and social media, and how relations between users, data, and platform services change in an environment of data portability.
  • The paper finds that the benefits to competition and innovation of portability are greatest in two kinds of environments: first, where platforms offer complementary products and can realize synergistic benefits by sharing data; and secondly, where platforms offer substitute or rival products but the risk of anti-competitive behaviour is high, as for search engines.
  • It identifies privacy and security issues raised by data portability. Portability could, for example, allow an identity fraudster to misuse personal data across multiple platforms, compounding the harm they cause.
  • It also suggests that standards for data interoperability could act to reduce innovation in data technology, encouraging data controllers to continue to use outdated technologies in order to comply with inflexible, government-mandated standards.

Graef, Inge, Martin Husovec and Nadezhda Purtova, Data Portability and Data Control: Lessons for an Emerging Concept in EU Law (2018)

  • This paper situates the data portability right conferred by the GDPR within rights-based data protection law. The authors argue that the right to data portability should be seen as a new regulatory tool aimed at stimulating competition and innovation in data-driven markets.
  • The authors note the potential for conflicts between the right to data portability and the intellectual property rights of data holders, suggesting that the framers underestimated the potential impact of such conflicts on copyright, trade secrets and sui generis database law.
  • Given that the right to data portability is being replicated within consumer protection law and the regulation of non-personal data, the authors argue framers of these laws should consider the potential for conflict and the impact of such conflict on incentives to innovate.

Mohsen, Mona Omar and Hassan A. Aziz The Blue Button Project: Engaging Patients in Healthcare by a Click of a Button (2015)

  • This paper provides a literature review on the Blue Button initiative, an early data portability project which allows Americans to access, view or download their health records in a variety of formats.
  • Originally launched through the Department of Veterans’ Affairs in 2010, the Blue Button initiative had expanded to more than 500 organizations by 2014, when the Department of Health and Human Services launched the Blue Button Connector to facilitate both patient access and development of new tools.
  • The Blue Button has enabled the development of tools such as the Harvard-developed Growth-Tastic app, which allows parents to check their child’s growth by submitting their downloaded pediatric health data. Pharmacies across the US have also adopted the Blue Button to provide patients with access to their prescription history.

More than Data and Mission: Smart, Got Data? The Value of Energy Data to Customers (2016)

  • This report outlines the public value of the Green Button, a data protocol that provides customers with private and secure access to their energy use data collected by smart meters.
  • The authors outline how the use of the Green Button can help states meet their energy and climate goals by enabling them to structure renewables and other distributed energy resources (DER) such as energy efficiency, demand response, and solar photovoltaics. Access to granular, near real time data can encourage innovation among DER providers, facilitating the development of applications like “virtual energy audits” that identify efficiency opportunities, allowing customers to reduce costs through time-of-use pricing, and enabling the optimization of photovoltaic systems to meet peak demand.
  • Energy efficiency receives the greatest boost from initiatives like the Green Button, with studies showing energy savings of up to 18 percent when customers have access to their meter data. In addition to improving energy conservation, access to meter data could improve the efficiency of appliances by allowing devices to trigger sleep modes in response to data on usage or price. However, at the time of writing, problems with data portability and interoperability were preventing these benefits from being realized, at a cost of tens of millions of dollars.
  • The authors recommend that commissions require utilities to make usage data available to customers or authorized third parties in standardized formats as part of basic utility service, and tariff data to developers for use in smart appliances.

MyData, Understanding Data Operators (2020)

  • MyData is a global movement of data users, activists and developers with a common goal to empower individuals with their personal data to enable them and their communities to develop knowledge, make informed decisions and interact more consciously and efficiently.
  • This introductory paper presents the state of knowledge about data operators, trusted data intermediaries that provide infrastructure for human-centric personal data management and governance, including data sharing and transfer. The operator model allows data controllers to outsource issues of legal compliance with data portability requirements, while offering individual users a transparent and intuitive way to manage the data transfer process.
  • The paper examines use cases from 48 “proto-operators” from 15 countries who fulfil some of the functions of an operator, albeit at an early level of maturity. The paper finds that operators offer management of identity authentication, data transaction permissions, connections between services, value exchange, data model management, personal data transfer and storage, governance support, and logging and accountability. At the heart of these functions is the need for minimum standards of data interoperability.
  • The paper reviews governance frameworks from the general (legislative) to the specific (operators), and explores proto-operator business models. In keeping with an emerging field, business models are currently unclear and potentially unsustainable, and one of a number of areas, including interoperability requirements and governance frameworks, that must still be developed.

National Science and Technology Council Smart Disclosure and Consumer Decision Making: Report of the Task Force on Smart Disclosure (2013)

  • This report summarizes the work and findings of the 2011–2013 Task Force on Smart Disclosure: Information and Efficiency, an interagency body tasked with advancing smart disclosure, through which data is made more available and accessible to both consumers and innovators.
  • The Task Force recognized the capacity of smart disclosure to inform consumer choices, empower them through access to useful personal data, enable the creation of new tools, products and services, and promote efficiency and growth. It reviewed federal efforts to promote smart disclosure within sectors and in data types that crosscut sectors, such as location data, consumer feedback, enforcement and compliance data and unique identifiers. It also surveyed specific public-private partnerships on access to data, such as the Blue and Green Button and MyData initiatives in health, energy and education respectively.
  • The Task Force reviewed steps taken by the Federal Government to implement smart disclosure, including adoption of machine readable formats and standards for metadata, use of APIs, and making data available in an unstructured format rather than not releasing it at all. It also reviewed “choice engines” making use of the data to provide services to consumers across a range of sectors.
  • The Task Force recommended that smart disclosure should be a core component of efforts to institutionalize and operationalize open data practices, with agencies proactively identifying, tagging, and planning the release of candidate data. It also recommended that this be supported by a government-wide community of practice.

Nicholas, Gabriel Taking It With You: Platform Barriers to Entry and the Limits of Data Portability (2020)

  • This paper considers whether, as is often claimed, data portability offers a genuine solution to the lack of competition within the tech sector.
  • It concludes that current regulatory approaches to data portability, which focus on reducing switching costs through technical solutions such as one-off exports and API interoperability, are not sufficient to generate increased competition. This is because they fail to address other barriers to entry, including network effects, unique data access, and economies of scale.
  • The author proposes an alternative approach, which he terms collective portability, which would allow groups of users to coordinate the transfer of their data to a new platform. This model raises questions about how such collectives would make decisions regarding portability, but would enable new entrants to successfully target specific user groups and scale rapidly without having to reach users one by one.

OECD, Enhancing Access to and Sharing of Data: Reconciling Risks and Benefits for Data Re-use across Societies (2019)

  • This background paper to a 2017 expert workshop on risks and benefits of data reuse considers data portability as one strategy within a data openness continuum that also includes open data, market-based B2B contractual agreements, and restricted data-sharing agreements within research and data for social good applications.
  • It considers four rationales offered for data portability. These include empowering individuals towards the “informational self-determination” aspired to by GDPR, increased competition within digital and other markets through reductions in information asymmetries between individuals and providers, switching costs, and barriers to market entry; and facilitating increased data flows.
  • The report highlights the need for both syntactic and semantic interoperability standards to ensure data can be reused across systems, both of which may be fostered by increased rights to data portability. Data intermediaries have an important role to play in the development of these standards, through initiatives like the Data Transfer Project, a collaboration which brought together Facebook, Google, Microsoft, and Twitter to create an open-source data portability platform.

Personal Data Protection Commission Singapore Response to Feedback on the Public Consultation on Proposed Data Portability and Data Innovation Provisions (2020)

  • The report summarizes the findings of the 2019 PDPC public consultation on proposals to introduce provisions on data portability and data innovation in Singapore’s Personal Data Protection Act.
  • The proposed provision would oblige organizations to transmit an individual’s data to another organization in a commonly used machine-readable format, upon the individual’s request. The obligation does not extend to data intermediaries or organizations that do not have a presence in Singapore, although data holders may choose to honor those requests.
  • The obligation would apply to electronic data that is either provided by the individual or generated by the individual’s activities in using the organization’s service or product, but not derived data created by the processing of other data by the data holder. Respondents were concerned that including derived data could harm organizations’ competitiveness.
  • Respondents were concerned about how to honour data portability requests where the data of third parties was involved, as in the case of a joint account holder, for example. The PDPC opted for a “balanced, reasonable, and pragmatic approach,” allowing data involving third parties to be ported where it was under the requesting individual’s control, was to be used for domestic and personal purposes, and related only to the organization’s product or service.

Quinn, Paul Is the GDPR and Its Right to Data Portability a Major Enabler of Citizen Science? (2018)

  • This article explores the potential of data portability to advance citizen science by enabling participants to port their personal data from one research project to another. Citizen science — the collection and contribution of large amounts of data by private individuals for scientific research — has grown rapidly in response to the development of new digital means to capture, store, organize, analyze and share data.
  • The GDPR right to data portability aids citizen science by requiring transfer of data in machine-readable format and allowing data subjects to request its transfer to another data controller. This requirement of interoperability does not amount to compatibility, however, and data thus transferred would probably still require cleaning to be usable, acting as a disincentive to reuse.
  • The GDPR’s limitation of transferability to personal data provided by the data subject excludes some forms of data that might possess significant scientific potential, such as secondary personal data derived from further processing or analysis.
  • The GDPR right to data portability also potentially limits citizen science by restricting the grounds for processing data to which the right applies to data obtained through a subject’s express consent or through the performance of a contract. This limitation excludes other forms of data processing described in the GDPR, such as data processing for preventive or occupational medicine, scientific research, or archiving for reasons of public or scientific interest. It is also not clear whether the GDPR compels data controllers to transfer data outside the European Union.

Wong, Janis and Tristan Henderson, How Portable is Portable? Exercising the GDPR’s Right to Data Portability (2018)

  • This paper presents the results of 230 real-world requests for data portability in order to assess how — and how well — the GDPR right to data portability is being implemented. The authors were interested in establishing the kinds of file formats that were returned in response to requests, and to identify practical difficulties encountered in making and interpreting requests, over a three month period beginning on the day the GDPR came into effect.
  • The findings revealed continuing problems around ensuring portability for both data controllers and data subjects. Of the 230 requests, only 163 were successfully completed.
  • Data controllers frequently had difficulty understanding the requirements of GDPR, providing data in incomplete or inappropriate formats: only 40 percent of the files supplied were in a fully compliant format. Additionally, some data controllers were confused between the right to data portability and other rights conferred by the GDPR, such as the right to access or erasure.

The GovLab Selected Readings on Open Data for Developing Economies


By Andrew Young, Stefaan Verhulst, and Juliet McMurren

This edition of the GovLab Selected Readings was developed as part of the Open Data for Developing Economies research project (in collaboration with WebFoundation, USAID and fhi360). Special thanks to Maurice McNaughton, Francois van Schalkwyk, Fernando Perini, Michael Canares and David Opoku for their input on an early draft. Please contact Stefaan Verhulst (stefaan@thegovlab.org) for any additional input or suggestions.

Data-and-its-uses-for-Governance-1024x491

Open data is increasingly seen as a tool for economic and social development. Across sectors and regions, policymakers, NGOs, researchers and practitioners are exploring the potential of open data to improve government effectiveness, create new economic opportunity, empower citizens and solve public problems in developing economies. Open data for development does not exist in a vacuum – rather it is a phenomenon that is relevant to and studied from different vantage points including Data4Development (D4D), Open Government, the United Nations’ Sustainable Development Goals (SDGs), and Open Development. The below selected readings provide a view of the current research and practice on the use of open data for development and its relationship to related interventions.

Selected Reading List (in alphabetical order)

Annotated Selected Readings List (in alphabetical order)

Open Data and Open Government for Development

Benjamin, Solomon, R. Bhuvaneswari, P. Rajan, Manjunatha, “Bhoomi: ‘E-Governance’, or, An Anti-Politics Machine Necessary to Globalize Bangalore?” CASUM-m Working Paper, January 2007,http://bit.ly/2aD3vZe

  • This paper explores the digitization of land titles and their effect on governance in Bangalore. The paper takes a critical view of digitization and transparency efforts, particularly as best practices that should be replicated in many contexts.
  • The authors point to the potential of centralized open data and land records databases as a means for further entrenching existing power structures. They found that the digitization of land records in Bangalore “led to increased corruption, much more bribes and substantially increased time taken for land transactions,” as well allowing “very large players in the land markets to capture vast quantities of land when Bangalore experiences a boom in the land market.”
  • They argue for the need “to replace politically neutered concepts like ‘transparency’, ‘efficiency’, ‘governance’, and ‘best practice’ conceptually more rigorous terms that reflect the uneven terrain of power and control that governance embodies.

McGee, Rosie and Duncan Edwards, “Introduction: Opening Governance – Change, Continuity and Conceptual Ambiguity,” IDS Bulletin, January 24, 2016. http://bit.ly/2aJn1pq.  

  • This introduction to a special issue of the IDS Bulletin frames the research and practice of leveraging opening governance as part of a development agenda.
  • The piece primarily focuses on a number of “critical debates” that “have begun to lay bare how imprecise and overblown the expectations are in the transparency, accountability and openness ‘buzzfield’, and the problems this poses.”
  • A key finding on opening governance’s uptake and impact in the development space relates to political buy-in:
    • “Political will is generally a necessary but insu cient condition for governance processes and relationships to become more open, and is certainly a necessary but insu cient condition for tech-based approaches to open them up. In short, where there is a will, tech-for-T&A may be able to provide a way; where there isn’t a will, it won’t.”

Open Data and Data 4 Development

3rd International Open Data Conference (IODC), “Enabling the Data Revolution: An International Open Data Roadmap,” Conference Report, 2015, http://bit.ly/2asb2ei

  • This report, prepared by Open Data for Development, summarizes the proceedings of the third IODC in Ottawa, ON. It sets out an action plan for “harnessing open data for sustainable development”, with the following five priorities:
    1. Deliver shared principles for open data
    2. Develop and adopt good practices and open standards for data publication
    3. Build capacity to produce and use open data effectively
    4. Strengthen open data innovation networks
    5. Adopt common measurement and evaluation tools
  • The report draws on 70 impact accounts to present cross-sector evidence of “the promise and reality of open data,” and emphasizes the utility of open data in monitoring development goals, and the importance of “joined-up open data infrastructures,” ensuring wide accessibility, and grounding measurement in a clear understanding of citizen need, in order to realize the greatest benefits from open data.
  • Finally, the report sets out a draft International Open Data Charter and Action Plan for International Collaboration.

Hilbert, Martin, “Big Data for Development: A Review of Promises and Challenges,” Development Policy Review, December 13, 2015, http://bit.ly/2aoPtxL.

  • This article presents a conceptual framework based on the analysis of 180 articles on the opportunities and threats of big data for international development.
  • Open data, Hilbert argues, can be an incentive for those outside of government to leverage big data analytics: “If data from the public sector were to be openly available, around a quarter of existing data resources could be liberated for Big Data Analytics.”
  • Hilbert explores the misalignment between “the level of economic well-being and perceived transparency of a country” and the existence of an overarching open data policy. He points to low-income countries that are active in the open data effort, like Kenya, Russia and Brazil, in comparison to “other countries with traditionally high perceived transparency,” which are less active in releasing data, like Chile, Belgium and Sweden.

International Development Research Centre, World Wide Web Foundation, and Berkman Center at Harvard University, “Fostering a Critical Development Perspective on Open Government Data,” Workshop Report, 2012, http://bit.ly/2aJpyQq

  • This paper considers the need for a critical perspective on whether the expectations raised by open data programmes worldwide — as “a suitable remedy for challenges of good governance, economic growth, social inclusion, innovation, and participation” — have been met, and if so, under what circumstances.
  • Given the lack of empirical evidence underlying the implementation of Open Data initiative to guide practice and policy formulation, particularly in developing countries, the paper discusses the implementation of a policy-oriented research agenda to ensure open data initiatives in the Global South “challenge democratic deficits, create economic value and foster inclusion.”
  • The report considers theories of the relationship between open data and impact, and the mediating factors affecting whether that impact is achieved. It takes a broad view of impact, including both demand- and supply-side economic impacts, social and environmental impact, and political impact.

Open Data for Development, “Open Data for Development: Building an Inclusive Data Revolution,” Annual Report, 2015, http://bit.ly/2aGbkz5

  • This report — the inaugural annual report for the Open Data for Development program — gives an overview of outcomes from the program for each of OD4D’s five program objectives:
  1. Setting a global open data for sustainable development agenda;
  2. Supporting governments in their open data initiatives;
  3. Scaling data solutions for sustainable development;
  4. Monitoring the availability, use and impact of open data around the world; and
  5. Building the institutional capacity and long-term sustainability of the Open Data for Development network.
  • The report identifies four barriers to impact in developing countries: the lack of capacity and leadership; the lack of evidence of what works; the lack of coordination between actors; and the lack of quality data.

Stuart, Elizabeth, Emma Samman, William Avis, Tom Berliner, “The Data Revolution: Finding the Missing Millions,” Open Data Institute Research Report, April 2015, http://bit.ly/2acnZtE.

  • This report examines the challenge of implementing successful development initiatives when many citizens are not known to their governments as they do not exist in official databases.
  • The authors argue that “good quality, relevant, accessible and timely data will allow willing governments to extend services into communities which until now have been blank spaces in planning processes, and to implement policies more efficiently.”
  • In addition to improvements to national statistical offices, the authors argue that “making better use of the data we already have” by increasing openness to certain datasets held by governments and international organizations could help to improve the situation.
  • They examine a number of open data efforts in developing countries, including Kenya and Mexico.
  • Finally, they argue that “the data revolution could play a role in changing the power dynamic between citizens, governments and the private sector, building on open data and freedom of information movements around the world. It has the potential to enable people to produce, access and understand information about their lives and to use this information to make changes.”

United Nations Independent Expert Advisory Group on a Data Revolution for Sustainable Development. “A World That Counts, Mobilizing the Data Revolution,” 2014, http://bit.ly/2am5K28.

  • This report focuses on the potential benefits and risks data holds for sustainable development. Included in this is a strategic framework for using and managing data for humanitarian purposes. It describes a need for a multinational consensus to be developed to ensure data is shared effectively and efficiently.
  • It suggests that “people who are counted”—i.e., those who are included in data collection processes—have better development outcomes and a better chance for humanitarian response in emergency or conflict situations.
  • In particular, “better and more open data” is described as having the potential to “save money and create economic, social and environmental value” toward sustainable development ends.

The World Bank, “Digital Dividends: World Development Report 2016.”http://bit.ly/2aG9Kx5

  • This report examines “digital dividends” or the development benefits of using digital technologies in the space.
  • The authors argue that: “To get the most out of the digital revolution, countries also need to work on the “analog complements”—by strengthening regulations that ensure competition among businesses, by adapting workers’ skills to the demands of the new economy, and by ensuring that institutions are accountable.”
  • The “data revolution,” which includes both big data and open data is listed as one of four “digital enablers.”
  • Open data’s impacts are explored across a number of cases and developing countries and regions, including: Nepal, Mexico, Southern Africa, Kenya, Moldova and the Philippines.
  • Despite a number of success stories, the authors argue that: “sustained, impactful, scaled-up examples of big and open data in the developing world are still relatively rare,” and, in particular, “Open data has far to go.” They point to the high correlation between readiness, implementation and impact of open data to GDP per capita as evidence of the room for improvement.

Open Data and Open Development

Reilly, Katherine and Juan P. Alperin, “Intermediation in Open Development: A Knowledge Stewardship Approach,”Global Media Journal (Canadian Edition), 2016, http://bit.ly/2atWyI8

  • This paper examines the intermediaries that “have emerged to facilitate open data and related knowledge production activities in development processes.”
  • In particular, they study the concept of “knowledge stewardship,” which “demands careful consideration of how—through what arrangements—open resources can best be provided, and how best to maximize the quality, sustainability, buy-in, and uptake of those resources.”
  • The authors describe five models of open data intermediation:
    • Decentralized
    • Arterial
    • Ecosystem
    • Bridging
    • Communities of practice

Reilly, Katherine and Rob McMahon, “Quality of openness: Evaluating the contributions of IDRC’s Information and Networks Program to open development.” International Development Research Centre, January 2015, http://bit.ly/2aD6h0U

  • This reports describes the outcomes of IRDC’s Information and Networks (I&N) programme, focusing, in particular, those related to “quality of openness” of initiatives as well as their outcomes.
  • The research program explores “mechanisms that link open initiatives to human activities in ways that generate social innovations of significance to development. These include push factors such as data holders’ understanding of data usage, the preparedness or acceptance of user communities, institutional policies, and wider policies and regulations; as well as pull factors including the awareness, capacity and attitude of users. In other words, openly networked social processes rely on not just quality openness, but also on supportive environments that link open resources and the people who might leverage them to create improvements, whether in governance, education or knowledge production.”

Smith, M. and L. Elder, “Open ICT Ecosystems Transforming the Developing World,” Information Technologies and International Development, 2010, http://bit.ly/2au0qsW.

  • The paper seeks to examine the hypothesis that “open social arrangements, enabled by ICTs, can help to catalyze the development impacts of ICTs. In other words, open ICT ecosystems provide the space for the amplification and transformation of social activities that can be powerful drivers of development.”
  • While the focus is placed on a number of ICT interventions – with open data only directly referenced as it relates to the science community – the lessons learned and overarching framework are applicable to the open data for development space.
  • The authors argue for a new research focus on “the new social activities enabled by different configurations of ICT ecosystems and their connections with particular social outcomes.” They point in particular to “modules of social practices that can be applied to solve similar problems across different development domains,” including “massive participation, collaborative production of content, collaborative innovation, collective information validation, new ‘open’ organizational models, and standards and knowledge transfer.”

Smith, Matthew and Katherine M. A. Reilly (eds), “Open Development: Networked Innovations in International Development,” MIT Press, 2013, http://bit.ly/2atX2hu.

  • This edited volume considers the implications of the emergence of open networked models predicated on digital network technologies for development. In their introduction, the editors emphasize that openness is a means to support development, not an end, which is layered upon existing technological and social structures. While openness is often disruptive, it depends upon some measure of closedness and structure in order to function effectively.
  • Subsequent, separately authored chapters provide case studies of open development drawn from health, biotechnology, and education, and explore some of the political and structural barriers faced by open models.  

van den Broek, Tijs, Marijn Rijken, Sander van Oort, “Towards Open Development Data: A review of open development data from a NGO perspective,” 2012, http://bit.ly/2ap5E8a

  • In this paper, the authors seek to answer the question: “What is the status, potential and required next steps of open development data from the perspective of the NGOs?”
  • They argue that “the take-up of open development data by NGOs has shown limited progress in the last few years,” and, offer “several steps to be taken before implementation” to increase the effectiveness of open data’s use by NGOs to improve development efforts:
    • Develop a vision on open development and open data
    • Develop a clear business case
    • Research the benefits and risks of open development data and raise organizational and political awareness and support
    • Develop an appealing business model for data intermediaries and end-users
    • Balance data quality and timeliness
    • Dealing with the data obesity
    • Enrich quantitative data to overcome a quantitative bias
    • Monitor implementation and share best practices.

Open Data and Development Goals

Berdou, Evangelia, “Mediating Voices and Communicating Realities: Using Information Crowdsourcing Tools, Open Data Initiatives and Digital Media to Support and Protect the Vulnerable and Marginalised,” Institute of Development Studies, 2011, http://bit.ly/2aqbycg.

  • This report examines the potential of “open source information crowdsourcing platforms like Ushahidi, and open mapping and data initiatives like OpenStreetMap, are enabling citizens in developing countries to generate and disseminate information critical for their lives and livelihoods.”
  • The authors focus in particular on:
    • “the role of the open source social entrepreneur as a new development actor
    • the complexity of the architectures of participation supported by these platforms and the need to consider them in relation to the decision-making processes that they aim to support and the roles in which they cast citizens
    • the possibilities for cross-fertilisation of ideas and the development of new practices between development practitioners and technology actors committed to working with communities to improve lives and livelihoods.”
  • While the use of ICTs and open data pose numerous potential benefits for supporting and protecting the vulnerable and marginalised, the authors call for greater attention to:
    • challenges emerging from efforts to sustain participation and govern the new information commons in under-resourced and politically contested spaces
    • complications and risks emerging from the desire to share information freely in such contexts
    • gaps between information provision, transparency and accountability, and the slow materialisation of projects’ wider social benefits

Canares, Michael, Satyarupa Shekhar, “Open Data and Sub-national Governments: Lessons from Developing Countries,”  2015, http://bit.ly/2au2gu2

  • This synthesis paper seeks to gain a greater understanding of open data’s effects on local contexts – ”where data is collected and stored, where there is strong feasibility that data will be published, and where data can generate the most use and impact” – through the examination of nine papers developed as part of the Open Data in Developing Countries research project.
  • The authors point to three central findings:
    • “There is substantial effort on the part of sub-national governments to proactively disclose data, however, the design delimits citizen participation, and eventually, use.”
    • Context demands different roles for intermediaries and different types of initiatives to create an enabling environment for open data.”
    • “Data quality will remain a critical challenge for sub-national governments in developing countries and it will temper potential impact that open data will be able to generate.

Davies, Tim, “Open Data in Developing Countries – Emerging Insights from Phase I,” ODDC, 2014, http://bit.ly/2aX55UW

  • This report synthesizes findings from the Exploring the Emerging Impacts of Open Data in Developing Countries (ODDC) research network and its study of open data initiatives in 13 countries.
  • Davies provides 15 initial insights across the supply, mediation, and use of open data, including:
    • Open data initiatives can create new spaces for civil society to pursue government accountability and effectiveness;
    • Intermediaries are vital to both the supply and the use of open data; and
    • Digital divides create data divides in both the supply and use of data.

Davies, Tim, Duncan Edwards, “Emerging Implications of Open and Linked Data for Knowledge Sharing Development,” IDS Bulletin, 2012, http://bit.ly/2aLKFyI

  • This article explores “issues that development sector knowledge intermediaries may need to engage with to ensure the socio-technical innovations of open and linked data work in the interests of greater diversity and better development practice.”
  • The authors explore a number of case studies where open and linked data was used in a development context, including:
    • Open research: IDS and R4D meta-data
    • Open aid: International Aid Transparency Initiative
    • Open linked statistics: Young Lives
  • Based on lessons learned from these cases, the authors argue that “openness must serve the interests of marginalised and poor people. This is pertinent at three levels:
    • practices in the publication and communication of data
    • capacities for, and approaches to, the use of data
    • development and emergent structuring of open data ecosystems.

Davies, Tim, Fernando Perini, and Jose Alonso, “Researching the Emerging Impacts of Open Data,” ODDC, 2013, http://bit.ly/2aqb6uP

  • This research report offers a conceptual framework for open data, with a particular focus on open data in developing countries.
  • The conceptual framework comprises three central elements:
    • Open Data
      • About government
      • About companies & markets
      • About citizens
    • Domains of governance
      • Political domains
      • Economic domains
      • Social domains
    • Emerging Outcomes
      • Transparency & accountability
      • Innovation & economic growth
      • Inclusion & empowerment
  • The authors describe three central theories of change related to open data’s impacts:
    • Open data will bring about greater transparency in government, which in turn brings about greater accountability of key actors to make decisions and apply rules in the public interest;
    • Open data will enable non-state innovators to improve public services or build innovative products and services with social and economic value; open data will shift certain decision making from the state into the market, making it more efficient;
    • Open data will remove power imbalances that resulted from asymmetric information, and will bring new stakeholders into policy debates, giving marginalised groups a greater say in the creation and application of rules and policy.

Montano, Elise and Diogo Silva, “Exploring the Emerging Impacts of Open Data in Developing Countries (ODDC): ODDC1 Follow-up Outcome Evaluation Report,” ODDC, 2016, http://bit.ly/2au65z7.

  • This report summarizes the findings of a two and a half year research-driven project sponsored by the World Wide Web Foundation to explore how open data improves governance in developing countries, and build capacity in these countries to engage with open data. The research was conducted through 17 subgrants to partners from 12 countries.
  • Upon evaluation in 2014, partners reported increased capacity and expertise in dealing with open data; empowerment in influencing local and regional open data trends, particularly among CSOs; and increased understanding of open data among policy makers with whom the partners were in contact.

Smith, Fiona, William Gerry, Emma Truswell, “Supporting Sustainable Development with Open Data,” Open Data Institute, 2015, http://bit.ly/2aJwxsF

  • This report describes the potential benefits, challenges and next steps for leveraging open data to advance the Sustainable Development Goals.
  • The authors argue that the greatest potential impacts of open data on development are:
    • More effectively target aid money and improve development programmes
    • Track development progress and prevent corruption
    • Contribute to innovation, job creation and economic growth.
  • They note, however, that many challenges to such impact exist, including:
    • A weak enabling environment for open data publishing
    • Poor data quality
    • A mismatch between the demand for open data and the supply of appropriate datasets
    • A ‘digital divide’ between rich and poor, affecting both the supply and use of data
    • A general lack of quantifiable data and metrics.
  • The report articulates a number of ways that “governments, donors and (international) NGOs – with the support of researchers, civil society and industry – can apply open data to help make the SDGs a reality:
    • Reach global consensus around principles and standards, namely being ‘open by default’, using the Open Government Partnership’s Open Data Working Group as a global forum for discussion.
    • Embed open data into funding agreements, ensuring that relevant, high-quality data is collected to report against the SDGs. Funders should mandate that data relating to performance of services, and data produced as a result of funded activity, be released as open data.
    • Build a global partnership for sustainable open data, so that groups across the public and private sectors can work together to build sustainable supply and demand for data in the developing world.”

The World Bank, “Open Data for Sustainable Development,” Policy Note, August 2015, http://bit.ly/2aGjaJ4

  • This report from the World Bank seeks to describe open data’s potential for achieving the Sustainable Development Goals, and makes a number of recommendations toward that end.
  • The authors describe four key benefits of open data use for developing countries:
    • Foster economic growth and job creation
    • Improve efficiency, effectiveness and coverage of public services
    • Increase transparency, accountability, and citizen participation
    • Facilitate better information sharing within government
  • The paper concludes with a number of recommendations for improving open data programs, including:
    • Support Open Data use through legal and licensing frameworks.
    • Make data available for free online.
    • Publish data inventories for the government’s data resources.
    • Create feedback channels to government from current and potential data users.
    • Prioritize the datasets that users want.

Open Data and Developing Countries (National Case Studies)

Beghin, Nathalie and Carmela Zigoni, “Measuring Open Data’s Impact on Brazilian National and Sub-National Budget Transparency Websites and Its Impacts on People’s Rights,” 2014, http://bit.ly/2au3LaQ.

  • This report examines the impact of a Brazilian law requiring government entities to “provide real-time information on their budgets and spending through electronic means.” The authors explore “whether the national and state capitals are in fact using principles and practices of open data in their disclosures, and has evaluated the emerging impacts of open budget data disclosed through the national transparency portal.”
  • The report leveraged a “quantitative survey of budget and financial disclosures, and qualitative research with key stakeholders” to explore the “role of technical platforms and intermediaries in supporting the use of budget data by groups working in pursuit of social change and human rights.”
  • The survey found that:
    • The information provided is complete
    • In general, the data are not primary
    • Most governments do not provide timely information
    • Access to information is not ensured to all individuals
    • Advances were observed in terms of the availability of machine-processable data
    • Access is free, without discriminating users
    • The minority presents data in non-proprietary format
    • It is not known whether the data are under license

Boyera, S., C. Iglesias, “Open Data in Developing Countries: State of the Art,” Partnership for Open Data, 2014, http://bit.ly/2acBMR7

  • This report provides a summary of the State-of-the-Art study developed by SBC4D for the Partnership for Open Data (POD).
  • A series of interviews and responses to an online questionnaire yielded a number of findings, including:
    • “The number of actors interested in Open Data in Developing Countries is growing quickly. The study has identified 160+ organizations. It is important to note that a majority of them are just engaging in the domain and have little past experience. Most of these actors are focused on OD as an objective not a tool or means to increase impact or outcome.
    • Local actors are strong advocates of public data release. Lots of them are also promoting the re-use of existing data (through e.g. the organization of training, hackathons and alike). However, the study has not identified many actors practically using OD in their work or engaged in releasing their own data.
    • Traditional development sectors (health, education, agriculture, energy, transport) are not yet the target of many initiatives, and are clearly underdeveloped in terms of use-cases.
    • There is very little connection between horizontal (e.g. national OD initiatives) and vertical (sector-specific initiatives on e.g. extractive industry, or disaster management) activities”

Canares, M.P., J. de Guia, M. Narca, J. Arawiran, “Opening the Gates: Will Open Data Initiatives Make Local Governments in the Philippines More Transparent?” Open LGU Research Project, 2014, http://bit.ly/2au3Ond

  • This paper seeks to determine the impacts of the Department of Interior and Local Government of the Philippines’ Full Disclosure Policy, affecting financial and procurement data, on both data providers and data users.
  • The paper uncovered two key findings:
    • “On the supply side, incentivising openness is a critical aspect in ensuring that local governments have the interest to disclose financial data. While at this stage, local governments are still on compliance behaviour, it encourages the once reluctant LGUs to disclose financial information in the use of public funds, especially when technology and institutional arrangements are in place. However, LGUs do not make an effort to inform the public that information is available online and has not made data accessible in such a way that it can allow the public to perform computations and analysis. Currently, no data standards have been made yet by the Philippine national government in terms of format and level of detail.”
    • “On the demand side, there is limited awareness on the part of the public, and more particularly the intermediaries (e.g. business groups, civil society organizations, research institutions), on the availability of data, and thus, its limited use. As most of these data are financial in nature, it requires a certain degree of competence and expertise so that they will be able to make use of the data in demanding from government better services and accountability.”
  • The authors argue that “openness is not just about governments putting meaningful government data out into the public domain, but also about making the public meaningfully engage with governments through the use of open government data.” In order to do that, policies should “require observance of open government data standards and a capacity building process of ensuring that the public, to whom the data is intended, are aware and able to use the data in ensuring more transparent and accountable governance.”

Canares, M., M. Narca, and D. Marcial, “Enhancing Citizen Engagement Through Open Government Data,” ODDC, 2015, http://bit.ly/2aJMhfS

  • This research paper seeks to gain a greater understanding of how civil society organizations can increase or initiate their use of open data. The study is based on research conducted in “two provinces in the Philippines where civil society organizations in Negros Oriental province were trained, and in the Bohol province were mentored on accessing and using open data.
  • The authors seek to answer three central research questions:
    • What do CSOs know about open government data? What do they know about government data that their local governments are publishing in the web?
    • What do CSOs have in terms of skills that would enable them to engage meaningfully with open government data?
    • How best can capacity building be delivered to civil society organizations to ensure that they learn to access and use open government data to improve governance?
  • They provide a number of key lessons, including:
    • Baseline condition should inform capacity building approach
    • Data use is dependent on data supply
    • Open data requires accessible and stable internet connection
    • Open data skills are important but insufficient
    • Outcomes, and not just outputs, prove capacity improvements

Chattapadhyay, Sumandro, “Opening Government Data through Mediation: Exploring the Roles, Practices and Strategies of Data Intermediary Organisations in India,” ODDC, 2014, http://bit.ly/2au3F37

  • This report seeks to gain a greater understanding of the current practice following the Government of India’s 2012 National Data Sharing and Accessibility Policy.
  • Cattapadhyay examines the open government data practices of “various (non-governmental) ‘data intermediary organisations’ on the one hand, and implementation challenges faced by managers of the Open Government Data Platform of India on the other.
  • The report’s objectives are:
    • To undertake a provisional mapping of government data related activities across different sectors to understand the nature of the “open data community” in India,
    • To enrich government data/information policy discussion in India by gathering evidence and experience of (non­governmental) data intermediaries regarding their actual practices of accessing and sharing government data, and their utilisation of the provisions of NDSAP and RTI act, and
    • To critically reflect on the nature of open data practices in India.

Chiliswa, Zacharia, “Open Government Data for Effective Public Participation: Findings of a Case Study Research Investigating The Kenya’s Open Data Initiative in Urban Slums and Rural Settlements,” ODDC, April 2014, http://bit.ly/2au8E4s

  • This research report is the product of a study of two urban slums and a rural settlement in Nairobi, Mobasa and Isiolo County, respectively, aimed at gaining a better understanding of the awareness and use of Kenya’s open data.
  • The study had four organizing objectives:
    • “Investigate the impact of the Kenyan Government’s open data initiative and to see whether, and if so how, it is assisting marginalized communities and groups in accessing key social services and information such as health and education;
    • Understand the way people use the information provided by the Open Data Initiative;
    • Identify people’s trust in the information and how it can assist their day-to-day lives;
    • Examine ways in which the public wish for the open data initiative to improve, particularly in relation to governance and service delivery.”
  • The study uncovered four central findings about Kenya’s open data initiative:
    • “There is a mismatch between the data citizens want to have and the data the Kenya portal and other intermediaries have provided.
    • Most people go to local information intermediaries instead of going directly to the government data portals and that there are few connections between these intermediaries and the wider open data sources.
    • Currently the rural communities are much less likely to seek out government information.
    • The kinds of data needed to support service delivery in Kenya may be different from those needed in other places in the world.”

Lwanga-Ntale, Charles, Beatrice Mugambe, Bernard Sabiti, Peace Nganwa, “Understanding how open data could impact resource allocation for poverty eradication in Kenya and Uganda,” ODDC, 2014, http://bit.ly/2aHqYKi

  • This paper explores case studies from Uganda and Kenya to explore an open data movement seeking to address “age-old” issues including “transparency, accountability, equity, and the relevance, effectiveness and efficiency of governance.”
  • The authors focus both on the role “emerging open data processes in the two countries may be playing in promoting citizen/public engagement and the allocation of resources,” and the “possible negative impacts that may emerge due to the ‘digital divide’ between those who have access to data (and technology) and those who do not.
  • They offer a number of recommendations to the government of Uganda and Kenya that could be more broadly applicable, including:
    • Promote sector and cross sector specific initiatives that enable collaboration and transparency through different e-transformation strategies across government sectors and agencies.
    • Develop and champion the capacity to drive transformation across government and to advance skills in its institutions and civil service.

Sapkota, Krishna, “Exploring the emerging impacts of open aid data and budget data in Nepal,” Freedom Forum, August 2014, http://bit.ly/2ap0z5G

  • This research report seeks to answer a five key questions regarding the opening of aid and budget data in Nepal:
    • What is the context for open aid and budget data in Nepal?
    • What sorts of budget and aid information is being made available in Nepal?
    • What is the governance of open aid and budget data in Nepal?
    • How are relevant stakeholders making use of open aid and budget data in Nepal?
    • What are the emerging impacts of open aid and budget data in Nepal?
  • The study uncovered a number of findings, including
    • “Information and data can play an important role in addressing key social issues, and that whilst some aid and budget data is increasingly available, including in open data formats, there is not yet a sustainable supply of open data direct from official sources that meet the needs of the different stakeholders we consulted.”
    • “Expectations amongst government, civil society, media and private sector actors that open data could be a useful resource in improving governance, and we found some evidence of media making use of data to drive stories more when they had the right skills, incentives and support.”
    • “The context of Nepal also highlights that a more critical perspective may be needed on the introduction of open data, understanding the specific opportunities and challenges for open data supply and use in a country that is currently undergoing a period of constitutional development, institution building and deepening democracy.”

Srivastava, Nidhi, Veena Agarwal, Anmol Soni, Souvik Bhattacharjya, Bibhu P. Nayak, Harsha Meenawat, Tarun Gopalakrishnan, “Open government data for regulation of energy resources in India,”ODDC, 2014, http://bit.ly/2au9oXf

  • This research paper examines “the availability, accessibility and use of open data in the extractive energy industries sector in India.”
  • The authors describe a number of challenges being faced by:
    • Data suppliers and intermediaries:
      • Lack of clarity on mandate
      • Agency specific issues
      • Resource challenges
      • Privacy issues of commercial data and contractual constraints
      • Formats for data collection
      • Challenges in providing timely data
      • Recovery of costs and pricing of data
    • Data users
      • Data available but inaccessible
      • Data accessible but not usable
      • Timeliness of data
  • They make a number of recommendations for addressing these challenges focusing on:
    • Policy measures
    • Improving data quality
    • Improving effectiveness of data portal

van Schalkwyk, François, Michael Caňares, Sumandro Chattapadhyay and Alexander Andrason “Open Data Intermediaries in Developing Countries,” ODDC, 2015, http://bit.ly/2aJztWi

  • This paper seeks to provide “a more socially nuanced approach to open data intermediaries,” moving beyond the traditional approach wherein data intermediaries are “presented as single and simple linkages between open data supply and use.”
  • The study’s analysis draws on cases from the Emerging Impacts of Open Data in Developing Countries (ODDC) project.
  • The authors provide a working definition of open data intermediaries: An open data intermediary is an agent:
    • positioned at some point in a data supply chain that incorporates an open dataset,
    • positioned between two agents in the supply chain, and
    • facilitates the use of open data that may otherwise not have been the case.
  • One of the studies key findings is that, “Intermediation does not only consist of a single agent facilitating the flow of data in an open data supply chain; multiple intermediaries may operate in an open data supply chain, and the presence of multiple intermediaries may increase the probability of use (and impact) because no single intermediary is likely to possess all the types of capital required to unlock the full value of the transaction between the provider and the user in each of the fields in play.”

van Schalkwyk, François, Michelle Willmers and Tobias Schonwetter, “Embedding Open Data Practice,” ODDC, 2015, http://bit.ly/2aHt5xu

  • This research paper was developed as part of the ODDC Phase 2 project and seeks to address the “insufficient attention paid to the institutional dynamics within governments and how these may be impeding open data practice.”
  • The study focuses in particular on open data initiatives in South Africa and Kenya, leveraging a conceptual framework to allow for meaningful comparison between the two countries.
  • Focusing on South Africa and Kenya, as well as Africa as a whole, the authors seek to address four central research questions:
    • Is open data practice being embedded in African governments?
    • What are the possible indicators of open data practice being embedded?
    • What do the indicators reveal about resistance to or compliance with pressures to adopt open data practice?
    • What are different effects of multiple institutional domains that may be at play in government as an organisation?

van Schalkwyk, Francois, Michelle Willmers, and Laura Czerniewicz, “Case Study: Open Data in the Governance of South African Higher Education,” ODDC, 2014, http://bit.ly/2amgIFb

  • This research report uses the South African Centre for Higher Education Transformation (CHET) open data platform as a case study to examine “the supply of and demand for open data as well as the roles of intermediaries in the South African higher education governance ecosystem.
  • The report’s findings include:
    • “There are concerns at both government and university levels about how data will be used and (mis)interpreted, and this may constrain future data supply. Education both at the level of supply (DHET) and at the level of use by the media in particular on how to improve the interpretability of data could go some way in countering current levels of mistrust. Similar initiatives may be necessary to address uneven levels of data use and trust apparent across university executives and councils.”
    • “Open data intermediaries increase the accessibility and utility of data. While there is a rich publicly-funded dataset on South African higher education, the data remains largely inaccessible and unusable to universities and researchers in higher education studies. Despite these constraints, the findings show that intermediaries in the ecosystem are playing a valuable role in making the data both available and useable.”
    • “Open data intermediaries provide both supply-side as well as demand-side value. CHET’s work on higher education performance indicators was intended not only to contribute to government’s steering mechanisms, but also to contribute to the governance capacity of South African universities. The findings support the use of CHET’s open data to build capacity within universities. Further research is required to confirm the use of CHET data in state-steering of the South African higher education system, although there is some evidence of CHET’s data being referenced in national policy documents.”

Verhulst, Stefaan and Andrew Young, “Open Data Impact: When Demand Supply Meet,” The GovLab, 2016, http://bit.ly/1LHkQPO

Additional Resource:

World Bank Readiness Assessment Tool

Selected Readings on AI for Development


By Dominik Baumann, Jeremy Pesner, Alexandra Shaw, Stefaan Verhulst, Michelle Winowatan, Andrew Young, Andrew J. Zahuranec

As part of an ongoing effort to build a knowledge base for the field of improving governance through technology, The GovLab publishes a series of Selected Readings, which provide an annotated and curated collection of recommended works on themes such as open data, data collaboration, and civic technology. 

In this edition, we explore selected literature on AI and Development. This piece was developed in the context of The GovLab’s collaboration with Agence Française de Développement (AFD) on the use of emerging technology for development. To suggest additional readings on this or any other topic, please email info@thelivinglib.org. All our Selected Readings can be found here.

Context: In recent years, public discourse on artificial intelligence (AI) has focused on its potential for improving the way businesses, governments, and societies make (automated) decisions. Simultaneously, several AI initiatives have raised concerns about human rights, including the possibility of discrimination and privacy breaches. Between these two opposing perspectives is a discussion on how stakeholders can maximize the benefits of AI for society while minimizing the risks that might arise from the use of this technology.

While the majority of AI initiatives today come from the private sector, international development actors increasingly experiment with AI-enabled programs. These initiatives focus on, for example, climate modelling, urban mobility, and disease transmission. These early efforts demonstrate the promise of AI for supporting more efficient, targeted, and impactful development efforts. Yet, the intersection of AI and development remains nascent, and questions remain regarding how this emerging technology can deliver on its promise while mitigating risks to intended beneficiaries.

Readings are listed in alphabetical order.

2030Vision. AI and the Sustainable Development Goals: the State of Play

  • In broad language, this document for 2030Vision assesses AI research and initiatives and the Sustainable Development Goals (SDGs) to determine gaps and potential that can be further explored or scaled. 
  • It specifically reviews the current applications of AI in two SDG sectors, food/agriculture and healthcare.
  • The paper recommends enhancing multi-sector collaboration among businesses, governments, civil society, academia and others to ensure technology can best address the world’s most pressing challenges.

Andersen, Lindsey. Artificial Intelligence in International Development: Avoiding Ethical Pitfalls. Journal of Public & International Affairs (2019). 

  • Investigating the ethical implications of AI in the international development sector, the author argues that the involvement of many different stakeholders and AI-technology providers results in ethical issues concerning fairness and inclusion, transparency, explainability and accountability, data limitations, and privacy and security.
  • The author recommends the information communication technology for development (ICT4D) community adopt the Principles for Digital Development to ensure the ethical implementation of AI in international development projects.
  • The Principles of Digital Development include: 1) design with the user; 2) understand the ecosystem; 3) design for scale; 4) build for sustainability; 5) be data driven; 6) use open standards, open data, open source, and open innovation; and 7) reuse and improve.

Arun, Chinmayi. AI and the Global South: Designing for Other Worlds in Markus D. Dubber, Frank Pasquale, and Sunit Das (eds.), The Oxford Handbook of Ethics of AI, Oxford University Press, Forthcoming (2019).

  • This chapter interrogates the impact of AI’s application in the Global South and raises concerns about such initiatives.
  • Arun argues AI’s deployment in the Global South may result in discrimination, bias, oppression, exclusion, and bad design. She further argues it can be especially harmful to vulnerable communities in places that do not have strong respect for human rights.
  • The paper concludes by outlining the international human rights laws that can mitigate these risks. It stresses the importance of a human rights-centric, inclusive, empowering context-driven approach in the use of AI in the Global South.

Best, Michael. Artificial Intelligence (AI) for Development Series: Module on AI, Ethics and Society. International Telecommunications Union (2018). 

  • This working paper is intended to help ICT policymakers or regulators consider the ethical challenges that emerge within AI applications.
  • The author identifies a four-pronged framework of analysis (risks, rewards, connections, and key questions to consider) that can guide policymaking in the fields of: 1) livelihood and work; 2) diversity, non-discrimination and freedoms from bias; 3) data privacy and minimization; and 4) peace and security.
  • The paper also includes a table of policies and initiatives undertaken by national governments and tech companies around AI, along with the set of values (mentioned above) explicitly considered.

International Development Innovation Alliance (2019). Artificial Intelligence and International Development: An Introduction

  • Results for Development, a nonprofit organization working in the international development sector, developed a report in collaboration with the AI and Development Working Group within the International Development Innovation Alliance (IDIA). The report provides a brief overview of AI and how this technology may impact the international development sector.
  • The report provides examples of AI-powered applications and initiatives that support the SDGs, including eradicating hunger, promoting gender equality, and encouraging climate action.
  • It also provides a collection of supporting resources and case studies for development practitioners interested in using AI.

Paul, Amy, Craig Jolley, and Aubra Anthony. Reflecting the Past, Shaping the Future: Making AI Work for International Development. United States Agency for International Development (2018). 

  • This report outlines the potential of machine learning (ML) and artificial intelligence in supporting development strategy. It also details some of the common risks that can arise from the use of these technologies.
  • The document contains examples of ML and AI applications to support the development sector and recommends good practices in handling such technologies. 
  • It concludes by recommending broad, shared governance, using fair and balanced data, and ensuring local population and development practitioners remain involved in it.

Pincet, Arnaud, Shu Okabe, and Martin Pawelczyk. Linking Aid to the Sustainable Development Goals – a machine learning approach. OECD Development Co-operation Working Papers (2019). 

  • The authors apply ML and semantic analysis to data sourced from the OECD’s Creditor Reporting System to map aid funding to particular SDGs.
  • The researchers find “Good Health and Well-Being” as the most targeted SDG, what the researchers call the “SDG darling.”
  • The authors find that mapping relationships between the system and SDGs can help to ensure equitable funding across different goals.

Quinn, John, Vanessa Frias-Martinez, and Lakshminarayan Subramanian. Computational Sustainability and Artificial Intelligence in the Developing World. Association for the Advancement of Artificial Intelligence (2014). 

  • These researchers suggest three different areas—health, food security, and transportation—in which AI applications can uniquely benefit the developing world. The researchers argue the lack of technological infrastructure in these regions make AI especially useful and valuable, as it can efficiently analyze data and provide solutions.
  • It provides some examples of application within the three themes, including disease surveillance, identification of drought and agricultural trends, modeling of commuting patterns, and traffic congestion monitoring.

Smith, Matthew and Sujaya Neupane. Artificial intelligence and human development: toward a research agenda (2018).

  • The authors highlight potential beneficial applications for AI in a development context, including healthcare, agriculture, governance, education, and economic productivity.
  • They also discuss the risks and downsides of AI, which include the “black boxing” of algorithms, bias in decision making, potential for extreme surveillance, undermining democracy, potential for job and tax revenue loss, vulnerability to cybercrime, and unequal wealth gains towards the already-rich.
  • They recommend further research projects on these topics that are interdisciplinary, locally conducted, and designed to support practice and policy.

Tomašev, Nenad, et al. AI for social good: unlocking the opportunity for positive impact. Nature Communications (2020).

  • This paper takes stock of what the authors term the AI for Social Good movement (AI4SG), which “aims to establish interdisciplinary partnerships centred around AI applications towards SDGs.”  
  • Developed at a multidisciplinary expert seminar on the topic, the authors present 10 recommendations for creating successful AI4SG collaborations: “1) Expectations of what is possible with AI need to be well grounded. 2) There is value in simple solutions. 3) Applications of AI need to be inclusive and accessible, and reviewed at every stage for ethics and human rights compliance. 4) Goals and use cases should be clear and well-defined. 5) Deep, long-term partnerships are required to solve large problem successfully. 6) Planning needs to align incentives, and factor in the limitations of both communities. 7) Establishing and maintaining trust is key to overcoming organisational barriers. 8) Options for reducing the development cost of AI solutions should be explored. 9) Improving data readiness is key. 10) Data must be processed securely, with utmost respect for human rights and privacy.”

Vinuesa, Ricardo, et al. The role of artificial intelligence in achieving the Sustainable Development Goals. 

  • This report analyzes how AI can meet both the demands of some SDGs and also inhibit progress toward others. It highlights a critical research gap about the extent to which AI impacts sustainable development in the medium and long term. 
  • Through his analysis, Vinuesa claims AI has the potential to positively impact the environment, society, and the economy. However, AI can hinder these groups.
  • The authors recognize that although AI enables efficiency and productivity, it can also increase inequality and hinder achievements of the 2030 Agenda. Vinuesa and his co-authors suggest adequate policy formation and regulation are needed to ensure fast and equitable development of AI technologies that can address the SDGs. 

United Nations Education, Scientific and Cultural Organization (UNESCO) (2019). Artificial intelligence for Sustainable Development: Synthesis Report, Mobile Learning Week 2019

  • In this report, UNESCO assesses the findings from Mobile Learning Week (MLW) 2019. The three main conclusions were: 1) the world is facing a learning crisis; 2) education drives sustainable development; and 3) sustainable development can only be achieved if we harness the potential of AI. 
  • Questions around four major themes dominated the MLW 2019 sessions: 1) how to guarantee inclusive and equitable use of AI in education; 2) how to harness AI to improve learning; 3) how to increase skills development; and 4) how to ensure transparent and auditable use of education data. 
  • To move forward, UNESCO advocates for more international cooperation and stakeholder involvement, creation of education and AI standards, and development of national policies to address educational gaps and risks. 

Selected Readings on Data Responsibility, Refugees and Migration


By Kezia Paladina, Alexandra Shaw, Michelle Winowatan, Stefaan Verhulst, and Andrew Young

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of Data Collaboration for Migration was originally published in 2018.

Special thanks to Paul Currion whose data responsibility literature review gave us a headstart when developing the below. (Check out his article listed below on Refugee Identity)

The collection below is also meant to complement our article in the Stanford Social Innovation Review on Data Collaboration for Migration where we emphasize the need for a Data Responsibility Framework moving forward.

From climate change to politics to finance, there is growing recognition that some of the most intractable problems of our era are information problems. In recent years, the ongoing refugee crisis has increased the call for new data-driven approaches to address the many challenges and opportunities arising from migration. While data – including data from the private sector – holds significant potential value for informing analysis and targeted international and humanitarian response to (forced) migration, decision-makers often lack an actionable understanding of if, when and how data could be collected, processed, stored, analyzed, used, and shared in a responsible manner.

Data responsibility – including the responsibility to protect data and shield its subjects from harms, and the responsibility to leverage and share data when it can provide public value – is an emerging field seeking to go beyond just privacy concerns. The forced migration arena has a number of particularly important issues impacting responsible data approaches, including the risks of leveraging data regarding individuals fleeing a hostile or repressive government.

In this edition of the GovLab’s Selected Readings series, we examine the emerging literature on the data responsibility approaches in the refugee and forced migration space – part of an ongoing series focused on Data Responsibiltiy. The below reading list features annotated readings related to the Policy and Practice of data responsibility for refugees, and the specific responsibility challenges regarding Identity and Biometrics.

Data Responsibility and Refugees – Policy and Practice

International Organization for Migration (IOM) (2010) IOM Data Protection Manual. Geneva: IOM.

  • This IOM manual includes 13 data protection principles related to the following activities: lawful and fair collection, specified and legitimate purpose, data quality, consent, transfer to third parties, confidentiality, access and transparency, data security, retention and personal data, application of the principles, ownership of personal data, oversight, compliance and internal remedies (and exceptions).
  • For each principle, the IOM manual features targeted data protection guidelines, and templates and checklists are included to help foster practical application.

Norwegian Refugee Council (NRC) Internal Displacement Monitoring Centre / OCHA (eds.) (2008) Guidance on Profiling Internally Displaced Persons. Geneva: Inter-Agency Standing Committee.

  • This NRC document contains guidelines on gathering better data on Internally Displaced Persons (IDPs), based on country context.
  • IDP profile is defined as number of displaced persons, location, causes of displacement, patterns of displacement, and humanitarian needs among others.
  • It further states that collecting IDPs data is challenging and the current condition of IDPs data are hampering assistance programs.
  • Chapter I of the document explores the rationale for IDP profiling. Chapter II describes the who aspect of profiling: who IDPs are and common pitfalls in distinguishing them from other population groups. Chapter III describes the different methodologies that can be used in different contexts and suggesting some of the advantages and disadvantages of each, what kind of information is needed and when it is appropriate to profile.

United Nations High Commissioner for Refugees (UNHCR). Model agreement on the sharing of personal data with Governments in the context of hand-over of the refugee status determination process. Geneva: UNHCR.

  • This document from UNHCR provides a template of agreement guiding the sharing of data between a national government and UNHCR. The model agreement’s guidance is aimed at protecting the privacy and confidentiality of individual data while promoting improvements to service delivery for refugees.

United Nations High Commissioner for Refugees (UNHCR) (2015). Policy on the Protection of Personal Data of Persons of Concern to UNHCR. Geneva: UNHCR.

  • This policy outlines the rules and principles regarding the processing of personal data of persons engaged by UNHCR with the purpose of ensuring that the practice is consistent with UNGA’s regulation of computerized personal data files that was established to protect individuals’ data and privacy.
  • UNHCR require its personnel to apply the following principles when processing personal data: (i) Legitimate and fair processing (ii) Purpose specification (iii) Necessity and proportionality (iv) Accuracy (v) Respect for the rights of the data subject (vi) Confidentiality (vii) Security (viii) Accountability and supervision.

United Nations High Commissioner for Refugees (UNHCR) (2015) Privacy Impact Assessment of UNHCR Cash Based Interventions.

  • This impact assessment focuses on privacy issues related to financial assistance for refugees in the form of cash transfers. For international organizations like UNHCR to determine eligibility for cash assistance, data “aggregation, profiling, and social sorting techniques,” are often needed, leading a need for a responsible data approach.
  • This Privacy Impact Assessment (PIA) aims to identify the privacy risks posed by their program and seek to enhance safeguards that can mitigate those risks.
  • Key issues raised in the PIA involves the challenge of ensuring that individuals’ data will not be used for purposes other than those initially specified.

Data Responsibility in Identity and Biometrics

Bohlin, A. (2008) “Protection at the Cost of Privacy? A Study of the Biometric Registration of Refugees.” Lund: Faculty of Law of the University of Lund.

  • This 2008 study focuses on the systematic biometric registration of refugees conducted by UNHCR in refugee camps around the world, to understand whether enhancing the registration mechanism of refugees contributes to their protection and guarantee of human rights, or whether refugee registration exposes people to invasions of privacy.
  • Bohlin found that, at the time, UNHCR failed to put a proper safeguards in the case of data dissemination, exposing the refugees data to the risk of being misused. She goes on to suggest data protection regulations that could be put in place in order to protect refugees’ privacy.

Currion, Paul. (2018) “The Refugee Identity.” Medium.

  • Developed as part of a DFID-funded initiative, this essay considers Data Requirements for Service Delivery within Refugee Camps, with a particular focus on refugee identity.
  • Among other findings, Currion finds that since “the digitisation of aid has already begun…aid agencies must therefore pay more attention to the way in which identity systems affect the lives and livelihoods of the forcibly displaced, both positively and negatively.”
  • Currion argues that a Responsible Data approach, as opposed to a process defined by a Data Minimization principle, provides “useful guidelines,” but notes that data responsibility “still needs to be translated into organisational policy, then into institutional processes, and finally into operational practice.”

Farraj, A. (2010) “Refugees and the Biometric Future: The Impact of Biometrics on Refugees and Asylum Seekers.” Colum. Hum. Rts. L. Rev. 42 (2010): 891.

  • This article argues that biometrics help refugees and asylum seekers establish their identity, which is important for ensuring the protection of their rights and service delivery.
  • However, Farraj also describes several risks related to biometrics, such as, misidentification and misuse of data, leading to a need for proper approaches for the collection, storage, and utilization of the biometric information by government, international organizations, or other parties.  

GSMA (2017) Landscape Report: Mobile Money, Humanitarian Cash Transfers and Displaced Populations. London: GSMA.

  • This paper from GSMA seeks to evaluate how mobile technology can be helpful in refugee registration, cross-organizational data sharing, and service delivery processes.
  • One of its assessments is that the use of mobile money in a humanitarian context depends on the supporting regulatory environment that contributes to unlocking the true potential of mobile money. The examples include extension of SIM dormancy period to anticipate infrequent cash disbursements, ensuring that persons without identification are able to use the mobile money services, and so on.
  • Additionally, GMSA argues that mobile money will be most successful when there is an ecosystem to support other financial services such as remittances, airtime top-ups, savings, and bill payments. These services will be especially helpful in including displaced populations in development.

GSMA (2017) Refugees and Identity: Considerations for mobile-enabled registration and aid delivery. London: GSMA.

  • This paper emphasizes the importance of registration in the context of humanitarian emergency, because being registered and having a document that proves this registration is key in acquiring services and assistance.
  • Studying cases of Kenya and Iraq, the report concludes by providing three recommendations to improve mobile data collection and registration processes: 1) establish more flexible KYC for mobile money because where refugees are not able to meet existing requirements; 2) encourage interoperability and data sharing to avoid fragmented and duplicative registration management; and 3) build partnership and collaboration among governments, humanitarian organizations, and multinational corporations.

Jacobsen, Katja Lindskov (2015) “Experimentation in Humanitarian Locations: UNHCR and Biometric Registration of Afghan Refugees.” Security Dialogue, Vol 46 No. 2: 144–164.

  • In this article, Jacobsen studies the biometric registration of Afghan refugees, and considers how “humanitarian refugee biometrics produces digital refugees at risk of exposure to new forms of intrusion and insecurity.”

Jacobsen, Katja Lindskov (2017) “On Humanitarian Refugee Biometrics and New Forms of Intervention.” Journal of Intervention and Statebuilding, 1–23.

  • This article traces the evolution of the use of biometrics at the Office of the United Nations High Commissioner for Refugees (UNHCR) – moving from a few early pilot projects (in the early-to-mid-2000s) to the emergence of a policy in which biometric registration is considered a ‘strategic decision’.

Manby, Bronwen (2016) “Identification in the Context of Forced Displacement.” Washington DC: World Bank Group. Accessed August 21, 2017.

  • In this paper, Bronwen describes the consequences of not having an identity in a situation of forced displacement. It prevents displaced population from getting various services and creates higher chance of exploitation. It also lowers the effectiveness of humanitarian actions, as lacking identity prevents humanitarian organizations from delivering their services to the displaced populations.
  • Lack of identity can be both the consequence and and cause of forced displacement. People who have no identity can be considered illegal and risk being deported. At the same time, conflicts that lead to displacement can also result in loss of ID during travel.
  • The paper identifies different stakeholders and their interest in the case of identity and forced displacement, and finds that the biggest challenge for providing identity to refugees is the politics of identification and nationality.
  • Manby concludes that in order to address this challenge, there needs to be more effective coordination among governments, international organizations, and the private sector to come up with an alternative of providing identification and services to the displaced persons. She also argues that it is essential to ensure that national identification becomes a universal practice for states.

McClure, D. and Menchi, B. (2015). Challenges and the State of Play of Interoperability in Cash Transfer Programming. Geneva: UNHCR/World Vision International.

  • This report reviews the elements that contribute to the interoperability design for Cash Transfer Programming (CTP). The design framework offered here maps out these various features and also looks at the state of the problem and the state of play through a variety of use cases.
  • The study considers the current state of play and provides insights about the ways to address the multi-dimensionality of interoperability measures in increasingly complex ecosystems.     

NRC / International Human Rights Clinic (2016). Securing Status: Syrian refugees and the documentation of legal status, identity, and family relationships in Jordan.

  • This report examines Syrian refugees’ attempts to obtain identity cards and other forms of legally recognized documentation (mainly, Ministry of Interior Service Cards, or “new MoI cards”) in Jordan through the state’s Urban Verification Exercise (“UVE”). These MoI cards are significant because they allow Syrians to live outside of refugee camps and move freely about Jordan.
  • The text reviews the acquirement processes and the subsequent challenges and consequences that refugees face when unable to obtain documentation. Refugees can encounter issues ranging from lack of access to basic services to arrest, detention, forced relocation to camps and refoulement.  
  • Seventy-two Syrian refugee families in Jordan were interviewed in 2016 for this report and their experiences with obtaining MoI cards varied widely.

Office of Internal Oversight Services (2015). Audit of the operations in Jordan for the Office of the United Nations High Commissioner for Refugees. Report 2015/049. New York: UN.

  • This report documents the January 1, 2012 – March 31, 2014 audit of Jordanian operations, which is intended to ensure the effectiveness of the UNHCR Representation in the state.
  • The main goals of the Regional Response Plan for Syrian refugees included relieving the pressure on Jordanian services and resources while still maintaining protection for refugees.
  • The audit results concluded that the Representation was initially unsatisfactory, and the OIOS suggested several recommendations according to the two key controls which the Representation acknowledged. Those recommendations included:
    • Project management:
      • Providing training to staff involved in financial verification of partners supervise management
      • Revising standard operating procedure on cash based interventions
      • Establishing ways to ensure that appropriate criteria for payment of all types of costs to partners’ staff are included in partnership agreements
    • Regulatory framework:
      • Preparing annual need-based procurement plan and establishing adequate management oversight processes
      • Creating procedures for the assessment of renovation work in progress and issuing written change orders
      • Protecting data and ensuring timely consultation with the UNHCR Division of Financial and Administrative Management

UNHCR/WFP (2015). Joint Inspection of the Biometrics Identification System for Food Distribution in Kenya. Geneva: UNHCR/WFP.

  • This report outlines the partnership between the WFP and UNHCR in its effort to promote its biometric identification checking system to support food distribution in the Dadaab and Kakuma refugee camps in Kenya.
  • Both entities conducted a joint inspection mission in March 2015 and was considered an effective tool and a model for other country operations.
  • Still, 11 recommendations are proposed and responded to in this text to further improve the efficiency of the biometric system, including real-time evaluation of impact, need for automatic alerts, documentation of best practices, among others.

Selected Readings on Data, Gender, and Mobility


By Michelle Winowatan, Andrew Young, and Stefaan Verhulst

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data, gender, and mobility was originally published in 2017.

This edition of the Selected Readings was  developed as part of an ongoing project at the GovLab, supported by Data2X, in collaboration with UNICEF, DigitalGlobe, IDS (UDD/Telefonica R&D), and the ISI Foundation, to establish a data collaborative to analyze unequal access to urban transportation for women and girls in Chile. We thank all our partners for their suggestions to the below curation – in particular Leo Ferres at IDS who got us started with this collection; Ciro Cattuto and Michele Tizzoni from the ISI Foundation; and Bapu Vaitla at Data2X for their pointers to the growing data and mobility literature. 

Introduction

Daily mobility is key for gender equity. Access to transportation contributes to women’s agency and independence. The ability to move from place to place safely and efficiently can allow women to access education, work, and the public domain more generally. Yet, mobility is not just a means to access various opportunities. It is also a means to enter the public domain.

Women’s mobility is a multi-layered challenge
Women’s daily mobility, however, is often hampered by social, cultural, infrastructural, and technical barriers. Cultural bias, for instance, limits women mobility in a way that women are confined to an area with close proximity to their house due to society’s double standard on women to be homemakers. From an infrastructural perspective, public transportation mostly only accommodates home-to-work trips, when in reality women often make more complex trips with stops, for example, at the market, school, healthcare provider – sometimes called “trip chaining.” From a safety perspective, women tend to avoid making trips in certain areas and/or at certain time, due to a constant risk of being sexually harassed on public places. Women are also pushed toward more expensive transportation – such as taking a cab instead of a bus or train – based on safety concerns.

The growing importance of (new sources of) data
Researchers are increasingly experimenting with ways to address these interdependent problems through the analysis of diverse datasets, often collected by private sector businesses and other non-governmental entities. Gender-disaggregated mobile phone records, geospatial data, satellite imagery, and social media data, to name a few, are providing evidence-based insight into gender and mobility concerns. Such data collaboratives – the exchange of data across sectors to create public value – can help governments, international organizations, and other public sector entities in the move toward more inclusive urban and transportation planning, and the promotion of gender equity.
The below curated set of readings seek to focus on the following areas:

  1. Insights on how data can inform gender empowerment initiatives,
  2. Emergent research into the capacity of new data sources – like call detail records (CDRs) and satellite imagery – to increase our understanding of human mobility patterns, and
  3. Publications exploring data-driven policy for gender equity in mobility.

Readings are listed in alphabetical order.

We selected the readings based upon their focus (gender and/or mobility related); scope and representativeness (going beyond one project or context); type of data used (such as CDRs and satellite imagery); and date of publication.

Annotated Reading List

Data and Gender

Blumenstock, Joshua, and Nathan Eagle. Mobile Divides: Gender, Socioeconomic Status, and Mobile Phone Use in Rwanda. ACM Press, 2010.

  • Using traditional survey and mobile phone operator data, this study analyzes gender and socioeconomic divides in mobile phone use in Rwanda, where it is found that the use of mobile phones is significantly more prevalent in men and the higher class.
  • The study also shows the differences in the way men and women use phones, for example: women are more likely to use a shared phone than men.
  • The authors frame their findings around gender and economic inequality in the country to the end of providing pointers for government action.

Bosco, Claudio, et al. Mapping Indicators of Female Welfare at High Spatial Resolution. WorldPop and Flowminder, 2015.

  • This report focuses on early adolescence in girls, which often comes with higher risk of violence, fewer economic opportunity, and restrictions on mobility. Significant data gaps, methodological and ethical issues surrounding data collection for girls also create barriers for policymakers to create evidence-based policy to address those issues.
  • The authors analyze geolocated household survey data, using statistical models and validation techniques, and creates high-resolution maps of various sex-disaggregated indicators, such as nutrition level, access to contraception, and literacy, to better inform local policy making processes.
  • Further, it identifies the gender data gap and issues surrounding gender data collection, and provides arguments for why having a comprehensive data can help create better policy and contribute to the achievements of the Sustainable Development Goals (SDGs).

Buvinic, Mayra, Rebecca Furst-Nichols, and Gayatri Koolwal. Mapping Gender Data Gaps. Data2X, 2014.

  • This study identifies gaps in gender data in developing countries on health, education, economic opportunities, political participation, and human security issues.
  • It recommends ways to close the gender data gap through censuses and micro-level surveys, service and administrative records, and emphasizes how “big data” in particular can fill the missing data that will be able to measure the progress of women and girls well being. The authors argue that dentifying these gaps is key to advancing gender equality and women’s empowerment, one of the SDGs.

Catalyzing Inclusive FInancial System: Chile’s Commitment to Women’s Data. Data2X, 2014.

  • This article analyzes global and national data in the banking sector to fill the gap of sex-disaggregated data in Chile. The purpose of the study is to describe the difference in spending behavior and priorities between women and men, identify the challenges for women in accessing financial services, and create policies that promote women inclusion in Chile.

Ready to Measure: Twenty Indicators for Monitoring SDG Gender Targets. Open Data Watch and Data2X, 2016.

  • Using readily available data this study identifies 20 SDG indicators related to gender issues that can serve as a baseline measurement for advancing gender equality, such as percentage of women aged 20-24 who were married or in a union before age 18 (child marriage), proportion of seats held by women in national parliament, and share of women among mobile telephone owners, among others.

Ready to Measure Phase II: Indicators Available to Monitor SDG Gender Targets. Open Data Watch and Data2X, 2017.

  • The Phase II paper is an extension of the Ready to Measure Phase I above. Where Phase I identifies the readily available data to measure women and girls well-being, Phase II provides informations on how to access and summarizes insights from this data.
  • Phase II elaborates the insights about data gathered from ready to measure indicators and finds that although underlying data to measure indicators of women and girls’ wellbeing is readily available in most cases, it is typically not sex-disaggregated.
  • Over one in five – 53 out of 232 – SDG indicators specifically refer to women and girls. However, further analysis from this study reveals that at least 34 more indicators should be disaggregated by sex. For instance, there should be 15 more sex-disaggregated indicators for SDG number 3: “Ensure healthy lives and promote well-being for all at all ages.”
  • The report recommends national statistical agencies to take the lead and assert additional effort to fill the data gap by utilizing tools such as the statistical model to fill the current gender data gap for each of the SDGs.

Reed, Philip J., Muhammad Raza Khan, and Joshua Blumenstock. Observing gender dynamics and disparities with mobile phone metadata. International Conference on Information and Communication Technologies and Development (ICTD), 2016.

  • The study analyzes mobile phone logs of millions of Pakistani residents to explore whether there is a difference in mobile phone usage behavior between male and female and determine the extent to which gender inequality is reflected in mobile phone usage.
  • It utilizes mobile phone data to analyze the pattern of usage behavior between genders, and socioeconomic and demographic data obtained from census and advocacy groups to assess the state of gender equality in each region in Pakistan.
  • One of its findings is a strong positive correlation between proportion of female mobile phone users and education score.

Stehlé, Juliette, et al. Gender homophily from spatial behavior in a primary school: A sociometric study. 2013.

    • This paper seeks to understand homophily, a human behavior characterizes by interaction with peers who have similarities in “physical attributes to tastes or political opinions”. Further, it seeks to identify the magnitude of influence, a type of homophily has to social structures.
    • Focusing on gender interaction among primary school aged children in France, this paper collects data from wearable devices from 200 children in the period of 2 days and measure the physical proximity and duration of the interaction among those children in the playground.
  • It finds that interaction patterns are significantly determined by grade and class structure of the school. Meaning that children belonging to the same class have most interactions, and that lower grades usually do not interact with higher grades.
  • From a gender lens, this study finds that mixed-gender interaction lasts shorter relative to same-gender interaction. In addition, interaction among girls is also longer compared to interaction among boys. These indicate that the children in this school tend to have stronger relationships within their own gender, or what the study calls gender homophily. It further finds that gender homophily is apparent in all classes.

Data and Mobility

Bengtsson, Linus, et al. Using Mobile Phone Data to Predict the Spatial Spread of Cholera. Flowminder, 2015.

  • This study seeks to predict the 2010 cholera epidemic in Haiti using 2.9 million anonymous mobile phone SIM cards and reported cases of Cholera from the Haitian Directorate of Health, where 78 study areas were analyzed in the period of October 16 – December 16, 2010.
  • From this dataset, the study creates a mobility matrix that indicates mobile phone movement from one study area to another and combines that with the number of reported case of cholera in the study areas to calculate the infectious pressure level of those areas.
  • The main finding of its analysis shows that the outbreak risk of a study area correlates positively with the infectious pressure level, where an infectious pressure of over 22 results in an outbreak within 7 days. Further, it finds that the infectious pressure level can inform the sensitivity and specificity of the outbreak prediction.
  • It hopes to improve infectious disease containment by identifying areas with highest risks of outbreaks.

Calabrese, Francesco, et al. Understanding Individual Mobility Patterns from Urban Sensing Data: A Mobile Phone Trace Example. SENSEable City Lab, MIT, 2012.

  • This study compares mobile phone data and odometer readings from annual safety inspections to characterize individual mobility and vehicular mobility in the Boston Metropolitan Area, measured by the average daily total trip length of mobile phone users and average daily Vehicular Kilometers Traveled (VKT).
  • The study found that, “accessibility to work and non-work destinations are the two most important factors in explaining the regional variations in individual and vehicular mobility, while the impacts of populations density and land use mix on both mobility measures are insignificant.” Further, “a well-connected street network is negatively associated with daily vehicular total trip length.”
  • This study demonstrates the potential for mobile phone data to provide useful and updatable information on individual mobility patterns to inform transportation and mobility research.

Campos-Cordobés, Sergio, et al. “Chapter 5 – Big Data in Road Transport and Mobility Research.” Intelligent Vehicles. Edited by Felipe Jiménez. Butterworth-Heinemann, 2018.

  • This study outlines a number of techniques and data sources – such as geolocation information, mobile phone data, and social network observation – that could be leveraged to predict human mobility.
  • The authors also provide a number of examples of real-world applications of big data to address transportation and mobility problems, such as transport demand modeling, short-term traffic prediction, and route planning.

Lin, Miao, and Wen-Jing Hsu. Mining GPS Data for Mobility Patterns: A Survey. Pervasive and Mobile Computing vol. 12,, 2014.

  • This study surveys the current field of research using high resolution positioning data (GPS) to capture mobility patterns.
  • The survey focuses on analyses related to frequently visited locations, modes of transportation, trajectory patterns, and placed-based activities. The authors find “high regularity” in human mobility patterns despite high levels of variation among the mobility areas covered by individuals.

Phithakkitnukoon, Santi, Zbigniew Smoreda, and Patrick Olivier. Socio-Geography of Human Mobility: A Study Using Longitudinal Mobile Phone Data. PLoS ONE, 2012.

  • This study used a year’s call logs and location data of approximately one million mobile phone users in Portugal to analyze the association between individuals’ mobility and their social networks.
  • It measures and analyze travel scope (locations visited) and geo-social radius (distance from friends, family, and acquaintances) to determine the association.
  • It finds that 80% of places visited are within 20 km of an individual’s nearest social ties’ location and it rises to 90% at 45 km radius. Further, as population density increases, distance between individuals and their social networks decreases.
  • The findings in this study demonstrates how mobile phone data can provide insights to “the socio-geography of human mobility”.

Semanjski, Ivana, and Sidharta Gautama. Crowdsourcing Mobility Insights – Reflection of Attitude Based Segments on High Resolution Mobility Behaviour Data. vol. 71, Transportation Research, 2016.

  • Using cellphone data, this study maps attitudinal segments that explain how age, gender, occupation, household size, income, and car ownership influence an individual’s mobility patterns. This type of segment analysis is seen as particularly useful for targeted messaging.
  • The authors argue that these time- and space-specific insights could also provide value for government officials and policymakers, by, for example, allowing for evidence-based transportation pricing options and public sector advertising campaign placement.

Silveira, Lucas M., et al. MobHet: Predicting Human Mobility using Heterogeneous Data Sources. vol. 95, Computer Communications , 2016.

  • This study explores the potential of using data from multiple sources (e.g., Twitter and Foursquare), in addition to GPS data, to provide a more accurate prediction of human mobility. This heterogenous data captures popularity of different locations, frequency of visits to those locations, and the relationships among people who are moving around the target area. The authors’ initial experimentation finds that the combination of these sources of data are demonstrated to be more accurate in identifying human mobility patterns.

Wilson, Robin, et al. Rapid and Near Real-Time Assessments of Population Displacement Using Mobile Phone Data Following Disasters: The 2015 Nepal Earthquake. PLOS Current Disasters, 2016.

  • Utilizing call detail records of 12 million mobile phone users in Nepal, this study seeks spatio-temporal details of the population after the earthquake on April 25, 2015.
  • It seeks to answer the problem of slow and ineffective disaster response, by capturing near real-time displacement pattern provided by mobile phone call detail records, in order to inform humanitarian agencies on where to distribute their assistance. The preliminary results of this study were available nine days after the earthquake.
  • This project relies on the foundational cooperation with mobile phone operator, who supplied the de-identified data from 12 million users, before the earthquake.
  • The study finds that shortly after the earthquake there was an anomalous population movement out of the Kathmandu Valley, the most impacted area, to surrounding areas. The study estimates 390,000 people above normal had left the valley.

Data, Gender and Mobility

Althoff, Tim, et al. “Large-Scale Physical Activity Data Reveal Worldwide Activity Inequality.” Nature, 2017.

  • This study’s analysis of worldwide physical activity is built on a dataset containing 68 million days of physical activity of 717,527 people collected through their smartphone accelerometers.
  • The authors find a significant reduction in female activity levels in cities with high active inequality, where high active inequality is associated with low city walkability – walkability indicators include pedestrian facilities (city block length, intersection density, etc.) and amenities (shops, parks, etc.).
  • Further, they find that high active inequality is associated with high levels of inactivity-related health problems, like obesity.

Borker, Girija. “Safety First: Street Harassment and Women’s Educational Choices in India.” Stop Street Harassment, 2017.

  • Using data collected from SafetiPin, an application that allows user to mark an area on a map as safe or not, and Safecity, another application that lets users share their experience of harassment in public places, the researcher analyzes the safety of travel routes surrounding different colleges in India and their effect on women’s college choices.
  • The study finds that women are willing to go to a lower ranked college in order to avoid higher risk of street harassment. Women who choose the best college from their set of options, spend an average of $250 more each year to access safer modes of transportation.

Frias-Martinez, Vanessa, Enrique Frias-Martinez, and Nuria Oliver. A Gender-Centric Analysis of Calling Behavior in a Developing Economy Using Call Detail Records. Association for the Advancement of Articial Intelligence, 2010.

  • Using encrypted Call Detail Records (CDRs) of 10,000 participants in a developing economy, this study analyzes the behavioral, social, and mobility variables to determine the gender of a mobile phone user, and finds that there is a difference in behavioral and social variables in mobile phone use between female and male.
  • It finds that women have higher usage of phone in terms of number of calls made, call duration, and call expenses compared to men. Women also have bigger social network, meaning that the number of unique phone numbers that contact or get contacted is larger. It finds no statistically significant difference in terms of distance made between calls in men and women.
  • Frias-Martinez et al recommends to take these findings into consideration when designing a cellphone based service.

Psylla, Ioanna, Piotr Sapiezynski, Enys Mones, Sune Lehmann. “The role of gender in social network organization.” PLoS ONE 12, December 20, 2017.

  • Using a large dataset of high resolution data collected through mobile phones, as well as detailed questionnaires, this report studies gender differences in a large cohort. The researchers consider mobility behavior and individual personality traits among a group of more than 800 university students.
  • Analyzing mobility data, they find both that women visit more unique locations over time, and that they have more homogeneous time distribution over their visited locations than men, indicating the time commitment of women is more widely spread across places.

Vaitla, Bapu. Big Data and the Well-Being of Women and Girls: Applications on the Social Scientific Frontier. Data2X, Apr. 2017.

  • In this study, the researchers use geospatial data, credit card and cell phone information, and social media posts to identify problems–such as malnutrition, education, access to healthcare, mental health–facing women and girls in developing countries.
  • From the credit card and cell phone data in particular, the report finds that analyzing patterns of women’s spending and mobility can provide useful insight into Latin American women’s “economic lifestyles.”
  • Based on this analysis, Vaitla recommends that various untraditional big data be used to fill gaps in conventional data sources to address the common issues of invisibility of women and girls’ data in institutional databases.

Selected Readings on CrowdLaw


By Beth Simone Noveck and Gabriella Capone

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of CrowdLaw was published in 2018, and most recently updated on February 13, 2019.

Introduction

The public is beginning to demand — and governments are beginning to provide — new opportunities for the engagement of citizens on an ongoing basis as collaborators in public problem-solving rather than merely as voters. Nowhere is the explosion in citizen participation accelerating more than in the context of lawmaking, where legislators and regulators are turning to new technology to solicit both public opinion and know-how to improve the legitimacy and effectiveness of the legislative process.

Such participatory lawmaking, known as crowdlaw (also, CrowdLaw), is a tech-enabled approach for the collaborative drafting of legislation, policies or constitutions between governments and citizens. CrowdLaw is an alternative to the traditional method of lawmaking, which is typically done by the political elite — politicians, bureaucrats, and staff — working in legislatures behind closed doors, with little input from the people affected. Instead, this new form of inclusive lawmaking opens the legislative function of government to a broader array of actors.

From Brazil to Iceland to Libya, there is an explosion in new collaborative lawmaking experiments. Despite the growing movement, the field of participatory lawmaking requires further research and experimentation. Given the traditionally deep distrust of groups expressed in the social psychology literature on groupthink, which condemns the presumed tendency of groups to drift to extreme positions, it is not self-evident that crowdlaw practices are better and should be institutionalized. Also, depending on its design, crowdlaw has the potential to accomplish different normative goals, which are often viewed as being at odds, including: improving democratic legitimacy by giving more people a voice in the process, or creating better quality legislation by introducing greater expertise. There is a need to study crowdlaw practices and assess their impact.

To complement our evolving theoretical and empirical research on and case studies of crowdlaw, we have compiled these selected readings on public engagement in lawmaking and policymaking. For reasons of space, we do not include readings on citizen engagement or crowdsourcing and open innovation generally (see GovLab’s Selected Readings on Crowdsourcing Opinions and Ideas) but focus, instead, on engagement in these specific institutional contexts.

We invite you to visit Crowd.Law for additional resources, as well as:

CrowdLaw Design Recommendations

CrowdLaw Twitter List

CrowLaw Unconferences:

Annotated Readings

Aitamurto, Tanja – Collective Intelligence in Law Reforms: When the Logic of the Crowds and the Logic of Policymaking Collide (Paper, 10 pages, 2016)

  • This paper explores the risks of crowdsourcing for policymaking and the challenges that arise as a result of a severe conflict between the logics of the crowds and the logics of policymaking. Furthermore, he highlights the differences between traditional policymaking, which is done by a small group of experts, and crowdsourced policymaking, which utilizes a large, anonymous crowd with mixed levels of expertise.
  • “By drawing on data from a crowdsourced law-making process in Finland, the paper shows how the logics of the crowds and policymaking collide in practice,” and thus how this conflict prevents governments from gathering valuable insights from the crowd’s input. Poblet then addresses how to resolve this conflict and further overcome these challenges.

Atlee, Tom – vTaiwan (Blog series, 5 parts, 2018)

  • In this five-part blog series, Atlee describes in detail Taiwan’s citizen engagement platform vTaiwan and his takeaways after several months of research.
  • In order to cover what he deems “an inspiring beginning of a potentially profound evolutionary shift in all aspects of our collective governance,” Atlee divides his findings into the following sections:
    • The first post includes a quick introduction and overview of the platform.
    • The second delves deeper into its origins, process, and mechanics.
    • The third describes two real actions completed by vTaiwan and its associated g0v community.
    • The fourth provides a long list of useful sources discovered by Atlee.
    • The fifth and final post offers a high-level examination of vTaiwan and makes comments to provide lessons for other governments.

Capone, Gabriella and Beth Simone Noveck – “CrowdLaw”: Online Public Participation in Lawmaking, (Report, 71 pages, 2017)

  • Capone and Noveck provide recommendations for the thoughtful design of crowdlaw initiatives, a model legislative framework for institutionalizing legislative participation, and a summary of 25 citizen engagement case studies from around the world — all in an effort to acknowledge and promote best crowdlaw practices. The report, written to inform the public engagement strategy of the Autonomous Community of Madrid, can apply to crowdlaw initiatives across different contexts and jurisdictions.
  • CrowdLaw advocates for engagement opportunities that go beyond citizens suggesting ideas, and inviting integration of participation throughout the legislative life-cycle — from agenda-setting to evaluation of implemented legislation. Additionally, Capone and Noveck highlight the importance of engaging with the recipient public institutions to ensure that participatory actions are useful and desired. Finally, they lay out a research and experimentation agenda for crowdlaw, noting that the increased data capture and sharing, as well as the creation of empirical standards for evaluating initiatives, are integral to the progress and promise of crowdlaw.
  • The 25 case studies are organized by a six-part taxonomy of: (1) the participatory task requested, (2) the methods employed by the process, (3) the stages of the legislative process, (4) the platforms used, from mobile to in-person meetings, (5) the institutionalization or degree of legal formalization of the initiative, and (6) the mechanisms and metrics for ongoing evaluation of the initiative

Faria, Cristiano Ferri Soares de – The open parliament in the age of the internet: can the people now collaborate with legislatures in lawmaking? (Book, 352 pages, 2013)

  • Faria explores the concept of participatory parliaments, and how participatory and deliberative democracy can complement existing systems of representative democracy. Currently the first and only full-length book surveying citizen engagement in lawmaking.
  • As the World Bank’s Tiago Peixoto writes: “This is a text that brings the reader into contact with the main theories and arguments relating to issues of transparency, participation, actors’ strategies, and processes of institutional and technological innovation. […] Cristiano Faria captures the state of the art in electronic democracy experiences in the legislative at the beginning of the 21st century.”
  • Chapters 4 and 5, deep dive into two case studies: the Chilean Senate’s Virtual Senator project, and the Brazilian House of Representatives e-Democracy project.

Johns, Melissa, and Valentina Saltane (World Bank Global Indicators Group) – Citizen Engagement in Rulemaking: Evidence on Regulatory Practices in 185 Countries (Report, 45 pages, 2016)

  • This report “presents a new database of indicators measuring the extent to which rulemaking processes are transparent and participatory across 185 countries. […] [It] presents a nses ew global data set on citizen engagement in rulemaking and provides detailed descriptive statistics for the indicators. The paper then provides preliminary analysis on how the level of citizen engagement correlates with other social and economic outcomes. To support this analysis, we developed a composite citizen engagement in rulemaking score around the publication of proposed regulations, consultation on their content and the use of regulatory impact assessments.”
  • The authors outline the global landscape of regulatory processes and the extent to which citizens are kept privy to regulatory happenings and/or able to participate in them.
  • Findings include that: “30 of the sampled economies regulators voluntarily publish proposed regulations despite having no formal requirement to do so” and that, “In 98 of the 185 countries surveyed for this paper, ministries and regulatory agencies do not conduct impact assessments of proposed regulations.” Also: “High-income countries tend to perform well on the citizen engagement in rulemaking score.”

Noveck, Beth Simone – The Electronic Revolution in Rulemaking (Journal article, 90 pages, 2004)

  • Noveck addresses the need for the design of effective practices, beyond the legal procedure that enables participation, in order to fully institutionalize the right to participate in e-rulemaking processes. At the time of writing, e-rulemaking practices failed to “do democracy,” which requires building a community of practice and taking advantage of enabling technology. The work, which focuses on public participation in informal rulemaking processes, explores “how the use of technology in rulemaking can promote more collaborative, less hierarchical, and more sustained forms of participation — in effect, myriad policy juries — where groups deliberate together.”
  • Noveck looks to reorient on the improvement of participatory practices that exploit new technologies: a design-centered approach as opposed a critique the shortcomings of participation. Technology can be a critical tool in promoting meaningful, deliberative engagement among citizens and government. With this, participation is to be not a procedural right, but a set of technologically-enabled practices enabled by government.

Peña-López, Ismael – decidim.barcelona, Spain. Voice or chatter? Case studies (Report, 54 pages, 2017)

  • Peña-López analyzes the origins and impact of the opensource decidim.barcelona platform, a component of the city’s broader movement towards participatory democracy. The case is divided into “the institutionalization of the ethos of the 15M Spanish Indignados movement, the context building up to the decidim.barcelona initiative,” and then reviews “its design and philosophy […] in greater detail. […] In the final section, the results of the project are analyzed and the shifts of the initiative in meaning, norms and power, both from the government and the citizen end are discussed.”
  • A main finding includes that “decidim.barcelona has increased the amount of information in the hands of the citizens, and gathered more citizens around key issues. There has been an increase in participation, with many citizen created proposals being widely supported, legitimated and accepted to be part of the municipality strategic plan. As pluralism has been enhanced without damaging the existing social capital, we can only think that the increase of participation has led to an improvement of democratic processes, especially in bolstering legitimacy around decision making.”

Simon, Julie, Theo Bass, Victoria Boelman, and Geoff Mulgan (Nesta) – Digital Democracy: The Tools Transforming Political Engagement (Report, 100 pages, 2017)

  • Reviews the origins, implementation, and outcomes of 13 case studies representing the best in digital democracy practices that are consistently reviewed. The report then provides six key themes that underpin a “good digital democracy process.” Particularly instructive are the interviews with actors in each of the different projects, and their accounts of what contributed to their project’s successes or failures. The Nesta team also provides insightful analysis as to what contributed to the relative success or failure of the initiatives.

Suteu, Silvia – Constitutional Conventions in the Digital Era: Lessons from Iceland and Ireland (Journal article, 26 pages, 2015)

  • This piece from the Boston College International & Comparative Law Review “assesses whether the novelty in the means used in modern constitution-making translates further into novelty at a more substantive level, namely, in the quality of the constitution-making process and legitimacy of the end product. Additionally, this Essay analyzes standards of direct democratic engagements, which adequately fit these new developments, with a focus on the cases of Iceland and Ireland.”
  • It provides four motivations for focusing on constitution-making processes:
    • legitimacy: a good process can create a model for future political interactions,
    • the correlation between participatory constitution-making and the increased availability of popular involvement mechanisms,
    • the breadth of participation is a key factor to ensuring constitutional survival, and
    • democratic renewal.
  • Suteu traces the Icelandic and Irish processes of crowdsourcing their constitutions, the former being known as the first crowdsourced constitution, and the latter being known for its civil society-led We the Citizens initiative which spurred a constitutional convention and the adoption of a citizen assembly in the process.

Bernal, Carlos – How Constitutional Crowd-drafting can enhance Legitimacy in Constitution-Making(Paper, 27 pages, 2018)

  • Bernal examines the use of online engagement for facilitating citizen participation in constitutional drafting, a process he dubs “Crowddrafting.” Highlighting examples from places such as Kenya, Iceland, and Egypt, he lays out the details the process including key players, methods, actions, and tools.
  • Bernal poses three stages where citizens can participate in constitutional crowddrafting: foundational, deliberation, and pre-ratification. Citing more examples, he concisely explains how each process works and states their expected outcomes. Although he acknowledges the challenges that it may face, Bernal concludes by proposing that “constitutional crowddrafting is a strategy for strengthening the democratic legitimacy of constitution-making processes by enabling inclusive mechanisms of popular participation of individuals and groups in deliberations, expression of preferences, and decisions related to the content of the constitution.”
  • He suggests that crowddrafting can increase autonomy, transparency, and equality, and can engage groups or individuals that are often left out of deliberative processes. While it may create potential risks, Bernal explains how to mitigate those risks and achieve the full power of enhanced legitimacy from constitutional crowddrafting.

Finnbogadóttir, Vigdís & Gylfason,Thorvaldur – The New Icelandic Constitution: How did it come about? Where is it? (Book, 2016)

  • This book, co-authored by a former President of Iceland (also the world’s first democratically directly elected female president) tells the story the crowdsourced Icelandic constitution as a powerful example of participatory democracy.
  • “In 2010 a nationally elected Constitutional Council met, and four months later a draft constitution was born. On the 20th. of October 2012, The People of Iceland voted to tell their Parliament to ratify it as its new constitution.” Four years later, the book discusses the current state of the Icelandic constitution and explores whether Parliament is respecting the will of the people.

Mitozo, Isabele & Marques, Francisco Paulo Jamil – Context Matters! Looking Beyond Platform Structure to Understand Citizen Deliberation on Brazil’s Portal e‐Democracia (Article, 21 pages, 2019)

  • This article analyzes the Portal e‐Democracia participatory platform, sponsored by the Brazilian Chamber of Deputies. Since 2009, the online initiative has provided different opportunities for legislators to engage with constituents and representatives through various methods such as surveys, forums, and collaborative wiki tools. Hence, the article examines the participatory behavior of Brazilian citizens during four particular forums hosted on Portal e-Democracia.
  • The researchers confirmed their hypothesis (i.e., that debates with diverse characteristics can develop even under the same design structures) and also drew several additional conclusions, suggesting that the issue at stake and sociopolitical context of the issue might be more important to characterizing the debate than the structure is.

Alsina, Victòria and Luis Martí, José – The Birth of the CrowdLaw Movement: Tech-Based Citizen Participation, Legitimacy and the Quality of Lawmaking

  • This paper introduces the idea of CrowdLaw followed by a deep dive into its roots, true meaning, and the inspiration behind its launch.
  • The authors first distinguish CrowdLaw from other forms of political participation, setting the movement apart from others. They then restate and explain the CrowdLaw Manifesto, a set 12 collaboratively-written principles intended to booster the design, implementation and evaluation of new tech-enabled practices of public engagement in law and policymaking. Finally, the authors conclude by emphasizing the importance of certain qualities that are inherent to the concept of CrowdLaw.

Beth Simone Noveck – Crowdlaw: Collective Intelligence and Lawmaking

  • In this essay, Noveck provides an all-encompassing and detailed description of the CrowdLaw concept. After establishing the value proposition for CrowdLaw methods, Noveck explores good practices for incorporating them into each stage of the law and policymaking process
  • Using illustrative examples of successful cases from around the world, Noveck affirms why CrowdLaw should become more widely adopted by highlighting its potential, while simultaneously suggesting how to implement CrowdLaw processes for interested institutions

Selected Readings on Blockchain and Identity


By Hannah Pierce and Stefaan Verhulst

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of blockchain and identity was originally published in 2017.

The potential of blockchain and other distributed ledger technologies to create positive social change has inspired enthusiasm, broad experimentation, and some skepticism. In this edition of the Selected Readings series, we explore and curate the literature on blockchain and how it impacts identity as a means to access services and rights. (In a previous edition we considered the Potential of Blockchain for Transforming Governance).

Introduction

In 2008, an unknown source calling itself Satoshi Nakamoto released a paper named Bitcoin: A Peer-to-Peer Electronic Cash System which introduced Blockchain. Blockchain is a novel technology that uses a distributed ledger to record transactions and ensure compliance. Blockchain and other Distributed Ledger technologies (DLTs) rely on an ability to act as a vast, transparent, and secure public database.

Distributed ledger technologies (DLTs) have disruptive potential beyond innovation in products, services, revenue streams and operating systems within industry. By providing transparency and accountability in new and distributed ways, DLTs have the potential to positively empower underserved populations in myriad ways, including providing a means for establishing a trusted digital identity.

Consider the potential of DLTs for 2.4 billion people worldwide, about 1.5 billion of whom are over the age of 14, who are unable to prove identity to the satisfaction of authorities and other organizations – often excluding them from property ownership, free movement, and social protection as a result. At the same time, transition to a DLT led system of ID management involves various risks, that if not understood and mitigated properly, could harm potential beneficiaries.

Annotated Selected Reading List

Governance

Cuomo, Jerry, Richard Nash, Veena Pureswaran, Alan Thurlow, Dave Zaharchuk. “Building trust in government: Exploring the potential of blockchains.” IBM Institute for Business Value. January 2017.

This paper from the IBM Institute for Business Value culls findings from surveys conducted with over 200 government leaders in 16 countries regarding their experiences and expectations for blockchain technology. The report also identifies “Trailblazers”, or governments that expect to have blockchain technology in place by the end of the year, and details the views and approaches that these early adopters are taking to ensure the success of blockchain in governance. These Trailblazers also believe that there will be high yields from utilizing blockchain in identity management and that citizen services, such as voting, tax collection and land registration, will become increasingly dependent upon decentralized and secure identity management systems. Additionally, some of the Trailblazers are exploring blockchain application in borderless services, like cross-province or state tax collection, because the technology removes the need for intermediaries like notaries or lawyers to verify identities and the authenticity of transactions.

Mattila, Juri. “The Blockchain Phenomenon: The Disruptive Potential of Distributed Consensus Architectures.” Berkeley Roundtable on the International Economy. May 2016.

This working paper gives a clear introduction to blockchain terminology, architecture, challenges, applications (including use cases), and implications for digital trust, disintermediation, democratizing the supply chain, an automated economy, and the reconfiguration of regulatory capacity. As far as identification management is concerned, Mattila argues that blockchain can remove the need to go through a trusted third party (such as a bank) to verify identity online. This could strengthen the security of personal data, as the move from a centralized intermediary to a decentralized network lowers the risk of a mass data security breach. In addition, using blockchain technology for identity verification allows for a more standardized documentation of identity which can be used across platforms and services. In light of these potential capabilities, Mattila addresses the disruptive power of blockchain technology on intermediary businesses and regulating bodies.

Identity Management Applications

Allen, Christopher.  “The Path to Self-Sovereign Identity.” Coindesk. April 27, 2016.

In this Coindesk article, author Christopher Allen lays out the history of digital identities, then explains a concept of a “self-sovereign” identity, where trust is enabled without compromising individual privacy. His ten principles for self-sovereign identity (Existence, Control, Access, Transparency, Persistence, Portability, Interoperability, Consent, Minimization, and Protection) lend themselves to blockchain technology for administration. Although there are actors making moves toward the establishment of self-sovereign identity, there are a few challenges that face the widespread implementation of these tenets, including legal risks, confidentiality issues, immature technology, and a reluctance to change established processes.

Jacobovitz, Ori. “Blockchain for Identity Management.” Department of Computer Science, Ben-Gurion University. December 11, 2016.

This technical report discusses advantages of blockchain technology in managing and authenticating identities online, such as the ability for individuals to create and manage their own online identities, which offers greater control over access to personal data. Using blockchain for identity verification can also afford the potential of “digital watermarks” that could be assigned to each of an individual’s transactions, as well as negating the creation of unique usernames and passwords online. After arguing that this decentralized model will allow individuals to manage data on their own terms, Jacobvitz provides a list of companies, projects, and movements that are using blockchain for identity management.

Mainelli, Michael. “Blockchain Will Help Us Prove Our Identities in a Digital World.” Harvard Business Review. March 16, 2017.

In this Harvard Business Review article, author Michael Mainelli highlights a solution to identity problems for rich and poor alike–mutual distributed ledgers (MDLs), or blockchain technology. These multi-organizational data bases with unalterable ledgers and a “super audit trail” have three parties that deal with digital document exchanges: subjects are individuals or assets, certifiers are are organizations that verify identity, and inquisitors are entities that conducts know-your-customer (KYC) checks on the subject. This system will allow for a low-cost, secure, and global method of proving identity. After outlining some of the other benefits that this technology may have in creating secure and easily auditable digital documents, such as greater tolerance that comes from viewing widely public ledgers, Mainelli questions if these capabilities will turn out to be a boon or a burden to bureaucracy and societal behavior.

Personal Data Security Applications

Banafa, Ahmed. “How to Secure the Internet of Things (IoT) with Blockchain.” Datafloq. August 15, 2016.

This article details the data security risks that are coming up as the Internet of Things continues to expand, and how using blockchain technology can protect the personal data and identity information that is exchanged between devices. Banafa argues that, as the creation and collection of data is central to the functions of Internet of Things devices, there is an increasing need to better secure data that largely confidential and often personally identifiable. Decentralizing IoT networks, then securing their communications with blockchain can allow to remain scalable, private, and reliable. Enabling blockchain’s peer-to-peer, trustless communication may also enable smart devices to initiate personal data exchanges like financial transactions, as centralized authorities or intermediaries will not be necessary.

Shrier, David, Weige Wu and Alex Pentland. “Blockchain & Infrastructure (Identity, Data Security).” Massachusetts Institute of Technology. May 17, 2016.

This paper, the third of a four-part series on potential blockchain applications, covers the potential of blockchains to change the status quo of identity authentication systems, privacy protection, transaction monitoring, ownership rights, and data security. The paper also posits that, as personal data becomes more and more valuable, that we should move towards a “New Deal on Data” which provides individuals data protection–through blockchain technology– and the option to contribute their data to aggregates that work towards the common good. In order to achieve this New Deal on Data, robust regulatory standards and financial incentives must be provided to entice individuals to share their data to benefit society.

Selected Readings on Blockchain Technology and Its Potential for Transforming Governance


By Prianka Srinivasan, Robert Montano, Andrew Young, and Stefaan G. Verhulst

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of blockchain and governance was originally published in 2017.

Introduction

In 2008, an unknown source calling itself Satoshi Nakamoto released a paper named Bitcoin: A Peer-to-Peer Electronic Cash System which introduced blockchain technology. Blockchain is a novel system that uses a distributed ledger to record transactions and ensure compliance. Blockchain technology relies on an ability to act as a vast, transparent, and secure public database.

It has since gained recognition as a tool to transform governance by creating a decentralized system to

  • manage and protect identity,
  • trace and track; and
  • incentivize smarter social and business contracts.

These applications cast blockchain as a tool to confront certain public problems in the digital age.

The readings below represent selected readings on the applications for governance. They have been categorized by theme – Governance Applications, Identity Protection and ManagementTracing and Tracking, and Smart Contracts.

Selected Reading List

Governance Applications

  • Atzori, Marcella – The Center for Blockchain Technologies (2015) Blockchain Technology and Decentralized Governance: Is the State Still Necessary?  Aims to investigate the political applications of blockchain, particularly in encouraging government decentralization by considering to what extent blockchain can be viewed as “hyper-political tools.” The paper suggests that the domination of private bodies in blockchain systems highlights the continued need for the State to remain as a central point of coordination.
  • Boucher, Philip. – European Parliamentary Research Service (2017) How blockchain technology could change our lives  This report commissioned by the European Parliamentary Research Service provides a deep introduction to blockchain theory and its applications to society and political systems, providing 2 page briefings on currencies, digital content, patents, e-voting, smart contracts, supply chains, and blockchain states.
  • Boucher, Philip. – Euroscientist (2017) Are Blockchain Applications Guided by Social Values?  This report by a policy analyst at the European Parliament’s Scientific foresight unit, evaluates the social and moral contours of blockchain technology, arguing that “all technologies have value and politics,” and blockchain is no exception. Calls for greater scrutiny on the possibility for blockchain to act as a truly distributed and transparent system without a “middleman.”
  • Cheng, Steve;  Daub, Matthew; Domeyer, Axel; and Lundqvist, Martin –McKinsey & Company (2017)  Using Blockchain to Improve Data Management in the Public SectorThis essay considers the potential uses of blockchain technology for the public sector to improve the security of sensitive information collected by governments and as a way to simplify communication with specialists.
  • De Filippi, Primavera; and Wright, Aaron –Paris University & Cordoza School of Law (2015)  Decentralized Blockchain Technology and the Rise of Lex Cryptographia – Looks at how to regulate blockchain technology, particularly given its implications on governance and society. Argues that a new legal framework needs to emerge to take into account the applications of self-executing blockchain technology.
  • Liebenau, Jonathan and Elaluf-Calderwood, Silvia Monica. – London School of Economics & Florida International University (2016) Blockchain Innovation Beyond Bitcoin and Banking. A paper that explores the potential of blockchain technology in financial services and in broader digital applications, considers regulatory possibility and frameworks, and highlights the innovative potential of blockchain.
  • Prpić, John – Lulea University of Technology (2017) Unpacking Blockchains – This short paper provides a brief introduction to the use of Blockchain outside monetary purposes, breaking down its function as a digital ledger and transaction platform.
  • Stark, Josh – Ledger Labs (2016) Making Sense of Blockchain Governance Applications This CoinDesk article discusses, in simple terms, how blockchain technology can be used to accomplish what is called “the three basic functions of governance.”
  • UK Government Chief Scientific Adviser (2016)  Distributed Ledger Technology: Beyond Blockchain – A report from the UK Government that investigates the use of blockchain’s “distributed leger” as a database for governments and other institutions to adopt.

Identity Protection and Management

  • Baars, D.S. – University of Twente (2016Towards Self-Sovereign Identity Using Blockchain Technology.  A study exploring self-sovereign identity – i.e. the ability of users to control their own digital identity – that led to the creation of a new architecture designed for users to manage their digital ID. Called the Decentralized Identity Management System, it is built on blockchain technology and is based on the concept of claim-based identity.
  • Burger, Eric and Sullivan, Clare Linda. – Georgetown University (2016) E-Residency and Blockchain. A case study focused on an Estonian commercial initiative that allows for citizens of any nation to become an “Estonian E-Resident.” This paper explores the legal, policy, and technical implications of the program and considers its impact on the way identity information is controlled and authenticated.
  • Nathan, Oz; Pentland, Alex ‘Sandy’; and Zyskind, Guy – Security and Privacy Workshops (2015) Decentralizing Privacy: Using Blockchain to Protect Personal Data Describes the potential of blockchain technology to create a decentralized personal data management system, making third-party personal data collection redundant.
  • De Filippi, Primavera – Paris University (2016) The Interplay Between Decentralization and Privacy: The Case of Blockchain Technologies  A journal entry that weighs the radical transparency of blockchain technology against privacy concerns for its users, finding that the apparent dichotomy is not as at conflict with itself as it may first appear.

Tracing and Tracking

  • Barnes, Andrew; Brake, Christopher; and Perry, Thomas – Plymouth University (2016) Digital Voting with the use of Blockchain Technology – A report investigating the potential of blockchain technology to overcome issues surrounding digital voting, from voter fraud, data security and defense against cyber attacks. Proposes a blockchain voting system that can safely and robustly manage these challenges for digital voting.
  • The Economist (2015), “Blockchains The Great Chain of Being Sure About Things.”  An exploratory article that explores the potential usefulness of a blockchain-based land registry in places like Honduras and Greece, transaction registries for trading stock, and the creation of smart contracts.
  • Lin, Wendy; McDonnell, Colin; and Yuan, Ben – Massachusetts Institute of Technology (2015)  Blockchains and electronic health records. – Suggests the “durable transaction ledger” fundamental to blockchain has wide applicability in electronic medical record management. Also, evaluates some of the practical shortcomings in implementing the system across the US health industry.

Smart Contracts

  • Iansiti, Marco; and Lakhani, Karim R. – Harvard Business Review (2017) The Truth about Blockchain – A Harvard Business Review article exploring how blockchain technology can create secure and transparent digital contracts, and what effect this may have on the economy and businesses.
  • Levy, Karen E.C. – Engaging Science, Technology, and Society (2017) Book-Smart, Not Street-Smart: Blockchain-Based Smart Contracts and The Social Workings of Law. Article exploring the concept of blockchain-based “smart contracts” – contracts that securely automate and execute obligations without a centralized authority – and discusses the tension between law, social norms, and contracts with an eye toward social equality and fairness.

Annotated Selected Reading List

Cheng, Steve, Matthias Daub, Axel Domeyer, and Martin Lundqvist. “Using blockchain to improve data management in the public sector.” McKinsey & Company. Web. 03 Apr. 2017. http://bit.ly/2nWgomw

  • An essay arguing that blockchain is useful outside of financial institutions for government agencies, particularly those that store sensitive information such as birth and death dates or information about marital status, business licensing, property transfers, and criminal activity.
  • Blockchain technology would maintain the security of such sensitive information while also making it easier for agencies to use and access critical public-sector information.
  • Despite its potential, a significant drawback for use by government agencies is the speed with which blockchain has developed – there are no accepted standards for blockchain technologies or the networks that operate them; and because many providers are start-ups, agencies might struggle to find partners that will have lasting power. Additionally, government agencies will have to remain vigilant to ensure the security of data.
  • Although best practices will take some time to develop, this piece argues that the time is now for experimentation – and that governments would be wise to include blockchain in their strategies to learn what methods work best and uncover how to best unlock the potential of blockchain.

“The Great Chain of Being Sure About Things.” The Economist. The Economist Newspaper, 31 Oct. 2015. Web. 03 Apr. 2017. http://econ.st/1M3kLnr

  • This is an exploratory article written in The Economist that examines the various potential uses of blockchain technology beyond its initial focus on bitcoin:
    • It highlights the potential of blockchain-based land registries as a way to curb human rights abuses and insecurity in much of the world (it specifically cites examples in Greece and Honduras);
    • It also highlights the relative security of blockchain while noting its openness;
    • It is useful as a primer for how blockchain functions as tool for a non-specialist;
    • Discusses “smart contracts” (about which we have linked more research above);
    • Analyzes potential risks;
    • And considers the potential future unlocked by blockchain
  • This article is particularly useful as a primer into the various capabilities and potential of blockchain for interested researchers who may not have a detailed knowledge of the technology or for those seeking for an introduction.

Iansiti, Marco and Lakhani, Karim R. “The Truth About Blockchain.” Harvard Business Review. N.p., 17 Feb. 2017. Web. 06 Apr. 2017. http://bit.ly/2hqo3FU

  • This entry into the Harvard Business Review discusses blockchain’s ability to solve the gap between emerging technological progress and the outdated ways in which bureaucracies handle and record contracts and transactions.
  • Blockchain, the authors argue, allows us to imagine a world in which “contracts are embedded in digital code and stored in transparent, shared databases, where they are protected from deletion, tampering, and revision”, allowing for the removal of intermediaries and facilitating direct interactions between individuals and institutions.
  • The authors compare the emergence of blockchain to other technologies that have had transformative power, such as TCP/IP, and consider the speed with which they have proliferated and become mainstream.
    • They argue that like TCP/IP, blockchain is likely decades away from maximizing its potential and offer frameworks for the adoption of the technology involving both single-use, localization, substitution, and transformation.
    • Using these frameworks and comparisons, the authors present an investment strategy for those interested in blockchain.

IBM Global Business Services Public Sector Team. “Blockchain: The Chain of Trust and its Potential to Transform Healthcare – Our Point of View.” IBM. 2016. http://bit.ly/2oBJDLw

  • This enthusiastic business report from IBM suggests that blockchain technology can be adopted by the healthcare industry to “solve” challenges healthcare professionals face. This is primarily achieved by blockchain’s ability to streamline transactions by establishing trust, accountability, and transparency.
  • Structured around so-called “pain-points” in the healthcare industry, and how blockchain can confront them, the paper looks at 3 concepts and their application in the healthcare industry:
    • Bit-string cryptography: Improves privacy and security concerns in healthcare, by supporting data encryption and enforces complex data permission systems. This allows healthcare professionals to share data without risking the privacy of patients. It also streamlines data management systems, saving money and improving efficiency.
    • Transaction Validity: This feature promotes the use of electronic prescriptions by allowing transactional trust and authenticated data exchange. Abuse is reduced, and abusers are more easily identified.
    • Smart contracts: This streamlines the procurement and contracting qualms in healthcare by reducing intermediaries. Creates a more efficient and transparent healthcare system.
  • The paper goes on to signal the limitations of blockchain in certain use cases (particularly in low-value, high-volume transactions) but highlights 3 use cases where blockchain can help address a business problem in the healthcare industry.
  • Important to keep in mind that, since this paper is geared toward business applications of blockchain through the lens of IBM’s investments, the problems are drafted as business/transactional problems, where blockchain primarily improves efficiency than supporting patient outcomes.

Nathan, Oz; Pentland, Alex ‘Sandy’; and Zyskind, Guy “Decentralizing Privacy: Using Blockchain to Protect Personal Data” Security and Privacy Workshops (SPW). 2015. http://bit.ly/2nPo4r6

  • This technical paper suggests that anonymization and centralized systems can never provide complete security for personal data, and only blockchain technology, by creating a decentralized data management system, can overcome these privacy issues.
  • The authors identify 3 common privacy concerns that blockchain technology can address:
    • Data ownership: users want to own and control their personal data, and data management systems must acknowledge this.
    • Data transparency and auditability: users want to know what data is been collected and for what purpose.
    • Fine-grained access control: users want to be able to easily update and adapt their permission settings to control how and when third-party organizations access their data.
  • The authors propose their own system designed for mobile phones which integrates blockchain technology to store data in a reliable way. The entire system uses blockchain to store data, verify users through a digital signature when they want to access data, and creates a user interface that individuals  can access to view their personal data.
  • Though much of the body of this paper includes technical details on the setup of this blockchain data management system, it provides a strong case for how blockchain technology can be practically implemented to assuage privacy concerns among the public. The authors highlight that by using blockchain “laws and regulations could be programmed into the blockchain itself, so that they are enforced automatically.” They ultimately conclude that using blockchain in such a data protection system such as the one they propose is easier, safer, and more accountable.

Wright, Aaron, and Primavera De Filippi. “Decentralized blockchain technology and the rise of lex cryptographia.” 2015. Available at SSRN http://bit.ly/2oujvoG

  • This paper proposes that the emergence of blockchain technology, and its various applications (decentralized currencies, self-executing contracts, smart property etc.), will necessitate the creation of a new subset of laws, termed by the authors as “Lex Cryptographia.”
  • Considering the ability for blockchain to “cut out the middleman” there exist concrete challenges to law enforcement faced by the coming digital revolution brought by the technology. These encompass the very benefits of blockchain; for instance, the authors posit that the decentralized, autonomous nature of blockchain systems can act much like “a biological virus or an uncontrollable force of nature” if the system was ill-intentioned. Though this same system can regulate the problems of corruption and hierarchy associated with traditional, centralized systems, their autonomy poses an obvious obstacle for law-enforcement.
  • The paper goes on to details all the possible benefits and societal impacts of various applications of blockchain, finally suggesting there exists a need to “rethink” traditional models of regulating society and individuals. They predict a rise in Lex Cryptographia “characterized by a set of rules administered through self-executing smart contracts and decentralized (and potentially autonomous) organizations.” Much of these regulations depend upon the need to supervise restrictions placed upon blockchain technology that may chill its application, for instance corporations who may choose to purposefully avoid including any blockchain-based applications in their search engines so as to stymie the adoption of this technology.

Selected Readings on Algorithmic Scrutiny


By Prianka Srinivasan, Andrew Young and Stefaan Verhulst

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of algorithmic scrutiny was originally published in 2017.

Introduction

From government policy, to criminal justice, to our news feeds; to business and consumer practices, the processes that shape our lives both online and off are more and more driven by data and the complex algorithms used to form rulings or predictions. In most cases, these algorithms have created “black boxes” of decision making, where models remain inscrutable and inaccessible. It should therefore come as no surprise that several observers and policymakers are calling for more scrutiny of how algorithms are designed and work, particularly when their outcomes convey intrinsic biases or defy existing ethical standards.

While the concern about values in technology design is not new, recent developments in machine learning, artificial intelligence and the Internet of Things have increased the urgency to establish processes and develop tools to scrutinize algorithms.

In what follows, we have curated several readings covering the impact of algorithms on:

  • Information Intermediaries;
  • Governance
  • Finance
  • Justice

In addition we have selected a few readings that provide insight on possible processes and tools to establish algorithmic scrutiny.

Selected Reading List

Information Intermediaries

Governance

Consumer Finance

Justice

Tools & Process Toward Algorithmic Scrutiny

Annotated Selected Reading List

Information Intermediaries

Diakopoulos, Nicholas. “Algorithmic accountability: Journalistic investigation of computational power structures.” Digital Journalism 3.3 (2015): 398-415. http://bit.ly/.

  • This paper attempts to substantiate the notion of accountability for algorithms, particularly how they relate to media and journalism. It puts forward the notion of “algorithmic power,” analyzing the framework of influence such systems exert, and also introduces some of the challenges in the practice of algorithmic accountability, particularly for computational journalists.
  • Offers a basis for how algorithms can be analyzed, built in terms of the types of decisions algorithms make in prioritizing, classifying, associating, and filtering information.

Diakopoulos, Nicholas, and Michael Koliska. “Algorithmic transparency in the news media.” Digital Journalism (2016): 1-20. http://bit.ly/2hMvXdE.

  • This paper analyzes the increased use of “computational journalism,” and argues that though transparency remains a key tenet of journalism, the use of algorithms in gathering, producing and disseminating news undermines this principle.
  • It first analyzes what the ethical principle of transparency means to journalists and the media. It then highlights the findings from a focus-group study, where 50 participants from the news media and academia were invited to discuss three different case studies related to the use of algorithms in journalism.
  • They find two key barriers to algorithmic transparency in the media: “(1) a lack of business incentives for disclosure, and (2) the concern of overwhelming end-users with too much information.”
  • The study also finds a variety of opportunities for transparency across the “data, model, inference, and interface” components of an algorithmic system.

Napoli, Philip M. “The algorithm as institution: Toward a theoretical framework for automated media production and consumption.” Fordham University Schools of Business Research Paper (2013). http://bit.ly/2hKBHqo

  • This paper puts forward an analytical framework to discuss the algorithmic content creation of media and journalism in an attempt to “close the gap” on theory related to automated media production.
  • By borrowing concepts from institutional theory, the paper finds that algorithms are distinct forms of media institutions, and the cultural and political implications of this interpretation.
  • It urges further study in the field of “media sociology” to further unpack the influence of algorithms, and their role in institutionalizing certain norms, cultures and ways of thinking.

Introna, Lucas D., and Helen Nissenbaum. “Shaping the Web: Why the politics of search engines matters.” The Information Society 16.3 (2000): 169-185. http://bit.ly/2ijzsrg.

  • This paper, published 16 years ago, provides an in-depth account of some of the risks related to search engine optimizations, and the biases and harms these can introduce, particularly on the nature of politics.
  • Suggests search engines can be designed to account for these political dimensions, and better correlate with the ideal of the World Wide Web as being a place that is open, accessible and democratic.
  • According to the paper, policy (and not the free market) is the only way to spur change in this field, though the current technical solutions we have introduce further challenges.

Gillespie, Tarleton. “The Relevance of Algorithms.” Media
technologies: Essays on communication, materiality, and society (2014): 167. http://bit.ly/2h6ASEu.

  • This paper suggests that the extended use of algorithms, to the extent that they undercut many aspects of our lives, (Tarleton calls this public relevance algorithms) are fundamentally “producing and certifying knowledge.” In this ability to create a particular “knowledge logic,” algorithms are a primary feature of our information ecosystem.
  • The paper goes on to map 6 dimensions of these public relevance algorithms:
    • Patterns of inclusion
    • Cycles of anticipation
    • The evaluation of relevance
    • The promise of algorithmic objectivity
    • Entanglement with practice
    • The production of calculated publics
  • The paper concludes by highlighting the need for a sociological inquiry into the function, implications and contexts of algorithms, and to “soberly  recognize their flaws and fragilities,” despite the fact that much of their inner workings remain hidden.

Rainie, Lee and Janna Anderson. “Code-Dependent: Pros and Cons of the Algorithm Age.” Pew Research Center. February 8, 2017. http://bit.ly/2kwnvCo.

  • This Pew Research Center report examines the benefits and negative impacts of algorithms as they become more influential in different sectors and aspects of daily life.
  • Through a scan of the research and practice, with a particular focus on the research of experts in the field, Rainie and Anderson identify seven key themes of the burgeoning Algorithm Age:
    • Algorithms will continue to spread everywhere
    • Good things lie ahead
    • Humanity and human judgment are lost when data and predictive modeling become paramount
    • Biases exist in algorithmically-organized systems
    • Algorithmic categorizations deepen divides
    • Unemployment will rise; and
    • The need grows for algorithmic literacy, transparency and oversight

Tufekci, Zeynep. “Algorithmic harms beyond Facebook and Google: Emergent challenges of computational agency.” Journal on Telecommunications & High Technology Law 13 (2015): 203. http://bit.ly/1JdvCGo.

  • This paper establishes some of the risks and harms in regard to algorithmic computation, particularly in their filtering abilities as seen in Facebook and other social media algorithms.
  • Suggests that the editorial decisions performed by algorithms can have significant influence on our political and cultural realms, and categorizes the types of harms that algorithms may have on individuals and their society.
  • Takes two case studies–one from the social media coverage of the Ferguson protests, the other on how social media can influence election turnouts–to analyze the influence of algorithms. In doing so, this paper lays out the “tip of the iceberg” in terms of some of the challenges and ethical concerns introduced by algorithmic computing.

Mittelstadt, Brent, Patrick Allo, Mariarosaria Taddeo, Sandra Wachter, and Luciano Floridi. “The Ethics of Algorithms: Mapping the Debate.” Big Data & Society (2016): 3(2). http://bit.ly/2kWNwL6

  • This paper provides significant background and analysis of the ethical context of algorithmic decision-making. It primarily seeks to map the ethical consequences of algorithms, which have adopted the role of a mediator between data and action within societies.
  • Develops a conceptual map of 6 ethical concerns:
      • Inconclusive Evidence
      • Inscrutable Evidence
      • Misguided Evidence
      • Unfair Outcomes
      • Transformative Effects
    • Traceability
  • The paper then reviews existing literature, which together with the map creates a structure to inform future debate.

Governance

Janssen, Marijn, and George Kuk. “The challenges and limits of big data algorithms in technocratic governance.” Government Information Quarterly 33.3 (2016): 371-377. http://bit.ly/2hMq4z6.

  • In regarding the centrality of algorithms in enforcing policy and extending governance, this paper analyzes the “technocratic governance” that has emerged by the removal of humans from decision making processes, and the inclusion of algorithmic automation.
  • The paper argues that the belief in technocratic governance producing neutral and unbiased results, since their decision-making processes are uninfluenced by human thought processes, is at odds with studies that reveal the inherent discriminatory practices that exist within algorithms.
  • Suggests that algorithms are still bound by the biases of designers and policy-makers, and that accountability is needed to improve the functioning of an algorithm. In order to do so, we must acknowledge the “intersecting dynamics of algorithm as a sociotechnical materiality system involving technologies, data and people using code to shape opinion and make certain actions more likely than others.”

Just, Natascha, and Michael Latzer. “Governance by algorithms: reality construction by algorithmic selection on the Internet.” Media, Culture & Society (2016): 0163443716643157. http://bit.ly/2h6B1Yv.

  • This paper provides a conceptual framework on how to assess the governance potential of algorithms, asking how technology and software governs individuals and societies.
  • By understanding algorithms as institutions, the paper suggests that algorithmic governance puts in place more evidence-based and data-driven systems than traditional governance methods. The result is a form of governance that cares more about effects than causes.
  • The paper concludes by suggesting that algorithmic selection on the Internet tends to shape individuals’ realities and social orders by “increasing individualization, commercialization, inequalities, deterritorialization, and decreasing transparency, controllability, predictability.”

Consumer Finance

Hildebrandt, Mireille. “The dawn of a critical transparency right for the profiling era.” Digital Enlightenment Yearbook 2012 (2012): 41-56. http://bit.ly/2igJcGM.

  • Analyzes the use of consumer profiling by online businesses in order to target marketing and services to their needs. By establishing how this profiling relates to identification, the author also offers some of the threats to democracy and the right of autonomy posed by these profiling algorithms.
  • The paper concludes by suggesting that cross-disciplinary transparency is necessary to design more accountable profiling techniques that can match the extension of “smart environments” that capture ever more data and information from users.

Reddix-Smalls, Brenda. “Credit Scoring and Trade Secrecy: An Algorithmic Quagmire or How the Lack of Transparency in Complex Financial Models Scuttled the Finance Market.” UC Davis Business Law Journal 12 (2011): 87. http://bit.ly/2he52ch

  • Analyzes the creation of predictive risk models in financial markets through algorithmic systems, particularly in regard to credit scoring. It suggests that these models were corrupted in order to maintain a competitive market advantage: “The lack of transparency and the legal environment led to the use of these risk models as predatory credit pricing instruments as opposed to accurate credit scoring predictive instruments.”
  • The paper suggests that without greater transparency of these financial risk model, and greater regulation over their abuse, another financial crisis like that in 2008 is highly likely.

Justice

Aas, Katja Franko. “Sentencing Transparency in the Information Age.” Journal of Scandinavian Studies in Criminology and Crime Prevention 5.1 (2004): 48-61. http://bit.ly/2igGssK.

  • This paper questions the use of predetermined sentencing in the US judicial system through the application of computer technology and sentencing information systems (SIS). By assessing the use of these systems between the English speaking world and Norway, the author suggests that such technological approaches to sentencing attempt to overcome accusations of mistrust, uncertainty and arbitrariness often leveled against the judicial system.
  • However, in their attempt to rebuild trust, such technological solutions can be seen as an attempt to remedy a flawed view of judges by the public. Therefore, the political and social climate must be taken into account when trying to reform these sentencing systems: “The use of the various sentencing technologies is not only, and not primarily, a matter of technological development. It is a matter of a political and cultural climate and the relations of trust in a society.”

Cui, Gregory. “Evidence-Based Sentencing and the Taint of Dangerousness.” Yale Law Journal Forum 125 (2016): 315-315. http://bit.ly/1XLAvhL.

  • This short essay submitted on the Yale Law Journal Forum calls for greater scrutiny of “evidence based sentencing,” where past data is computed and used to predict future criminal behavior of a defendant. The author suggests that these risk models may undermine the Constitution’s prohibition of bills of attainder, and also are unlawful for inflicting punishment without a judicial trial.

Tools & Processes Toward Algorithmic Scrutiny

Ananny, Mike and Crawford, Kate. “Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability.” New Media & Society. SAGE Publications. 2016. http://bit.ly/2hvKc5x.

  • This paper attempts to critically analyze calls to improve the transparency of algorithms, asking how historically we are able to confront the limitations of the transparency ideal in computing.
  • By establishing “transparency as an ideal” the paper tracks the philosophical and historical lineage of this principle, attempting to establish what laws and provisions were put in place across the world to keep up with and enforce this ideal.
  • The paper goes on to detail the limits of transparency as an ideal, arguing, amongst other things, that it does not necessarily build trust, it privileges a certain function (seeing) over others (say, understanding) and that it has numerous technical limitations.
  • The paper ends by concluding that transparency is an inadequate way to govern algorithmic systems, and that accountability must acknowledge the ability to govern across systems.

Datta, Anupam, Shayak Sen, and Yair Zick. “Algorithmic Transparency via Quantitative Input Influence.Proceedings of 37th IEEE Symposium on Security and Privacy. 2016. http://bit.ly/2hgyLTp.

  • This paper develops what is called a family of Quantitative Input Influence (QII) measures “that capture the degree of influence of inputs on outputs of systems.” The attempt is to theorize a transparency report that is to accompany any algorithmic decisions made, in order to explain any decisions and detect algorithmic discrimination.
  • QII works by breaking “correlations between inputs to allow causal reasoning, and computes the marginal influence of inputs in situations where inputs cannot affect outcomes alone.”
  • Finds that these QII measures are useful in scrutinizing algorithms when “black box” access is available.

Goodman, Bryce, and Seth Flaxman. “European Union regulations on algorithmic decision-making and a right to explanationarXiv preprint arXiv:1606.08813 (2016). http://bit.ly/2h6xpWi.

  • This paper analyzes the implications of a new EU law, to be enacted in 2018, that calls to “restrict automated individual decision-making (that is, algorithms that make decisions based on user level predictors) which ‘significantly affect’ users.” The law will also allow for a “right to explanation” where users can ask for an explanation behind automated decision made about them.
  • The paper, while acknowledging the challenges in implementing such laws, suggests that such regulations can spur computer scientists to create algorithms and decision making systems that are more accountable, can provide explanations, and do not produce discriminatory results.
  • The paper concludes by stating algorithms and computer systems should not aim to be simply efficient, but also fair and accountable. It is optimistic about the ability to put in place interventions to account for and correct discrimination.

Kizilcec, René F. “How Much Information?: Effects of Transparency on Trust in an Algorithmic Interface.” Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 2016. http://bit.ly/2hMjFUR.

  • This paper studies how transparency of algorithms affects our impression of trust by conducting an online field experiment, where participants enrolled in a MOOC a given different explanations for the computer generated grade given in their class.
  • The study found that “Individuals whose expectations were violated (by receiving a lower grade than expected) trusted the system less, unless the grading algorithm was made more transparent through explanation. However, providing too much information eroded this trust.”
  • In conclusion, the study found that a balance of transparency was needed to maintain trust amongst the participants, suggesting that pure transparency of algorithmic processes and results may not correlate with high feelings of trust amongst users.

Kroll, Joshua A., et al. “Accountable Algorithms.” University of Pennsylvania Law Review 165 (2016). http://bit.ly/2i6ipcO.

  • This paper suggests that policy and legal standards need to be updated given the increased use of algorithms to perform tasks and make decisions in arenas that people once did. An “accountability mechanism” is lacking in many of these automated decision making processes.
  • The paper argues that mere transparency through the divulsion of source code is inadequate when confronting questions of accountability. Rather, technology itself provides a key to create algorithms and decision making apparatuses more inline with our existing political and legal frameworks.
  • The paper assesses some computational techniques that may provide possibilities to create accountable software and reform specific cases of automated decisionmaking. For example, diversity and anti-discrimination orders can be built into technology to ensure fidelity to policy choices.

Selected Readings on Data Collaboratives


By Neil Britto, David Sangokoya, Iryna Susha, Stefaan Verhulst and Andrew Young

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data collaboratives was originally published in 2017.

The term data collaborative refers to a new form of collaboration, beyond the public-private partnership model, in which participants from different sectors (including private companies, research institutions, and government agencies ) can exchange data to help solve public problems. Several of society’s greatest challenges — from addressing climate change to public health to job creation to improving the lives of children — require greater access to data, more collaboration between public – and private-sector entities, and an increased ability to analyze datasets. In the coming months and years, data collaboratives will be essential vehicles for harnessing the vast stores of privately held data toward the public good.

Selected Reading List (in alphabetical order)

Annotated Selected Readings List (in alphabetical order)

Agaba, G., Akindès, F., Bengtsson, L., Cowls, J., Ganesh, M., Hoffman, N., . . . Meissner, F. “Big Data and Positive Social Change in the Developing World: A White Paper for Practitioners and Researchers.” 2014. http://bit.ly/25RRC6N.

  • This white paper, produced by “a group of activists, researchers and data experts” explores the potential of big data to improve development outcomes and spur positive social change in low- and middle-income countries. Using examples, the authors discuss four areas in which the use of big data can impact development efforts:
    • Advocating and facilitating by “opening[ing] up new public spaces for discussion and awareness building;
    • Describing and predicting through the detection of “new correlations and the surfac[ing] of new questions;
    • Facilitating information exchange through “multiple feedback loops which feed into both research and action,” and
    • Promoting accountability and transparency, especially as a byproduct of crowdsourcing efforts aimed at “aggregat[ing] and analyz[ing] information in real time.
  • The authors argue that in order to maximize the potential of big data’s use in development, “there is a case to be made for building a data commons for private/public data, and for setting up new and more appropriate ethical guidelines.”
  • They also identify a number of challenges, especially when leveraging data made accessible from a number of sources, including private sector entities, such as:
    • Lack of general data literacy;
    • Lack of open learning environments and repositories;
    • Lack of resources, capacity and access;
    • Challenges of sensitivity and risk perception with regard to using data;
    • Storage and computing capacity; and
    • Externally validating data sources for comparison and verification.

Ansell, C. and Gash, A. “Collaborative Governance in Theory and Practice.” Journal of Public Administration Research and  Theory 18 (4), 2008. http://bit.ly/1RZgsI5.

  • This article describes collaborative arrangements that include public and private organizations working together and proposes a model for understanding an emergent form of public-private interaction informed by 137 diverse cases of collaborative governance.
  • The article suggests factors significant to successful partnering processes and outcomes include:
    • Shared understanding of challenges,
    • Trust building processes,
    • The importance of recognizing seemingly modest progress, and
    • Strong indicators of commitment to the partnership’s aspirations and process.
  • The authors provide a ‘’contingency theory model’’ that specifies relationships between different variables that influence outcomes of collaborative governance initiatives. Three “core contingencies’’ for successful collaborative governance initiatives identified by the authors are:
    • Time (e.g., decision making time afforded to the collaboration);
    • Interdependence (e.g., a high degree of interdependence can mitigate negative effects of low trust); and
    • Trust (e.g. a higher level of trust indicates a higher probability of success).

Ballivian A, Hoffman W. “Public-Private Partnerships for Data: Issues Paper for Data Revolution Consultation.” World Bank, 2015. Available from: http://bit.ly/1ENvmRJ

  • This World Bank report provides a background document on forming public-prviate partnerships for data with the private sector in order to inform the UN’s Independent Expert Advisory Group (IEAG) on sustaining a “data revolution” in sustainable development.
  • The report highlights the critical position of private companies within the data value chain and reflects on key elements of a sustainable data PPP: “common objectives across all impacted stakeholders, alignment of incentives, and sharing of risks.” In addition, the report describes the risks and incentives of public and private actors, and the principles needed to “build[ing] the legal, cultural, technological and economic infrastructures to enable the balancing of competing interests.” These principles include understanding; experimentation; adaptability; balance; persuasion and compulsion; risk management; and governance.
  • Examples of data collaboratives cited in the report include HP Earth Insights, Orange Data for Development Challenges, Amazon Web Services, IBM Smart Cities Initiative, and the Governance Lab’s Open Data 500.

Brack, Matthew, and Tito Castillo. “Data Sharing for Public Health: Key Lessons from Other Sectors.” Chatham House, Centre on Global Health Security. April 2015. Available from: http://bit.ly/1DHFGVl

  • The Chatham House report provides an overview on public health surveillance data sharing, highlighting the benefits and challenges of shared health data and the complexity in adapting technical solutions from other sectors for public health.
  • The report describes data sharing processes from several perspectives, including in-depth case studies of actual data sharing in practice at the individual, organizational and sector levels. Among the key lessons for public health data sharing, the report strongly highlights the need to harness momentum for action and maintain collaborative engagement: “Successful data sharing communities are highly collaborative. Collaboration holds the key to producing and abiding by community standards, and building and maintaining productive networks, and is by definition the essence of data sharing itself. Time should be invested in establishing and sustaining collaboration with all stakeholders concerned with public health surveillance data sharing.”
  • Examples of data collaboratives include H3Africa (a collaboration between NIH and Wellcome Trust) and NHS England’s care.data programme.

de Montjoye, Yves-Alexandre, Jake Kendall, and Cameron F. Kerry. “Enabling Humanitarian Use of Mobile Phone Data.” The Brookings Institution, Issues in Technology Innovation. November 2014. Available from: http://brook.gs/1JxVpxp

  • Using Ebola as a case study, the authors describe the value of using private telecom data for uncovering “valuable insights into understanding the spread of infectious diseases as well as strategies into micro-target outreach and driving update of health-seeking behavior.”
  • The authors highlight the absence of a common legal and standards framework for “sharing mobile phone data in privacy-conscientious ways” and recommend “engaging companies, NGOs, researchers, privacy experts, and governments to agree on a set of best practices for new privacy-conscientious metadata sharing models.”

Eckartz, Silja M., Hofman, Wout J., Van Veenstra, Anne Fleur. “A decision model for data sharing.” Vol. 8653 LNCS. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2014. http://bit.ly/21cGWfw.

  • This paper proposes a decision model for data sharing of public and private data based on literature review and three case studies in the logistics sector.
  • The authors identify five categories of the barriers to data sharing and offer a decision model for identifying potential interventions to overcome each barrier:
    • Ownership. Possible interventions likely require improving trust among those who own the data through, for example, involvement and support from higher management
    • Privacy. Interventions include “anonymization by filtering of sensitive information and aggregation of data,” and access control mechanisms built around identity management and regulated access.  
    • Economic. Interventions include a model where data is shared only with a few trusted organizations, and yield management mechanisms to ensure negative financial consequences are avoided.
    • Data quality. Interventions include identifying additional data sources that could improve the completeness of datasets, and efforts to improve metadata.
    • Technical. Interventions include making data available in structured formats and publishing data according to widely agreed upon data standards.

Hoffman, Sharona and Podgurski, Andy. “The Use and Misuse of Biomedical Data: Is Bigger Really Better?” American Journal of Law & Medicine 497, 2013. http://bit.ly/1syMS7J.

  • This journal articles explores the benefits and, in particular, the risks related to large-scale biomedical databases bringing together health information from a diversity of sources across sectors. Some data collaboratives examined in the piece include:
    • MedMining – a company that extracts EHR data, de-identifies it, and offers it to researchers. The data sets that MedMining delivers to its customers include ‘lab results, vital signs, medications, procedures, diagnoses, lifestyle data, and detailed costs’ from inpatient and outpatient facilities.
    • Explorys has formed a large healthcare database derived from financial, administrative, and medical records. It has partnered with major healthcare organizations such as the Cleveland Clinic Foundation and Summa Health System to aggregate and standardize health information from ten million patients and over thirty billion clinical events.
  • Hoffman and Podgurski note that biomedical databases populated have many potential uses, with those likely to benefit including: “researchers, regulators, public health officials, commercial entities, lawyers,” as well as “healthcare providers who conduct quality assessment and improvement activities,” regulatory monitoring entities like the FDA, and “litigants in tort cases to develop evidence concerning causation and harm.”
  • They argue, however, that risks arise based on:
    • The data contained in biomedical databases is surprisingly likely to be incorrect or incomplete;
    • Systemic biases, arising from both the nature of the data and the preconceptions of investigators are serious threats the validity of research results, especially in answering causal questions;
  • Data mining of biomedical databases makes it easier for individuals with political, social, or economic agendas to generate ostensibly scientific but misleading research findings for the purpose of manipulating public opinion and swaying policymakers.

Krumholz, Harlan M., et al. “Sea Change in Open Science and Data Sharing Leadership by Industry.” Circulation: Cardiovascular Quality and Outcomes 7.4. 2014. 499-504. http://1.usa.gov/1J6q7KJ

  • This article provides a comprehensive overview of industry-led efforts and cross-sector collaborations in data sharing by pharmaceutical companies to inform clinical practice.
  • The article details the types of data being shared and the early activities of GlaxoSmithKline (“in coordination with other companies such as Roche and ViiV”); Medtronic and the Yale University Open Data Access Project; and Janssen Pharmaceuticals (Johnson & Johnson). The article also describes the range of involvement in data sharing among pharmaceutical companies including Pfizer, Novartis, Bayer, AbbVie, Eli Llly, AstraZeneca, and Bristol-Myers Squibb.

Mann, Gideon. “Private Data and the Public Good.” Medium. May 17, 2016. http://bit.ly/1OgOY68.

    • This Medium post from Gideon Mann, the Head of Data Science at Bloomberg, shares his prepared remarks given at a lecture at the City College of New York. Mann argues for the potential benefits of increasing access to private sector data, both to improve research and academic inquiry and also to help solve practical, real-world problems. He also describes a number of initiatives underway at Bloomberg along these lines.    
  • Mann argues that data generated at private companies “could enable amazing discoveries and research,” but is often inaccessible to those who could put it to those uses. Beyond research, he notes that corporate data could, for instance, benefit:
      • Public health – including suicide prevention, addiction counseling and mental health monitoring.
    • Legal and ethical questions – especially as they relate to “the role algorithms have in decisions about our lives,” such as credit checks and resume screening.
  • Mann recognizes the privacy challenges inherent in private sector data sharing, but argues that it is a common misconception that the only two choices are “complete privacy or complete disclosure.” He believes that flexible frameworks for differential privacy could open up new opportunities for responsibly leveraging data collaboratives.

Pastor Escuredo, D., Morales-Guzmán, A. et al, “Flooding through the Lens of Mobile Phone Activity.” IEEE Global Humanitarian Technology Conference, GHTC 2014. Available from: http://bit.ly/1OzK2bK

  • This report describes the impact of using mobile data in order to understand the impact of disasters and improve disaster management. The report was conducted in the Mexican state of Tabasco in 2009 as a multidisciplinary, multi-stakeholder consortium involving the UN World Food Programme (WFP), Telefonica Research, Technical University of Madrid (UPM), Digital Strategy Coordination Office of the President of Mexico, and UN Global Pulse.
  • Telefonica Research, a division of the major Latin American telecommunications company, provided call detail records covering flood-affected areas for nine months. This data was combined with “remote sensing data (satellite images), rainfall data, census and civil protection data.” The results of the data demonstrated that “analysing mobile activity during floods could be used to potentially locate damaged areas, efficiently assess needs and allocate resources (for example, sending supplies to affected areas).”
  • In addition to the results, the study highlighted “the value of a public-private partnership on using mobile data to accurately indicate flooding impacts in Tabasco, thus improving early warning and crisis management.”

* Perkmann, M. and Schildt, H. “Open data partnerships between firms and universities: The role of boundary organizations.” Research Policy, 44(5), 2015. http://bit.ly/25RRJ2c

  • This paper discusses the concept of a “boundary organization” in relation to industry-academic partnerships driven by data. Boundary organizations perform mediated revealing, allowing firms to disclose their research problems to a broad audience of innovators and simultaneously minimize the risk that this information would be adversely used by competitors.
  • The authors identify two especially important challenges for private firms to enter open data or participate in data collaboratives with the academic research community that could be addressed through more involvement from boundary organizations:
    • First is a challenge of maintaining competitive advantage. The authors note that, “the more a firm attempts to align the efforts in an open data research programme with its R&D priorities, the more it will have to reveal about the problems it is addressing within its proprietary R&D.”
    • Second, involves the misalignment of incentives between the private and academic field. Perkmann and Schildt argue that, a firm seeking to build collaborations around its opened data “will have to provide suitable incentives that are aligned with academic scientists’ desire to be rewarded for their work within their respective communities.”

Robin, N., Klein, T., & Jütting, J. “Public-Private Partnerships for Statistics: Lessons Learned, Future Steps.” OECD. 2016. http://bit.ly/24FLYlD.

  • This working paper acknowledges the growing body of work on how different types of data (e.g, telecom data, social media, sensors and geospatial data, etc.) can address data gaps relevant to National Statistical Offices (NSOs).
  • Four models of public-private interaction for statistics are describe: in-house production of statistics by a data-provider for a national statistics office (NSO), transfer of data-sets to NSOs from private entities, transfer of data to a third party provider to manage the NSO and private entity data, and the outsourcing of NSO functions.
  • The paper highlights challenges to public-private partnerships involving data (e.g., technical challenges, data confidentiality, risks, limited incentives for participation), suggests deliberate and highly structured approaches to public-private partnerships involving data require enforceable contracts, emphasizes the trade-off between data specificity and accessibility of such data, and the importance of pricing mechanisms that reflect the capacity and capability of national statistic offices.
  • Case studies referenced in the paper include:
    • A mobile network operator’s (MNO Telefonica) in house analysis of call detail records;
    • A third-party data provider and steward of travel statistics (Positium);
    • The Data for Development (D4D) challenge organized by MNO Orange; and
    • Statistics Netherlands use of social media to predict consumer confidence.

Stuart, Elizabeth, Samman, Emma, Avis, William, Berliner, Tom. “The data revolution: finding the missing millions.” Overseas Development Institute, 2015. Available from: http://bit.ly/1bPKOjw

  • The authors of this report highlight the need for good quality, relevant, accessible and timely data for governments to extend services into underrepresented communities and implement policies towards a sustainable “data revolution.”
  • The solutions focused on this recent report from the Overseas Development Institute focus on capacity-building activities of national statistical offices (NSOs), alternative sources of data (including shared corporate data) to address gaps, and building strong data management systems.

Taylor, L., & Schroeder, R. “Is bigger better? The emergence of big data as a tool for international development policy.” GeoJournal, 80(4). 2015. 503-518. http://bit.ly/1RZgSy4.

  • This journal article describes how privately held data – namely “digital traces” of consumer activity – “are becoming seen by policymakers and researchers as a potential solution to the lack of reliable statistical data on lower-income countries.
  • They focus especially on three categories of data collaborative use cases:
    • Mobile data as a predictive tool for issues such as human mobility and economic activity;
    • Use of mobile data to inform humanitarian response to crises; and
    • Use of born-digital web data as a tool for predicting economic trends, and the implications these have for LMICs.
  • They note, however, that a number of challenges and drawbacks exist for these types of use cases, including:
    • Access to private data sources often must be negotiated or bought, “which potentially means substituting negotiations with corporations for those with national statistical offices;”
    • The meaning of such data is not always simple or stable, and local knowledge is needed to understand how people are using the technologies in question
    • Bias in proprietary data can be hard to understand and quantify;
    • Lack of privacy frameworks; and
    • Power asymmetries, wherein “LMIC citizens are unwittingly placed in a panopticon staffed by international researchers, with no way out and no legal recourse.”

van Panhuis, Willem G., Proma Paul, Claudia Emerson, John Grefenstette, Richard Wilder, Abraham J. Herbst, David Heymann, and Donald S. Burke. “A systematic review of barriers to data sharing in public health.” BMC public health 14, no. 1 (2014): 1144. Available from: http://bit.ly/1JOBruO

  • The authors of this report provide a “systematic literature of potential barriers to public health data sharing.” These twenty potential barriers are classified in six categories: “technical, motivational, economic, political, legal and ethical.” In this taxonomy, “the first three categories are deeply rooted in well-known challenges of health information systems for which structural solutions have yet to be found; the last three have solutions that lie in an international dialogue aimed at generating consensus on policies and instruments for data sharing.”
  • The authors suggest the need for a “systematic framework of barriers to data sharing in public health” in order to accelerate access and use of data for public good.

Verhulst, Stefaan and Sangokoya, David. “Mapping the Next Frontier of Open Data: Corporate Data Sharing.” In: Gasser, Urs and Zittrain, Jonathan and Faris, Robert and Heacock Jones, Rebekah, “Internet Monitor 2014: Reflections on the Digital World: Platforms, Policy, Privacy, and Public Discourse (December 15, 2014).” Berkman Center Research Publication No. 2014-17. http://bit.ly/1GC12a2

  • This essay describe a taxonomy of current corporate data sharing practices for public good: research partnerships; prizes and challenges; trusted intermediaries; application programming interfaces (APIs); intelligence products; and corporate data cooperatives or pooling.
  • Examples of data collaboratives include: Yelp Dataset Challenge, the Digital Ecologies Research Partnerhsip, BBVA Innova Challenge, Telecom Italia’s Big Data Challenge, NIH’s Accelerating Medicines Partnership and the White House’s Climate Data Partnerships.
  • The authors highlight important questions to consider towards a more comprehensive mapping of these activities.

Verhulst, Stefaan and Sangokoya, David, 2015. “Data Collaboratives: Exchanging Data to Improve People’s Lives.” Medium. Available from: http://bit.ly/1JOBDdy

  • The essay refers to data collaboratives as a new form of collaboration involving participants from different sectors exchanging data to help solve public problems. These forms of collaborations can improve people’s lives through data-driven decision-making; information exchange and coordination; and shared standards and frameworks for multi-actor, multi-sector participation.
  • The essay cites four activities that are critical to accelerating data collaboratives: documenting value and measuring impact; matching public demand and corporate supply of data in a trusted way; training and convening data providers and users; experimenting and scaling existing initiatives.
  • Examples of data collaboratives include NIH’s Precision Medicine Initiative; the Mobile Data, Environmental Extremes and Population (MDEEP) Project; and Twitter-MIT’s Laboratory for Social Machines.

Verhulst, Stefaan, Susha, Iryna, Kostura, Alexander. “Data Collaboratives: matching Supply of (Corporate) Data to Solve Public Problems.” Medium. February 24, 2016. http://bit.ly/1ZEp2Sr.

  • This piece articulates a set of key lessons learned during a session at the International Data Responsibility Conference focused on identifying emerging practices, opportunities and challenges confronting data collaboratives.
  • The authors list a number of privately held data sources that could create positive public impacts if made more accessible in a collaborative manner, including:
    • Data for early warning systems to help mitigate the effects of natural disasters;
    • Data to help understand human behavior as it relates to nutrition and livelihoods in developing countries;
    • Data to monitor compliance with weapons treaties;
    • Data to more accurately measure progress related to the UN Sustainable Development Goals.
  • To the end of identifying and expanding on emerging practice in the space, the authors describe a number of current data collaborative experiments, including:
    • Trusted Intermediaries: Statistics Netherlands partnered with Vodafone to analyze mobile call data records in order to better understand mobility patterns and inform urban planning.
    • Prizes and Challenges: Orange Telecom, which has been a leader in this type of Data Collaboration, provided several examples of the company’s initiatives, such as the use of call data records to track the spread of malaria as well as their experience with Challenge 4 Development.
    • Research partnerships: The Data for Climate Action project is an ongoing large-scale initiative incentivizing companies to share their data to help researchers answer particular scientific questions related to climate change and adaptation.
    • Sharing intelligence products: JPMorgan Chase shares macro economic insights they gained leveraging their data through the newly established JPMorgan Chase Institute.
  • In order to capitalize on the opportunities provided by data collaboratives, a number of needs were identified:
    • A responsible data framework;
    • Increased insight into different business models that may facilitate the sharing of data;
    • Capacity to tap into the potential value of data;
    • Transparent stock of available data supply; and
    • Mapping emerging practices and models of sharing.

Vogel, N., Theisen, C., Leidig, J. P., Scripps, J., Graham, D. H., & Wolffe, G. “Mining mobile datasets to enable the fine-grained stochastic simulation of Ebola diffusion.” Paper presented at the Procedia Computer Science. 2015. http://bit.ly/1TZDroF.

  • The paper presents a research study conducted on the basis of the mobile calls records shared with researchers in the framework of the Data for Development Challenge by the mobile operator Orange.
  • The study discusses the data analysis approach in relation to developing a situation of Ebola diffusion built around “the interactions of multi-scale models, including viral loads (at the cellular level), disease progression (at the individual person level), disease propagation (at the workplace and family level), societal changes in migration and travel movements (at the population level), and mitigating interventions (at the abstract government policy level).”
  • The authors argue that the use of their population, mobility, and simulation models provide more accurate simulation details in comparison to high-level analytical predictions and that the D4D mobile datasets provide high-resolution information useful for modeling developing regions and hard to reach locations.

Welle Donker, F., van Loenen, B., & Bregt, A. K. “Open Data and Beyond.” ISPRS International Journal of Geo-Information, 5(4). 2016. http://bit.ly/22YtugY.

  • This research has developed a monitoring framework to assess the effects of open (private) data using a case study of a Dutch energy network administrator Liander.
  • Focusing on the potential impacts of open private energy data – beyond ‘smart disclosure’ where citizens are given information only about their own energy usage – the authors identify three attainable strategic goals:
    • Continuously optimize performance on services, security of supply, and costs;
    • Improve management of energy flows and insight into energy consumption;
    • Help customers save energy and switch over to renewable energy sources.
  • The authors propose a seven-step framework for assessing the impacts of Liander data, in particular, and open private data more generally:
    • Develop a performance framework to describe what the program is about, description of the organization’s mission and strategic goals;
    • Identify the most important elements, or key performance areas which are most critical to understanding and assessing your program’s success;
    • Select the most appropriate performance measures;
    • Determine the gaps between what information you need and what is available;
    • Develop and implement a measurement strategy to address the gaps;
    • Develop a performance report which highlights what you have accomplished and what you have learned;
    • Learn from your experiences and refine your approach as required.
  • While the authors note that the true impacts of this open private data will likely not come into view in the short term, they argue that, “Liander has successfully demonstrated that private energy companies can release open data, and has successfully championed the other Dutch network administrators to follow suit.”

World Economic Forum, 2015. “Data-driven development: pathways for progress.” Geneva: World Economic Forum. http://bit.ly/1JOBS8u

  • This report captures an overview of the existing data deficit and the value and impact of big data for sustainable development.
  • The authors of the report focus on four main priorities towards a sustainable data revolution: commercial incentives and trusted agreements with public- and private-sector actors; the development of shared policy frameworks, legal protections and impact assessments; capacity building activities at the institutional, community, local and individual level; and lastly, recognizing individuals as both produces and consumers of data.