Selected Readings on Indigenous Data Sovereignty


By Juliet McMurren

As part of an ongoing effort to build a knowledge base for the field of improving governance through data and technology, The GovLab publishes a series of Selected Readings, which provide an annotated and curated collection of recommended works on themes such as open data, data collaboration, and civic technology.

In this edition, to recognize and honor Indigenous Peoples’ Day, we have curated below a selection of literature on Indigenous data sovereignty (IDS), the principle that Indigenous peoples should be able to control the data collected by and about them, to determine how and by whom it is accessed, stored, and used, and to develop data practices and methodologies that reflect their lived experiences, cultures, and worldviews. The selection complements previously released readings on Personal DataData GovernanceAlgorithmic Scrutiny — among other Data related topics.

To suggest additional readings on this or any other topic, please email info@thelivinglib.org. All our Selected Readings can be found here.

Selected Readings (in alphabetical order)

Principles

Kukutai, Tahu and John Taylor (eds) Indigenous Data Sovereignty: Towards an Agenda (2016)

  • The foundational work in the field, this edited volume brings together Māori, Australian Aboriginal, Native American, and First Nations academics, researchers and data practitioners to set out the case for Indigenous data sovereignty.
  • Organized in four parts, the book begins by providing a historical account of colonialist statistics and the origins of the concept of Indigenous data sovereignty. In the second part, the authors set out an Indigenous critique of official statistics as a colonialist practice primarily intended to serve settler governments through the control of Indigenous peoples. As a result, population statistics from these societies are imbued with colonialist norms that both ignore indicators significant to Indigenous peoples and reduce them to what contributor Maggie Walter calls 5D data: disparity, deprivation, disadvantage, dysfunction, and difference.
  • The authors outline how Indigenous data sovereignty would work, setting out an agenda in which Indigenous people would control who should be counted among them, and establish collection priorities reflective of their cultural norms, interests, values and priorities. This could include a move away from data about individuals as single indicators used to compare, rank and drive “improvement” to a more nuanced and complex view of data that focuses on social groupings beyond the household. They would also control who would have access to the data gathered, with culturally appropriate rules and protocols for consents to access and use data. These principles are encapsulated in the First Nations OCAP® data model, through which they assert their ownership, control over collection, use and disclosure; access to, and possession of all First Nations’ data.
  • The third part of the book provides examples of Indigenous data sovereignty in practice, from the perspective of both data practitioners and data users. A case study of data sovereignty among the Yawuru of Western Australia outlines a methodology for developing data collection rooted in self-determination and community values of mabu buru (knowledge of the land) and mabu liyan (relational or community wellbeing). Another examines the work of a Māori primary health care organization, National Hauora Coalition, which conducted a rapid response campaign to reduce high rates of acute rheumatic fever among Māori children in Auckland. Stewardship and analysis of their own community data enabled targeted interventions that reduced positive Group A strep rates among children by 75 percent and rates of rheumatic fever by 33 percent.
  • The final section of the book outlines the emerging efforts of the New Zealand and Australian Government to engage with Indigenous peoples’ desire for data sovereignty through their statistical practices.

Lovett, Raymond et al Good Data Practices for Indigenous Data Sovereignty and Governance (2019)

  • This multi-authored chapter is the first in a volume the editors describe as born of frustration with dystopian “bad data” practices and devoted to the exploration of how data could be used “productively and justly to further social, economic, cultural and political goals.”
  • The chapter sets out the context for the emergence of IDS movements worldwide, and gives a survey of IDS networks and their foundational principles, such as OCAP® (above) and the Māori principles of rangatiratanga(right to own, access, control and possess), manaakitanga (ethical use to further wellbeing) and kaitiakitanga (sustainable stewardship) as they apply to data about or from themselves and their environs.
  • The article defines and differentiates IDS — the management of information in alignment with the laws, practices and customs of the nation-state in which it is located — and indigenous data governance (IDG), or power and authority over the design, ownership, access to and use of data. It situates IDS movements alongside broader movements for Indigenous sovereignty informed by the rights laid out in the UN Declaration on the Rights of Indigenous Peoples.

Rainie, Stephanie Carroll, Tahu Kukutai, Maggie Walter, Oscar Figueroa-Rodriguez, Jennifer Walker, and Per Axelsson (2019) Issues in Open Data — Indigenous Data Sovereignty. In T. Davies, S. Walker, M. Rubinstein, & F. Perini (Eds.), The State of Open Data: Histories and Horizons.

  • Part of a wider report about the state of open data, this chapter discusses the tension between the principles of open data and IDS. Describing open data for Indigenous Peoples as a double-edged sword, the authors note the potential of open data to help deliver on Indigenous aspirations for sustainable development. At the same time, open data perpetuates data challenges born of colonization, including assumptions about a single nation-state and that open data is both useful and benign.
  • The authors observe that there is a widespread lack of understanding about IDS within the open data movement, and that open data policy and discussions have been largely framed to address the needs and interests of nation-states, with minimal engagement with Indigenous peoples. The authors provide a critique of the ways in which the Open Data Charter overlooks the issue of IDS in its principles on open by default, citizen engagement, and inclusive development. They note that the Open Data Charter’s commitment to free use, reuse, and redistribution by anyone, at any time, and anywhere, for example, is in direct conflict with the rights of Indigenous Peoples to govern their own data and control how and by whom it is accessed.
  • Opening state data that is unreliable, inaccurate, and designed, collected and processed according to the norms of state agencies poses additional problems for Indigenous peoples. Statistics about Indigenous peoples based on colonialist norms frequently perpetuate a narrative of inequality. Data infrastructures may be distorted by cultural assumptions, such as those about naming conventions, that misrepresent Indigenous people. In addition, the concept of open data has led to instances of cooptation and theft of Indigenous knowledge, when researchers have collected Indigenous knowledge about the environment, digitized it and shared it with Indigenous consent or oversight.
  • Drawing on the experience of Indigenous data networks worldwide, the authors propose three steps forward for the open data community in its relationship to Indigenous peoples. First, it needs to engage with Indigenous peoples as partners and knowledge holders to inform stewardship of Indigenous data. Secondly, IDS networks, with contacts in Indigenous communities and the world of data, should act as intermediaries for this engagement. Finally, they call for a broader adoption of principles on the governance and stewardship of Indigenous data within research and administration.

Research Data Alliance CARE Principles for Indigenous Data Governance(2019)

  • The RDA’s CARE principles propose an additional set of criteria that should be applied to open data in order to ensure that it respects Indigenous rights to self-determination. It argues the existing FAIR principles — that open data should be findable, accessible, interoperable and reusable — focus on data characteristics that facilitate increased sharing while ignoring historical context and power differentials.
  • To supplement FAIR, they propose the addition of CARE: that open data should be for collective benefit, recognize Indigenous peoples’ authority to control their own data, carry a responsibility to demonstrate how they benefit self-determination, and have embedded ethics prioritizing the rights and wellbeing of Indigenous people.
  • The principle of collective benefit asserts that data ecosystems should be designed in ways that Indigenous peoples can derive benefit from them. This includes active government support for Indigenous data use and reuse, using data to reduce information asymmetries between government and Indigenous communities, and the use of any value created from Indigenous data to benefit Indigenous communities. Authority to control recognizes the rights and interests of Indigenous peoples in their knowledge and data, and to govern and control how it is collected, accessed, used and stored.
  • Those working with Indigenous data have a responsibility to demonstrate how they are using it to benefit Indigenous communities. This involves fostering relationships of partnership and trust, working to build capability and capacity within Indigenous communities, and grounding data in the experiences, languages and worldviews of those communities. Finally, the rights and wellbeing of Indigenous peoples must be the primary concern. This requires data design and collection practices that do not stigmatize Indigenous people and that align with Indigenous ethical practices, that address imbalances in power and resources, and that are mindful of the potential for future use and potential harms.

Applications and Case Studies

Carroll, Stephanie Russo, Desi Rodriguez-Lonebear and Andrew Martinez, Indigenous Data Governance: Strategies from United States Native Nations(2019)

  • This article reviews IDS strategies from Native nations in the United States, connecting IDS and IDG to the rebuilding of Native nations and providing case studies of IDG occurring within tribal and non-tribal entities.
  • The article leads with a definition of key terms, including data dependency, “a paradox of scarcity and abundance: extensive data are collected about Indigenous peoples and nations, but rarely by or for Indigenous peoples’ and nations’ purposes.” It proposes IDG as a method by which the aspiration of IDS can be achieved, through a self-reinforcing cycle: governance of data leads to data rebuilding, providing data for governance that in turn leads to nation rebuilding.
  • The article offers three tribal, two non-tribal, and three urban, inter- and supra-tribal case studies of IDG in practice. The National Congress of American Indians Tribal Data Capacity Project, for example, was a pilot project to build tribal data capacity with five US tribes. Its outputs included a successful census conducted by the Pueblo of Laguna and the University of New Mexico, on tribal terms with tribal money for tribal purposes, and resulting in the development of proprietary software that remains the property of the tribe and that can be reused for subsequent collections.
  • The article concludes with a set of recommendations for tribal rights holders and stakeholders. It recommends that tribal rights holders develop tribe-specific data governance principles, policies, and procedures, and generate resources for IDG. Stakeholders are called on to acknowledge, support and promote IDS and embed it in data collection practices by building frameworks specifying how IDS is to be enacted in data processes, investing in intertribal institutions and recruiting and training Indigenous data professionals, among other measures.

Chaney, Christopher Data Sovereignty and the Tribal Law and Order Act (2018)

  • This article surveys the relationship between data sovereignty and the provision of criminal justice services, a key aspect of tribal sovereignty. The Tribal Law and Order Act (TLOA) 2010 addressed tribal data by mandating federal justice and law enforcement agencies to coordinate and consult with tribes over data collection, and providing tribal criminal justice agencies meeting federal and state requirements with access to national crime databases to enter and retrieve data.
  • TLOA has resulted in broad and extensive opportunities for federally recognized tribes to submit and retrieve data. Subject to federal law, tribes have the right to determine what information they will submit and access, putting the tribe in control of its own data. It also greatly facilitates the administration of tribal law enforcement and justice by enabling access to federal databases on property, such as vehicles and firearms, and people, including those on fugitives, sex offenders, and missing persons. The author suggests that TLOA implementation could serve as a model for other federal agencies working towards tribal data sovereignty arrangements.

[Dewar, J.] First Nations’ Data Sovereignty in Canada (2019).

  • This paper provides an overview of First Nations experiences of Canadian efforts to identify First Nations individuals, communities, and Nations in official statistics and data, and of the development of First Nations Data Sovereignty efforts over the previous two decades.
  • The paper surveys the ways in which early legislation constructed “Indians” and indian status within Canada counter to First Nations norms, harming traditional gender roles, leadership structures and governance, severing many First Nations women who “married out” from the culture and lands, and forcing First Nations people to choose between “enfranchisement” through education or employment and their Indian status and culture.
  • The paper then surveys the current First Nations statistical context, noting its numerous deficiencies. Sources of information and data, including the national census, were created with little or no Indigenous involvement or input, creating inconsistencies in the accuracy, reliability, usefulness, and comparability of the data. Even where the data is useful, it is not routinely used in planning and advocacy for the benefit of First Nations communities. First Nations are also required to meet onerous reporting requirements in order to access federal funding, but the resulting data — and other data from and about First Nations — are not effectively analyzed, used, or shared with First Nations.
  • The paper provides examples of effective instances of national and regional First Nations data sovereignty using OCAP® principles. These include the First Nations Information Governance Centre’s own survey work on health, childhood, education, labor and employment, but also similar provincial initiatives. The FNIGC is currently at work with regional partners to develop a National Data Governance Strategy to advance First Nations Data Sovereignty.

Garrison, Nanibaa’ et al Genomic Research Through An Indigenous Lens: Understanding the Expectations (2019)

  • This multi-authored study compares research guidelines for genomic research among Indigenous peoples in Canada, New Zealand, Australia, and the United States.
  • It notes that while there is a dearth of genomic research about Indigenous peoples, Indigenous communities have been the subject of western science in ways that have been intrusive, disrespectful and unethical, leading to community harms and mistrust. Lack of community engagement and informed consent for secondary use of data, and past experiences of harmful and negative representation in publications, have reduced the willingness of Indigenous peoples to engage with genetic research.
  • Canada, New Zealand, Australia, and the United States each have guidelines on scientific research among Indigenous peoples. The authors compare the provisions of these guidelines and the Indigenous Research Protection Act, a draft instrument developed by the Indigenous Peoples Council on Biocolonialism with the goal of protecting Indigenous peoples in research, across four principles: community engagement, rights and interests, institutional responsibilities, and ethical/legal oversight. They observe that while many of the policies provide for protection of Indigenous peoples relating to sample collection, secondary uses of data, benefits, and withdrawal from research, there is less consistency regarding cultural rights and interests, particularly in US instruments.
  • The authors examine ways Indigenous peoples have sought to “bridge the gap” between the benefits of genomic research and the protection of Indigenous peoples. Community protocol development, Indigenous-led genomics initiatives, and consent procedures that draw on UNDRIP have increased community engagement in some countries and fostered greater trust. Concrete progress has also been made in initiatives to preserve Indigenous rights and interests over biospecimens, including protocols that allow for the return of samples, biobanking, and Indigenous governance of resulting data.

Gifford, Heather and Kirikowhai Mikaere Te Kete Tū Ātea: Towards claiming Rangitīkei iwi data sovereignty (2019)

  • This article gives an outline of the Te Kete Tū Ātea research project, an four-year, two phase participatory research initiative by the Rangitīkei Iwi Collective to establish iwi data sovereignty. The first phase resulted in the development of an iwi data needs analysis and comprehensive iwi information framework, which identified potential data sources, gaps in current information, and strategies to address those gaps. The second phase led to the prioritization of a key information gathering domain, economic data, and a statistical evaluation of current iwi data holdings. The project adopted a Kaupapa Māori approach: it was “Māori led, Māori controlled, privileged a Māori worldview, and was framed around questions identified by Māori as of relevance to Māori.”
  • The first phase of the study identified a five domain framework to guide iwi data gathering. Collectively, these five domains — cultural, social, peoples, environmental and economic — make up Te Kete Tū Ātea, informed by three goal dimensions: kaitiakitanga, strengthening identity and connection, and empowerment and enablement.
  • The study identified challenges in assessing the wellbeing of iwi, including statistical capacity within iwi and the availability of data, but the authors suggest that the approach itself could be borrowed and applied by other iwi nationwide.

Johnson-Jennings, Michelle, Derek Jennings, and Meg Little Indigenous data sovereignty in action: The Food Wisdom Repository (2019)

  • This article arose from the experience of the authors at the Research for Indigenous Community Health (RICH) Center. Observing that while Indigenous health and nutrition information is available, it is dispersed and difficult to access, they proposed the development of a Food Wisdom Repository to gather meaningful data and information on Indigenous health practices and efforts. The result, supported by the Shakopee Mdewakanton Sioux Community, is an online digital repository of wise food practices grounded in Indigenous knowledge and IDS.
  • The project draws on Indigenous worldviews, knowledge and ways of knowing, beliefs, and forms of power. In particular, it is framed around the idea of wise practices — pragmatic, flexible and sustainable practices rooted in a given local context and the wisdom of community members — rather than the objective, hierarchical, hegemonic and acontextual “best practices” of Western science.

Montenegro, Maria Subverting the universality of metadata standards: The TK labels as a tool to promote Indigenous data sovereignty (2019)

  • This paper explores how metadata standards, and in particular the widely used Dublin Core, reinforce colonial legal property frameworks and disenfranchise Indigenous people, and how they could be used (or subverted) to exercise and promote IDS.
  • The author notes that the rights and creator fields of DC are in direct conflict with Indigenous epistemologies and protocols on the access, circulation, and use of traditional Indigenous knowledge (TK). The rights field is embedded in western legal practice designed to recognize and protect new creations or inventions, and require a designated individual author and original work in order to offer any protection. This emphasis on originality and individuality is at odds with Indigenous knowledge that emphasizes collective and cumulative knowledge acquired over generations. Similarly, both western IP law and the creator field within DC recognize the individual who records the lifestyles, languages and cultural practices of Indigenous people in film, audio, or image as the legal author, rather than the communities from which the content arose. As subjects but not authors, Indigenous people have no control over these recordings of their cultural practices, or how they are stored, accessed or reused. Indeed, they are even legally required to seek permission from the author to reuse these materials that document their lives and culture.
  • Developed in collaboration with Indigenous peoples, the TK labels are a set of digital tags that can be included as associated metadata in various digital information contexts such as CMSs, online catalogs and databases, finding aids and online platforms. These tags are intended to increase awareness of culturally appropriate circulation, access and use of Indigenous cultural materials. Designed to be used where communities are unable to assert legal control over materials, they provide important information about culturally appropriate use and stewardship. A Seasonal tag developed by the Penobscot Nation, for example, proscribes access to some content outside a given time of year, while an Attribution label, the most widely used, allows Indigenous communities to assert that they are the TK holders of the content and should be acknowledged as such.
  • While the TK labels represent a welcome advance in capturing and asserting Indigenous metadata standards, they are voluntary, and therefore only function if non-tribal collecting institutions recognize the IDS of the tribes.

McMahon, Rob, Tim LaHache, and Tim Whiteduck Digital Data Management as Indigenous Resurgence in Kahnawà:ke (2015)

  • This article documents IDG experiences within the Kahnawà:ke Mohawk (Quebec) community as it set up and used ICT systems to manage community data on research, education, finance, health, membership, housing, lands, and resources. Their research followed the implementation of a customized digital data management system, and sought to find out employees of community service organizations, chiefly in education, conceived of and used data, and the role of data management as part of self-government and Indigenous resurgence.
  • The authors describe the initiative as an act of “everyday community resurgence,” but one that was accompanied by significant internal tensions and challenges. They note the need to avoid technological determinism in IDS, since the use of ICTs has the potential to exacerbate the effects of settler colonialism, concentrating and centralizing power.
  • The article describes the rollout, architecture, and governance of the Kahnawà:ke data management system. One of the challenges faced by the community was in data sharing, with a lack of trust between community organizations leading to data hugging and silos. This tension, which has been identified in other research cited by the authors, points to the need for trust-building in order to promote more holistic data sharing and optimal data use.

Rainie, Stephanie Carroll et al Data as a Strategic Resource: Self-determination, Governance, and the Data Challenge for Indigenous Nations in the United States (2017)

  • Despite the need of Indigenous nations for data to help identify problems and find solutions, US Indigenous nations encounter a data landscape characterized by “sparse, inconsistent, and irrelevant information complicated by limited access and utility” that does not serve to address tribally defined needs. Because much of this data is collected and controlled by others for their own purposes, mistrust in data collection is high.
  • This article documents two cases studies in tribal data sovereignty and data governance, among the Ysleta del Sur Pueblo and Cheyenne River Sioux Tribe. It lays out the data priorities, agendas and challenges faced by each, and the resulting data initiatives, protocols and uses. The article also discusses how this data governance contributed to the tribes’ self determination.
  • As part of a development strategy, in 2008 the Ysleta del Sur began to collect socioeconomic and demographic data annually from its citizens as part of its enrolment process. Implementing a census approach that incorporated cultural and local knowledge and western epistemologies, the project yielded data about population, poverty rates, household incomes, educational attainment, workforce and unemployment that was more complete than US census data. Strong community engagement yielded a 90 percent response rate, and the results inspired other data initiatives to support community strategic decision making. The socioeconomic data was also used to support successful applications for federal funding.
  • Identifying high levels of poverty and unemployment as a problem, the Cheyenne River Sioux Tribe sought a comprehensive plan to address these problems, for which they needed timelier, more granular, and more culturally and locally relevant data than that available through the federal government. With academic partners, it developed a survey and data collection process to collect baseline demographic and socioeconomic data from a sample of residents. The survey was able to quantify unemployment rates among people living on the reservation, but also captured employment categories missed by federal data collection, such as the arts microenterprise sector. The results were shared back to the community, and used to foster microenterprises and write grant applications.

Walter, Maggie and Michelle Suina Indigenous data, indigenous methodologies and indigenous data sovereignty (2019)

  • In this article, Walter and Suina propose that there is a dearth of Indigenous quantitative methodologies, driven by a longstanding mistrust of positivist research that positions Indigenous peoples within a deficit discourse. What the authors call “quantitative avoidance” leads to lived consequences for Indigenous peoples: since the statistics produced by quantitative methods form the primary evidence base for policy within the colonial societies, failing to engage with them removes Indigenous people from a critical part of the policy debate.
  • The authors make a case for developing Indigenous quantitative methodologies. They evidence the Albuquerque Area Southwest Tribal Epidemiology Center, whose mission is to collaborate with the 27 tribes of their health area to provide high quality, culturally congruent epidemiology, capacity development, program evaluation, and health promotion. Committed to honoring tribal sovereignty, AASTEC is committed to building capacity that enables tribes to control data design, collection, and management at all stages of the process. This requires not merely adapting western survey instruments, but redesigning them to incorporate the values and definitions of health of the communities they serve.
  • The authors close with three recommendations for communities and stakeholders interested in building Indigenous quantitative methodologies. First, communities need to cultivate technical skills for survey development, data collection, analysis and reporting. Secondly, they need to build comfort and understand about research methods among tribal partners in order to undo decades of mistrust; the authors describe simulation exercises that help to demonstrate how worldviews shape expectations and perceptions around data that they have used successfully with Indigenous and non-Indigenous participants. Finally, they should pursue advocacy of IDS and an exchange of ideas that allows successful Indigenous research methodologies to be promulgated.

Selected Readings on Data Portability


By Juliet McMurren, Andrew Young, and Stefaan G. Verhulst

As part of an ongoing effort to build a knowledge base for the field of improving governance through technology, The GovLab publishes a series of Selected Readings, which provide an annotated and curated collection of recommended works on themes such as open data, data collaboration, and civic technology.

In this edition, we explore selected literature on data portability.

To suggest additional readings on this or any other topic, please email info@thelivinglib.org. All our Selected Readings can be found here.

Context

Data today exists largely in silos, generating problems and inefficiencies for the individual, business and society at large. These include:

  • difficulty switching (data) between competitive service providers;
  • delays in sharing data for important societal research initiatives;
  • barriers for data innovators to reuse data that could generate insights to inform individuals’ decision making; and
  • inhibitions to scale data donation.

Data portability — the principle that individuals have a right to obtain, copy, and reuse their own personal data and to transfer it from one IT platform or service to another for their own purposes — is positioned as a solution to these problems. When fully implemented, it would make data liquid, giving individuals the ability to access their own data in a usable and transferable format, transfer it from one service provider to another, or donate data for research and enhanced data analysis by those working in the public interest.

Some companies, including Google, Apple, Twitter and Facebook, have sought to advance data portability through initiatives like the Data Transfer Project, an open source software project designed to facilitate data transmittals. Newly enacted data protection legislation such as Europe’s General Data Protection Regulation (2018) and the California Consumer Privacy Act (2018) give data holders a right to data portability. However, despite the legal and technical advances made, many questions toward scaling up data liquidity and portability responsibly and systematically remain. These new data rights have generated complex and as yet unanswered questions about the limits of data ownership, the implications for privacy, security and intellectual property rights, and the practicalities of how, when, and to whom data can be transferred.

In this edition of the GovLab’s Selected Readings series, we examine the emerging literature on data portability to provide a foundation for future work on the value proposition of data portability. Readings are listed in alphabetical order.

Selected readings

Cho, Daegon, Pedro Ferreira, and Rahul Telang, The Impact of Mobile Number Portability on Price and Consumer Welfare (2016)

  • In this paper, the authors analyze how Mobile Number Portability (MNP) — the ability for consumers to maintain their phone number when changing providers, thus reducing switching costs — affected the relationship between switching costs, market price and consumer surplus after it was introduced in most European countries in the early 2000s.
  • Theory holds that when switching costs are high, market leaders will enjoy a substantial advantage and are able to keep prices high. Policy makers will therefore attempt to decrease switching costs to intensify competition and reduce prices to consumers.
  • The study reviewed quarterly data from 47 wireless service providers in 15 EU countries between 1999 and 2006. The data showed that MNP simultaneously decreased market price by over four percent and increased consumer welfare by an average of at least €2.15 per person per quarter. This increase amounted to a total of €880 million per quarter across the 15 EU countries analyzed in this paper and accounted for 15 percent of the increase in consumer surplus observed over this time.

CtrlShift, Data Mobility: The data portability growth opportunity for the UK economy (2018)

  • Commissioned by the UK Department of Digital, Culture, Media and Sport (DCMS), this study was intended to identify the potential of personal data portability for the UK economy.
  • Its scope went beyond the legal right to data portability envisaged by the GDPR, to encompass the current state of personal data portability and mobility, requirements for safe and secure data sharing, and the potential economic benefits through stimulation of innovation, productivity and competition.
  • The report concludes that increased personal data mobility has the potential to be a vital stimulus for the development of the digital economy, driving growth by empowering individuals to make use of their own data and consent to others using it to create new data-driven services and technologies.
  • However, the report concludes that there are significant challenges to be overcome, and new risks to be addressed, before the value of personal data can be realized. Much personal data remains locked in organizational silos, and systemic issues related to data security and governance and the uneven sharing of benefits need to be resolved.

Data Guidance and Future of Privacy Forum, Comparing Privacy Laws: GDPR v. CCPA (2018)

  • This paper compares the provisions of the GDPR with those of the California Consumer Privacy Act (2018).
  • Both article 20 of the GDPR and section 1798 of the CCPA recognize a right to data portability. Both also confer on data subjects the right to receive data from controllers free of charge upon request, and oblige controllers to create mechanisms to provide subjects with their data in portable and reusable form so that it can be transmitted to third parties for reuse.
  • In the CCPA, the right to data portability is an extension of the right to access, and only confers on data subjects the right to apply for data collected within the past 12 months and have it delivered to them. The GDPR does not impose a time limit, and allows data to be transferred from one data controller to another, but limits the right to automatically collected personal data provided by the data subject themselves through consent or contract.

Data Transfer Project, Data Transfer Project Overview and Fundamentals (2018)

  • The paper presents an overview of the goals, principles, architecture, and system components of the Data Transfer Project. The intent of the DTP is to increase the number of services offering data portability and provide users with the ability to transfer data directly in and out of participating providers through systems that are easy and intuitive to use, private and secure, reciprocal between services, and focused on user data. The project, which is supported by Microsoft, Google, Twitter and Facebook, is an open-source initiative that encourages the participation of other providers to reduce the infrastructure burden on providers and users.
  • In addition to benefits to innovation, competition, and user choice, the authors point to benefits to security, through allowing users to backup, organize, or archive their data, recover from account hijacking, and retrieve their data from deprecated services.
  • The DTP’s remit was to test concepts and feasibility for the transfer of specific types of user data between online services using a system of adapters to transfer proprietary formats into canonical formats that can be used to transfer data while allowing providers to maintain control over the security of their service. While not resolving all formatting or support issues, this approach would allow substantial data portability and encourage ecosystem sustainability.

Deloitte, How to Flourish in an Uncertain Future: Open Banking(2017)

  • This report addresses the innovative and disruptive potential of open banking, in which data is shared between members of the banking ecosystem at the authorization of the customer, with the potential to increase competition and facilitate new products and services. In the resulting marketplace model, customers could use a single banking interface to access products from multiple players, from established banks to newcomers and fintechs.
  • The report’s authors identify significant threats to current banking models. Banks that failed to embrace open banking could be relegated to a secondary role as an infrastructure provider, while third parties — tech companies, fintech, and price comparison websites — take over the customer relationship.
  • The report identifies four overlapping operating models banks could adopt within an open banking model: full service providers, delivering proprietary products through their own interface with little or no third-party integration; utilities, which provide other players with infrastructure without customer-facing services; suppliers, which offer proprietary products through third-party interfaces; and interfaces,which provide distribution services through a marketplace interface. To retain market share, incumbents are likely to need to adopt a combination of these roles, offering their own products and services and those of third parties through their own and others’ interfaces.

Digital Competition Expert Panel Unlocking Digital Competition(2019)

  • This report captures the findings of the UK Digital Competition Expert Panel, which was tasked in 2018 with considering opportunities and challenges the digital economy might pose for competition and competition policy and to recommend any necessary changes. The panel focused on the impact of big players within the sector, appropriate responses to mergers or anticompetitive practices, and the impact on consumers.
  • The panel found that the digital economy is creating many benefits, but that digital markets are subject to tipping, in which emerging winners can scoop much of the market. This concentration can give rise to substantial costs, especially to consumers, and cannot be solved by competition alone. However, government policy and regulatory solutions have limitations, including the slowness of policy change, uneven enforcement and profound informational asymmetries between companies and government.
  • The panel proposed the creation of a digital markets unit that would be tasked with developing a code of competitive conduct, enabling greater personal data mobility and systems designed with open standards, and advancing access to non-personal data to reduce barriers to market entry.
  • The panel’s model of data mobility goes beyond data portability, which involves consumers being able to request and transfer their own data from one provider to another. Instead, the panel recommended empowering consumers to instigate transfers of data between a business and a third party in order to access price information, compare goods and services, or access tailored advice and recommendations. They point to open banking as an example of how this could function in practice.
  • It also proposed updating merger policy to make it more forward-looking to better protect consumers and innovation and preserve the competitiveness of the market. It recommended the creation of antitrust policy that would enable the implementation of interim measures to limit damage to competition while antitrust cases are in process.

Egan, Erin, Charting a Way Forward: Data Portability and Privacy(2019)

  • This white paper by Facebook’s VP and Chief Privacy Officer, Policy, represents an attempt to advance the conversation about the relationship between data portability, privacy, and data protection. The author sets out five key questions about data portability: what is it, whose and what data should be portable, how privacy should be protected in the context of portability, and where responsibility for data misuse or improper protection should lie.
  • The paper finds that definitions of data portability still remain imprecise, particularly with regard to the distinction between data portability and data transfer. In the interest of feasibility and a reasonable operational burden on providers, it proposes time limits on providers’ obligations to make observed data portable.
  • The paper concludes that there are strong arguments both for and against allowing users to port their social graph — the map of connections between that user and other users of the service — but that the key determinant should be a capacity to ensure the privacy of all users involved. Best-practice data portability protocols that would resolve current differences of approach as to what, how and by whom information should be made available would help promote broader portability, as would resolution of liability for misuse or data exposure.

Engels, Barbara, Data portability among online platforms (2016)

  • The article examines the effects on competition and innovation of data portability among online platforms such as search engines, online marketplaces, and social media, and how relations between users, data, and platform services change in an environment of data portability.
  • The paper finds that the benefits to competition and innovation of portability are greatest in two kinds of environments: first, where platforms offer complementary products and can realize synergistic benefits by sharing data; and secondly, where platforms offer substitute or rival products but the risk of anti-competitive behaviour is high, as for search engines.
  • It identifies privacy and security issues raised by data portability. Portability could, for example, allow an identity fraudster to misuse personal data across multiple platforms, compounding the harm they cause.
  • It also suggests that standards for data interoperability could act to reduce innovation in data technology, encouraging data controllers to continue to use outdated technologies in order to comply with inflexible, government-mandated standards.

Graef, Inge, Martin Husovec and Nadezhda Purtova, Data Portability and Data Control: Lessons for an Emerging Concept in EU Law (2018)

  • This paper situates the data portability right conferred by the GDPR within rights-based data protection law. The authors argue that the right to data portability should be seen as a new regulatory tool aimed at stimulating competition and innovation in data-driven markets.
  • The authors note the potential for conflicts between the right to data portability and the intellectual property rights of data holders, suggesting that the framers underestimated the potential impact of such conflicts on copyright, trade secrets and sui generis database law.
  • Given that the right to data portability is being replicated within consumer protection law and the regulation of non-personal data, the authors argue framers of these laws should consider the potential for conflict and the impact of such conflict on incentives to innovate.

Mohsen, Mona Omar and Hassan A. Aziz The Blue Button Project: Engaging Patients in Healthcare by a Click of a Button (2015)

  • This paper provides a literature review on the Blue Button initiative, an early data portability project which allows Americans to access, view or download their health records in a variety of formats.
  • Originally launched through the Department of Veterans’ Affairs in 2010, the Blue Button initiative had expanded to more than 500 organizations by 2014, when the Department of Health and Human Services launched the Blue Button Connector to facilitate both patient access and development of new tools.
  • The Blue Button has enabled the development of tools such as the Harvard-developed Growth-Tastic app, which allows parents to check their child’s growth by submitting their downloaded pediatric health data. Pharmacies across the US have also adopted the Blue Button to provide patients with access to their prescription history.

More than Data and Mission: Smart, Got Data? The Value of Energy Data to Customers (2016)

  • This report outlines the public value of the Green Button, a data protocol that provides customers with private and secure access to their energy use data collected by smart meters.
  • The authors outline how the use of the Green Button can help states meet their energy and climate goals by enabling them to structure renewables and other distributed energy resources (DER) such as energy efficiency, demand response, and solar photovoltaics. Access to granular, near real time data can encourage innovation among DER providers, facilitating the development of applications like “virtual energy audits” that identify efficiency opportunities, allowing customers to reduce costs through time-of-use pricing, and enabling the optimization of photovoltaic systems to meet peak demand.
  • Energy efficiency receives the greatest boost from initiatives like the Green Button, with studies showing energy savings of up to 18 percent when customers have access to their meter data. In addition to improving energy conservation, access to meter data could improve the efficiency of appliances by allowing devices to trigger sleep modes in response to data on usage or price. However, at the time of writing, problems with data portability and interoperability were preventing these benefits from being realized, at a cost of tens of millions of dollars.
  • The authors recommend that commissions require utilities to make usage data available to customers or authorized third parties in standardized formats as part of basic utility service, and tariff data to developers for use in smart appliances.

MyData, Understanding Data Operators (2020)

  • MyData is a global movement of data users, activists and developers with a common goal to empower individuals with their personal data to enable them and their communities to develop knowledge, make informed decisions and interact more consciously and efficiently.
  • This introductory paper presents the state of knowledge about data operators, trusted data intermediaries that provide infrastructure for human-centric personal data management and governance, including data sharing and transfer. The operator model allows data controllers to outsource issues of legal compliance with data portability requirements, while offering individual users a transparent and intuitive way to manage the data transfer process.
  • The paper examines use cases from 48 “proto-operators” from 15 countries who fulfil some of the functions of an operator, albeit at an early level of maturity. The paper finds that operators offer management of identity authentication, data transaction permissions, connections between services, value exchange, data model management, personal data transfer and storage, governance support, and logging and accountability. At the heart of these functions is the need for minimum standards of data interoperability.
  • The paper reviews governance frameworks from the general (legislative) to the specific (operators), and explores proto-operator business models. In keeping with an emerging field, business models are currently unclear and potentially unsustainable, and one of a number of areas, including interoperability requirements and governance frameworks, that must still be developed.

National Science and Technology Council Smart Disclosure and Consumer Decision Making: Report of the Task Force on Smart Disclosure (2013)

  • This report summarizes the work and findings of the 2011–2013 Task Force on Smart Disclosure: Information and Efficiency, an interagency body tasked with advancing smart disclosure, through which data is made more available and accessible to both consumers and innovators.
  • The Task Force recognized the capacity of smart disclosure to inform consumer choices, empower them through access to useful personal data, enable the creation of new tools, products and services, and promote efficiency and growth. It reviewed federal efforts to promote smart disclosure within sectors and in data types that crosscut sectors, such as location data, consumer feedback, enforcement and compliance data and unique identifiers. It also surveyed specific public-private partnerships on access to data, such as the Blue and Green Button and MyData initiatives in health, energy and education respectively.
  • The Task Force reviewed steps taken by the Federal Government to implement smart disclosure, including adoption of machine readable formats and standards for metadata, use of APIs, and making data available in an unstructured format rather than not releasing it at all. It also reviewed “choice engines” making use of the data to provide services to consumers across a range of sectors.
  • The Task Force recommended that smart disclosure should be a core component of efforts to institutionalize and operationalize open data practices, with agencies proactively identifying, tagging, and planning the release of candidate data. It also recommended that this be supported by a government-wide community of practice.

Nicholas, Gabriel Taking It With You: Platform Barriers to Entry and the Limits of Data Portability (2020)

  • This paper considers whether, as is often claimed, data portability offers a genuine solution to the lack of competition within the tech sector.
  • It concludes that current regulatory approaches to data portability, which focus on reducing switching costs through technical solutions such as one-off exports and API interoperability, are not sufficient to generate increased competition. This is because they fail to address other barriers to entry, including network effects, unique data access, and economies of scale.
  • The author proposes an alternative approach, which he terms collective portability, which would allow groups of users to coordinate the transfer of their data to a new platform. This model raises questions about how such collectives would make decisions regarding portability, but would enable new entrants to successfully target specific user groups and scale rapidly without having to reach users one by one.

OECD, Enhancing Access to and Sharing of Data: Reconciling Risks and Benefits for Data Re-use across Societies (2019)

  • This background paper to a 2017 expert workshop on risks and benefits of data reuse considers data portability as one strategy within a data openness continuum that also includes open data, market-based B2B contractual agreements, and restricted data-sharing agreements within research and data for social good applications.
  • It considers four rationales offered for data portability. These include empowering individuals towards the “informational self-determination” aspired to by GDPR, increased competition within digital and other markets through reductions in information asymmetries between individuals and providers, switching costs, and barriers to market entry; and facilitating increased data flows.
  • The report highlights the need for both syntactic and semantic interoperability standards to ensure data can be reused across systems, both of which may be fostered by increased rights to data portability. Data intermediaries have an important role to play in the development of these standards, through initiatives like the Data Transfer Project, a collaboration which brought together Facebook, Google, Microsoft, and Twitter to create an open-source data portability platform.

Personal Data Protection Commission Singapore Response to Feedback on the Public Consultation on Proposed Data Portability and Data Innovation Provisions (2020)

  • The report summarizes the findings of the 2019 PDPC public consultation on proposals to introduce provisions on data portability and data innovation in Singapore’s Personal Data Protection Act.
  • The proposed provision would oblige organizations to transmit an individual’s data to another organization in a commonly used machine-readable format, upon the individual’s request. The obligation does not extend to data intermediaries or organizations that do not have a presence in Singapore, although data holders may choose to honor those requests.
  • The obligation would apply to electronic data that is either provided by the individual or generated by the individual’s activities in using the organization’s service or product, but not derived data created by the processing of other data by the data holder. Respondents were concerned that including derived data could harm organizations’ competitiveness.
  • Respondents were concerned about how to honour data portability requests where the data of third parties was involved, as in the case of a joint account holder, for example. The PDPC opted for a “balanced, reasonable, and pragmatic approach,” allowing data involving third parties to be ported where it was under the requesting individual’s control, was to be used for domestic and personal purposes, and related only to the organization’s product or service.

Quinn, Paul Is the GDPR and Its Right to Data Portability a Major Enabler of Citizen Science? (2018)

  • This article explores the potential of data portability to advance citizen science by enabling participants to port their personal data from one research project to another. Citizen science — the collection and contribution of large amounts of data by private individuals for scientific research — has grown rapidly in response to the development of new digital means to capture, store, organize, analyze and share data.
  • The GDPR right to data portability aids citizen science by requiring transfer of data in machine-readable format and allowing data subjects to request its transfer to another data controller. This requirement of interoperability does not amount to compatibility, however, and data thus transferred would probably still require cleaning to be usable, acting as a disincentive to reuse.
  • The GDPR’s limitation of transferability to personal data provided by the data subject excludes some forms of data that might possess significant scientific potential, such as secondary personal data derived from further processing or analysis.
  • The GDPR right to data portability also potentially limits citizen science by restricting the grounds for processing data to which the right applies to data obtained through a subject’s express consent or through the performance of a contract. This limitation excludes other forms of data processing described in the GDPR, such as data processing for preventive or occupational medicine, scientific research, or archiving for reasons of public or scientific interest. It is also not clear whether the GDPR compels data controllers to transfer data outside the European Union.

Wong, Janis and Tristan Henderson, How Portable is Portable? Exercising the GDPR’s Right to Data Portability (2018)

  • This paper presents the results of 230 real-world requests for data portability in order to assess how — and how well — the GDPR right to data portability is being implemented. The authors were interested in establishing the kinds of file formats that were returned in response to requests, and to identify practical difficulties encountered in making and interpreting requests, over a three month period beginning on the day the GDPR came into effect.
  • The findings revealed continuing problems around ensuring portability for both data controllers and data subjects. Of the 230 requests, only 163 were successfully completed.
  • Data controllers frequently had difficulty understanding the requirements of GDPR, providing data in incomplete or inappropriate formats: only 40 percent of the files supplied were in a fully compliant format. Additionally, some data controllers were confused between the right to data portability and other rights conferred by the GDPR, such as the right to access or erasure.

Selected Readings on AI for Development


By Dominik Baumann, Jeremy Pesner, Alexandra Shaw, Stefaan Verhulst, Michelle Winowatan, Andrew Young, Andrew J. Zahuranec

As part of an ongoing effort to build a knowledge base for the field of improving governance through technology, The GovLab publishes a series of Selected Readings, which provide an annotated and curated collection of recommended works on themes such as open data, data collaboration, and civic technology. 

In this edition, we explore selected literature on AI and Development. This piece was developed in the context of The GovLab’s collaboration with Agence Française de Développement (AFD) on the use of emerging technology for development. To suggest additional readings on this or any other topic, please email info@thelivinglib.org. All our Selected Readings can be found here.

Context: In recent years, public discourse on artificial intelligence (AI) has focused on its potential for improving the way businesses, governments, and societies make (automated) decisions. Simultaneously, several AI initiatives have raised concerns about human rights, including the possibility of discrimination and privacy breaches. Between these two opposing perspectives is a discussion on how stakeholders can maximize the benefits of AI for society while minimizing the risks that might arise from the use of this technology.

While the majority of AI initiatives today come from the private sector, international development actors increasingly experiment with AI-enabled programs. These initiatives focus on, for example, climate modelling, urban mobility, and disease transmission. These early efforts demonstrate the promise of AI for supporting more efficient, targeted, and impactful development efforts. Yet, the intersection of AI and development remains nascent, and questions remain regarding how this emerging technology can deliver on its promise while mitigating risks to intended beneficiaries.

Readings are listed in alphabetical order.

2030Vision. AI and the Sustainable Development Goals: the State of Play

  • In broad language, this document for 2030Vision assesses AI research and initiatives and the Sustainable Development Goals (SDGs) to determine gaps and potential that can be further explored or scaled. 
  • It specifically reviews the current applications of AI in two SDG sectors, food/agriculture and healthcare.
  • The paper recommends enhancing multi-sector collaboration among businesses, governments, civil society, academia and others to ensure technology can best address the world’s most pressing challenges.

Andersen, Lindsey. Artificial Intelligence in International Development: Avoiding Ethical Pitfalls. Journal of Public & International Affairs (2019). 

  • Investigating the ethical implications of AI in the international development sector, the author argues that the involvement of many different stakeholders and AI-technology providers results in ethical issues concerning fairness and inclusion, transparency, explainability and accountability, data limitations, and privacy and security.
  • The author recommends the information communication technology for development (ICT4D) community adopt the Principles for Digital Development to ensure the ethical implementation of AI in international development projects.
  • The Principles of Digital Development include: 1) design with the user; 2) understand the ecosystem; 3) design for scale; 4) build for sustainability; 5) be data driven; 6) use open standards, open data, open source, and open innovation; and 7) reuse and improve.

Arun, Chinmayi. AI and the Global South: Designing for Other Worlds in Markus D. Dubber, Frank Pasquale, and Sunit Das (eds.), The Oxford Handbook of Ethics of AI, Oxford University Press, Forthcoming (2019).

  • This chapter interrogates the impact of AI’s application in the Global South and raises concerns about such initiatives.
  • Arun argues AI’s deployment in the Global South may result in discrimination, bias, oppression, exclusion, and bad design. She further argues it can be especially harmful to vulnerable communities in places that do not have strong respect for human rights.
  • The paper concludes by outlining the international human rights laws that can mitigate these risks. It stresses the importance of a human rights-centric, inclusive, empowering context-driven approach in the use of AI in the Global South.

Best, Michael. Artificial Intelligence (AI) for Development Series: Module on AI, Ethics and Society. International Telecommunications Union (2018). 

  • This working paper is intended to help ICT policymakers or regulators consider the ethical challenges that emerge within AI applications.
  • The author identifies a four-pronged framework of analysis (risks, rewards, connections, and key questions to consider) that can guide policymaking in the fields of: 1) livelihood and work; 2) diversity, non-discrimination and freedoms from bias; 3) data privacy and minimization; and 4) peace and security.
  • The paper also includes a table of policies and initiatives undertaken by national governments and tech companies around AI, along with the set of values (mentioned above) explicitly considered.

International Development Innovation Alliance (2019). Artificial Intelligence and International Development: An Introduction

  • Results for Development, a nonprofit organization working in the international development sector, developed a report in collaboration with the AI and Development Working Group within the International Development Innovation Alliance (IDIA). The report provides a brief overview of AI and how this technology may impact the international development sector.
  • The report provides examples of AI-powered applications and initiatives that support the SDGs, including eradicating hunger, promoting gender equality, and encouraging climate action.
  • It also provides a collection of supporting resources and case studies for development practitioners interested in using AI.

Paul, Amy, Craig Jolley, and Aubra Anthony. Reflecting the Past, Shaping the Future: Making AI Work for International Development. United States Agency for International Development (2018). 

  • This report outlines the potential of machine learning (ML) and artificial intelligence in supporting development strategy. It also details some of the common risks that can arise from the use of these technologies.
  • The document contains examples of ML and AI applications to support the development sector and recommends good practices in handling such technologies. 
  • It concludes by recommending broad, shared governance, using fair and balanced data, and ensuring local population and development practitioners remain involved in it.

Pincet, Arnaud, Shu Okabe, and Martin Pawelczyk. Linking Aid to the Sustainable Development Goals – a machine learning approach. OECD Development Co-operation Working Papers (2019). 

  • The authors apply ML and semantic analysis to data sourced from the OECD’s Creditor Reporting System to map aid funding to particular SDGs.
  • The researchers find “Good Health and Well-Being” as the most targeted SDG, what the researchers call the “SDG darling.”
  • The authors find that mapping relationships between the system and SDGs can help to ensure equitable funding across different goals.

Quinn, John, Vanessa Frias-Martinez, and Lakshminarayan Subramanian. Computational Sustainability and Artificial Intelligence in the Developing World. Association for the Advancement of Artificial Intelligence (2014). 

  • These researchers suggest three different areas—health, food security, and transportation—in which AI applications can uniquely benefit the developing world. The researchers argue the lack of technological infrastructure in these regions make AI especially useful and valuable, as it can efficiently analyze data and provide solutions.
  • It provides some examples of application within the three themes, including disease surveillance, identification of drought and agricultural trends, modeling of commuting patterns, and traffic congestion monitoring.

Smith, Matthew and Sujaya Neupane. Artificial intelligence and human development: toward a research agenda (2018).

  • The authors highlight potential beneficial applications for AI in a development context, including healthcare, agriculture, governance, education, and economic productivity.
  • They also discuss the risks and downsides of AI, which include the “black boxing” of algorithms, bias in decision making, potential for extreme surveillance, undermining democracy, potential for job and tax revenue loss, vulnerability to cybercrime, and unequal wealth gains towards the already-rich.
  • They recommend further research projects on these topics that are interdisciplinary, locally conducted, and designed to support practice and policy.

Tomašev, Nenad, et al. AI for social good: unlocking the opportunity for positive impact. Nature Communications (2020).

  • This paper takes stock of what the authors term the AI for Social Good movement (AI4SG), which “aims to establish interdisciplinary partnerships centred around AI applications towards SDGs.”  
  • Developed at a multidisciplinary expert seminar on the topic, the authors present 10 recommendations for creating successful AI4SG collaborations: “1) Expectations of what is possible with AI need to be well grounded. 2) There is value in simple solutions. 3) Applications of AI need to be inclusive and accessible, and reviewed at every stage for ethics and human rights compliance. 4) Goals and use cases should be clear and well-defined. 5) Deep, long-term partnerships are required to solve large problem successfully. 6) Planning needs to align incentives, and factor in the limitations of both communities. 7) Establishing and maintaining trust is key to overcoming organisational barriers. 8) Options for reducing the development cost of AI solutions should be explored. 9) Improving data readiness is key. 10) Data must be processed securely, with utmost respect for human rights and privacy.”

Vinuesa, Ricardo, et al. The role of artificial intelligence in achieving the Sustainable Development Goals. 

  • This report analyzes how AI can meet both the demands of some SDGs and also inhibit progress toward others. It highlights a critical research gap about the extent to which AI impacts sustainable development in the medium and long term. 
  • Through his analysis, Vinuesa claims AI has the potential to positively impact the environment, society, and the economy. However, AI can hinder these groups.
  • The authors recognize that although AI enables efficiency and productivity, it can also increase inequality and hinder achievements of the 2030 Agenda. Vinuesa and his co-authors suggest adequate policy formation and regulation are needed to ensure fast and equitable development of AI technologies that can address the SDGs. 

United Nations Education, Scientific and Cultural Organization (UNESCO) (2019). Artificial intelligence for Sustainable Development: Synthesis Report, Mobile Learning Week 2019

  • In this report, UNESCO assesses the findings from Mobile Learning Week (MLW) 2019. The three main conclusions were: 1) the world is facing a learning crisis; 2) education drives sustainable development; and 3) sustainable development can only be achieved if we harness the potential of AI. 
  • Questions around four major themes dominated the MLW 2019 sessions: 1) how to guarantee inclusive and equitable use of AI in education; 2) how to harness AI to improve learning; 3) how to increase skills development; and 4) how to ensure transparent and auditable use of education data. 
  • To move forward, UNESCO advocates for more international cooperation and stakeholder involvement, creation of education and AI standards, and development of national policies to address educational gaps and risks. 

Index: Open Data


By Alexandra Shaw, Michelle Winowatan, Andrew Young, and Stefaan Verhulst

The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on open data and was originally published in 2018.

Value and Impact

  • The projected year at which all 28+ EU member countries will have a fully operating open data portal: 2020

  • Between 2016 and 2020, the market size of open data in Europe is expected to increase by 36.9%, and reach this value by 2020: EUR 75.7 billion

Public Views on and Use of Open Government Data

  • Number of Americans who do not trust the federal government or social media sites to protect their data: Approximately 50%

  • Key findings from The Economist Intelligence Unit report on Open Government Data Demand:

    • Percentage of respondents who say the key reason why governments open up their data is to create greater trust between the government and citizens: 70%

    • Percentage of respondents who say OGD plays an important role in improving lives of citizens: 78%

    • Percentage of respondents who say OGD helps with daily decision making especially for transportation, education, environment: 53%

    • Percentage of respondents who cite lack of awareness about OGD and its potential use and benefits as the greatest barrier to usage: 50%

    • Percentage of respondents who say they lack access to usable and relevant data: 31%

    • Percentage of respondents who think they don’t have sufficient technical skills to use open government data: 25%

    • Percentage of respondents who feel the number of OGD apps available is insufficient, indicating an opportunity for app developers: 20%

    • Percentage of respondents who say OGD has the potential to generate economic value and new business opportunity: 61%

    • Percentage of respondents who say they don’t trust governments to keep data safe, protected, and anonymized: 19%

Efforts and Involvement

  • Time that’s passed since open government advocates convened to create a set of principles for open government data – the instance that started the open data government movement: 10 years

  • Countries participating in the Open Government Partnership today: 79 OGP participating countries and 20 subnational governments

  • Percentage of “open data readiness” in Europe according to European Data Portal: 72%

    • Open data readiness consists of four indicators which are presence of policy, national coordination, licensing norms, and use of data.

  • Number of U.S. cities with Open Data portals: 27

  • Number of governments who have adopted the International Open Data Charter: 62

  • Number of non-state organizations endorsing the International Open Data Charter: 57

  • Number of countries analyzed by the Open Data Index: 94

  • Number of Latin American countries that do not have open data portals as of 2017: 4 total – Belize, Guatemala, Honduras and Nicaragua

  • Number of cities participating in the Open Data Census: 39

Demand for Open Data

  • Open data demand measured by frequency of open government data use according to The Economist Intelligence Unit report:

    • Australia

      • Monthly: 15% of respondents

      • Quarterly: 22% of respondents

      • Annually: 10% of respondents

    • Finland

      • Monthly: 28% of respondents

      • Quarterly: 18% of respondents

      • Annually: 20% of respondents

    •  France

      • Monthly: 27% of respondents

      • Quarterly: 17% of respondents

      • Annually: 19% of respondents

        •  
    • India

      • Monthly: 29% of respondents

      • Quarterly: 20% of respondents

      • Annually: 10% of respondents

    • Singapore

      • Monthly: 28% of respondents

      • Quarterly: 15% of respondents

      • Annually: 17% of respondents 

    • UK

      • Monthly: 23% of respondents

      • Quarterly: 21% of respondents

      • Annually: 15% of respondents

    • US

      • Monthly: 16% of respondents

      • Quarterly: 15% of respondents

      • Annually: 20% of respondents

  • Number of FOIA requests received in the US for fiscal year 2017: 818,271

  • Number of FOIA request processed in the US for fiscal year 2017: 823,222

  • Distribution of FOIA requests in 2017 among top 5 agencies with highest number of request:

    • DHS: 45%

    • DOJ: 10%

    • NARA: 7%

    • DOD: 7%

    • HHS: 4%

Examining Datasets

  • Country with highest index score according to ODB Leaders Edition: Canada (76 out of 100)

  • Country with lowest index score according to ODB Leaders Edition: Sierra Leone (22 out of 100)

  • Number of datasets open in the top 30 governments according to ODB Leaders Edition: Fewer than 1 in 5

  • Average percentage of datasets that are open in the top 30 open data governments according to ODB Leaders Edition: 19%

  • Average percentage of datasets that are open in the top 30 open data governments according to ODB Leaders Edition by sector/subject:

    • Budget: 30%

    • Companies: 13%

    • Contracts: 27%

    • Crime: 17%

    • Education: 13%

    • Elections: 17%

    • Environment: 20%

    • Health: 17%

    • Land: 7%

    • Legislation: 13%

    • Maps: 20%

    • Spending: 13%

    • Statistics: 27%

    • Trade: 23%

    • Transport: 30%

  • Percentage of countries that release data on government spending according to ODB Leaders Edition: 13%

  • Percentage of government data that is updated at regular intervals according to ODB Leaders Edition: 74%

  • Number of datasets available through:

  • Number of datasets classed as “open” in 94 places worldwide analyzed by the Open Data Index: 11%

  • Percentage of open datasets in the Caribbean, according to Open Data Census: 7%

  • Number of companies whose data is available through OpenCorporates: 158,589,950

City Open Data

  • New York City

  • Singapore

    • Number of datasets published in Singapore: 1,480

    • Percentage of datasets with standardized format: 35%

    • Percentage of datasets made as raw as possible: 25%

  • Barcelona

    • Number of datasets published in Barcelona: 443

    • Open data demand in Barcelona measured by:

      • Number of unique sessions in the month of September 2018: 5,401

    • Quality of datasets published in Barcelona according to Tim Berners Lee 5-star Open Data: 3 stars

  • London

    • Number of datasets published in London: 762

    • Number of data requests since October 2014: 325

  • Bandung

    • Number of datasets published in Bandung: 1,417

  • Buenos Aires

    • Number of datasets published in Buenos Aires: 216

  • Dubai

    • Number of datasets published in Dubai: 267

  • Melbourne

    • Number of datasets published in Melbourne: 199

Sources

  • About OGP, Open Government Partnership. 2018.  

The GovLab Selected Readings on Blockchain Technologies and the Governance of Extractives


Curation by Andrew Young, Anders Pedersen, and Stefaan G. Verhulst

Readings developed together with NRGI, within the context of our joint project on Blockchain technologies and the Governance of Extractives. Thanks to Joyce Zhang and Michelle Winowatan for research support.

We need your help! Please share any additional readings on the use of Blockchain Technologies in the Extractives Sector with blockchange@thegovlab.org.  

Introduction

By providing new ways to securely identify individuals and organizations, and record transactions of various types in a distributed manner, blockchain technologies have been heralded as a new tool to address information asymmetries, establish trust and improve governance – particularly around the extraction of oil, gas and other natural resources. At the same time, blockchain technologies are been experimented with to optimize certain parts of the extractives value chain – potentially decreasing transparency and accountability while making governance harder to implement.

Across the expansive and complex extractives sector, blockchain technologies are believed to have particular potential for improving governance in three key areas:  

  • Beneficial ownership and illicit flows screening: The identity of those who benefit, through ownership, from companies that extract natural resources is often hidden – potentially contributing to tax evasion, challenges to global sanction regimes, corruption and money laundering.
  • Land registration, licensing and contracting transparency: To ensure companies extract resources responsibly and comply with rules and fee requirements, effective governance and a process to determine who has the rights to extract natural resources, under what conditions, and who is entitled to the land is essential.
  • Commodity trading and supply chain transparency: The commodity trading sector is facing substantive challenges in assessing and verifying the authenticity of for example oil trades. Costly time is spent by commodity traders reviewing documentation of often poor quality. The expectation of the sector is firstly to eliminate time spent verifying the authenticity of traded goods and secondly to reduce the risk premium on trades. Transactions from resources and commodities trades are often opaque and secretive, allowing for governments and companies to conceal how much money they receive from trading, and leading to corruption and evasion of taxation.

In the below we provide a selection of the nascent but growing literature on Blockchain Technologies and Extractives across six categories:

Selected Readings 

Blockchain Technologies and Extractives – Promise and Current Potential

Adams, Richard, Beth Kewell, Glenn Parry. “Blockchain for Good? Digital Ledger Technology and Sustainable Development Goals.” Handbook of Sustainability and Social Science Research. October 27, 2017.

  • This chapter in the Handbook of Sustainability and Social Science Research seeks to reflect and explore the different ways Blockchain for Good (B4G) projects can provide social and environmental benefits under the UN’s Sustainable Goals framework
  • The authors describe the main categories in which blockchain can achieve social impact: mining/consensus algorithms that reward good behavior, benefits linked to currency use in the form of “colored coins,” innovations in supply chain, innovations in government, enabling the sharing economy, and fostering financial inclusion.
  • The chapter concludes that with B4G there is also inevitably “Blockchain for Bad.” There is already critique and failures of DLTs such as the DAO, and more research must be done to identify whether DLTs can provide a more decentralized, egalitarian society, or if they will ultimately be another tool for control and surveillance by organizations and government.

Cullinane, Bernadette, and Randy Wilson. “Transforming the Oil and Gas Industry through Blockchain.” Official Journal of the Australian Institute of Energy News, p 9-10, December 2017.

  • In this article, Cullinane and Wilson explore blockchain’s application in the oil and gas industry “presents a particularly compelling opportunity…due to the high transactional values, associated risks and relentless pressure to reduce costs.”
  • The authors elaborate four areas where blockchain can benefit play a role in transforming the oil and gas industry:
    • Supply chain management
    • Smart contracts
    • Record management
    • Cross-border payments

Da Silva, Filipe M., and Ankita Jaitly. “Blockchain in Natural Resources: Hedging Against Volatile Prices.” Tata Consultancy Services Ltd., 2018.

  • The authors of this white paper assess the readiness of natural resources industries for blockchain technology application, identify areas where blockchain can add value, and outline a strategic plan for its adoption.
  • In particular, they highlight the potential for blockchain in the oil and gas industry to simplify payments, where for example, gas can be delivered directly to consumer homes using a blockchain smart contracting application.

Halford-Thompson, Guy. “Powered by Blockchain: Reinventing Information Management in the Energy Space.” BTL, May 12, 2017.

  • According to Halford-Thompson, “oil and gas companies are exploring blockchain’s promise to revamp inefficient internal processes and achieve significant reductions in operating costs through the automation of record keeping and messaging, the digitization of the supply chain information flow, and the elimination of reconciliation, among many other data management use cases.”
  • The data reconciliation process, for one, is complex and can require significant time for completion. Blockchain technology could not only remove the need for some steps in the information reconciliation process, but also eliminate the need for reconciliation altogether in some instances.

Blockchain Technologies and the Governance of Extractives

(See also: Selected Readings of Blockchain Technologies and its Potential to Transform Governance)

Koeppen, Mark, David Shrier, and Morgan Bazilian. “Is Blockchain’s Future in Oil and Gas Transformative Or Transient? Deloitte, 2017.

  • In this report, the authors propose four areas that blockchain can improve for the oil and gas industry, which are:
    • Transparency and compliance: Employment of blockchain is predicted to significantly reduce cost related to compliance, since it securely makes information available to all parties involved in the supply chain.
    • Cyber threats and security: The industry faces constant digital security threat and blockchain provides a solution to address this issue.
    • Mid-volume trading/third party impacts: They argue that the “boundaries between asset classes will blur as cash, energy products and other commodities, from industrial components to apples could all become digital assets trading interoperably.”
    • Smart contract: Since the “sheer size and volume of contracts and transactions to execute capital projects in oil and gas have historically caused significant reconciliation and tracking issues among contractors, sub-contractors, and suppliers,” blockchain-enabled smart contracts could improve the process by executing automatically after all requirements are met, and boosting contract efficiency and protecting each party from volatile pricing.

Mawet, Pierre, and Michael Insogna. “Unlocking the Potential of Blockchain in Oil and Gas Supply Chains.” Accenture Energy Blog, November 21, 2016.

  • The authors propose three ways blockchain technology can boost productivity and efficiency in oil and gas industry:
    • “Greater process efficiency. Smart contracts, for example, can be held in a blockchain transaction with party compliance confirmed through follow-on transactions, reducing third-party supervision and paper-based contracting, thus helping reduce cost and overhead.”
    • “Compliance. Visibility is essential to improve supply chain performance. The immutable record of transactions can aid in product traceability and asset tracking.”
    • “Data transfer from IoT sensors. Blockchain could be used to track the unique history of a device, with the distributed ledger recording data transfer from multiple sensors. Data security in devices could be safeguarded by unique blockchain characteristics.”

Som, Indranil. “Blockchain: Radically Changing the Mining Paradigm.” Digitalist, September 27, 2017.

  • In this article, Som proposes three ways that the blockchain technology can “support leaner organizations and increased security” in the mining industry: improving cybersecurity, increasing transparency through smart contracts, and providing visibility into the supply chain.

Identity: Beneficial Ownership and Illicit Flows

(See also: Selected Readings on Blockchain Technologies and Identity).

de Jong, Julia, Alexander Meyer, and Jeffrey Owens. “Using blockchain for transparent beneficial ownership registers. International Tax Review, June 2017.

  • This paper discusses the features of blockchain and distributed ledger technology that can improve collection and distribution of information on beneficial ownership.
  • The FATF and OECD Global Forum regimes have identified a number of common problems related to beneficial ownership information across all jurisdictions, including:
    • “Insufficient accuracy and accessibility of company identification and ownership information;
    • Less rigorous implementation of customer due-diligence (CDD) measures by key gatekeepers such as lawyers, accountants, and trust and company service providers; and
    • Obstacles to information sharing such as data protection and privacy laws, which impede competent authorities from receiving timely access to adequate, accurate and up-to-date information on basic legal and beneficial ownership.”
  • The authors argue that the transparency, immutability, and security offered by blockchain makes it ideally suited for record-keeping, particularly with regards to the ownership of assets. Thus, blockchain can address many of the shortcomings in the current system as identified by the FATF and the OECD.
  • They go on to suggest that a global registry of beneficial ownership using blockchain technology would offer the following benefits:
    • Ensuring real-time accuracy and verification of ownership information
    • Increasing security and control over sensitive personal and commercial information
    • Enhancing audit transparency
    • Creating the potential for globally-linked registries
    • Reducing corruption and fraud, and increasing trust
    • Reducing compliance burden for regulate entities

Herian, Robert. “Trusteeship in a Post-Trust World: Property, Trusts Law and the Blockchain.” The Open University, 2016.

  • This working paper discusses the often overlooked topic of trusteeship and trusts law and the implications of blockchain technology in the space. 
  • “Smart trusts” on the blockchain will distribute trusteeship across a network and, in theory, remove the need for continuous human intervention in trust fund investments thus resolving key issues around accountability and the potential for any breach of trust.
  • Smart trusts can also increase efficiency and security of transactions, which could improve the overall performance of the investment strategy, thereby creating higher returns for beneficiaries.

Karsten, Jack and Darrell M. West (2018): “Venezuela’s “petro” undermines other cryptocurrencies – and international sanctions.” Brookings, Friday, March 9 2018,

  • This article discusses the Venezuelan government’s cryptocurrency, “petro,” which was launched as a solution to the country’s economic crisis and near-worthless currency, “bolívar”
  • Unlike the volatility of other cryptocurrencies such as Bitcoin and Litecoin, one petro’s price is pegged to the price of one barrel of Venezuelan oil – roughly $60
  • And rather than decentralizing control like most blockchain applications, the petro is subject to arbitrary discount factor adjustment, fluctuating oil prices, and a corrupt government known for manipulating its currency
  • The authors warn the petro will not stabilize the Venezuelan economy since only foreign investors funded the presale, yet (from the White Paper) only Venezuelan citizens can use the cryptocurrency to pay taxes, fees, and other expenses. Rather, they argue, the petro represents an attempt to create foreign capital out of “thin air,” which is not subject to traditional economic sanctions.  

Land Registration, Licensing and Contracting Transparency

Michael Graglia and Christopher Mellon. “Blockchain and Property in 2018: At the End of the Beginning.” 2018 World Bank Conference on Land and Poverty, March 19-23, 2018.

  • This paper claims “blockchain makes sense for real estate” because real estate transactions depend on a number of relationships, processes, and intermediaries that must reconcile all transactions and documents for an action to occur. Blockchain and smart contracts can reduce the time and cost of transactions while ensuring secure and transparent record-keeping systems.
  • The ease, efficiency, and security of transactions can also create an “international market for small real estate” in which individuals who cannot afford an entire plot of land can invest small amounts and receive their portion of rental payments automatically through smart contracts.
  • The authors describe seven prerequisites that land registries must fulfill before blockchain can be introduced successfully: accurate data, digitized records, an identity solution, multi-sig wallets, a private or hybrid blockchain, connectivity and a tech aware population, and a trained professional community
  • To achieve the goal of an efficient and secure property registry, the authors propose an 8-level progressive framework through which registries slowly integrate blockchain due to legal complexity of land administration, resulting inertia of existing processes, and high implementation costs.  
    • Level 0 – No Integration
    • Level 1 – Blockchain Recording
    • Level 2 – Smart Workflow
    • Level 3 – Smart Escrow
    • Level 4 – Blockchain Registry
    • Level 5 – Disaggregated Rights
    • Level 6 – Fractional Rights
    • Level 7 – Peer-to-Peer Transactions
    • Level 8 – Interoperability

Thomas, Rod. “Blockchain’s Incompatibility for Use as a Land Registry: Issues of Definition, Feasibility and Risk. European Property Law Journal, vol. 6, no. 3, May 2017.

  • Thomas argues that blockchain, as it is currently understood and defined, is unsuited for the transfer of real property rights because it fails to address the need for independent verification and control.
  • Under a blockchain-based system, coin holders would be in complete control of the recordation of the title interests of their land, and thus, it would be unlikely that they would report competing or contested claims.
  • Since land remains in the public domain, the risk of third party possessory title claims are likely to occur; and over time, these risks will only increase exponentially.
  • A blockchain-based land title represents interlinking and sequential transactions over many hundreds, if not thousands, of years, so given the misinformation that would compound over time, it would be difficult to trust the current title holder has a correctly recorded title
  • The author concludes that supporters of blockchain for land registries frequently overlook a registry’s primary function to provide an independent verification of the provenance of stored data.

Vos, Jacob, Christiaan Lemmen, and Bert Beentjes. “Blockchain-Based Land Registry: Panacea, Illusion or Something In Between? 2017 World Bank Conference on Land and Poverty, March 20-24, 2017.

  • The authors propose that blockchain is best suited for the following steps in land administration:
    • The issuance of titles
    • The archiving of transactions – specifically in countries that do not have a reliable electronic system of transfer of ownership
  • The step in between issuing titles and archiving transactions is the most complex – the registration of the transaction. This step includes complex relationships between the “triple” of land administration: rights (right in rem and/or personal rights), object (spatial unit), and subject (title holder). For the most part, this step is done manually by registrars, and it is questionable whether blockchain technology, in the form of smart contracts, will be able to process these complex transactions.
  • The authors conclude that one should not underestimate the complexity of the legal system related to land administration. The standardization of processes may be the threshold to success of blockchain-based land administration. The authors suggest instead of seeking to eliminate one party from the process, technologists should cooperate with legal and geodetic professionals to create a system of checks and balances to successfully implement blockchain for land administration.  
  • This paper also outlines five blockchain-based land administration projects launched in Ghana, Honduras, Sweden, Georgia, and Cook County, Illinois.

Commodity Trading and Supply Chain Transparency

Ahmed, Shabir. “Leveraging Blockchain to Revolutionise the Mining Industry.” SAP News, February 27, 2018.

  • In this article, Ahmed identifies seven key use cases for blockchain in the mining industry:
    • Automation of ore acquisition and transfer;
    • Automatic registration of mineral rights and IP;
    • Visibility of ore inventory at ports;
    • Automatic cargo hire process;
    • Process and secure large amounts of IoT data;
    • Reconciling amount produced and sent for processing;
    • Automatically execute procurement and other contracts.

Brooks, Michael. “Blockchain and the Fight Against Illicit Financial Flows.” The Policy Corner, February 19, 2018.

  • In this article, Brooks argues that, “Because of the inherent decentralization and immutability of data within blockchains, it offers a unique opportunity to bypass traditional tracking and transparency initiatives that require strong central governance and low levels of corruption. It could, to a significant extent, bypass the persistent issues of authority and corruption by democratizing information around data consensus, rather than official channels and occasional studies based off limited and often manipulated information. Within the framework of a coherent policy initiative that integrates all relevant stakeholders (states, transnational organizations, businesses, NGOs, other monitors and oversight bodies), a international supply chains supported by blockchain would decrease the ease with which resources can be hidden, numbers altered, and trade misinvoiced.”

Conflict Free Natural Resources.” Global Opportunity Report 2017. Global Opportunity Network, 2017.

  • In this entry from the Global Opportunity Report, and specifically toward the end of ensuring conflict-free natural resources, Blockchain is labeled as “well-suited for tracking objects and transactions, making it possible for virtually anything of value to be traced. This opportunity is about creating transparency and product traceability in supply chains.

Blockchain for Traceability in Minerals and Metals Supply Chains: Opportunities and Challenges.” RCS Global and ICMM, 2017.

  • This report is based on insights generated during the Materials Stewardship Round Table on the potential of BCTs for tracking and tracing metals and minerals supply chains, which subsequently informed an RCS Global research initiative on the topic.
  • Insight into two key areas is increasingly desired by downstream manufacturing companies from upstream producers of metals and minerals: provenance and production methods
  • In particular, the report offers five key potential advantages of using Blockchain for mineral and metal supply chain activities:
    • “Builds consensus and trust around responsible production standards between downstream and upstream companies.
    • The immutability of and decentralized control over a blockchain system minimizes the risk of fraud.
    • Defined datasets can be made accessible in real time to any third party, including downstream buyers, auditors, investors, etc. but at the same time encrypted so as to share a proof of fact rather than confidential information.
    • A blockchain system can be easily scaled to include other producers and supply chains beyond those initially involved.
    • Cost reduction due to the paperless nature of a blockchain-enabled CoC [Chain of Custody] system, the potential reduction of audits, and reduction in transaction costs.”

Van Bockstael, Steve. “The emergence of conflict-free, ethical, and Fair Trade mineral supply chain certification systems: A brief introduction.” The Extractives Industries and Society, vol. 5, issue 1, January 2018.

  • This introduction to a special section considers the emerging field of “‘conflict-free’, ‘fair’ and ‘transparently sourced and traded’ minerals” in global industry supply chains.
  • Van Bockstael describes three areas of practice aimed at increasing supply chain transparency:
    • “Initiatives that explicitly try to sever the links between mining or minerals trading and armed conflict of the funding thereof.”
    • “Initiatives, limited in number yet growing, that are explicitly linked to the internationally recognized ‘Fair Trade’ movement and whose aim it is to source artisanally mined minerals for the Western jewellry industry.”
    • “Initiatives that aim to provide consumers or consumer-facing industries with more ethical, transparent and fair supply chains (often using those concepts in fuzzy and interchangeable ways) that are not linked to the established Fair Trade movement” – including, among others, initiatives using Blockchain technology “to create tamper-proof supply chains.”

Global Governance, Standards and Disclosure Practices

Lafarre, Anne and Christoph Van der Elst. “Blockchain Technology for Corporate Governance and Shareholder Activism.” European Corporate Governance Institute (ECGI) – Law Working Paper No. 390/2018, March 8, 2018.

  • This working paper focuses on the potential benefits of leveraging Blockchain during functions involving shareholder and company decision making. Lafarre and Van der Elst argue that “Blockchain technology can lower shareholder voting costs and the organization costs for companies substantially. Moreover, blockchain technology can increase the speed of decision-making, facilitate fast and efficient involvement of shareholders.”
  • The authors argue that in the field of corporate governance, Blockchain offers two important elements: “transparency – via the verifiable way of recording transactions – and trust – via the immutability of these transactions.”
  • Smart contracting, in particular, is seen as a potential avenue for facilitating the ‘agency relationship’ between board members and the shareholders they represent in corporate decision-making processes.

Myung, San Jun. “Blockchain government – a next for of infrastructure for the twenty-first century.” Journal of Open Innovation: Technology, Market, and Complexity, December 2018.

  • This paper argues the idea that Blockchain represents a new form of infrastructure that, given its core consensus mechanism, could replace existing social apparatuses including bureaucracy.
  • Indeed, Myung argues that blockchain and bureaucracy share a number of attributes:
    • “First, both of them are defined by the rules and execute predetermined rules.
    • Second, both of them work as information processing machines for society.
    • Third, both of them work as trust machines for society.”  
  • The piece concludes with five principles for replacing bureaucracy with blockchain for social organization: “1) introducing Blockchain Statute law; 2) transparent disclosure of data and source code; 3) implementing autonomous executing administration; 4) building a governance system based on direct democracy; and 5) making Distributed Autonomous Government (DAG).  

Peters, Gareth and Vishnia, Guy (2016): “Blockchain Architectures for Electronic Exchange Reporting Requirements: EMIR, Dodd Frank, MiFID I/II, MiFIR, REMIT, Reg NMS and T2S.” University College London, August 31, 2016.

  • This paper offers a solution based on blockchain architectures to the regulations of financial exchanges around the world for trade processing and reporting for execution and clearing. In particular, the authors give a detailed overview of EMIR, Dodd Frank, MiFID I/II, MiFIR, REMIT, Reg NMS and T2S.
  • The authors suggest the increasing amount of data from transaction reporting start to be incorporated on a blockchain ledger in order to harness the built-in security and immutability features of the blockchain to support key regulatory features.
  • Specifically, the authors suggest 1) a permissioned blockchain controlled by a regulator or a consortium of market participants for the maintenance of identity data from market participants and 2) blockchain frameworks such as Enigma to be used to facilitate required transparency and reporting aspects related to identities when performing pre- and post-trade reporting as well as for auditing.

Blockchain Technology and Competition Policy – Issues paper by the Secretariat,” OECD, June 8, 2018.

  • This OECD issues paper poses two key questions about how blockchain technology might increase the relevance of new disclosures practices:
    • “Should competition agencies be given permission to access blockchains? This might enable them to monitor trading prices in real-time, spot suspicious trends, and, when investigating a merger, conduct or market have immediate access to the necessary data without needing to impose burdensome information requests on parties.”
    • “Similarly, easy access to the information on a blockchain for a firm’s owners and head offices would potentially improve the effectiveness of its oversight on its own subsidiaries and foreign holdings. Competition agencies may assume such oversight already exists, but by making it easier and cheaper, a blockchain might make it more effective, which might allow for more effective centralised compliance programmes.”

Michael Pisa and Matt Juden. “Blockchain and Economic Development: Hype vs. Reality.” Center for Global Development Policy Paper, 2017.

  • In this Center for Global Development Policy Paper, the authors examine blockchain’s potential to address four major development challenges: (1) facilitating faster and cheaper international payments, (2) providing a secure digital infrastructure for verifying identity, (3) securing property rights, and (4) making aid disbursement more secure and transparent.
  • The authors conclude that while blockchain may be well suited for certain use cases, the majority of constraints in blockchain-based projects fall outside the scope of technology. Common constraints such as data collection and privacy, governance, and operational resiliency must be addressed before blockchain can be successfully implemented as a solution.

Industry-Specific Case Studies

Chohan, Usman. “Blockchain and the Extractive Industries: Cobalt Case Study,” University of New South Wales, Canberra Discussion Paper Series: Notes on the 21st Century, 2018.

  • In this discussion paper, the author studies the pilot use of blockchain in cobalt mining industry in the Democratic Republic of Congo (DRC). The project tracked the movement of cobalt from artisanal mines through its installation in devices such as smartphones and electric cars.
  • The project records cobalt attributes – weights, dates, times, images, etc. – into the digital ledger to help ensure that cobalt purchases are not contributing to forced child labor or conflict minerals. 

Chohan, Usman. “Blockchain and the Extractive Industries #2: Diamonds Case Study,” University of New South Wales, Canberra Discussion Paper Series: Notes on the 21st Century, 2018.

  • The second case study from Chohan investigates the application of blockchain technology in the extractive industry by studying Anglo-American (AAL) diamond DeBeer’s unit and Everledger’s blockchain projects. 
  • In this study, the author finds that AAL uses blockchain to track gems (carat, color, certificate numbers), starting from extraction and onwards, including when the gems change hands in trade transaction.
  • Like the cobalt pilot, the AAL initiative aims to help avoid supporting conflicts and forced labor, and to improve trading accountability and transparency more generally.

Selected Readings on Data Responsibility, Refugees and Migration


By Kezia Paladina, Alexandra Shaw, Michelle Winowatan, Stefaan Verhulst, and Andrew Young

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of Data Collaboration for Migration was originally published in 2018.

Special thanks to Paul Currion whose data responsibility literature review gave us a headstart when developing the below. (Check out his article listed below on Refugee Identity)

The collection below is also meant to complement our article in the Stanford Social Innovation Review on Data Collaboration for Migration where we emphasize the need for a Data Responsibility Framework moving forward.

From climate change to politics to finance, there is growing recognition that some of the most intractable problems of our era are information problems. In recent years, the ongoing refugee crisis has increased the call for new data-driven approaches to address the many challenges and opportunities arising from migration. While data – including data from the private sector – holds significant potential value for informing analysis and targeted international and humanitarian response to (forced) migration, decision-makers often lack an actionable understanding of if, when and how data could be collected, processed, stored, analyzed, used, and shared in a responsible manner.

Data responsibility – including the responsibility to protect data and shield its subjects from harms, and the responsibility to leverage and share data when it can provide public value – is an emerging field seeking to go beyond just privacy concerns. The forced migration arena has a number of particularly important issues impacting responsible data approaches, including the risks of leveraging data regarding individuals fleeing a hostile or repressive government.

In this edition of the GovLab’s Selected Readings series, we examine the emerging literature on the data responsibility approaches in the refugee and forced migration space – part of an ongoing series focused on Data Responsibiltiy. The below reading list features annotated readings related to the Policy and Practice of data responsibility for refugees, and the specific responsibility challenges regarding Identity and Biometrics.

Data Responsibility and Refugees – Policy and Practice

International Organization for Migration (IOM) (2010) IOM Data Protection Manual. Geneva: IOM.

  • This IOM manual includes 13 data protection principles related to the following activities: lawful and fair collection, specified and legitimate purpose, data quality, consent, transfer to third parties, confidentiality, access and transparency, data security, retention and personal data, application of the principles, ownership of personal data, oversight, compliance and internal remedies (and exceptions).
  • For each principle, the IOM manual features targeted data protection guidelines, and templates and checklists are included to help foster practical application.

Norwegian Refugee Council (NRC) Internal Displacement Monitoring Centre / OCHA (eds.) (2008) Guidance on Profiling Internally Displaced Persons. Geneva: Inter-Agency Standing Committee.

  • This NRC document contains guidelines on gathering better data on Internally Displaced Persons (IDPs), based on country context.
  • IDP profile is defined as number of displaced persons, location, causes of displacement, patterns of displacement, and humanitarian needs among others.
  • It further states that collecting IDPs data is challenging and the current condition of IDPs data are hampering assistance programs.
  • Chapter I of the document explores the rationale for IDP profiling. Chapter II describes the who aspect of profiling: who IDPs are and common pitfalls in distinguishing them from other population groups. Chapter III describes the different methodologies that can be used in different contexts and suggesting some of the advantages and disadvantages of each, what kind of information is needed and when it is appropriate to profile.

United Nations High Commissioner for Refugees (UNHCR). Model agreement on the sharing of personal data with Governments in the context of hand-over of the refugee status determination process. Geneva: UNHCR.

  • This document from UNHCR provides a template of agreement guiding the sharing of data between a national government and UNHCR. The model agreement’s guidance is aimed at protecting the privacy and confidentiality of individual data while promoting improvements to service delivery for refugees.

United Nations High Commissioner for Refugees (UNHCR) (2015). Policy on the Protection of Personal Data of Persons of Concern to UNHCR. Geneva: UNHCR.

  • This policy outlines the rules and principles regarding the processing of personal data of persons engaged by UNHCR with the purpose of ensuring that the practice is consistent with UNGA’s regulation of computerized personal data files that was established to protect individuals’ data and privacy.
  • UNHCR require its personnel to apply the following principles when processing personal data: (i) Legitimate and fair processing (ii) Purpose specification (iii) Necessity and proportionality (iv) Accuracy (v) Respect for the rights of the data subject (vi) Confidentiality (vii) Security (viii) Accountability and supervision.

United Nations High Commissioner for Refugees (UNHCR) (2015) Privacy Impact Assessment of UNHCR Cash Based Interventions.

  • This impact assessment focuses on privacy issues related to financial assistance for refugees in the form of cash transfers. For international organizations like UNHCR to determine eligibility for cash assistance, data “aggregation, profiling, and social sorting techniques,” are often needed, leading a need for a responsible data approach.
  • This Privacy Impact Assessment (PIA) aims to identify the privacy risks posed by their program and seek to enhance safeguards that can mitigate those risks.
  • Key issues raised in the PIA involves the challenge of ensuring that individuals’ data will not be used for purposes other than those initially specified.

Data Responsibility in Identity and Biometrics

Bohlin, A. (2008) “Protection at the Cost of Privacy? A Study of the Biometric Registration of Refugees.” Lund: Faculty of Law of the University of Lund.

  • This 2008 study focuses on the systematic biometric registration of refugees conducted by UNHCR in refugee camps around the world, to understand whether enhancing the registration mechanism of refugees contributes to their protection and guarantee of human rights, or whether refugee registration exposes people to invasions of privacy.
  • Bohlin found that, at the time, UNHCR failed to put a proper safeguards in the case of data dissemination, exposing the refugees data to the risk of being misused. She goes on to suggest data protection regulations that could be put in place in order to protect refugees’ privacy.

Currion, Paul. (2018) “The Refugee Identity.” Medium.

  • Developed as part of a DFID-funded initiative, this essay considers Data Requirements for Service Delivery within Refugee Camps, with a particular focus on refugee identity.
  • Among other findings, Currion finds that since “the digitisation of aid has already begun…aid agencies must therefore pay more attention to the way in which identity systems affect the lives and livelihoods of the forcibly displaced, both positively and negatively.”
  • Currion argues that a Responsible Data approach, as opposed to a process defined by a Data Minimization principle, provides “useful guidelines,” but notes that data responsibility “still needs to be translated into organisational policy, then into institutional processes, and finally into operational practice.”

Farraj, A. (2010) “Refugees and the Biometric Future: The Impact of Biometrics on Refugees and Asylum Seekers.” Colum. Hum. Rts. L. Rev. 42 (2010): 891.

  • This article argues that biometrics help refugees and asylum seekers establish their identity, which is important for ensuring the protection of their rights and service delivery.
  • However, Farraj also describes several risks related to biometrics, such as, misidentification and misuse of data, leading to a need for proper approaches for the collection, storage, and utilization of the biometric information by government, international organizations, or other parties.  

GSMA (2017) Landscape Report: Mobile Money, Humanitarian Cash Transfers and Displaced Populations. London: GSMA.

  • This paper from GSMA seeks to evaluate how mobile technology can be helpful in refugee registration, cross-organizational data sharing, and service delivery processes.
  • One of its assessments is that the use of mobile money in a humanitarian context depends on the supporting regulatory environment that contributes to unlocking the true potential of mobile money. The examples include extension of SIM dormancy period to anticipate infrequent cash disbursements, ensuring that persons without identification are able to use the mobile money services, and so on.
  • Additionally, GMSA argues that mobile money will be most successful when there is an ecosystem to support other financial services such as remittances, airtime top-ups, savings, and bill payments. These services will be especially helpful in including displaced populations in development.

GSMA (2017) Refugees and Identity: Considerations for mobile-enabled registration and aid delivery. London: GSMA.

  • This paper emphasizes the importance of registration in the context of humanitarian emergency, because being registered and having a document that proves this registration is key in acquiring services and assistance.
  • Studying cases of Kenya and Iraq, the report concludes by providing three recommendations to improve mobile data collection and registration processes: 1) establish more flexible KYC for mobile money because where refugees are not able to meet existing requirements; 2) encourage interoperability and data sharing to avoid fragmented and duplicative registration management; and 3) build partnership and collaboration among governments, humanitarian organizations, and multinational corporations.

Jacobsen, Katja Lindskov (2015) “Experimentation in Humanitarian Locations: UNHCR and Biometric Registration of Afghan Refugees.” Security Dialogue, Vol 46 No. 2: 144–164.

  • In this article, Jacobsen studies the biometric registration of Afghan refugees, and considers how “humanitarian refugee biometrics produces digital refugees at risk of exposure to new forms of intrusion and insecurity.”

Jacobsen, Katja Lindskov (2017) “On Humanitarian Refugee Biometrics and New Forms of Intervention.” Journal of Intervention and Statebuilding, 1–23.

  • This article traces the evolution of the use of biometrics at the Office of the United Nations High Commissioner for Refugees (UNHCR) – moving from a few early pilot projects (in the early-to-mid-2000s) to the emergence of a policy in which biometric registration is considered a ‘strategic decision’.

Manby, Bronwen (2016) “Identification in the Context of Forced Displacement.” Washington DC: World Bank Group. Accessed August 21, 2017.

  • In this paper, Bronwen describes the consequences of not having an identity in a situation of forced displacement. It prevents displaced population from getting various services and creates higher chance of exploitation. It also lowers the effectiveness of humanitarian actions, as lacking identity prevents humanitarian organizations from delivering their services to the displaced populations.
  • Lack of identity can be both the consequence and and cause of forced displacement. People who have no identity can be considered illegal and risk being deported. At the same time, conflicts that lead to displacement can also result in loss of ID during travel.
  • The paper identifies different stakeholders and their interest in the case of identity and forced displacement, and finds that the biggest challenge for providing identity to refugees is the politics of identification and nationality.
  • Manby concludes that in order to address this challenge, there needs to be more effective coordination among governments, international organizations, and the private sector to come up with an alternative of providing identification and services to the displaced persons. She also argues that it is essential to ensure that national identification becomes a universal practice for states.

McClure, D. and Menchi, B. (2015). Challenges and the State of Play of Interoperability in Cash Transfer Programming. Geneva: UNHCR/World Vision International.

  • This report reviews the elements that contribute to the interoperability design for Cash Transfer Programming (CTP). The design framework offered here maps out these various features and also looks at the state of the problem and the state of play through a variety of use cases.
  • The study considers the current state of play and provides insights about the ways to address the multi-dimensionality of interoperability measures in increasingly complex ecosystems.     

NRC / International Human Rights Clinic (2016). Securing Status: Syrian refugees and the documentation of legal status, identity, and family relationships in Jordan.

  • This report examines Syrian refugees’ attempts to obtain identity cards and other forms of legally recognized documentation (mainly, Ministry of Interior Service Cards, or “new MoI cards”) in Jordan through the state’s Urban Verification Exercise (“UVE”). These MoI cards are significant because they allow Syrians to live outside of refugee camps and move freely about Jordan.
  • The text reviews the acquirement processes and the subsequent challenges and consequences that refugees face when unable to obtain documentation. Refugees can encounter issues ranging from lack of access to basic services to arrest, detention, forced relocation to camps and refoulement.  
  • Seventy-two Syrian refugee families in Jordan were interviewed in 2016 for this report and their experiences with obtaining MoI cards varied widely.

Office of Internal Oversight Services (2015). Audit of the operations in Jordan for the Office of the United Nations High Commissioner for Refugees. Report 2015/049. New York: UN.

  • This report documents the January 1, 2012 – March 31, 2014 audit of Jordanian operations, which is intended to ensure the effectiveness of the UNHCR Representation in the state.
  • The main goals of the Regional Response Plan for Syrian refugees included relieving the pressure on Jordanian services and resources while still maintaining protection for refugees.
  • The audit results concluded that the Representation was initially unsatisfactory, and the OIOS suggested several recommendations according to the two key controls which the Representation acknowledged. Those recommendations included:
    • Project management:
      • Providing training to staff involved in financial verification of partners supervise management
      • Revising standard operating procedure on cash based interventions
      • Establishing ways to ensure that appropriate criteria for payment of all types of costs to partners’ staff are included in partnership agreements
    • Regulatory framework:
      • Preparing annual need-based procurement plan and establishing adequate management oversight processes
      • Creating procedures for the assessment of renovation work in progress and issuing written change orders
      • Protecting data and ensuring timely consultation with the UNHCR Division of Financial and Administrative Management

UNHCR/WFP (2015). Joint Inspection of the Biometrics Identification System for Food Distribution in Kenya. Geneva: UNHCR/WFP.

  • This report outlines the partnership between the WFP and UNHCR in its effort to promote its biometric identification checking system to support food distribution in the Dadaab and Kakuma refugee camps in Kenya.
  • Both entities conducted a joint inspection mission in March 2015 and was considered an effective tool and a model for other country operations.
  • Still, 11 recommendations are proposed and responded to in this text to further improve the efficiency of the biometric system, including real-time evaluation of impact, need for automatic alerts, documentation of best practices, among others.

Open Data for Developing Economies


By Andrew Young, Stefaan Verhulst, and Juliet McMurren
This edition of the GovLab Selected Readings was developed as part of the Open Data for Developing Economies research project (in collaboration with WebFoundation, USAID and fhi360). Special thanks to Maurice McNaughton, Francois van Schalkwyk, Fernando Perini, Michael Canares and David Opoku for their input on an early draft. Please contact Stefaan Verhulst (stefaan@thegovlab.org) for any additional input or suggestions.
Data-and-its-uses-for-Governance-1024x491
Open data is increasingly seen as a tool for economic and social development. Across sectors and regions, policymakers, NGOs, researchers and practitioners are exploring the potential of open data to improve government effectiveness, create new economic opportunity, empower citizens and solve public problems in developing economies. Open data for development does not exist in a vacuum – rather it is a phenomenon that is relevant to and studied from different vantage points including Data4Development (D4D), Open Government, the United Nations’ Sustainable Development Goals (SDGs), and Open Development. The below selected readings provide a view of the current research and practice on the use of open data for development and its relationship to related interventions.
Selected Reading List (in alphabetical order)

Annotated Selected Readings List (in alphabetical order)

Open Data and Open Government for Development

Benjamin, Solomon, R. Bhuvaneswari, P. Rajan, Manjunatha, “Bhoomi: ‘E-Governance’, or, An Anti-Politics Machine Necessary to Globalize Bangalore?” CASUM-m Working Paper, January 2007, http://bit.ly/2aD3vZe

  • This paper explores the digitization of land titles and their effect on governance in Bangalore. The paper takes a critical view of digitization and transparency efforts, particularly as best practices that should be replicated in many contexts.
  • The authors point to the potential of centralized open data and land records databases as a means for further entrenching existing power structures. They found that the digitization of land records in Bangalore “led to increased corruption, much more bribes and substantially increased time taken for land transactions,” as well allowing “very large players in the land markets to capture vast quantities of land when Bangalore experiences a boom in the land market.”
  • They argue for the need “to replace politically neutered concepts like ‘transparency’, ‘efficiency’, ‘governance’, and ‘best practice’ conceptually more rigorous terms that reflect the uneven terrain of power and control that governance embodies.

McGee, Rosie and Duncan Edwards, “Introduction: Opening Governance – Change, Continuity and Conceptual Ambiguity,” IDS Bulletin, January 24, 2016. http://bit.ly/2aJn1pq.  

  • This introduction to a special issue of the IDS Bulletin frames the research and practice of leveraging opening governance as part of a development agenda.
  • The piece primarily focuses on a number of “critical debates” that “have begun to lay bare how imprecise and overblown the expectations are in the transparency, accountability and openness ‘buzzfield’, and the problems this poses.”
  • A key finding on opening governance’s uptake and impact in the development space relates to political buy-in:
    • “Political will is generally a necessary but insu cient condition for governance processes and relationships to become more open, and is certainly a necessary but insu cient condition for tech-based approaches to open them up. In short, where there is a will, tech-for-T&A may be able to provide a way; where there isn’t a will, it won’t.”

Open Data and Data 4 Development

3rd International Open Data Conference (IODC), “Enabling the Data Revolution: An International Open Data Roadmap,” Conference Report, 2015, http://bit.ly/2asb2ei

  • This report, prepared by Open Data for Development, summarizes the proceedings of the third IODC in Ottawa, ON. It sets out an action plan for “harnessing open data for sustainable development”, with the following five priorities:
    1. Deliver shared principles for open data
    2. Develop and adopt good practices and open standards for data publication
    3. Build capacity to produce and use open data effectively
    4. Strengthen open data innovation networks
    5. Adopt common measurement and evaluation tools
  • The report draws on 70 impact accounts to present cross-sector evidence of “the promise and reality of open data,” and emphasizes the utility of open data in monitoring development goals, and the importance of “joined-up open data infrastructures,” ensuring wide accessibility, and grounding measurement in a clear understanding of citizen need, in order to realize the greatest benefits from open data.
  • Finally, the report sets out a draft International Open Data Charter and Action Plan for International Collaboration.

Hilbert, Martin, “Big Data for Development: A Review of Promises and Challenges,” Development Policy Review, December 13, 2015, http://bit.ly/2aoPtxL.

  • This article presents a conceptual framework based on the analysis of 180 articles on the opportunities and threats of big data for international development.
  • Open data, Hilbert argues, can be an incentive for those outside of government to leverage big data analytics: “If data from the public sector were to be openly available, around a quarter of existing data resources could be liberated for Big Data Analytics.”
  • Hilbert explores the misalignment between “the level of economic well-being and perceived transparency of a country” and the existence of an overarching open data policy. He points to low-income countries that are active in the open data effort, like Kenya, Russia and Brazil, in comparison to “other countries with traditionally high perceived transparency,” which are less active in releasing data, like Chile, Belgium and Sweden.

International Development Research Centre, World Wide Web Foundation, and Berkman Center at Harvard University, “Fostering a Critical Development Perspective on Open Government Data,” Workshop Report, 2012, http://bit.ly/2aJpyQq

  • This paper considers the need for a critical perspective on whether the expectations raised by open data programmes worldwide — as “a suitable remedy for challenges of good governance, economic growth, social inclusion, innovation, and participation” — have been met, and if so, under what circumstances.
  • Given the lack of empirical evidence underlying the implementation of Open Data initiative to guide practice and policy formulation, particularly in developing countries, the paper discusses the implementation of a policy-oriented research agenda to ensure open data initiatives in the Global South “challenge democratic deficits, create economic value and foster inclusion.”
  • The report considers theories of the relationship between open data and impact, and the mediating factors affecting whether that impact is achieved. It takes a broad view of impact, including both demand- and supply-side economic impacts, social and environmental impact, and political impact.

Open Data for Development, “Open Data for Development: Building an Inclusive Data Revolution,” Annual Report, 2015, http://bit.ly/2aGbkz5

  • This report — the inaugural annual report for the Open Data for Development program — gives an overview of outcomes from the program for each of OD4D’s five program objectives:
    1. Setting a global open data for sustainable development agenda;
    2. Supporting governments in their open data initiatives;
    3. Scaling data solutions for sustainable development;
    4. Monitoring the availability, use and impact of open data around the world; and
    5. Building the institutional capacity and long-term sustainability of the Open Data for Development network.
  • The report identifies four barriers to impact in developing countries: the lack of capacity and leadership; the lack of evidence of what works; the lack of coordination between actors; and the lack of quality data.

Stuart, Elizabeth, Emma Samman, William Avis, Tom Berliner, “The Data Revolution: Finding the Missing Millions,” Open Data Institute Research Report, April 2015, http://bit.ly/2acnZtE.

  • This report examines the challenge of implementing successful development initiatives when many citizens are not known to their governments as they do not exist in official databases.
  • The authors argue that “good quality, relevant, accessible and timely data will allow willing governments to extend services into communities which until now have been blank spaces in planning processes, and to implement policies more efficiently.”
  • In addition to improvements to national statistical offices, the authors argue that “making better use of the data we already have” by increasing openness to certain datasets held by governments and international organizations could help to improve the situation.
  • They examine a number of open data efforts in developing countries, including Kenya and Mexico.
  • Finally, they argue that “the data revolution could play a role in changing the power dynamic between citizens, governments and the private sector, building on open data and freedom of information movements around the world. It has the potential to enable people to produce, access and understand information about their lives and to use this information to make changes.”

United Nations Independent Expert Advisory Group on a Data Revolution for Sustainable Development. “A World That Counts, Mobilizing the Data Revolution,” 2014, http://bit.ly/2am5K28.

  • This report focuses on the potential benefits and risks data holds for sustainable development. Included in this is a strategic framework for using and managing data for humanitarian purposes. It describes a need for a multinational consensus to be developed to ensure data is shared effectively and efficiently.
  • It suggests that “people who are counted”—i.e., those who are included in data collection processes—have better development outcomes and a better chance for humanitarian response in emergency or conflict situations.
  • In particular, “better and more open data” is described as having the potential to “save money and create economic, social and environmental value” toward sustainable development ends.

The World Bank, “Digital Dividends: World Development Report 2016.” http://bit.ly/2aG9Kx5

  • This report examines “digital dividends” or the development benefits of using digital technologies in the space.
  • The authors argue that: “To get the most out of the digital revolution, countries also need to work on the “analog complements”—by strengthening regulations that ensure competition among businesses, by adapting workers’ skills to the demands of the new economy, and by ensuring that institutions are accountable.”
  • The “data revolution,” which includes both big data and open data is listed as one of four “digital enablers.”
  • Open data’s impacts are explored across a number of cases and developing countries and regions, including: Nepal, Mexico, Southern Africa, Kenya, Moldova and the Philippines.
  • Despite a number of success stories, the authors argue that: “sustained, impactful, scaled-up examples of big and open data in the developing world are still relatively rare,” and, in particular, “Open data has far to go.” They point to the high correlation between readiness, implementation and impact of open data to GDP per capita as evidence of the room for improvement.

Open Data and Open Development

Reilly, Katherine and Juan P. Alperin, “Intermediation in Open Development: A Knowledge Stewardship Approach,” Global Media Journal (Canadian Edition), 2016, http://bit.ly/2atWyI8

  • This paper examines the intermediaries that “have emerged to facilitate open data and related knowledge production activities in development processes.”
  • In particular, they study the concept of “knowledge stewardship,” which “demands careful consideration of how—through what arrangements—open resources can best be provided, and how best to maximize the quality, sustainability, buy-in, and uptake of those resources.”
  • The authors describe five models of open data intermediation:
    • Decentralized
    • Arterial
    • Ecosystem
    • Bridging
    • Communities of practice

Reilly, Katherine and Rob McMahon, “Quality of openness: Evaluating the contributions of IDRC’s Information and Networks Program to open development.” International Development Research Centre, January 2015, http://bit.ly/2aD6h0U

  • This reports describes the outcomes of IRDC’s Information and Networks (I&N) programme, focusing, in particular, those related to “quality of openness” of initiatives as well as their outcomes.
  • The research program explores “mechanisms that link open initiatives to human activities in ways that generate social innovations of significance to development. These include push factors such as data holders’ understanding of data usage, the preparedness or acceptance of user communities, institutional policies, and wider policies and regulations; as well as pull factors including the awareness, capacity and attitude of users. In other words, openly networked social processes rely on not just quality openness, but also on supportive environments that link open resources and the people who might leverage them to create improvements, whether in governance, education or knowledge production.”

Smith, M. and L. Elder, “Open ICT Ecosystems Transforming the Developing World,” Information Technologies and International Development, 2010, http://bit.ly/2au0qsW.

  • The paper seeks to examine the hypothesis that “open social arrangements, enabled by ICTs, can help to catalyze the development impacts of ICTs. In other words, open ICT ecosystems provide the space for the amplification and transformation of social activities that can be powerful drivers of development.”
  • While the focus is placed on a number of ICT interventions – with open data only directly referenced as it relates to the science community – the lessons learned and overarching framework are applicable to the open data for development space.
  • The authors argue for a new research focus on “the new social activities enabled by different configurations of ICT ecosystems and their connections with particular social outcomes.” They point in particular to “modules of social practices that can be applied to solve similar problems across different development domains,” including “massive participation, collaborative production of content, collaborative innovation, collective information validation, new ‘open’ organizational models, and standards and knowledge transfer.”

Smith, Matthew and Katherine M. A. Reilly (eds), “Open Development: Networked Innovations in International Development,” MIT Press, 2013, http://bit.ly/2atX2hu.

  • This edited volume considers the implications of the emergence of open networked models predicated on digital network technologies for development. In their introduction, the editors emphasize that openness is a means to support development, not an end, which is layered upon existing technological and social structures. While openness is often disruptive, it depends upon some measure of closedness and structure in order to function effectively.
  • Subsequent, separately authored chapters provide case studies of open development drawn from health, biotechnology, and education, and explore some of the political and structural barriers faced by open models.  

van den Broek, Tijs, Marijn Rijken, Sander van Oort, “Towards Open Development Data: A review of open development data from a NGO perspective,” 2012, http://bit.ly/2ap5E8a

  • In this paper, the authors seek to answer the question: “What is the status, potential and required next steps of open development data from the perspective of the NGOs?”
  • They argue that “the take-up of open development data by NGOs has shown limited progress in the last few years,” and, offer “several steps to be taken before implementation” to increase the effectiveness of open data’s use by NGOs to improve development efforts:
    • Develop a vision on open development and open data
    • Develop a clear business case
    • Research the benefits and risks of open development data and raise organizational and political awareness and support
    • Develop an appealing business model for data intermediaries and end-users
    • Balance data quality and timeliness
    • Dealing with the data obesity
    • Enrich quantitative data to overcome a quantitative bias
    • Monitor implementation and share best practices.

Open Data and Development Goals

Berdou, Evangelia, “Mediating Voices and Communicating Realities: Using Information Crowdsourcing Tools, Open Data Initiatives and Digital Media to Support and Protect the Vulnerable and Marginalised,” Institute of Development Studies, 2011, http://bit.ly/2aqbycg.

  • This report examines the potential of “open source information crowdsourcing platforms like Ushahidi, and open mapping and data initiatives like OpenStreetMap, are enabling citizens in developing countries to generate and disseminate information critical for their lives and livelihoods.”
  • The authors focus in particular on:
    • “the role of the open source social entrepreneur as a new development actor
    • the complexity of the architectures of participation supported by these platforms and the need to consider them in relation to the decision-making processes that they aim to support and the roles in which they cast citizens
    • the possibilities for cross-fertilisation of ideas and the development of new practices between development practitioners and technology actors committed to working with communities to improve lives and livelihoods.”
  • While the use of ICTs and open data pose numerous potential benefits for supporting and protecting the vulnerable and marginalised, the authors call for greater attention to:
    • challenges emerging from efforts to sustain participation and govern the new information commons in under-resourced and politically contested spaces
    • complications and risks emerging from the desire to share information freely in such contexts
    • gaps between information provision, transparency and accountability, and the slow materialisation of projects’ wider social benefits

Canares, Michael, Satyarupa Shekhar, “Open Data and Sub-national Governments: Lessons from Developing Countries,”  2015, http://bit.ly/2au2gu2

  • This synthesis paper seeks to gain a greater understanding of open data’s effects on local contexts – ”where data is collected and stored, where there is strong feasibility that data will be published, and where data can generate the most use and impact” – through the examination of nine papers developed as part of the Open Data in Developing Countries research project.
  • The authors point to three central findings:
    • “There is substantial effort on the part of sub-national governments to proactively disclose data, however, the design delimits citizen participation, and eventually, use.”
    • Context demands different roles for intermediaries and different types of initiatives to create an enabling environment for open data.”
    • “Data quality will remain a critical challenge for sub-national governments in developing countries and it will temper potential impact that open data will be able to generate.

Davies, Tim, “Open Data in Developing Countries – Emerging Insights from Phase I,” ODDC, 2014, http://bit.ly/2aX55UW

  • This report synthesizes findings from the Exploring the Emerging Impacts of Open Data in Developing Countries (ODDC) research network and its study of open data initiatives in 13 countries.
  • Davies provides 15 initial insights across the supply, mediation, and use of open data, including:
    • Open data initiatives can create new spaces for civil society to pursue government accountability and effectiveness;
    • Intermediaries are vital to both the supply and the use of open data; and
    • Digital divides create data divides in both the supply and use of data.

Davies, Tim, Duncan Edwards, “Emerging Implications of Open and Linked Data for Knowledge Sharing Development,” IDS Bulletin, 2012, http://bit.ly/2aLKFyI

  • This article explores “issues that development sector knowledge intermediaries may need to engage with to ensure the socio-technical innovations of open and linked data work in the interests of greater diversity and better development practice.”
  • The authors explore a number of case studies where open and linked data was used in a development context, including:
    • Open research: IDS and R4D meta-data
    • Open aid: International Aid Transparency Initiative
    • Open linked statistics: Young Lives
  • Based on lessons learned from these cases, the authors argue that “openness must serve the interests of marginalised and poor people. This is pertinent at three levels:
    • practices in the publication and communication of data
    • capacities for, and approaches to, the use of data
    • development and emergent structuring of open data ecosystems.

Davies, Tim, Fernando Perini, and Jose Alonso, “Researching the Emerging Impacts of Open Data,” ODDC, 2013, http://bit.ly/2aqb6uP

  • This research report offers a conceptual framework for open data, with a particular focus on open data in developing countries.
  • The conceptual framework comprises three central elements:
    • Open Data
      • About government
      • About companies & markets
      • About citizens
    • Domains of governance
      • Political domains
      • Economic domains
      • Social domains
    • Emerging Outcomes
      • Transparency & accountability
      • Innovation & economic growth
      • Inclusion & empowerment
  • The authors describe three central theories of change related to open data’s impacts:
    • Open data will bring about greater transparency in government, which in turn brings about greater accountability of key actors to make decisions and apply rules in the public interest;
    • Open data will enable non-state innovators to improve public services or build innovative products and services with social and economic value; open data will shift certain decision making from the state into the market, making it more efficient;
    • Open data will remove power imbalances that resulted from asymmetric information, and will bring new stakeholders into policy debates, giving marginalised groups a greater say in the creation and application of rules and policy.

Montano, Elise and Diogo Silva, “Exploring the Emerging Impacts of Open Data in Developing Countries (ODDC): ODDC1 Follow-up Outcome Evaluation Report,” ODDC, 2016, http://bit.ly/2au65z7.

  • This report summarizes the findings of a two and a half year research-driven project sponsored by the World Wide Web Foundation to explore how open data improves governance in developing countries, and build capacity in these countries to engage with open data. The research was conducted through 17 subgrants to partners from 12 countries.
  • Upon evaluation in 2014, partners reported increased capacity and expertise in dealing with open data; empowerment in influencing local and regional open data trends, particularly among CSOs; and increased understanding of open data among policy makers with whom the partners were in contact.

Smith, Fiona, William Gerry, Emma Truswell, “Supporting Sustainable Development with Open Data,” Open Data Institute, 2015, http://bit.ly/2aJwxsF

  • This report describes the potential benefits, challenges and next steps for leveraging open data to advance the Sustainable Development Goals.
  • The authors argue that the greatest potential impacts of open data on development are:
    • More effectively target aid money and improve development programmes
    • Track development progress and prevent corruption
    • Contribute to innovation, job creation and economic growth.
  • They note, however, that many challenges to such impact exist, including:
    • A weak enabling environment for open data publishing
    • Poor data quality
    • A mismatch between the demand for open data and the supply of appropriate datasets
    • A ‘digital divide’ between rich and poor, affecting both the supply and use of data
    • A general lack of quantifiable data and metrics.
  • The report articulates a number of ways that “governments, donors and (international) NGOs – with the support of researchers, civil society and industry – can apply open data to help make the SDGs a reality:
    • Reach global consensus around principles and standards, namely being ‘open by default’, using the Open Government Partnership’s Open Data Working Group as a global forum for discussion.
    • Embed open data into funding agreements, ensuring that relevant, high-quality data is collected to report against the SDGs. Funders should mandate that data relating to performance of services, and data produced as a result of funded activity, be released as open data.
    • Build a global partnership for sustainable open data, so that groups across the public and private sectors can work together to build sustainable supply and demand for data in the developing world.”

The World Bank, “Open Data for Sustainable Development,” Policy Note, August 2015, http://bit.ly/2aGjaJ4

  • This report from the World Bank seeks to describe open data’s potential for achieving the Sustainable Development Goals, and makes a number of recommendations toward that end.
  • The authors describe four key benefits of open data use for developing countries:
    • Foster economic growth and job creation
    • Improve efficiency, effectiveness and coverage of public services
    • Increase transparency, accountability, and citizen participation
    • Facilitate better information sharing within government
  • The paper concludes with a number of recommendations for improving open data programs, including:
    • Support Open Data use through legal and licensing frameworks.
    • Make data available for free online.
    • Publish data inventories for the government’s data resources.
    • Create feedback channels to government from current and potential data users.
    • Prioritize the datasets that users want.

Open Data and Developing Countries (National Case Studies)

Beghin, Nathalie and Carmela Zigoni, “Measuring Open Data’s Impact on Brazilian National and Sub-National Budget Transparency Websites and Its Impacts on People’s Rights,” 2014, http://bit.ly/2au3LaQ.

  • This report examines the impact of a Brazilian law requiring government entities to “provide real-time information on their budgets and spending through electronic means.” The authors explore “whether the national and state capitals are in fact using principles and practices of open data in their disclosures, and has evaluated the emerging impacts of open budget data disclosed through the national transparency portal.”
  • The report leveraged a “quantitative survey of budget and financial disclosures, and qualitative research with key stakeholders” to explore the “role of technical platforms and intermediaries in supporting the use of budget data by groups working in pursuit of social change and human rights.”
  • The survey found that:
    • The information provided is complete
    • In general, the data are not primary
    • Most governments do not provide timely information
    • Access to information is not ensured to all individuals
    • Advances were observed in terms of the availability of machine-processable data
    • Access is free, without discriminating users
    • The minority presents data in non-proprietary format
    • It is not known whether the data are under license

Boyera, S., C. Iglesias, “Open Data in Developing Countries: State of the Art,” Partnership for Open Data, 2014, http://bit.ly/2acBMR7

  • This report provides a summary of the State-of-the-Art study developed by SBC4D for the Partnership for Open Data (POD).
  • A series of interviews and responses to an online questionnaire yielded a number of findings, including:
    • “The number of actors interested in Open Data in Developing Countries is growing quickly. The study has identified 160+ organizations. It is important to note that a majority of them are just engaging in the domain and have little past experience. Most of these actors are focused on OD as an objective not a tool or means to increase impact or outcome.
    • Local actors are strong advocates of public data release. Lots of them are also promoting the re-use of existing data (through e.g. the organization of training, hackathons and alike). However, the study has not identified many actors practically using OD in their work or engaged in releasing their own data.
    • Traditional development sectors (health, education, agriculture, energy, transport) are not yet the target of many initiatives, and are clearly underdeveloped in terms of use-cases.
    • There is very little connection between horizontal (e.g. national OD initiatives) and vertical (sector-specific initiatives on e.g. extractive industry, or disaster management) activities”

Canares, M.P., J. de Guia, M. Narca, J. Arawiran, “Opening the Gates: Will Open Data Initiatives Make Local Governments in the Philippines More Transparent?” Open LGU Research Project, 2014, http://bit.ly/2au3Ond

  • This paper seeks to determine the impacts of the Department of Interior and Local Government of the Philippines’ Full Disclosure Policy, affecting financial and procurement data, on both data providers and data users.
  • The paper uncovered two key findings:
    • “On the supply side, incentivising openness is a critical aspect in ensuring that local governments have the interest to disclose financial data. While at this stage, local governments are still on compliance behaviour, it encourages the once reluctant LGUs to disclose financial information in the use of public funds, especially when technology and institutional arrangements are in place. However, LGUs do not make an effort to inform the public that information is available online and has not made data accessible in such a way that it can allow the public to perform computations and analysis. Currently, no data standards have been made yet by the Philippine national government in terms of format and level of detail.”
    • “On the demand side, there is limited awareness on the part of the public, and more particularly the intermediaries (e.g. business groups, civil society organizations, research institutions), on the availability of data, and thus, its limited use. As most of these data are financial in nature, it requires a certain degree of competence and expertise so that they will be able to make use of the data in demanding from government better services and accountability.”
  • The authors argue that “openness is not just about governments putting meaningful government data out into the public domain, but also about making the public meaningfully engage with governments through the use of open government data.” In order to do that, policies should “require observance of open government data standards and a capacity building process of ensuring that the public, to whom the data is intended, are aware and able to use the data in ensuring more transparent and accountable governance.”

Canares, M., M. Narca, and D. Marcial, “Enhancing Citizen Engagement Through Open Government Data,” ODDC, 2015, http://bit.ly/2aJMhfS

  • This research paper seeks to gain a greater understanding of how civil society organizations can increase or initiate their use of open data. The study is based on research conducted in “two provinces in the Philippines where civil society organizations in Negros Oriental province were trained, and in the Bohol province were mentored on accessing and using open data.
  • The authors seek to answer three central research questions:
    • What do CSOs know about open government data? What do they know about government data that their local governments are publishing in the web?
    • What do CSOs have in terms of skills that would enable them to engage meaningfully with open government data?
    • How best can capacity building be delivered to civil society organizations to ensure that they learn to access and use open government data to improve governance?
  • They provide a number of key lessons, including:
    • Baseline condition should inform capacity building approach
    • Data use is dependent on data supply
    • Open data requires accessible and stable internet connection
    • Open data skills are important but insufficient
    • Outcomes, and not just outputs, prove capacity improvements

Chattapadhyay, Sumandro, “Opening Government Data through Mediation: Exploring the Roles, Practices and Strategies of Data Intermediary Organisations in India,ODDC, 2014, http://bit.ly/2au3F37

  • This report seeks to gain a greater understanding of the current practice following the Government of India’s 2012 National Data Sharing and Accessibility Policy.
  • Cattapadhyay examines the open government data practices of “various (non-governmental) ‘data intermediary organisations’ on the one hand, and implementation challenges faced by managers of the Open Government Data Platform of India on the other.
  • The report’s objectives are:
    • To undertake a provisional mapping of government data related activities across different sectors to understand the nature of the “open data community” in India,
    • To enrich government data/information policy discussion in India by gathering evidence and experience of (non­governmental) data intermediaries regarding their actual practices of accessing and sharing government data, and their utilisation of the provisions of NDSAP and RTI act, and
    • To critically reflect on the nature of open data practices in India.

Chiliswa, Zacharia, “Open Government Data for Effective Public Participation: Findings of a Case Study Research Investigating The Kenya’s Open Data Initiative in Urban Slums and Rural Settlements,” ODDC, April 2014, http://bit.ly/2au8E4s

  • This research report is the product of a study of two urban slums and a rural settlement in Nairobi, Mobasa and Isiolo County, respectively, aimed at gaining a better understanding of the awareness and use of Kenya’s open data.
  • The study had four organizing objectives:
    • “Investigate the impact of the Kenyan Government’s open data initiative and to see whether, and if so how, it is assisting marginalized communities and groups in accessing key social services and information such as health and education;
    • Understand the way people use the information provided by the Open Data Initiative;
    • Identify people’s trust in the information and how it can assist their day-to-day lives;
    • Examine ways in which the public wish for the open data initiative to improve, particularly in relation to governance and service delivery.”
  • The study uncovered four central findings about Kenya’s open data initiative:
    • “There is a mismatch between the data citizens want to have and the data the Kenya portal and other intermediaries have provided.
    • Most people go to local information intermediaries instead of going directly to the government data portals and that there are few connections between these intermediaries and the wider open data sources.
    • Currently the rural communities are much less likely to seek out government information.
    • The kinds of data needed to support service delivery in Kenya may be different from those needed in other places in the world.”

Lwanga-Ntale, Charles, Beatrice Mugambe, Bernard Sabiti, Peace Nganwa, “Understanding how open data could impact resource allocation for poverty eradication in Kenya and Uganda,” ODDC, 2014, http://bit.ly/2aHqYKi

  • This paper explores case studies from Uganda and Kenya to explore an open data movement seeking to address “age-old” issues including “transparency, accountability, equity, and the relevance, effectiveness and efficiency of governance.”
  • The authors focus both on the role “emerging open data processes in the two countries may be playing in promoting citizen/public engagement and the allocation of resources,” and the “possible negative impacts that may emerge due to the ‘digital divide’ between those who have access to data (and technology) and those who do not.
  • They offer a number of recommendations to the government of Uganda and Kenya that could be more broadly applicable, including:
    • Promote sector and cross sector specific initiatives that enable collaboration and transparency through different e-transformation strategies across government sectors and agencies.
    • Develop and champion the capacity to drive transformation across government and to advance skills in its institutions and civil service.

Sapkota, Krishna, “Exploring the emerging impacts of open aid data and budget data in Nepal,” Freedom Forum, August 2014, http://bit.ly/2ap0z5G

  • This research report seeks to answer a five key questions regarding the opening of aid and budget data in Nepal:
    • What is the context for open aid and budget data in Nepal?
    • What sorts of budget and aid information is being made available in Nepal?
    • What is the governance of open aid and budget data in Nepal?
    • How are relevant stakeholders making use of open aid and budget data in Nepal?
    • What are the emerging impacts of open aid and budget data in Nepal?
  • The study uncovered a number of findings, including
    • “Information and data can play an important role in addressing key social issues, and that whilst some aid and budget data is increasingly available, including in open data formats, there is not yet a sustainable supply of open data direct from official sources that meet the needs of the different stakeholders we consulted.”
    • “Expectations amongst government, civil society, media and private sector actors that open data could be a useful resource in improving governance, and we found some evidence of media making use of data to drive stories more when they had the right skills, incentives and support.”
    • “The context of Nepal also highlights that a more critical perspective may be needed on the introduction of open data, understanding the specific opportunities and challenges for open data supply and use in a country that is currently undergoing a period of constitutional development, institution building and deepening democracy.”

Srivastava, Nidhi, Veena Agarwal, Anmol Soni, Souvik Bhattacharjya, Bibhu P. Nayak, Harsha Meenawat, Tarun Gopalakrishnan, “Open government data for regulation of energy resources in India,”ODDC, 2014, http://bit.ly/2au9oXf

  • This research paper examines “the availability, accessibility and use of open data in the extractive energy industries sector in India.”
  • The authors describe a number of challenges being faced by:
    • Data suppliers and intermediaries:
      • Lack of clarity on mandate
      • Agency specific issues
      • Resource challenges
      • Privacy issues of commercial data and contractual constraints
      • Formats for data collection
      • Challenges in providing timely data
      • Recovery of costs and pricing of data
    • Data users
      • Data available but inaccessible
      • Data accessible but not usable
      • Timeliness of data
  • They make a number of recommendations for addressing these challenges focusing on:
    • Policy measures
    • Improving data quality
    • Improving effectiveness of data portal

van Schalkwyk, François, Michael Caňares, Sumandro Chattapadhyay and Alexander Andrason “Open Data Intermediaries in Developing Countries,” ODDC, 2015, http://bit.ly/2aJztWi

  • This paper seeks to provide “a more socially nuanced approach to open data intermediaries,” moving beyond the traditional approach wherein data intermediaries are “presented as single and simple linkages between open data supply and use.”
  • The study’s analysis draws on cases from the Emerging Impacts of Open Data in Developing Countries (ODDC) project.
  • The authors provide a working definition of open data intermediaries: An open data intermediary is an agent:
    • positioned at some point in a data supply chain that incorporates an open dataset,
    • positioned between two agents in the supply chain, and
    • facilitates the use of open data that may otherwise not have been the case.
  • One of the studies key findings is that, “Intermediation does not only consist of a single agent facilitating the flow of data in an open data supply chain; multiple intermediaries may operate in an open data supply chain, and the presence of multiple intermediaries may increase the probability of use (and impact) because no single intermediary is likely to possess all the types of capital required to unlock the full value of the transaction between the provider and the user in each of the fields in play.”

van Schalkwyk, François, Michelle Willmers and Tobias Schonwetter, “Embedding Open Data Practice,” ODDC, 2015, http://bit.ly/2aHt5xu

  • This research paper was developed as part of the ODDC Phase 2 project and seeks to address the “insufficient attention paid to the institutional dynamics within governments and how these may be impeding open data practice.”
  • The study focuses in particular on open data initiatives in South Africa and Kenya, leveraging a conceptual framework to allow for meaningful comparison between the two countries.
  • Focusing on South Africa and Kenya, as well as Africa as a whole, the authors seek to address four central research questions:
    • Is open data practice being embedded in African governments?
    • What are the possible indicators of open data practice being embedded?
    • What do the indicators reveal about resistance to or compliance with pressures to adopt open data practice?
    • What are different effects of multiple institutional domains that may be at play in government as an organisation?

van Schalkwyk, Francois, Michelle Willmers, and Laura Czerniewicz, “Case Study: Open Data in the Governance of South African Higher Education,” ODDC, 2014, http://bit.ly/2amgIFb

  • This research report uses the South African Centre for Higher Education Transformation (CHET) open data platform as a case study to examine “the supply of and demand for open data as well as the roles of intermediaries in the South African higher education governance ecosystem.
  • The report’s findings include:
    • “There are concerns at both government and university levels about how data will be used and (mis)interpreted, and this may constrain future data supply. Education both at the level of supply (DHET) and at the level of use by the media in particular on how to improve the interpretability of data could go some way in countering current levels of mistrust. Similar initiatives may be necessary to address uneven levels of data use and trust apparent across university executives and councils.”
    • “Open data intermediaries increase the accessibility and utility of data. While there is a rich publicly-funded dataset on South African higher education, the data remains largely inaccessible and unusable to universities and researchers in higher education studies. Despite these constraints, the findings show that intermediaries in the ecosystem are playing a valuable role in making the data both available and useable.”
    • “Open data intermediaries provide both supply-side as well as demand-side value. CHET’s work on higher education performance indicators was intended not only to contribute to government’s steering mechanisms, but also to contribute to the governance capacity of South African universities. The findings support the use of CHET’s open data to build capacity within universities. Further research is required to confirm the use of CHET data in state-steering of the South African higher education system, although there is some evidence of CHET’s data being referenced in national policy documents.”

Verhulst, Stefaan and Andrew Young, “Open Data Impact: When Demand Supply Meet,” The GovLab, 2016, http://bit.ly/1LHkQPO

  • This report provides a taxonomy of the impacts open data is having on a number of countries around the world, comprising:
    • Improving Government
    • Empowering Citizens
    • Creating Opportunity
    • Solving Public Problems
  • The authors describe four key enabling conditions for creating impactful open data initiatives:
    • Partnerships
    • Public Infrastructure
    • Policies and Performance Metrics
    • Problem Definition

Additional Resource:
World Bank Readiness Assessment Tool

  • To aid in the assessment “of the readiness of a government or individual agency to evaluate, design and implement an Open Data initiative,” the World Bank’s Open Government Data Working Group developed an openly accessible Open Data Readiness Assessment (ODRA) tool.

Selected Readings on Data Collaboratives


By Neil Britto, David Sangokoya, Iryna Susha, Stefaan Verhulst and Andrew Young

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data collaboratives was originally published in 2017.

The term data collaborative refers to a new form of collaboration, beyond the public-private partnership model, in which participants from different sectors (including private companies, research institutions, and government agencies ) can exchange data to help solve public problems. Several of society’s greatest challenges — from addressing climate change to public health to job creation to improving the lives of children — require greater access to data, more collaboration between public – and private-sector entities, and an increased ability to analyze datasets. In the coming months and years, data collaboratives will be essential vehicles for harnessing the vast stores of privately held data toward the public good.

Selected Reading List (in alphabetical order)

Annotated Selected Readings List (in alphabetical order)

Agaba, G., Akindès, F., Bengtsson, L., Cowls, J., Ganesh, M., Hoffman, N., . . . Meissner, F. “Big Data and Positive Social Change in the Developing World: A White Paper for Practitioners and Researchers.” 2014. http://bit.ly/25RRC6N.

  • This white paper, produced by “a group of activists, researchers and data experts” explores the potential of big data to improve development outcomes and spur positive social change in low- and middle-income countries. Using examples, the authors discuss four areas in which the use of big data can impact development efforts:
    • Advocating and facilitating by “opening[ing] up new public spaces for discussion and awareness building;
    • Describing and predicting through the detection of “new correlations and the surfac[ing] of new questions;
    • Facilitating information exchange through “multiple feedback loops which feed into both research and action,” and
    • Promoting accountability and transparency, especially as a byproduct of crowdsourcing efforts aimed at “aggregat[ing] and analyz[ing] information in real time.
  • The authors argue that in order to maximize the potential of big data’s use in development, “there is a case to be made for building a data commons for private/public data, and for setting up new and more appropriate ethical guidelines.”
  • They also identify a number of challenges, especially when leveraging data made accessible from a number of sources, including private sector entities, such as:
    • Lack of general data literacy;
    • Lack of open learning environments and repositories;
    • Lack of resources, capacity and access;
    • Challenges of sensitivity and risk perception with regard to using data;
    • Storage and computing capacity; and
    • Externally validating data sources for comparison and verification.

Ansell, C. and Gash, A. “Collaborative Governance in Theory and Practice.” Journal of Public Administration Research and  Theory 18 (4), 2008. http://bit.ly/1RZgsI5.

  • This article describes collaborative arrangements that include public and private organizations working together and proposes a model for understanding an emergent form of public-private interaction informed by 137 diverse cases of collaborative governance.
  • The article suggests factors significant to successful partnering processes and outcomes include:
    • Shared understanding of challenges,
    • Trust building processes,
    • The importance of recognizing seemingly modest progress, and
    • Strong indicators of commitment to the partnership’s aspirations and process.
  • The authors provide a ‘’contingency theory model’’ that specifies relationships between different variables that influence outcomes of collaborative governance initiatives. Three “core contingencies’’ for successful collaborative governance initiatives identified by the authors are:
    • Time (e.g., decision making time afforded to the collaboration);
    • Interdependence (e.g., a high degree of interdependence can mitigate negative effects of low trust); and
    • Trust (e.g. a higher level of trust indicates a higher probability of success).

Ballivian A, Hoffman W. “Public-Private Partnerships for Data: Issues Paper for Data Revolution Consultation.” World Bank, 2015. Available from: http://bit.ly/1ENvmRJ

  • This World Bank report provides a background document on forming public-prviate partnerships for data with the private sector in order to inform the UN’s Independent Expert Advisory Group (IEAG) on sustaining a “data revolution” in sustainable development.
  • The report highlights the critical position of private companies within the data value chain and reflects on key elements of a sustainable data PPP: “common objectives across all impacted stakeholders, alignment of incentives, and sharing of risks.” In addition, the report describes the risks and incentives of public and private actors, and the principles needed to “build[ing] the legal, cultural, technological and economic infrastructures to enable the balancing of competing interests.” These principles include understanding; experimentation; adaptability; balance; persuasion and compulsion; risk management; and governance.
  • Examples of data collaboratives cited in the report include HP Earth Insights, Orange Data for Development Challenges, Amazon Web Services, IBM Smart Cities Initiative, and the Governance Lab’s Open Data 500.

Brack, Matthew, and Tito Castillo. “Data Sharing for Public Health: Key Lessons from Other Sectors.” Chatham House, Centre on Global Health Security. April 2015. Available from: http://bit.ly/1DHFGVl

  • The Chatham House report provides an overview on public health surveillance data sharing, highlighting the benefits and challenges of shared health data and the complexity in adapting technical solutions from other sectors for public health.
  • The report describes data sharing processes from several perspectives, including in-depth case studies of actual data sharing in practice at the individual, organizational and sector levels. Among the key lessons for public health data sharing, the report strongly highlights the need to harness momentum for action and maintain collaborative engagement: “Successful data sharing communities are highly collaborative. Collaboration holds the key to producing and abiding by community standards, and building and maintaining productive networks, and is by definition the essence of data sharing itself. Time should be invested in establishing and sustaining collaboration with all stakeholders concerned with public health surveillance data sharing.”
  • Examples of data collaboratives include H3Africa (a collaboration between NIH and Wellcome Trust) and NHS England’s care.data programme.

de Montjoye, Yves-Alexandre, Jake Kendall, and Cameron F. Kerry. “Enabling Humanitarian Use of Mobile Phone Data.” The Brookings Institution, Issues in Technology Innovation. November 2014. Available from: http://brook.gs/1JxVpxp

  • Using Ebola as a case study, the authors describe the value of using private telecom data for uncovering “valuable insights into understanding the spread of infectious diseases as well as strategies into micro-target outreach and driving update of health-seeking behavior.”
  • The authors highlight the absence of a common legal and standards framework for “sharing mobile phone data in privacy-conscientious ways” and recommend “engaging companies, NGOs, researchers, privacy experts, and governments to agree on a set of best practices for new privacy-conscientious metadata sharing models.”

Eckartz, Silja M., Hofman, Wout J., Van Veenstra, Anne Fleur. “A decision model for data sharing.” Vol. 8653 LNCS. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2014. http://bit.ly/21cGWfw.

  • This paper proposes a decision model for data sharing of public and private data based on literature review and three case studies in the logistics sector.
  • The authors identify five categories of the barriers to data sharing and offer a decision model for identifying potential interventions to overcome each barrier:
    • Ownership. Possible interventions likely require improving trust among those who own the data through, for example, involvement and support from higher management
    • Privacy. Interventions include “anonymization by filtering of sensitive information and aggregation of data,” and access control mechanisms built around identity management and regulated access.  
    • Economic. Interventions include a model where data is shared only with a few trusted organizations, and yield management mechanisms to ensure negative financial consequences are avoided.
    • Data quality. Interventions include identifying additional data sources that could improve the completeness of datasets, and efforts to improve metadata.
    • Technical. Interventions include making data available in structured formats and publishing data according to widely agreed upon data standards.

Hoffman, Sharona and Podgurski, Andy. “The Use and Misuse of Biomedical Data: Is Bigger Really Better?” American Journal of Law & Medicine 497, 2013. http://bit.ly/1syMS7J.

  • This journal articles explores the benefits and, in particular, the risks related to large-scale biomedical databases bringing together health information from a diversity of sources across sectors. Some data collaboratives examined in the piece include:
    • MedMining – a company that extracts EHR data, de-identifies it, and offers it to researchers. The data sets that MedMining delivers to its customers include ‘lab results, vital signs, medications, procedures, diagnoses, lifestyle data, and detailed costs’ from inpatient and outpatient facilities.
    • Explorys has formed a large healthcare database derived from financial, administrative, and medical records. It has partnered with major healthcare organizations such as the Cleveland Clinic Foundation and Summa Health System to aggregate and standardize health information from ten million patients and over thirty billion clinical events.
  • Hoffman and Podgurski note that biomedical databases populated have many potential uses, with those likely to benefit including: “researchers, regulators, public health officials, commercial entities, lawyers,” as well as “healthcare providers who conduct quality assessment and improvement activities,” regulatory monitoring entities like the FDA, and “litigants in tort cases to develop evidence concerning causation and harm.”
  • They argue, however, that risks arise based on:
    • The data contained in biomedical databases is surprisingly likely to be incorrect or incomplete;
    • Systemic biases, arising from both the nature of the data and the preconceptions of investigators are serious threats the validity of research results, especially in answering causal questions;
  • Data mining of biomedical databases makes it easier for individuals with political, social, or economic agendas to generate ostensibly scientific but misleading research findings for the purpose of manipulating public opinion and swaying policymakers.

Krumholz, Harlan M., et al. “Sea Change in Open Science and Data Sharing Leadership by Industry.” Circulation: Cardiovascular Quality and Outcomes 7.4. 2014. 499-504. http://1.usa.gov/1J6q7KJ

  • This article provides a comprehensive overview of industry-led efforts and cross-sector collaborations in data sharing by pharmaceutical companies to inform clinical practice.
  • The article details the types of data being shared and the early activities of GlaxoSmithKline (“in coordination with other companies such as Roche and ViiV”); Medtronic and the Yale University Open Data Access Project; and Janssen Pharmaceuticals (Johnson & Johnson). The article also describes the range of involvement in data sharing among pharmaceutical companies including Pfizer, Novartis, Bayer, AbbVie, Eli Llly, AstraZeneca, and Bristol-Myers Squibb.

Mann, Gideon. “Private Data and the Public Good.” Medium. May 17, 2016. http://bit.ly/1OgOY68.

    • This Medium post from Gideon Mann, the Head of Data Science at Bloomberg, shares his prepared remarks given at a lecture at the City College of New York. Mann argues for the potential benefits of increasing access to private sector data, both to improve research and academic inquiry and also to help solve practical, real-world problems. He also describes a number of initiatives underway at Bloomberg along these lines.    
  • Mann argues that data generated at private companies “could enable amazing discoveries and research,” but is often inaccessible to those who could put it to those uses. Beyond research, he notes that corporate data could, for instance, benefit:
      • Public health – including suicide prevention, addiction counseling and mental health monitoring.
    • Legal and ethical questions – especially as they relate to “the role algorithms have in decisions about our lives,” such as credit checks and resume screening.
  • Mann recognizes the privacy challenges inherent in private sector data sharing, but argues that it is a common misconception that the only two choices are “complete privacy or complete disclosure.” He believes that flexible frameworks for differential privacy could open up new opportunities for responsibly leveraging data collaboratives.

Pastor Escuredo, D., Morales-Guzmán, A. et al, “Flooding through the Lens of Mobile Phone Activity.” IEEE Global Humanitarian Technology Conference, GHTC 2014. Available from: http://bit.ly/1OzK2bK

  • This report describes the impact of using mobile data in order to understand the impact of disasters and improve disaster management. The report was conducted in the Mexican state of Tabasco in 2009 as a multidisciplinary, multi-stakeholder consortium involving the UN World Food Programme (WFP), Telefonica Research, Technical University of Madrid (UPM), Digital Strategy Coordination Office of the President of Mexico, and UN Global Pulse.
  • Telefonica Research, a division of the major Latin American telecommunications company, provided call detail records covering flood-affected areas for nine months. This data was combined with “remote sensing data (satellite images), rainfall data, census and civil protection data.” The results of the data demonstrated that “analysing mobile activity during floods could be used to potentially locate damaged areas, efficiently assess needs and allocate resources (for example, sending supplies to affected areas).”
  • In addition to the results, the study highlighted “the value of a public-private partnership on using mobile data to accurately indicate flooding impacts in Tabasco, thus improving early warning and crisis management.”

* Perkmann, M. and Schildt, H. “Open data partnerships between firms and universities: The role of boundary organizations.” Research Policy, 44(5), 2015. http://bit.ly/25RRJ2c

  • This paper discusses the concept of a “boundary organization” in relation to industry-academic partnerships driven by data. Boundary organizations perform mediated revealing, allowing firms to disclose their research problems to a broad audience of innovators and simultaneously minimize the risk that this information would be adversely used by competitors.
  • The authors identify two especially important challenges for private firms to enter open data or participate in data collaboratives with the academic research community that could be addressed through more involvement from boundary organizations:
    • First is a challenge of maintaining competitive advantage. The authors note that, “the more a firm attempts to align the efforts in an open data research programme with its R&D priorities, the more it will have to reveal about the problems it is addressing within its proprietary R&D.”
    • Second, involves the misalignment of incentives between the private and academic field. Perkmann and Schildt argue that, a firm seeking to build collaborations around its opened data “will have to provide suitable incentives that are aligned with academic scientists’ desire to be rewarded for their work within their respective communities.”

Robin, N., Klein, T., & Jütting, J. “Public-Private Partnerships for Statistics: Lessons Learned, Future Steps.” OECD. 2016. http://bit.ly/24FLYlD.

  • This working paper acknowledges the growing body of work on how different types of data (e.g, telecom data, social media, sensors and geospatial data, etc.) can address data gaps relevant to National Statistical Offices (NSOs).
  • Four models of public-private interaction for statistics are describe: in-house production of statistics by a data-provider for a national statistics office (NSO), transfer of data-sets to NSOs from private entities, transfer of data to a third party provider to manage the NSO and private entity data, and the outsourcing of NSO functions.
  • The paper highlights challenges to public-private partnerships involving data (e.g., technical challenges, data confidentiality, risks, limited incentives for participation), suggests deliberate and highly structured approaches to public-private partnerships involving data require enforceable contracts, emphasizes the trade-off between data specificity and accessibility of such data, and the importance of pricing mechanisms that reflect the capacity and capability of national statistic offices.
  • Case studies referenced in the paper include:
    • A mobile network operator’s (MNO Telefonica) in house analysis of call detail records;
    • A third-party data provider and steward of travel statistics (Positium);
    • The Data for Development (D4D) challenge organized by MNO Orange; and
    • Statistics Netherlands use of social media to predict consumer confidence.

Stuart, Elizabeth, Samman, Emma, Avis, William, Berliner, Tom. “The data revolution: finding the missing millions.” Overseas Development Institute, 2015. Available from: http://bit.ly/1bPKOjw

  • The authors of this report highlight the need for good quality, relevant, accessible and timely data for governments to extend services into underrepresented communities and implement policies towards a sustainable “data revolution.”
  • The solutions focused on this recent report from the Overseas Development Institute focus on capacity-building activities of national statistical offices (NSOs), alternative sources of data (including shared corporate data) to address gaps, and building strong data management systems.

Taylor, L., & Schroeder, R. “Is bigger better? The emergence of big data as a tool for international development policy.” GeoJournal, 80(4). 2015. 503-518. http://bit.ly/1RZgSy4.

  • This journal article describes how privately held data – namely “digital traces” of consumer activity – “are becoming seen by policymakers and researchers as a potential solution to the lack of reliable statistical data on lower-income countries.
  • They focus especially on three categories of data collaborative use cases:
    • Mobile data as a predictive tool for issues such as human mobility and economic activity;
    • Use of mobile data to inform humanitarian response to crises; and
    • Use of born-digital web data as a tool for predicting economic trends, and the implications these have for LMICs.
  • They note, however, that a number of challenges and drawbacks exist for these types of use cases, including:
    • Access to private data sources often must be negotiated or bought, “which potentially means substituting negotiations with corporations for those with national statistical offices;”
    • The meaning of such data is not always simple or stable, and local knowledge is needed to understand how people are using the technologies in question
    • Bias in proprietary data can be hard to understand and quantify;
    • Lack of privacy frameworks; and
    • Power asymmetries, wherein “LMIC citizens are unwittingly placed in a panopticon staffed by international researchers, with no way out and no legal recourse.”

van Panhuis, Willem G., Proma Paul, Claudia Emerson, John Grefenstette, Richard Wilder, Abraham J. Herbst, David Heymann, and Donald S. Burke. “A systematic review of barriers to data sharing in public health.” BMC public health 14, no. 1 (2014): 1144. Available from: http://bit.ly/1JOBruO

  • The authors of this report provide a “systematic literature of potential barriers to public health data sharing.” These twenty potential barriers are classified in six categories: “technical, motivational, economic, political, legal and ethical.” In this taxonomy, “the first three categories are deeply rooted in well-known challenges of health information systems for which structural solutions have yet to be found; the last three have solutions that lie in an international dialogue aimed at generating consensus on policies and instruments for data sharing.”
  • The authors suggest the need for a “systematic framework of barriers to data sharing in public health” in order to accelerate access and use of data for public good.

Verhulst, Stefaan and Sangokoya, David. “Mapping the Next Frontier of Open Data: Corporate Data Sharing.” In: Gasser, Urs and Zittrain, Jonathan and Faris, Robert and Heacock Jones, Rebekah, “Internet Monitor 2014: Reflections on the Digital World: Platforms, Policy, Privacy, and Public Discourse (December 15, 2014).” Berkman Center Research Publication No. 2014-17. http://bit.ly/1GC12a2

  • This essay describe a taxonomy of current corporate data sharing practices for public good: research partnerships; prizes and challenges; trusted intermediaries; application programming interfaces (APIs); intelligence products; and corporate data cooperatives or pooling.
  • Examples of data collaboratives include: Yelp Dataset Challenge, the Digital Ecologies Research Partnerhsip, BBVA Innova Challenge, Telecom Italia’s Big Data Challenge, NIH’s Accelerating Medicines Partnership and the White House’s Climate Data Partnerships.
  • The authors highlight important questions to consider towards a more comprehensive mapping of these activities.

Verhulst, Stefaan and Sangokoya, David, 2015. “Data Collaboratives: Exchanging Data to Improve People’s Lives.” Medium. Available from: http://bit.ly/1JOBDdy

  • The essay refers to data collaboratives as a new form of collaboration involving participants from different sectors exchanging data to help solve public problems. These forms of collaborations can improve people’s lives through data-driven decision-making; information exchange and coordination; and shared standards and frameworks for multi-actor, multi-sector participation.
  • The essay cites four activities that are critical to accelerating data collaboratives: documenting value and measuring impact; matching public demand and corporate supply of data in a trusted way; training and convening data providers and users; experimenting and scaling existing initiatives.
  • Examples of data collaboratives include NIH’s Precision Medicine Initiative; the Mobile Data, Environmental Extremes and Population (MDEEP) Project; and Twitter-MIT’s Laboratory for Social Machines.

Verhulst, Stefaan, Susha, Iryna, Kostura, Alexander. “Data Collaboratives: matching Supply of (Corporate) Data to Solve Public Problems.” Medium. February 24, 2016. http://bit.ly/1ZEp2Sr.

  • This piece articulates a set of key lessons learned during a session at the International Data Responsibility Conference focused on identifying emerging practices, opportunities and challenges confronting data collaboratives.
  • The authors list a number of privately held data sources that could create positive public impacts if made more accessible in a collaborative manner, including:
    • Data for early warning systems to help mitigate the effects of natural disasters;
    • Data to help understand human behavior as it relates to nutrition and livelihoods in developing countries;
    • Data to monitor compliance with weapons treaties;
    • Data to more accurately measure progress related to the UN Sustainable Development Goals.
  • To the end of identifying and expanding on emerging practice in the space, the authors describe a number of current data collaborative experiments, including:
    • Trusted Intermediaries: Statistics Netherlands partnered with Vodafone to analyze mobile call data records in order to better understand mobility patterns and inform urban planning.
    • Prizes and Challenges: Orange Telecom, which has been a leader in this type of Data Collaboration, provided several examples of the company’s initiatives, such as the use of call data records to track the spread of malaria as well as their experience with Challenge 4 Development.
    • Research partnerships: The Data for Climate Action project is an ongoing large-scale initiative incentivizing companies to share their data to help researchers answer particular scientific questions related to climate change and adaptation.
    • Sharing intelligence products: JPMorgan Chase shares macro economic insights they gained leveraging their data through the newly established JPMorgan Chase Institute.
  • In order to capitalize on the opportunities provided by data collaboratives, a number of needs were identified:
    • A responsible data framework;
    • Increased insight into different business models that may facilitate the sharing of data;
    • Capacity to tap into the potential value of data;
    • Transparent stock of available data supply; and
    • Mapping emerging practices and models of sharing.

Vogel, N., Theisen, C., Leidig, J. P., Scripps, J., Graham, D. H., & Wolffe, G. “Mining mobile datasets to enable the fine-grained stochastic simulation of Ebola diffusion.” Paper presented at the Procedia Computer Science. 2015. http://bit.ly/1TZDroF.

  • The paper presents a research study conducted on the basis of the mobile calls records shared with researchers in the framework of the Data for Development Challenge by the mobile operator Orange.
  • The study discusses the data analysis approach in relation to developing a situation of Ebola diffusion built around “the interactions of multi-scale models, including viral loads (at the cellular level), disease progression (at the individual person level), disease propagation (at the workplace and family level), societal changes in migration and travel movements (at the population level), and mitigating interventions (at the abstract government policy level).”
  • The authors argue that the use of their population, mobility, and simulation models provide more accurate simulation details in comparison to high-level analytical predictions and that the D4D mobile datasets provide high-resolution information useful for modeling developing regions and hard to reach locations.

Welle Donker, F., van Loenen, B., & Bregt, A. K. “Open Data and Beyond.” ISPRS International Journal of Geo-Information, 5(4). 2016. http://bit.ly/22YtugY.

  • This research has developed a monitoring framework to assess the effects of open (private) data using a case study of a Dutch energy network administrator Liander.
  • Focusing on the potential impacts of open private energy data – beyond ‘smart disclosure’ where citizens are given information only about their own energy usage – the authors identify three attainable strategic goals:
    • Continuously optimize performance on services, security of supply, and costs;
    • Improve management of energy flows and insight into energy consumption;
    • Help customers save energy and switch over to renewable energy sources.
  • The authors propose a seven-step framework for assessing the impacts of Liander data, in particular, and open private data more generally:
    • Develop a performance framework to describe what the program is about, description of the organization’s mission and strategic goals;
    • Identify the most important elements, or key performance areas which are most critical to understanding and assessing your program’s success;
    • Select the most appropriate performance measures;
    • Determine the gaps between what information you need and what is available;
    • Develop and implement a measurement strategy to address the gaps;
    • Develop a performance report which highlights what you have accomplished and what you have learned;
    • Learn from your experiences and refine your approach as required.
  • While the authors note that the true impacts of this open private data will likely not come into view in the short term, they argue that, “Liander has successfully demonstrated that private energy companies can release open data, and has successfully championed the other Dutch network administrators to follow suit.”

World Economic Forum, 2015. “Data-driven development: pathways for progress.” Geneva: World Economic Forum. http://bit.ly/1JOBS8u

  • This report captures an overview of the existing data deficit and the value and impact of big data for sustainable development.
  • The authors of the report focus on four main priorities towards a sustainable data revolution: commercial incentives and trusted agreements with public- and private-sector actors; the development of shared policy frameworks, legal protections and impact assessments; capacity building activities at the institutional, community, local and individual level; and lastly, recognizing individuals as both produces and consumers of data.

Selected Readings on Crowdsourcing Expertise


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of crowdsourcing was originally published in 2014.

Crowdsourcing enables leaders and citizens to work together to solve public problems in new and innovative ways. New tools and platforms enable citizens with differing levels of knowledge, expertise, experience and abilities to collaborate and solve problems together. Identifying experts, or individuals with specialized skills, knowledge or abilities with regard to a specific topic, and incentivizing their participation in crowdsourcing information, knowledge or experience to achieve a shared goal can enhance the efficiency and effectiveness of problem solving.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Börner, Katy, Michael Conlon, Jon Corson-Rikert, and Ying Ding. “VIVO: A Semantic Approach to Scholarly Networking and Discovery.” Synthesis Lectures on the Semantic Web: Theory and Technology 2, no. 1 (October 17, 2012): 1–178. http://bit.ly/17huggT.

  • This e-book “provides an introduction to VIVO…a tool for representing information about research and researchers — their scholarly works, research interests, and organizational relationships.”
  • VIVO is a response to the fact that, “Information for scholars — and about scholarly activity — has not kept pace with the increasing demands and expectations. Information remains siloed in legacy systems and behind various access controls that must be licensed or otherwise negotiated before access. Information representation is in its infancy. The raw material of scholarship — the data and information regarding previous work — is not available in common formats with common semantics.”
  • Providing access to structured information on the work and experience of a diversity of scholars enables improved expert finding — “identifying and engaging experts whose scholarly works is of value to one’s own. To find experts, one needs rich data regarding one’s own work and the work of potential related experts. The authors argue that expert finding is of increasing importance since, “[m]ulti-disciplinary and inter-disciplinary investigation is increasingly required to address complex problems. 

Bozzon, Alessandro, Marco Brambilla, Stefano Ceri, Matteo Silvestri, and Giuliano Vesci. “Choosing the Right Crowd: Expert Finding in Social Networks.” In Proceedings of the 16th International Conference on Extending Database Technology, 637–648. EDBT  ’13. New York, NY, USA: ACM, 2013. http://bit.ly/18QbtY5.

  • This paper explores the challenge of selecting experts within the population of social networks by considering the following problem: “given an expertise need (expressed for instance as a natural language query) and a set of social network members, who are the most knowledgeable people for addressing that need?”
  • The authors come to the following conclusions:
    • “profile information is generally less effective than information about resources that they directly create, own or annotate;
    • resources which are produced by others (resources appearing on the person’s Facebook wall or produced by people that she follows on Twitter) help increasing the assessment precision;
    • Twitter appears the most effective social network for expertise matching, as it very frequently outperforms all other social networks (either combined or alone);
    • Twitter appears as well very effective for matching expertise in domains such as computer engineering, science, sport, and technology & games, but Facebook is also very effective in fields such as locations, music, sport, and movies & tv;
    • surprisingly, LinkedIn appears less effective than other social networks in all domains (including computer science) and overall.”

Brabham, Daren C. “The Myth of Amateur Crowds.” Information, Communication & Society 15, no. 3 (2012): 394–410. http://bit.ly/1hdnGJV.

  • Unlike most of the related literature, this paper focuses on bringing attention to the expertise already being tapped by crowdsourcing efforts rather than determining ways to identify more dormant expertise to improve the results of crowdsourcing.
  • Brabham comes to two central conclusions: “(1) crowdsourcing is discussed in the popular press as a process driven by amateurs and hobbyists, yet empirical research on crowdsourcing indicates that crowds are largely self-selected professionals and experts who opt-in to crowdsourcing arrangements; and (2) the myth of the amateur in crowdsourcing ventures works to label crowds as mere hobbyists who see crowdsourcing ventures as opportunities for creative expression, as entertainment, or as opportunities to pass the time when bored. This amateur/hobbyist label then undermines the fact that large amounts of real work and expert knowledge are exerted by crowds for relatively little reward and to serve the profit motives of companies. 

Dutton, William H. Networking Distributed Public Expertise: Strategies for Citizen Sourcing Advice to Government. One of a Series of Occasional Papers in Science and Technology Policy, Science and Technology Policy Institute, Institute for Defense Analyses, February 23, 2011. http://bit.ly/1c1bpEB.

  • In this paper, a case is made for more structured and well-managed crowdsourcing efforts within government. Specifically, the paper “explains how collaborative networking can be used to harness the distributed expertise of citizens, as distinguished from citizen consultation, which seeks to engage citizens — each on an equal footing.” Instead of looking for answers from an undefined crowd, Dutton proposes “networking the public as advisors” by seeking to “involve experts on particular public issues and problems distributed anywhere in the world.”
  • Dutton argues that expert-based crowdsourcing can be successfully for government for a number of reasons:
    • Direct communication with a diversity of independent experts
    • The convening power of government
    • Compatibility with open government and open innovation
    • Synergy with citizen consultation
    • Building on experience with paid consultants
    • Speed and urgency
    • Centrality of documents to policy and practice.
  • He also proposes a nine-step process for government to foster bottom-up collaboration networks:
    • Do not reinvent the technology
    • Focus on activities, not the tools
    • Start small, but capable of scaling up
    • Modularize
    • Be open and flexible in finding and going to communities of experts
    • Do not concentrate on one approach to all problems
    • Cultivate the bottom-up development of multiple projects
    • Experience networking and collaborating — be a networked individual
    • Capture, reward, and publicize success.

Goel, Gagan, Afshin Nikzad and Adish Singla. “Matching Workers with Tasks: Incentives in Heterogeneous Crowdsourcing Markets.” Under review by the International World Wide Web Conference (WWW). 2014. http://bit.ly/1qHBkdf

  • Combining the notions of crowdsourcing expertise and crowdsourcing tasks, this paper focuses on the challenge within platforms like Mechanical Turk related to intelligently matching tasks to workers.
  • The authors’ call for more strategic assignment of tasks in crowdsourcing markets is based on the understanding that “each worker has certain expertise and interests which define the set of tasks she can and is willing to do.”
  • Focusing on developing meaningful incentives based on varying levels of expertise, the authors sought to create a mechanism that, “i) is incentive compatible in the sense that it is truthful for agents to report their true cost, ii) picks a set of workers and assigns them to the tasks they are eligible for in order to maximize the utility of the requester, iii) makes sure total payments made to the workers doesn’t exceed the budget of the requester.

Gubanov, D., N. Korgin, D. Novikov and A. Kalkov. E-Expertise: Modern Collective Intelligence. Springer, Studies in Computational Intelligence 558, 2014. http://bit.ly/U1sxX7

  • In this book, the authors focus on “organization and mechanisms of expert decision-making support using modern information and communication technologies, as well as information analysis and collective intelligence technologies (electronic expertise or simply e-expertise).”
  • The book, which “addresses a wide range of readers interested in management, decision-making and expert activity in political, economic, social and industrial spheres, is broken into five chapters:
    • Chapter 1 (E-Expertise) discusses the role of e-expertise in decision-making processes. The procedures of e-expertise are classified, their benefits and shortcomings are identified, and the efficiency conditions are considered.
    • Chapter 2 (Expert Technologies and Principles) provides a comprehensive overview of modern expert technologies. A special emphasis is placed on the specifics of e-expertise. Moreover, the authors study the feasibility and reasonability of employing well-known methods and approaches in e-expertise.
    • Chapter 3 (E-Expertise: Organization and Technologies) describes some examples of up-to-date technologies to perform e-expertise.
    • Chapter 4 (Trust Networks and Competence Networks) deals with the problems of expert finding and grouping by information and communication technologies.
    • Chapter 5 (Active Expertise) treats the problem of expertise stability against any strategic manipulation by experts or coordinators pursuing individual goals.

Holst, Cathrine. “Expertise and Democracy.” ARENA Report No 1/14, Center for European Studies, University of Oslo. http://bit.ly/1nm3rh4

  • This report contains a set of 16 papers focused on the concept of “epistocracy,” meaning the “rule of knowers.” The papers inquire into the role of knowledge and expertise in modern democracies and especially in the European Union (EU). Major themes are: expert-rule and democratic legitimacy; the role of knowledge and expertise in EU governance; and the European Commission’s use of expertise.
    • Expert-rule and democratic legitimacy
      • Papers within this theme concentrate on issues such as the “implications of modern democracies’ knowledge and expertise dependence for political and democratic theory.” Topics include the accountability of experts, the legitimacy of expert arrangements within democracies, the role of evidence in policy-making, how expertise can be problematic in democratic contexts, and “ethical expertise” and its place in epistemic democracies.
    • The role of knowledge and expertise in EU governance
      • Papers within this theme concentrate on “general trends and developments in the EU with regard to the role of expertise and experts in political decision-making, the implications for the EU’s democratic legitimacy, and analytical strategies for studying expertise and democratic legitimacy in an EU context.”
    • The European Commission’s use of expertise
      • Papers within this theme concentrate on how the European Commission uses expertise and in particular the European Commission’s “expertgroup system.” Topics include the European Citizen’s Initiative, analytic-deliberative processes in EU food safety, the operation of EU environmental agencies, and the autonomy of various EU agencies.

King, Andrew and Karim R. Lakhani. “Using Open Innovation to Identify the Best Ideas.” MIT Sloan Management Review, September 11, 2013. http://bit.ly/HjVOpi.

  • In this paper, King and Lakhani examine different methods for opening innovation, where, “[i]nstead of doing everything in-house, companies can tap into the ideas cloud of external expertise to develop new products and services.”
  • The three types of open innovation discussed are: opening the idea-creation process, competitions where prizes are offered and designers bid with possible solutions; opening the idea-selection process, ‘approval contests’ in which outsiders vote to determine which entries should be pursued; and opening both idea generation and selection, an option used especially by organizations focused on quickly changing needs.

Long, Chengjiang, Gang Hua and Ashish Kapoor. Active Visual Recognition with Expertise Estimation in Crowdsourcing. 2013 IEEE International Conference on Computer Vision. December 2013. http://bit.ly/1lRWFur.

  • This paper is focused on improving the crowdsourced labeling of visual datasets from platforms like Mechanical Turk. The authors note that, “Although it is cheap to obtain large quantity of labels through crowdsourcing, it has been well known that the collected labels could be very noisy. So it is desirable to model the expertise level of the labelers to ensure the quality of the labels. The higher the expertise level a labeler is at, the lower the label noises he/she will produce.”
  • Based on the need for identifying expert labelers upfront, the authors developed an “active classifier learning system which determines which users to label which unlabeled examples” from collected visual datasets.
  • The researchers’ experiments in identifying expert visual dataset labelers led to findings demonstrating that the “active selection” of expert labelers is beneficial in cutting through the noise of crowdsourcing platforms.

Noveck, Beth Simone. “’Peer to Patent’: Collective Intelligence, Open Review, and Patent Reform.” Harvard Journal of Law & Technology 20, no. 1 (Fall 2006): 123–162. http://bit.ly/HegzTT.

  • This law review article introduces the idea of crowdsourcing expertise to mitigate the challenge of patent processing. Noveck argues that, “access to information is the crux of the patent quality problem. Patent examiners currently make decisions about the grant of a patent that will shape an industry for a twenty-year period on the basis of a limited subset of available information. Examiners may neither consult the public, talk to experts, nor, in many cases, even use the Internet.”
  • Peer-to-Patent, which launched three years after this article, is based on the idea that, “The new generation of social software might not only make it easier to find friends but also to find expertise that can be applied to legal and policy decision-making. This way, we can improve upon the Constitutional promise to promote the progress of science and the useful arts in our democracy by ensuring that only worth ideas receive that ‘odious monopoly’ of which Thomas Jefferson complained.”

Ober, Josiah. “Democracy’s Wisdom: An Aristotelian Middle Way for Collective Judgment.” American Political Science Review 107, no. 01 (2013): 104–122. http://bit.ly/1cgf857.

  • In this paper, Ober argues that, “A satisfactory model of decision-making in an epistemic democracy must respect democratic values, while advancing citizens’ interests, by taking account of relevant knowledge about the world.”
  • Ober describes an approach to decision-making that aggregates expertise across multiple domains. This “Relevant Expertise Aggregation (REA) enables a body of minimally competent voters to make superior choices among multiple options, on matters of common interest.”

Sims, Max H., Jeffrey Bigham, Henry Kautz and Marc W. Halterman. Crowdsourcing medical expertise in near real time.” Journal of Hospital Medicine 9, no. 7, July 2014. http://bit.ly/1kAKvq7.

  • In this article, the authors discuss the develoment of a mobile application called DocCHIRP, which was developed due to the fact that, “although the Internet creates unprecedented access to information, gaps in the medical literature and inefficient searches often leave healthcare providers’ questions unanswered.”
  • The DocCHIRP pilot project used a “system of point-to-multipoint push notifications designed to help providers problem solve by crowdsourcing from their peers.”
  • Healthcare providers (HCPs) sought to gain intelligence from the crowd, which included 85 registered users, on questions related to medication, complex medical decision making, standard of care, administrative, testing and referrals.
  • The authors believe that, “if future iterations of the mobile crowdsourcing applications can address…adoption barriers and support the organic growth of the crowd of HCPs,” then “the approach could have a positive and transformative effect on how providers acquire relevant knowledge and care for patients.”

Spina, Alessandro. “Scientific Expertise and Open Government in the Digital Era: Some Reflections on EFSA and Other EU Agencies.” in Foundations of EU Food Law and Policy, eds. A. Alemmano and S. Gabbi. Ashgate, 2014. http://bit.ly/1k2EwdD.

  • In this paper, Spina “presents some reflections on how the collaborative and crowdsourcing practices of Open Government could be integrated in the activities of EFSA [European Food Safety Authority] and other EU agencies,” with a particular focus on “highlighting the benefits of the Open Government paradigm for expert regulatory bodies in the EU.”
  • Spina argues that the “crowdsourcing of expertise and the reconfiguration of the information flows between European agencies and teh public could represent a concrete possibility of modernising the role of agencies with a new model that has a low financial burden and an almost immediate effect on the legal governance of agencies.”
  • He concludes that, “It is becoming evident that in order to guarantee that the best scientific expertise is provided to EU institutions and citizens, EFSA should strive to use the best organisational models to source science and expertise.”

Urban Analytics (Updated and Expanded)


As part of an ongoing effort to build a knowledge base for the field of opening governance by organizing and disseminating its learnings, the GovLab Selected Readings series provides an annotated and curated collection of recommended works on key opening governance topics. In this edition, we explore the literature on Urban Analytics. To suggest additional readings on this or any other topic, please email biblio@thegovlab.org.

Data and its uses for Governance

Urban Analytics places better information in the hands of citizens as well as government officials to empower people to make more informed choices. Today, we are able to gather real-time information about traffic, pollution, noise, and environmental and safety conditions by culling data from a range of tools: from the low-cost sensors in mobile phones to more robust monitoring tools installed in our environment. With data collected and combined from the built, natural and human environments, we can develop more robust predictive models and use those models to make policy smarter.

With the computing power to transmit and store the data from these sensors, and the tools to translate raw data into meaningful visualizations, we can identify problems as they happen, design new strategies for city management, and target the application of scarce resources where they are most needed.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)
Amini, L., E. Bouillet, F. Calabrese, L. Gasparini, and O. Verscheure. “Challenges and Results in City-scale Sensing.” In IEEE Sensors, 59–61, 2011. http://bit.ly/1doodZm.

  • This paper examines “how city requirements map to research challenges in machine learning, optimization, control, visualization, and semantic analysis.”
  • The authors raises several research challenges including how to extract accurate information when the data is noisy and sparse; how to represent findings from digital pervasive technologies; and how people interact with one another and their environment.

Batty, M., K. W. Axhausen, F. Giannotti, A. Pozdnoukhov, A. Bazzani, M. Wachowicz, G. Ouzounis, and Y. Portugali. “Smart Cities of the Future.The European Physical Journal Special Topics 214, no. 1 (November 1, 2012): 481–518. http://bit.ly/HefbjZ.

  • This paper explores the goals and research challenges involved in the development of smart cities that merge ICT with traditional infrastructures through digital technologies.
  • The authors put forth several research objectives, including: 1) to explore the notion of the city as a laboratory for innovation; 2) to develop technologies that ensure equity, fairness and realize a better quality of city life; and 3) to develop technologies that ensure informed participation and create shared knowledge for democratic city governance.
  • The paper also examines several contemporary smart city initiatives, expected paradigm shifts in the field, benefits, risks and impacts.

Budde, Paul. “Smart Cities of Tomorrow.” In Cities for Smart Environmental and Energy Futures, edited by Stamatina Th Rassia and Panos M. Pardalos, 9–20. Energy Systems. Springer Berlin Heidelberg, 2014. http://bit.ly/17MqPZW.

  • This paper examines the components and strategies involved in the creation of smart cities featuring “cohesive and open telecommunication and software architecture.”
  • In their study of smart cities, the authors examine smart and renewable energy; next-generation networks; smart buildings; smart transport; and smart government.
  • They conclude that for the development of smart cities, information and communication technology (ICT) is needed to build more horizontal collaborative structures, useful data must be analyzed in real time and people and/or machines must be able to make instant decisions related to social and urban life.

Cardone, G., L. Foschini, P. Bellavista, A. Corradi, C. Borcea, M. Talasila, and R. Curtmola. “Fostering Participaction in Smart Cities: a Geo-social Crowdsensing Platform.” IEEE Communications
Magazine 51, no. 6 (2013): 112–119. http://bit.ly/17iJ0vZ.

  • This article examines “how and to what extent the power of collective although imprecise intelligence can be employed in smart cities.”
  • To tackle problems of managing the crowdsensing process, this article proposes a “crowdsensing platform with three main original technical aspects: an innovative geo-social model to profile users along different variables, such as time, location, social interaction, service usage, and human activities; a matching algorithm to autonomously choose people to involve in participActions and to quantify the performance of their sensing; and a new Android-based platform to collect sensing data from smart phones, automatically or with user help, and to deliver sensing/actuation tasks to users.”

Chen, Chien-Chu. “The Trend towards ‘Smart Cities.’” International Journal of Automation and Smart Technology. June 1, 2014. http://bit.ly/1jOOaAg.

  • In this study, Chen explores the ambitions, prevalence and outcomes of a variety of smart cities, organized into five categories:
    • Transportation-focused smart cities
    • Energy-focused smart cities
    • Building-focused smart cities
    • Water-resources-focused smart cities
    • Governance-focused smart cities
  • The study finds that the “Asia Pacific region accounts for the largest share of all smart city development plans worldwide, with 51% of the global total. Smart city development plans in the Asia Pacific region tend to be energy-focused smart city initiatives, aimed at easing the pressure on energy resources that will be caused by continuing rapid urbanization in the future.”
  • North America, on the other hand is generally more geared toward energy-focused smart city development plans. “In North America, there has been a major drive to introduce smart meters and smart electric power grids, integrating the electric power sector with information and communications technology (ICT) and replacing obsolete electric power infrastructure, so as to make cities’ electric power systems more reliable (which in turn can help to boost private-sector investment, stimulate the growth of the ‘green energy’ industry, and create more job opportunities).”
  • Looking to Taiwan as an example, Chen argues that, “Cities in different parts of the world face different problems and challenges when it comes to urban development, making it necessary to utilize technology applications from different fields to solve the unique problems that each individual city has to overcome; the emphasis here is on the development of customized solutions for smart city development.”

Domingo, A., B. Bellalta, M. Palacin, M. Oliver and E. Almirall. “Public Open Sensor Data: Revolutionizing Smart Cities.” Technology and Society Magazine, IEEE 32, No. 4. Winter 2013. http://bit.ly/1iH6ekU.

  • In this article, the authors explore the “enormous amount of information collected by sensor devices” that allows for “the automation of several real-time services to improve city management by using intelligent traffic-light patterns during rush hour, reducing water consumption in parks, or efficiently routing garbage collection trucks throughout the city.”
  • They argue that, “To achieve the goal of sharing and open data to the public, some technical expertise on the part of citizens will be required. A real environment – or platform – will be needed to achieve this goal.” They go on to introduce a variety of “technical challenges and considerations involved in building an Open Sensor Data platform,” including:
    • Scalability
    • Reliability
    • Low latency
    • Standardized formats
    • Standardized connectivity
  • The authors conclude that, despite incredible advancements in urban analytics and open sensing in recent years, “Today, we can only imagine the revolution in Open Data as an introduction to a real-time world mashup with temperature, humidity, CO2 emission, transport, tourism attractions, events, water and gas consumption, politics decisions, emergencies, etc., and all of this interacting with us to help improve the future decisions we make in our public and private lives.”

Harrison, C., B. Eckman, R. Hamilton, P. Hartswick, J. Kalagnanam, J. Paraszczak, and P. Williams. “Foundations for Smarter Cities.” IBM Journal of Research and Development 54, no. 4 (2010): 1–16. http://bit.ly/1iha6CR.

  • This paper describes the information technology (IT) foundation and principles for Smarter Cities.
  • The authors introduce three foundational concepts of smarter cities: instrumented, interconnected and intelligent.
  • They also describe some of the major needs of contemporary cities, and concludes that Creating the Smarter City implies capturing and accelerating flows of information both vertically and horizontally.

Hernández-Muñoz, José M., Jesús Bernat Vercher, Luis Muñoz, José A. Galache, Mirko Presser, Luis A. Hernández Gómez, and Jan Pettersson. “Smart Cities at the Forefront of the Future Internet.” In The Future Internet, edited by John Domingue, Alex Galis, Anastasius Gavras, Theodore Zahariadis, Dave Lambert, Frances Cleary, Petros Daras, et al., 447–462. Lecture Notes in Computer Science 6656. Springer Berlin Heidelberg, 2011. http://bit.ly/HhNbMX.

  • This paper explores how the “Internet of Things (IoT) and Internet of Services (IoS), can become building blocks to progress towards a unified urban-scale ICT platform transforming a Smart City into an open innovation platform.”
  • The authors examine the SmartSantander project to argue that, “the different stakeholders involved in the smart city business is so big that many non-technical constraints must be considered (users, public administrations, vendors, etc.).”
  • The authors also discuss the need for infrastructures at the, for instance, European level for realistic large-scale experimentally-driven research.

Hoon-Lee, Jung, Marguerite Gong Hancock, Mei-Chih Hu. “Towards an effective framework for building smart cities: Lessons from Seoul and San Francisco.” Technological Forecasting and Social Change. Ocotober 3, 2013. http://bit.ly/1rzID5v.

  • In this study, the authors aim to “shed light on the process of building an effective smart city by integrating various practical perspectives with a consideration of smart city characteristics taken from the literature.”
  • They propose a conceptual framework based on case studies from Seoul and San Francisco built around the following dimensions:
    • Urban openness
    • Service innovation
    • Partnerships formation
    • Urban proactiveness
    • Smart city infrastructure integration
    • Smart city governance
  • The authors conclude with a summary of research findings featuring “8 stylized facts”:
    • Movement towards more interactive services engaging citizens;
    • Open data movement facilitates open innovation;
    • Diversifying service development: exploit or explore?
    • How to accelerate adoption: top-down public driven vs. bottom-up market driven partnerships;
    • Advanced intelligent technology supports new value-added smart city services;
    • Smart city services combined with robust incentive systems empower engagement;
    • Multiple device & network accessibility can create network effects for smart city services;
    • Centralized leadership implementing a comprehensive strategy boosts smart initiatives.

Kamel Boulos, Maged N. and Najeeb M. Al-Shorbaji. “On the Internet of Things, smart cities and the WHO Healthy Cities.” International Journal of Health Geographics 13, No. 10. 2014. http://bit.ly/Tkt9GA.

  • In this article, the authors give a “brief overview of the Internet of Things (IoT) for cities, offering examples of IoT-powered 21st century smart cities, including the experience of the Spanish city of Barcelona in implementing its own IoT-driven services to improve the quality of life of its people through measures that promote an eco-friendly, sustainable environment.”
  • The authors argue that one of the central needs for harnessing the power of the IoT and urban analytics is for cities to “involve and engage its stakeholders from a very early stage (city officials at all levels, as well as citizens), and to secure their support by raising awareness and educating them about smart city technologies, the associated benefits, and the likely challenges that will need to be overcome (such as privacy issues).”
  • They conclude that, “The Internet of Things is rapidly gaining a central place as key enabler of the smarter cities of today and the future. Such cities also stand better chances of becoming healthier cities.”

Keller, Sallie Ann, Steven E. Koonin, and Stephanie Shipp. “Big Data and City Living – What Can It Do for Us?Significance 9, no. 4 (2012): 4–7. http://bit.ly/166W3NP.

  • This article provides a short introduction to Big Data, its importance, and the ways in which it is transforming cities. After an overview of the social benefits of big data in an urban context, the article examines its challenges, such as privacy concerns and institutional barriers.
  • The authors recommend that new approaches to making data available for research are needed that do not violate the privacy of entities included in the datasets. They believe that balancing privacy and accessibility issues will require new government regulations and incentives.

Kitchin, Rob. “The Real-Time City? Big Data and Smart Urbanism.” SSRN Scholarly Paper. Rochester, NY: Social Science Research Network, July 3, 2013. http://bit.ly/1aamZj2.

  • This paper focuses on “how cities are being instrumented with digital devices and infrastructure that produce ‘big data’ which enable real-time analysis of city life, new modes of technocratic urban governance, and a re-imagining of cities.”
  • The authors provide “a number of projects that seek to produce a real-time analysis of the city and provides a critical reflection on the implications of big data and smart urbanism.”

Mostashari, A., F. Arnold, M. Maurer, and J. Wade. “Citizens as Sensors: The Cognitive City Paradigm.” In 2011 8th International Conference Expo on Emerging Technologies for a Smarter World (CEWIT), 1–5, 2011. http://bit.ly/1fYe9an.

  • This paper argues that. “implementing sensor networks are a necessary but not sufficient approach to improving urban living.”
  • The authors introduce the concept of the “Cognitive City” – a city that can not only operate more efficiently due to networked architecture, but can also learn to improve its service conditions, by planning, deciding and acting on perceived conditions.
  • Based on this conceptualization of a smart city as a cognitive city, the authors propose “an architectural process approach that allows city decision-makers and service providers to integrate cognition into urban processes.”

Oliver, M., M. Palacin, A. Domingo, and V. Valls. “Sensor Information Fueling Open Data.” In Computer Software and Applications Conference Workshops (COMPSACW), 2012 IEEE 36th Annual, 116–121, 2012. http://bit.ly/HjV4jS.

  • This paper introduces the concept of sensor networks as a key component in the smart cities framework, and shows how real-time data provided by different city network sensors enrich Open Data portals and require a new architecture to deal with massive amounts of continuously flowing information.
  • The authors’ main conclusion is that by providing a framework to build new applications and services using public static and dynamic data that promote innovation, a real-time open sensor network data platform can have several positive effects for citizens.

Perera, Charith, Arkady Zaslavsky, Peter Christen and Dimitrios Georgakopoulos. “Sensing as a service model for smart cities supported by Internet of Things.” Transactions on Emerging Telecommunications Technologies 25, Issue 1. January 2014. http://bit.ly/1qJLDP9.

  • This paper looks into the “enormous pressure towards efficient city management” that has “triggered various Smart City initiatives by both government and private sector businesses to invest in information and communication technologies to find sustainable solutions to the growing issues.”
  • The authors explore the parallel advancement of the Internet of Things (IoT), which “envisions to connect billions of sensors to the Internet and expects to use them for efficient and effective resource management in Smart Cities.”
  • The paper proposes the sensing as a service model “as a solution based on IoT infrastructure.” The sensing as a service model consists of four conceptual layers: “(i) sensors and sensor owners; (ii) sensor publishers (SPs); (iii) extended service providers (ESPs); and (iv) sensor data consumers. They go on to describe how this model would work in the areas of waste management, smart agriculture and environmental management.

Privacy, Big Data, and the Public Good: Frameworks for Engagement. Edited by Julia Lane, Victoria Stodden, Stefan Bender, and Helen Nissenbaum; Cambridge University Press, 2014. http://bit.ly/UoGRca.

  • This book focuses on the legal, practical, and statistical approaches for maximizing the use of massive datasets while minimizing information risk.
  • “Big data” is more than a straightforward change in technology.  It poses deep challenges to our traditions of notice and consent as tools for managing privacy.  Because our new tools of data science can make it all but impossible to guarantee anonymity in the future, the authors question whether it possible to truly give informed consent, when we cannot, by definition, know what the risks are from revealing personal data either for individuals or for society as a whole.
  • Based on their experience building large data collections, authors discuss some of the best practical ways to provide access while protecting confidentiality.  What have we learned about effective engineered controls?  About effective access policies?  About designing data systems that reinforce – rather than counter – access policies?  They also explore the business, legal, and technical standards necessary for a new deal on data.
  • Since the data generating process or the data collection process is not necessarily well understood for big data streams, authors discuss what statistics can tell us about how to make greatest scientific use of this data. They also explore the shortcomings of current disclosure limitation approaches and whether we can quantify the extent of privacy loss.

Schaffers, Hans, Nicos Komninos, Marc Pallot, Brigitte Trousse, Michael Nilsson, and Alvaro Oliveira. “Smart Cities and the Future Internet: Towards Cooperation Frameworks for Open Innovation.” In The Future Internet, edited by John Domingue, Alex Galis, Anastasius Gavras, Theodore Zahariadis, Dave Lambert, Frances Cleary, Petros Daras, et al., 431–446. Lecture Notes in Computer Science 6656. Springer Berlin Heidelberg, 2011. http://bit.ly/16ytKoT.

  • This paper “explores ‘smart cities’ as environments of open and user-driven innovation for experimenting and validating Future Internet-enabled services.”
  • The authors examine several smart city projects to illustrate the central role of users in defining smart services and the importance of participation. They argue that, “Two different layers of collaboration can be distinguished. The first layer is collaboration within the innovation process. The second layer concerns collaboration at the territorial level, driven by urban and regional development policies aiming at strengthening the urban innovation systems through creating effective conditions for sustainable innovation.”

Suciu, G., A. Vulpe, S. Halunga, O. Fratu, G. Todoran, and V. Suciu. “Smart Cities Built on Resilient Cloud Computing and Secure Internet of Things.” In 2013 19th International Conference on Control Systems and Computer Science (CSCS), 513–518, 2013. http://bit.ly/16wfNgv.

  • This paper proposes “a new platform for using cloud computing capacities for provision and support of ubiquitous connectivity and real-time applications and services for smart cities’ needs.”
  • The authors present a “framework for data procured from highly distributed, heterogeneous, decentralized, real and virtual devices (sensors, actuators, smart devices) that can be automatically managed, analyzed and controlled by distributed cloud-based services.”

Townsend, Anthony. Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia. W. W. Norton & Company, 2013.

  • In this book, Townsend illustrates how “cities worldwide are deploying technology to address both the timeless challenges of government and the mounting problems posed by human settlements of previously unimaginable size and complexity.”
  • He also considers “the motivations, aspirations, and shortcomings” of the many stakeholders involved in the development of smart cities, and poses a new civics to guide these efforts.
  • He argues that smart cities are not made smart by various, soon-to-be-obsolete technologies built into its infrastructure, but how citizens use these ever-changing technologies to be “human-centered, inclusive and resilient.”

To stay current on recent writings and developments on Urban Analytics, please subscribe to the GovLab Digest.
Did we miss anything? Please submit reading recommendations to biblio@thegovlab.org or in the comments below.