Index: Open Data


By Alexandra Shaw, Michelle Winowatan, Andrew Young, and Stefaan Verhulst

The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on open data and was originally published in 2018.

Value and Impact

  • The projected year at which all 28+ EU member countries will have a fully operating open data portal: 2020

  • Between 2016 and 2020, the market size of open data in Europe is expected to increase by 36.9%, and reach this value by 2020: EUR 75.7 billion

Public Views on and Use of Open Government Data

  • Number of Americans who do not trust the federal government or social media sites to protect their data: Approximately 50%

  • Key findings from The Economist Intelligence Unit report on Open Government Data Demand:

    • Percentage of respondents who say the key reason why governments open up their data is to create greater trust between the government and citizens: 70%

    • Percentage of respondents who say OGD plays an important role in improving lives of citizens: 78%

    • Percentage of respondents who say OGD helps with daily decision making especially for transportation, education, environment: 53%

    • Percentage of respondents who cite lack of awareness about OGD and its potential use and benefits as the greatest barrier to usage: 50%

    • Percentage of respondents who say they lack access to usable and relevant data: 31%

    • Percentage of respondents who think they don’t have sufficient technical skills to use open government data: 25%

    • Percentage of respondents who feel the number of OGD apps available is insufficient, indicating an opportunity for app developers: 20%

    • Percentage of respondents who say OGD has the potential to generate economic value and new business opportunity: 61%

    • Percentage of respondents who say they don’t trust governments to keep data safe, protected, and anonymized: 19%

Efforts and Involvement

  • Time that’s passed since open government advocates convened to create a set of principles for open government data – the instance that started the open data government movement: 10 years

  • Countries participating in the Open Government Partnership today: 79 OGP participating countries and 20 subnational governments

  • Percentage of “open data readiness” in Europe according to European Data Portal: 72%

    • Open data readiness consists of four indicators which are presence of policy, national coordination, licensing norms, and use of data.

  • Number of U.S. cities with Open Data portals: 27

  • Number of governments who have adopted the International Open Data Charter: 62

  • Number of non-state organizations endorsing the International Open Data Charter: 57

  • Number of countries analyzed by the Open Data Index: 94

  • Number of Latin American countries that do not have open data portals as of 2017: 4 total – Belize, Guatemala, Honduras and Nicaragua

  • Number of cities participating in the Open Data Census: 39

Demand for Open Data

  • Open data demand measured by frequency of open government data use according to The Economist Intelligence Unit report:

    • Australia

      • Monthly: 15% of respondents

      • Quarterly: 22% of respondents

      • Annually: 10% of respondents

    • Finland

      • Monthly: 28% of respondents

      • Quarterly: 18% of respondents

      • Annually: 20% of respondents

    •  France

      • Monthly: 27% of respondents

      • Quarterly: 17% of respondents

      • Annually: 19% of respondents

        •  
    • India

      • Monthly: 29% of respondents

      • Quarterly: 20% of respondents

      • Annually: 10% of respondents

    • Singapore

      • Monthly: 28% of respondents

      • Quarterly: 15% of respondents

      • Annually: 17% of respondents 

    • UK

      • Monthly: 23% of respondents

      • Quarterly: 21% of respondents

      • Annually: 15% of respondents

    • US

      • Monthly: 16% of respondents

      • Quarterly: 15% of respondents

      • Annually: 20% of respondents

  • Number of FOIA requests received in the US for fiscal year 2017: 818,271

  • Number of FOIA request processed in the US for fiscal year 2017: 823,222

  • Distribution of FOIA requests in 2017 among top 5 agencies with highest number of request:

    • DHS: 45%

    • DOJ: 10%

    • NARA: 7%

    • DOD: 7%

    • HHS: 4%

Examining Datasets

  • Country with highest index score according to ODB Leaders Edition: Canada (76 out of 100)

  • Country with lowest index score according to ODB Leaders Edition: Sierra Leone (22 out of 100)

  • Number of datasets open in the top 30 governments according to ODB Leaders Edition: Fewer than 1 in 5

  • Average percentage of datasets that are open in the top 30 open data governments according to ODB Leaders Edition: 19%

  • Average percentage of datasets that are open in the top 30 open data governments according to ODB Leaders Edition by sector/subject:

    • Budget: 30%

    • Companies: 13%

    • Contracts: 27%

    • Crime: 17%

    • Education: 13%

    • Elections: 17%

    • Environment: 20%

    • Health: 17%

    • Land: 7%

    • Legislation: 13%

    • Maps: 20%

    • Spending: 13%

    • Statistics: 27%

    • Trade: 23%

    • Transport: 30%

  • Percentage of countries that release data on government spending according to ODB Leaders Edition: 13%

  • Percentage of government data that is updated at regular intervals according to ODB Leaders Edition: 74%

  • Number of datasets available through:

  • Number of datasets classed as “open” in 94 places worldwide analyzed by the Open Data Index: 11%

  • Percentage of open datasets in the Caribbean, according to Open Data Census: 7%

  • Number of companies whose data is available through OpenCorporates: 158,589,950

City Open Data

  • New York City

  • Singapore

    • Number of datasets published in Singapore: 1,480

    • Percentage of datasets with standardized format: 35%

    • Percentage of datasets made as raw as possible: 25%

  • Barcelona

    • Number of datasets published in Barcelona: 443

    • Open data demand in Barcelona measured by:

      • Number of unique sessions in the month of September 2018: 5,401

    • Quality of datasets published in Barcelona according to Tim Berners Lee 5-star Open Data: 3 stars

  • London

    • Number of datasets published in London: 762

    • Number of data requests since October 2014: 325

  • Bandung

    • Number of datasets published in Bandung: 1,417

  • Buenos Aires

    • Number of datasets published in Buenos Aires: 216

  • Dubai

    • Number of datasets published in Dubai: 267

  • Melbourne

    • Number of datasets published in Melbourne: 199

Sources

  • About OGP, Open Government Partnership. 2018.  

The GovLab Selected Readings on Blockchain Technologies and the Governance of Extractives


Curation by Andrew Young, Anders Pedersen, and Stefaan G. Verhulst

Readings developed together with NRGI, within the context of our joint project on Blockchain technologies and the Governance of Extractives. Thanks to Joyce Zhang and Michelle Winowatan for research support.

We need your help! Please share any additional readings on the use of Blockchain Technologies in the Extractives Sector with blockchange@thegovlab.org.  

Introduction

By providing new ways to securely identify individuals and organizations, and record transactions of various types in a distributed manner, blockchain technologies have been heralded as a new tool to address information asymmetries, establish trust and improve governance – particularly around the extraction of oil, gas and other natural resources. At the same time, blockchain technologies are been experimented with to optimize certain parts of the extractives value chain – potentially decreasing transparency and accountability while making governance harder to implement.

Across the expansive and complex extractives sector, blockchain technologies are believed to have particular potential for improving governance in three key areas:  

  • Beneficial ownership and illicit flows screening: The identity of those who benefit, through ownership, from companies that extract natural resources is often hidden – potentially contributing to tax evasion, challenges to global sanction regimes, corruption and money laundering.
  • Land registration, licensing and contracting transparency: To ensure companies extract resources responsibly and comply with rules and fee requirements, effective governance and a process to determine who has the rights to extract natural resources, under what conditions, and who is entitled to the land is essential.
  • Commodity trading and supply chain transparency: The commodity trading sector is facing substantive challenges in assessing and verifying the authenticity of for example oil trades. Costly time is spent by commodity traders reviewing documentation of often poor quality. The expectation of the sector is firstly to eliminate time spent verifying the authenticity of traded goods and secondly to reduce the risk premium on trades. Transactions from resources and commodities trades are often opaque and secretive, allowing for governments and companies to conceal how much money they receive from trading, and leading to corruption and evasion of taxation.

In the below we provide a selection of the nascent but growing literature on Blockchain Technologies and Extractives across six categories:

Selected Readings 

Blockchain Technologies and Extractives – Promise and Current Potential

Adams, Richard, Beth Kewell, Glenn Parry. “Blockchain for Good? Digital Ledger Technology and Sustainable Development Goals.” Handbook of Sustainability and Social Science Research. October 27, 2017.

  • This chapter in the Handbook of Sustainability and Social Science Research seeks to reflect and explore the different ways Blockchain for Good (B4G) projects can provide social and environmental benefits under the UN’s Sustainable Goals framework
  • The authors describe the main categories in which blockchain can achieve social impact: mining/consensus algorithms that reward good behavior, benefits linked to currency use in the form of “colored coins,” innovations in supply chain, innovations in government, enabling the sharing economy, and fostering financial inclusion.
  • The chapter concludes that with B4G there is also inevitably “Blockchain for Bad.” There is already critique and failures of DLTs such as the DAO, and more research must be done to identify whether DLTs can provide a more decentralized, egalitarian society, or if they will ultimately be another tool for control and surveillance by organizations and government.

Cullinane, Bernadette, and Randy Wilson. “Transforming the Oil and Gas Industry through Blockchain.” Official Journal of the Australian Institute of Energy News, p 9-10, December 2017.

  • In this article, Cullinane and Wilson explore blockchain’s application in the oil and gas industry “presents a particularly compelling opportunity…due to the high transactional values, associated risks and relentless pressure to reduce costs.”
  • The authors elaborate four areas where blockchain can benefit play a role in transforming the oil and gas industry:
    • Supply chain management
    • Smart contracts
    • Record management
    • Cross-border payments

Da Silva, Filipe M., and Ankita Jaitly. “Blockchain in Natural Resources: Hedging Against Volatile Prices.” Tata Consultancy Services Ltd., 2018.

  • The authors of this white paper assess the readiness of natural resources industries for blockchain technology application, identify areas where blockchain can add value, and outline a strategic plan for its adoption.
  • In particular, they highlight the potential for blockchain in the oil and gas industry to simplify payments, where for example, gas can be delivered directly to consumer homes using a blockchain smart contracting application.

Halford-Thompson, Guy. “Powered by Blockchain: Reinventing Information Management in the Energy Space.” BTL, May 12, 2017.

  • According to Halford-Thompson, “oil and gas companies are exploring blockchain’s promise to revamp inefficient internal processes and achieve significant reductions in operating costs through the automation of record keeping and messaging, the digitization of the supply chain information flow, and the elimination of reconciliation, among many other data management use cases.”
  • The data reconciliation process, for one, is complex and can require significant time for completion. Blockchain technology could not only remove the need for some steps in the information reconciliation process, but also eliminate the need for reconciliation altogether in some instances.

Blockchain Technologies and the Governance of Extractives

(See also: Selected Readings of Blockchain Technologies and its Potential to Transform Governance)

Koeppen, Mark, David Shrier, and Morgan Bazilian. “Is Blockchain’s Future in Oil and Gas Transformative Or Transient? Deloitte, 2017.

  • In this report, the authors propose four areas that blockchain can improve for the oil and gas industry, which are:
    • Transparency and compliance: Employment of blockchain is predicted to significantly reduce cost related to compliance, since it securely makes information available to all parties involved in the supply chain.
    • Cyber threats and security: The industry faces constant digital security threat and blockchain provides a solution to address this issue.
    • Mid-volume trading/third party impacts: They argue that the “boundaries between asset classes will blur as cash, energy products and other commodities, from industrial components to apples could all become digital assets trading interoperably.”
    • Smart contract: Since the “sheer size and volume of contracts and transactions to execute capital projects in oil and gas have historically caused significant reconciliation and tracking issues among contractors, sub-contractors, and suppliers,” blockchain-enabled smart contracts could improve the process by executing automatically after all requirements are met, and boosting contract efficiency and protecting each party from volatile pricing.

Mawet, Pierre, and Michael Insogna. “Unlocking the Potential of Blockchain in Oil and Gas Supply Chains.” Accenture Energy Blog, November 21, 2016.

  • The authors propose three ways blockchain technology can boost productivity and efficiency in oil and gas industry:
    • “Greater process efficiency. Smart contracts, for example, can be held in a blockchain transaction with party compliance confirmed through follow-on transactions, reducing third-party supervision and paper-based contracting, thus helping reduce cost and overhead.”
    • “Compliance. Visibility is essential to improve supply chain performance. The immutable record of transactions can aid in product traceability and asset tracking.”
    • “Data transfer from IoT sensors. Blockchain could be used to track the unique history of a device, with the distributed ledger recording data transfer from multiple sensors. Data security in devices could be safeguarded by unique blockchain characteristics.”

Som, Indranil. “Blockchain: Radically Changing the Mining Paradigm.” Digitalist, September 27, 2017.

  • In this article, Som proposes three ways that the blockchain technology can “support leaner organizations and increased security” in the mining industry: improving cybersecurity, increasing transparency through smart contracts, and providing visibility into the supply chain.

Identity: Beneficial Ownership and Illicit Flows

(See also: Selected Readings on Blockchain Technologies and Identity).

de Jong, Julia, Alexander Meyer, and Jeffrey Owens. “Using blockchain for transparent beneficial ownership registers. International Tax Review, June 2017.

  • This paper discusses the features of blockchain and distributed ledger technology that can improve collection and distribution of information on beneficial ownership.
  • The FATF and OECD Global Forum regimes have identified a number of common problems related to beneficial ownership information across all jurisdictions, including:
    • “Insufficient accuracy and accessibility of company identification and ownership information;
    • Less rigorous implementation of customer due-diligence (CDD) measures by key gatekeepers such as lawyers, accountants, and trust and company service providers; and
    • Obstacles to information sharing such as data protection and privacy laws, which impede competent authorities from receiving timely access to adequate, accurate and up-to-date information on basic legal and beneficial ownership.”
  • The authors argue that the transparency, immutability, and security offered by blockchain makes it ideally suited for record-keeping, particularly with regards to the ownership of assets. Thus, blockchain can address many of the shortcomings in the current system as identified by the FATF and the OECD.
  • They go on to suggest that a global registry of beneficial ownership using blockchain technology would offer the following benefits:
    • Ensuring real-time accuracy and verification of ownership information
    • Increasing security and control over sensitive personal and commercial information
    • Enhancing audit transparency
    • Creating the potential for globally-linked registries
    • Reducing corruption and fraud, and increasing trust
    • Reducing compliance burden for regulate entities

Herian, Robert. “Trusteeship in a Post-Trust World: Property, Trusts Law and the Blockchain.” The Open University, 2016.

  • This working paper discusses the often overlooked topic of trusteeship and trusts law and the implications of blockchain technology in the space. 
  • “Smart trusts” on the blockchain will distribute trusteeship across a network and, in theory, remove the need for continuous human intervention in trust fund investments thus resolving key issues around accountability and the potential for any breach of trust.
  • Smart trusts can also increase efficiency and security of transactions, which could improve the overall performance of the investment strategy, thereby creating higher returns for beneficiaries.

Karsten, Jack and Darrell M. West (2018): “Venezuela’s “petro” undermines other cryptocurrencies – and international sanctions.” Brookings, Friday, March 9 2018,

  • This article discusses the Venezuelan government’s cryptocurrency, “petro,” which was launched as a solution to the country’s economic crisis and near-worthless currency, “bolívar”
  • Unlike the volatility of other cryptocurrencies such as Bitcoin and Litecoin, one petro’s price is pegged to the price of one barrel of Venezuelan oil – roughly $60
  • And rather than decentralizing control like most blockchain applications, the petro is subject to arbitrary discount factor adjustment, fluctuating oil prices, and a corrupt government known for manipulating its currency
  • The authors warn the petro will not stabilize the Venezuelan economy since only foreign investors funded the presale, yet (from the White Paper) only Venezuelan citizens can use the cryptocurrency to pay taxes, fees, and other expenses. Rather, they argue, the petro represents an attempt to create foreign capital out of “thin air,” which is not subject to traditional economic sanctions.  

Land Registration, Licensing and Contracting Transparency

Michael Graglia and Christopher Mellon. “Blockchain and Property in 2018: At the End of the Beginning.” 2018 World Bank Conference on Land and Poverty, March 19-23, 2018.

  • This paper claims “blockchain makes sense for real estate” because real estate transactions depend on a number of relationships, processes, and intermediaries that must reconcile all transactions and documents for an action to occur. Blockchain and smart contracts can reduce the time and cost of transactions while ensuring secure and transparent record-keeping systems.
  • The ease, efficiency, and security of transactions can also create an “international market for small real estate” in which individuals who cannot afford an entire plot of land can invest small amounts and receive their portion of rental payments automatically through smart contracts.
  • The authors describe seven prerequisites that land registries must fulfill before blockchain can be introduced successfully: accurate data, digitized records, an identity solution, multi-sig wallets, a private or hybrid blockchain, connectivity and a tech aware population, and a trained professional community
  • To achieve the goal of an efficient and secure property registry, the authors propose an 8-level progressive framework through which registries slowly integrate blockchain due to legal complexity of land administration, resulting inertia of existing processes, and high implementation costs.  
    • Level 0 – No Integration
    • Level 1 – Blockchain Recording
    • Level 2 – Smart Workflow
    • Level 3 – Smart Escrow
    • Level 4 – Blockchain Registry
    • Level 5 – Disaggregated Rights
    • Level 6 – Fractional Rights
    • Level 7 – Peer-to-Peer Transactions
    • Level 8 – Interoperability

Thomas, Rod. “Blockchain’s Incompatibility for Use as a Land Registry: Issues of Definition, Feasibility and Risk. European Property Law Journal, vol. 6, no. 3, May 2017.

  • Thomas argues that blockchain, as it is currently understood and defined, is unsuited for the transfer of real property rights because it fails to address the need for independent verification and control.
  • Under a blockchain-based system, coin holders would be in complete control of the recordation of the title interests of their land, and thus, it would be unlikely that they would report competing or contested claims.
  • Since land remains in the public domain, the risk of third party possessory title claims are likely to occur; and over time, these risks will only increase exponentially.
  • A blockchain-based land title represents interlinking and sequential transactions over many hundreds, if not thousands, of years, so given the misinformation that would compound over time, it would be difficult to trust the current title holder has a correctly recorded title
  • The author concludes that supporters of blockchain for land registries frequently overlook a registry’s primary function to provide an independent verification of the provenance of stored data.

Vos, Jacob, Christiaan Lemmen, and Bert Beentjes. “Blockchain-Based Land Registry: Panacea, Illusion or Something In Between? 2017 World Bank Conference on Land and Poverty, March 20-24, 2017.

  • The authors propose that blockchain is best suited for the following steps in land administration:
    • The issuance of titles
    • The archiving of transactions – specifically in countries that do not have a reliable electronic system of transfer of ownership
  • The step in between issuing titles and archiving transactions is the most complex – the registration of the transaction. This step includes complex relationships between the “triple” of land administration: rights (right in rem and/or personal rights), object (spatial unit), and subject (title holder). For the most part, this step is done manually by registrars, and it is questionable whether blockchain technology, in the form of smart contracts, will be able to process these complex transactions.
  • The authors conclude that one should not underestimate the complexity of the legal system related to land administration. The standardization of processes may be the threshold to success of blockchain-based land administration. The authors suggest instead of seeking to eliminate one party from the process, technologists should cooperate with legal and geodetic professionals to create a system of checks and balances to successfully implement blockchain for land administration.  
  • This paper also outlines five blockchain-based land administration projects launched in Ghana, Honduras, Sweden, Georgia, and Cook County, Illinois.

Commodity Trading and Supply Chain Transparency

Ahmed, Shabir. “Leveraging Blockchain to Revolutionise the Mining Industry.” SAP News, February 27, 2018.

  • In this article, Ahmed identifies seven key use cases for blockchain in the mining industry:
    • Automation of ore acquisition and transfer;
    • Automatic registration of mineral rights and IP;
    • Visibility of ore inventory at ports;
    • Automatic cargo hire process;
    • Process and secure large amounts of IoT data;
    • Reconciling amount produced and sent for processing;
    • Automatically execute procurement and other contracts.

Brooks, Michael. “Blockchain and the Fight Against Illicit Financial Flows.” The Policy Corner, February 19, 2018.

  • In this article, Brooks argues that, “Because of the inherent decentralization and immutability of data within blockchains, it offers a unique opportunity to bypass traditional tracking and transparency initiatives that require strong central governance and low levels of corruption. It could, to a significant extent, bypass the persistent issues of authority and corruption by democratizing information around data consensus, rather than official channels and occasional studies based off limited and often manipulated information. Within the framework of a coherent policy initiative that integrates all relevant stakeholders (states, transnational organizations, businesses, NGOs, other monitors and oversight bodies), a international supply chains supported by blockchain would decrease the ease with which resources can be hidden, numbers altered, and trade misinvoiced.”

Conflict Free Natural Resources.” Global Opportunity Report 2017. Global Opportunity Network, 2017.

  • In this entry from the Global Opportunity Report, and specifically toward the end of ensuring conflict-free natural resources, Blockchain is labeled as “well-suited for tracking objects and transactions, making it possible for virtually anything of value to be traced. This opportunity is about creating transparency and product traceability in supply chains.

Blockchain for Traceability in Minerals and Metals Supply Chains: Opportunities and Challenges.” RCS Global and ICMM, 2017.

  • This report is based on insights generated during the Materials Stewardship Round Table on the potential of BCTs for tracking and tracing metals and minerals supply chains, which subsequently informed an RCS Global research initiative on the topic.
  • Insight into two key areas is increasingly desired by downstream manufacturing companies from upstream producers of metals and minerals: provenance and production methods
  • In particular, the report offers five key potential advantages of using Blockchain for mineral and metal supply chain activities:
    • “Builds consensus and trust around responsible production standards between downstream and upstream companies.
    • The immutability of and decentralized control over a blockchain system minimizes the risk of fraud.
    • Defined datasets can be made accessible in real time to any third party, including downstream buyers, auditors, investors, etc. but at the same time encrypted so as to share a proof of fact rather than confidential information.
    • A blockchain system can be easily scaled to include other producers and supply chains beyond those initially involved.
    • Cost reduction due to the paperless nature of a blockchain-enabled CoC [Chain of Custody] system, the potential reduction of audits, and reduction in transaction costs.”

Van Bockstael, Steve. “The emergence of conflict-free, ethical, and Fair Trade mineral supply chain certification systems: A brief introduction.” The Extractives Industries and Society, vol. 5, issue 1, January 2018.

  • This introduction to a special section considers the emerging field of “‘conflict-free’, ‘fair’ and ‘transparently sourced and traded’ minerals” in global industry supply chains.
  • Van Bockstael describes three areas of practice aimed at increasing supply chain transparency:
    • “Initiatives that explicitly try to sever the links between mining or minerals trading and armed conflict of the funding thereof.”
    • “Initiatives, limited in number yet growing, that are explicitly linked to the internationally recognized ‘Fair Trade’ movement and whose aim it is to source artisanally mined minerals for the Western jewellry industry.”
    • “Initiatives that aim to provide consumers or consumer-facing industries with more ethical, transparent and fair supply chains (often using those concepts in fuzzy and interchangeable ways) that are not linked to the established Fair Trade movement” – including, among others, initiatives using Blockchain technology “to create tamper-proof supply chains.”

Global Governance, Standards and Disclosure Practices

Lafarre, Anne and Christoph Van der Elst. “Blockchain Technology for Corporate Governance and Shareholder Activism.” European Corporate Governance Institute (ECGI) – Law Working Paper No. 390/2018, March 8, 2018.

  • This working paper focuses on the potential benefits of leveraging Blockchain during functions involving shareholder and company decision making. Lafarre and Van der Elst argue that “Blockchain technology can lower shareholder voting costs and the organization costs for companies substantially. Moreover, blockchain technology can increase the speed of decision-making, facilitate fast and efficient involvement of shareholders.”
  • The authors argue that in the field of corporate governance, Blockchain offers two important elements: “transparency – via the verifiable way of recording transactions – and trust – via the immutability of these transactions.”
  • Smart contracting, in particular, is seen as a potential avenue for facilitating the ‘agency relationship’ between board members and the shareholders they represent in corporate decision-making processes.

Myung, San Jun. “Blockchain government – a next for of infrastructure for the twenty-first century.” Journal of Open Innovation: Technology, Market, and Complexity, December 2018.

  • This paper argues the idea that Blockchain represents a new form of infrastructure that, given its core consensus mechanism, could replace existing social apparatuses including bureaucracy.
  • Indeed, Myung argues that blockchain and bureaucracy share a number of attributes:
    • “First, both of them are defined by the rules and execute predetermined rules.
    • Second, both of them work as information processing machines for society.
    • Third, both of them work as trust machines for society.”  
  • The piece concludes with five principles for replacing bureaucracy with blockchain for social organization: “1) introducing Blockchain Statute law; 2) transparent disclosure of data and source code; 3) implementing autonomous executing administration; 4) building a governance system based on direct democracy; and 5) making Distributed Autonomous Government (DAG).  

Peters, Gareth and Vishnia, Guy (2016): “Blockchain Architectures for Electronic Exchange Reporting Requirements: EMIR, Dodd Frank, MiFID I/II, MiFIR, REMIT, Reg NMS and T2S.” University College London, August 31, 2016.

  • This paper offers a solution based on blockchain architectures to the regulations of financial exchanges around the world for trade processing and reporting for execution and clearing. In particular, the authors give a detailed overview of EMIR, Dodd Frank, MiFID I/II, MiFIR, REMIT, Reg NMS and T2S.
  • The authors suggest the increasing amount of data from transaction reporting start to be incorporated on a blockchain ledger in order to harness the built-in security and immutability features of the blockchain to support key regulatory features.
  • Specifically, the authors suggest 1) a permissioned blockchain controlled by a regulator or a consortium of market participants for the maintenance of identity data from market participants and 2) blockchain frameworks such as Enigma to be used to facilitate required transparency and reporting aspects related to identities when performing pre- and post-trade reporting as well as for auditing.

Blockchain Technology and Competition Policy – Issues paper by the Secretariat,” OECD, June 8, 2018.

  • This OECD issues paper poses two key questions about how blockchain technology might increase the relevance of new disclosures practices:
    • “Should competition agencies be given permission to access blockchains? This might enable them to monitor trading prices in real-time, spot suspicious trends, and, when investigating a merger, conduct or market have immediate access to the necessary data without needing to impose burdensome information requests on parties.”
    • “Similarly, easy access to the information on a blockchain for a firm’s owners and head offices would potentially improve the effectiveness of its oversight on its own subsidiaries and foreign holdings. Competition agencies may assume such oversight already exists, but by making it easier and cheaper, a blockchain might make it more effective, which might allow for more effective centralised compliance programmes.”

Michael Pisa and Matt Juden. “Blockchain and Economic Development: Hype vs. Reality.” Center for Global Development Policy Paper, 2017.

  • In this Center for Global Development Policy Paper, the authors examine blockchain’s potential to address four major development challenges: (1) facilitating faster and cheaper international payments, (2) providing a secure digital infrastructure for verifying identity, (3) securing property rights, and (4) making aid disbursement more secure and transparent.
  • The authors conclude that while blockchain may be well suited for certain use cases, the majority of constraints in blockchain-based projects fall outside the scope of technology. Common constraints such as data collection and privacy, governance, and operational resiliency must be addressed before blockchain can be successfully implemented as a solution.

Industry-Specific Case Studies

Chohan, Usman. “Blockchain and the Extractive Industries: Cobalt Case Study,” University of New South Wales, Canberra Discussion Paper Series: Notes on the 21st Century, 2018.

  • In this discussion paper, the author studies the pilot use of blockchain in cobalt mining industry in the Democratic Republic of Congo (DRC). The project tracked the movement of cobalt from artisanal mines through its installation in devices such as smartphones and electric cars.
  • The project records cobalt attributes – weights, dates, times, images, etc. – into the digital ledger to help ensure that cobalt purchases are not contributing to forced child labor or conflict minerals. 

Chohan, Usman. “Blockchain and the Extractive Industries #2: Diamonds Case Study,” University of New South Wales, Canberra Discussion Paper Series: Notes on the 21st Century, 2018.

  • The second case study from Chohan investigates the application of blockchain technology in the extractive industry by studying Anglo-American (AAL) diamond DeBeer’s unit and Everledger’s blockchain projects. 
  • In this study, the author finds that AAL uses blockchain to track gems (carat, color, certificate numbers), starting from extraction and onwards, including when the gems change hands in trade transaction.
  • Like the cobalt pilot, the AAL initiative aims to help avoid supporting conflicts and forced labor, and to improve trading accountability and transparency more generally.

Selected Readings on Data Responsibility, Refugees and Migration


By Kezia Paladina, Alexandra Shaw, Michelle Winowatan, Stefaan Verhulst, and Andrew Young

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of Data Collaboration for Migration was originally published in 2018.

Special thanks to Paul Currion whose data responsibility literature review gave us a headstart when developing the below. (Check out his article listed below on Refugee Identity)

The collection below is also meant to complement our article in the Stanford Social Innovation Review on Data Collaboration for Migration where we emphasize the need for a Data Responsibility Framework moving forward.

From climate change to politics to finance, there is growing recognition that some of the most intractable problems of our era are information problems. In recent years, the ongoing refugee crisis has increased the call for new data-driven approaches to address the many challenges and opportunities arising from migration. While data – including data from the private sector – holds significant potential value for informing analysis and targeted international and humanitarian response to (forced) migration, decision-makers often lack an actionable understanding of if, when and how data could be collected, processed, stored, analyzed, used, and shared in a responsible manner.

Data responsibility – including the responsibility to protect data and shield its subjects from harms, and the responsibility to leverage and share data when it can provide public value – is an emerging field seeking to go beyond just privacy concerns. The forced migration arena has a number of particularly important issues impacting responsible data approaches, including the risks of leveraging data regarding individuals fleeing a hostile or repressive government.

In this edition of the GovLab’s Selected Readings series, we examine the emerging literature on the data responsibility approaches in the refugee and forced migration space – part of an ongoing series focused on Data Responsibiltiy. The below reading list features annotated readings related to the Policy and Practice of data responsibility for refugees, and the specific responsibility challenges regarding Identity and Biometrics.

Data Responsibility and Refugees – Policy and Practice

International Organization for Migration (IOM) (2010) IOM Data Protection Manual. Geneva: IOM.

  • This IOM manual includes 13 data protection principles related to the following activities: lawful and fair collection, specified and legitimate purpose, data quality, consent, transfer to third parties, confidentiality, access and transparency, data security, retention and personal data, application of the principles, ownership of personal data, oversight, compliance and internal remedies (and exceptions).
  • For each principle, the IOM manual features targeted data protection guidelines, and templates and checklists are included to help foster practical application.

Norwegian Refugee Council (NRC) Internal Displacement Monitoring Centre / OCHA (eds.) (2008) Guidance on Profiling Internally Displaced Persons. Geneva: Inter-Agency Standing Committee.

  • This NRC document contains guidelines on gathering better data on Internally Displaced Persons (IDPs), based on country context.
  • IDP profile is defined as number of displaced persons, location, causes of displacement, patterns of displacement, and humanitarian needs among others.
  • It further states that collecting IDPs data is challenging and the current condition of IDPs data are hampering assistance programs.
  • Chapter I of the document explores the rationale for IDP profiling. Chapter II describes the who aspect of profiling: who IDPs are and common pitfalls in distinguishing them from other population groups. Chapter III describes the different methodologies that can be used in different contexts and suggesting some of the advantages and disadvantages of each, what kind of information is needed and when it is appropriate to profile.

United Nations High Commissioner for Refugees (UNHCR). Model agreement on the sharing of personal data with Governments in the context of hand-over of the refugee status determination process. Geneva: UNHCR.

  • This document from UNHCR provides a template of agreement guiding the sharing of data between a national government and UNHCR. The model agreement’s guidance is aimed at protecting the privacy and confidentiality of individual data while promoting improvements to service delivery for refugees.

United Nations High Commissioner for Refugees (UNHCR) (2015). Policy on the Protection of Personal Data of Persons of Concern to UNHCR. Geneva: UNHCR.

  • This policy outlines the rules and principles regarding the processing of personal data of persons engaged by UNHCR with the purpose of ensuring that the practice is consistent with UNGA’s regulation of computerized personal data files that was established to protect individuals’ data and privacy.
  • UNHCR require its personnel to apply the following principles when processing personal data: (i) Legitimate and fair processing (ii) Purpose specification (iii) Necessity and proportionality (iv) Accuracy (v) Respect for the rights of the data subject (vi) Confidentiality (vii) Security (viii) Accountability and supervision.

United Nations High Commissioner for Refugees (UNHCR) (2015) Privacy Impact Assessment of UNHCR Cash Based Interventions.

  • This impact assessment focuses on privacy issues related to financial assistance for refugees in the form of cash transfers. For international organizations like UNHCR to determine eligibility for cash assistance, data “aggregation, profiling, and social sorting techniques,” are often needed, leading a need for a responsible data approach.
  • This Privacy Impact Assessment (PIA) aims to identify the privacy risks posed by their program and seek to enhance safeguards that can mitigate those risks.
  • Key issues raised in the PIA involves the challenge of ensuring that individuals’ data will not be used for purposes other than those initially specified.

Data Responsibility in Identity and Biometrics

Bohlin, A. (2008) “Protection at the Cost of Privacy? A Study of the Biometric Registration of Refugees.” Lund: Faculty of Law of the University of Lund.

  • This 2008 study focuses on the systematic biometric registration of refugees conducted by UNHCR in refugee camps around the world, to understand whether enhancing the registration mechanism of refugees contributes to their protection and guarantee of human rights, or whether refugee registration exposes people to invasions of privacy.
  • Bohlin found that, at the time, UNHCR failed to put a proper safeguards in the case of data dissemination, exposing the refugees data to the risk of being misused. She goes on to suggest data protection regulations that could be put in place in order to protect refugees’ privacy.

Currion, Paul. (2018) “The Refugee Identity.” Medium.

  • Developed as part of a DFID-funded initiative, this essay considers Data Requirements for Service Delivery within Refugee Camps, with a particular focus on refugee identity.
  • Among other findings, Currion finds that since “the digitisation of aid has already begun…aid agencies must therefore pay more attention to the way in which identity systems affect the lives and livelihoods of the forcibly displaced, both positively and negatively.”
  • Currion argues that a Responsible Data approach, as opposed to a process defined by a Data Minimization principle, provides “useful guidelines,” but notes that data responsibility “still needs to be translated into organisational policy, then into institutional processes, and finally into operational practice.”

Farraj, A. (2010) “Refugees and the Biometric Future: The Impact of Biometrics on Refugees and Asylum Seekers.” Colum. Hum. Rts. L. Rev. 42 (2010): 891.

  • This article argues that biometrics help refugees and asylum seekers establish their identity, which is important for ensuring the protection of their rights and service delivery.
  • However, Farraj also describes several risks related to biometrics, such as, misidentification and misuse of data, leading to a need for proper approaches for the collection, storage, and utilization of the biometric information by government, international organizations, or other parties.  

GSMA (2017) Landscape Report: Mobile Money, Humanitarian Cash Transfers and Displaced Populations. London: GSMA.

  • This paper from GSMA seeks to evaluate how mobile technology can be helpful in refugee registration, cross-organizational data sharing, and service delivery processes.
  • One of its assessments is that the use of mobile money in a humanitarian context depends on the supporting regulatory environment that contributes to unlocking the true potential of mobile money. The examples include extension of SIM dormancy period to anticipate infrequent cash disbursements, ensuring that persons without identification are able to use the mobile money services, and so on.
  • Additionally, GMSA argues that mobile money will be most successful when there is an ecosystem to support other financial services such as remittances, airtime top-ups, savings, and bill payments. These services will be especially helpful in including displaced populations in development.

GSMA (2017) Refugees and Identity: Considerations for mobile-enabled registration and aid delivery. London: GSMA.

  • This paper emphasizes the importance of registration in the context of humanitarian emergency, because being registered and having a document that proves this registration is key in acquiring services and assistance.
  • Studying cases of Kenya and Iraq, the report concludes by providing three recommendations to improve mobile data collection and registration processes: 1) establish more flexible KYC for mobile money because where refugees are not able to meet existing requirements; 2) encourage interoperability and data sharing to avoid fragmented and duplicative registration management; and 3) build partnership and collaboration among governments, humanitarian organizations, and multinational corporations.

Jacobsen, Katja Lindskov (2015) “Experimentation in Humanitarian Locations: UNHCR and Biometric Registration of Afghan Refugees.” Security Dialogue, Vol 46 No. 2: 144–164.

  • In this article, Jacobsen studies the biometric registration of Afghan refugees, and considers how “humanitarian refugee biometrics produces digital refugees at risk of exposure to new forms of intrusion and insecurity.”

Jacobsen, Katja Lindskov (2017) “On Humanitarian Refugee Biometrics and New Forms of Intervention.” Journal of Intervention and Statebuilding, 1–23.

  • This article traces the evolution of the use of biometrics at the Office of the United Nations High Commissioner for Refugees (UNHCR) – moving from a few early pilot projects (in the early-to-mid-2000s) to the emergence of a policy in which biometric registration is considered a ‘strategic decision’.

Manby, Bronwen (2016) “Identification in the Context of Forced Displacement.” Washington DC: World Bank Group. Accessed August 21, 2017.

  • In this paper, Bronwen describes the consequences of not having an identity in a situation of forced displacement. It prevents displaced population from getting various services and creates higher chance of exploitation. It also lowers the effectiveness of humanitarian actions, as lacking identity prevents humanitarian organizations from delivering their services to the displaced populations.
  • Lack of identity can be both the consequence and and cause of forced displacement. People who have no identity can be considered illegal and risk being deported. At the same time, conflicts that lead to displacement can also result in loss of ID during travel.
  • The paper identifies different stakeholders and their interest in the case of identity and forced displacement, and finds that the biggest challenge for providing identity to refugees is the politics of identification and nationality.
  • Manby concludes that in order to address this challenge, there needs to be more effective coordination among governments, international organizations, and the private sector to come up with an alternative of providing identification and services to the displaced persons. She also argues that it is essential to ensure that national identification becomes a universal practice for states.

McClure, D. and Menchi, B. (2015). Challenges and the State of Play of Interoperability in Cash Transfer Programming. Geneva: UNHCR/World Vision International.

  • This report reviews the elements that contribute to the interoperability design for Cash Transfer Programming (CTP). The design framework offered here maps out these various features and also looks at the state of the problem and the state of play through a variety of use cases.
  • The study considers the current state of play and provides insights about the ways to address the multi-dimensionality of interoperability measures in increasingly complex ecosystems.     

NRC / International Human Rights Clinic (2016). Securing Status: Syrian refugees and the documentation of legal status, identity, and family relationships in Jordan.

  • This report examines Syrian refugees’ attempts to obtain identity cards and other forms of legally recognized documentation (mainly, Ministry of Interior Service Cards, or “new MoI cards”) in Jordan through the state’s Urban Verification Exercise (“UVE”). These MoI cards are significant because they allow Syrians to live outside of refugee camps and move freely about Jordan.
  • The text reviews the acquirement processes and the subsequent challenges and consequences that refugees face when unable to obtain documentation. Refugees can encounter issues ranging from lack of access to basic services to arrest, detention, forced relocation to camps and refoulement.  
  • Seventy-two Syrian refugee families in Jordan were interviewed in 2016 for this report and their experiences with obtaining MoI cards varied widely.

Office of Internal Oversight Services (2015). Audit of the operations in Jordan for the Office of the United Nations High Commissioner for Refugees. Report 2015/049. New York: UN.

  • This report documents the January 1, 2012 – March 31, 2014 audit of Jordanian operations, which is intended to ensure the effectiveness of the UNHCR Representation in the state.
  • The main goals of the Regional Response Plan for Syrian refugees included relieving the pressure on Jordanian services and resources while still maintaining protection for refugees.
  • The audit results concluded that the Representation was initially unsatisfactory, and the OIOS suggested several recommendations according to the two key controls which the Representation acknowledged. Those recommendations included:
    • Project management:
      • Providing training to staff involved in financial verification of partners supervise management
      • Revising standard operating procedure on cash based interventions
      • Establishing ways to ensure that appropriate criteria for payment of all types of costs to partners’ staff are included in partnership agreements
    • Regulatory framework:
      • Preparing annual need-based procurement plan and establishing adequate management oversight processes
      • Creating procedures for the assessment of renovation work in progress and issuing written change orders
      • Protecting data and ensuring timely consultation with the UNHCR Division of Financial and Administrative Management

UNHCR/WFP (2015). Joint Inspection of the Biometrics Identification System for Food Distribution in Kenya. Geneva: UNHCR/WFP.

  • This report outlines the partnership between the WFP and UNHCR in its effort to promote its biometric identification checking system to support food distribution in the Dadaab and Kakuma refugee camps in Kenya.
  • Both entities conducted a joint inspection mission in March 2015 and was considered an effective tool and a model for other country operations.
  • Still, 11 recommendations are proposed and responded to in this text to further improve the efficiency of the biometric system, including real-time evaluation of impact, need for automatic alerts, documentation of best practices, among others.

Open Data for Developing Economies


By Andrew Young, Stefaan Verhulst, and Juliet McMurren
This edition of the GovLab Selected Readings was developed as part of the Open Data for Developing Economies research project (in collaboration with WebFoundation, USAID and fhi360). Special thanks to Maurice McNaughton, Francois van Schalkwyk, Fernando Perini, Michael Canares and David Opoku for their input on an early draft. Please contact Stefaan Verhulst (stefaan@thegovlab.org) for any additional input or suggestions.
Data-and-its-uses-for-Governance-1024x491
Open data is increasingly seen as a tool for economic and social development. Across sectors and regions, policymakers, NGOs, researchers and practitioners are exploring the potential of open data to improve government effectiveness, create new economic opportunity, empower citizens and solve public problems in developing economies. Open data for development does not exist in a vacuum – rather it is a phenomenon that is relevant to and studied from different vantage points including Data4Development (D4D), Open Government, the United Nations’ Sustainable Development Goals (SDGs), and Open Development. The below selected readings provide a view of the current research and practice on the use of open data for development and its relationship to related interventions.
Selected Reading List (in alphabetical order)

Annotated Selected Readings List (in alphabetical order)

Open Data and Open Government for Development

Benjamin, Solomon, R. Bhuvaneswari, P. Rajan, Manjunatha, “Bhoomi: ‘E-Governance’, or, An Anti-Politics Machine Necessary to Globalize Bangalore?” CASUM-m Working Paper, January 2007, http://bit.ly/2aD3vZe

  • This paper explores the digitization of land titles and their effect on governance in Bangalore. The paper takes a critical view of digitization and transparency efforts, particularly as best practices that should be replicated in many contexts.
  • The authors point to the potential of centralized open data and land records databases as a means for further entrenching existing power structures. They found that the digitization of land records in Bangalore “led to increased corruption, much more bribes and substantially increased time taken for land transactions,” as well allowing “very large players in the land markets to capture vast quantities of land when Bangalore experiences a boom in the land market.”
  • They argue for the need “to replace politically neutered concepts like ‘transparency’, ‘efficiency’, ‘governance’, and ‘best practice’ conceptually more rigorous terms that reflect the uneven terrain of power and control that governance embodies.

McGee, Rosie and Duncan Edwards, “Introduction: Opening Governance – Change, Continuity and Conceptual Ambiguity,” IDS Bulletin, January 24, 2016. http://bit.ly/2aJn1pq.  

  • This introduction to a special issue of the IDS Bulletin frames the research and practice of leveraging opening governance as part of a development agenda.
  • The piece primarily focuses on a number of “critical debates” that “have begun to lay bare how imprecise and overblown the expectations are in the transparency, accountability and openness ‘buzzfield’, and the problems this poses.”
  • A key finding on opening governance’s uptake and impact in the development space relates to political buy-in:
    • “Political will is generally a necessary but insu cient condition for governance processes and relationships to become more open, and is certainly a necessary but insu cient condition for tech-based approaches to open them up. In short, where there is a will, tech-for-T&A may be able to provide a way; where there isn’t a will, it won’t.”

Open Data and Data 4 Development

3rd International Open Data Conference (IODC), “Enabling the Data Revolution: An International Open Data Roadmap,” Conference Report, 2015, http://bit.ly/2asb2ei

  • This report, prepared by Open Data for Development, summarizes the proceedings of the third IODC in Ottawa, ON. It sets out an action plan for “harnessing open data for sustainable development”, with the following five priorities:
    1. Deliver shared principles for open data
    2. Develop and adopt good practices and open standards for data publication
    3. Build capacity to produce and use open data effectively
    4. Strengthen open data innovation networks
    5. Adopt common measurement and evaluation tools
  • The report draws on 70 impact accounts to present cross-sector evidence of “the promise and reality of open data,” and emphasizes the utility of open data in monitoring development goals, and the importance of “joined-up open data infrastructures,” ensuring wide accessibility, and grounding measurement in a clear understanding of citizen need, in order to realize the greatest benefits from open data.
  • Finally, the report sets out a draft International Open Data Charter and Action Plan for International Collaboration.

Hilbert, Martin, “Big Data for Development: A Review of Promises and Challenges,” Development Policy Review, December 13, 2015, http://bit.ly/2aoPtxL.

  • This article presents a conceptual framework based on the analysis of 180 articles on the opportunities and threats of big data for international development.
  • Open data, Hilbert argues, can be an incentive for those outside of government to leverage big data analytics: “If data from the public sector were to be openly available, around a quarter of existing data resources could be liberated for Big Data Analytics.”
  • Hilbert explores the misalignment between “the level of economic well-being and perceived transparency of a country” and the existence of an overarching open data policy. He points to low-income countries that are active in the open data effort, like Kenya, Russia and Brazil, in comparison to “other countries with traditionally high perceived transparency,” which are less active in releasing data, like Chile, Belgium and Sweden.

International Development Research Centre, World Wide Web Foundation, and Berkman Center at Harvard University, “Fostering a Critical Development Perspective on Open Government Data,” Workshop Report, 2012, http://bit.ly/2aJpyQq

  • This paper considers the need for a critical perspective on whether the expectations raised by open data programmes worldwide — as “a suitable remedy for challenges of good governance, economic growth, social inclusion, innovation, and participation” — have been met, and if so, under what circumstances.
  • Given the lack of empirical evidence underlying the implementation of Open Data initiative to guide practice and policy formulation, particularly in developing countries, the paper discusses the implementation of a policy-oriented research agenda to ensure open data initiatives in the Global South “challenge democratic deficits, create economic value and foster inclusion.”
  • The report considers theories of the relationship between open data and impact, and the mediating factors affecting whether that impact is achieved. It takes a broad view of impact, including both demand- and supply-side economic impacts, social and environmental impact, and political impact.

Open Data for Development, “Open Data for Development: Building an Inclusive Data Revolution,” Annual Report, 2015, http://bit.ly/2aGbkz5

  • This report — the inaugural annual report for the Open Data for Development program — gives an overview of outcomes from the program for each of OD4D’s five program objectives:
    1. Setting a global open data for sustainable development agenda;
    2. Supporting governments in their open data initiatives;
    3. Scaling data solutions for sustainable development;
    4. Monitoring the availability, use and impact of open data around the world; and
    5. Building the institutional capacity and long-term sustainability of the Open Data for Development network.
  • The report identifies four barriers to impact in developing countries: the lack of capacity and leadership; the lack of evidence of what works; the lack of coordination between actors; and the lack of quality data.

Stuart, Elizabeth, Emma Samman, William Avis, Tom Berliner, “The Data Revolution: Finding the Missing Millions,” Open Data Institute Research Report, April 2015, http://bit.ly/2acnZtE.

  • This report examines the challenge of implementing successful development initiatives when many citizens are not known to their governments as they do not exist in official databases.
  • The authors argue that “good quality, relevant, accessible and timely data will allow willing governments to extend services into communities which until now have been blank spaces in planning processes, and to implement policies more efficiently.”
  • In addition to improvements to national statistical offices, the authors argue that “making better use of the data we already have” by increasing openness to certain datasets held by governments and international organizations could help to improve the situation.
  • They examine a number of open data efforts in developing countries, including Kenya and Mexico.
  • Finally, they argue that “the data revolution could play a role in changing the power dynamic between citizens, governments and the private sector, building on open data and freedom of information movements around the world. It has the potential to enable people to produce, access and understand information about their lives and to use this information to make changes.”

United Nations Independent Expert Advisory Group on a Data Revolution for Sustainable Development. “A World That Counts, Mobilizing the Data Revolution,” 2014, http://bit.ly/2am5K28.

  • This report focuses on the potential benefits and risks data holds for sustainable development. Included in this is a strategic framework for using and managing data for humanitarian purposes. It describes a need for a multinational consensus to be developed to ensure data is shared effectively and efficiently.
  • It suggests that “people who are counted”—i.e., those who are included in data collection processes—have better development outcomes and a better chance for humanitarian response in emergency or conflict situations.
  • In particular, “better and more open data” is described as having the potential to “save money and create economic, social and environmental value” toward sustainable development ends.

The World Bank, “Digital Dividends: World Development Report 2016.” http://bit.ly/2aG9Kx5

  • This report examines “digital dividends” or the development benefits of using digital technologies in the space.
  • The authors argue that: “To get the most out of the digital revolution, countries also need to work on the “analog complements”—by strengthening regulations that ensure competition among businesses, by adapting workers’ skills to the demands of the new economy, and by ensuring that institutions are accountable.”
  • The “data revolution,” which includes both big data and open data is listed as one of four “digital enablers.”
  • Open data’s impacts are explored across a number of cases and developing countries and regions, including: Nepal, Mexico, Southern Africa, Kenya, Moldova and the Philippines.
  • Despite a number of success stories, the authors argue that: “sustained, impactful, scaled-up examples of big and open data in the developing world are still relatively rare,” and, in particular, “Open data has far to go.” They point to the high correlation between readiness, implementation and impact of open data to GDP per capita as evidence of the room for improvement.

Open Data and Open Development

Reilly, Katherine and Juan P. Alperin, “Intermediation in Open Development: A Knowledge Stewardship Approach,” Global Media Journal (Canadian Edition), 2016, http://bit.ly/2atWyI8

  • This paper examines the intermediaries that “have emerged to facilitate open data and related knowledge production activities in development processes.”
  • In particular, they study the concept of “knowledge stewardship,” which “demands careful consideration of how—through what arrangements—open resources can best be provided, and how best to maximize the quality, sustainability, buy-in, and uptake of those resources.”
  • The authors describe five models of open data intermediation:
    • Decentralized
    • Arterial
    • Ecosystem
    • Bridging
    • Communities of practice

Reilly, Katherine and Rob McMahon, “Quality of openness: Evaluating the contributions of IDRC’s Information and Networks Program to open development.” International Development Research Centre, January 2015, http://bit.ly/2aD6h0U

  • This reports describes the outcomes of IRDC’s Information and Networks (I&N) programme, focusing, in particular, those related to “quality of openness” of initiatives as well as their outcomes.
  • The research program explores “mechanisms that link open initiatives to human activities in ways that generate social innovations of significance to development. These include push factors such as data holders’ understanding of data usage, the preparedness or acceptance of user communities, institutional policies, and wider policies and regulations; as well as pull factors including the awareness, capacity and attitude of users. In other words, openly networked social processes rely on not just quality openness, but also on supportive environments that link open resources and the people who might leverage them to create improvements, whether in governance, education or knowledge production.”

Smith, M. and L. Elder, “Open ICT Ecosystems Transforming the Developing World,” Information Technologies and International Development, 2010, http://bit.ly/2au0qsW.

  • The paper seeks to examine the hypothesis that “open social arrangements, enabled by ICTs, can help to catalyze the development impacts of ICTs. In other words, open ICT ecosystems provide the space for the amplification and transformation of social activities that can be powerful drivers of development.”
  • While the focus is placed on a number of ICT interventions – with open data only directly referenced as it relates to the science community – the lessons learned and overarching framework are applicable to the open data for development space.
  • The authors argue for a new research focus on “the new social activities enabled by different configurations of ICT ecosystems and their connections with particular social outcomes.” They point in particular to “modules of social practices that can be applied to solve similar problems across different development domains,” including “massive participation, collaborative production of content, collaborative innovation, collective information validation, new ‘open’ organizational models, and standards and knowledge transfer.”

Smith, Matthew and Katherine M. A. Reilly (eds), “Open Development: Networked Innovations in International Development,” MIT Press, 2013, http://bit.ly/2atX2hu.

  • This edited volume considers the implications of the emergence of open networked models predicated on digital network technologies for development. In their introduction, the editors emphasize that openness is a means to support development, not an end, which is layered upon existing technological and social structures. While openness is often disruptive, it depends upon some measure of closedness and structure in order to function effectively.
  • Subsequent, separately authored chapters provide case studies of open development drawn from health, biotechnology, and education, and explore some of the political and structural barriers faced by open models.  

van den Broek, Tijs, Marijn Rijken, Sander van Oort, “Towards Open Development Data: A review of open development data from a NGO perspective,” 2012, http://bit.ly/2ap5E8a

  • In this paper, the authors seek to answer the question: “What is the status, potential and required next steps of open development data from the perspective of the NGOs?”
  • They argue that “the take-up of open development data by NGOs has shown limited progress in the last few years,” and, offer “several steps to be taken before implementation” to increase the effectiveness of open data’s use by NGOs to improve development efforts:
    • Develop a vision on open development and open data
    • Develop a clear business case
    • Research the benefits and risks of open development data and raise organizational and political awareness and support
    • Develop an appealing business model for data intermediaries and end-users
    • Balance data quality and timeliness
    • Dealing with the data obesity
    • Enrich quantitative data to overcome a quantitative bias
    • Monitor implementation and share best practices.

Open Data and Development Goals

Berdou, Evangelia, “Mediating Voices and Communicating Realities: Using Information Crowdsourcing Tools, Open Data Initiatives and Digital Media to Support and Protect the Vulnerable and Marginalised,” Institute of Development Studies, 2011, http://bit.ly/2aqbycg.

  • This report examines the potential of “open source information crowdsourcing platforms like Ushahidi, and open mapping and data initiatives like OpenStreetMap, are enabling citizens in developing countries to generate and disseminate information critical for their lives and livelihoods.”
  • The authors focus in particular on:
    • “the role of the open source social entrepreneur as a new development actor
    • the complexity of the architectures of participation supported by these platforms and the need to consider them in relation to the decision-making processes that they aim to support and the roles in which they cast citizens
    • the possibilities for cross-fertilisation of ideas and the development of new practices between development practitioners and technology actors committed to working with communities to improve lives and livelihoods.”
  • While the use of ICTs and open data pose numerous potential benefits for supporting and protecting the vulnerable and marginalised, the authors call for greater attention to:
    • challenges emerging from efforts to sustain participation and govern the new information commons in under-resourced and politically contested spaces
    • complications and risks emerging from the desire to share information freely in such contexts
    • gaps between information provision, transparency and accountability, and the slow materialisation of projects’ wider social benefits

Canares, Michael, Satyarupa Shekhar, “Open Data and Sub-national Governments: Lessons from Developing Countries,”  2015, http://bit.ly/2au2gu2

  • This synthesis paper seeks to gain a greater understanding of open data’s effects on local contexts – ”where data is collected and stored, where there is strong feasibility that data will be published, and where data can generate the most use and impact” – through the examination of nine papers developed as part of the Open Data in Developing Countries research project.
  • The authors point to three central findings:
    • “There is substantial effort on the part of sub-national governments to proactively disclose data, however, the design delimits citizen participation, and eventually, use.”
    • Context demands different roles for intermediaries and different types of initiatives to create an enabling environment for open data.”
    • “Data quality will remain a critical challenge for sub-national governments in developing countries and it will temper potential impact that open data will be able to generate.

Davies, Tim, “Open Data in Developing Countries – Emerging Insights from Phase I,” ODDC, 2014, http://bit.ly/2aX55UW

  • This report synthesizes findings from the Exploring the Emerging Impacts of Open Data in Developing Countries (ODDC) research network and its study of open data initiatives in 13 countries.
  • Davies provides 15 initial insights across the supply, mediation, and use of open data, including:
    • Open data initiatives can create new spaces for civil society to pursue government accountability and effectiveness;
    • Intermediaries are vital to both the supply and the use of open data; and
    • Digital divides create data divides in both the supply and use of data.

Davies, Tim, Duncan Edwards, “Emerging Implications of Open and Linked Data for Knowledge Sharing Development,” IDS Bulletin, 2012, http://bit.ly/2aLKFyI

  • This article explores “issues that development sector knowledge intermediaries may need to engage with to ensure the socio-technical innovations of open and linked data work in the interests of greater diversity and better development practice.”
  • The authors explore a number of case studies where open and linked data was used in a development context, including:
    • Open research: IDS and R4D meta-data
    • Open aid: International Aid Transparency Initiative
    • Open linked statistics: Young Lives
  • Based on lessons learned from these cases, the authors argue that “openness must serve the interests of marginalised and poor people. This is pertinent at three levels:
    • practices in the publication and communication of data
    • capacities for, and approaches to, the use of data
    • development and emergent structuring of open data ecosystems.

Davies, Tim, Fernando Perini, and Jose Alonso, “Researching the Emerging Impacts of Open Data,” ODDC, 2013, http://bit.ly/2aqb6uP

  • This research report offers a conceptual framework for open data, with a particular focus on open data in developing countries.
  • The conceptual framework comprises three central elements:
    • Open Data
      • About government
      • About companies & markets
      • About citizens
    • Domains of governance
      • Political domains
      • Economic domains
      • Social domains
    • Emerging Outcomes
      • Transparency & accountability
      • Innovation & economic growth
      • Inclusion & empowerment
  • The authors describe three central theories of change related to open data’s impacts:
    • Open data will bring about greater transparency in government, which in turn brings about greater accountability of key actors to make decisions and apply rules in the public interest;
    • Open data will enable non-state innovators to improve public services or build innovative products and services with social and economic value; open data will shift certain decision making from the state into the market, making it more efficient;
    • Open data will remove power imbalances that resulted from asymmetric information, and will bring new stakeholders into policy debates, giving marginalised groups a greater say in the creation and application of rules and policy.

Montano, Elise and Diogo Silva, “Exploring the Emerging Impacts of Open Data in Developing Countries (ODDC): ODDC1 Follow-up Outcome Evaluation Report,” ODDC, 2016, http://bit.ly/2au65z7.

  • This report summarizes the findings of a two and a half year research-driven project sponsored by the World Wide Web Foundation to explore how open data improves governance in developing countries, and build capacity in these countries to engage with open data. The research was conducted through 17 subgrants to partners from 12 countries.
  • Upon evaluation in 2014, partners reported increased capacity and expertise in dealing with open data; empowerment in influencing local and regional open data trends, particularly among CSOs; and increased understanding of open data among policy makers with whom the partners were in contact.

Smith, Fiona, William Gerry, Emma Truswell, “Supporting Sustainable Development with Open Data,” Open Data Institute, 2015, http://bit.ly/2aJwxsF

  • This report describes the potential benefits, challenges and next steps for leveraging open data to advance the Sustainable Development Goals.
  • The authors argue that the greatest potential impacts of open data on development are:
    • More effectively target aid money and improve development programmes
    • Track development progress and prevent corruption
    • Contribute to innovation, job creation and economic growth.
  • They note, however, that many challenges to such impact exist, including:
    • A weak enabling environment for open data publishing
    • Poor data quality
    • A mismatch between the demand for open data and the supply of appropriate datasets
    • A ‘digital divide’ between rich and poor, affecting both the supply and use of data
    • A general lack of quantifiable data and metrics.
  • The report articulates a number of ways that “governments, donors and (international) NGOs – with the support of researchers, civil society and industry – can apply open data to help make the SDGs a reality:
    • Reach global consensus around principles and standards, namely being ‘open by default’, using the Open Government Partnership’s Open Data Working Group as a global forum for discussion.
    • Embed open data into funding agreements, ensuring that relevant, high-quality data is collected to report against the SDGs. Funders should mandate that data relating to performance of services, and data produced as a result of funded activity, be released as open data.
    • Build a global partnership for sustainable open data, so that groups across the public and private sectors can work together to build sustainable supply and demand for data in the developing world.”

The World Bank, “Open Data for Sustainable Development,” Policy Note, August 2015, http://bit.ly/2aGjaJ4

  • This report from the World Bank seeks to describe open data’s potential for achieving the Sustainable Development Goals, and makes a number of recommendations toward that end.
  • The authors describe four key benefits of open data use for developing countries:
    • Foster economic growth and job creation
    • Improve efficiency, effectiveness and coverage of public services
    • Increase transparency, accountability, and citizen participation
    • Facilitate better information sharing within government
  • The paper concludes with a number of recommendations for improving open data programs, including:
    • Support Open Data use through legal and licensing frameworks.
    • Make data available for free online.
    • Publish data inventories for the government’s data resources.
    • Create feedback channels to government from current and potential data users.
    • Prioritize the datasets that users want.

Open Data and Developing Countries (National Case Studies)

Beghin, Nathalie and Carmela Zigoni, “Measuring Open Data’s Impact on Brazilian National and Sub-National Budget Transparency Websites and Its Impacts on People’s Rights,” 2014, http://bit.ly/2au3LaQ.

  • This report examines the impact of a Brazilian law requiring government entities to “provide real-time information on their budgets and spending through electronic means.” The authors explore “whether the national and state capitals are in fact using principles and practices of open data in their disclosures, and has evaluated the emerging impacts of open budget data disclosed through the national transparency portal.”
  • The report leveraged a “quantitative survey of budget and financial disclosures, and qualitative research with key stakeholders” to explore the “role of technical platforms and intermediaries in supporting the use of budget data by groups working in pursuit of social change and human rights.”
  • The survey found that:
    • The information provided is complete
    • In general, the data are not primary
    • Most governments do not provide timely information
    • Access to information is not ensured to all individuals
    • Advances were observed in terms of the availability of machine-processable data
    • Access is free, without discriminating users
    • The minority presents data in non-proprietary format
    • It is not known whether the data are under license

Boyera, S., C. Iglesias, “Open Data in Developing Countries: State of the Art,” Partnership for Open Data, 2014, http://bit.ly/2acBMR7

  • This report provides a summary of the State-of-the-Art study developed by SBC4D for the Partnership for Open Data (POD).
  • A series of interviews and responses to an online questionnaire yielded a number of findings, including:
    • “The number of actors interested in Open Data in Developing Countries is growing quickly. The study has identified 160+ organizations. It is important to note that a majority of them are just engaging in the domain and have little past experience. Most of these actors are focused on OD as an objective not a tool or means to increase impact or outcome.
    • Local actors are strong advocates of public data release. Lots of them are also promoting the re-use of existing data (through e.g. the organization of training, hackathons and alike). However, the study has not identified many actors practically using OD in their work or engaged in releasing their own data.
    • Traditional development sectors (health, education, agriculture, energy, transport) are not yet the target of many initiatives, and are clearly underdeveloped in terms of use-cases.
    • There is very little connection between horizontal (e.g. national OD initiatives) and vertical (sector-specific initiatives on e.g. extractive industry, or disaster management) activities”

Canares, M.P., J. de Guia, M. Narca, J. Arawiran, “Opening the Gates: Will Open Data Initiatives Make Local Governments in the Philippines More Transparent?” Open LGU Research Project, 2014, http://bit.ly/2au3Ond

  • This paper seeks to determine the impacts of the Department of Interior and Local Government of the Philippines’ Full Disclosure Policy, affecting financial and procurement data, on both data providers and data users.
  • The paper uncovered two key findings:
    • “On the supply side, incentivising openness is a critical aspect in ensuring that local governments have the interest to disclose financial data. While at this stage, local governments are still on compliance behaviour, it encourages the once reluctant LGUs to disclose financial information in the use of public funds, especially when technology and institutional arrangements are in place. However, LGUs do not make an effort to inform the public that information is available online and has not made data accessible in such a way that it can allow the public to perform computations and analysis. Currently, no data standards have been made yet by the Philippine national government in terms of format and level of detail.”
    • “On the demand side, there is limited awareness on the part of the public, and more particularly the intermediaries (e.g. business groups, civil society organizations, research institutions), on the availability of data, and thus, its limited use. As most of these data are financial in nature, it requires a certain degree of competence and expertise so that they will be able to make use of the data in demanding from government better services and accountability.”
  • The authors argue that “openness is not just about governments putting meaningful government data out into the public domain, but also about making the public meaningfully engage with governments through the use of open government data.” In order to do that, policies should “require observance of open government data standards and a capacity building process of ensuring that the public, to whom the data is intended, are aware and able to use the data in ensuring more transparent and accountable governance.”

Canares, M., M. Narca, and D. Marcial, “Enhancing Citizen Engagement Through Open Government Data,” ODDC, 2015, http://bit.ly/2aJMhfS

  • This research paper seeks to gain a greater understanding of how civil society organizations can increase or initiate their use of open data. The study is based on research conducted in “two provinces in the Philippines where civil society organizations in Negros Oriental province were trained, and in the Bohol province were mentored on accessing and using open data.
  • The authors seek to answer three central research questions:
    • What do CSOs know about open government data? What do they know about government data that their local governments are publishing in the web?
    • What do CSOs have in terms of skills that would enable them to engage meaningfully with open government data?
    • How best can capacity building be delivered to civil society organizations to ensure that they learn to access and use open government data to improve governance?
  • They provide a number of key lessons, including:
    • Baseline condition should inform capacity building approach
    • Data use is dependent on data supply
    • Open data requires accessible and stable internet connection
    • Open data skills are important but insufficient
    • Outcomes, and not just outputs, prove capacity improvements

Chattapadhyay, Sumandro, “Opening Government Data through Mediation: Exploring the Roles, Practices and Strategies of Data Intermediary Organisations in India,ODDC, 2014, http://bit.ly/2au3F37

  • This report seeks to gain a greater understanding of the current practice following the Government of India’s 2012 National Data Sharing and Accessibility Policy.
  • Cattapadhyay examines the open government data practices of “various (non-governmental) ‘data intermediary organisations’ on the one hand, and implementation challenges faced by managers of the Open Government Data Platform of India on the other.
  • The report’s objectives are:
    • To undertake a provisional mapping of government data related activities across different sectors to understand the nature of the “open data community” in India,
    • To enrich government data/information policy discussion in India by gathering evidence and experience of (non­governmental) data intermediaries regarding their actual practices of accessing and sharing government data, and their utilisation of the provisions of NDSAP and RTI act, and
    • To critically reflect on the nature of open data practices in India.

Chiliswa, Zacharia, “Open Government Data for Effective Public Participation: Findings of a Case Study Research Investigating The Kenya’s Open Data Initiative in Urban Slums and Rural Settlements,” ODDC, April 2014, http://bit.ly/2au8E4s

  • This research report is the product of a study of two urban slums and a rural settlement in Nairobi, Mobasa and Isiolo County, respectively, aimed at gaining a better understanding of the awareness and use of Kenya’s open data.
  • The study had four organizing objectives:
    • “Investigate the impact of the Kenyan Government’s open data initiative and to see whether, and if so how, it is assisting marginalized communities and groups in accessing key social services and information such as health and education;
    • Understand the way people use the information provided by the Open Data Initiative;
    • Identify people’s trust in the information and how it can assist their day-to-day lives;
    • Examine ways in which the public wish for the open data initiative to improve, particularly in relation to governance and service delivery.”
  • The study uncovered four central findings about Kenya’s open data initiative:
    • “There is a mismatch between the data citizens want to have and the data the Kenya portal and other intermediaries have provided.
    • Most people go to local information intermediaries instead of going directly to the government data portals and that there are few connections between these intermediaries and the wider open data sources.
    • Currently the rural communities are much less likely to seek out government information.
    • The kinds of data needed to support service delivery in Kenya may be different from those needed in other places in the world.”

Lwanga-Ntale, Charles, Beatrice Mugambe, Bernard Sabiti, Peace Nganwa, “Understanding how open data could impact resource allocation for poverty eradication in Kenya and Uganda,” ODDC, 2014, http://bit.ly/2aHqYKi

  • This paper explores case studies from Uganda and Kenya to explore an open data movement seeking to address “age-old” issues including “transparency, accountability, equity, and the relevance, effectiveness and efficiency of governance.”
  • The authors focus both on the role “emerging open data processes in the two countries may be playing in promoting citizen/public engagement and the allocation of resources,” and the “possible negative impacts that may emerge due to the ‘digital divide’ between those who have access to data (and technology) and those who do not.
  • They offer a number of recommendations to the government of Uganda and Kenya that could be more broadly applicable, including:
    • Promote sector and cross sector specific initiatives that enable collaboration and transparency through different e-transformation strategies across government sectors and agencies.
    • Develop and champion the capacity to drive transformation across government and to advance skills in its institutions and civil service.

Sapkota, Krishna, “Exploring the emerging impacts of open aid data and budget data in Nepal,” Freedom Forum, August 2014, http://bit.ly/2ap0z5G

  • This research report seeks to answer a five key questions regarding the opening of aid and budget data in Nepal:
    • What is the context for open aid and budget data in Nepal?
    • What sorts of budget and aid information is being made available in Nepal?
    • What is the governance of open aid and budget data in Nepal?
    • How are relevant stakeholders making use of open aid and budget data in Nepal?
    • What are the emerging impacts of open aid and budget data in Nepal?
  • The study uncovered a number of findings, including
    • “Information and data can play an important role in addressing key social issues, and that whilst some aid and budget data is increasingly available, including in open data formats, there is not yet a sustainable supply of open data direct from official sources that meet the needs of the different stakeholders we consulted.”
    • “Expectations amongst government, civil society, media and private sector actors that open data could be a useful resource in improving governance, and we found some evidence of media making use of data to drive stories more when they had the right skills, incentives and support.”
    • “The context of Nepal also highlights that a more critical perspective may be needed on the introduction of open data, understanding the specific opportunities and challenges for open data supply and use in a country that is currently undergoing a period of constitutional development, institution building and deepening democracy.”

Srivastava, Nidhi, Veena Agarwal, Anmol Soni, Souvik Bhattacharjya, Bibhu P. Nayak, Harsha Meenawat, Tarun Gopalakrishnan, “Open government data for regulation of energy resources in India,”ODDC, 2014, http://bit.ly/2au9oXf

  • This research paper examines “the availability, accessibility and use of open data in the extractive energy industries sector in India.”
  • The authors describe a number of challenges being faced by:
    • Data suppliers and intermediaries:
      • Lack of clarity on mandate
      • Agency specific issues
      • Resource challenges
      • Privacy issues of commercial data and contractual constraints
      • Formats for data collection
      • Challenges in providing timely data
      • Recovery of costs and pricing of data
    • Data users
      • Data available but inaccessible
      • Data accessible but not usable
      • Timeliness of data
  • They make a number of recommendations for addressing these challenges focusing on:
    • Policy measures
    • Improving data quality
    • Improving effectiveness of data portal

van Schalkwyk, François, Michael Caňares, Sumandro Chattapadhyay and Alexander Andrason “Open Data Intermediaries in Developing Countries,” ODDC, 2015, http://bit.ly/2aJztWi

  • This paper seeks to provide “a more socially nuanced approach to open data intermediaries,” moving beyond the traditional approach wherein data intermediaries are “presented as single and simple linkages between open data supply and use.”
  • The study’s analysis draws on cases from the Emerging Impacts of Open Data in Developing Countries (ODDC) project.
  • The authors provide a working definition of open data intermediaries: An open data intermediary is an agent:
    • positioned at some point in a data supply chain that incorporates an open dataset,
    • positioned between two agents in the supply chain, and
    • facilitates the use of open data that may otherwise not have been the case.
  • One of the studies key findings is that, “Intermediation does not only consist of a single agent facilitating the flow of data in an open data supply chain; multiple intermediaries may operate in an open data supply chain, and the presence of multiple intermediaries may increase the probability of use (and impact) because no single intermediary is likely to possess all the types of capital required to unlock the full value of the transaction between the provider and the user in each of the fields in play.”

van Schalkwyk, François, Michelle Willmers and Tobias Schonwetter, “Embedding Open Data Practice,” ODDC, 2015, http://bit.ly/2aHt5xu

  • This research paper was developed as part of the ODDC Phase 2 project and seeks to address the “insufficient attention paid to the institutional dynamics within governments and how these may be impeding open data practice.”
  • The study focuses in particular on open data initiatives in South Africa and Kenya, leveraging a conceptual framework to allow for meaningful comparison between the two countries.
  • Focusing on South Africa and Kenya, as well as Africa as a whole, the authors seek to address four central research questions:
    • Is open data practice being embedded in African governments?
    • What are the possible indicators of open data practice being embedded?
    • What do the indicators reveal about resistance to or compliance with pressures to adopt open data practice?
    • What are different effects of multiple institutional domains that may be at play in government as an organisation?

van Schalkwyk, Francois, Michelle Willmers, and Laura Czerniewicz, “Case Study: Open Data in the Governance of South African Higher Education,” ODDC, 2014, http://bit.ly/2amgIFb

  • This research report uses the South African Centre for Higher Education Transformation (CHET) open data platform as a case study to examine “the supply of and demand for open data as well as the roles of intermediaries in the South African higher education governance ecosystem.
  • The report’s findings include:
    • “There are concerns at both government and university levels about how data will be used and (mis)interpreted, and this may constrain future data supply. Education both at the level of supply (DHET) and at the level of use by the media in particular on how to improve the interpretability of data could go some way in countering current levels of mistrust. Similar initiatives may be necessary to address uneven levels of data use and trust apparent across university executives and councils.”
    • “Open data intermediaries increase the accessibility and utility of data. While there is a rich publicly-funded dataset on South African higher education, the data remains largely inaccessible and unusable to universities and researchers in higher education studies. Despite these constraints, the findings show that intermediaries in the ecosystem are playing a valuable role in making the data both available and useable.”
    • “Open data intermediaries provide both supply-side as well as demand-side value. CHET’s work on higher education performance indicators was intended not only to contribute to government’s steering mechanisms, but also to contribute to the governance capacity of South African universities. The findings support the use of CHET’s open data to build capacity within universities. Further research is required to confirm the use of CHET data in state-steering of the South African higher education system, although there is some evidence of CHET’s data being referenced in national policy documents.”

Verhulst, Stefaan and Andrew Young, “Open Data Impact: When Demand Supply Meet,” The GovLab, 2016, http://bit.ly/1LHkQPO

  • This report provides a taxonomy of the impacts open data is having on a number of countries around the world, comprising:
    • Improving Government
    • Empowering Citizens
    • Creating Opportunity
    • Solving Public Problems
  • The authors describe four key enabling conditions for creating impactful open data initiatives:
    • Partnerships
    • Public Infrastructure
    • Policies and Performance Metrics
    • Problem Definition

Additional Resource:
World Bank Readiness Assessment Tool

  • To aid in the assessment “of the readiness of a government or individual agency to evaluate, design and implement an Open Data initiative,” the World Bank’s Open Government Data Working Group developed an openly accessible Open Data Readiness Assessment (ODRA) tool.

Selected Readings on Data Collaboratives


By Neil Britto, David Sangokoya, Iryna Susha, Stefaan Verhulst and Andrew Young

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data collaboratives was originally published in 2017.

The term data collaborative refers to a new form of collaboration, beyond the public-private partnership model, in which participants from different sectors (including private companies, research institutions, and government agencies ) can exchange data to help solve public problems. Several of society’s greatest challenges — from addressing climate change to public health to job creation to improving the lives of children — require greater access to data, more collaboration between public – and private-sector entities, and an increased ability to analyze datasets. In the coming months and years, data collaboratives will be essential vehicles for harnessing the vast stores of privately held data toward the public good.

Selected Reading List (in alphabetical order)

Annotated Selected Readings List (in alphabetical order)

Agaba, G., Akindès, F., Bengtsson, L., Cowls, J., Ganesh, M., Hoffman, N., . . . Meissner, F. “Big Data and Positive Social Change in the Developing World: A White Paper for Practitioners and Researchers.” 2014. http://bit.ly/25RRC6N.

  • This white paper, produced by “a group of activists, researchers and data experts” explores the potential of big data to improve development outcomes and spur positive social change in low- and middle-income countries. Using examples, the authors discuss four areas in which the use of big data can impact development efforts:
    • Advocating and facilitating by “opening[ing] up new public spaces for discussion and awareness building;
    • Describing and predicting through the detection of “new correlations and the surfac[ing] of new questions;
    • Facilitating information exchange through “multiple feedback loops which feed into both research and action,” and
    • Promoting accountability and transparency, especially as a byproduct of crowdsourcing efforts aimed at “aggregat[ing] and analyz[ing] information in real time.
  • The authors argue that in order to maximize the potential of big data’s use in development, “there is a case to be made for building a data commons for private/public data, and for setting up new and more appropriate ethical guidelines.”
  • They also identify a number of challenges, especially when leveraging data made accessible from a number of sources, including private sector entities, such as:
    • Lack of general data literacy;
    • Lack of open learning environments and repositories;
    • Lack of resources, capacity and access;
    • Challenges of sensitivity and risk perception with regard to using data;
    • Storage and computing capacity; and
    • Externally validating data sources for comparison and verification.

Ansell, C. and Gash, A. “Collaborative Governance in Theory and Practice.” Journal of Public Administration Research and  Theory 18 (4), 2008. http://bit.ly/1RZgsI5.

  • This article describes collaborative arrangements that include public and private organizations working together and proposes a model for understanding an emergent form of public-private interaction informed by 137 diverse cases of collaborative governance.
  • The article suggests factors significant to successful partnering processes and outcomes include:
    • Shared understanding of challenges,
    • Trust building processes,
    • The importance of recognizing seemingly modest progress, and
    • Strong indicators of commitment to the partnership’s aspirations and process.
  • The authors provide a ‘’contingency theory model’’ that specifies relationships between different variables that influence outcomes of collaborative governance initiatives. Three “core contingencies’’ for successful collaborative governance initiatives identified by the authors are:
    • Time (e.g., decision making time afforded to the collaboration);
    • Interdependence (e.g., a high degree of interdependence can mitigate negative effects of low trust); and
    • Trust (e.g. a higher level of trust indicates a higher probability of success).

Ballivian A, Hoffman W. “Public-Private Partnerships for Data: Issues Paper for Data Revolution Consultation.” World Bank, 2015. Available from: http://bit.ly/1ENvmRJ

  • This World Bank report provides a background document on forming public-prviate partnerships for data with the private sector in order to inform the UN’s Independent Expert Advisory Group (IEAG) on sustaining a “data revolution” in sustainable development.
  • The report highlights the critical position of private companies within the data value chain and reflects on key elements of a sustainable data PPP: “common objectives across all impacted stakeholders, alignment of incentives, and sharing of risks.” In addition, the report describes the risks and incentives of public and private actors, and the principles needed to “build[ing] the legal, cultural, technological and economic infrastructures to enable the balancing of competing interests.” These principles include understanding; experimentation; adaptability; balance; persuasion and compulsion; risk management; and governance.
  • Examples of data collaboratives cited in the report include HP Earth Insights, Orange Data for Development Challenges, Amazon Web Services, IBM Smart Cities Initiative, and the Governance Lab’s Open Data 500.

Brack, Matthew, and Tito Castillo. “Data Sharing for Public Health: Key Lessons from Other Sectors.” Chatham House, Centre on Global Health Security. April 2015. Available from: http://bit.ly/1DHFGVl

  • The Chatham House report provides an overview on public health surveillance data sharing, highlighting the benefits and challenges of shared health data and the complexity in adapting technical solutions from other sectors for public health.
  • The report describes data sharing processes from several perspectives, including in-depth case studies of actual data sharing in practice at the individual, organizational and sector levels. Among the key lessons for public health data sharing, the report strongly highlights the need to harness momentum for action and maintain collaborative engagement: “Successful data sharing communities are highly collaborative. Collaboration holds the key to producing and abiding by community standards, and building and maintaining productive networks, and is by definition the essence of data sharing itself. Time should be invested in establishing and sustaining collaboration with all stakeholders concerned with public health surveillance data sharing.”
  • Examples of data collaboratives include H3Africa (a collaboration between NIH and Wellcome Trust) and NHS England’s care.data programme.

de Montjoye, Yves-Alexandre, Jake Kendall, and Cameron F. Kerry. “Enabling Humanitarian Use of Mobile Phone Data.” The Brookings Institution, Issues in Technology Innovation. November 2014. Available from: http://brook.gs/1JxVpxp

  • Using Ebola as a case study, the authors describe the value of using private telecom data for uncovering “valuable insights into understanding the spread of infectious diseases as well as strategies into micro-target outreach and driving update of health-seeking behavior.”
  • The authors highlight the absence of a common legal and standards framework for “sharing mobile phone data in privacy-conscientious ways” and recommend “engaging companies, NGOs, researchers, privacy experts, and governments to agree on a set of best practices for new privacy-conscientious metadata sharing models.”

Eckartz, Silja M., Hofman, Wout J., Van Veenstra, Anne Fleur. “A decision model for data sharing.” Vol. 8653 LNCS. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2014. http://bit.ly/21cGWfw.

  • This paper proposes a decision model for data sharing of public and private data based on literature review and three case studies in the logistics sector.
  • The authors identify five categories of the barriers to data sharing and offer a decision model for identifying potential interventions to overcome each barrier:
    • Ownership. Possible interventions likely require improving trust among those who own the data through, for example, involvement and support from higher management
    • Privacy. Interventions include “anonymization by filtering of sensitive information and aggregation of data,” and access control mechanisms built around identity management and regulated access.  
    • Economic. Interventions include a model where data is shared only with a few trusted organizations, and yield management mechanisms to ensure negative financial consequences are avoided.
    • Data quality. Interventions include identifying additional data sources that could improve the completeness of datasets, and efforts to improve metadata.
    • Technical. Interventions include making data available in structured formats and publishing data according to widely agreed upon data standards.

Hoffman, Sharona and Podgurski, Andy. “The Use and Misuse of Biomedical Data: Is Bigger Really Better?” American Journal of Law & Medicine 497, 2013. http://bit.ly/1syMS7J.

  • This journal articles explores the benefits and, in particular, the risks related to large-scale biomedical databases bringing together health information from a diversity of sources across sectors. Some data collaboratives examined in the piece include:
    • MedMining – a company that extracts EHR data, de-identifies it, and offers it to researchers. The data sets that MedMining delivers to its customers include ‘lab results, vital signs, medications, procedures, diagnoses, lifestyle data, and detailed costs’ from inpatient and outpatient facilities.
    • Explorys has formed a large healthcare database derived from financial, administrative, and medical records. It has partnered with major healthcare organizations such as the Cleveland Clinic Foundation and Summa Health System to aggregate and standardize health information from ten million patients and over thirty billion clinical events.
  • Hoffman and Podgurski note that biomedical databases populated have many potential uses, with those likely to benefit including: “researchers, regulators, public health officials, commercial entities, lawyers,” as well as “healthcare providers who conduct quality assessment and improvement activities,” regulatory monitoring entities like the FDA, and “litigants in tort cases to develop evidence concerning causation and harm.”
  • They argue, however, that risks arise based on:
    • The data contained in biomedical databases is surprisingly likely to be incorrect or incomplete;
    • Systemic biases, arising from both the nature of the data and the preconceptions of investigators are serious threats the validity of research results, especially in answering causal questions;
  • Data mining of biomedical databases makes it easier for individuals with political, social, or economic agendas to generate ostensibly scientific but misleading research findings for the purpose of manipulating public opinion and swaying policymakers.

Krumholz, Harlan M., et al. “Sea Change in Open Science and Data Sharing Leadership by Industry.” Circulation: Cardiovascular Quality and Outcomes 7.4. 2014. 499-504. http://1.usa.gov/1J6q7KJ

  • This article provides a comprehensive overview of industry-led efforts and cross-sector collaborations in data sharing by pharmaceutical companies to inform clinical practice.
  • The article details the types of data being shared and the early activities of GlaxoSmithKline (“in coordination with other companies such as Roche and ViiV”); Medtronic and the Yale University Open Data Access Project; and Janssen Pharmaceuticals (Johnson & Johnson). The article also describes the range of involvement in data sharing among pharmaceutical companies including Pfizer, Novartis, Bayer, AbbVie, Eli Llly, AstraZeneca, and Bristol-Myers Squibb.

Mann, Gideon. “Private Data and the Public Good.” Medium. May 17, 2016. http://bit.ly/1OgOY68.

    • This Medium post from Gideon Mann, the Head of Data Science at Bloomberg, shares his prepared remarks given at a lecture at the City College of New York. Mann argues for the potential benefits of increasing access to private sector data, both to improve research and academic inquiry and also to help solve practical, real-world problems. He also describes a number of initiatives underway at Bloomberg along these lines.    
  • Mann argues that data generated at private companies “could enable amazing discoveries and research,” but is often inaccessible to those who could put it to those uses. Beyond research, he notes that corporate data could, for instance, benefit:
      • Public health – including suicide prevention, addiction counseling and mental health monitoring.
    • Legal and ethical questions – especially as they relate to “the role algorithms have in decisions about our lives,” such as credit checks and resume screening.
  • Mann recognizes the privacy challenges inherent in private sector data sharing, but argues that it is a common misconception that the only two choices are “complete privacy or complete disclosure.” He believes that flexible frameworks for differential privacy could open up new opportunities for responsibly leveraging data collaboratives.

Pastor Escuredo, D., Morales-Guzmán, A. et al, “Flooding through the Lens of Mobile Phone Activity.” IEEE Global Humanitarian Technology Conference, GHTC 2014. Available from: http://bit.ly/1OzK2bK

  • This report describes the impact of using mobile data in order to understand the impact of disasters and improve disaster management. The report was conducted in the Mexican state of Tabasco in 2009 as a multidisciplinary, multi-stakeholder consortium involving the UN World Food Programme (WFP), Telefonica Research, Technical University of Madrid (UPM), Digital Strategy Coordination Office of the President of Mexico, and UN Global Pulse.
  • Telefonica Research, a division of the major Latin American telecommunications company, provided call detail records covering flood-affected areas for nine months. This data was combined with “remote sensing data (satellite images), rainfall data, census and civil protection data.” The results of the data demonstrated that “analysing mobile activity during floods could be used to potentially locate damaged areas, efficiently assess needs and allocate resources (for example, sending supplies to affected areas).”
  • In addition to the results, the study highlighted “the value of a public-private partnership on using mobile data to accurately indicate flooding impacts in Tabasco, thus improving early warning and crisis management.”

* Perkmann, M. and Schildt, H. “Open data partnerships between firms and universities: The role of boundary organizations.” Research Policy, 44(5), 2015. http://bit.ly/25RRJ2c

  • This paper discusses the concept of a “boundary organization” in relation to industry-academic partnerships driven by data. Boundary organizations perform mediated revealing, allowing firms to disclose their research problems to a broad audience of innovators and simultaneously minimize the risk that this information would be adversely used by competitors.
  • The authors identify two especially important challenges for private firms to enter open data or participate in data collaboratives with the academic research community that could be addressed through more involvement from boundary organizations:
    • First is a challenge of maintaining competitive advantage. The authors note that, “the more a firm attempts to align the efforts in an open data research programme with its R&D priorities, the more it will have to reveal about the problems it is addressing within its proprietary R&D.”
    • Second, involves the misalignment of incentives between the private and academic field. Perkmann and Schildt argue that, a firm seeking to build collaborations around its opened data “will have to provide suitable incentives that are aligned with academic scientists’ desire to be rewarded for their work within their respective communities.”

Robin, N., Klein, T., & Jütting, J. “Public-Private Partnerships for Statistics: Lessons Learned, Future Steps.” OECD. 2016. http://bit.ly/24FLYlD.

  • This working paper acknowledges the growing body of work on how different types of data (e.g, telecom data, social media, sensors and geospatial data, etc.) can address data gaps relevant to National Statistical Offices (NSOs).
  • Four models of public-private interaction for statistics are describe: in-house production of statistics by a data-provider for a national statistics office (NSO), transfer of data-sets to NSOs from private entities, transfer of data to a third party provider to manage the NSO and private entity data, and the outsourcing of NSO functions.
  • The paper highlights challenges to public-private partnerships involving data (e.g., technical challenges, data confidentiality, risks, limited incentives for participation), suggests deliberate and highly structured approaches to public-private partnerships involving data require enforceable contracts, emphasizes the trade-off between data specificity and accessibility of such data, and the importance of pricing mechanisms that reflect the capacity and capability of national statistic offices.
  • Case studies referenced in the paper include:
    • A mobile network operator’s (MNO Telefonica) in house analysis of call detail records;
    • A third-party data provider and steward of travel statistics (Positium);
    • The Data for Development (D4D) challenge organized by MNO Orange; and
    • Statistics Netherlands use of social media to predict consumer confidence.

Stuart, Elizabeth, Samman, Emma, Avis, William, Berliner, Tom. “The data revolution: finding the missing millions.” Overseas Development Institute, 2015. Available from: http://bit.ly/1bPKOjw

  • The authors of this report highlight the need for good quality, relevant, accessible and timely data for governments to extend services into underrepresented communities and implement policies towards a sustainable “data revolution.”
  • The solutions focused on this recent report from the Overseas Development Institute focus on capacity-building activities of national statistical offices (NSOs), alternative sources of data (including shared corporate data) to address gaps, and building strong data management systems.

Taylor, L., & Schroeder, R. “Is bigger better? The emergence of big data as a tool for international development policy.” GeoJournal, 80(4). 2015. 503-518. http://bit.ly/1RZgSy4.

  • This journal article describes how privately held data – namely “digital traces” of consumer activity – “are becoming seen by policymakers and researchers as a potential solution to the lack of reliable statistical data on lower-income countries.
  • They focus especially on three categories of data collaborative use cases:
    • Mobile data as a predictive tool for issues such as human mobility and economic activity;
    • Use of mobile data to inform humanitarian response to crises; and
    • Use of born-digital web data as a tool for predicting economic trends, and the implications these have for LMICs.
  • They note, however, that a number of challenges and drawbacks exist for these types of use cases, including:
    • Access to private data sources often must be negotiated or bought, “which potentially means substituting negotiations with corporations for those with national statistical offices;”
    • The meaning of such data is not always simple or stable, and local knowledge is needed to understand how people are using the technologies in question
    • Bias in proprietary data can be hard to understand and quantify;
    • Lack of privacy frameworks; and
    • Power asymmetries, wherein “LMIC citizens are unwittingly placed in a panopticon staffed by international researchers, with no way out and no legal recourse.”

van Panhuis, Willem G., Proma Paul, Claudia Emerson, John Grefenstette, Richard Wilder, Abraham J. Herbst, David Heymann, and Donald S. Burke. “A systematic review of barriers to data sharing in public health.” BMC public health 14, no. 1 (2014): 1144. Available from: http://bit.ly/1JOBruO

  • The authors of this report provide a “systematic literature of potential barriers to public health data sharing.” These twenty potential barriers are classified in six categories: “technical, motivational, economic, political, legal and ethical.” In this taxonomy, “the first three categories are deeply rooted in well-known challenges of health information systems for which structural solutions have yet to be found; the last three have solutions that lie in an international dialogue aimed at generating consensus on policies and instruments for data sharing.”
  • The authors suggest the need for a “systematic framework of barriers to data sharing in public health” in order to accelerate access and use of data for public good.

Verhulst, Stefaan and Sangokoya, David. “Mapping the Next Frontier of Open Data: Corporate Data Sharing.” In: Gasser, Urs and Zittrain, Jonathan and Faris, Robert and Heacock Jones, Rebekah, “Internet Monitor 2014: Reflections on the Digital World: Platforms, Policy, Privacy, and Public Discourse (December 15, 2014).” Berkman Center Research Publication No. 2014-17. http://bit.ly/1GC12a2

  • This essay describe a taxonomy of current corporate data sharing practices for public good: research partnerships; prizes and challenges; trusted intermediaries; application programming interfaces (APIs); intelligence products; and corporate data cooperatives or pooling.
  • Examples of data collaboratives include: Yelp Dataset Challenge, the Digital Ecologies Research Partnerhsip, BBVA Innova Challenge, Telecom Italia’s Big Data Challenge, NIH’s Accelerating Medicines Partnership and the White House’s Climate Data Partnerships.
  • The authors highlight important questions to consider towards a more comprehensive mapping of these activities.

Verhulst, Stefaan and Sangokoya, David, 2015. “Data Collaboratives: Exchanging Data to Improve People’s Lives.” Medium. Available from: http://bit.ly/1JOBDdy

  • The essay refers to data collaboratives as a new form of collaboration involving participants from different sectors exchanging data to help solve public problems. These forms of collaborations can improve people’s lives through data-driven decision-making; information exchange and coordination; and shared standards and frameworks for multi-actor, multi-sector participation.
  • The essay cites four activities that are critical to accelerating data collaboratives: documenting value and measuring impact; matching public demand and corporate supply of data in a trusted way; training and convening data providers and users; experimenting and scaling existing initiatives.
  • Examples of data collaboratives include NIH’s Precision Medicine Initiative; the Mobile Data, Environmental Extremes and Population (MDEEP) Project; and Twitter-MIT’s Laboratory for Social Machines.

Verhulst, Stefaan, Susha, Iryna, Kostura, Alexander. “Data Collaboratives: matching Supply of (Corporate) Data to Solve Public Problems.” Medium. February 24, 2016. http://bit.ly/1ZEp2Sr.

  • This piece articulates a set of key lessons learned during a session at the International Data Responsibility Conference focused on identifying emerging practices, opportunities and challenges confronting data collaboratives.
  • The authors list a number of privately held data sources that could create positive public impacts if made more accessible in a collaborative manner, including:
    • Data for early warning systems to help mitigate the effects of natural disasters;
    • Data to help understand human behavior as it relates to nutrition and livelihoods in developing countries;
    • Data to monitor compliance with weapons treaties;
    • Data to more accurately measure progress related to the UN Sustainable Development Goals.
  • To the end of identifying and expanding on emerging practice in the space, the authors describe a number of current data collaborative experiments, including:
    • Trusted Intermediaries: Statistics Netherlands partnered with Vodafone to analyze mobile call data records in order to better understand mobility patterns and inform urban planning.
    • Prizes and Challenges: Orange Telecom, which has been a leader in this type of Data Collaboration, provided several examples of the company’s initiatives, such as the use of call data records to track the spread of malaria as well as their experience with Challenge 4 Development.
    • Research partnerships: The Data for Climate Action project is an ongoing large-scale initiative incentivizing companies to share their data to help researchers answer particular scientific questions related to climate change and adaptation.
    • Sharing intelligence products: JPMorgan Chase shares macro economic insights they gained leveraging their data through the newly established JPMorgan Chase Institute.
  • In order to capitalize on the opportunities provided by data collaboratives, a number of needs were identified:
    • A responsible data framework;
    • Increased insight into different business models that may facilitate the sharing of data;
    • Capacity to tap into the potential value of data;
    • Transparent stock of available data supply; and
    • Mapping emerging practices and models of sharing.

Vogel, N., Theisen, C., Leidig, J. P., Scripps, J., Graham, D. H., & Wolffe, G. “Mining mobile datasets to enable the fine-grained stochastic simulation of Ebola diffusion.” Paper presented at the Procedia Computer Science. 2015. http://bit.ly/1TZDroF.

  • The paper presents a research study conducted on the basis of the mobile calls records shared with researchers in the framework of the Data for Development Challenge by the mobile operator Orange.
  • The study discusses the data analysis approach in relation to developing a situation of Ebola diffusion built around “the interactions of multi-scale models, including viral loads (at the cellular level), disease progression (at the individual person level), disease propagation (at the workplace and family level), societal changes in migration and travel movements (at the population level), and mitigating interventions (at the abstract government policy level).”
  • The authors argue that the use of their population, mobility, and simulation models provide more accurate simulation details in comparison to high-level analytical predictions and that the D4D mobile datasets provide high-resolution information useful for modeling developing regions and hard to reach locations.

Welle Donker, F., van Loenen, B., & Bregt, A. K. “Open Data and Beyond.” ISPRS International Journal of Geo-Information, 5(4). 2016. http://bit.ly/22YtugY.

  • This research has developed a monitoring framework to assess the effects of open (private) data using a case study of a Dutch energy network administrator Liander.
  • Focusing on the potential impacts of open private energy data – beyond ‘smart disclosure’ where citizens are given information only about their own energy usage – the authors identify three attainable strategic goals:
    • Continuously optimize performance on services, security of supply, and costs;
    • Improve management of energy flows and insight into energy consumption;
    • Help customers save energy and switch over to renewable energy sources.
  • The authors propose a seven-step framework for assessing the impacts of Liander data, in particular, and open private data more generally:
    • Develop a performance framework to describe what the program is about, description of the organization’s mission and strategic goals;
    • Identify the most important elements, or key performance areas which are most critical to understanding and assessing your program’s success;
    • Select the most appropriate performance measures;
    • Determine the gaps between what information you need and what is available;
    • Develop and implement a measurement strategy to address the gaps;
    • Develop a performance report which highlights what you have accomplished and what you have learned;
    • Learn from your experiences and refine your approach as required.
  • While the authors note that the true impacts of this open private data will likely not come into view in the short term, they argue that, “Liander has successfully demonstrated that private energy companies can release open data, and has successfully championed the other Dutch network administrators to follow suit.”

World Economic Forum, 2015. “Data-driven development: pathways for progress.” Geneva: World Economic Forum. http://bit.ly/1JOBS8u

  • This report captures an overview of the existing data deficit and the value and impact of big data for sustainable development.
  • The authors of the report focus on four main priorities towards a sustainable data revolution: commercial incentives and trusted agreements with public- and private-sector actors; the development of shared policy frameworks, legal protections and impact assessments; capacity building activities at the institutional, community, local and individual level; and lastly, recognizing individuals as both produces and consumers of data.

Selected Readings on Crowdsourcing Expertise


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of crowdsourcing was originally published in 2014.

Crowdsourcing enables leaders and citizens to work together to solve public problems in new and innovative ways. New tools and platforms enable citizens with differing levels of knowledge, expertise, experience and abilities to collaborate and solve problems together. Identifying experts, or individuals with specialized skills, knowledge or abilities with regard to a specific topic, and incentivizing their participation in crowdsourcing information, knowledge or experience to achieve a shared goal can enhance the efficiency and effectiveness of problem solving.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Börner, Katy, Michael Conlon, Jon Corson-Rikert, and Ying Ding. “VIVO: A Semantic Approach to Scholarly Networking and Discovery.” Synthesis Lectures on the Semantic Web: Theory and Technology 2, no. 1 (October 17, 2012): 1–178. http://bit.ly/17huggT.

  • This e-book “provides an introduction to VIVO…a tool for representing information about research and researchers — their scholarly works, research interests, and organizational relationships.”
  • VIVO is a response to the fact that, “Information for scholars — and about scholarly activity — has not kept pace with the increasing demands and expectations. Information remains siloed in legacy systems and behind various access controls that must be licensed or otherwise negotiated before access. Information representation is in its infancy. The raw material of scholarship — the data and information regarding previous work — is not available in common formats with common semantics.”
  • Providing access to structured information on the work and experience of a diversity of scholars enables improved expert finding — “identifying and engaging experts whose scholarly works is of value to one’s own. To find experts, one needs rich data regarding one’s own work and the work of potential related experts. The authors argue that expert finding is of increasing importance since, “[m]ulti-disciplinary and inter-disciplinary investigation is increasingly required to address complex problems. 

Bozzon, Alessandro, Marco Brambilla, Stefano Ceri, Matteo Silvestri, and Giuliano Vesci. “Choosing the Right Crowd: Expert Finding in Social Networks.” In Proceedings of the 16th International Conference on Extending Database Technology, 637–648. EDBT  ’13. New York, NY, USA: ACM, 2013. http://bit.ly/18QbtY5.

  • This paper explores the challenge of selecting experts within the population of social networks by considering the following problem: “given an expertise need (expressed for instance as a natural language query) and a set of social network members, who are the most knowledgeable people for addressing that need?”
  • The authors come to the following conclusions:
    • “profile information is generally less effective than information about resources that they directly create, own or annotate;
    • resources which are produced by others (resources appearing on the person’s Facebook wall or produced by people that she follows on Twitter) help increasing the assessment precision;
    • Twitter appears the most effective social network for expertise matching, as it very frequently outperforms all other social networks (either combined or alone);
    • Twitter appears as well very effective for matching expertise in domains such as computer engineering, science, sport, and technology & games, but Facebook is also very effective in fields such as locations, music, sport, and movies & tv;
    • surprisingly, LinkedIn appears less effective than other social networks in all domains (including computer science) and overall.”

Brabham, Daren C. “The Myth of Amateur Crowds.” Information, Communication & Society 15, no. 3 (2012): 394–410. http://bit.ly/1hdnGJV.

  • Unlike most of the related literature, this paper focuses on bringing attention to the expertise already being tapped by crowdsourcing efforts rather than determining ways to identify more dormant expertise to improve the results of crowdsourcing.
  • Brabham comes to two central conclusions: “(1) crowdsourcing is discussed in the popular press as a process driven by amateurs and hobbyists, yet empirical research on crowdsourcing indicates that crowds are largely self-selected professionals and experts who opt-in to crowdsourcing arrangements; and (2) the myth of the amateur in crowdsourcing ventures works to label crowds as mere hobbyists who see crowdsourcing ventures as opportunities for creative expression, as entertainment, or as opportunities to pass the time when bored. This amateur/hobbyist label then undermines the fact that large amounts of real work and expert knowledge are exerted by crowds for relatively little reward and to serve the profit motives of companies. 

Dutton, William H. Networking Distributed Public Expertise: Strategies for Citizen Sourcing Advice to Government. One of a Series of Occasional Papers in Science and Technology Policy, Science and Technology Policy Institute, Institute for Defense Analyses, February 23, 2011. http://bit.ly/1c1bpEB.

  • In this paper, a case is made for more structured and well-managed crowdsourcing efforts within government. Specifically, the paper “explains how collaborative networking can be used to harness the distributed expertise of citizens, as distinguished from citizen consultation, which seeks to engage citizens — each on an equal footing.” Instead of looking for answers from an undefined crowd, Dutton proposes “networking the public as advisors” by seeking to “involve experts on particular public issues and problems distributed anywhere in the world.”
  • Dutton argues that expert-based crowdsourcing can be successfully for government for a number of reasons:
    • Direct communication with a diversity of independent experts
    • The convening power of government
    • Compatibility with open government and open innovation
    • Synergy with citizen consultation
    • Building on experience with paid consultants
    • Speed and urgency
    • Centrality of documents to policy and practice.
  • He also proposes a nine-step process for government to foster bottom-up collaboration networks:
    • Do not reinvent the technology
    • Focus on activities, not the tools
    • Start small, but capable of scaling up
    • Modularize
    • Be open and flexible in finding and going to communities of experts
    • Do not concentrate on one approach to all problems
    • Cultivate the bottom-up development of multiple projects
    • Experience networking and collaborating — be a networked individual
    • Capture, reward, and publicize success.

Goel, Gagan, Afshin Nikzad and Adish Singla. “Matching Workers with Tasks: Incentives in Heterogeneous Crowdsourcing Markets.” Under review by the International World Wide Web Conference (WWW). 2014. http://bit.ly/1qHBkdf

  • Combining the notions of crowdsourcing expertise and crowdsourcing tasks, this paper focuses on the challenge within platforms like Mechanical Turk related to intelligently matching tasks to workers.
  • The authors’ call for more strategic assignment of tasks in crowdsourcing markets is based on the understanding that “each worker has certain expertise and interests which define the set of tasks she can and is willing to do.”
  • Focusing on developing meaningful incentives based on varying levels of expertise, the authors sought to create a mechanism that, “i) is incentive compatible in the sense that it is truthful for agents to report their true cost, ii) picks a set of workers and assigns them to the tasks they are eligible for in order to maximize the utility of the requester, iii) makes sure total payments made to the workers doesn’t exceed the budget of the requester.

Gubanov, D., N. Korgin, D. Novikov and A. Kalkov. E-Expertise: Modern Collective Intelligence. Springer, Studies in Computational Intelligence 558, 2014. http://bit.ly/U1sxX7

  • In this book, the authors focus on “organization and mechanisms of expert decision-making support using modern information and communication technologies, as well as information analysis and collective intelligence technologies (electronic expertise or simply e-expertise).”
  • The book, which “addresses a wide range of readers interested in management, decision-making and expert activity in political, economic, social and industrial spheres, is broken into five chapters:
    • Chapter 1 (E-Expertise) discusses the role of e-expertise in decision-making processes. The procedures of e-expertise are classified, their benefits and shortcomings are identified, and the efficiency conditions are considered.
    • Chapter 2 (Expert Technologies and Principles) provides a comprehensive overview of modern expert technologies. A special emphasis is placed on the specifics of e-expertise. Moreover, the authors study the feasibility and reasonability of employing well-known methods and approaches in e-expertise.
    • Chapter 3 (E-Expertise: Organization and Technologies) describes some examples of up-to-date technologies to perform e-expertise.
    • Chapter 4 (Trust Networks and Competence Networks) deals with the problems of expert finding and grouping by information and communication technologies.
    • Chapter 5 (Active Expertise) treats the problem of expertise stability against any strategic manipulation by experts or coordinators pursuing individual goals.

Holst, Cathrine. “Expertise and Democracy.” ARENA Report No 1/14, Center for European Studies, University of Oslo. http://bit.ly/1nm3rh4

  • This report contains a set of 16 papers focused on the concept of “epistocracy,” meaning the “rule of knowers.” The papers inquire into the role of knowledge and expertise in modern democracies and especially in the European Union (EU). Major themes are: expert-rule and democratic legitimacy; the role of knowledge and expertise in EU governance; and the European Commission’s use of expertise.
    • Expert-rule and democratic legitimacy
      • Papers within this theme concentrate on issues such as the “implications of modern democracies’ knowledge and expertise dependence for political and democratic theory.” Topics include the accountability of experts, the legitimacy of expert arrangements within democracies, the role of evidence in policy-making, how expertise can be problematic in democratic contexts, and “ethical expertise” and its place in epistemic democracies.
    • The role of knowledge and expertise in EU governance
      • Papers within this theme concentrate on “general trends and developments in the EU with regard to the role of expertise and experts in political decision-making, the implications for the EU’s democratic legitimacy, and analytical strategies for studying expertise and democratic legitimacy in an EU context.”
    • The European Commission’s use of expertise
      • Papers within this theme concentrate on how the European Commission uses expertise and in particular the European Commission’s “expertgroup system.” Topics include the European Citizen’s Initiative, analytic-deliberative processes in EU food safety, the operation of EU environmental agencies, and the autonomy of various EU agencies.

King, Andrew and Karim R. Lakhani. “Using Open Innovation to Identify the Best Ideas.” MIT Sloan Management Review, September 11, 2013. http://bit.ly/HjVOpi.

  • In this paper, King and Lakhani examine different methods for opening innovation, where, “[i]nstead of doing everything in-house, companies can tap into the ideas cloud of external expertise to develop new products and services.”
  • The three types of open innovation discussed are: opening the idea-creation process, competitions where prizes are offered and designers bid with possible solutions; opening the idea-selection process, ‘approval contests’ in which outsiders vote to determine which entries should be pursued; and opening both idea generation and selection, an option used especially by organizations focused on quickly changing needs.

Long, Chengjiang, Gang Hua and Ashish Kapoor. Active Visual Recognition with Expertise Estimation in Crowdsourcing. 2013 IEEE International Conference on Computer Vision. December 2013. http://bit.ly/1lRWFur.

  • This paper is focused on improving the crowdsourced labeling of visual datasets from platforms like Mechanical Turk. The authors note that, “Although it is cheap to obtain large quantity of labels through crowdsourcing, it has been well known that the collected labels could be very noisy. So it is desirable to model the expertise level of the labelers to ensure the quality of the labels. The higher the expertise level a labeler is at, the lower the label noises he/she will produce.”
  • Based on the need for identifying expert labelers upfront, the authors developed an “active classifier learning system which determines which users to label which unlabeled examples” from collected visual datasets.
  • The researchers’ experiments in identifying expert visual dataset labelers led to findings demonstrating that the “active selection” of expert labelers is beneficial in cutting through the noise of crowdsourcing platforms.

Noveck, Beth Simone. “’Peer to Patent’: Collective Intelligence, Open Review, and Patent Reform.” Harvard Journal of Law & Technology 20, no. 1 (Fall 2006): 123–162. http://bit.ly/HegzTT.

  • This law review article introduces the idea of crowdsourcing expertise to mitigate the challenge of patent processing. Noveck argues that, “access to information is the crux of the patent quality problem. Patent examiners currently make decisions about the grant of a patent that will shape an industry for a twenty-year period on the basis of a limited subset of available information. Examiners may neither consult the public, talk to experts, nor, in many cases, even use the Internet.”
  • Peer-to-Patent, which launched three years after this article, is based on the idea that, “The new generation of social software might not only make it easier to find friends but also to find expertise that can be applied to legal and policy decision-making. This way, we can improve upon the Constitutional promise to promote the progress of science and the useful arts in our democracy by ensuring that only worth ideas receive that ‘odious monopoly’ of which Thomas Jefferson complained.”

Ober, Josiah. “Democracy’s Wisdom: An Aristotelian Middle Way for Collective Judgment.” American Political Science Review 107, no. 01 (2013): 104–122. http://bit.ly/1cgf857.

  • In this paper, Ober argues that, “A satisfactory model of decision-making in an epistemic democracy must respect democratic values, while advancing citizens’ interests, by taking account of relevant knowledge about the world.”
  • Ober describes an approach to decision-making that aggregates expertise across multiple domains. This “Relevant Expertise Aggregation (REA) enables a body of minimally competent voters to make superior choices among multiple options, on matters of common interest.”

Sims, Max H., Jeffrey Bigham, Henry Kautz and Marc W. Halterman. Crowdsourcing medical expertise in near real time.” Journal of Hospital Medicine 9, no. 7, July 2014. http://bit.ly/1kAKvq7.

  • In this article, the authors discuss the develoment of a mobile application called DocCHIRP, which was developed due to the fact that, “although the Internet creates unprecedented access to information, gaps in the medical literature and inefficient searches often leave healthcare providers’ questions unanswered.”
  • The DocCHIRP pilot project used a “system of point-to-multipoint push notifications designed to help providers problem solve by crowdsourcing from their peers.”
  • Healthcare providers (HCPs) sought to gain intelligence from the crowd, which included 85 registered users, on questions related to medication, complex medical decision making, standard of care, administrative, testing and referrals.
  • The authors believe that, “if future iterations of the mobile crowdsourcing applications can address…adoption barriers and support the organic growth of the crowd of HCPs,” then “the approach could have a positive and transformative effect on how providers acquire relevant knowledge and care for patients.”

Spina, Alessandro. “Scientific Expertise and Open Government in the Digital Era: Some Reflections on EFSA and Other EU Agencies.” in Foundations of EU Food Law and Policy, eds. A. Alemmano and S. Gabbi. Ashgate, 2014. http://bit.ly/1k2EwdD.

  • In this paper, Spina “presents some reflections on how the collaborative and crowdsourcing practices of Open Government could be integrated in the activities of EFSA [European Food Safety Authority] and other EU agencies,” with a particular focus on “highlighting the benefits of the Open Government paradigm for expert regulatory bodies in the EU.”
  • Spina argues that the “crowdsourcing of expertise and the reconfiguration of the information flows between European agencies and teh public could represent a concrete possibility of modernising the role of agencies with a new model that has a low financial burden and an almost immediate effect on the legal governance of agencies.”
  • He concludes that, “It is becoming evident that in order to guarantee that the best scientific expertise is provided to EU institutions and citizens, EFSA should strive to use the best organisational models to source science and expertise.”

Urban Analytics (Updated and Expanded)


As part of an ongoing effort to build a knowledge base for the field of opening governance by organizing and disseminating its learnings, the GovLab Selected Readings series provides an annotated and curated collection of recommended works on key opening governance topics. In this edition, we explore the literature on Urban Analytics. To suggest additional readings on this or any other topic, please email biblio@thegovlab.org.

Data and its uses for Governance

Urban Analytics places better information in the hands of citizens as well as government officials to empower people to make more informed choices. Today, we are able to gather real-time information about traffic, pollution, noise, and environmental and safety conditions by culling data from a range of tools: from the low-cost sensors in mobile phones to more robust monitoring tools installed in our environment. With data collected and combined from the built, natural and human environments, we can develop more robust predictive models and use those models to make policy smarter.

With the computing power to transmit and store the data from these sensors, and the tools to translate raw data into meaningful visualizations, we can identify problems as they happen, design new strategies for city management, and target the application of scarce resources where they are most needed.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)
Amini, L., E. Bouillet, F. Calabrese, L. Gasparini, and O. Verscheure. “Challenges and Results in City-scale Sensing.” In IEEE Sensors, 59–61, 2011. http://bit.ly/1doodZm.

  • This paper examines “how city requirements map to research challenges in machine learning, optimization, control, visualization, and semantic analysis.”
  • The authors raises several research challenges including how to extract accurate information when the data is noisy and sparse; how to represent findings from digital pervasive technologies; and how people interact with one another and their environment.

Batty, M., K. W. Axhausen, F. Giannotti, A. Pozdnoukhov, A. Bazzani, M. Wachowicz, G. Ouzounis, and Y. Portugali. “Smart Cities of the Future.The European Physical Journal Special Topics 214, no. 1 (November 1, 2012): 481–518. http://bit.ly/HefbjZ.

  • This paper explores the goals and research challenges involved in the development of smart cities that merge ICT with traditional infrastructures through digital technologies.
  • The authors put forth several research objectives, including: 1) to explore the notion of the city as a laboratory for innovation; 2) to develop technologies that ensure equity, fairness and realize a better quality of city life; and 3) to develop technologies that ensure informed participation and create shared knowledge for democratic city governance.
  • The paper also examines several contemporary smart city initiatives, expected paradigm shifts in the field, benefits, risks and impacts.

Budde, Paul. “Smart Cities of Tomorrow.” In Cities for Smart Environmental and Energy Futures, edited by Stamatina Th Rassia and Panos M. Pardalos, 9–20. Energy Systems. Springer Berlin Heidelberg, 2014. http://bit.ly/17MqPZW.

  • This paper examines the components and strategies involved in the creation of smart cities featuring “cohesive and open telecommunication and software architecture.”
  • In their study of smart cities, the authors examine smart and renewable energy; next-generation networks; smart buildings; smart transport; and smart government.
  • They conclude that for the development of smart cities, information and communication technology (ICT) is needed to build more horizontal collaborative structures, useful data must be analyzed in real time and people and/or machines must be able to make instant decisions related to social and urban life.

Cardone, G., L. Foschini, P. Bellavista, A. Corradi, C. Borcea, M. Talasila, and R. Curtmola. “Fostering Participaction in Smart Cities: a Geo-social Crowdsensing Platform.” IEEE Communications
Magazine 51, no. 6 (2013): 112–119. http://bit.ly/17iJ0vZ.

  • This article examines “how and to what extent the power of collective although imprecise intelligence can be employed in smart cities.”
  • To tackle problems of managing the crowdsensing process, this article proposes a “crowdsensing platform with three main original technical aspects: an innovative geo-social model to profile users along different variables, such as time, location, social interaction, service usage, and human activities; a matching algorithm to autonomously choose people to involve in participActions and to quantify the performance of their sensing; and a new Android-based platform to collect sensing data from smart phones, automatically or with user help, and to deliver sensing/actuation tasks to users.”

Chen, Chien-Chu. “The Trend towards ‘Smart Cities.’” International Journal of Automation and Smart Technology. June 1, 2014. http://bit.ly/1jOOaAg.

  • In this study, Chen explores the ambitions, prevalence and outcomes of a variety of smart cities, organized into five categories:
    • Transportation-focused smart cities
    • Energy-focused smart cities
    • Building-focused smart cities
    • Water-resources-focused smart cities
    • Governance-focused smart cities
  • The study finds that the “Asia Pacific region accounts for the largest share of all smart city development plans worldwide, with 51% of the global total. Smart city development plans in the Asia Pacific region tend to be energy-focused smart city initiatives, aimed at easing the pressure on energy resources that will be caused by continuing rapid urbanization in the future.”
  • North America, on the other hand is generally more geared toward energy-focused smart city development plans. “In North America, there has been a major drive to introduce smart meters and smart electric power grids, integrating the electric power sector with information and communications technology (ICT) and replacing obsolete electric power infrastructure, so as to make cities’ electric power systems more reliable (which in turn can help to boost private-sector investment, stimulate the growth of the ‘green energy’ industry, and create more job opportunities).”
  • Looking to Taiwan as an example, Chen argues that, “Cities in different parts of the world face different problems and challenges when it comes to urban development, making it necessary to utilize technology applications from different fields to solve the unique problems that each individual city has to overcome; the emphasis here is on the development of customized solutions for smart city development.”

Domingo, A., B. Bellalta, M. Palacin, M. Oliver and E. Almirall. “Public Open Sensor Data: Revolutionizing Smart Cities.” Technology and Society Magazine, IEEE 32, No. 4. Winter 2013. http://bit.ly/1iH6ekU.

  • In this article, the authors explore the “enormous amount of information collected by sensor devices” that allows for “the automation of several real-time services to improve city management by using intelligent traffic-light patterns during rush hour, reducing water consumption in parks, or efficiently routing garbage collection trucks throughout the city.”
  • They argue that, “To achieve the goal of sharing and open data to the public, some technical expertise on the part of citizens will be required. A real environment – or platform – will be needed to achieve this goal.” They go on to introduce a variety of “technical challenges and considerations involved in building an Open Sensor Data platform,” including:
    • Scalability
    • Reliability
    • Low latency
    • Standardized formats
    • Standardized connectivity
  • The authors conclude that, despite incredible advancements in urban analytics and open sensing in recent years, “Today, we can only imagine the revolution in Open Data as an introduction to a real-time world mashup with temperature, humidity, CO2 emission, transport, tourism attractions, events, water and gas consumption, politics decisions, emergencies, etc., and all of this interacting with us to help improve the future decisions we make in our public and private lives.”

Harrison, C., B. Eckman, R. Hamilton, P. Hartswick, J. Kalagnanam, J. Paraszczak, and P. Williams. “Foundations for Smarter Cities.” IBM Journal of Research and Development 54, no. 4 (2010): 1–16. http://bit.ly/1iha6CR.

  • This paper describes the information technology (IT) foundation and principles for Smarter Cities.
  • The authors introduce three foundational concepts of smarter cities: instrumented, interconnected and intelligent.
  • They also describe some of the major needs of contemporary cities, and concludes that Creating the Smarter City implies capturing and accelerating flows of information both vertically and horizontally.

Hernández-Muñoz, José M., Jesús Bernat Vercher, Luis Muñoz, José A. Galache, Mirko Presser, Luis A. Hernández Gómez, and Jan Pettersson. “Smart Cities at the Forefront of the Future Internet.” In The Future Internet, edited by John Domingue, Alex Galis, Anastasius Gavras, Theodore Zahariadis, Dave Lambert, Frances Cleary, Petros Daras, et al., 447–462. Lecture Notes in Computer Science 6656. Springer Berlin Heidelberg, 2011. http://bit.ly/HhNbMX.

  • This paper explores how the “Internet of Things (IoT) and Internet of Services (IoS), can become building blocks to progress towards a unified urban-scale ICT platform transforming a Smart City into an open innovation platform.”
  • The authors examine the SmartSantander project to argue that, “the different stakeholders involved in the smart city business is so big that many non-technical constraints must be considered (users, public administrations, vendors, etc.).”
  • The authors also discuss the need for infrastructures at the, for instance, European level for realistic large-scale experimentally-driven research.

Hoon-Lee, Jung, Marguerite Gong Hancock, Mei-Chih Hu. “Towards an effective framework for building smart cities: Lessons from Seoul and San Francisco.” Technological Forecasting and Social Change. Ocotober 3, 2013. http://bit.ly/1rzID5v.

  • In this study, the authors aim to “shed light on the process of building an effective smart city by integrating various practical perspectives with a consideration of smart city characteristics taken from the literature.”
  • They propose a conceptual framework based on case studies from Seoul and San Francisco built around the following dimensions:
    • Urban openness
    • Service innovation
    • Partnerships formation
    • Urban proactiveness
    • Smart city infrastructure integration
    • Smart city governance
  • The authors conclude with a summary of research findings featuring “8 stylized facts”:
    • Movement towards more interactive services engaging citizens;
    • Open data movement facilitates open innovation;
    • Diversifying service development: exploit or explore?
    • How to accelerate adoption: top-down public driven vs. bottom-up market driven partnerships;
    • Advanced intelligent technology supports new value-added smart city services;
    • Smart city services combined with robust incentive systems empower engagement;
    • Multiple device & network accessibility can create network effects for smart city services;
    • Centralized leadership implementing a comprehensive strategy boosts smart initiatives.

Kamel Boulos, Maged N. and Najeeb M. Al-Shorbaji. “On the Internet of Things, smart cities and the WHO Healthy Cities.” International Journal of Health Geographics 13, No. 10. 2014. http://bit.ly/Tkt9GA.

  • In this article, the authors give a “brief overview of the Internet of Things (IoT) for cities, offering examples of IoT-powered 21st century smart cities, including the experience of the Spanish city of Barcelona in implementing its own IoT-driven services to improve the quality of life of its people through measures that promote an eco-friendly, sustainable environment.”
  • The authors argue that one of the central needs for harnessing the power of the IoT and urban analytics is for cities to “involve and engage its stakeholders from a very early stage (city officials at all levels, as well as citizens), and to secure their support by raising awareness and educating them about smart city technologies, the associated benefits, and the likely challenges that will need to be overcome (such as privacy issues).”
  • They conclude that, “The Internet of Things is rapidly gaining a central place as key enabler of the smarter cities of today and the future. Such cities also stand better chances of becoming healthier cities.”

Keller, Sallie Ann, Steven E. Koonin, and Stephanie Shipp. “Big Data and City Living – What Can It Do for Us?Significance 9, no. 4 (2012): 4–7. http://bit.ly/166W3NP.

  • This article provides a short introduction to Big Data, its importance, and the ways in which it is transforming cities. After an overview of the social benefits of big data in an urban context, the article examines its challenges, such as privacy concerns and institutional barriers.
  • The authors recommend that new approaches to making data available for research are needed that do not violate the privacy of entities included in the datasets. They believe that balancing privacy and accessibility issues will require new government regulations and incentives.

Kitchin, Rob. “The Real-Time City? Big Data and Smart Urbanism.” SSRN Scholarly Paper. Rochester, NY: Social Science Research Network, July 3, 2013. http://bit.ly/1aamZj2.

  • This paper focuses on “how cities are being instrumented with digital devices and infrastructure that produce ‘big data’ which enable real-time analysis of city life, new modes of technocratic urban governance, and a re-imagining of cities.”
  • The authors provide “a number of projects that seek to produce a real-time analysis of the city and provides a critical reflection on the implications of big data and smart urbanism.”

Mostashari, A., F. Arnold, M. Maurer, and J. Wade. “Citizens as Sensors: The Cognitive City Paradigm.” In 2011 8th International Conference Expo on Emerging Technologies for a Smarter World (CEWIT), 1–5, 2011. http://bit.ly/1fYe9an.

  • This paper argues that. “implementing sensor networks are a necessary but not sufficient approach to improving urban living.”
  • The authors introduce the concept of the “Cognitive City” – a city that can not only operate more efficiently due to networked architecture, but can also learn to improve its service conditions, by planning, deciding and acting on perceived conditions.
  • Based on this conceptualization of a smart city as a cognitive city, the authors propose “an architectural process approach that allows city decision-makers and service providers to integrate cognition into urban processes.”

Oliver, M., M. Palacin, A. Domingo, and V. Valls. “Sensor Information Fueling Open Data.” In Computer Software and Applications Conference Workshops (COMPSACW), 2012 IEEE 36th Annual, 116–121, 2012. http://bit.ly/HjV4jS.

  • This paper introduces the concept of sensor networks as a key component in the smart cities framework, and shows how real-time data provided by different city network sensors enrich Open Data portals and require a new architecture to deal with massive amounts of continuously flowing information.
  • The authors’ main conclusion is that by providing a framework to build new applications and services using public static and dynamic data that promote innovation, a real-time open sensor network data platform can have several positive effects for citizens.

Perera, Charith, Arkady Zaslavsky, Peter Christen and Dimitrios Georgakopoulos. “Sensing as a service model for smart cities supported by Internet of Things.” Transactions on Emerging Telecommunications Technologies 25, Issue 1. January 2014. http://bit.ly/1qJLDP9.

  • This paper looks into the “enormous pressure towards efficient city management” that has “triggered various Smart City initiatives by both government and private sector businesses to invest in information and communication technologies to find sustainable solutions to the growing issues.”
  • The authors explore the parallel advancement of the Internet of Things (IoT), which “envisions to connect billions of sensors to the Internet and expects to use them for efficient and effective resource management in Smart Cities.”
  • The paper proposes the sensing as a service model “as a solution based on IoT infrastructure.” The sensing as a service model consists of four conceptual layers: “(i) sensors and sensor owners; (ii) sensor publishers (SPs); (iii) extended service providers (ESPs); and (iv) sensor data consumers. They go on to describe how this model would work in the areas of waste management, smart agriculture and environmental management.

Privacy, Big Data, and the Public Good: Frameworks for Engagement. Edited by Julia Lane, Victoria Stodden, Stefan Bender, and Helen Nissenbaum; Cambridge University Press, 2014. http://bit.ly/UoGRca.

  • This book focuses on the legal, practical, and statistical approaches for maximizing the use of massive datasets while minimizing information risk.
  • “Big data” is more than a straightforward change in technology.  It poses deep challenges to our traditions of notice and consent as tools for managing privacy.  Because our new tools of data science can make it all but impossible to guarantee anonymity in the future, the authors question whether it possible to truly give informed consent, when we cannot, by definition, know what the risks are from revealing personal data either for individuals or for society as a whole.
  • Based on their experience building large data collections, authors discuss some of the best practical ways to provide access while protecting confidentiality.  What have we learned about effective engineered controls?  About effective access policies?  About designing data systems that reinforce – rather than counter – access policies?  They also explore the business, legal, and technical standards necessary for a new deal on data.
  • Since the data generating process or the data collection process is not necessarily well understood for big data streams, authors discuss what statistics can tell us about how to make greatest scientific use of this data. They also explore the shortcomings of current disclosure limitation approaches and whether we can quantify the extent of privacy loss.

Schaffers, Hans, Nicos Komninos, Marc Pallot, Brigitte Trousse, Michael Nilsson, and Alvaro Oliveira. “Smart Cities and the Future Internet: Towards Cooperation Frameworks for Open Innovation.” In The Future Internet, edited by John Domingue, Alex Galis, Anastasius Gavras, Theodore Zahariadis, Dave Lambert, Frances Cleary, Petros Daras, et al., 431–446. Lecture Notes in Computer Science 6656. Springer Berlin Heidelberg, 2011. http://bit.ly/16ytKoT.

  • This paper “explores ‘smart cities’ as environments of open and user-driven innovation for experimenting and validating Future Internet-enabled services.”
  • The authors examine several smart city projects to illustrate the central role of users in defining smart services and the importance of participation. They argue that, “Two different layers of collaboration can be distinguished. The first layer is collaboration within the innovation process. The second layer concerns collaboration at the territorial level, driven by urban and regional development policies aiming at strengthening the urban innovation systems through creating effective conditions for sustainable innovation.”

Suciu, G., A. Vulpe, S. Halunga, O. Fratu, G. Todoran, and V. Suciu. “Smart Cities Built on Resilient Cloud Computing and Secure Internet of Things.” In 2013 19th International Conference on Control Systems and Computer Science (CSCS), 513–518, 2013. http://bit.ly/16wfNgv.

  • This paper proposes “a new platform for using cloud computing capacities for provision and support of ubiquitous connectivity and real-time applications and services for smart cities’ needs.”
  • The authors present a “framework for data procured from highly distributed, heterogeneous, decentralized, real and virtual devices (sensors, actuators, smart devices) that can be automatically managed, analyzed and controlled by distributed cloud-based services.”

Townsend, Anthony. Smart Cities: Big Data, Civic Hackers, and the Quest for a New Utopia. W. W. Norton & Company, 2013.

  • In this book, Townsend illustrates how “cities worldwide are deploying technology to address both the timeless challenges of government and the mounting problems posed by human settlements of previously unimaginable size and complexity.”
  • He also considers “the motivations, aspirations, and shortcomings” of the many stakeholders involved in the development of smart cities, and poses a new civics to guide these efforts.
  • He argues that smart cities are not made smart by various, soon-to-be-obsolete technologies built into its infrastructure, but how citizens use these ever-changing technologies to be “human-centered, inclusive and resilient.”

To stay current on recent writings and developments on Urban Analytics, please subscribe to the GovLab Digest.
Did we miss anything? Please submit reading recommendations to biblio@thegovlab.org or in the comments below.

Selected Readings on Crowdsourcing Tasks and Peer Production


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of crowdsourcing was originally published in 2014.

Technological advances are creating a new paradigm by which institutions and organizations are increasingly outsourcing tasks to an open community, allocating specific needs to a flexible, willing and dispersed workforce. “Microtasking” platforms like Amazon’s Mechanical Turk are a burgeoning source of income for individuals who contribute their time, skills and knowledge on a per-task basis. In parallel, citizen science projects – task-based initiatives in which citizens of any background can help contribute to scientific research – like Galaxy Zoo are demonstrating the ability of lay and expert citizens alike to make small, useful contributions to aid large, complex undertakings. As governing institutions seek to do more with less, looking to the success of citizen science and microtasking initiatives could provide a blueprint for engaging citizens to help accomplish difficult, time-consuming objectives at little cost. Moreover, the incredible success of peer-production projects – best exemplified by Wikipedia – instills optimism regarding the public’s willingness and ability to complete relatively small tasks that feed into a greater whole and benefit the public good. You can learn more about this new wave of “collective intelligence” by following the MIT Center for Collective Intelligence and their annual Collective Intelligence Conference.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Benkler, Yochai. The Wealth of Networks: How Social Production Transforms Markets and Freedom. Yale University Press, 2006. http://bit.ly/1aaU7Yb.

  • In this book, Benkler “describes how patterns of information, knowledge, and cultural production are changing – and shows that the way information and knowledge are made available can either limit or enlarge the ways people can create and express themselves.”
  • In his discussion on Wikipedia – one of many paradigmatic examples of people collaborating without financial reward – he calls attention to the notable ongoing cooperation taking place among a diversity of individuals. He argues that, “The important point is that Wikipedia requires not only mechanical cooperation among people, but a commitment to a particular style of writing and describing concepts that is far from intuitive or natural to people. It requires self-discipline. It enforces the behavior it requires primarily through appeal to the common enterprise that the participants are engaged in…”

Brabham, Daren C. Using Crowdsourcing in Government. Collaborating Across Boundaries Series. IBM Center for The Business of Government, 2013. http://bit.ly/17gzBTA.

  • In this report, Brabham categorizes government crowdsourcing cases into a “four-part, problem-based typology, encouraging government leaders and public administrators to consider these open problem-solving techniques as a way to engage the public and tackle difficult policy and administrative tasks more effectively and efficiently using online communities.”
  • The proposed four-part typology describes the following types of crowdsourcing in government:
    • Knowledge Discovery and Management
    • Distributed Human Intelligence Tasking
    • Broadcast Search
    • Peer-Vetted Creative Production
  • In his discussion on Distributed Human Intelligence Tasking, Brabham argues that Amazon’s Mechanical Turk and other microtasking platforms could be useful in a number of governance scenarios, including:
    • Governments and scholars transcribing historical document scans
    • Public health departments translating health campaign materials into foreign languages to benefit constituents who do not speak the native language
    • Governments translating tax documents, school enrollment and immunization brochures, and other important materials into minority languages
    • Helping governments predict citizens’ behavior, “such as for predicting their use of public transit or other services or for predicting behaviors that could inform public health practitioners and environmental policy makers”

Boudreau, Kevin J., Patrick Gaule, Karim Lakhani, Christoph Reidl, Anita Williams Woolley. “From Crowds to Collaborators: Initiating Effort & Catalyzing Interactions Among Online Creative Workers.” Harvard Business School Technology & Operations Mgt. Unit Working Paper No. 14-060. January 23, 2014. https://bit.ly/2QVmGUu.

  • In this working paper, the authors explore the “conditions necessary for eliciting effort from those affecting the quality of interdependent teamwork” and “consider the the role of incentives versus social processes in catalyzing collaboration.”
  • The paper’s findings are based on an experiment involving 260 individuals randomly assigned to 52 teams working toward solutions to a complex problem.
  • The authors determined the level of effort in such collaborative undertakings are sensitive to cash incentives. However, collaboration among teams was driven more by the active participation of teammates, rather than any monetary reward.

Franzoni, Chiara, and Henry Sauermann. “Crowd Science: The Organization of Scientific Research in Open Collaborative Projects.” Research Policy (August 14, 2013). http://bit.ly/HihFyj.

  • In this paper, the authors explore the concept of crowd science, which they define based on two important features: “participation in a project is open to a wide base of potential contributors, and intermediate inputs such as data or problem solving algorithms are made openly available.” The rationale for their study and conceptual framework is the “growing attention from the scientific community, but also policy makers, funding agencies and managers who seek to evaluate its potential benefits and challenges. Based on the experiences of early crowd science projects, the opportunities are considerable.”
  • Based on the study of a number of crowd science projects – including governance-related initiatives like Patients Like Me – the authors identify a number of potential benefits in the following categories:
    • Knowledge-related benefits
    • Benefits from open participation
    • Benefits from the open disclosure of intermediate inputs
    • Motivational benefits
  • The authors also identify a number of challenges:
    • Organizational challenges
    • Matching projects and people
    • Division of labor and integration of contributions
    • Project leadership
    • Motivational challenges
    • Sustaining contributor involvement
    • Supporting a broader set of motivations
    • Reconciling conflicting motivations

Kittur, Aniket, Ed H. Chi, and Bongwon Suh. “Crowdsourcing User Studies with Mechanical Turk.” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 453–456. CHI ’08. New York, NY, USA: ACM, 2008. http://bit.ly/1a3Op48.

  • In this paper, the authors examine “[m]icro-task markets, such as Amazon’s Mechanical Turk, [which] offer a potential paradigm for engaging a large number of users for low time and monetary costs. [They] investigate the utility of a micro-task market for collecting user measurements, and discuss design considerations for developing remote micro user evaluation tasks.”
  • The authors conclude that in addition to providing a means for crowdsourcing small, clearly defined, often non-skill-intensive tasks, “Micro-task markets such as Amazon’s Mechanical Turk are promising platforms for conducting a variety of user study tasks, ranging from surveys to rapid prototyping to quantitative measures. Hundreds of users can be recruited for highly interactive tasks for marginal costs within a timeframe of days or even minutes. However, special care must be taken in the design of the task, especially for user measurements that are subjective or qualitative.”

Kittur, Aniket, Jeffrey V. Nickerson, Michael S. Bernstein, Elizabeth M. Gerber, Aaron Shaw, John Zimmerman, Matthew Lease, and John J. Horton. “The Future of Crowd Work.” In 16th ACM Conference on Computer Supported Cooperative Work (CSCW 2013), 2012. http://bit.ly/1c1GJD3.

  • In this paper, the authors discuss paid crowd work, which “offers remarkable opportunities for improving productivity, social mobility, and the global economy by engaging a geographically distributed workforce to complete complex tasks on demand and at scale.” However, they caution that, “it is also possible that crowd work will fail to achieve its potential, focusing on assembly-line piecework.”
  • The authors argue that seven key challenges must be met to ensure that crowd work processes evolve and reach their full potential:
    • Designing workflows
    • Assigning tasks
    • Supporting hierarchical structure
    • Enabling real-time crowd work
    • Supporting synchronous collaboration
    • Controlling quality

Madison, Michael J. “Commons at the Intersection of Peer Production, Citizen Science, and Big Data: Galaxy Zoo.” In Convening Cultural Commons, 2013. http://bit.ly/1ih9Xzm.

  • This paper explores a “case of commons governance grounded in research in modern astronomy. The case, Galaxy Zoo, is a leading example of at least three different contemporary phenomena. In the first place, Galaxy Zoo is a global citizen science project, in which volunteer non-scientists have been recruited to participate in large-scale data analysis on the Internet. In the second place, Galaxy Zoo is a highly successful example of peer production, some times known as crowdsourcing…In the third place, is a highly visible example of data-intensive science, sometimes referred to as e-science or Big Data science, by which scientific researchers develop methods to grapple with the massive volumes of digital data now available to them via modern sensing and imaging technologies.”
  • Madison concludes that the success of Galaxy Zoo has not been the result of the “character of its information resources (scientific data) and rules regarding their usage,” but rather, the fact that the “community was guided from the outset by a vision of a specific organizational solution to a specific research problem in astronomy, initiated and governed, over time, by professional astronomers in collaboration with their expanding universe of volunteers.”

Malone, Thomas W., Robert Laubacher and Chrysanthos Dellarocas. “Harnessing Crowds: Mapping the Genome of Collective Intelligence.” MIT Sloan Research Paper. February 3, 2009. https://bit.ly/2SPjxTP.

  • In this article, the authors describe and map the phenomenon of collective intelligence – also referred to as “radical decentralization, crowd-sourcing, wisdom of crowds, peer production, and wikinomics – which they broadly define as “groups of individuals doing things collectively that seem intelligent.”
  • The article is derived from the authors’ work at MIT’s Center for Collective Intelligence, where they gathered nearly 250 examples of Web-enabled collective intelligence. To map the building blocks or “genes” of collective intelligence, the authors used two pairs of related questions:
    • Who is performing the task? Why are they doing it?
    • What is being accomplished? How is it being done?
  • The authors concede that much work remains to be done “to identify all the different genes for collective intelligence, the conditions under which these genes are useful, and the constraints governing how they can be combined,” but they believe that their framework provides a useful start and gives managers and other institutional decisionmakers looking to take advantage of collective intelligence activities the ability to “systematically consider many possible combinations of answers to questions about Who, Why, What, and How.”

Mulgan, Geoff. “True Collective Intelligence? A Sketch of a Possible New Field.” Philosophy & Technology 27, no. 1. March 2014. http://bit.ly/1p3YSdd.

  • In this paper, Mulgan explores the concept of a collective intelligence, a “much talked about but…very underdeveloped” field.
  • With a particular focus on health knowledge, Mulgan “sets out some of the potential theoretical building blocks, suggests an experimental and research agenda, shows how it could be analysed within an organisation or business sector and points to possible intellectual barriers to progress.”
  • He concludes that the “central message that comes from observing real intelligence is that intelligence has to be for something,” and that “turning this simple insight – the stuff of so many science fiction stories – into new theories, new technologies and new applications looks set to be one of the most exciting prospects of the next few years and may help give shape to a new discipline that helps us to be collectively intelligent about our own collective intelligence.”

Sauermann, Henry and Chiara Franzoni. “Participation Dynamics in Crowd-Based Knowledge Production: The Scope and Sustainability of Interest-Based Motivation.” SSRN Working Papers Series. November 28, 2013. http://bit.ly/1o6YB7f.

  • In this paper, Sauremann and Franzoni explore the issue of interest-based motivation in crowd-based knowledge production – in particular the use of the crowd science platform Zooniverse – by drawing on “research in psychology to discuss important static and dynamic features of interest and deriv[ing] a number of research questions.”
  • The authors find that interest-based motivation is often tied to a “particular object (e.g., task, project, topic)” not based on a “general trait of the person or a general characteristic of the object.” As such, they find that “most members of the installed base of users on the platform do not sign up for multiple projects, and most of those who try out a project do not return.”
  • They conclude that “interest can be a powerful motivator of individuals’ contributions to crowd-based knowledge production…However, both the scope and sustainability of this interest appear to be rather limited for the large majority of contributors…At the same time, some individuals show a strong and more enduring interest to participate both within and across projects, and these contributors are ultimately responsible for much of what crowd science projects are able to accomplish.”

Schmitt-Sands, Catherine E. and Richard J. Smith. “Prospects for Online Crowdsourcing of Social Science Research Tasks: A Case Study Using Amazon Mechanical Turk.” SSRN Working Papers Series. January 9, 2014. http://bit.ly/1ugaYja.

  • In this paper, the authors describe an experiment involving the nascent use of Amazon’s Mechanical Turk as a social science research tool. “While researchers have used crowdsourcing to find research subjects or classify texts, [they] used Mechanical Turk to conduct a policy scan of local government websites.”
  • Schmitt-Sands and Smith found that “crowdsourcing worked well for conducting an online policy program and scan.” The microtasked workers were helpful in screening out local governments that either did not have websites or did not have the types of policies and services for which the researchers were looking. However, “if the task is complicated such that it requires ongoing supervision, then crowdsourcing is not the best solution.”

Shirky, Clay. Here Comes Everybody: The Power of Organizing Without Organizations. New York: Penguin Press, 2008. https://bit.ly/2QysNif.

  • In this book, Shirky explores our current era in which, “For the first time in history, the tools for cooperating on a global scale are not solely in the hands of governments or institutions. The spread of the Internet and mobile phones are changing how people come together and get things done.”
  • Discussing Wikipedia’s “spontaneous division of labor,” Shirky argues that the process is like, “the process is more like creating a coral reef, the sum of millions of individual actions, than creating a car. And the key to creating those individual actions is to hand as much freedom as possible to the average user.”

Silvertown, Jonathan. “A New Dawn for Citizen Science.” Trends in Ecology & Evolution 24, no. 9 (September 2009): 467–471. http://bit.ly/1iha6CR.

  • This article discusses the move from “Science for the people,” a slogan adopted by activists in the 1970s to “’Science by the people,’ which is “a more inclusive aim, and is becoming a distinctly 21st century phenomenon.”
  • Silvertown identifies three factors that are responsible for the explosion of activity in citizen science, each of which could be similarly related to the crowdsourcing of skills by governing institutions:
    • “First is the existence of easily available technical tools for disseminating information about products and gathering data from the public.
    • A second factor driving the growth of citizen science is the increasing realisation among professional scientists that the public represent a free source of labour, skills, computational power and even finance.
    • Third, citizen science is likely to benefit from the condition that research funders such as the National Science Foundation in the USA and the Natural Environment Research Council in the UK now impose upon every grantholder to undertake project-related science outreach. This is outreach as a form of public accountability.”

Szkuta, Katarzyna, Roberto Pizzicannella, David Osimo. “Collaborative approaches to public sector innovation: A scoping study.” Telecommunications Policy. 2014. http://bit.ly/1oBg9GY.

  • In this article, the authors explore cases where government collaboratively delivers online public services, with a focus on success factors and “incentives for services providers, citizens as users and public administration.”
  • The authors focus on six types of collaborative governance projects:
    • Services initiated by government built on government data;
    • Services initiated by government and making use of citizens’ data;
    • Services initiated by civil society built on open government data;
    • Collaborative e-government services; and
    • Services run by civil society and based on citizen data.
  • The cases explored “are all designed in the way that effectively harnesses the citizens’ potential. Services susceptible to collaboration are those that require computing efforts, i.e. many non-complicated tasks (e.g. citizen science projects – Zooniverse) or citizens’ free time in general (e.g. time banks). Those services also profit from unique citizens’ skills and their propensity to share their competencies.”

Open Data (Updated and Expanded)


As part of an ongoing effort to build a knowledge base for the field of opening governance by organizing and disseminating its learnings, the GovLab Selected Readings series provides an annotated and curated collection of recommended works on key opening governance topics. We start our series with a focus on Open Data. To suggest additional readings on this or any other topic, please email biblio@thegovlab.org.

Data and its uses for GovernanceOpen data refers to data that is publicly available for anyone to use and which is licensed in a way that allows for its re-use. The common requirement that open data be machine-readable not only means that data is distributed via the Internet in a digitized form, but can also be processed by computers through automation, ensuring both wide dissemination and ease of re-use. Much of the focus of the open data advocacy community is on government data and government-supported research data. For example, in May 2013, the US Open Data Policy defined open data as publicly available data structured in a way that enables the data to be fully discoverable and usable by end users, and consistent with a number of principles focused on availability, accessibility and reusability.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)
Fox, Mark S. “City Data: Big, Open and Linked.” Working Paper, Enterprise Integration Laboratory (2013). http://bit.ly/1bFr7oL.

  • This paper examines concepts that underlie Big City Data using data from multiple cities as examples. It begins by explaining the concepts of Open, Unified, Linked, and Grounded data, which are central to the Semantic Web. Fox then explore Big Data as an extension of Data Analytics, and provide case examples of good data analytics in cities.
  • Fox concludes that we can develop the tools that will enable anyone to analyze data, both big and small, by adopting the principles of the Semantic Web:
    • Data being openly available over the internet,
    • Data being unifiable using common vocabularies,
    • Data being linkable using International Resource Identifiers,
    • Data being accessible using a common data structure, namely triples,
    • Data being semantically grounded using Ontologies.

Foulonneau, Muriel, Sébastien Martin, and Slim Turki. “How Open Data Are Turned into Services?” In Exploring Services Science, edited by Mehdi Snene and Michel Leonard, 31–39. Lecture Notes in Business Information Processing 169. Springer International Publishing, 2014. http://bit.ly/1fltUmR.

  • In this chapter, the authors argue that, considering the important role the development of new services plays as a motivation for open data policies, the impact of new services created through open data should play a more central role in evaluating the success of open data initiatives.
  • Foulonneau, Martin and Turki argue that the following metrics should be considered when evaluating the success of open data initiatives: “the usage, audience, and uniqueness of the services, according to the changes it has entailed in the public institutions that have open their data…the business opportunity it has created, the citizen perception of the city…the modification to particular markets it has entailed…the sustainability of the services created, or even the new dialog created with citizens.”

Goldstein, Brett, and Lauren Dyson. Beyond Transparency: Open Data and the Future of Civic Innovation. 1 edition. (Code for America Press: 2013). http://bit.ly/15OAxgF

  • This “cross-disciplinary survey of the open data landscape” features stories from practitioners in the open data space — including Michael Flowers, Brett Goldstein, Emer Colmeman and many others — discussing what they’ve accomplished with open civic data. The book “seeks to move beyond the rhetoric of transparency for transparency’s sake and towards action and problem solving.”
  • The book’s editors seek to accomplish the following objectives:
    • Help local governments learn how to start an open data program
    • Spark discussion on where open data will go next
    • Help community members outside of government better engage with the process of governance
    • Lend a voice to many aspects of the open data community.
  • The book is broken into five sections: Opening Government Data, Building on Open Data, Understanding Open Data, Driving Decisions with Data and Looking Ahead.

Granickas, Karolis. “Understanding the Impact of Releasing and Re-using Open Government Data.” European Public Sector Information Platform, ePSIplatform Topic Report No. 2013/08, (2013). http://bit.ly/GU0Nx4.

  • This paper examines the impact of open government data by exploring the latest research in the field, with an eye toward enabling  an environment for open data, as well as identifying the benefits of open government data and its political, social, and economic impacts.
  • Granickas concludes that to maximize the benefits of open government data: a) further research is required that structure and measure potential benefits of open government data; b) “government should pay more attention to creating feedback mechanisms between policy implementers, data providers and data-re-users”; c) “finding a balance between demand and supply requires mechanisms of shaping demand from data re-users and also demonstration of data inventory that governments possess”; and lastly, d) “open data policies require regular monitoring.”

Gurin, Joel. Open Data Now: The Secret to Hot Startups, Smart Investing, Savvy Marketing, and Fast Innovation, (New York: McGraw-Hill, 2014). http://amzn.to/1flubWR.

  • In this book, GovLab Senior Advisor and Open Data 500 director Joel Gurin explores the broad realized and potential benefit of Open Data, and how, “unlike Big Data, Open Data is transparent, accessible, and reusable in ways that give it the power to transform business, government, and society.”
  • The book provides “an essential guide to understanding all kinds of open databases – business, government, science, technology, retail, social media, and more – and using those resources to your best advantage.”
  • In particular, Gurin discusses a number of applications of Open Data with very real potential benefits:
    • “Hot Startups: turn government data into profitable ventures;
    • Savvy Marketing: understanding how reputational data drives your brand;
    • Data-Driven Investing: apply new tools for business analysis;
    • Consumer Information: connect with your customers using smart disclosure;
    • Green Business: use data to bet on sustainable companies;
    • Fast R&D: turn the online world into your research lab;
    • New Opportunities: explore open fields for new businesses.”

Jetzek, Thorhildur, Michel Avital, and Niels Bjørn-Andersen. “Generating Value from Open Government Data.” Thirty Fourth International Conference on Information Systems, 5. General IS Topics 2013. http://bit.ly/1gCbQqL.

  • In this paper, the authors “developed a conceptual model portraying how data as a resource can be transformed to value.”
  • Jetzek, Avital and Bjørn-Andersen propose a conceptual model featuring four Enabling Factors (openness, resource governance, capabilities and technical connectivity) acting on four Value Generating Mechanisms (efficiency, innovation, transparency and participation) leading to the impacts of Economic and Social Value.
  • The authors argue that their research supports that “all four of the identified mechanisms positively influence value, reflected in the level of education, health and wellbeing, as well as the monetary value of GDP and environmental factors.”

Kassen, Maxat. “A promising phenomenon of open data: A case study of the Chicago open data project.Government Information Quarterly (2013). http://bit.ly/1ewIZnk.

  • This paper uses the Chicago open data project to explore the “empowering potential of an open data phenomenon at the local level as a platform useful for promotion of civic engagement projects and provide a framework for future research and hypothesis testing.”
  • Kassen argues that “open data-driven projects offer a new platform for proactive civic engagement” wherein governments can harness “the collective wisdom of the local communities, their knowledge and visions of the local challenges, governments could react and meet citizens’ needs in a more productive and cost-efficient manner.”
  • The paper highlights the need for independent IT developers to network in order for this trend to continue, as well as the importance of the private sector in “overall diffusion of the open data concept.”

Keen, Justin, Radu Calinescu, Richard Paige, John Rooksby. “Big data + politics = open data: The case of health care data in England.Policy and Internet 5 (2), (2013): 228–243. http://bit.ly/1i231WS.

  • This paper examines the assumptions regarding open datasets, technological infrastructure and access, using healthcare systems as a case study.
  • The authors specifically address two assumptions surrounding enthusiasm about Big Data in healthcare: the assumption that healthcare datasets and technological infrastructure are up to task, and the assumption of access to this data from outside the healthcare system.
  • By using the National Health Service in England as an example, the authors identify data, technology, and information governance challenges. They argue that “public acceptability of third party access to detailed health care datasets is, at best, unclear,” and that the prospects of Open Data depend on Open Data policies, which are inherently political, and the government’s assertion of property rights over large datasets. Thus, they argue that the “success or failure of Open Data in the NHS may turn on the question of trust in institutions.”

Kulk, Stefan and Bastiaan Van Loenen. “Brave New Open Data World?International Journal of Spatial Data Infrastructures Research, May 14, 2012. http://bit.ly/15OAUYR.

  • This paper examines the evolving tension between the open data movement and the European Union’s privacy regulations, especially the Data Protection Directive.
  • The authors argue, “Technological developments and the increasing amount of publicly available data are…blurring the lines between non-personal and personal data. Open data may not seem to be personal data on first glance especially when it is anonymised or aggregated. However, it may become personal by combining it with other publicly available data or when it is de-anonymised.”

Kundra, Vivek. “Digital Fuel of the 21st Century: Innovation through Open Data and the Network Effect.” Joan Shorenstein Center on the Press, Politics and Public Policy, Harvard College: Discussion Paper Series, January 2012, http://hvrd.me/1fIwsjR.

  • In this paper, Vivek Kundra, the first Chief Information Officer of the United States, explores the growing impact of open data, and argues that, “In the information economy, data is power and we face a choice between democratizing it and holding on to it for an asymmetrical advantage.”
  • Kundra offers four specific recommendations to maximize the impact of open data: Citizens and NGOs must demand open data in order to fight government corruption, improve accountability and government services; Governments must enact legislation to change the default setting of government to open, transparent and participatory; The press must harness the power of the network effect through strategic partnerships and crowdsourcing to cut costs and provide better insights; and Venture capitalists should invest in startups focused on building companies based on public sector data.

Noveck, Beth Simone and Daniel L. Goroff. “Information for Impact: Liberating Nonprofit Sector Data.” The Aspen Institute Philanthropy & Social Innovation Publication Number 13-004. 2013. http://bit.ly/WDxd7p.

  • This report is focused on “obtaining better, more usable data about the nonprofit sector,” which encompasses, as of 2010, “1.5 million tax-exempt organizations in the United States with $1.51 trillion in revenues.”
  • Toward that goal, the authors propose liberating data from the Form 990, an Internal Revenue Service form that “gathers and publishes a large amount of information about tax-exempt organizations,” including information related to “governance, investments, and other factors not directly related to an organization’s tax calculations or qualifications for tax exemption.”
  • The authors recommend a two-track strategy: “Pursuing the longer-term goal of legislation that would mandate electronic filing to create open 990 data, and pursuing a shorter-term strategy of developing a third party platform that can demonstrate benefits more immediately.”

Robinson, David G., Harlan Yu, William P. Zeller, and Edward W. Felten, “Government Data and the Invisible Hand.” Yale Journal of Law & Technology 11 (2009), http://bit.ly/1c2aDLr.

  • This paper proposes a new approach to online government data that “leverages both the American tradition of entrepreneurial self-reliance and the remarkable low-cost flexibility of contemporary digital technology.”
  • “In order for public data to benefit from the same innovation and dynamism that characterize private parties’ use of the Internet, the federal government must reimagine its role as an information provider. Rather than struggling, as it currently does, to design sites that meet each end-user need, it should focus on creating a simple, reliable and publicly accessible infrastructure that ‘exposes’ the underlying data.”
Ubaldi, Barbara. “Open Government Data: Towards Empirical Analysis of Open Government Data Initiatives.” OECD Working Papers on Public Governance. Paris: Organisation for Economic Co-operation and Development, May 27, 2013. http://bit.ly/15OB6qP.

  • This working paper from the OECD seeks to provide an all-encompassing look at the principles, concepts and criteria framing open government data (OGD) initiatives.
  • Ubaldi also analyzes a variety of challenges to implementing OGD initiatives, including policy, technical, economic and financial, organizational, cultural and legal impediments.
  • The paper also proposes a methodological framework for evaluating OGD Initiatives in OECD countries, with the intention of eventually “developing a common set of metrics to consistently assess impact and value creation within and across countries.”

Worthy, Ben. “David Cameron’s Transparency Revolution? The Impact of Open Data in the UK.” SSRN Scholarly Paper. Rochester, NY: Social Science Research Network, November 29, 2013. http://bit.ly/NIrN6y.

  • In this article, Worthy “examines the impact of the UK Government’s Transparency agenda, focusing on the publication of spending data at local government level. It measures the democratic impact in terms of creating transparency and accountability, public participation and everyday information.”
  • Worthy’s findings, based on surveys of local authorities, interviews and FOI requests, are disappointing. He finds that:
    • Open spending data has led to some government accountability, but largely from those already monitoring government, not regular citizens.
    • Open Data has not led to increased participation, “as it lacks the narrative or accountability instruments to fully bring such effects.”
    • It has also not “created a new stream of information to underpin citizen choice, though new innovations offer this possibility. The evidence points to third party innovations as the key.
  • Despite these initial findings, “Interviewees pointed out that Open Data holds tremendous opportunities for policy-making. Joined up data could significantly alter how policy is made and resources targeted. From small scale issues e.g. saving money through prescriptions to targeting homelessness or health resources, it can have a transformative impact. “

Zuiderwijk, Anneke, Marijn Janssen, Sunil Choenni, Ronald Meijer and Roexsana Sheikh Alibaks. “Socio-technical Impediments of Open Data.” Electronic Journal of e-Government 10, no. 2 (2012). http://bit.ly/17yf4pM.

  • This paper to seeks to identify the socio-technical impediments to open data impact based on a review of the open data literature, as well as workshops and interviews.
  • The authors discovered 118 impediments across ten categories: 1) availability and access; 2) find-ability; 3) usability; 4) understandability; 5) quality; 6) linking and combining data; 7) comparability and compatibility; 8) metadata; 9) interaction with the data provider; and 10) opening and uploading.

Zuiderwijk, Anneke and Marijn Janssen. “Open Data Policies, Their Implementation and Impact: A Framework for Comparison.” Government Information Quarterly 31, no. 1 (January 2014): 17–29. http://bit.ly/1bQVmYT.

  • In this article, Zuiderwijk and Janssen argue that “currently there is a multiplicity of open data policies at various levels of government, whereas very little systematic and structured research [being] done on the issues that are covered by open data policies, their intent and actual impact.”
  • With this evaluation deficit in mind, the authors propose a new framework for comparing open data policies at different government levels using the following elements for comparison:
    • Policy environment and context, such as level of government organization and policy objectives;
    • Policy content (input), such as types of data not publicized and technical standards;
    • Performance indicators (output), such as benefits and risks of publicized data; and
    • Public values (impact).

To stay current on recent writings and developments on Open Data, please subscribe to the GovLab Digest.
Did we miss anything? Please submit reading recommendations to biblio@thegovlab.org or in the comments below.

Selected Readings on Behavioral Economics: Nudges


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of behavioral economics was originally published in 2014.

The 2008 publication of Richard Thaler and Cass Sunstein’s Nudge ushered in a new era of behavioral economics, and since then, policy makers in the United States and elsewhere have been applying behavioral economics to the field of public policy. Like Smart Disclosure, behavioral economics can be used in the public sector to improve the decisionmaking ability of citizens without relying on regulatory interventions. In the six years since Nudge was published, the United Kingdom has created the Behavioural Insights Team (also known as the Nudge Unit), a cross-ministerial organization that uses behavioral economics to inform public policy, and the White House has recently followed suit by convening a team of behavioral economists to create a behavioral insights-driven team in the United States. Policymakers have been using behavioral insights to design more effective interventions in the fields of long term unemployment; roadway safety; enrollment in retirement plans; and increasing enrollment in organ donation registries, to name some noteworthy examples. The literature of this nascent field provides a look at the growing optimism in the potential of applying behavioral insights in the public sector to improve people’s lives.

Selected Reading List (in alphabetical order)

  • John Beshears, James Choi, David Laibson and Brigitte C. Madrian – The Importance of Default Options for Retirement Savings Outcomes: Evidence from the United States – a paper examining the role default options play in encouraging intelligent retirement savings decisionmaking.
  • Cabinet Office and Behavioural Insights Team, United Kingdom – Applying Behavioural Insights to Healtha paper outlining some examples of behavioral economics being applied to the healthcare landscape using cost-efficient interventions.
  • Matthew Darling, Saugato Datta and Sendhil Mullainathan – The Nature of the BEast: What Behavioral Economics Is Not – a paper discussing why control and behavioral economics are not as closely aligned as some think, reiterating the fact that the field is politically agnostic.
  • Antoinette Schoar and Saugato Datta – The Power of Heuristics – a paper exploring the concept of “heuristics,” or rules of thumb, which can provide helpful guidelines for pushing people toward making “reasonably good” decisions without a full understanding of the complexity of a situation.
  • Richard H. Thaler and Cass R. Sunstein – Nudge: Improving Decisions About Health, Wealth, and Happiness – an influential book describing the many ways in which the principles of behavioral economics can be and have been used to influence choices and behavior through the development of new “choice architectures.” 
  • U.K. Parliament Science and Technology Committee – Behaviour Changean exploration of the government’s attempts to influence the behaviour of its citizens through nudges, with a focus on comparing the effectiveness of nudges to that of regulatory interventions.

Annotated Selected Reading List (in alphabetical order)

Beshears, John, James Choi, David Laibson and Brigitte C. Madrian. “The Importance of Default Options for Retirement Savings Outcomes: Evidence from the United States.” In Jeffrey R. Brown, Jeffrey B. Liebman and David A. Wise, editors, Social Security Policy in a Changing Environment, Cambridge: National Bureau of Economic Research, 2009. http://bit.ly/LFmC5s.

  • This paper examines the role default options play in pushing people toward making intelligent decisions regarding long-term savings and retirement planning.
  • Importantly, the authors provide evidence that a strategically oriented default setting from the outset is likely not enough to fully nudge people toward the best possible decisions in retirement savings. They find that the default settings in every major dimension of the savings process (from deciding whether to participate in a 401(k) to how to withdraw money at retirement) have real and distinct effects on behavior.

Cabinet Office and Behavioural Insights Team, United Kingdom. “Applying Behavioural Insights to Health.” December 2010. http://bit.ly/1eFP16J.

  • In this report, the United Kingdom’s Behavioural Insights Team does not attempt to “suggest that behaviour change techniques are the silver bullet that can solve every problem.” Rather, they explore a variety of examples where local authorities, charities, government and the private-sector are using behavioural interventions to encourage healthier behaviors.  
  • The report features case studies regarding behavioral insights ability to affect the following public health issues:
    • Smoking
    • Organ donation
    • Teenage pregnancy
    • Alcohol
    • Diet and weight
    • Diabetes
    • Food hygiene
    • Physical activity
    • Social care
  • The report concludes with a call for more experimentation and knowledge gathering to determine when, where and how behavioural interventions can be most effective in helping the public become healthier.

Darling, Matthew, Saugato Datta and Sendhil Mullainathan. “The Nature of the BEast: What Behavioral Economics Is Not.” The Center for Global Development. October 2013. https://bit.ly/2QytRmf.

  • In this paper, Darling, Datta and Mullainathan outline the three most pervasive myths that abound within the literature about behavioral economics:
    • First, they dispel the relationship between control and behavioral economics.  Although tools used within behavioral economics can convince people to make certain choices, the goal is to nudge people to make the choices they want to make. For example, studies find that when retirement savings plans change the default to opt-in rather than opt-out, more workers set up 401K plans. This is an example of a nudge that guides people to make a choice that they already intend to make.
    • Second, they reiterate that the field is politically agnostic. Both liberals and conservatives have adopted behavioral economics and its approach is neither liberal nor conservative. President Obama embraces behavioral economics but the United Kingdom’s conservative party does, too.
    • And thirdly, the article highlights that irrationality actually has little to do with behavioral economics. Context is an important consideration when one considers what behavior is rational and what behavior is not. Rather than use the term “irrational” to describe human beings, the authors assert that humans are “infinitely complex” and behavior that is often considered irrational is entirely situational.

Schoar, Antoinette and Saugato Datta. “The Power of Heuristics.” Ideas42. January 2014. https://bit.ly/2UDC5YK.

  • This paper explores the notion that being presented with a bevy of options can be desirable in many situations, but when making an intelligent decision requires a high-level understanding of the nuances of vastly different financial aid packages, for example, options can overwhelm. Heuristics (rules of thumb) provide helpful guidelines that “enable people to make ‘reasonably good’ decisions without needing to understand all the complex nuances of the situation.”
  • The underlying goal heuristics in the policy space involves giving people the type of “rules of thumb” that enable make good decisionmaking regarding complex topics such as finance, healthcare and education. The authors point to the benefit of asking individuals to remember smaller pieces of knowledge by referencing a series of studies conducted by psychologists Beatty and Kahneman that showed people were better able to remember long strings of numbers when they were broken into smaller segments.
  • Schoar and Datta recommend these four rules when implementing heuristics:
    • Use heuristics where possible, particularly in complex situation;
    • Leverage new technology (such as text messages and Internet-based tools) to implement heuristics.
    • Determine where heuristics can be used in adult training programs and replace in-depth training programs with heuristics where possible; and
    • Consider how to apply heuristics in situations where the exception is the rule. The authors point to the example of savings and credit card debt. In most instances, saving a portion of one’s income is a good rule of thumb. However, when one has high credit card debt, paying off debt could be preferable to building one’s savings.

Thaler, Richard H. and Cass R. Sunstein. Nudge: Improving Decisions About Health, Wealth, and Happiness. Yale University Press, 2008. https://bit.ly/2kNXroe.

  • This book, likely the single piece of scholarship most responsible for bringing the concept of nudges into the public consciousness, explores how a strategic “choice architecture” can help people make the best decisions.
  • Thaler and Sunstein, while advocating for the wider and more targeted use of nudges to help improve people’s lives without resorting to overly paternal regulation, look to five common nudges for lessons and inspiration:
    • The design of menus gets you to eat (and spend) more;
    • “Flies” in urinals improve, well, aim;
    • Credit card minimum payments affect repayment schedules;
    • Automatic savings programs increase savings rate; and
    • “Defaults” can improve rates of organ donation.
  • In the simplest terms, the authors propose the wider deployment of choice architectures that follow “the golden rule of libertarian paternalism: offer nudges that are most likely to help and least likely to inflict harm.”

U.K. Parliament Science and Technology Committee. “Behaviour Change.” July 2011. http://bit.ly/1cbYv5j.

  • This report from the U.K.’s Science and Technology Committee explores the government’s attempts to influence the behavior of its citizens through nudges, with a focus on comparing the effectiveness of nudges to that of regulatory interventions.
  • The author’s central conclusion is that, “non-regulatory measures used in isolation, including ‘nudges,’ are less likely to be effective. Effective policies often use a range of interventions.”
  • The report’s other major findings and recommendations are:
    • Government must invest in gathering more evidence about what measures work to influence population behaviour change;
    • They should appoint an independent Chief Social Scientist to provide them with robust and independent scientific advice;
    • The Government should take steps to implement a traffic light system of nutritional labelling on all food packaging; and
    • Current voluntary agreements with businesses in relation to public health have major failings. They are not a proportionate response to the scale of the problem of obesity and do not reflect the evidence about what will work to reduce obesity. If effective agreements cannot be reached, or if they show minimal benefit, the Government should pursue regulation.”