Government wants access to personal data while it pushes privacy


Sara Fischer and Scott Rosenberg at Axios: “Over the past two years, the U.S. government has tried to rein in how major tech companies use the personal data they’ve gathered on their customers. At the same time, government agencies are themselves seeking to harness those troves of data.

Why it matters: Tech platforms use personal information to target ads, whereas the government can use it to prevent and solve crimes, deliver benefits to citizens — or (illegally) target political dissent.

Driving the news: A new report from the Wall Street Journal details the ways in which family DNA testing sites like FamilyTreeDNA are pressured by the FBI to hand over customer data to help solve criminal cases using DNA.

  • The trend has privacy experts worried about the potential implications of the government having access to large pools of genetic data, even though many people whose data is included never agreed to its use for that purpose.

The FBI has particular interest in data from genetic and social media sites, because it could help solve crimes and protect the public.

  • For example, the FBI is “soliciting proposals from outside vendors for a contract to pull vast quantities of public data” from Facebook, Twitter Inc. and other social media companies,“ the Wall Street Journal reports.
  • The request is meant to help the agency surveil social behavior to “mitigate multifaceted threats, while ensuring all privacy and civil liberties compliance requirements are met.”
  • Meanwhile, the Trump administration has also urged social media platforms to cooperate with the governmentin efforts to flag individual users as potential mass shooters.

Other agencies have their eyes on big data troves as well.

  • Earlier this year, settlement talks between Facebook and the Department of Housing and Urban Development broke down over an advertising discrimination lawsuit when, according to a Facebook spokesperson, HUD “insisted on access to sensitive information — like user data — without adequate safeguards.”
  • HUD presumably wanted access to the data to ensure advertising discrimination wasn’t occurring on the platform, but it’s unclear whether the agency needed user data to be able to support that investigation….(More)”.

Future Studies and Counterfactual Analysis


Book by Theodore J. Gordon and Mariana Todorova: “In this volume, the authors contribute to futures research by placing the counterfactual question in the future tense. They explore the possible outcomes of future, and consider how future decisions are turning points that may produce different global outcomes. This book focuses on a dozen or so intractable issues that span politics, religion, and technology, each addressed in individual chapters. Until now, most scenarios written by futurists have been built on cause and effect narratives or depended on numerical models derived from historical relationships. In contrast, many of the scenarios written for this book are point descriptions of future discontinuities, a form allows more thought-provoking presentations. Ultimately, this book demonstrates that counterfactual thinking and point scenarios of discontinuities are new, groundbreaking tools for futurists….(More)”.

Index: Open Data


By Alexandra Shaw, Michelle Winowatan, Andrew Young, and Stefaan Verhulst

The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on open data and was originally published in 2018.

Value and Impact

  • The projected year at which all 28+ EU member countries will have a fully operating open data portal: 2020

  • Between 2016 and 2020, the market size of open data in Europe is expected to increase by 36.9%, and reach this value by 2020: EUR 75.7 billion

Public Views on and Use of Open Government Data

  • Number of Americans who do not trust the federal government or social media sites to protect their data: Approximately 50%

  • Key findings from The Economist Intelligence Unit report on Open Government Data Demand:

    • Percentage of respondents who say the key reason why governments open up their data is to create greater trust between the government and citizens: 70%

    • Percentage of respondents who say OGD plays an important role in improving lives of citizens: 78%

    • Percentage of respondents who say OGD helps with daily decision making especially for transportation, education, environment: 53%

    • Percentage of respondents who cite lack of awareness about OGD and its potential use and benefits as the greatest barrier to usage: 50%

    • Percentage of respondents who say they lack access to usable and relevant data: 31%

    • Percentage of respondents who think they don’t have sufficient technical skills to use open government data: 25%

    • Percentage of respondents who feel the number of OGD apps available is insufficient, indicating an opportunity for app developers: 20%

    • Percentage of respondents who say OGD has the potential to generate economic value and new business opportunity: 61%

    • Percentage of respondents who say they don’t trust governments to keep data safe, protected, and anonymized: 19%

Efforts and Involvement

  • Time that’s passed since open government advocates convened to create a set of principles for open government data – the instance that started the open data government movement: 10 years

  • Countries participating in the Open Government Partnership today: 79 OGP participating countries and 20 subnational governments

  • Percentage of “open data readiness” in Europe according to European Data Portal: 72%

    • Open data readiness consists of four indicators which are presence of policy, national coordination, licensing norms, and use of data.

  • Number of U.S. cities with Open Data portals: 27

  • Number of governments who have adopted the International Open Data Charter: 62

  • Number of non-state organizations endorsing the International Open Data Charter: 57

  • Number of countries analyzed by the Open Data Index: 94

  • Number of Latin American countries that do not have open data portals as of 2017: 4 total – Belize, Guatemala, Honduras and Nicaragua

  • Number of cities participating in the Open Data Census: 39

Demand for Open Data

  • Open data demand measured by frequency of open government data use according to The Economist Intelligence Unit report:

    • Australia

      • Monthly: 15% of respondents

      • Quarterly: 22% of respondents

      • Annually: 10% of respondents

    • Finland

      • Monthly: 28% of respondents

      • Quarterly: 18% of respondents

      • Annually: 20% of respondents

    •  France

      • Monthly: 27% of respondents

      • Quarterly: 17% of respondents

      • Annually: 19% of respondents

        •  
    • India

      • Monthly: 29% of respondents

      • Quarterly: 20% of respondents

      • Annually: 10% of respondents

    • Singapore

      • Monthly: 28% of respondents

      • Quarterly: 15% of respondents

      • Annually: 17% of respondents 

    • UK

      • Monthly: 23% of respondents

      • Quarterly: 21% of respondents

      • Annually: 15% of respondents

    • US

      • Monthly: 16% of respondents

      • Quarterly: 15% of respondents

      • Annually: 20% of respondents

  • Number of FOIA requests received in the US for fiscal year 2017: 818,271

  • Number of FOIA request processed in the US for fiscal year 2017: 823,222

  • Distribution of FOIA requests in 2017 among top 5 agencies with highest number of request:

    • DHS: 45%

    • DOJ: 10%

    • NARA: 7%

    • DOD: 7%

    • HHS: 4%

Examining Datasets

  • Country with highest index score according to ODB Leaders Edition: Canada (76 out of 100)

  • Country with lowest index score according to ODB Leaders Edition: Sierra Leone (22 out of 100)

  • Number of datasets open in the top 30 governments according to ODB Leaders Edition: Fewer than 1 in 5

  • Average percentage of datasets that are open in the top 30 open data governments according to ODB Leaders Edition: 19%

  • Average percentage of datasets that are open in the top 30 open data governments according to ODB Leaders Edition by sector/subject:

    • Budget: 30%

    • Companies: 13%

    • Contracts: 27%

    • Crime: 17%

    • Education: 13%

    • Elections: 17%

    • Environment: 20%

    • Health: 17%

    • Land: 7%

    • Legislation: 13%

    • Maps: 20%

    • Spending: 13%

    • Statistics: 27%

    • Trade: 23%

    • Transport: 30%

  • Percentage of countries that release data on government spending according to ODB Leaders Edition: 13%

  • Percentage of government data that is updated at regular intervals according to ODB Leaders Edition: 74%

  • Number of datasets available through:

  • Number of datasets classed as “open” in 94 places worldwide analyzed by the Open Data Index: 11%

  • Percentage of open datasets in the Caribbean, according to Open Data Census: 7%

  • Number of companies whose data is available through OpenCorporates: 158,589,950

City Open Data

  • New York City

  • Singapore

    • Number of datasets published in Singapore: 1,480

    • Percentage of datasets with standardized format: 35%

    • Percentage of datasets made as raw as possible: 25%

  • Barcelona

    • Number of datasets published in Barcelona: 443

    • Open data demand in Barcelona measured by:

      • Number of unique sessions in the month of September 2018: 5,401

    • Quality of datasets published in Barcelona according to Tim Berners Lee 5-star Open Data: 3 stars

  • London

    • Number of datasets published in London: 762

    • Number of data requests since October 2014: 325

  • Bandung

    • Number of datasets published in Bandung: 1,417

  • Buenos Aires

    • Number of datasets published in Buenos Aires: 216

  • Dubai

    • Number of datasets published in Dubai: 267

  • Melbourne

    • Number of datasets published in Melbourne: 199

Sources

  • About OGP, Open Government Partnership. 2018.  

Selected Readings on Data Responsibility, Refugees and Migration


By Kezia Paladina, Alexandra Shaw, Michelle Winowatan, Stefaan Verhulst, and Andrew Young

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of Data Collaboration for Migration was originally published in 2018.

Special thanks to Paul Currion whose data responsibility literature review gave us a headstart when developing the below. (Check out his article listed below on Refugee Identity)

The collection below is also meant to complement our article in the Stanford Social Innovation Review on Data Collaboration for Migration where we emphasize the need for a Data Responsibility Framework moving forward.

From climate change to politics to finance, there is growing recognition that some of the most intractable problems of our era are information problems. In recent years, the ongoing refugee crisis has increased the call for new data-driven approaches to address the many challenges and opportunities arising from migration. While data – including data from the private sector – holds significant potential value for informing analysis and targeted international and humanitarian response to (forced) migration, decision-makers often lack an actionable understanding of if, when and how data could be collected, processed, stored, analyzed, used, and shared in a responsible manner.

Data responsibility – including the responsibility to protect data and shield its subjects from harms, and the responsibility to leverage and share data when it can provide public value – is an emerging field seeking to go beyond just privacy concerns. The forced migration arena has a number of particularly important issues impacting responsible data approaches, including the risks of leveraging data regarding individuals fleeing a hostile or repressive government.

In this edition of the GovLab’s Selected Readings series, we examine the emerging literature on the data responsibility approaches in the refugee and forced migration space – part of an ongoing series focused on Data Responsibiltiy. The below reading list features annotated readings related to the Policy and Practice of data responsibility for refugees, and the specific responsibility challenges regarding Identity and Biometrics.

Data Responsibility and Refugees – Policy and Practice

International Organization for Migration (IOM) (2010) IOM Data Protection Manual. Geneva: IOM.

  • This IOM manual includes 13 data protection principles related to the following activities: lawful and fair collection, specified and legitimate purpose, data quality, consent, transfer to third parties, confidentiality, access and transparency, data security, retention and personal data, application of the principles, ownership of personal data, oversight, compliance and internal remedies (and exceptions).
  • For each principle, the IOM manual features targeted data protection guidelines, and templates and checklists are included to help foster practical application.

Norwegian Refugee Council (NRC) Internal Displacement Monitoring Centre / OCHA (eds.) (2008) Guidance on Profiling Internally Displaced Persons. Geneva: Inter-Agency Standing Committee.

  • This NRC document contains guidelines on gathering better data on Internally Displaced Persons (IDPs), based on country context.
  • IDP profile is defined as number of displaced persons, location, causes of displacement, patterns of displacement, and humanitarian needs among others.
  • It further states that collecting IDPs data is challenging and the current condition of IDPs data are hampering assistance programs.
  • Chapter I of the document explores the rationale for IDP profiling. Chapter II describes the who aspect of profiling: who IDPs are and common pitfalls in distinguishing them from other population groups. Chapter III describes the different methodologies that can be used in different contexts and suggesting some of the advantages and disadvantages of each, what kind of information is needed and when it is appropriate to profile.

United Nations High Commissioner for Refugees (UNHCR). Model agreement on the sharing of personal data with Governments in the context of hand-over of the refugee status determination process. Geneva: UNHCR.

  • This document from UNHCR provides a template of agreement guiding the sharing of data between a national government and UNHCR. The model agreement’s guidance is aimed at protecting the privacy and confidentiality of individual data while promoting improvements to service delivery for refugees.

United Nations High Commissioner for Refugees (UNHCR) (2015). Policy on the Protection of Personal Data of Persons of Concern to UNHCR. Geneva: UNHCR.

  • This policy outlines the rules and principles regarding the processing of personal data of persons engaged by UNHCR with the purpose of ensuring that the practice is consistent with UNGA’s regulation of computerized personal data files that was established to protect individuals’ data and privacy.
  • UNHCR require its personnel to apply the following principles when processing personal data: (i) Legitimate and fair processing (ii) Purpose specification (iii) Necessity and proportionality (iv) Accuracy (v) Respect for the rights of the data subject (vi) Confidentiality (vii) Security (viii) Accountability and supervision.

United Nations High Commissioner for Refugees (UNHCR) (2015) Privacy Impact Assessment of UNHCR Cash Based Interventions.

  • This impact assessment focuses on privacy issues related to financial assistance for refugees in the form of cash transfers. For international organizations like UNHCR to determine eligibility for cash assistance, data “aggregation, profiling, and social sorting techniques,” are often needed, leading a need for a responsible data approach.
  • This Privacy Impact Assessment (PIA) aims to identify the privacy risks posed by their program and seek to enhance safeguards that can mitigate those risks.
  • Key issues raised in the PIA involves the challenge of ensuring that individuals’ data will not be used for purposes other than those initially specified.

Data Responsibility in Identity and Biometrics

Bohlin, A. (2008) “Protection at the Cost of Privacy? A Study of the Biometric Registration of Refugees.” Lund: Faculty of Law of the University of Lund.

  • This 2008 study focuses on the systematic biometric registration of refugees conducted by UNHCR in refugee camps around the world, to understand whether enhancing the registration mechanism of refugees contributes to their protection and guarantee of human rights, or whether refugee registration exposes people to invasions of privacy.
  • Bohlin found that, at the time, UNHCR failed to put a proper safeguards in the case of data dissemination, exposing the refugees data to the risk of being misused. She goes on to suggest data protection regulations that could be put in place in order to protect refugees’ privacy.

Currion, Paul. (2018) “The Refugee Identity.” Medium.

  • Developed as part of a DFID-funded initiative, this essay considers Data Requirements for Service Delivery within Refugee Camps, with a particular focus on refugee identity.
  • Among other findings, Currion finds that since “the digitisation of aid has already begun…aid agencies must therefore pay more attention to the way in which identity systems affect the lives and livelihoods of the forcibly displaced, both positively and negatively.”
  • Currion argues that a Responsible Data approach, as opposed to a process defined by a Data Minimization principle, provides “useful guidelines,” but notes that data responsibility “still needs to be translated into organisational policy, then into institutional processes, and finally into operational practice.”

Farraj, A. (2010) “Refugees and the Biometric Future: The Impact of Biometrics on Refugees and Asylum Seekers.” Colum. Hum. Rts. L. Rev. 42 (2010): 891.

  • This article argues that biometrics help refugees and asylum seekers establish their identity, which is important for ensuring the protection of their rights and service delivery.
  • However, Farraj also describes several risks related to biometrics, such as, misidentification and misuse of data, leading to a need for proper approaches for the collection, storage, and utilization of the biometric information by government, international organizations, or other parties.  

GSMA (2017) Landscape Report: Mobile Money, Humanitarian Cash Transfers and Displaced Populations. London: GSMA.

  • This paper from GSMA seeks to evaluate how mobile technology can be helpful in refugee registration, cross-organizational data sharing, and service delivery processes.
  • One of its assessments is that the use of mobile money in a humanitarian context depends on the supporting regulatory environment that contributes to unlocking the true potential of mobile money. The examples include extension of SIM dormancy period to anticipate infrequent cash disbursements, ensuring that persons without identification are able to use the mobile money services, and so on.
  • Additionally, GMSA argues that mobile money will be most successful when there is an ecosystem to support other financial services such as remittances, airtime top-ups, savings, and bill payments. These services will be especially helpful in including displaced populations in development.

GSMA (2017) Refugees and Identity: Considerations for mobile-enabled registration and aid delivery. London: GSMA.

  • This paper emphasizes the importance of registration in the context of humanitarian emergency, because being registered and having a document that proves this registration is key in acquiring services and assistance.
  • Studying cases of Kenya and Iraq, the report concludes by providing three recommendations to improve mobile data collection and registration processes: 1) establish more flexible KYC for mobile money because where refugees are not able to meet existing requirements; 2) encourage interoperability and data sharing to avoid fragmented and duplicative registration management; and 3) build partnership and collaboration among governments, humanitarian organizations, and multinational corporations.

Jacobsen, Katja Lindskov (2015) “Experimentation in Humanitarian Locations: UNHCR and Biometric Registration of Afghan Refugees.” Security Dialogue, Vol 46 No. 2: 144–164.

  • In this article, Jacobsen studies the biometric registration of Afghan refugees, and considers how “humanitarian refugee biometrics produces digital refugees at risk of exposure to new forms of intrusion and insecurity.”

Jacobsen, Katja Lindskov (2017) “On Humanitarian Refugee Biometrics and New Forms of Intervention.” Journal of Intervention and Statebuilding, 1–23.

  • This article traces the evolution of the use of biometrics at the Office of the United Nations High Commissioner for Refugees (UNHCR) – moving from a few early pilot projects (in the early-to-mid-2000s) to the emergence of a policy in which biometric registration is considered a ‘strategic decision’.

Manby, Bronwen (2016) “Identification in the Context of Forced Displacement.” Washington DC: World Bank Group. Accessed August 21, 2017.

  • In this paper, Bronwen describes the consequences of not having an identity in a situation of forced displacement. It prevents displaced population from getting various services and creates higher chance of exploitation. It also lowers the effectiveness of humanitarian actions, as lacking identity prevents humanitarian organizations from delivering their services to the displaced populations.
  • Lack of identity can be both the consequence and and cause of forced displacement. People who have no identity can be considered illegal and risk being deported. At the same time, conflicts that lead to displacement can also result in loss of ID during travel.
  • The paper identifies different stakeholders and their interest in the case of identity and forced displacement, and finds that the biggest challenge for providing identity to refugees is the politics of identification and nationality.
  • Manby concludes that in order to address this challenge, there needs to be more effective coordination among governments, international organizations, and the private sector to come up with an alternative of providing identification and services to the displaced persons. She also argues that it is essential to ensure that national identification becomes a universal practice for states.

McClure, D. and Menchi, B. (2015). Challenges and the State of Play of Interoperability in Cash Transfer Programming. Geneva: UNHCR/World Vision International.

  • This report reviews the elements that contribute to the interoperability design for Cash Transfer Programming (CTP). The design framework offered here maps out these various features and also looks at the state of the problem and the state of play through a variety of use cases.
  • The study considers the current state of play and provides insights about the ways to address the multi-dimensionality of interoperability measures in increasingly complex ecosystems.     

NRC / International Human Rights Clinic (2016). Securing Status: Syrian refugees and the documentation of legal status, identity, and family relationships in Jordan.

  • This report examines Syrian refugees’ attempts to obtain identity cards and other forms of legally recognized documentation (mainly, Ministry of Interior Service Cards, or “new MoI cards”) in Jordan through the state’s Urban Verification Exercise (“UVE”). These MoI cards are significant because they allow Syrians to live outside of refugee camps and move freely about Jordan.
  • The text reviews the acquirement processes and the subsequent challenges and consequences that refugees face when unable to obtain documentation. Refugees can encounter issues ranging from lack of access to basic services to arrest, detention, forced relocation to camps and refoulement.  
  • Seventy-two Syrian refugee families in Jordan were interviewed in 2016 for this report and their experiences with obtaining MoI cards varied widely.

Office of Internal Oversight Services (2015). Audit of the operations in Jordan for the Office of the United Nations High Commissioner for Refugees. Report 2015/049. New York: UN.

  • This report documents the January 1, 2012 – March 31, 2014 audit of Jordanian operations, which is intended to ensure the effectiveness of the UNHCR Representation in the state.
  • The main goals of the Regional Response Plan for Syrian refugees included relieving the pressure on Jordanian services and resources while still maintaining protection for refugees.
  • The audit results concluded that the Representation was initially unsatisfactory, and the OIOS suggested several recommendations according to the two key controls which the Representation acknowledged. Those recommendations included:
    • Project management:
      • Providing training to staff involved in financial verification of partners supervise management
      • Revising standard operating procedure on cash based interventions
      • Establishing ways to ensure that appropriate criteria for payment of all types of costs to partners’ staff are included in partnership agreements
    • Regulatory framework:
      • Preparing annual need-based procurement plan and establishing adequate management oversight processes
      • Creating procedures for the assessment of renovation work in progress and issuing written change orders
      • Protecting data and ensuring timely consultation with the UNHCR Division of Financial and Administrative Management

UNHCR/WFP (2015). Joint Inspection of the Biometrics Identification System for Food Distribution in Kenya. Geneva: UNHCR/WFP.

  • This report outlines the partnership between the WFP and UNHCR in its effort to promote its biometric identification checking system to support food distribution in the Dadaab and Kakuma refugee camps in Kenya.
  • Both entities conducted a joint inspection mission in March 2015 and was considered an effective tool and a model for other country operations.
  • Still, 11 recommendations are proposed and responded to in this text to further improve the efficiency of the biometric system, including real-time evaluation of impact, need for automatic alerts, documentation of best practices, among others.

Selected Readings on Data, Gender, and Mobility


By Michelle Winowatan, Andrew Young, and Stefaan Verhulst

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data, gender, and mobility was originally published in 2017.

This edition of the Selected Readings was  developed as part of an ongoing project at the GovLab, supported by Data2X, in collaboration with UNICEF, DigitalGlobe, IDS (UDD/Telefonica R&D), and the ISI Foundation, to establish a data collaborative to analyze unequal access to urban transportation for women and girls in Chile. We thank all our partners for their suggestions to the below curation – in particular Leo Ferres at IDS who got us started with this collection; Ciro Cattuto and Michele Tizzoni from the ISI Foundation; and Bapu Vaitla at Data2X for their pointers to the growing data and mobility literature. 

Introduction

Daily mobility is key for gender equity. Access to transportation contributes to women’s agency and independence. The ability to move from place to place safely and efficiently can allow women to access education, work, and the public domain more generally. Yet, mobility is not just a means to access various opportunities. It is also a means to enter the public domain.

Women’s mobility is a multi-layered challenge
Women’s daily mobility, however, is often hampered by social, cultural, infrastructural, and technical barriers. Cultural bias, for instance, limits women mobility in a way that women are confined to an area with close proximity to their house due to society’s double standard on women to be homemakers. From an infrastructural perspective, public transportation mostly only accommodates home-to-work trips, when in reality women often make more complex trips with stops, for example, at the market, school, healthcare provider – sometimes called “trip chaining.” From a safety perspective, women tend to avoid making trips in certain areas and/or at certain time, due to a constant risk of being sexually harassed on public places. Women are also pushed toward more expensive transportation – such as taking a cab instead of a bus or train – based on safety concerns.

The growing importance of (new sources of) data
Researchers are increasingly experimenting with ways to address these interdependent problems through the analysis of diverse datasets, often collected by private sector businesses and other non-governmental entities. Gender-disaggregated mobile phone records, geospatial data, satellite imagery, and social media data, to name a few, are providing evidence-based insight into gender and mobility concerns. Such data collaboratives – the exchange of data across sectors to create public value – can help governments, international organizations, and other public sector entities in the move toward more inclusive urban and transportation planning, and the promotion of gender equity.
The below curated set of readings seek to focus on the following areas:

  1. Insights on how data can inform gender empowerment initiatives,
  2. Emergent research into the capacity of new data sources – like call detail records (CDRs) and satellite imagery – to increase our understanding of human mobility patterns, and
  3. Publications exploring data-driven policy for gender equity in mobility.

Readings are listed in alphabetical order.

We selected the readings based upon their focus (gender and/or mobility related); scope and representativeness (going beyond one project or context); type of data used (such as CDRs and satellite imagery); and date of publication.

Annotated Reading List

Data and Gender

Blumenstock, Joshua, and Nathan Eagle. Mobile Divides: Gender, Socioeconomic Status, and Mobile Phone Use in Rwanda. ACM Press, 2010.

  • Using traditional survey and mobile phone operator data, this study analyzes gender and socioeconomic divides in mobile phone use in Rwanda, where it is found that the use of mobile phones is significantly more prevalent in men and the higher class.
  • The study also shows the differences in the way men and women use phones, for example: women are more likely to use a shared phone than men.
  • The authors frame their findings around gender and economic inequality in the country to the end of providing pointers for government action.

Bosco, Claudio, et al. Mapping Indicators of Female Welfare at High Spatial Resolution. WorldPop and Flowminder, 2015.

  • This report focuses on early adolescence in girls, which often comes with higher risk of violence, fewer economic opportunity, and restrictions on mobility. Significant data gaps, methodological and ethical issues surrounding data collection for girls also create barriers for policymakers to create evidence-based policy to address those issues.
  • The authors analyze geolocated household survey data, using statistical models and validation techniques, and creates high-resolution maps of various sex-disaggregated indicators, such as nutrition level, access to contraception, and literacy, to better inform local policy making processes.
  • Further, it identifies the gender data gap and issues surrounding gender data collection, and provides arguments for why having a comprehensive data can help create better policy and contribute to the achievements of the Sustainable Development Goals (SDGs).

Buvinic, Mayra, Rebecca Furst-Nichols, and Gayatri Koolwal. Mapping Gender Data Gaps. Data2X, 2014.

  • This study identifies gaps in gender data in developing countries on health, education, economic opportunities, political participation, and human security issues.
  • It recommends ways to close the gender data gap through censuses and micro-level surveys, service and administrative records, and emphasizes how “big data” in particular can fill the missing data that will be able to measure the progress of women and girls well being. The authors argue that dentifying these gaps is key to advancing gender equality and women’s empowerment, one of the SDGs.

Catalyzing Inclusive FInancial System: Chile’s Commitment to Women’s Data. Data2X, 2014.

  • This article analyzes global and national data in the banking sector to fill the gap of sex-disaggregated data in Chile. The purpose of the study is to describe the difference in spending behavior and priorities between women and men, identify the challenges for women in accessing financial services, and create policies that promote women inclusion in Chile.

Ready to Measure: Twenty Indicators for Monitoring SDG Gender Targets. Open Data Watch and Data2X, 2016.

  • Using readily available data this study identifies 20 SDG indicators related to gender issues that can serve as a baseline measurement for advancing gender equality, such as percentage of women aged 20-24 who were married or in a union before age 18 (child marriage), proportion of seats held by women in national parliament, and share of women among mobile telephone owners, among others.

Ready to Measure Phase II: Indicators Available to Monitor SDG Gender Targets. Open Data Watch and Data2X, 2017.

  • The Phase II paper is an extension of the Ready to Measure Phase I above. Where Phase I identifies the readily available data to measure women and girls well-being, Phase II provides informations on how to access and summarizes insights from this data.
  • Phase II elaborates the insights about data gathered from ready to measure indicators and finds that although underlying data to measure indicators of women and girls’ wellbeing is readily available in most cases, it is typically not sex-disaggregated.
  • Over one in five – 53 out of 232 – SDG indicators specifically refer to women and girls. However, further analysis from this study reveals that at least 34 more indicators should be disaggregated by sex. For instance, there should be 15 more sex-disaggregated indicators for SDG number 3: “Ensure healthy lives and promote well-being for all at all ages.”
  • The report recommends national statistical agencies to take the lead and assert additional effort to fill the data gap by utilizing tools such as the statistical model to fill the current gender data gap for each of the SDGs.

Reed, Philip J., Muhammad Raza Khan, and Joshua Blumenstock. Observing gender dynamics and disparities with mobile phone metadata. International Conference on Information and Communication Technologies and Development (ICTD), 2016.

  • The study analyzes mobile phone logs of millions of Pakistani residents to explore whether there is a difference in mobile phone usage behavior between male and female and determine the extent to which gender inequality is reflected in mobile phone usage.
  • It utilizes mobile phone data to analyze the pattern of usage behavior between genders, and socioeconomic and demographic data obtained from census and advocacy groups to assess the state of gender equality in each region in Pakistan.
  • One of its findings is a strong positive correlation between proportion of female mobile phone users and education score.

Stehlé, Juliette, et al. Gender homophily from spatial behavior in a primary school: A sociometric study. 2013.

    • This paper seeks to understand homophily, a human behavior characterizes by interaction with peers who have similarities in “physical attributes to tastes or political opinions”. Further, it seeks to identify the magnitude of influence, a type of homophily has to social structures.
    • Focusing on gender interaction among primary school aged children in France, this paper collects data from wearable devices from 200 children in the period of 2 days and measure the physical proximity and duration of the interaction among those children in the playground.
  • It finds that interaction patterns are significantly determined by grade and class structure of the school. Meaning that children belonging to the same class have most interactions, and that lower grades usually do not interact with higher grades.
  • From a gender lens, this study finds that mixed-gender interaction lasts shorter relative to same-gender interaction. In addition, interaction among girls is also longer compared to interaction among boys. These indicate that the children in this school tend to have stronger relationships within their own gender, or what the study calls gender homophily. It further finds that gender homophily is apparent in all classes.

Data and Mobility

Bengtsson, Linus, et al. Using Mobile Phone Data to Predict the Spatial Spread of Cholera. Flowminder, 2015.

  • This study seeks to predict the 2010 cholera epidemic in Haiti using 2.9 million anonymous mobile phone SIM cards and reported cases of Cholera from the Haitian Directorate of Health, where 78 study areas were analyzed in the period of October 16 – December 16, 2010.
  • From this dataset, the study creates a mobility matrix that indicates mobile phone movement from one study area to another and combines that with the number of reported case of cholera in the study areas to calculate the infectious pressure level of those areas.
  • The main finding of its analysis shows that the outbreak risk of a study area correlates positively with the infectious pressure level, where an infectious pressure of over 22 results in an outbreak within 7 days. Further, it finds that the infectious pressure level can inform the sensitivity and specificity of the outbreak prediction.
  • It hopes to improve infectious disease containment by identifying areas with highest risks of outbreaks.

Calabrese, Francesco, et al. Understanding Individual Mobility Patterns from Urban Sensing Data: A Mobile Phone Trace Example. SENSEable City Lab, MIT, 2012.

  • This study compares mobile phone data and odometer readings from annual safety inspections to characterize individual mobility and vehicular mobility in the Boston Metropolitan Area, measured by the average daily total trip length of mobile phone users and average daily Vehicular Kilometers Traveled (VKT).
  • The study found that, “accessibility to work and non-work destinations are the two most important factors in explaining the regional variations in individual and vehicular mobility, while the impacts of populations density and land use mix on both mobility measures are insignificant.” Further, “a well-connected street network is negatively associated with daily vehicular total trip length.”
  • This study demonstrates the potential for mobile phone data to provide useful and updatable information on individual mobility patterns to inform transportation and mobility research.

Campos-Cordobés, Sergio, et al. “Chapter 5 – Big Data in Road Transport and Mobility Research.” Intelligent Vehicles. Edited by Felipe Jiménez. Butterworth-Heinemann, 2018.

  • This study outlines a number of techniques and data sources – such as geolocation information, mobile phone data, and social network observation – that could be leveraged to predict human mobility.
  • The authors also provide a number of examples of real-world applications of big data to address transportation and mobility problems, such as transport demand modeling, short-term traffic prediction, and route planning.

Lin, Miao, and Wen-Jing Hsu. Mining GPS Data for Mobility Patterns: A Survey. Pervasive and Mobile Computing vol. 12,, 2014.

  • This study surveys the current field of research using high resolution positioning data (GPS) to capture mobility patterns.
  • The survey focuses on analyses related to frequently visited locations, modes of transportation, trajectory patterns, and placed-based activities. The authors find “high regularity” in human mobility patterns despite high levels of variation among the mobility areas covered by individuals.

Phithakkitnukoon, Santi, Zbigniew Smoreda, and Patrick Olivier. Socio-Geography of Human Mobility: A Study Using Longitudinal Mobile Phone Data. PLoS ONE, 2012.

  • This study used a year’s call logs and location data of approximately one million mobile phone users in Portugal to analyze the association between individuals’ mobility and their social networks.
  • It measures and analyze travel scope (locations visited) and geo-social radius (distance from friends, family, and acquaintances) to determine the association.
  • It finds that 80% of places visited are within 20 km of an individual’s nearest social ties’ location and it rises to 90% at 45 km radius. Further, as population density increases, distance between individuals and their social networks decreases.
  • The findings in this study demonstrates how mobile phone data can provide insights to “the socio-geography of human mobility”.

Semanjski, Ivana, and Sidharta Gautama. Crowdsourcing Mobility Insights – Reflection of Attitude Based Segments on High Resolution Mobility Behaviour Data. vol. 71, Transportation Research, 2016.

  • Using cellphone data, this study maps attitudinal segments that explain how age, gender, occupation, household size, income, and car ownership influence an individual’s mobility patterns. This type of segment analysis is seen as particularly useful for targeted messaging.
  • The authors argue that these time- and space-specific insights could also provide value for government officials and policymakers, by, for example, allowing for evidence-based transportation pricing options and public sector advertising campaign placement.

Silveira, Lucas M., et al. MobHet: Predicting Human Mobility using Heterogeneous Data Sources. vol. 95, Computer Communications , 2016.

  • This study explores the potential of using data from multiple sources (e.g., Twitter and Foursquare), in addition to GPS data, to provide a more accurate prediction of human mobility. This heterogenous data captures popularity of different locations, frequency of visits to those locations, and the relationships among people who are moving around the target area. The authors’ initial experimentation finds that the combination of these sources of data are demonstrated to be more accurate in identifying human mobility patterns.

Wilson, Robin, et al. Rapid and Near Real-Time Assessments of Population Displacement Using Mobile Phone Data Following Disasters: The 2015 Nepal Earthquake. PLOS Current Disasters, 2016.

  • Utilizing call detail records of 12 million mobile phone users in Nepal, this study seeks spatio-temporal details of the population after the earthquake on April 25, 2015.
  • It seeks to answer the problem of slow and ineffective disaster response, by capturing near real-time displacement pattern provided by mobile phone call detail records, in order to inform humanitarian agencies on where to distribute their assistance. The preliminary results of this study were available nine days after the earthquake.
  • This project relies on the foundational cooperation with mobile phone operator, who supplied the de-identified data from 12 million users, before the earthquake.
  • The study finds that shortly after the earthquake there was an anomalous population movement out of the Kathmandu Valley, the most impacted area, to surrounding areas. The study estimates 390,000 people above normal had left the valley.

Data, Gender and Mobility

Althoff, Tim, et al. “Large-Scale Physical Activity Data Reveal Worldwide Activity Inequality.” Nature, 2017.

  • This study’s analysis of worldwide physical activity is built on a dataset containing 68 million days of physical activity of 717,527 people collected through their smartphone accelerometers.
  • The authors find a significant reduction in female activity levels in cities with high active inequality, where high active inequality is associated with low city walkability – walkability indicators include pedestrian facilities (city block length, intersection density, etc.) and amenities (shops, parks, etc.).
  • Further, they find that high active inequality is associated with high levels of inactivity-related health problems, like obesity.

Borker, Girija. “Safety First: Street Harassment and Women’s Educational Choices in India.” Stop Street Harassment, 2017.

  • Using data collected from SafetiPin, an application that allows user to mark an area on a map as safe or not, and Safecity, another application that lets users share their experience of harassment in public places, the researcher analyzes the safety of travel routes surrounding different colleges in India and their effect on women’s college choices.
  • The study finds that women are willing to go to a lower ranked college in order to avoid higher risk of street harassment. Women who choose the best college from their set of options, spend an average of $250 more each year to access safer modes of transportation.

Frias-Martinez, Vanessa, Enrique Frias-Martinez, and Nuria Oliver. A Gender-Centric Analysis of Calling Behavior in a Developing Economy Using Call Detail Records. Association for the Advancement of Articial Intelligence, 2010.

  • Using encrypted Call Detail Records (CDRs) of 10,000 participants in a developing economy, this study analyzes the behavioral, social, and mobility variables to determine the gender of a mobile phone user, and finds that there is a difference in behavioral and social variables in mobile phone use between female and male.
  • It finds that women have higher usage of phone in terms of number of calls made, call duration, and call expenses compared to men. Women also have bigger social network, meaning that the number of unique phone numbers that contact or get contacted is larger. It finds no statistically significant difference in terms of distance made between calls in men and women.
  • Frias-Martinez et al recommends to take these findings into consideration when designing a cellphone based service.

Psylla, Ioanna, Piotr Sapiezynski, Enys Mones, Sune Lehmann. “The role of gender in social network organization.” PLoS ONE 12, December 20, 2017.

  • Using a large dataset of high resolution data collected through mobile phones, as well as detailed questionnaires, this report studies gender differences in a large cohort. The researchers consider mobility behavior and individual personality traits among a group of more than 800 university students.
  • Analyzing mobility data, they find both that women visit more unique locations over time, and that they have more homogeneous time distribution over their visited locations than men, indicating the time commitment of women is more widely spread across places.

Vaitla, Bapu. Big Data and the Well-Being of Women and Girls: Applications on the Social Scientific Frontier. Data2X, Apr. 2017.

  • In this study, the researchers use geospatial data, credit card and cell phone information, and social media posts to identify problems–such as malnutrition, education, access to healthcare, mental health–facing women and girls in developing countries.
  • From the credit card and cell phone data in particular, the report finds that analyzing patterns of women’s spending and mobility can provide useful insight into Latin American women’s “economic lifestyles.”
  • Based on this analysis, Vaitla recommends that various untraditional big data be used to fill gaps in conventional data sources to address the common issues of invisibility of women and girls’ data in institutional databases.

Selected Readings on Blockchain and Identity


By Hannah Pierce and Stefaan Verhulst

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of blockchain and identity was originally published in 2017.

The potential of blockchain and other distributed ledger technologies to create positive social change has inspired enthusiasm, broad experimentation, and some skepticism. In this edition of the Selected Readings series, we explore and curate the literature on blockchain and how it impacts identity as a means to access services and rights. (In a previous edition we considered the Potential of Blockchain for Transforming Governance).

Introduction

In 2008, an unknown source calling itself Satoshi Nakamoto released a paper named Bitcoin: A Peer-to-Peer Electronic Cash System which introduced Blockchain. Blockchain is a novel technology that uses a distributed ledger to record transactions and ensure compliance. Blockchain and other Distributed Ledger technologies (DLTs) rely on an ability to act as a vast, transparent, and secure public database.

Distributed ledger technologies (DLTs) have disruptive potential beyond innovation in products, services, revenue streams and operating systems within industry. By providing transparency and accountability in new and distributed ways, DLTs have the potential to positively empower underserved populations in myriad ways, including providing a means for establishing a trusted digital identity.

Consider the potential of DLTs for 2.4 billion people worldwide, about 1.5 billion of whom are over the age of 14, who are unable to prove identity to the satisfaction of authorities and other organizations – often excluding them from property ownership, free movement, and social protection as a result. At the same time, transition to a DLT led system of ID management involves various risks, that if not understood and mitigated properly, could harm potential beneficiaries.

Annotated Selected Reading List

Governance

Cuomo, Jerry, Richard Nash, Veena Pureswaran, Alan Thurlow, Dave Zaharchuk. “Building trust in government: Exploring the potential of blockchains.” IBM Institute for Business Value. January 2017.

This paper from the IBM Institute for Business Value culls findings from surveys conducted with over 200 government leaders in 16 countries regarding their experiences and expectations for blockchain technology. The report also identifies “Trailblazers”, or governments that expect to have blockchain technology in place by the end of the year, and details the views and approaches that these early adopters are taking to ensure the success of blockchain in governance. These Trailblazers also believe that there will be high yields from utilizing blockchain in identity management and that citizen services, such as voting, tax collection and land registration, will become increasingly dependent upon decentralized and secure identity management systems. Additionally, some of the Trailblazers are exploring blockchain application in borderless services, like cross-province or state tax collection, because the technology removes the need for intermediaries like notaries or lawyers to verify identities and the authenticity of transactions.

Mattila, Juri. “The Blockchain Phenomenon: The Disruptive Potential of Distributed Consensus Architectures.” Berkeley Roundtable on the International Economy. May 2016.

This working paper gives a clear introduction to blockchain terminology, architecture, challenges, applications (including use cases), and implications for digital trust, disintermediation, democratizing the supply chain, an automated economy, and the reconfiguration of regulatory capacity. As far as identification management is concerned, Mattila argues that blockchain can remove the need to go through a trusted third party (such as a bank) to verify identity online. This could strengthen the security of personal data, as the move from a centralized intermediary to a decentralized network lowers the risk of a mass data security breach. In addition, using blockchain technology for identity verification allows for a more standardized documentation of identity which can be used across platforms and services. In light of these potential capabilities, Mattila addresses the disruptive power of blockchain technology on intermediary businesses and regulating bodies.

Identity Management Applications

Allen, Christopher.  “The Path to Self-Sovereign Identity.” Coindesk. April 27, 2016.

In this Coindesk article, author Christopher Allen lays out the history of digital identities, then explains a concept of a “self-sovereign” identity, where trust is enabled without compromising individual privacy. His ten principles for self-sovereign identity (Existence, Control, Access, Transparency, Persistence, Portability, Interoperability, Consent, Minimization, and Protection) lend themselves to blockchain technology for administration. Although there are actors making moves toward the establishment of self-sovereign identity, there are a few challenges that face the widespread implementation of these tenets, including legal risks, confidentiality issues, immature technology, and a reluctance to change established processes.

Jacobovitz, Ori. “Blockchain for Identity Management.” Department of Computer Science, Ben-Gurion University. December 11, 2016.

This technical report discusses advantages of blockchain technology in managing and authenticating identities online, such as the ability for individuals to create and manage their own online identities, which offers greater control over access to personal data. Using blockchain for identity verification can also afford the potential of “digital watermarks” that could be assigned to each of an individual’s transactions, as well as negating the creation of unique usernames and passwords online. After arguing that this decentralized model will allow individuals to manage data on their own terms, Jacobvitz provides a list of companies, projects, and movements that are using blockchain for identity management.

Mainelli, Michael. “Blockchain Will Help Us Prove Our Identities in a Digital World.” Harvard Business Review. March 16, 2017.

In this Harvard Business Review article, author Michael Mainelli highlights a solution to identity problems for rich and poor alike–mutual distributed ledgers (MDLs), or blockchain technology. These multi-organizational data bases with unalterable ledgers and a “super audit trail” have three parties that deal with digital document exchanges: subjects are individuals or assets, certifiers are are organizations that verify identity, and inquisitors are entities that conducts know-your-customer (KYC) checks on the subject. This system will allow for a low-cost, secure, and global method of proving identity. After outlining some of the other benefits that this technology may have in creating secure and easily auditable digital documents, such as greater tolerance that comes from viewing widely public ledgers, Mainelli questions if these capabilities will turn out to be a boon or a burden to bureaucracy and societal behavior.

Personal Data Security Applications

Banafa, Ahmed. “How to Secure the Internet of Things (IoT) with Blockchain.” Datafloq. August 15, 2016.

This article details the data security risks that are coming up as the Internet of Things continues to expand, and how using blockchain technology can protect the personal data and identity information that is exchanged between devices. Banafa argues that, as the creation and collection of data is central to the functions of Internet of Things devices, there is an increasing need to better secure data that largely confidential and often personally identifiable. Decentralizing IoT networks, then securing their communications with blockchain can allow to remain scalable, private, and reliable. Enabling blockchain’s peer-to-peer, trustless communication may also enable smart devices to initiate personal data exchanges like financial transactions, as centralized authorities or intermediaries will not be necessary.

Shrier, David, Weige Wu and Alex Pentland. “Blockchain & Infrastructure (Identity, Data Security).” Massachusetts Institute of Technology. May 17, 2016.

This paper, the third of a four-part series on potential blockchain applications, covers the potential of blockchains to change the status quo of identity authentication systems, privacy protection, transaction monitoring, ownership rights, and data security. The paper also posits that, as personal data becomes more and more valuable, that we should move towards a “New Deal on Data” which provides individuals data protection–through blockchain technology– and the option to contribute their data to aggregates that work towards the common good. In order to achieve this New Deal on Data, robust regulatory standards and financial incentives must be provided to entice individuals to share their data to benefit society.

Selected Readings on Blockchain Technology and Its Potential for Transforming Governance


By Prianka Srinivasan, Robert Montano, Andrew Young, and Stefaan G. Verhulst

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of blockchain and governance was originally published in 2017.

Introduction

In 2008, an unknown source calling itself Satoshi Nakamoto released a paper named Bitcoin: A Peer-to-Peer Electronic Cash System which introduced blockchain technology. Blockchain is a novel system that uses a distributed ledger to record transactions and ensure compliance. Blockchain technology relies on an ability to act as a vast, transparent, and secure public database.

It has since gained recognition as a tool to transform governance by creating a decentralized system to

  • manage and protect identity,
  • trace and track; and
  • incentivize smarter social and business contracts.

These applications cast blockchain as a tool to confront certain public problems in the digital age.

The readings below represent selected readings on the applications for governance. They have been categorized by theme – Governance Applications, Identity Protection and ManagementTracing and Tracking, and Smart Contracts.

Selected Reading List

Governance Applications

  • Atzori, Marcella – The Center for Blockchain Technologies (2015) Blockchain Technology and Decentralized Governance: Is the State Still Necessary?  Aims to investigate the political applications of blockchain, particularly in encouraging government decentralization by considering to what extent blockchain can be viewed as “hyper-political tools.” The paper suggests that the domination of private bodies in blockchain systems highlights the continued need for the State to remain as a central point of coordination.
  • Boucher, Philip. – European Parliamentary Research Service (2017) How blockchain technology could change our lives  This report commissioned by the European Parliamentary Research Service provides a deep introduction to blockchain theory and its applications to society and political systems, providing 2 page briefings on currencies, digital content, patents, e-voting, smart contracts, supply chains, and blockchain states.
  • Boucher, Philip. – Euroscientist (2017) Are Blockchain Applications Guided by Social Values?  This report by a policy analyst at the European Parliament’s Scientific foresight unit, evaluates the social and moral contours of blockchain technology, arguing that “all technologies have value and politics,” and blockchain is no exception. Calls for greater scrutiny on the possibility for blockchain to act as a truly distributed and transparent system without a “middleman.”
  • Cheng, Steve;  Daub, Matthew; Domeyer, Axel; and Lundqvist, Martin –McKinsey & Company (2017)  Using Blockchain to Improve Data Management in the Public SectorThis essay considers the potential uses of blockchain technology for the public sector to improve the security of sensitive information collected by governments and as a way to simplify communication with specialists.
  • De Filippi, Primavera; and Wright, Aaron –Paris University & Cordoza School of Law (2015)  Decentralized Blockchain Technology and the Rise of Lex Cryptographia – Looks at how to regulate blockchain technology, particularly given its implications on governance and society. Argues that a new legal framework needs to emerge to take into account the applications of self-executing blockchain technology.
  • Liebenau, Jonathan and Elaluf-Calderwood, Silvia Monica. – London School of Economics & Florida International University (2016) Blockchain Innovation Beyond Bitcoin and Banking. A paper that explores the potential of blockchain technology in financial services and in broader digital applications, considers regulatory possibility and frameworks, and highlights the innovative potential of blockchain.
  • Prpić, John – Lulea University of Technology (2017) Unpacking Blockchains – This short paper provides a brief introduction to the use of Blockchain outside monetary purposes, breaking down its function as a digital ledger and transaction platform.
  • Stark, Josh – Ledger Labs (2016) Making Sense of Blockchain Governance Applications This CoinDesk article discusses, in simple terms, how blockchain technology can be used to accomplish what is called “the three basic functions of governance.”
  • UK Government Chief Scientific Adviser (2016)  Distributed Ledger Technology: Beyond Blockchain – A report from the UK Government that investigates the use of blockchain’s “distributed leger” as a database for governments and other institutions to adopt.

Identity Protection and Management

  • Baars, D.S. – University of Twente (2016Towards Self-Sovereign Identity Using Blockchain Technology.  A study exploring self-sovereign identity – i.e. the ability of users to control their own digital identity – that led to the creation of a new architecture designed for users to manage their digital ID. Called the Decentralized Identity Management System, it is built on blockchain technology and is based on the concept of claim-based identity.
  • Burger, Eric and Sullivan, Clare Linda. – Georgetown University (2016) E-Residency and Blockchain. A case study focused on an Estonian commercial initiative that allows for citizens of any nation to become an “Estonian E-Resident.” This paper explores the legal, policy, and technical implications of the program and considers its impact on the way identity information is controlled and authenticated.
  • Nathan, Oz; Pentland, Alex ‘Sandy’; and Zyskind, Guy – Security and Privacy Workshops (2015) Decentralizing Privacy: Using Blockchain to Protect Personal Data Describes the potential of blockchain technology to create a decentralized personal data management system, making third-party personal data collection redundant.
  • De Filippi, Primavera – Paris University (2016) The Interplay Between Decentralization and Privacy: The Case of Blockchain Technologies  A journal entry that weighs the radical transparency of blockchain technology against privacy concerns for its users, finding that the apparent dichotomy is not as at conflict with itself as it may first appear.

Tracing and Tracking

  • Barnes, Andrew; Brake, Christopher; and Perry, Thomas – Plymouth University (2016) Digital Voting with the use of Blockchain Technology – A report investigating the potential of blockchain technology to overcome issues surrounding digital voting, from voter fraud, data security and defense against cyber attacks. Proposes a blockchain voting system that can safely and robustly manage these challenges for digital voting.
  • The Economist (2015), “Blockchains The Great Chain of Being Sure About Things.”  An exploratory article that explores the potential usefulness of a blockchain-based land registry in places like Honduras and Greece, transaction registries for trading stock, and the creation of smart contracts.
  • Lin, Wendy; McDonnell, Colin; and Yuan, Ben – Massachusetts Institute of Technology (2015)  Blockchains and electronic health records. – Suggests the “durable transaction ledger” fundamental to blockchain has wide applicability in electronic medical record management. Also, evaluates some of the practical shortcomings in implementing the system across the US health industry.

Smart Contracts

  • Iansiti, Marco; and Lakhani, Karim R. – Harvard Business Review (2017) The Truth about Blockchain – A Harvard Business Review article exploring how blockchain technology can create secure and transparent digital contracts, and what effect this may have on the economy and businesses.
  • Levy, Karen E.C. – Engaging Science, Technology, and Society (2017) Book-Smart, Not Street-Smart: Blockchain-Based Smart Contracts and The Social Workings of Law. Article exploring the concept of blockchain-based “smart contracts” – contracts that securely automate and execute obligations without a centralized authority – and discusses the tension between law, social norms, and contracts with an eye toward social equality and fairness.

Annotated Selected Reading List

Cheng, Steve, Matthias Daub, Axel Domeyer, and Martin Lundqvist. “Using blockchain to improve data management in the public sector.” McKinsey & Company. Web. 03 Apr. 2017. http://bit.ly/2nWgomw

  • An essay arguing that blockchain is useful outside of financial institutions for government agencies, particularly those that store sensitive information such as birth and death dates or information about marital status, business licensing, property transfers, and criminal activity.
  • Blockchain technology would maintain the security of such sensitive information while also making it easier for agencies to use and access critical public-sector information.
  • Despite its potential, a significant drawback for use by government agencies is the speed with which blockchain has developed – there are no accepted standards for blockchain technologies or the networks that operate them; and because many providers are start-ups, agencies might struggle to find partners that will have lasting power. Additionally, government agencies will have to remain vigilant to ensure the security of data.
  • Although best practices will take some time to develop, this piece argues that the time is now for experimentation – and that governments would be wise to include blockchain in their strategies to learn what methods work best and uncover how to best unlock the potential of blockchain.

“The Great Chain of Being Sure About Things.” The Economist. The Economist Newspaper, 31 Oct. 2015. Web. 03 Apr. 2017. http://econ.st/1M3kLnr

  • This is an exploratory article written in The Economist that examines the various potential uses of blockchain technology beyond its initial focus on bitcoin:
    • It highlights the potential of blockchain-based land registries as a way to curb human rights abuses and insecurity in much of the world (it specifically cites examples in Greece and Honduras);
    • It also highlights the relative security of blockchain while noting its openness;
    • It is useful as a primer for how blockchain functions as tool for a non-specialist;
    • Discusses “smart contracts” (about which we have linked more research above);
    • Analyzes potential risks;
    • And considers the potential future unlocked by blockchain
  • This article is particularly useful as a primer into the various capabilities and potential of blockchain for interested researchers who may not have a detailed knowledge of the technology or for those seeking for an introduction.

Iansiti, Marco and Lakhani, Karim R. “The Truth About Blockchain.” Harvard Business Review. N.p., 17 Feb. 2017. Web. 06 Apr. 2017. http://bit.ly/2hqo3FU

  • This entry into the Harvard Business Review discusses blockchain’s ability to solve the gap between emerging technological progress and the outdated ways in which bureaucracies handle and record contracts and transactions.
  • Blockchain, the authors argue, allows us to imagine a world in which “contracts are embedded in digital code and stored in transparent, shared databases, where they are protected from deletion, tampering, and revision”, allowing for the removal of intermediaries and facilitating direct interactions between individuals and institutions.
  • The authors compare the emergence of blockchain to other technologies that have had transformative power, such as TCP/IP, and consider the speed with which they have proliferated and become mainstream.
    • They argue that like TCP/IP, blockchain is likely decades away from maximizing its potential and offer frameworks for the adoption of the technology involving both single-use, localization, substitution, and transformation.
    • Using these frameworks and comparisons, the authors present an investment strategy for those interested in blockchain.

IBM Global Business Services Public Sector Team. “Blockchain: The Chain of Trust and its Potential to Transform Healthcare – Our Point of View.” IBM. 2016. http://bit.ly/2oBJDLw

  • This enthusiastic business report from IBM suggests that blockchain technology can be adopted by the healthcare industry to “solve” challenges healthcare professionals face. This is primarily achieved by blockchain’s ability to streamline transactions by establishing trust, accountability, and transparency.
  • Structured around so-called “pain-points” in the healthcare industry, and how blockchain can confront them, the paper looks at 3 concepts and their application in the healthcare industry:
    • Bit-string cryptography: Improves privacy and security concerns in healthcare, by supporting data encryption and enforces complex data permission systems. This allows healthcare professionals to share data without risking the privacy of patients. It also streamlines data management systems, saving money and improving efficiency.
    • Transaction Validity: This feature promotes the use of electronic prescriptions by allowing transactional trust and authenticated data exchange. Abuse is reduced, and abusers are more easily identified.
    • Smart contracts: This streamlines the procurement and contracting qualms in healthcare by reducing intermediaries. Creates a more efficient and transparent healthcare system.
  • The paper goes on to signal the limitations of blockchain in certain use cases (particularly in low-value, high-volume transactions) but highlights 3 use cases where blockchain can help address a business problem in the healthcare industry.
  • Important to keep in mind that, since this paper is geared toward business applications of blockchain through the lens of IBM’s investments, the problems are drafted as business/transactional problems, where blockchain primarily improves efficiency than supporting patient outcomes.

Nathan, Oz; Pentland, Alex ‘Sandy’; and Zyskind, Guy “Decentralizing Privacy: Using Blockchain to Protect Personal Data” Security and Privacy Workshops (SPW). 2015. http://bit.ly/2nPo4r6

  • This technical paper suggests that anonymization and centralized systems can never provide complete security for personal data, and only blockchain technology, by creating a decentralized data management system, can overcome these privacy issues.
  • The authors identify 3 common privacy concerns that blockchain technology can address:
    • Data ownership: users want to own and control their personal data, and data management systems must acknowledge this.
    • Data transparency and auditability: users want to know what data is been collected and for what purpose.
    • Fine-grained access control: users want to be able to easily update and adapt their permission settings to control how and when third-party organizations access their data.
  • The authors propose their own system designed for mobile phones which integrates blockchain technology to store data in a reliable way. The entire system uses blockchain to store data, verify users through a digital signature when they want to access data, and creates a user interface that individuals  can access to view their personal data.
  • Though much of the body of this paper includes technical details on the setup of this blockchain data management system, it provides a strong case for how blockchain technology can be practically implemented to assuage privacy concerns among the public. The authors highlight that by using blockchain “laws and regulations could be programmed into the blockchain itself, so that they are enforced automatically.” They ultimately conclude that using blockchain in such a data protection system such as the one they propose is easier, safer, and more accountable.

Wright, Aaron, and Primavera De Filippi. “Decentralized blockchain technology and the rise of lex cryptographia.” 2015. Available at SSRN http://bit.ly/2oujvoG

  • This paper proposes that the emergence of blockchain technology, and its various applications (decentralized currencies, self-executing contracts, smart property etc.), will necessitate the creation of a new subset of laws, termed by the authors as “Lex Cryptographia.”
  • Considering the ability for blockchain to “cut out the middleman” there exist concrete challenges to law enforcement faced by the coming digital revolution brought by the technology. These encompass the very benefits of blockchain; for instance, the authors posit that the decentralized, autonomous nature of blockchain systems can act much like “a biological virus or an uncontrollable force of nature” if the system was ill-intentioned. Though this same system can regulate the problems of corruption and hierarchy associated with traditional, centralized systems, their autonomy poses an obvious obstacle for law-enforcement.
  • The paper goes on to details all the possible benefits and societal impacts of various applications of blockchain, finally suggesting there exists a need to “rethink” traditional models of regulating society and individuals. They predict a rise in Lex Cryptographia “characterized by a set of rules administered through self-executing smart contracts and decentralized (and potentially autonomous) organizations.” Much of these regulations depend upon the need to supervise restrictions placed upon blockchain technology that may chill its application, for instance corporations who may choose to purposefully avoid including any blockchain-based applications in their search engines so as to stymie the adoption of this technology.

Selected Readings on Algorithmic Scrutiny


By Prianka Srinivasan, Andrew Young and Stefaan Verhulst

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of algorithmic scrutiny was originally published in 2017.

Introduction

From government policy, to criminal justice, to our news feeds; to business and consumer practices, the processes that shape our lives both online and off are more and more driven by data and the complex algorithms used to form rulings or predictions. In most cases, these algorithms have created “black boxes” of decision making, where models remain inscrutable and inaccessible. It should therefore come as no surprise that several observers and policymakers are calling for more scrutiny of how algorithms are designed and work, particularly when their outcomes convey intrinsic biases or defy existing ethical standards.

While the concern about values in technology design is not new, recent developments in machine learning, artificial intelligence and the Internet of Things have increased the urgency to establish processes and develop tools to scrutinize algorithms.

In what follows, we have curated several readings covering the impact of algorithms on:

  • Information Intermediaries;
  • Governance
  • Finance
  • Justice

In addition we have selected a few readings that provide insight on possible processes and tools to establish algorithmic scrutiny.

Selected Reading List

Information Intermediaries

Governance

Consumer Finance

Justice

Tools & Process Toward Algorithmic Scrutiny

Annotated Selected Reading List

Information Intermediaries

Diakopoulos, Nicholas. “Algorithmic accountability: Journalistic investigation of computational power structures.” Digital Journalism 3.3 (2015): 398-415. http://bit.ly/.

  • This paper attempts to substantiate the notion of accountability for algorithms, particularly how they relate to media and journalism. It puts forward the notion of “algorithmic power,” analyzing the framework of influence such systems exert, and also introduces some of the challenges in the practice of algorithmic accountability, particularly for computational journalists.
  • Offers a basis for how algorithms can be analyzed, built in terms of the types of decisions algorithms make in prioritizing, classifying, associating, and filtering information.

Diakopoulos, Nicholas, and Michael Koliska. “Algorithmic transparency in the news media.” Digital Journalism (2016): 1-20. http://bit.ly/2hMvXdE.

  • This paper analyzes the increased use of “computational journalism,” and argues that though transparency remains a key tenet of journalism, the use of algorithms in gathering, producing and disseminating news undermines this principle.
  • It first analyzes what the ethical principle of transparency means to journalists and the media. It then highlights the findings from a focus-group study, where 50 participants from the news media and academia were invited to discuss three different case studies related to the use of algorithms in journalism.
  • They find two key barriers to algorithmic transparency in the media: “(1) a lack of business incentives for disclosure, and (2) the concern of overwhelming end-users with too much information.”
  • The study also finds a variety of opportunities for transparency across the “data, model, inference, and interface” components of an algorithmic system.

Napoli, Philip M. “The algorithm as institution: Toward a theoretical framework for automated media production and consumption.” Fordham University Schools of Business Research Paper (2013). http://bit.ly/2hKBHqo

  • This paper puts forward an analytical framework to discuss the algorithmic content creation of media and journalism in an attempt to “close the gap” on theory related to automated media production.
  • By borrowing concepts from institutional theory, the paper finds that algorithms are distinct forms of media institutions, and the cultural and political implications of this interpretation.
  • It urges further study in the field of “media sociology” to further unpack the influence of algorithms, and their role in institutionalizing certain norms, cultures and ways of thinking.

Introna, Lucas D., and Helen Nissenbaum. “Shaping the Web: Why the politics of search engines matters.” The Information Society 16.3 (2000): 169-185. http://bit.ly/2ijzsrg.

  • This paper, published 16 years ago, provides an in-depth account of some of the risks related to search engine optimizations, and the biases and harms these can introduce, particularly on the nature of politics.
  • Suggests search engines can be designed to account for these political dimensions, and better correlate with the ideal of the World Wide Web as being a place that is open, accessible and democratic.
  • According to the paper, policy (and not the free market) is the only way to spur change in this field, though the current technical solutions we have introduce further challenges.

Gillespie, Tarleton. “The Relevance of Algorithms.” Media
technologies: Essays on communication, materiality, and society (2014): 167. http://bit.ly/2h6ASEu.

  • This paper suggests that the extended use of algorithms, to the extent that they undercut many aspects of our lives, (Tarleton calls this public relevance algorithms) are fundamentally “producing and certifying knowledge.” In this ability to create a particular “knowledge logic,” algorithms are a primary feature of our information ecosystem.
  • The paper goes on to map 6 dimensions of these public relevance algorithms:
    • Patterns of inclusion
    • Cycles of anticipation
    • The evaluation of relevance
    • The promise of algorithmic objectivity
    • Entanglement with practice
    • The production of calculated publics
  • The paper concludes by highlighting the need for a sociological inquiry into the function, implications and contexts of algorithms, and to “soberly  recognize their flaws and fragilities,” despite the fact that much of their inner workings remain hidden.

Rainie, Lee and Janna Anderson. “Code-Dependent: Pros and Cons of the Algorithm Age.” Pew Research Center. February 8, 2017. http://bit.ly/2kwnvCo.

  • This Pew Research Center report examines the benefits and negative impacts of algorithms as they become more influential in different sectors and aspects of daily life.
  • Through a scan of the research and practice, with a particular focus on the research of experts in the field, Rainie and Anderson identify seven key themes of the burgeoning Algorithm Age:
    • Algorithms will continue to spread everywhere
    • Good things lie ahead
    • Humanity and human judgment are lost when data and predictive modeling become paramount
    • Biases exist in algorithmically-organized systems
    • Algorithmic categorizations deepen divides
    • Unemployment will rise; and
    • The need grows for algorithmic literacy, transparency and oversight

Tufekci, Zeynep. “Algorithmic harms beyond Facebook and Google: Emergent challenges of computational agency.” Journal on Telecommunications & High Technology Law 13 (2015): 203. http://bit.ly/1JdvCGo.

  • This paper establishes some of the risks and harms in regard to algorithmic computation, particularly in their filtering abilities as seen in Facebook and other social media algorithms.
  • Suggests that the editorial decisions performed by algorithms can have significant influence on our political and cultural realms, and categorizes the types of harms that algorithms may have on individuals and their society.
  • Takes two case studies–one from the social media coverage of the Ferguson protests, the other on how social media can influence election turnouts–to analyze the influence of algorithms. In doing so, this paper lays out the “tip of the iceberg” in terms of some of the challenges and ethical concerns introduced by algorithmic computing.

Mittelstadt, Brent, Patrick Allo, Mariarosaria Taddeo, Sandra Wachter, and Luciano Floridi. “The Ethics of Algorithms: Mapping the Debate.” Big Data & Society (2016): 3(2). http://bit.ly/2kWNwL6

  • This paper provides significant background and analysis of the ethical context of algorithmic decision-making. It primarily seeks to map the ethical consequences of algorithms, which have adopted the role of a mediator between data and action within societies.
  • Develops a conceptual map of 6 ethical concerns:
      • Inconclusive Evidence
      • Inscrutable Evidence
      • Misguided Evidence
      • Unfair Outcomes
      • Transformative Effects
    • Traceability
  • The paper then reviews existing literature, which together with the map creates a structure to inform future debate.

Governance

Janssen, Marijn, and George Kuk. “The challenges and limits of big data algorithms in technocratic governance.” Government Information Quarterly 33.3 (2016): 371-377. http://bit.ly/2hMq4z6.

  • In regarding the centrality of algorithms in enforcing policy and extending governance, this paper analyzes the “technocratic governance” that has emerged by the removal of humans from decision making processes, and the inclusion of algorithmic automation.
  • The paper argues that the belief in technocratic governance producing neutral and unbiased results, since their decision-making processes are uninfluenced by human thought processes, is at odds with studies that reveal the inherent discriminatory practices that exist within algorithms.
  • Suggests that algorithms are still bound by the biases of designers and policy-makers, and that accountability is needed to improve the functioning of an algorithm. In order to do so, we must acknowledge the “intersecting dynamics of algorithm as a sociotechnical materiality system involving technologies, data and people using code to shape opinion and make certain actions more likely than others.”

Just, Natascha, and Michael Latzer. “Governance by algorithms: reality construction by algorithmic selection on the Internet.” Media, Culture & Society (2016): 0163443716643157. http://bit.ly/2h6B1Yv.

  • This paper provides a conceptual framework on how to assess the governance potential of algorithms, asking how technology and software governs individuals and societies.
  • By understanding algorithms as institutions, the paper suggests that algorithmic governance puts in place more evidence-based and data-driven systems than traditional governance methods. The result is a form of governance that cares more about effects than causes.
  • The paper concludes by suggesting that algorithmic selection on the Internet tends to shape individuals’ realities and social orders by “increasing individualization, commercialization, inequalities, deterritorialization, and decreasing transparency, controllability, predictability.”

Consumer Finance

Hildebrandt, Mireille. “The dawn of a critical transparency right for the profiling era.” Digital Enlightenment Yearbook 2012 (2012): 41-56. http://bit.ly/2igJcGM.

  • Analyzes the use of consumer profiling by online businesses in order to target marketing and services to their needs. By establishing how this profiling relates to identification, the author also offers some of the threats to democracy and the right of autonomy posed by these profiling algorithms.
  • The paper concludes by suggesting that cross-disciplinary transparency is necessary to design more accountable profiling techniques that can match the extension of “smart environments” that capture ever more data and information from users.

Reddix-Smalls, Brenda. “Credit Scoring and Trade Secrecy: An Algorithmic Quagmire or How the Lack of Transparency in Complex Financial Models Scuttled the Finance Market.” UC Davis Business Law Journal 12 (2011): 87. http://bit.ly/2he52ch

  • Analyzes the creation of predictive risk models in financial markets through algorithmic systems, particularly in regard to credit scoring. It suggests that these models were corrupted in order to maintain a competitive market advantage: “The lack of transparency and the legal environment led to the use of these risk models as predatory credit pricing instruments as opposed to accurate credit scoring predictive instruments.”
  • The paper suggests that without greater transparency of these financial risk model, and greater regulation over their abuse, another financial crisis like that in 2008 is highly likely.

Justice

Aas, Katja Franko. “Sentencing Transparency in the Information Age.” Journal of Scandinavian Studies in Criminology and Crime Prevention 5.1 (2004): 48-61. http://bit.ly/2igGssK.

  • This paper questions the use of predetermined sentencing in the US judicial system through the application of computer technology and sentencing information systems (SIS). By assessing the use of these systems between the English speaking world and Norway, the author suggests that such technological approaches to sentencing attempt to overcome accusations of mistrust, uncertainty and arbitrariness often leveled against the judicial system.
  • However, in their attempt to rebuild trust, such technological solutions can be seen as an attempt to remedy a flawed view of judges by the public. Therefore, the political and social climate must be taken into account when trying to reform these sentencing systems: “The use of the various sentencing technologies is not only, and not primarily, a matter of technological development. It is a matter of a political and cultural climate and the relations of trust in a society.”

Cui, Gregory. “Evidence-Based Sentencing and the Taint of Dangerousness.” Yale Law Journal Forum 125 (2016): 315-315. http://bit.ly/1XLAvhL.

  • This short essay submitted on the Yale Law Journal Forum calls for greater scrutiny of “evidence based sentencing,” where past data is computed and used to predict future criminal behavior of a defendant. The author suggests that these risk models may undermine the Constitution’s prohibition of bills of attainder, and also are unlawful for inflicting punishment without a judicial trial.

Tools & Processes Toward Algorithmic Scrutiny

Ananny, Mike and Crawford, Kate. “Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability.” New Media & Society. SAGE Publications. 2016. http://bit.ly/2hvKc5x.

  • This paper attempts to critically analyze calls to improve the transparency of algorithms, asking how historically we are able to confront the limitations of the transparency ideal in computing.
  • By establishing “transparency as an ideal” the paper tracks the philosophical and historical lineage of this principle, attempting to establish what laws and provisions were put in place across the world to keep up with and enforce this ideal.
  • The paper goes on to detail the limits of transparency as an ideal, arguing, amongst other things, that it does not necessarily build trust, it privileges a certain function (seeing) over others (say, understanding) and that it has numerous technical limitations.
  • The paper ends by concluding that transparency is an inadequate way to govern algorithmic systems, and that accountability must acknowledge the ability to govern across systems.

Datta, Anupam, Shayak Sen, and Yair Zick. “Algorithmic Transparency via Quantitative Input Influence.Proceedings of 37th IEEE Symposium on Security and Privacy. 2016. http://bit.ly/2hgyLTp.

  • This paper develops what is called a family of Quantitative Input Influence (QII) measures “that capture the degree of influence of inputs on outputs of systems.” The attempt is to theorize a transparency report that is to accompany any algorithmic decisions made, in order to explain any decisions and detect algorithmic discrimination.
  • QII works by breaking “correlations between inputs to allow causal reasoning, and computes the marginal influence of inputs in situations where inputs cannot affect outcomes alone.”
  • Finds that these QII measures are useful in scrutinizing algorithms when “black box” access is available.

Goodman, Bryce, and Seth Flaxman. “European Union regulations on algorithmic decision-making and a right to explanationarXiv preprint arXiv:1606.08813 (2016). http://bit.ly/2h6xpWi.

  • This paper analyzes the implications of a new EU law, to be enacted in 2018, that calls to “restrict automated individual decision-making (that is, algorithms that make decisions based on user level predictors) which ‘significantly affect’ users.” The law will also allow for a “right to explanation” where users can ask for an explanation behind automated decision made about them.
  • The paper, while acknowledging the challenges in implementing such laws, suggests that such regulations can spur computer scientists to create algorithms and decision making systems that are more accountable, can provide explanations, and do not produce discriminatory results.
  • The paper concludes by stating algorithms and computer systems should not aim to be simply efficient, but also fair and accountable. It is optimistic about the ability to put in place interventions to account for and correct discrimination.

Kizilcec, René F. “How Much Information?: Effects of Transparency on Trust in an Algorithmic Interface.” Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 2016. http://bit.ly/2hMjFUR.

  • This paper studies how transparency of algorithms affects our impression of trust by conducting an online field experiment, where participants enrolled in a MOOC a given different explanations for the computer generated grade given in their class.
  • The study found that “Individuals whose expectations were violated (by receiving a lower grade than expected) trusted the system less, unless the grading algorithm was made more transparent through explanation. However, providing too much information eroded this trust.”
  • In conclusion, the study found that a balance of transparency was needed to maintain trust amongst the participants, suggesting that pure transparency of algorithmic processes and results may not correlate with high feelings of trust amongst users.

Kroll, Joshua A., et al. “Accountable Algorithms.” University of Pennsylvania Law Review 165 (2016). http://bit.ly/2i6ipcO.

  • This paper suggests that policy and legal standards need to be updated given the increased use of algorithms to perform tasks and make decisions in arenas that people once did. An “accountability mechanism” is lacking in many of these automated decision making processes.
  • The paper argues that mere transparency through the divulsion of source code is inadequate when confronting questions of accountability. Rather, technology itself provides a key to create algorithms and decision making apparatuses more inline with our existing political and legal frameworks.
  • The paper assesses some computational techniques that may provide possibilities to create accountable software and reform specific cases of automated decisionmaking. For example, diversity and anti-discrimination orders can be built into technology to ensure fidelity to policy choices.

Selected Readings on Data Collaboratives


By Neil Britto, David Sangokoya, Iryna Susha, Stefaan Verhulst and Andrew Young

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data collaboratives was originally published in 2017.

The term data collaborative refers to a new form of collaboration, beyond the public-private partnership model, in which participants from different sectors (including private companies, research institutions, and government agencies ) can exchange data to help solve public problems. Several of society’s greatest challenges — from addressing climate change to public health to job creation to improving the lives of children — require greater access to data, more collaboration between public – and private-sector entities, and an increased ability to analyze datasets. In the coming months and years, data collaboratives will be essential vehicles for harnessing the vast stores of privately held data toward the public good.

Selected Reading List (in alphabetical order)

Annotated Selected Readings List (in alphabetical order)

Agaba, G., Akindès, F., Bengtsson, L., Cowls, J., Ganesh, M., Hoffman, N., . . . Meissner, F. “Big Data and Positive Social Change in the Developing World: A White Paper for Practitioners and Researchers.” 2014. http://bit.ly/25RRC6N.

  • This white paper, produced by “a group of activists, researchers and data experts” explores the potential of big data to improve development outcomes and spur positive social change in low- and middle-income countries. Using examples, the authors discuss four areas in which the use of big data can impact development efforts:
    • Advocating and facilitating by “opening[ing] up new public spaces for discussion and awareness building;
    • Describing and predicting through the detection of “new correlations and the surfac[ing] of new questions;
    • Facilitating information exchange through “multiple feedback loops which feed into both research and action,” and
    • Promoting accountability and transparency, especially as a byproduct of crowdsourcing efforts aimed at “aggregat[ing] and analyz[ing] information in real time.
  • The authors argue that in order to maximize the potential of big data’s use in development, “there is a case to be made for building a data commons for private/public data, and for setting up new and more appropriate ethical guidelines.”
  • They also identify a number of challenges, especially when leveraging data made accessible from a number of sources, including private sector entities, such as:
    • Lack of general data literacy;
    • Lack of open learning environments and repositories;
    • Lack of resources, capacity and access;
    • Challenges of sensitivity and risk perception with regard to using data;
    • Storage and computing capacity; and
    • Externally validating data sources for comparison and verification.

Ansell, C. and Gash, A. “Collaborative Governance in Theory and Practice.” Journal of Public Administration Research and  Theory 18 (4), 2008. http://bit.ly/1RZgsI5.

  • This article describes collaborative arrangements that include public and private organizations working together and proposes a model for understanding an emergent form of public-private interaction informed by 137 diverse cases of collaborative governance.
  • The article suggests factors significant to successful partnering processes and outcomes include:
    • Shared understanding of challenges,
    • Trust building processes,
    • The importance of recognizing seemingly modest progress, and
    • Strong indicators of commitment to the partnership’s aspirations and process.
  • The authors provide a ‘’contingency theory model’’ that specifies relationships between different variables that influence outcomes of collaborative governance initiatives. Three “core contingencies’’ for successful collaborative governance initiatives identified by the authors are:
    • Time (e.g., decision making time afforded to the collaboration);
    • Interdependence (e.g., a high degree of interdependence can mitigate negative effects of low trust); and
    • Trust (e.g. a higher level of trust indicates a higher probability of success).

Ballivian A, Hoffman W. “Public-Private Partnerships for Data: Issues Paper for Data Revolution Consultation.” World Bank, 2015. Available from: http://bit.ly/1ENvmRJ

  • This World Bank report provides a background document on forming public-prviate partnerships for data with the private sector in order to inform the UN’s Independent Expert Advisory Group (IEAG) on sustaining a “data revolution” in sustainable development.
  • The report highlights the critical position of private companies within the data value chain and reflects on key elements of a sustainable data PPP: “common objectives across all impacted stakeholders, alignment of incentives, and sharing of risks.” In addition, the report describes the risks and incentives of public and private actors, and the principles needed to “build[ing] the legal, cultural, technological and economic infrastructures to enable the balancing of competing interests.” These principles include understanding; experimentation; adaptability; balance; persuasion and compulsion; risk management; and governance.
  • Examples of data collaboratives cited in the report include HP Earth Insights, Orange Data for Development Challenges, Amazon Web Services, IBM Smart Cities Initiative, and the Governance Lab’s Open Data 500.

Brack, Matthew, and Tito Castillo. “Data Sharing for Public Health: Key Lessons from Other Sectors.” Chatham House, Centre on Global Health Security. April 2015. Available from: http://bit.ly/1DHFGVl

  • The Chatham House report provides an overview on public health surveillance data sharing, highlighting the benefits and challenges of shared health data and the complexity in adapting technical solutions from other sectors for public health.
  • The report describes data sharing processes from several perspectives, including in-depth case studies of actual data sharing in practice at the individual, organizational and sector levels. Among the key lessons for public health data sharing, the report strongly highlights the need to harness momentum for action and maintain collaborative engagement: “Successful data sharing communities are highly collaborative. Collaboration holds the key to producing and abiding by community standards, and building and maintaining productive networks, and is by definition the essence of data sharing itself. Time should be invested in establishing and sustaining collaboration with all stakeholders concerned with public health surveillance data sharing.”
  • Examples of data collaboratives include H3Africa (a collaboration between NIH and Wellcome Trust) and NHS England’s care.data programme.

de Montjoye, Yves-Alexandre, Jake Kendall, and Cameron F. Kerry. “Enabling Humanitarian Use of Mobile Phone Data.” The Brookings Institution, Issues in Technology Innovation. November 2014. Available from: http://brook.gs/1JxVpxp

  • Using Ebola as a case study, the authors describe the value of using private telecom data for uncovering “valuable insights into understanding the spread of infectious diseases as well as strategies into micro-target outreach and driving update of health-seeking behavior.”
  • The authors highlight the absence of a common legal and standards framework for “sharing mobile phone data in privacy-conscientious ways” and recommend “engaging companies, NGOs, researchers, privacy experts, and governments to agree on a set of best practices for new privacy-conscientious metadata sharing models.”

Eckartz, Silja M., Hofman, Wout J., Van Veenstra, Anne Fleur. “A decision model for data sharing.” Vol. 8653 LNCS. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2014. http://bit.ly/21cGWfw.

  • This paper proposes a decision model for data sharing of public and private data based on literature review and three case studies in the logistics sector.
  • The authors identify five categories of the barriers to data sharing and offer a decision model for identifying potential interventions to overcome each barrier:
    • Ownership. Possible interventions likely require improving trust among those who own the data through, for example, involvement and support from higher management
    • Privacy. Interventions include “anonymization by filtering of sensitive information and aggregation of data,” and access control mechanisms built around identity management and regulated access.  
    • Economic. Interventions include a model where data is shared only with a few trusted organizations, and yield management mechanisms to ensure negative financial consequences are avoided.
    • Data quality. Interventions include identifying additional data sources that could improve the completeness of datasets, and efforts to improve metadata.
    • Technical. Interventions include making data available in structured formats and publishing data according to widely agreed upon data standards.

Hoffman, Sharona and Podgurski, Andy. “The Use and Misuse of Biomedical Data: Is Bigger Really Better?” American Journal of Law & Medicine 497, 2013. http://bit.ly/1syMS7J.

  • This journal articles explores the benefits and, in particular, the risks related to large-scale biomedical databases bringing together health information from a diversity of sources across sectors. Some data collaboratives examined in the piece include:
    • MedMining – a company that extracts EHR data, de-identifies it, and offers it to researchers. The data sets that MedMining delivers to its customers include ‘lab results, vital signs, medications, procedures, diagnoses, lifestyle data, and detailed costs’ from inpatient and outpatient facilities.
    • Explorys has formed a large healthcare database derived from financial, administrative, and medical records. It has partnered with major healthcare organizations such as the Cleveland Clinic Foundation and Summa Health System to aggregate and standardize health information from ten million patients and over thirty billion clinical events.
  • Hoffman and Podgurski note that biomedical databases populated have many potential uses, with those likely to benefit including: “researchers, regulators, public health officials, commercial entities, lawyers,” as well as “healthcare providers who conduct quality assessment and improvement activities,” regulatory monitoring entities like the FDA, and “litigants in tort cases to develop evidence concerning causation and harm.”
  • They argue, however, that risks arise based on:
    • The data contained in biomedical databases is surprisingly likely to be incorrect or incomplete;
    • Systemic biases, arising from both the nature of the data and the preconceptions of investigators are serious threats the validity of research results, especially in answering causal questions;
  • Data mining of biomedical databases makes it easier for individuals with political, social, or economic agendas to generate ostensibly scientific but misleading research findings for the purpose of manipulating public opinion and swaying policymakers.

Krumholz, Harlan M., et al. “Sea Change in Open Science and Data Sharing Leadership by Industry.” Circulation: Cardiovascular Quality and Outcomes 7.4. 2014. 499-504. http://1.usa.gov/1J6q7KJ

  • This article provides a comprehensive overview of industry-led efforts and cross-sector collaborations in data sharing by pharmaceutical companies to inform clinical practice.
  • The article details the types of data being shared and the early activities of GlaxoSmithKline (“in coordination with other companies such as Roche and ViiV”); Medtronic and the Yale University Open Data Access Project; and Janssen Pharmaceuticals (Johnson & Johnson). The article also describes the range of involvement in data sharing among pharmaceutical companies including Pfizer, Novartis, Bayer, AbbVie, Eli Llly, AstraZeneca, and Bristol-Myers Squibb.

Mann, Gideon. “Private Data and the Public Good.” Medium. May 17, 2016. http://bit.ly/1OgOY68.

    • This Medium post from Gideon Mann, the Head of Data Science at Bloomberg, shares his prepared remarks given at a lecture at the City College of New York. Mann argues for the potential benefits of increasing access to private sector data, both to improve research and academic inquiry and also to help solve practical, real-world problems. He also describes a number of initiatives underway at Bloomberg along these lines.    
  • Mann argues that data generated at private companies “could enable amazing discoveries and research,” but is often inaccessible to those who could put it to those uses. Beyond research, he notes that corporate data could, for instance, benefit:
      • Public health – including suicide prevention, addiction counseling and mental health monitoring.
    • Legal and ethical questions – especially as they relate to “the role algorithms have in decisions about our lives,” such as credit checks and resume screening.
  • Mann recognizes the privacy challenges inherent in private sector data sharing, but argues that it is a common misconception that the only two choices are “complete privacy or complete disclosure.” He believes that flexible frameworks for differential privacy could open up new opportunities for responsibly leveraging data collaboratives.

Pastor Escuredo, D., Morales-Guzmán, A. et al, “Flooding through the Lens of Mobile Phone Activity.” IEEE Global Humanitarian Technology Conference, GHTC 2014. Available from: http://bit.ly/1OzK2bK

  • This report describes the impact of using mobile data in order to understand the impact of disasters and improve disaster management. The report was conducted in the Mexican state of Tabasco in 2009 as a multidisciplinary, multi-stakeholder consortium involving the UN World Food Programme (WFP), Telefonica Research, Technical University of Madrid (UPM), Digital Strategy Coordination Office of the President of Mexico, and UN Global Pulse.
  • Telefonica Research, a division of the major Latin American telecommunications company, provided call detail records covering flood-affected areas for nine months. This data was combined with “remote sensing data (satellite images), rainfall data, census and civil protection data.” The results of the data demonstrated that “analysing mobile activity during floods could be used to potentially locate damaged areas, efficiently assess needs and allocate resources (for example, sending supplies to affected areas).”
  • In addition to the results, the study highlighted “the value of a public-private partnership on using mobile data to accurately indicate flooding impacts in Tabasco, thus improving early warning and crisis management.”

* Perkmann, M. and Schildt, H. “Open data partnerships between firms and universities: The role of boundary organizations.” Research Policy, 44(5), 2015. http://bit.ly/25RRJ2c

  • This paper discusses the concept of a “boundary organization” in relation to industry-academic partnerships driven by data. Boundary organizations perform mediated revealing, allowing firms to disclose their research problems to a broad audience of innovators and simultaneously minimize the risk that this information would be adversely used by competitors.
  • The authors identify two especially important challenges for private firms to enter open data or participate in data collaboratives with the academic research community that could be addressed through more involvement from boundary organizations:
    • First is a challenge of maintaining competitive advantage. The authors note that, “the more a firm attempts to align the efforts in an open data research programme with its R&D priorities, the more it will have to reveal about the problems it is addressing within its proprietary R&D.”
    • Second, involves the misalignment of incentives between the private and academic field. Perkmann and Schildt argue that, a firm seeking to build collaborations around its opened data “will have to provide suitable incentives that are aligned with academic scientists’ desire to be rewarded for their work within their respective communities.”

Robin, N., Klein, T., & Jütting, J. “Public-Private Partnerships for Statistics: Lessons Learned, Future Steps.” OECD. 2016. http://bit.ly/24FLYlD.

  • This working paper acknowledges the growing body of work on how different types of data (e.g, telecom data, social media, sensors and geospatial data, etc.) can address data gaps relevant to National Statistical Offices (NSOs).
  • Four models of public-private interaction for statistics are describe: in-house production of statistics by a data-provider for a national statistics office (NSO), transfer of data-sets to NSOs from private entities, transfer of data to a third party provider to manage the NSO and private entity data, and the outsourcing of NSO functions.
  • The paper highlights challenges to public-private partnerships involving data (e.g., technical challenges, data confidentiality, risks, limited incentives for participation), suggests deliberate and highly structured approaches to public-private partnerships involving data require enforceable contracts, emphasizes the trade-off between data specificity and accessibility of such data, and the importance of pricing mechanisms that reflect the capacity and capability of national statistic offices.
  • Case studies referenced in the paper include:
    • A mobile network operator’s (MNO Telefonica) in house analysis of call detail records;
    • A third-party data provider and steward of travel statistics (Positium);
    • The Data for Development (D4D) challenge organized by MNO Orange; and
    • Statistics Netherlands use of social media to predict consumer confidence.

Stuart, Elizabeth, Samman, Emma, Avis, William, Berliner, Tom. “The data revolution: finding the missing millions.” Overseas Development Institute, 2015. Available from: http://bit.ly/1bPKOjw

  • The authors of this report highlight the need for good quality, relevant, accessible and timely data for governments to extend services into underrepresented communities and implement policies towards a sustainable “data revolution.”
  • The solutions focused on this recent report from the Overseas Development Institute focus on capacity-building activities of national statistical offices (NSOs), alternative sources of data (including shared corporate data) to address gaps, and building strong data management systems.

Taylor, L., & Schroeder, R. “Is bigger better? The emergence of big data as a tool for international development policy.” GeoJournal, 80(4). 2015. 503-518. http://bit.ly/1RZgSy4.

  • This journal article describes how privately held data – namely “digital traces” of consumer activity – “are becoming seen by policymakers and researchers as a potential solution to the lack of reliable statistical data on lower-income countries.
  • They focus especially on three categories of data collaborative use cases:
    • Mobile data as a predictive tool for issues such as human mobility and economic activity;
    • Use of mobile data to inform humanitarian response to crises; and
    • Use of born-digital web data as a tool for predicting economic trends, and the implications these have for LMICs.
  • They note, however, that a number of challenges and drawbacks exist for these types of use cases, including:
    • Access to private data sources often must be negotiated or bought, “which potentially means substituting negotiations with corporations for those with national statistical offices;”
    • The meaning of such data is not always simple or stable, and local knowledge is needed to understand how people are using the technologies in question
    • Bias in proprietary data can be hard to understand and quantify;
    • Lack of privacy frameworks; and
    • Power asymmetries, wherein “LMIC citizens are unwittingly placed in a panopticon staffed by international researchers, with no way out and no legal recourse.”

van Panhuis, Willem G., Proma Paul, Claudia Emerson, John Grefenstette, Richard Wilder, Abraham J. Herbst, David Heymann, and Donald S. Burke. “A systematic review of barriers to data sharing in public health.” BMC public health 14, no. 1 (2014): 1144. Available from: http://bit.ly/1JOBruO

  • The authors of this report provide a “systematic literature of potential barriers to public health data sharing.” These twenty potential barriers are classified in six categories: “technical, motivational, economic, political, legal and ethical.” In this taxonomy, “the first three categories are deeply rooted in well-known challenges of health information systems for which structural solutions have yet to be found; the last three have solutions that lie in an international dialogue aimed at generating consensus on policies and instruments for data sharing.”
  • The authors suggest the need for a “systematic framework of barriers to data sharing in public health” in order to accelerate access and use of data for public good.

Verhulst, Stefaan and Sangokoya, David. “Mapping the Next Frontier of Open Data: Corporate Data Sharing.” In: Gasser, Urs and Zittrain, Jonathan and Faris, Robert and Heacock Jones, Rebekah, “Internet Monitor 2014: Reflections on the Digital World: Platforms, Policy, Privacy, and Public Discourse (December 15, 2014).” Berkman Center Research Publication No. 2014-17. http://bit.ly/1GC12a2

  • This essay describe a taxonomy of current corporate data sharing practices for public good: research partnerships; prizes and challenges; trusted intermediaries; application programming interfaces (APIs); intelligence products; and corporate data cooperatives or pooling.
  • Examples of data collaboratives include: Yelp Dataset Challenge, the Digital Ecologies Research Partnerhsip, BBVA Innova Challenge, Telecom Italia’s Big Data Challenge, NIH’s Accelerating Medicines Partnership and the White House’s Climate Data Partnerships.
  • The authors highlight important questions to consider towards a more comprehensive mapping of these activities.

Verhulst, Stefaan and Sangokoya, David, 2015. “Data Collaboratives: Exchanging Data to Improve People’s Lives.” Medium. Available from: http://bit.ly/1JOBDdy

  • The essay refers to data collaboratives as a new form of collaboration involving participants from different sectors exchanging data to help solve public problems. These forms of collaborations can improve people’s lives through data-driven decision-making; information exchange and coordination; and shared standards and frameworks for multi-actor, multi-sector participation.
  • The essay cites four activities that are critical to accelerating data collaboratives: documenting value and measuring impact; matching public demand and corporate supply of data in a trusted way; training and convening data providers and users; experimenting and scaling existing initiatives.
  • Examples of data collaboratives include NIH’s Precision Medicine Initiative; the Mobile Data, Environmental Extremes and Population (MDEEP) Project; and Twitter-MIT’s Laboratory for Social Machines.

Verhulst, Stefaan, Susha, Iryna, Kostura, Alexander. “Data Collaboratives: matching Supply of (Corporate) Data to Solve Public Problems.” Medium. February 24, 2016. http://bit.ly/1ZEp2Sr.

  • This piece articulates a set of key lessons learned during a session at the International Data Responsibility Conference focused on identifying emerging practices, opportunities and challenges confronting data collaboratives.
  • The authors list a number of privately held data sources that could create positive public impacts if made more accessible in a collaborative manner, including:
    • Data for early warning systems to help mitigate the effects of natural disasters;
    • Data to help understand human behavior as it relates to nutrition and livelihoods in developing countries;
    • Data to monitor compliance with weapons treaties;
    • Data to more accurately measure progress related to the UN Sustainable Development Goals.
  • To the end of identifying and expanding on emerging practice in the space, the authors describe a number of current data collaborative experiments, including:
    • Trusted Intermediaries: Statistics Netherlands partnered with Vodafone to analyze mobile call data records in order to better understand mobility patterns and inform urban planning.
    • Prizes and Challenges: Orange Telecom, which has been a leader in this type of Data Collaboration, provided several examples of the company’s initiatives, such as the use of call data records to track the spread of malaria as well as their experience with Challenge 4 Development.
    • Research partnerships: The Data for Climate Action project is an ongoing large-scale initiative incentivizing companies to share their data to help researchers answer particular scientific questions related to climate change and adaptation.
    • Sharing intelligence products: JPMorgan Chase shares macro economic insights they gained leveraging their data through the newly established JPMorgan Chase Institute.
  • In order to capitalize on the opportunities provided by data collaboratives, a number of needs were identified:
    • A responsible data framework;
    • Increased insight into different business models that may facilitate the sharing of data;
    • Capacity to tap into the potential value of data;
    • Transparent stock of available data supply; and
    • Mapping emerging practices and models of sharing.

Vogel, N., Theisen, C., Leidig, J. P., Scripps, J., Graham, D. H., & Wolffe, G. “Mining mobile datasets to enable the fine-grained stochastic simulation of Ebola diffusion.” Paper presented at the Procedia Computer Science. 2015. http://bit.ly/1TZDroF.

  • The paper presents a research study conducted on the basis of the mobile calls records shared with researchers in the framework of the Data for Development Challenge by the mobile operator Orange.
  • The study discusses the data analysis approach in relation to developing a situation of Ebola diffusion built around “the interactions of multi-scale models, including viral loads (at the cellular level), disease progression (at the individual person level), disease propagation (at the workplace and family level), societal changes in migration and travel movements (at the population level), and mitigating interventions (at the abstract government policy level).”
  • The authors argue that the use of their population, mobility, and simulation models provide more accurate simulation details in comparison to high-level analytical predictions and that the D4D mobile datasets provide high-resolution information useful for modeling developing regions and hard to reach locations.

Welle Donker, F., van Loenen, B., & Bregt, A. K. “Open Data and Beyond.” ISPRS International Journal of Geo-Information, 5(4). 2016. http://bit.ly/22YtugY.

  • This research has developed a monitoring framework to assess the effects of open (private) data using a case study of a Dutch energy network administrator Liander.
  • Focusing on the potential impacts of open private energy data – beyond ‘smart disclosure’ where citizens are given information only about their own energy usage – the authors identify three attainable strategic goals:
    • Continuously optimize performance on services, security of supply, and costs;
    • Improve management of energy flows and insight into energy consumption;
    • Help customers save energy and switch over to renewable energy sources.
  • The authors propose a seven-step framework for assessing the impacts of Liander data, in particular, and open private data more generally:
    • Develop a performance framework to describe what the program is about, description of the organization’s mission and strategic goals;
    • Identify the most important elements, or key performance areas which are most critical to understanding and assessing your program’s success;
    • Select the most appropriate performance measures;
    • Determine the gaps between what information you need and what is available;
    • Develop and implement a measurement strategy to address the gaps;
    • Develop a performance report which highlights what you have accomplished and what you have learned;
    • Learn from your experiences and refine your approach as required.
  • While the authors note that the true impacts of this open private data will likely not come into view in the short term, they argue that, “Liander has successfully demonstrated that private energy companies can release open data, and has successfully championed the other Dutch network administrators to follow suit.”

World Economic Forum, 2015. “Data-driven development: pathways for progress.” Geneva: World Economic Forum. http://bit.ly/1JOBS8u

  • This report captures an overview of the existing data deficit and the value and impact of big data for sustainable development.
  • The authors of the report focus on four main priorities towards a sustainable data revolution: commercial incentives and trusted agreements with public- and private-sector actors; the development of shared policy frameworks, legal protections and impact assessments; capacity building activities at the institutional, community, local and individual level; and lastly, recognizing individuals as both produces and consumers of data.

Selected Readings on Data and Humanitarian Response


By Prianka Srinivasan and Stefaan G. Verhulst *

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data and humanitarian response was originally published in 2016.

Data, when used well in a trusted manner, allows humanitarian organizations to innovate how to respond to emergency events, including better coordination of post-disaster relief efforts, the ability to harness local knowledge to create more targeted relief strategies, and tools to predict and monitor disasters in real time. Consequently, in recent years both multinational groups and community-based advocates have begun to integrate data collection and evaluation strategies into their humanitarian operations, to better and more quickly respond to emergencies. However, this movement poses a number of challenges. Compared to the private sector, humanitarian organizations are often less equipped to successfully analyze and manage big data, which pose a number of risks related to the security of victims’ data. Furthermore, complex power dynamics which exist within humanitarian spaces may be further exacerbated through the introduction of new technologies and big data collection mechanisms. In the below we share:

  • Selected Reading List (summaries and hyperlinks)
  • Annotated Selected Reading List
  • Additional Readings

Selected Reading List  (summaries in alphabetical order)

Data and Humanitarian Response

Risks of Using Big Data in Humanitarian Context

Annotated Selected Reading List (in alphabetical order)

Karlsrud, John. “Peacekeeping 4.0: Harnessing the Potential of Big Data, Social Media, and Cyber Technologies.” Cyberspace and International Relations, 2013. http://bit.ly/235Qb3e

  • This chapter from the book “Cyberspace and International Relations” suggests that advances in big data give humanitarian organizations unprecedented opportunities to prevent and mitigate natural disasters and humanitarian crises. However, the sheer amount of unstructured data necessitates effective “data mining” strategies for multinational organizations to make the most use of this data.
  • By profiling some civil-society organizations who use big data in their peacekeeping efforts, Karlsrud suggests that these community-focused initiatives are leading the movement toward analyzing and using big data in countries vulnerable to crisis.
  • The chapter concludes by offering ten recommendations to UN peacekeeping forces to best realize the potential of big data and new technology in supporting their operations.

Mancini, Fancesco. “New Technology and the prevention of Violence and Conflict.” International Peace Institute, 2013. http://bit.ly/1ltLfNV

  • This report from the International Peace Institute looks at five case studies to assess how information and communications technologies (ICTs) can help prevent humanitarian conflicts and violence. Their findings suggest that context has a significant impact on the ability for these ICTs for conflict prevention, and any strategies must take into account the specific contingencies of the region to be successful.
  • The report suggests seven lessons gleaned from the five case studies:
    • New technologies are just one in a variety of tools to combat violence. Consequently, organizations must investigate a variety of complementary strategies to prevent conflicts, and not simply rely on ICTs.
    • Not every community or social group will have the same relationship to technology, and their ability to adopt new technologies are similarly influenced by their context. Therefore, a detailed needs assessment must take place before any new technologies are implemented.
    • New technologies may be co-opted by violent groups seeking to maintain conflict in the region. Consequently, humanitarian groups must be sensitive to existing political actors and be aware of possible negative consequences these new technologies may spark.
    • Local input is integral to support conflict prevention measures, and there exists need for collaboration and awareness-raising with communities to ensure new technologies are sustainable and effective.
    • Information shared between civil-society has more potential to develop early-warning systems. This horizontal distribution of information can also allow communities to maintain the accountability of local leaders.

Meier, Patrick. “Digital humanitarians: how big data is changing the face of humanitarian response.” Crc Press, 2015. http://amzn.to/1RQ4ozc

  • This book traces the emergence of “Digital Humanitarians”—people who harness new digital tools and technologies to support humanitarian action. Meier suggests that this has created a “nervous system” to connect people from disparate parts of the world, revolutionizing the way we respond to humanitarian crises.
  • Meier argues that such technology is reconfiguring the structure of the humanitarian space, where victims are not simply passive recipients of aid but can contribute with other global citizens. This in turn makes us more humane and engaged people.

Robertson, Andrew and Olson, Steve. “Using Data Sharing to Improve Coordination in Peacebuilding.” United States Institute for Peace, 2012. http://bit.ly/235QuLm

  • This report functions as an overview of a roundtable workshop on Technology, Science and Peace Building held at the United States Institute of Peace. The workshop aimed to investigate how data-sharing techniques can be developed for use in peace building or conflict management.
  • Four main themes emerged from discussions during the workshop:
    • “Data sharing requires working across a technology-culture divide”—Data sharing needs the foundation of a strong relationship, which can depend on sociocultural, rather than technological, factors.
    • “Information sharing requires building and maintaining trust”—These relationships are often built on trust, which can include both technological and social perspectives.
    • “Information sharing requires linking civilian-military policy discussions to technology”—Even when sophisticated data-sharing technologies exist, continuous engagement between different stakeholders is necessary. Therefore, procedures used to maintain civil-military engagement should be broadened to include technology.
    • “Collaboration software needs to be aligned with user needs”—technology providers need to keep in mind the needs of its users, in this case peacebuilders, in order to ensure sustainability.

United Nations Independent Expert Advisory Group on a Data Revolution for Sustainable Development. “A World That Counts, Mobilizing the Data Revolution.” 2014. https://bit.ly/2Cb3lXq

  • This report focuses on the potential benefits and risks data holds for sustainable development. Included in this is a strategic framework for using and managing data for humanitarian purposes. It describes a need for a multinational consensus to be developed to ensure data is shared effectively and efficiently.
  • It suggests that “people who are counted”—i.e., those who are included in data collection processes—have better development outcomes and a better chance for humanitarian response in emergency or conflict situations.

Katie Whipkey and Andrej Verity. “Guidance for Incorporating Big Data into Humanitarian Operations.” Digital Humanitarian Network, 2015. http://bit.ly/1Y2BMkQ

  • This report produced by the Digital Humanitarian Network provides an overview of big data, and how humanitarian organizations can integrate this technology into their humanitarian response. It primarily functions as a guide for organizations, and provides concise, brief outlines of what big data is, and how it can benefit humanitarian groups.
  • The report puts forward four main benefits acquired through the use of big data by humanitarian organizations: 1) the ability to leverage real-time information; 2) the ability to make more informed decisions; 3) the ability to learn new insights; 4) the ability for organizations to be more prepared.
  • It goes on to assess seven challenges big data poses for humanitarian organizations: 1) geography, and the unequal access to technology across regions; 2) the potential for user error when processing data; 3) limited technology; 4) questionable validity of data; 5) underdeveloped policies and ethics relating to data management; 6) limitations relating to staff knowledge.

Risks of Using Big Data in Humanitarian Context
Crawford, Kate, and Megan Finn. “The limits of crisis data: analytical and ethical challenges of using social and mobile data to understand disasters.” GeoJournal 80.4, 2015. http://bit.ly/1X0F7AI

  • Crawford & Finn present a critical analysis of the use of big data in disaster management, taking a more skeptical tone to the data revolution facing humanitarian response.
  • They argue that though social and mobile data analysis can yield important insights and tools in crisis events, it also presents a number of limitations which can lead to oversights being made by researchers or humanitarian response teams.
  • Crawford & Finn explore the ethical concerns the use of big data in disaster events introduces, including issues of power, privacy, and consent.
  • The paper concludes by recommending that critical data studies, such as those presented in the paper, be integrated into crisis event research in order to analyze some of the assumptions which underlie mobile and social data.

Jacobsen, Katja Lindskov (2010) Making design safe for citizens: A hidden history of humanitarian experimentation. Citizenship Studies 14.1: 89-103. http://bit.ly/1YaRTwG

  • This paper explores the phenomenon of “humanitarian experimentation,” where victims of disaster or conflict are the subjects of experiments to test the application of technologies before they are administered in greater civilian populations.
  • By analyzing the particular use of iris recognition technology during the repatriation of Afghan refugees to Pakistan in 2002 to 2007, Jacobsen suggests that this “humanitarian experimentation” compromises the security of already vulnerable refugees in order to better deliver biometric product to the rest of the world.

Responsible Data Forum. “Responsible Data Reflection Stories: An Overview.” http://bit.ly/1Rszrz1

  • This piece from the Responsible Data forum is primarily a compilation of “war stories” which follow some of the challenges in using big data for social good. By drawing on these crowdsourced cases, the Forum also presents an overview which makes key recommendations to overcome some of the challenges associated with big data in humanitarian organizations.
  • It finds that most of these challenges occur when organizations are ill-equipped to manage data and new technologies, or are unaware about how different groups interact in digital spaces in different ways.

Sandvik, Kristin Bergtora. “The humanitarian cyberspace: shrinking space or an expanding frontier?” Third World Quarterly 37:1, 17-32, 2016. http://bit.ly/1PIiACK

  • This paper analyzes the shift toward more technology-driven humanitarian work, where humanitarian work increasingly takes place online in cyberspace, reshaping the definition and application of aid. This has occurred along with what many suggest is a shrinking of the humanitarian space.
  • Sandvik provides three interpretations of this phenomena:
    • First, traditional threats remain in the humanitarian space, which are both modified and reinforced by technology.
    • Second, new threats are introduced by the increasing use of technology in humanitarianism, and consequently the humanitarian space may be broadening, not shrinking.
    • Finally, if the shrinking humanitarian space theory holds, cyberspace offers one example of this, where the increasing use of digital technology to manage disasters leads to a contraction of space through the proliferation of remote services.

Additional Readings on Data and Humanitarian Response

* Thanks to: Kristen B. Sandvik; Zara Rahman; Jennifer Schulte; Sean McDonald; Paul Currion; Dinorah Cantú-Pedraza and the Responsible Data Listserve for valuable input.