Selected Readings on Data Collaboratives


By Neil Britto, David Sangokoya, Iryna Susha, Stefaan Verhulst and Andrew Young

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data collaboratives was originally published in 2017.

The term data collaborative refers to a new form of collaboration, beyond the public-private partnership model, in which participants from different sectors (including private companies, research institutions, and government agencies ) can exchange data to help solve public problems. Several of society’s greatest challenges — from addressing climate change to public health to job creation to improving the lives of children — require greater access to data, more collaboration between public – and private-sector entities, and an increased ability to analyze datasets. In the coming months and years, data collaboratives will be essential vehicles for harnessing the vast stores of privately held data toward the public good.

Selected Reading List (in alphabetical order)

Annotated Selected Readings List (in alphabetical order)

Agaba, G., Akindès, F., Bengtsson, L., Cowls, J., Ganesh, M., Hoffman, N., . . . Meissner, F. “Big Data and Positive Social Change in the Developing World: A White Paper for Practitioners and Researchers.” 2014. http://bit.ly/25RRC6N.

  • This white paper, produced by “a group of activists, researchers and data experts” explores the potential of big data to improve development outcomes and spur positive social change in low- and middle-income countries. Using examples, the authors discuss four areas in which the use of big data can impact development efforts:
    • Advocating and facilitating by “opening[ing] up new public spaces for discussion and awareness building;
    • Describing and predicting through the detection of “new correlations and the surfac[ing] of new questions;
    • Facilitating information exchange through “multiple feedback loops which feed into both research and action,” and
    • Promoting accountability and transparency, especially as a byproduct of crowdsourcing efforts aimed at “aggregat[ing] and analyz[ing] information in real time.
  • The authors argue that in order to maximize the potential of big data’s use in development, “there is a case to be made for building a data commons for private/public data, and for setting up new and more appropriate ethical guidelines.”
  • They also identify a number of challenges, especially when leveraging data made accessible from a number of sources, including private sector entities, such as:
    • Lack of general data literacy;
    • Lack of open learning environments and repositories;
    • Lack of resources, capacity and access;
    • Challenges of sensitivity and risk perception with regard to using data;
    • Storage and computing capacity; and
    • Externally validating data sources for comparison and verification.

Ansell, C. and Gash, A. “Collaborative Governance in Theory and Practice.” Journal of Public Administration Research and  Theory 18 (4), 2008. http://bit.ly/1RZgsI5.

  • This article describes collaborative arrangements that include public and private organizations working together and proposes a model for understanding an emergent form of public-private interaction informed by 137 diverse cases of collaborative governance.
  • The article suggests factors significant to successful partnering processes and outcomes include:
    • Shared understanding of challenges,
    • Trust building processes,
    • The importance of recognizing seemingly modest progress, and
    • Strong indicators of commitment to the partnership’s aspirations and process.
  • The authors provide a ‘’contingency theory model’’ that specifies relationships between different variables that influence outcomes of collaborative governance initiatives. Three “core contingencies’’ for successful collaborative governance initiatives identified by the authors are:
    • Time (e.g., decision making time afforded to the collaboration);
    • Interdependence (e.g., a high degree of interdependence can mitigate negative effects of low trust); and
    • Trust (e.g. a higher level of trust indicates a higher probability of success).

Ballivian A, Hoffman W. “Public-Private Partnerships for Data: Issues Paper for Data Revolution Consultation.” World Bank, 2015. Available from: http://bit.ly/1ENvmRJ

  • This World Bank report provides a background document on forming public-prviate partnerships for data with the private sector in order to inform the UN’s Independent Expert Advisory Group (IEAG) on sustaining a “data revolution” in sustainable development.
  • The report highlights the critical position of private companies within the data value chain and reflects on key elements of a sustainable data PPP: “common objectives across all impacted stakeholders, alignment of incentives, and sharing of risks.” In addition, the report describes the risks and incentives of public and private actors, and the principles needed to “build[ing] the legal, cultural, technological and economic infrastructures to enable the balancing of competing interests.” These principles include understanding; experimentation; adaptability; balance; persuasion and compulsion; risk management; and governance.
  • Examples of data collaboratives cited in the report include HP Earth Insights, Orange Data for Development Challenges, Amazon Web Services, IBM Smart Cities Initiative, and the Governance Lab’s Open Data 500.

Brack, Matthew, and Tito Castillo. “Data Sharing for Public Health: Key Lessons from Other Sectors.” Chatham House, Centre on Global Health Security. April 2015. Available from: http://bit.ly/1DHFGVl

  • The Chatham House report provides an overview on public health surveillance data sharing, highlighting the benefits and challenges of shared health data and the complexity in adapting technical solutions from other sectors for public health.
  • The report describes data sharing processes from several perspectives, including in-depth case studies of actual data sharing in practice at the individual, organizational and sector levels. Among the key lessons for public health data sharing, the report strongly highlights the need to harness momentum for action and maintain collaborative engagement: “Successful data sharing communities are highly collaborative. Collaboration holds the key to producing and abiding by community standards, and building and maintaining productive networks, and is by definition the essence of data sharing itself. Time should be invested in establishing and sustaining collaboration with all stakeholders concerned with public health surveillance data sharing.”
  • Examples of data collaboratives include H3Africa (a collaboration between NIH and Wellcome Trust) and NHS England’s care.data programme.

de Montjoye, Yves-Alexandre, Jake Kendall, and Cameron F. Kerry. “Enabling Humanitarian Use of Mobile Phone Data.” The Brookings Institution, Issues in Technology Innovation. November 2014. Available from: http://brook.gs/1JxVpxp

  • Using Ebola as a case study, the authors describe the value of using private telecom data for uncovering “valuable insights into understanding the spread of infectious diseases as well as strategies into micro-target outreach and driving update of health-seeking behavior.”
  • The authors highlight the absence of a common legal and standards framework for “sharing mobile phone data in privacy-conscientious ways” and recommend “engaging companies, NGOs, researchers, privacy experts, and governments to agree on a set of best practices for new privacy-conscientious metadata sharing models.”

Eckartz, Silja M., Hofman, Wout J., Van Veenstra, Anne Fleur. “A decision model for data sharing.” Vol. 8653 LNCS. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2014. http://bit.ly/21cGWfw.

  • This paper proposes a decision model for data sharing of public and private data based on literature review and three case studies in the logistics sector.
  • The authors identify five categories of the barriers to data sharing and offer a decision model for identifying potential interventions to overcome each barrier:
    • Ownership. Possible interventions likely require improving trust among those who own the data through, for example, involvement and support from higher management
    • Privacy. Interventions include “anonymization by filtering of sensitive information and aggregation of data,” and access control mechanisms built around identity management and regulated access.  
    • Economic. Interventions include a model where data is shared only with a few trusted organizations, and yield management mechanisms to ensure negative financial consequences are avoided.
    • Data quality. Interventions include identifying additional data sources that could improve the completeness of datasets, and efforts to improve metadata.
    • Technical. Interventions include making data available in structured formats and publishing data according to widely agreed upon data standards.

Hoffman, Sharona and Podgurski, Andy. “The Use and Misuse of Biomedical Data: Is Bigger Really Better?” American Journal of Law & Medicine 497, 2013. http://bit.ly/1syMS7J.

  • This journal articles explores the benefits and, in particular, the risks related to large-scale biomedical databases bringing together health information from a diversity of sources across sectors. Some data collaboratives examined in the piece include:
    • MedMining – a company that extracts EHR data, de-identifies it, and offers it to researchers. The data sets that MedMining delivers to its customers include ‘lab results, vital signs, medications, procedures, diagnoses, lifestyle data, and detailed costs’ from inpatient and outpatient facilities.
    • Explorys has formed a large healthcare database derived from financial, administrative, and medical records. It has partnered with major healthcare organizations such as the Cleveland Clinic Foundation and Summa Health System to aggregate and standardize health information from ten million patients and over thirty billion clinical events.
  • Hoffman and Podgurski note that biomedical databases populated have many potential uses, with those likely to benefit including: “researchers, regulators, public health officials, commercial entities, lawyers,” as well as “healthcare providers who conduct quality assessment and improvement activities,” regulatory monitoring entities like the FDA, and “litigants in tort cases to develop evidence concerning causation and harm.”
  • They argue, however, that risks arise based on:
    • The data contained in biomedical databases is surprisingly likely to be incorrect or incomplete;
    • Systemic biases, arising from both the nature of the data and the preconceptions of investigators are serious threats the validity of research results, especially in answering causal questions;
  • Data mining of biomedical databases makes it easier for individuals with political, social, or economic agendas to generate ostensibly scientific but misleading research findings for the purpose of manipulating public opinion and swaying policymakers.

Krumholz, Harlan M., et al. “Sea Change in Open Science and Data Sharing Leadership by Industry.” Circulation: Cardiovascular Quality and Outcomes 7.4. 2014. 499-504. http://1.usa.gov/1J6q7KJ

  • This article provides a comprehensive overview of industry-led efforts and cross-sector collaborations in data sharing by pharmaceutical companies to inform clinical practice.
  • The article details the types of data being shared and the early activities of GlaxoSmithKline (“in coordination with other companies such as Roche and ViiV”); Medtronic and the Yale University Open Data Access Project; and Janssen Pharmaceuticals (Johnson & Johnson). The article also describes the range of involvement in data sharing among pharmaceutical companies including Pfizer, Novartis, Bayer, AbbVie, Eli Llly, AstraZeneca, and Bristol-Myers Squibb.

Mann, Gideon. “Private Data and the Public Good.” Medium. May 17, 2016. http://bit.ly/1OgOY68.

    • This Medium post from Gideon Mann, the Head of Data Science at Bloomberg, shares his prepared remarks given at a lecture at the City College of New York. Mann argues for the potential benefits of increasing access to private sector data, both to improve research and academic inquiry and also to help solve practical, real-world problems. He also describes a number of initiatives underway at Bloomberg along these lines.    
  • Mann argues that data generated at private companies “could enable amazing discoveries and research,” but is often inaccessible to those who could put it to those uses. Beyond research, he notes that corporate data could, for instance, benefit:
      • Public health – including suicide prevention, addiction counseling and mental health monitoring.
    • Legal and ethical questions – especially as they relate to “the role algorithms have in decisions about our lives,” such as credit checks and resume screening.
  • Mann recognizes the privacy challenges inherent in private sector data sharing, but argues that it is a common misconception that the only two choices are “complete privacy or complete disclosure.” He believes that flexible frameworks for differential privacy could open up new opportunities for responsibly leveraging data collaboratives.

Pastor Escuredo, D., Morales-Guzmán, A. et al, “Flooding through the Lens of Mobile Phone Activity.” IEEE Global Humanitarian Technology Conference, GHTC 2014. Available from: http://bit.ly/1OzK2bK

  • This report describes the impact of using mobile data in order to understand the impact of disasters and improve disaster management. The report was conducted in the Mexican state of Tabasco in 2009 as a multidisciplinary, multi-stakeholder consortium involving the UN World Food Programme (WFP), Telefonica Research, Technical University of Madrid (UPM), Digital Strategy Coordination Office of the President of Mexico, and UN Global Pulse.
  • Telefonica Research, a division of the major Latin American telecommunications company, provided call detail records covering flood-affected areas for nine months. This data was combined with “remote sensing data (satellite images), rainfall data, census and civil protection data.” The results of the data demonstrated that “analysing mobile activity during floods could be used to potentially locate damaged areas, efficiently assess needs and allocate resources (for example, sending supplies to affected areas).”
  • In addition to the results, the study highlighted “the value of a public-private partnership on using mobile data to accurately indicate flooding impacts in Tabasco, thus improving early warning and crisis management.”

* Perkmann, M. and Schildt, H. “Open data partnerships between firms and universities: The role of boundary organizations.” Research Policy, 44(5), 2015. http://bit.ly/25RRJ2c

  • This paper discusses the concept of a “boundary organization” in relation to industry-academic partnerships driven by data. Boundary organizations perform mediated revealing, allowing firms to disclose their research problems to a broad audience of innovators and simultaneously minimize the risk that this information would be adversely used by competitors.
  • The authors identify two especially important challenges for private firms to enter open data or participate in data collaboratives with the academic research community that could be addressed through more involvement from boundary organizations:
    • First is a challenge of maintaining competitive advantage. The authors note that, “the more a firm attempts to align the efforts in an open data research programme with its R&D priorities, the more it will have to reveal about the problems it is addressing within its proprietary R&D.”
    • Second, involves the misalignment of incentives between the private and academic field. Perkmann and Schildt argue that, a firm seeking to build collaborations around its opened data “will have to provide suitable incentives that are aligned with academic scientists’ desire to be rewarded for their work within their respective communities.”

Robin, N., Klein, T., & Jütting, J. “Public-Private Partnerships for Statistics: Lessons Learned, Future Steps.” OECD. 2016. http://bit.ly/24FLYlD.

  • This working paper acknowledges the growing body of work on how different types of data (e.g, telecom data, social media, sensors and geospatial data, etc.) can address data gaps relevant to National Statistical Offices (NSOs).
  • Four models of public-private interaction for statistics are describe: in-house production of statistics by a data-provider for a national statistics office (NSO), transfer of data-sets to NSOs from private entities, transfer of data to a third party provider to manage the NSO and private entity data, and the outsourcing of NSO functions.
  • The paper highlights challenges to public-private partnerships involving data (e.g., technical challenges, data confidentiality, risks, limited incentives for participation), suggests deliberate and highly structured approaches to public-private partnerships involving data require enforceable contracts, emphasizes the trade-off between data specificity and accessibility of such data, and the importance of pricing mechanisms that reflect the capacity and capability of national statistic offices.
  • Case studies referenced in the paper include:
    • A mobile network operator’s (MNO Telefonica) in house analysis of call detail records;
    • A third-party data provider and steward of travel statistics (Positium);
    • The Data for Development (D4D) challenge organized by MNO Orange; and
    • Statistics Netherlands use of social media to predict consumer confidence.

Stuart, Elizabeth, Samman, Emma, Avis, William, Berliner, Tom. “The data revolution: finding the missing millions.” Overseas Development Institute, 2015. Available from: http://bit.ly/1bPKOjw

  • The authors of this report highlight the need for good quality, relevant, accessible and timely data for governments to extend services into underrepresented communities and implement policies towards a sustainable “data revolution.”
  • The solutions focused on this recent report from the Overseas Development Institute focus on capacity-building activities of national statistical offices (NSOs), alternative sources of data (including shared corporate data) to address gaps, and building strong data management systems.

Taylor, L., & Schroeder, R. “Is bigger better? The emergence of big data as a tool for international development policy.” GeoJournal, 80(4). 2015. 503-518. http://bit.ly/1RZgSy4.

  • This journal article describes how privately held data – namely “digital traces” of consumer activity – “are becoming seen by policymakers and researchers as a potential solution to the lack of reliable statistical data on lower-income countries.
  • They focus especially on three categories of data collaborative use cases:
    • Mobile data as a predictive tool for issues such as human mobility and economic activity;
    • Use of mobile data to inform humanitarian response to crises; and
    • Use of born-digital web data as a tool for predicting economic trends, and the implications these have for LMICs.
  • They note, however, that a number of challenges and drawbacks exist for these types of use cases, including:
    • Access to private data sources often must be negotiated or bought, “which potentially means substituting negotiations with corporations for those with national statistical offices;”
    • The meaning of such data is not always simple or stable, and local knowledge is needed to understand how people are using the technologies in question
    • Bias in proprietary data can be hard to understand and quantify;
    • Lack of privacy frameworks; and
    • Power asymmetries, wherein “LMIC citizens are unwittingly placed in a panopticon staffed by international researchers, with no way out and no legal recourse.”

van Panhuis, Willem G., Proma Paul, Claudia Emerson, John Grefenstette, Richard Wilder, Abraham J. Herbst, David Heymann, and Donald S. Burke. “A systematic review of barriers to data sharing in public health.” BMC public health 14, no. 1 (2014): 1144. Available from: http://bit.ly/1JOBruO

  • The authors of this report provide a “systematic literature of potential barriers to public health data sharing.” These twenty potential barriers are classified in six categories: “technical, motivational, economic, political, legal and ethical.” In this taxonomy, “the first three categories are deeply rooted in well-known challenges of health information systems for which structural solutions have yet to be found; the last three have solutions that lie in an international dialogue aimed at generating consensus on policies and instruments for data sharing.”
  • The authors suggest the need for a “systematic framework of barriers to data sharing in public health” in order to accelerate access and use of data for public good.

Verhulst, Stefaan and Sangokoya, David. “Mapping the Next Frontier of Open Data: Corporate Data Sharing.” In: Gasser, Urs and Zittrain, Jonathan and Faris, Robert and Heacock Jones, Rebekah, “Internet Monitor 2014: Reflections on the Digital World: Platforms, Policy, Privacy, and Public Discourse (December 15, 2014).” Berkman Center Research Publication No. 2014-17. http://bit.ly/1GC12a2

  • This essay describe a taxonomy of current corporate data sharing practices for public good: research partnerships; prizes and challenges; trusted intermediaries; application programming interfaces (APIs); intelligence products; and corporate data cooperatives or pooling.
  • Examples of data collaboratives include: Yelp Dataset Challenge, the Digital Ecologies Research Partnerhsip, BBVA Innova Challenge, Telecom Italia’s Big Data Challenge, NIH’s Accelerating Medicines Partnership and the White House’s Climate Data Partnerships.
  • The authors highlight important questions to consider towards a more comprehensive mapping of these activities.

Verhulst, Stefaan and Sangokoya, David, 2015. “Data Collaboratives: Exchanging Data to Improve People’s Lives.” Medium. Available from: http://bit.ly/1JOBDdy

  • The essay refers to data collaboratives as a new form of collaboration involving participants from different sectors exchanging data to help solve public problems. These forms of collaborations can improve people’s lives through data-driven decision-making; information exchange and coordination; and shared standards and frameworks for multi-actor, multi-sector participation.
  • The essay cites four activities that are critical to accelerating data collaboratives: documenting value and measuring impact; matching public demand and corporate supply of data in a trusted way; training and convening data providers and users; experimenting and scaling existing initiatives.
  • Examples of data collaboratives include NIH’s Precision Medicine Initiative; the Mobile Data, Environmental Extremes and Population (MDEEP) Project; and Twitter-MIT’s Laboratory for Social Machines.

Verhulst, Stefaan, Susha, Iryna, Kostura, Alexander. “Data Collaboratives: matching Supply of (Corporate) Data to Solve Public Problems.” Medium. February 24, 2016. http://bit.ly/1ZEp2Sr.

  • This piece articulates a set of key lessons learned during a session at the International Data Responsibility Conference focused on identifying emerging practices, opportunities and challenges confronting data collaboratives.
  • The authors list a number of privately held data sources that could create positive public impacts if made more accessible in a collaborative manner, including:
    • Data for early warning systems to help mitigate the effects of natural disasters;
    • Data to help understand human behavior as it relates to nutrition and livelihoods in developing countries;
    • Data to monitor compliance with weapons treaties;
    • Data to more accurately measure progress related to the UN Sustainable Development Goals.
  • To the end of identifying and expanding on emerging practice in the space, the authors describe a number of current data collaborative experiments, including:
    • Trusted Intermediaries: Statistics Netherlands partnered with Vodafone to analyze mobile call data records in order to better understand mobility patterns and inform urban planning.
    • Prizes and Challenges: Orange Telecom, which has been a leader in this type of Data Collaboration, provided several examples of the company’s initiatives, such as the use of call data records to track the spread of malaria as well as their experience with Challenge 4 Development.
    • Research partnerships: The Data for Climate Action project is an ongoing large-scale initiative incentivizing companies to share their data to help researchers answer particular scientific questions related to climate change and adaptation.
    • Sharing intelligence products: JPMorgan Chase shares macro economic insights they gained leveraging their data through the newly established JPMorgan Chase Institute.
  • In order to capitalize on the opportunities provided by data collaboratives, a number of needs were identified:
    • A responsible data framework;
    • Increased insight into different business models that may facilitate the sharing of data;
    • Capacity to tap into the potential value of data;
    • Transparent stock of available data supply; and
    • Mapping emerging practices and models of sharing.

Vogel, N., Theisen, C., Leidig, J. P., Scripps, J., Graham, D. H., & Wolffe, G. “Mining mobile datasets to enable the fine-grained stochastic simulation of Ebola diffusion.” Paper presented at the Procedia Computer Science. 2015. http://bit.ly/1TZDroF.

  • The paper presents a research study conducted on the basis of the mobile calls records shared with researchers in the framework of the Data for Development Challenge by the mobile operator Orange.
  • The study discusses the data analysis approach in relation to developing a situation of Ebola diffusion built around “the interactions of multi-scale models, including viral loads (at the cellular level), disease progression (at the individual person level), disease propagation (at the workplace and family level), societal changes in migration and travel movements (at the population level), and mitigating interventions (at the abstract government policy level).”
  • The authors argue that the use of their population, mobility, and simulation models provide more accurate simulation details in comparison to high-level analytical predictions and that the D4D mobile datasets provide high-resolution information useful for modeling developing regions and hard to reach locations.

Welle Donker, F., van Loenen, B., & Bregt, A. K. “Open Data and Beyond.” ISPRS International Journal of Geo-Information, 5(4). 2016. http://bit.ly/22YtugY.

  • This research has developed a monitoring framework to assess the effects of open (private) data using a case study of a Dutch energy network administrator Liander.
  • Focusing on the potential impacts of open private energy data – beyond ‘smart disclosure’ where citizens are given information only about their own energy usage – the authors identify three attainable strategic goals:
    • Continuously optimize performance on services, security of supply, and costs;
    • Improve management of energy flows and insight into energy consumption;
    • Help customers save energy and switch over to renewable energy sources.
  • The authors propose a seven-step framework for assessing the impacts of Liander data, in particular, and open private data more generally:
    • Develop a performance framework to describe what the program is about, description of the organization’s mission and strategic goals;
    • Identify the most important elements, or key performance areas which are most critical to understanding and assessing your program’s success;
    • Select the most appropriate performance measures;
    • Determine the gaps between what information you need and what is available;
    • Develop and implement a measurement strategy to address the gaps;
    • Develop a performance report which highlights what you have accomplished and what you have learned;
    • Learn from your experiences and refine your approach as required.
  • While the authors note that the true impacts of this open private data will likely not come into view in the short term, they argue that, “Liander has successfully demonstrated that private energy companies can release open data, and has successfully championed the other Dutch network administrators to follow suit.”

World Economic Forum, 2015. “Data-driven development: pathways for progress.” Geneva: World Economic Forum. http://bit.ly/1JOBS8u

  • This report captures an overview of the existing data deficit and the value and impact of big data for sustainable development.
  • The authors of the report focus on four main priorities towards a sustainable data revolution: commercial incentives and trusted agreements with public- and private-sector actors; the development of shared policy frameworks, legal protections and impact assessments; capacity building activities at the institutional, community, local and individual level; and lastly, recognizing individuals as both produces and consumers of data.

Do Open Comment Processes Increase Regulatory Compliance? Evidence from a Public Goods Experiment


Stephen N. Morgan, Nicole M. Mason and Robert S. Shupp at EconPapers: “Agri-environmental programs often incorporate stakeholder participation elements in an effort to increase community ownership of policies designed to protect environmental resources (Hajer 1995; Fischer 2000). Participation – acting through increased levels of ownership – is then expected to increase individual rates of compliance with regulatory policies. Utilizing a novel lab experiment, this research leverages a public goods contribution game to test the effects of a specific type of stakeholder participation scheme on individual compliance outcomes. We find significant evidence that the implemented type of non-voting participation mechanism reduces the probability that an individual will engage in noncompliant behavior and reduces the level of noncompliance. At the same time, exposure to the open comment treatment also increases individual contributions to a public good. Additionally, we find evidence that exposure to participation schemes results in a faster decay in individual compliance over time suggesting that the impacts of this type of participation mechanism may be transitory….(More)”

An App to Save Syria’s Lost Generation? What Technology Can and Can’t Do


 in Foreign Affairs: ” In January this year, when the refugee and migrant crisis in Europe had hit its peak—more than a million had crossed into Europe over the course of 2015—the U.S. State Department and Google hosted a forum of over 100 technology experts. The goal was to “bridge the education gap for Syrian refugee children.” Speaking to the group assembled at Stanford University, Deputy Secretary of State Antony Blinken announced a $1.7 million prize “to develop a smartphone app that can help Syrian children learn how to read and improve their wellbeing.” The competition, known as EduApp4Syria, is being run by the Norwegian Agency for Development Cooperation (Norad) and is supported by the Australian government and the French mobile company Orange.

Less than a month later, a group called Techfugees brought together over 100 technologists for a daylong brainstorm in New York City focused exclusively on education solutions. “We are facing the largest refugee crisis since World War II,” said U.S. Ambassador to the United Nations Samantha Power to open the conference. “It is a twenty-first-century crisis and we need a twenty-first-century solution.” Among the more promising, according to Power, were apps that enable “refugees to access critical services,” new “web platforms connecting refugees with one another,” and “education programs that teach refugees how to code.”

For example, the nonprofit PeaceGeeks created the Services Advisor app for the UN Refugee Agency, which maps the location of shelters, food distribution centers, and financial services in Jordan….(More)”

Teenage scientists enlisted to fight Zika


ShareAmerica: “A mosquito’s a mosquito, right? Not when it comes to Zika and other mosquito-borne diseases.

Only two of the estimated 3,000 species of mosquitoes are capable of carrying the Zika virus in the United States, but estimates of their precise range remain hazy, according to the U.S. Centers for Disease Control and Prevention.

Scientists could start getting better information about these pesky, but important, insects with the help of plastic cups, brown paper towels and teenage biology students.

As part of the Invasive Mosquito Project from the U.S. Department of Agriculture, secondary-school students nationwide are learning about mosquito populations and helping fill the knowledge gaps.

Simple experiment, complex problem

The experiment works like this: First, students line the cups with paper, then fill two-thirds of the cups with water. Students place the plastic cups outside, and after a week, the paper is dotted with what looks like specks of dirt. These dirt particles are actually mosquito eggs, which the students can identify and classify.

Students then upload their findings to a national crowdsourced database. Crowdsourcing uses the collective intelligence of online communities to “distribute” problem solving across a massive network.

Entomologist Lee Cohnstaedt of the U.S. Department of Agriculture coordinates the program, and he’s already thinking about expansion. He said he hopes to have one-fifth of U.S. schools participate in the mosquito species census. He also plans to adapt lesson plans for middle schools, Scouting troops and gardening clubs.

Already, crowdsourcing has “collected better data than we could have working alone,” he told the Associated Press….

In addition to mosquito tracking, crowdsourcing has been used to develop innovative responses to a number of complex challenges, from climate change to archaeologyto protein modeling….(More)”

Big data: big power shifts?


Special issue of Internet Policy Review: “Facing general conceptions of the power effects of big data, this thematic edition is interested in studies that scrutinise big data and power in concrete fields of application. It brings together scholars from different disciplines who analyse the fields agriculture, education, border control and consumer policy. As will be made explicit in the following, each of the articles tells us something about firstly, what big data is and how it relates to power. They secondly also shed light on how we should shape “the big data society” and what research questions need to be answered to be able to do so….

The ethics of big data in big agriculture
Isabelle M. Carbonell, University of California, Santa Cruz

Regulating “big data education” in Europe: lessons learned from the US
Yoni Har Carmel, University of Haifa

The borders, they are a-changin’! The emergence of socio-digital borders in the EU
Magdalena König, Maastricht University

Beyond consent: improving data protection through consumer protection law
Michiel Rhoen, Leiden University…

(More)”

Health care data as a public utility: how do we get there?


Mohit Kaushal and Margaret Darling at Brookings: “Forty-six million Americans use mobile fitness and health apps. Over half of providers serving Medicare or Medicaid patients are using electronic health records (EHRs). Despite such advances and proliferation of health data and its collection, we are not yet on an inevitable path to unleashing the often-promisedpower of data” because data remain proprietary and fragmented among insurers, providers, health record companies, government agencies, and researchers.

Despite the technological integration seen in banking and other industries, health care data has remained scattered and inaccessible. EHRs remain fragmented among 861 distinct ambulatory vendors and 277 inpatient vendors as of 2013. Similarly, insurance claims are stored in the databases of insurers, and information about public health—including information about the social determinants of health, such as housing, food security, safety, and education—is often kept in databases belonging to various governmental agencies. These silos wouldn’t necessarily be a problem, except for the lack of interoperability that has long plagued the health care industry.

For this reason, many are reconsidering if health care data is a public good, provided to all members of the public without profit. This idea is not new. In fact, the Institute of Medicine established the Roundtable on Value and Science-Driven Healthcare, citing that:

“A significant challenge to progress resides in the barriers and restrictions that derive from the treatment of medical care data as a proprietary commodity by the organizations involved. Even clinical research and medical care data developed with public funds are often not available for broader analysis and insights. Broader access and use of healthcare data for new insights require not only fostering data system reliability and interoperability but also addressing the matter of individual data ownership and the extent to which data central to progress in health and health care should constitute a public good.”

Indeed, publicly available health care data holds the potential to unlock many innovations, much like what public goods have done in other industries. As publicly available weather data has shown, the public utility of open access information is not only good for consumers, itis good for businesses…(More)”

BeMyEye: Crowdsourcing is making it easier to gather data fast


Jack Torrance at Management Today: “The era of big data is upon us. Dozens of well-funded start-ups have sprung up of late claiming to be able to turn raw data into ‘actionable insights’ that would have been unimaginable a few years ago. But the process of actually collecting data is still not always straightforward….

London-based start-up BeMyEye (not to be confused with Be My Eyes, an iPhone app that claims to ‘help the blind see’) has built an army of casual data gatherers that report back via their phones. ‘For companies that sell their product to high street retailers or supermarkets, being able to verify the presence of their product, the adequacy of the promotions, the positioning in relation to competitors, this is all invaluable intelligence,’ CEO Luca Pagano tells MT. ‘Our crowd is able to observe and feed this information back to these brands very, very quickly.’…

They can do more than check prices in shops. Some of its clients (which include Heineken, Illy and Three) have used the service to check billboards they are paying for have actually been put up correctly. ‘We realised the level of [billboard] compliance is actually below 90%,’ says Pagano. It can also be used to generate sales leads….

BeMyEyes isn’t the only company that’s exploring this business model. San Francisco company Premise is using a similar network of data gatherers to monitor food prices and other metrics in developing countries for NGOs and governments as well as commercial organisations. It’s not hard to see why they would be an attractive proposition for clients, but the challenge for both of these businesses will be ensuring they can find enough reliable and effective data gatherers to keep the information flowing in at a high enough quality….(More)”

Building Data Responsibility into Humanitarian Action


Stefaan Verhulst at The GovLab: “Next Monday, May 23rd, governments, non-profit organizations and citizen groups will gather in Istanbul at the first World Humanitarian Summit. A range of important issues will be on the agenda, not least of which the refugee crisis confronting the Middle East and Europe. Also on the agenda will be an issue of growing importance and relevance, even if it does not generate front-page headlines: the increasing potential (and use) of data in the humanitarian context.

To explore this topic, a new paper, “Building Data Responsibility into Humanitarian Action,” is being released today, and will be presented tomorrow at the Understanding Risk Forum. This paper is the result of a collaboration between the United Nations Office for the Coordination of Humanitarian Affairs (OCHA), The GovLab (NYU Tandon School of Engineering), the Harvard Humanitarian Initiative, and Leiden UniversityCentre for Innovation. It seeks to identify the potential benefits and risks of using data in the humanitarian context, and begins to outline an initial framework for the responsible use of data in humanitarian settings.

Both anecdotal and more rigorously researched evidence points to the growing use of data to address a variety of humanitarian crises. The paper discusses a number of data risk case studies, including the use of call data to fight Malaria in Africa; satellite imagery to identify security threats on the border between Sudan and South Sudan; and transaction data to increase the efficiency of food delivery in Lebanon. These early examples (along with a few others discussed in the paper) have begun to show the opportunities offered by data and information. More importantly, they also help us better understand the risks, including and especially those posed to privacy and security.

One of the broader goals of the paper is to integrate the specific and the theoretical, in the process building a bridge between the deep, contextual knowledge offered by initiatives like those discussed above and the broader needs of the humanitarian community. To that end, the paper builds on its discussion of case studies to begin establishing a framework for the responsible use of data in humanitarian contexts. It identifies four “Minimum Humanitarian standards for the Responsible use of Data” and four “Characteristics of Humanitarian Organizations that use Data Responsibly.” Together, these eight attributes can serve as a roadmap or blueprint for humanitarian groups seeking to use data. In addition, the paper also provides a four-step practical guide for a data responsibility framework (see also earlier blog)….(More)” Full Paper: Building Data Responsibility into Humanitarian Action

Society’s biggest problems need more than a nudge


 at the Conversation: “So-called “nudge units” are popping up in governments all around the world.

The best-known examples include the U.K.’s Behavioural Insights Team, created in 2010, and the White House-based Social and Behavioral Sciences Team, introduced by the Obama administration in 2014. Their mission is to leverage findings from behavioral science so that people’s decisions can be nudged in the direction of their best intentions without curtailing their ability to make choices that don’t align with their priorities.

Overall, these – and other – governments have made important strides when it comes to using behavioral science to nudge their constituents into better choices.

Yet, the same governments have done little to improve their own decision-making processes. Consider big missteps like the Flint water crisis. How could officials in Michigan decide to place an essential service – safe water – and almost 100,000 people at risk in order to save US$100 per day for three months? No defensible decision-making process should have allowed this call to be made.

When it comes to many of the big decisions faced by governments – and the private sector – behavioral science has more to offer than simple nudges.

Behavioral scientists who study decision-making processes could also help policy-makers understand why things went wrong in Flint, and how to get their arms around a wide array of society’s biggest problems – from energy transitions to how to best approach the refugee crisis in Syria.

When nudges are enough

The idea of nudging people in the direction of decisions that are in their own best interest has been around for a while. But it was popularized in 2008 with the publication of the bestseller “Nudge“ by Richard Thaler of the University of Chicago and Cass Sunstein of Harvard.

A common nudge goes something like this: if we want to eat better but are having a hard time doing it, choice architects can reengineer the environment in which we make our food choices so that healthier options are intuitively easier to select, without making it unrealistically difficult to eat junk food if that’s what we’d rather do. So, for example, we can shelve healthy foods at eye level in supermarkets, with less-healthy options relegated to the shelves nearer to the floor….

Sometimes a nudge isn’t enough

Nudges work for a wide array of choices, from ones we face every day to those that we face infrequently. Likewise, nudges are particularly well-suited to decisions that are complex with lots of different alternatives to choose from. And, they are advocated in situations where the outcomes of our decisions are delayed far enough into the future that they feel uncertain or abstract. This describes many of the big decisions policy-makers face, so it makes sense to think the solution must be more nudge units.

But herein lies the rub. For every context where a nudge seems like a realistic option, there’s at least another context where the application of passive decision support would be either be impossible – or, worse, a mistake.

Take, for example, the question of energy transitions. These transitions are often characterized by the move from infrastructure based on fossil fuels to renewables to address all manner of risks, including those from climate change. These are decisions that society makes infrequently. They are complex. And, the outcomes – which are based on our ability to meet conflicting economic, social and environmental objectives – will be delayed.

But, absent regulation that would place severe restrictions on the kinds of options we could choose from – and which, incidentally, would violate the freedom-of-choice tenet of choice architecture – there’s no way to put renewable infrastructure options at proverbial eye level for state or federal decision-makers, or their stakeholders.

Simply put, a nudge for a decision like this would be impossible. In these cases, decisions have to be made the old-fashioned way: with a heavy lift instead of a nudge.

Complex policy decisions like this require what we call active decision support….(More)”

Fifty Shades of Open


Jeffrey Pomerantz and Robin Peek at First Monday: “Open source. Open access. Open society. Open knowledge. Open government. Even open food. Until quite recently, the word “open” had a fairly constant meaning. The over-use of the word “open” has led to its meaning becoming increasingly ambiguous. This presents a critical problem for this important word, as ambiguity leads to misinterpretation.

“Open” has been applied to a wide variety of words to create new terms, some of which make sense, and some not so much. When we started writing this essay, we thought our working title was simply amusing. But the working title became the actual title, as we found that there are at least 50 different terms in which the word “open” is used, encompassing nearly as many different criteria for openness. In this essay we will attempt to make sense of this open season on the word “open.”

Opening the door on open

The word “open” is, perhaps unsurprisingly, a very old one in the English language, harking back to Early Old English. Unlike some words in English, the definition of “open” has changed very little in the intervening thousand-plus years: the earliest recorded uses of the word are completely consistent with its modern usage as an adjective, indicating a passage through or an access into something (Oxford English Dictionary, 2016).

This meaning leads to the development in the fifteenth century of the phrases “open house,” meaning an establishment in which all are welcome, and “open air,” meaning unenclosed outdoor spaces. One such unenclosed outdoor space that figured large in the fifteenth century, and continues to do so today, is the Commons (Hardin, 1968): land or other resources that are not privately owned, but are available for use to all members of a community. The word “open” in these phrases indicates that all have access to a shared resource. All are welcome to visit an open house, but not to move in; all are welcome to walk in the open air or graze their sheep on the Commons, but not to fence the Commons as part of their backyard. (And the moment at which Commons land ceases to be open is precisely the moment it is fenced by an owner, which is in fact what happened in Great Britain during the Enclosure movement of the sixteenth through eighteenth centuries.)

Running against the grain of this cultural movement to enclosure, the nineteenth century saw the circulating library become the norm — rather than libraries in which massive tomes were literally chained to desks. The interpretation of the word “open” to mean a shared resource to which all had access, fit neatly into the philosophy of the modern library movement of the nineteenth century. The phrases “open shelves” and “open stacks” emerged at this time, referring to resources that were directly available to library users, without necessarily requiring intervention by a librarian. Naturally, however, not all library resources were made openly available, nor are they even today. Furthermore, resources are made openly available with the understanding that, like Commons land, they must be shared: library resources have a due date.

The twentieth century saw an increase in the use of the word “open,” as well as a hint of the confusion that was to come about the interpretation of the word. The term “open society” was coined prior to World War I, to indicate a society tolerant of religious diversity. The “open skies” policy enables a nation to allow other nations’ commercial aviation to fly through its airspace — though, importantly, without giving up control of its airspace. The Open University was founded in the United Kingdom in 1969, to provide a university education to all, with no formal entry requirements. The meaning of the word “open” is quite different across these three terms — or perhaps it would be more accurate to say that these terms use different shadings of the word.

But it has been the twenty-first century that has seen the most dramatic increase in the number of terms that use “open.” The story of this explosion in the use of the word “open” begins, however, with a different word entirely: the word “free.”….

Introduction
Opening the door on open
Speech, beer, and puppies
Open means rights
Open means access
Open means use
Open means transparent
Open means participatory
Open means enabling openness
Open means philosophically aligned with open principles
Openwashing and its discontents
Conclusion