Are we too obsessed with data?


Lauren Woodman of Nethope:” Data: Everyone’s talking about it, everyone wants more of it….

Still, I’d posit that we’re too obsessed with data. Not just us in the humanitarian space, of course, but everyone. How many likes did that Facebook post get? How many airline miles did I fly last year? How many hours of sleep did I get last week?…

The problem is that data by itself isn’t that helpful: information is.

We need to develop a new obsession, around making sure that data is actionable, that it is relevant in the context in which we work, and on making sure that we’re using the data as effectively as we are collecting it.

In my talk at ICT4D, I referenced the example of 7-Eleven in Japan. In the 1970s, 7-Eleven in Japan became independent from its parent, Southland Corporation. The CEO had to build a viable business in a tough economy. Every month, each store manager would receive reams of data, but it wasn’t effective until the CEO stripped out the noise and provided just four critical data points that had the greatest relevance to drive the local purchasing that each store was empowered to do on their own.

Those points – what sold the day before, what sold the same day a year ago, what sold the last time the weather was the same, and what other stores sold the day before – were transformative. Within a year, 7-Eleven had turned a corner, and for 30 years, remained the most profitable retailer in Japan. It wasn’t about the Big Data; it was figuring out what data was relevant, actionable and empowered local managers to make nimble decisions.

For our sector to get there, we need to do the front-end work that transforms our data into information that we can use. That, after all, is where the magic happens.

A few examples provide more clarity as to why this is so critical.

We know that adaptive decision-making requires access to real-time data. By knowing what is happening in real-time, or near-real-time, we can adjust our approaches and interventions to be most impactful. But to do so, our data has to be accessible to those that are empowered to make decisions. To achieve that, we have to make investments in training, infrastructure, and capacity-building at the organizational level.  But in the nonprofit sector, such investments are rarely supported by donors and beyond the limited unrestricted funding available to most most organizations. As a result, the sector has, so far, been able to take only limited steps towards effective data usage, hampering our ability to transform the massive amounts of data we have into useful information.

Another big question about data, and particularly in the humanitarian space, is whether it should be open, closed or somewhere in between. Privacy is certainly paramount, and for types of data, the need for close protection is very clear. For many other data, however, the rules are far less clear. Every country has its own rules about how data can and cannot be used or shared, and more work is needed to provide clarity and predictability so that appropriate data-sharing can evolve.

And perhaps more importantly, we need to think about not just the data, but the use cases.  Most of us would agree, for example, that sharing information during a crisis situation can be hugely beneficial to the people and the communities we serve – but in a world where rules are unclear, that ambiguity limits what we can do with the data we have. Here again, the context in which data will be used is critically important.

Finally, all of in the sector have to realize that the journey to transforming data into information is one we’re on together. We have to be willing to give and take. Having data is great; sharing information is better. Sometimes, we have to co-create that basis to ensure we all benefit….(More)”

Selected Readings on Data Collaboratives


By Neil Britto, David Sangokoya, Iryna Susha, Stefaan Verhulst and Andrew Young

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data collaboratives was originally published in 2017.

The term data collaborative refers to a new form of collaboration, beyond the public-private partnership model, in which participants from different sectors (including private companies, research institutions, and government agencies ) can exchange data to help solve public problems. Several of society’s greatest challenges — from addressing climate change to public health to job creation to improving the lives of children — require greater access to data, more collaboration between public – and private-sector entities, and an increased ability to analyze datasets. In the coming months and years, data collaboratives will be essential vehicles for harnessing the vast stores of privately held data toward the public good.

Selected Reading List (in alphabetical order)

Annotated Selected Readings List (in alphabetical order)

Agaba, G., Akindès, F., Bengtsson, L., Cowls, J., Ganesh, M., Hoffman, N., . . . Meissner, F. “Big Data and Positive Social Change in the Developing World: A White Paper for Practitioners and Researchers.” 2014. http://bit.ly/25RRC6N.

  • This white paper, produced by “a group of activists, researchers and data experts” explores the potential of big data to improve development outcomes and spur positive social change in low- and middle-income countries. Using examples, the authors discuss four areas in which the use of big data can impact development efforts:
    • Advocating and facilitating by “opening[ing] up new public spaces for discussion and awareness building;
    • Describing and predicting through the detection of “new correlations and the surfac[ing] of new questions;
    • Facilitating information exchange through “multiple feedback loops which feed into both research and action,” and
    • Promoting accountability and transparency, especially as a byproduct of crowdsourcing efforts aimed at “aggregat[ing] and analyz[ing] information in real time.
  • The authors argue that in order to maximize the potential of big data’s use in development, “there is a case to be made for building a data commons for private/public data, and for setting up new and more appropriate ethical guidelines.”
  • They also identify a number of challenges, especially when leveraging data made accessible from a number of sources, including private sector entities, such as:
    • Lack of general data literacy;
    • Lack of open learning environments and repositories;
    • Lack of resources, capacity and access;
    • Challenges of sensitivity and risk perception with regard to using data;
    • Storage and computing capacity; and
    • Externally validating data sources for comparison and verification.

Ansell, C. and Gash, A. “Collaborative Governance in Theory and Practice.” Journal of Public Administration Research and  Theory 18 (4), 2008. http://bit.ly/1RZgsI5.

  • This article describes collaborative arrangements that include public and private organizations working together and proposes a model for understanding an emergent form of public-private interaction informed by 137 diverse cases of collaborative governance.
  • The article suggests factors significant to successful partnering processes and outcomes include:
    • Shared understanding of challenges,
    • Trust building processes,
    • The importance of recognizing seemingly modest progress, and
    • Strong indicators of commitment to the partnership’s aspirations and process.
  • The authors provide a ‘’contingency theory model’’ that specifies relationships between different variables that influence outcomes of collaborative governance initiatives. Three “core contingencies’’ for successful collaborative governance initiatives identified by the authors are:
    • Time (e.g., decision making time afforded to the collaboration);
    • Interdependence (e.g., a high degree of interdependence can mitigate negative effects of low trust); and
    • Trust (e.g. a higher level of trust indicates a higher probability of success).

Ballivian A, Hoffman W. “Public-Private Partnerships for Data: Issues Paper for Data Revolution Consultation.” World Bank, 2015. Available from: http://bit.ly/1ENvmRJ

  • This World Bank report provides a background document on forming public-prviate partnerships for data with the private sector in order to inform the UN’s Independent Expert Advisory Group (IEAG) on sustaining a “data revolution” in sustainable development.
  • The report highlights the critical position of private companies within the data value chain and reflects on key elements of a sustainable data PPP: “common objectives across all impacted stakeholders, alignment of incentives, and sharing of risks.” In addition, the report describes the risks and incentives of public and private actors, and the principles needed to “build[ing] the legal, cultural, technological and economic infrastructures to enable the balancing of competing interests.” These principles include understanding; experimentation; adaptability; balance; persuasion and compulsion; risk management; and governance.
  • Examples of data collaboratives cited in the report include HP Earth Insights, Orange Data for Development Challenges, Amazon Web Services, IBM Smart Cities Initiative, and the Governance Lab’s Open Data 500.

Brack, Matthew, and Tito Castillo. “Data Sharing for Public Health: Key Lessons from Other Sectors.” Chatham House, Centre on Global Health Security. April 2015. Available from: http://bit.ly/1DHFGVl

  • The Chatham House report provides an overview on public health surveillance data sharing, highlighting the benefits and challenges of shared health data and the complexity in adapting technical solutions from other sectors for public health.
  • The report describes data sharing processes from several perspectives, including in-depth case studies of actual data sharing in practice at the individual, organizational and sector levels. Among the key lessons for public health data sharing, the report strongly highlights the need to harness momentum for action and maintain collaborative engagement: “Successful data sharing communities are highly collaborative. Collaboration holds the key to producing and abiding by community standards, and building and maintaining productive networks, and is by definition the essence of data sharing itself. Time should be invested in establishing and sustaining collaboration with all stakeholders concerned with public health surveillance data sharing.”
  • Examples of data collaboratives include H3Africa (a collaboration between NIH and Wellcome Trust) and NHS England’s care.data programme.

de Montjoye, Yves-Alexandre, Jake Kendall, and Cameron F. Kerry. “Enabling Humanitarian Use of Mobile Phone Data.” The Brookings Institution, Issues in Technology Innovation. November 2014. Available from: http://brook.gs/1JxVpxp

  • Using Ebola as a case study, the authors describe the value of using private telecom data for uncovering “valuable insights into understanding the spread of infectious diseases as well as strategies into micro-target outreach and driving update of health-seeking behavior.”
  • The authors highlight the absence of a common legal and standards framework for “sharing mobile phone data in privacy-conscientious ways” and recommend “engaging companies, NGOs, researchers, privacy experts, and governments to agree on a set of best practices for new privacy-conscientious metadata sharing models.”

Eckartz, Silja M., Hofman, Wout J., Van Veenstra, Anne Fleur. “A decision model for data sharing.” Vol. 8653 LNCS. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2014. http://bit.ly/21cGWfw.

  • This paper proposes a decision model for data sharing of public and private data based on literature review and three case studies in the logistics sector.
  • The authors identify five categories of the barriers to data sharing and offer a decision model for identifying potential interventions to overcome each barrier:
    • Ownership. Possible interventions likely require improving trust among those who own the data through, for example, involvement and support from higher management
    • Privacy. Interventions include “anonymization by filtering of sensitive information and aggregation of data,” and access control mechanisms built around identity management and regulated access.  
    • Economic. Interventions include a model where data is shared only with a few trusted organizations, and yield management mechanisms to ensure negative financial consequences are avoided.
    • Data quality. Interventions include identifying additional data sources that could improve the completeness of datasets, and efforts to improve metadata.
    • Technical. Interventions include making data available in structured formats and publishing data according to widely agreed upon data standards.

Hoffman, Sharona and Podgurski, Andy. “The Use and Misuse of Biomedical Data: Is Bigger Really Better?” American Journal of Law & Medicine 497, 2013. http://bit.ly/1syMS7J.

  • This journal articles explores the benefits and, in particular, the risks related to large-scale biomedical databases bringing together health information from a diversity of sources across sectors. Some data collaboratives examined in the piece include:
    • MedMining – a company that extracts EHR data, de-identifies it, and offers it to researchers. The data sets that MedMining delivers to its customers include ‘lab results, vital signs, medications, procedures, diagnoses, lifestyle data, and detailed costs’ from inpatient and outpatient facilities.
    • Explorys has formed a large healthcare database derived from financial, administrative, and medical records. It has partnered with major healthcare organizations such as the Cleveland Clinic Foundation and Summa Health System to aggregate and standardize health information from ten million patients and over thirty billion clinical events.
  • Hoffman and Podgurski note that biomedical databases populated have many potential uses, with those likely to benefit including: “researchers, regulators, public health officials, commercial entities, lawyers,” as well as “healthcare providers who conduct quality assessment and improvement activities,” regulatory monitoring entities like the FDA, and “litigants in tort cases to develop evidence concerning causation and harm.”
  • They argue, however, that risks arise based on:
    • The data contained in biomedical databases is surprisingly likely to be incorrect or incomplete;
    • Systemic biases, arising from both the nature of the data and the preconceptions of investigators are serious threats the validity of research results, especially in answering causal questions;
  • Data mining of biomedical databases makes it easier for individuals with political, social, or economic agendas to generate ostensibly scientific but misleading research findings for the purpose of manipulating public opinion and swaying policymakers.

Krumholz, Harlan M., et al. “Sea Change in Open Science and Data Sharing Leadership by Industry.” Circulation: Cardiovascular Quality and Outcomes 7.4. 2014. 499-504. http://1.usa.gov/1J6q7KJ

  • This article provides a comprehensive overview of industry-led efforts and cross-sector collaborations in data sharing by pharmaceutical companies to inform clinical practice.
  • The article details the types of data being shared and the early activities of GlaxoSmithKline (“in coordination with other companies such as Roche and ViiV”); Medtronic and the Yale University Open Data Access Project; and Janssen Pharmaceuticals (Johnson & Johnson). The article also describes the range of involvement in data sharing among pharmaceutical companies including Pfizer, Novartis, Bayer, AbbVie, Eli Llly, AstraZeneca, and Bristol-Myers Squibb.

Mann, Gideon. “Private Data and the Public Good.” Medium. May 17, 2016. http://bit.ly/1OgOY68.

    • This Medium post from Gideon Mann, the Head of Data Science at Bloomberg, shares his prepared remarks given at a lecture at the City College of New York. Mann argues for the potential benefits of increasing access to private sector data, both to improve research and academic inquiry and also to help solve practical, real-world problems. He also describes a number of initiatives underway at Bloomberg along these lines.    
  • Mann argues that data generated at private companies “could enable amazing discoveries and research,” but is often inaccessible to those who could put it to those uses. Beyond research, he notes that corporate data could, for instance, benefit:
      • Public health – including suicide prevention, addiction counseling and mental health monitoring.
    • Legal and ethical questions – especially as they relate to “the role algorithms have in decisions about our lives,” such as credit checks and resume screening.
  • Mann recognizes the privacy challenges inherent in private sector data sharing, but argues that it is a common misconception that the only two choices are “complete privacy or complete disclosure.” He believes that flexible frameworks for differential privacy could open up new opportunities for responsibly leveraging data collaboratives.

Pastor Escuredo, D., Morales-Guzmán, A. et al, “Flooding through the Lens of Mobile Phone Activity.” IEEE Global Humanitarian Technology Conference, GHTC 2014. Available from: http://bit.ly/1OzK2bK

  • This report describes the impact of using mobile data in order to understand the impact of disasters and improve disaster management. The report was conducted in the Mexican state of Tabasco in 2009 as a multidisciplinary, multi-stakeholder consortium involving the UN World Food Programme (WFP), Telefonica Research, Technical University of Madrid (UPM), Digital Strategy Coordination Office of the President of Mexico, and UN Global Pulse.
  • Telefonica Research, a division of the major Latin American telecommunications company, provided call detail records covering flood-affected areas for nine months. This data was combined with “remote sensing data (satellite images), rainfall data, census and civil protection data.” The results of the data demonstrated that “analysing mobile activity during floods could be used to potentially locate damaged areas, efficiently assess needs and allocate resources (for example, sending supplies to affected areas).”
  • In addition to the results, the study highlighted “the value of a public-private partnership on using mobile data to accurately indicate flooding impacts in Tabasco, thus improving early warning and crisis management.”

* Perkmann, M. and Schildt, H. “Open data partnerships between firms and universities: The role of boundary organizations.” Research Policy, 44(5), 2015. http://bit.ly/25RRJ2c

  • This paper discusses the concept of a “boundary organization” in relation to industry-academic partnerships driven by data. Boundary organizations perform mediated revealing, allowing firms to disclose their research problems to a broad audience of innovators and simultaneously minimize the risk that this information would be adversely used by competitors.
  • The authors identify two especially important challenges for private firms to enter open data or participate in data collaboratives with the academic research community that could be addressed through more involvement from boundary organizations:
    • First is a challenge of maintaining competitive advantage. The authors note that, “the more a firm attempts to align the efforts in an open data research programme with its R&D priorities, the more it will have to reveal about the problems it is addressing within its proprietary R&D.”
    • Second, involves the misalignment of incentives between the private and academic field. Perkmann and Schildt argue that, a firm seeking to build collaborations around its opened data “will have to provide suitable incentives that are aligned with academic scientists’ desire to be rewarded for their work within their respective communities.”

Robin, N., Klein, T., & Jütting, J. “Public-Private Partnerships for Statistics: Lessons Learned, Future Steps.” OECD. 2016. http://bit.ly/24FLYlD.

  • This working paper acknowledges the growing body of work on how different types of data (e.g, telecom data, social media, sensors and geospatial data, etc.) can address data gaps relevant to National Statistical Offices (NSOs).
  • Four models of public-private interaction for statistics are describe: in-house production of statistics by a data-provider for a national statistics office (NSO), transfer of data-sets to NSOs from private entities, transfer of data to a third party provider to manage the NSO and private entity data, and the outsourcing of NSO functions.
  • The paper highlights challenges to public-private partnerships involving data (e.g., technical challenges, data confidentiality, risks, limited incentives for participation), suggests deliberate and highly structured approaches to public-private partnerships involving data require enforceable contracts, emphasizes the trade-off between data specificity and accessibility of such data, and the importance of pricing mechanisms that reflect the capacity and capability of national statistic offices.
  • Case studies referenced in the paper include:
    • A mobile network operator’s (MNO Telefonica) in house analysis of call detail records;
    • A third-party data provider and steward of travel statistics (Positium);
    • The Data for Development (D4D) challenge organized by MNO Orange; and
    • Statistics Netherlands use of social media to predict consumer confidence.

Stuart, Elizabeth, Samman, Emma, Avis, William, Berliner, Tom. “The data revolution: finding the missing millions.” Overseas Development Institute, 2015. Available from: http://bit.ly/1bPKOjw

  • The authors of this report highlight the need for good quality, relevant, accessible and timely data for governments to extend services into underrepresented communities and implement policies towards a sustainable “data revolution.”
  • The solutions focused on this recent report from the Overseas Development Institute focus on capacity-building activities of national statistical offices (NSOs), alternative sources of data (including shared corporate data) to address gaps, and building strong data management systems.

Taylor, L., & Schroeder, R. “Is bigger better? The emergence of big data as a tool for international development policy.” GeoJournal, 80(4). 2015. 503-518. http://bit.ly/1RZgSy4.

  • This journal article describes how privately held data – namely “digital traces” of consumer activity – “are becoming seen by policymakers and researchers as a potential solution to the lack of reliable statistical data on lower-income countries.
  • They focus especially on three categories of data collaborative use cases:
    • Mobile data as a predictive tool for issues such as human mobility and economic activity;
    • Use of mobile data to inform humanitarian response to crises; and
    • Use of born-digital web data as a tool for predicting economic trends, and the implications these have for LMICs.
  • They note, however, that a number of challenges and drawbacks exist for these types of use cases, including:
    • Access to private data sources often must be negotiated or bought, “which potentially means substituting negotiations with corporations for those with national statistical offices;”
    • The meaning of such data is not always simple or stable, and local knowledge is needed to understand how people are using the technologies in question
    • Bias in proprietary data can be hard to understand and quantify;
    • Lack of privacy frameworks; and
    • Power asymmetries, wherein “LMIC citizens are unwittingly placed in a panopticon staffed by international researchers, with no way out and no legal recourse.”

van Panhuis, Willem G., Proma Paul, Claudia Emerson, John Grefenstette, Richard Wilder, Abraham J. Herbst, David Heymann, and Donald S. Burke. “A systematic review of barriers to data sharing in public health.” BMC public health 14, no. 1 (2014): 1144. Available from: http://bit.ly/1JOBruO

  • The authors of this report provide a “systematic literature of potential barriers to public health data sharing.” These twenty potential barriers are classified in six categories: “technical, motivational, economic, political, legal and ethical.” In this taxonomy, “the first three categories are deeply rooted in well-known challenges of health information systems for which structural solutions have yet to be found; the last three have solutions that lie in an international dialogue aimed at generating consensus on policies and instruments for data sharing.”
  • The authors suggest the need for a “systematic framework of barriers to data sharing in public health” in order to accelerate access and use of data for public good.

Verhulst, Stefaan and Sangokoya, David. “Mapping the Next Frontier of Open Data: Corporate Data Sharing.” In: Gasser, Urs and Zittrain, Jonathan and Faris, Robert and Heacock Jones, Rebekah, “Internet Monitor 2014: Reflections on the Digital World: Platforms, Policy, Privacy, and Public Discourse (December 15, 2014).” Berkman Center Research Publication No. 2014-17. http://bit.ly/1GC12a2

  • This essay describe a taxonomy of current corporate data sharing practices for public good: research partnerships; prizes and challenges; trusted intermediaries; application programming interfaces (APIs); intelligence products; and corporate data cooperatives or pooling.
  • Examples of data collaboratives include: Yelp Dataset Challenge, the Digital Ecologies Research Partnerhsip, BBVA Innova Challenge, Telecom Italia’s Big Data Challenge, NIH’s Accelerating Medicines Partnership and the White House’s Climate Data Partnerships.
  • The authors highlight important questions to consider towards a more comprehensive mapping of these activities.

Verhulst, Stefaan and Sangokoya, David, 2015. “Data Collaboratives: Exchanging Data to Improve People’s Lives.” Medium. Available from: http://bit.ly/1JOBDdy

  • The essay refers to data collaboratives as a new form of collaboration involving participants from different sectors exchanging data to help solve public problems. These forms of collaborations can improve people’s lives through data-driven decision-making; information exchange and coordination; and shared standards and frameworks for multi-actor, multi-sector participation.
  • The essay cites four activities that are critical to accelerating data collaboratives: documenting value and measuring impact; matching public demand and corporate supply of data in a trusted way; training and convening data providers and users; experimenting and scaling existing initiatives.
  • Examples of data collaboratives include NIH’s Precision Medicine Initiative; the Mobile Data, Environmental Extremes and Population (MDEEP) Project; and Twitter-MIT’s Laboratory for Social Machines.

Verhulst, Stefaan, Susha, Iryna, Kostura, Alexander. “Data Collaboratives: matching Supply of (Corporate) Data to Solve Public Problems.” Medium. February 24, 2016. http://bit.ly/1ZEp2Sr.

  • This piece articulates a set of key lessons learned during a session at the International Data Responsibility Conference focused on identifying emerging practices, opportunities and challenges confronting data collaboratives.
  • The authors list a number of privately held data sources that could create positive public impacts if made more accessible in a collaborative manner, including:
    • Data for early warning systems to help mitigate the effects of natural disasters;
    • Data to help understand human behavior as it relates to nutrition and livelihoods in developing countries;
    • Data to monitor compliance with weapons treaties;
    • Data to more accurately measure progress related to the UN Sustainable Development Goals.
  • To the end of identifying and expanding on emerging practice in the space, the authors describe a number of current data collaborative experiments, including:
    • Trusted Intermediaries: Statistics Netherlands partnered with Vodafone to analyze mobile call data records in order to better understand mobility patterns and inform urban planning.
    • Prizes and Challenges: Orange Telecom, which has been a leader in this type of Data Collaboration, provided several examples of the company’s initiatives, such as the use of call data records to track the spread of malaria as well as their experience with Challenge 4 Development.
    • Research partnerships: The Data for Climate Action project is an ongoing large-scale initiative incentivizing companies to share their data to help researchers answer particular scientific questions related to climate change and adaptation.
    • Sharing intelligence products: JPMorgan Chase shares macro economic insights they gained leveraging their data through the newly established JPMorgan Chase Institute.
  • In order to capitalize on the opportunities provided by data collaboratives, a number of needs were identified:
    • A responsible data framework;
    • Increased insight into different business models that may facilitate the sharing of data;
    • Capacity to tap into the potential value of data;
    • Transparent stock of available data supply; and
    • Mapping emerging practices and models of sharing.

Vogel, N., Theisen, C., Leidig, J. P., Scripps, J., Graham, D. H., & Wolffe, G. “Mining mobile datasets to enable the fine-grained stochastic simulation of Ebola diffusion.” Paper presented at the Procedia Computer Science. 2015. http://bit.ly/1TZDroF.

  • The paper presents a research study conducted on the basis of the mobile calls records shared with researchers in the framework of the Data for Development Challenge by the mobile operator Orange.
  • The study discusses the data analysis approach in relation to developing a situation of Ebola diffusion built around “the interactions of multi-scale models, including viral loads (at the cellular level), disease progression (at the individual person level), disease propagation (at the workplace and family level), societal changes in migration and travel movements (at the population level), and mitigating interventions (at the abstract government policy level).”
  • The authors argue that the use of their population, mobility, and simulation models provide more accurate simulation details in comparison to high-level analytical predictions and that the D4D mobile datasets provide high-resolution information useful for modeling developing regions and hard to reach locations.

Welle Donker, F., van Loenen, B., & Bregt, A. K. “Open Data and Beyond.” ISPRS International Journal of Geo-Information, 5(4). 2016. http://bit.ly/22YtugY.

  • This research has developed a monitoring framework to assess the effects of open (private) data using a case study of a Dutch energy network administrator Liander.
  • Focusing on the potential impacts of open private energy data – beyond ‘smart disclosure’ where citizens are given information only about their own energy usage – the authors identify three attainable strategic goals:
    • Continuously optimize performance on services, security of supply, and costs;
    • Improve management of energy flows and insight into energy consumption;
    • Help customers save energy and switch over to renewable energy sources.
  • The authors propose a seven-step framework for assessing the impacts of Liander data, in particular, and open private data more generally:
    • Develop a performance framework to describe what the program is about, description of the organization’s mission and strategic goals;
    • Identify the most important elements, or key performance areas which are most critical to understanding and assessing your program’s success;
    • Select the most appropriate performance measures;
    • Determine the gaps between what information you need and what is available;
    • Develop and implement a measurement strategy to address the gaps;
    • Develop a performance report which highlights what you have accomplished and what you have learned;
    • Learn from your experiences and refine your approach as required.
  • While the authors note that the true impacts of this open private data will likely not come into view in the short term, they argue that, “Liander has successfully demonstrated that private energy companies can release open data, and has successfully championed the other Dutch network administrators to follow suit.”

World Economic Forum, 2015. “Data-driven development: pathways for progress.” Geneva: World Economic Forum. http://bit.ly/1JOBS8u

  • This report captures an overview of the existing data deficit and the value and impact of big data for sustainable development.
  • The authors of the report focus on four main priorities towards a sustainable data revolution: commercial incentives and trusted agreements with public- and private-sector actors; the development of shared policy frameworks, legal protections and impact assessments; capacity building activities at the institutional, community, local and individual level; and lastly, recognizing individuals as both produces and consumers of data.

Three Things Great Data Storytellers Do Differently


Jake Porway at Stanford Social Innovation Review: “…At DataKind, we use data science and algorithms in the service of humanity, and we believe that communicating about our work using data for social impact is just as important as the work itself. There’s nothing worse than findings gathering dust in an unread report.

We also believe our projects should always start with a question. It’s clear from the questions above and others that the art of data storytelling needs some demystifying. But rather than answering each question individually, I’d like to pose a broader question that can help us get at some of the essentials: What do great data storytellers do differently and what can we learn from them?

1. They answer the most important question: So what?

Knowing how to compel your audience with data is more of an art than a science. Most people still have negative associations with numbers and statistics—unpleasant memories of boring math classes, intimidating technical concepts, or dry accounting. That’s a shame, because the message behind the numbers can be so enriching and enlightening.

The solution? Help your audience understand the “so what,” not the numbers. Ask: Why should someone care about your findings? How does this information impact them? My strong opinion is that most people actually don’t want to look at data. They need to trust that your methods are sound and that you’re reasoning from data, but ultimately they just want to know what it all means for them and what they should do next.

A great example of going straight to the “so what” is this beautiful, interactive visualization by Periscopic about gun deaths. It uses data sparingly but still evokes a very clear anti-gun message….

2. They inspire us to ask more questions.

The best data visualization helps people investigate a topic further, instead of drawing a conclusion for them or persuading them to believe something new.

For example, the nonprofit DC Action for Children was interested in leveraging publicly available data from government agencies and the US Census, as well as DC Action for Children’s own databases, to help policymakers, parents, and community members understand the conditions influencing child well-being in Washington, DC. We helped create a tool that could bring together data in a multitude of forms, and present it in a way that allowed people to delve into the topic themselves and uncover surprising truths, such as the fact that one out of every three kids in DC lives in a neighborhood without a grocery store….

3. They use rigorous analysis instead of just putting numbers on a page.

Data visualization isn’t an end goal; it’s a process. It’s often the final step in a long manufacturing chain, along which we poke, prod, and mold data to create that pretty graph.

Years ago, the New York City Department of Parks & Recreation (NYC Parks) approached us—armed with data about every single tree in the city, including when it was planted and how it was pruned—and wanted to know: Does pruning trees in one year reduce the number of hazardous tree conditions in the following year? This is one of the first things our volunteer data scientists came up with:

Visualization of NYC Parks’ Department data showing tree density in New York City.

This is a visualization of tree density New York—and it was met with oohs and aahs. It was interactive! You could see where different types of trees lived! It was engaging! But another finding that came out of this work arguably had a greater impact. Brian D’Alessandro, one of our volunteer data scientists, used statistical modeling to help NYC Parks calculate a number: 22 percent. It turns out that if you prune trees in New York, there are 22 percent fewer emergencies on those blocks than on the blocks where you didn’t prune. This number is helping the city become more effective by understanding how to best allocate its resources, and now other urban forestry programs are asking New York how they can do the same thing. There was no sexy visualization, no interactivity—just a rigorous statistical model of the world that’s shaping how cities protect their citizens….(More)”

The Perils of Experimentation


Paper by Michael A. Livermore: “More than eighty years after Justice Brandeis coined the phrase “laboratories of democracy,” the concept of policy experimentation retains its currency as a leading justification for decentralized governance. This Article examines the downsides of experimentation, and in particular the potential for decentralization to lead to the production of information that exacerbates public choice failures. Standard accounts of experimentation and policy learning focus on information concerning the social welfare effects of alternative policies. But learning can also occur along a political dimension as information about ideological preferences, campaign techniques, and electoral incentives is revealed. Both types of information can be put to use in the policy arena by a host of individual and institutional actors that have a wide range of motives, from public-spirited concern for the general welfare to a desire to maximize personal financial returns. In this complex environment, there is no guarantee that the information that is generated by experimentation will lead to social benefits. This Article applies this insight to prior models of federalism developed in the legal and political science literature to show that decentralization can lead to the over-production of socially harmful information. As a consequence, policy makers undertaking a decentralization calculation should seek a level of decentralization that best balances the costs and benefits of information production. To illustrate the legal and policy implications of the arguments developed here, this Article examines two contemporary environmental rulemakings of substantial political, legal, and economic significance: a rule to define the jurisdictional reach of the Clean Water Act; and a rule to limit greenhouse gas emissions from the electricity generating sector….(More)”.

 

Foundation Transparency: Game Over?


Brad Smith at Glass Pockets (Foundation Center): “The tranquil world of America’s foundations is about to be shaken, but if you read the Center for Effective Philanthropy’s (CEP) new study — Sharing What Matters, Foundation Transparency — you would never know it.

Don’t get me wrong. That study, like everything CEP produces, is carefully researched, insightful and thoroughly professional. But it misses the single biggest change in foundation transparency in decades: the imminent release by the Internal Revenue Service of foundation 990-PF (and 990) tax returns as machine-readable open data.

Clara Miller, President of the Heron Foundation, writes eloquently in her manifesto, Building a Foundation for the 21St Century: “…the private foundation model was designed to be protective and separate, much like a terrarium.”

Terrariums, of course, are highly “curated” environments over which their creators have complete control. The CEP study, proves that point, to the extent that much of the study consists of interviews with foundation leaders and reviews of their websites as if transparency were a kind of optional endeavor in which foundations may choose to participate, if at all, and to what degree.

To be fair, CEP also interviewed the grantees of various foundations (sometimes referred to as “partners”), which helps convey the reality that foundations have stakeholders beyond their four walls. However, the terrarium metaphor is about to become far more relevant as the release of 990 tax returns as open data will literally make it possible for anyone to look right through those glass walls to the curated foundation world within.

What Is Open Data?

It is safe to say that most foundation leaders and a fair majority of their staff do not understand what open data really is. Open data is free, yes, but more importantly it is digital and machine-readable. This means it can be consumed in enormous volumes at lightning speed, directly by computers.

Once consumed, open data can be tagged, sorted, indexed and searched using statistical methods to make obvious comparisons while discovering previously undetected correlations. Anyone with a computer, some coding skills and a hard drive or cloud storage can access open data. In today’s world, a lot of people meet those requirements, and they are free to do whatever they please with your information once it is, as open data enthusiasts like to say, “in the wild.”

What is the Internal Revenue Service Releasing?

Thanks to the Aspen Institute’s leadership of a joint effort – funded by foundations and including Foundation Center, GuideStar, the National Center for Charitable Statistics, the Johns Hopkins Center for Civil Society Studies, and others – the IRS has started to make some 1,000,000 Form 990s and 40,000 Form 990PF available as machine-readable open data.

Previously, all Form 990s had been released as image (TIFF) files, essentially a picture, making it both time-consuming and expensive to extract useful data from them. Credit where credit is due; a kick in the butt in the form of a lawsuit from open data crusader Carl Malamud helped speed the process along.

The current test phase includes only those tax returns that were digitally filed by nonprofits and community foundations (990s) and private foundations (990PFs). Over time, the IRS will phase in a mandatory digital filing requirement for all Form 990s, and the intent is to release them all as open data. In other words, that which is born digital will be opened up to the public in digital form. Because of variations in the 990 forms, getting the information from them into a database will still require some technical expertise, but will be far more feasible and faster than ever before.

The Good

The work of organizations like Foundation Center– who have built expensive infrastructure in order to turn years of 990 tax returns into information that can be used by nonprofits looking for funding, researchers trying to understand the role of foundations and foundations, themselves, seeking to benchmark themselves against peers—will be transformed.

Work will shift away from the mechanics of capturing and processing the data to higher level analysis and visualization to stimulate the generation and sharing of new insights and knowledge. This will fuel greater collaboration between peer organizations, innovation, the merging of previous disparate bodies of data, better philanthropy, and a stronger social sector… (more)

 

These Online Platforms Make Direct Democracy Possible


Tom Ladendorf in InTheseTimes: “….Around the world, organizations from political parties to cooperatives are experimenting with new modes of direct democracy made possible by the internet.

“The world has gone through extraordinary technological innovation,” says Agustín Frizzera of Argentina’s Net Party. “But governments and political institutions haven’t innovated enough.”

The founders of the four-year-old party have also built an online platform, DemocracyOS, that lets users discuss and vote on proposals being considered by their legislators.

Anyone can adopt the technology, but the Net Party uses it to let Buenos Aires residents debate City Council measures. A 2013 thread, for example, concerned a plan to require bars and restaurants to make bathrooms free and open to the public.

“I recognize the need for freely available facilities, but it is the state who should be offering this service,” reads the top comment, voted most helpful by users. Others argued that private bathrooms open the door to discrimination. Ultimately, 56.9 percent of participants supported the proposal, while 35.3 percent voted against and 7.8 percent abstained….

A U.S. company called PlaceAVote, launched in 2014, takes what it calls a more pragmatic approach. According to cofounder Job Melton, PlaceAVote’s goal is to “work within the system we have now and fix it from the inside out” instead of attempting the unlikely feat of building a third U.S. party.

Like the Net Party and its brethren, PlaceAVote offers an online tool that lets voters participate in decision making. Right now, the technology is in public beta at PlaceAVote.com, allowing users nationwide to weigh in on legislation before Congress….

But digital democracy has applications that extend beyond electoral politics. A wide range of groups are using web-based decision-making tools internally. The Mexican government, for example, has used DemocracyOS to gather citizen feedback on a data-protection law, and Brazilian civil society organizations are using it to encourage engagement with federal and municipal policy-making.

Another direct-democracy tool in wide use is Loomio, developed by a cooperative in New Zealand. Ben Knight, one of Loomio’s cofounders, sums up his experience with Occupy as one of “seeing massive potential of collective decision making, and then realizing how difficult it could be in person.” After failing to find an online tool to facilitate the process, the Loomio team created a platform that enables online discussion with a personal element: Votes are by name and voters can choose to “disagree” with or even “block” proposals. Provo, Utah, uses Loomio for public consultation, and a number of political parties use Loomio for local decision making, including the Brazilian Pirate Party, several regional U.K. Green Party chapters and Spain’s Podemos. Podemos has enthusiastically embraced digital-democracy tools for everything from its selection of European Parliament candidates to the creation of its party platform….(More)”

How innovation agencies work


Kirsten Bound and Alex Glennie at NESTA: “This report considers how governments can get better at designing and running innovation agencies, drawing on examples from around the world.

Key findings

  • There is no single model for a ‘successful’ innovation agency.  Although there is much to learn from other countries about best practice in institution and programme design, attempts to directly replicate organisational models that operate in very different contexts are likely to fail.
  • There are a variety of roles that innovation agencies can play. From our case studies, we have identified a number of different approaches that an innovation agency might take, depending on the specific nature of a country’s innovation system, the priorities of policymakers, and available resources.
  • Innovation agencies need a clear mission, but an ability to adapt and experiment. Working towards many different objectives at once or constantly changing strategic direction can make it difficult for an innovation agency to deliver impactful innovation support for businesses. However, a long-term vision of what success looks like should not prevent innovation agencies from experimenting with new approaches, and responding to new needs and opportunities.
  • Innovation agencies should be assessed both quantitatively and qualitatively. Evaluations tend to focus on the financial return they generate, but our research suggests that more effort needs to be put into assessing some of the more qualitative aspects of their role, including the quality of their management, their ability to take (and learn from) strategic risks, and the skill with which they design and implement their programmes.
  • Governments should be both ambitious and realistic about what they expect an innovation agency to achieve. An innovation agency’s role will inevitably be affected by shifts in government priorities. Understanding how innovation agencies shape (and are shaped by) the broader political environment around innovation is a necessary part of ensuring that they are able to deliver on their potential.

Governments around the world are looking for ways to nurture innovative businesses, as a way of solving some of their most urgent economic and societal challenges. Many seek to do this by setting up national innovation agencies: institutions that provide financial and other support to catalyse or drive private sector innovation. Yet we still know relatively little about the range of approaches that these agencies take, what programmes and instruments are likely to work best in a given context, and how to assess their long-term impact.

We have been investigating these questions by studying a diverse group selection of innovation agencies in ten different countries. Our aim has been to improve understanding of the range of existing institutional models and to learn more about their design, evolution and effectiveness. In doing so, we have developed a broad framework to help policymakers think about the set of choices and options they face in the design and management of an innovation agency….(More)”

An App to Save Syria’s Lost Generation? What Technology Can and Can’t Do


 in Foreign Affairs: ” In January this year, when the refugee and migrant crisis in Europe had hit its peak—more than a million had crossed into Europe over the course of 2015—the U.S. State Department and Google hosted a forum of over 100 technology experts. The goal was to “bridge the education gap for Syrian refugee children.” Speaking to the group assembled at Stanford University, Deputy Secretary of State Antony Blinken announced a $1.7 million prize “to develop a smartphone app that can help Syrian children learn how to read and improve their wellbeing.” The competition, known as EduApp4Syria, is being run by the Norwegian Agency for Development Cooperation (Norad) and is supported by the Australian government and the French mobile company Orange.

Less than a month later, a group called Techfugees brought together over 100 technologists for a daylong brainstorm in New York City focused exclusively on education solutions. “We are facing the largest refugee crisis since World War II,” said U.S. Ambassador to the United Nations Samantha Power to open the conference. “It is a twenty-first-century crisis and we need a twenty-first-century solution.” Among the more promising, according to Power, were apps that enable “refugees to access critical services,” new “web platforms connecting refugees with one another,” and “education programs that teach refugees how to code.”

For example, the nonprofit PeaceGeeks created the Services Advisor app for the UN Refugee Agency, which maps the location of shelters, food distribution centers, and financial services in Jordan….(More)”

Scientists Are Just as Confused About the Ethics of Big-Data Research as You


Sarah Zhang at Wired: “When a rogue researcher last week released 70,000 OkCupid profiles, complete with usernames and sexual preferences, people were pissed. When Facebook researchers manipulated stories appearing in Newsfeeds for a mood contagion study in 2014, people were really pissed. OkCupid filed a copyright claim to take down the dataset; the journal that published Facebook’s study issued an “expression of concern.” Outrage has a way of shaping ethical boundaries. We learn from mistakes.

Shockingly, though, the researchers behind both of those big data blowups never anticipated public outrage. (The OkCupid research does not seem to have gone through any kind of ethical review process, and a Cornell ethics review board approved the Facebook experiment.) And that shows just how untested the ethics of this new field of research is. Unlike medical research, which has been shaped by decades of clinical trials, the risks—and rewards—of analyzing big, semi-public databases are just beginning to become clear.

And the patchwork of review boards responsible for overseeing those risks are only slowly inching into the 21st century. Under the Common Rule in the US, federally funded research has to go through ethical review. Rather than one unified system though, every single university has its own institutional review board, or IRB. Most IRB members are researchers at the university, most often in the biomedical sciences. Few are professional ethicists.

Even fewer have computer science or security expertise, which may be necessary to protect participants in this new kind of research. “The IRB may make very different decisions based on who is on the board, what university it is, and what they’re feeling that day,” says Kelsey Finch, policy counsel at the Future of Privacy Forum. There are hundreds of these IRBs in the US—and they’re grappling with research ethics in the digital age largely on their own….

Or maybe other institutions, like the open science repositories asking researchers to share data, should be picking up the slack on ethical issues. “Someone needs to provide oversight, but the optimal body is unlikely to be an IRB, which usually lacks subject matter expertise in de-identification and re-identification techniques,” Michelle Meyer, a bioethicist at Mount Sinai, writes in an email.

Even among Internet researchers familiar with the power of big data, attitudes vary. When Katie Shilton, an information technology research at the University of Maryland, interviewed 20 online data researchers, she found “significant disagreement” over issues like the ethics of ignoring Terms of Service and obtaining informed consent. Surprisingly, the researchers also said that ethical review boards had never challenged the ethics of their work—but peer reviewers and colleagues had. Various groups like theAssociation of Internet Researchers and the Center for Applied Internet Data Analysis have issued guidelines, but the people who actually have power—those on institutional review boards–are only just catching up.

Outside of academia, companies like Microsoft have started to institute their own ethical review processes. In December, Finch at the Future of Privacy Forum organized a workshop called Beyond IRBs to consider processes for ethical review outside of federally funded research. After all, modern tech companies like Facebook, OkCupid, Snapchat, Netflix sit atop a trove of data 20th century social scientists could have only dreamed up.

Of course, companies experiment on us all the time, whether it’s websites A/B testing headlines or grocery stores changing the configuration of their checkout line. But as these companies hire more data scientists out of PhD programs, academics are seeing an opportunity to bridge the divide and use that data to contribute to public knowledge. Maybe updated ethical guidelines can be forged out of those collaborations. Or it just might be a mess for a while….(More)”

Twelve principles for open innovation 2.0


Martin Curley in Nature: “A new mode of innovation is emerging that blurs the lines between universities, industry, governments and communities. It exploits disruptive technologies — such as cloud computing, the Internet of Things and big data — to solve societal challenges sustainably and profitably, and more quickly and ably than before. It is called open innovation 2.0 (ref. 1).

Such innovations are being tested in ‘living labs’ in hundreds of cities. In Dublin, for example, the city council has partnered with my company, the technology firm Intel (of which I am a vice-president), to install a pilot network of sensors to improve flood management by measuring local rain fall and river levels, and detecting blocked drains. Eindhoven in the Netherlands is working with electronics firm Philips and others to develop intelligent street lighting. Communications-technology firm Ericsson, the KTH Royal Institute of Technology, IBM and others are collaborating to test self-driving buses in Kista, Sweden.

Yet many institutions and companies remain unaware of this radical shift. They often confuse invention and innovation. Invention is the creation of a technology or method. Innovation concerns the use of that technology or method to create value. The agile approaches needed for open innovation 2.0 conflict with the ‘command and control’ organizations of the industrial age (see ‘How innovation modes have evolved’). Institutional or societal cultures can inhibit user and citizen involvement. Intellectual-property (IP) models may inhibit collaboration. Government funders can stifle the emergence of ideas by requiring that detailed descriptions of proposed work are specified before research can begin. Measures of success, such as citations, discount innovation and impact. Policymaking lags behind the market place….

Keys to collaborative innovation

  1. Purpose. Efforts and intellects aligned through commitment rather than compliance deliver an impact greater than the sum of their parts. A great example is former US President John F. Kennedy’s vision of putting a man on the Moon. Articulating a shared value that can be created is important. A win–win scenario is more sustainable than a win–lose outcome.
  2. Partner. The ‘quadruple helix’ of government, industry, academia and citizens joining forces aligns goals, amplifies resources, attenuates risk and accelerates progress. A collaboration between Intel, University College London, Imperial College London and Innovate UK’s Future Cities Catapult is working in the Intel Collaborative Research Institute to improve people’s well-being in cities, for example to enable reduction of air pollution.
  3. Platform. An environment for collaboration is a basic requirement. Platforms should be integrated and modular, allowing a plug-and-play approach. They must be open to ensure low barriers to use, catalysing the evolution of a community. Challenges in security, standards, trust and privacy need to be addressed. For example, the Open Connectivity Foundation is securing interoperability for the Internet of Things.
  4. Possibilities. Returns may not come from a product but from the business model that enabled it, a better process or a new user experience. Strategic tools are available, such as industrial designer Larry Keeley’s breakdown of innovations into ten types in four categories: finance, process, offerings and delivery.
  5. Plan. Adoption and scale should be the focus of innovation efforts, not product creation. Around 20% of value is created when an innovation is established; more than 80% comes when it is widely adopted7. Focus on the ‘four Us’: utility (value to the user); usability; user experience; and ubiquity (designing in network effects).
  6. Pyramid. Enable users to drive innovation. They inspired two-thirds of innovations in semiconductors and printed circuit boards, for example. Lego Ideas encourages children and others to submit product proposals — submitters must get 10,000 supporters for their idea to be reviewed. Successful inventors get 1% of royalties.
  7. Problem. Most innovations come from a stated need. Ethnographic research with users, customers or the environment can identify problems and support brainstorming of solutions. Create a road map to ensure the shortest path to a solution.
  8. Prototype. Solutions need to be tested and improved through rapid experimentation with users and citizens. Prototyping shows how applicable a solution is, reduces the risks of failures and can reveal pain points. ‘Hackathons’, where developers come together to rapidly try things, are increasingly common.
  9. Pilot. Projects need to be implemented in the real world on small scales first. The Intel Collaborative Research Institute runs research projects in London’s parks, neighbourhoods and schools. Barcelona’s Laboratori — which involves the quadruple helix — is pioneering open ‘living lab’ methods in the city to boost culture, knowledge, creativity and innovation.
  10. Product. Prototypes need to be converted into viable commercial products or services through scaling up and new infrastructure globally. Cloud computing allows even small start-ups to scale with volume, velocity and resilience.
  11. Product service systems. Organizations need to move from just delivering products to also delivering related services that improve sustainability as well as profitability. Rolls-Royce sells ‘power by the hour’ — hours of flight time rather than jet engines — enabled by advanced telemetry. The ultimate goal of open innovation 2.0 is a circular or performance economy, focused on services and reuse rather than consumption and waste.
  12. Process. Innovation is a team sport. Organizations, ecosystems and communities should measure, manage and improve their innovation processes to deliver results that are predictable, probable and profitable. Agile methods supported by automation shorten the time from idea to implementation….(More)”