Harnessing the Data Revolution for Sustainable Development

US State Department Fact Sheet on “U.S. Government Commitments and Collaboration with the Global Partnership for Sustainable Development Data”: “On September 27, 2015, the member states of the United Nations agreed to a set of Sustainable Development Goals (Global Goals) that define a common agenda to achieve inclusive growth, end poverty, and protect the environment by 2030. The Global Goals build on tremendous development gains made over the past decade, particularly in low- and middle-income countries, and set actionable steps with measureable indicators to drive progress. The availability and use of high quality data is essential to measuring and achieving the Global Goals. By harnessing the power of technology, mobilizing new and open data sources, and partnering across sectors, we will achieve these goals faster and make their progress more transparent.

Harnessing the data revolution is a critical enabler of the global goals—not only to monitor progress, but also to inclusively engage stakeholders at all levels – local, regional, national, global—to advance evidence-based policies and programs to reach those who need it most. Data can show us where girls are at greatest risk of violence so we can better prevent it; where forests are being destroyed in real-time so we can protect them; and where HIV/AIDS is enduring so we can focus our efforts and finish the fight. Data can catalyze private investment; build modern and inclusive economies; and support transparent and effective investment of resources for social good…..

The Global Partnership for Sustainable Development Data (Global Data Partnership), launched on the sidelines of the 70th United Nations General Assembly, is mobilizing a range of data producers and users—including governments, companies, civil society, data scientists, and international organizations—to harness the data revolution to achieve and measure the Global Goals. Working together, signatories to the Global Data Partnership will address the barriers to accessing and using development data, delivering outcomes that no single stakeholder can achieve working alone….The United States, through the U.S. President’s Emergency Plan for AIDS Relief (PEPFAR), is joining a consortium of funders to seed this initiative. The U.S. Government has many initiatives that are harnessing the data revolution for impact domestically and internationally. Highlights of our international efforts are found below:

Health and Gender

Country Data Collaboratives for Local Impact – PEPFAR and the Millennium Challenge Corporation(MCC) are partnering to invest $21.8 million in Country Data Collaboratives for Local Impact in sub-Saharan Africa that will use data on HIV/AIDS, global health, gender equality, and economic growth to improve programs and policies. Initially, the Country Data Collaboratives will align with and support the objectives of DREAMS, a PEPFAR, Bill & Melinda Gates Foundation, and Girl Effect partnership to reduce new HIV infections among adolescent girls and young women in high-burden areas.

Measurement and Accountability for Results in Health (MA4Health) Collaborative – USAID is partnering with the World Health Organization, the World Bank, and over 20 other agencies, countries, and civil society organizations to establish the MA4Health Collaborative, a multi-stakeholder partnership focused on reducing fragmentation and better aligning support to country health-system performance and accountability. The Collaborative will provide a vehicle to strengthen country-led health information platforms and accountability systems by improving data and increasing capacity for better decision-making; facilitating greater technical collaboration and joint investments; and developing international standards and tools for better information and accountability. In September 2015, partners agreed to a set of common strategic and operational principles, including a strong focus on 3–4 pathfinder countries where all partners will initially come together to support country-led monitoring and accountability platforms. Global actions will focus on promoting open data, establishing common norms and standards, and monitoring progress on data and accountability for the Global Goals. A more detailed operational plan will be developed through the end of the year, and implementation will start on January 1, 2016.

Data2X: Closing the Gender GapData2X is a platform for partners to work together to identify innovative sources of data, including “big data,” that can provide an evidence base to guide development policy and investment on gender data. As part of its commitment to Data2X—an initiative of the United Nations Foundation, Hewlett Foundation, Clinton Foundation, and Bill & Melinda Gates Foundation—PEPFAR and the Millennium Challenge Corporation (MCC) are working with partners to sponsor an open data challenge to incentivize the use of gender data to improve gender policy and practice….(More)”

See also: Data matters: the Global Partnership for Sustainable Development Data. Speech by UK International Development Secretary Justine Greening at the launch of the Global Partnership for Sustainable Development Data.

Data Collaboratives: Sharing Public Data in Private Hands for Social Good

Beth Simone Noveck (The GovLab) in Forbes: “Sensor-rich consumer electronics such as mobile phones, wearable devices, commercial cameras and even cars are collecting zettabytes of data about the environment and about us. According to one McKinsey study, the volume of data is growing at fifty percent a year. No one needs convincing that these private storehouses of information represent a goldmine for business, but these data can do double duty as rich social assets—if they are shared wisely.

Think about a couple of recent examples: Sharing data held by businesses and corporations (i.e. public data in private hands) can help to improve policy interventions. California planners make water allocation decisions based upon expertise, data and analytical tools from public and private sources, including Intel, the Earth Research Institute at the University of California at Santa Barbara, and the World Food Center at the University of California at Davis.

In Europe, several phone companies have made anonymized datasets available, making it possible for researchers to track calling and commuting patterns and gain better insight into social problems from unemployment to mental health. In the United States, LinkedIn is providing free data about demand for IT jobs in different markets which, when combined with open data from the Department of Labor, helps communities target efforts around training….

Despite the promise of data sharing, these kind of data collaboratives remain relatively new. There is a need toaccelerate their use by giving companies strong tax incentives for sharing data for public good. There’s a need for more study to identify models for data sharing in ways that respect personal privacy and security and enable companies to do well by doing good. My colleagues at The GovLab together with UN Global Pulse and the University of Leiden, for example, published this initial analysis of terms and conditions used when exchanging data as part of a prize-backed challenge. We also need philanthropy to start putting money into “meta research;” it’s not going to be enough to just open up databases: we need to know if the data is good.

After years of growing disenchantment with closed-door institutions, the push for greater use of data in governing can be seen as both a response and as a mirror to the Big Data revolution in business. Although more than 1,000,000 government datasets about everything from air quality to farmers markets are openly available online in downloadable formats, much of the data about environmental, biometric, epidemiological, and physical conditions rest in private hands. Governing better requires a new empiricism for developing solutions together. That will depend on access to these private, not just public data….(More)”

Interactive app lets constituents help balance their city’s budget

Springwise: “In this era of information, political spending and municipal budgets are still often shrouded in confusion and mystery. But a new web app called Balancing Act hopes to change that, by enabling US citizens to see the breakdown of their city’s budget via adjustable, comprehensive pie charts.

Created by Colorado-based consultants Engaged Public, Balancing Act not only shows citizens the current budget breakdown, it also enables them to experiment with hypothetical future budgets, adjusting spending and taxes to suit their own priorities. The project aims to engage and inform citizens about the money that their mayors and governments assign on their behalf and allow them to have more of a say in the future of their city. The resource has already been utilized by Pedro Segarra, Mayor of Hartford, Connecticut, who asked his citizens for their input on how best to balance the USD 49 million.

The system can be used to help governments understand the wants and needs of their constituents, as well as enable citizens to see the bigger picture when it comes to tough or unappealing policies. Eventually it can even be used to create the world’s first crowdsourced budget, giving the public the power to make their preferences heard in a clear, comprehensible way…(More)”

Selected Readings on Data Governance

Jos Berens (Centre for Innovation, Leiden University) and Stefaan G. Verhulst (GovLab)

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data governance was originally published in 2015.

The field of Data Collaboratives is premised on the idea that sharing and opening-up private sector datasets has great – and yet untapped – potential for promoting social good. At the same time, the potential of data collaboratives depends on the level of societal trust in the exchange, analysis and use of the data exchanged. Strong data governance frameworks are essential to ensure responsible data use. Without such governance regimes, the emergent data ecosystem will be hampered and the (perceived) risks will dominate the (perceived) benefits. Further, without adopting a human-centered approach to the design of data governance frameworks, including iterative prototyping and careful consideration of the experience, the responses may fail to be flexible and targeted to real needs.

Selected Readings List (in alphabetical order)

Annotated Selected Readings List (in alphabetical order)

Better Place Lab, “Privacy, Transparency and Trust.” Mozilla, 2015. Available from: http://www.betterplace-lab.org/privacy-report.

  • This report looks specifically at the risks involved in the social sector having access to datasets, and the main risks development organizations should focus on to develop a responsible data use practice.
  • Focusing on five specific countries (Brazil, China, Germany, India and Indonesia), the report displays specific country profiles, followed by a comparative analysis centering around the topics of privacy, transparency, online behavior and trust.
  • Some of the key findings mentioned are:
    • A general concern on the importance of privacy, with cultural differences influencing conception of what privacy is.
    • Cultural differences determining how transparency is perceived, and how much value is attached to achieving it.
    • To build trust, individuals need to feel a personal connection or get a personal recommendation – it is hard to build trust regarding automated processes.

Montjoye, Yves Alexandre de; Kendall, Jake and; Kerry, Cameron F. “Enabling Humanitarian Use of Mobile Phone Data.” The Brookings Institution, 2015. Available from: http://www.brookings.edu/research/papers/2014/11/12-enabling-humanitarian-use-mobile-phone-data.

  • Focussing in particular on mobile phone data, this paper explores ways of mitigating privacy harms involved in using call detail records for social good.
  • Key takeaways are the following recommendations for using data for social good:
    • Engaging companies, NGOs, researchers, privacy experts, and governments to agree on a set of best practices for new privacy-conscientious metadata sharing models.
    • Accepting that no framework for maximizing data for the public good will offer perfect protection for privacy, but there must be a balanced application of privacy concerns against the potential for social good.
    • Establishing systems and processes for recognizing trusted third-parties and systems to manage datasets, enable detailed audits, and control the use of data so as to combat the potential for data abuse and re-identification of anonymous data.
    • Simplifying the process among developing governments in regards to the collection and use of mobile phone metadata data for research and public good purposes.

Centre for Democracy and Technology, “Health Big Data in the Commercial Context.” Centre for Democracy and Technology, 2015. Available from: https://cdt.org/insight/health-big-data-in-the-commercial-context/.

  • Focusing particularly on the privacy issues related to using data generated by individuals, this paper explores the overlap in privacy questions this field has with other data uses.
  • The authors note that although the Health Insurance Portability and Accountability Act (HIPAA) has proven a successful approach in ensuring accountability for health data, most of these standards do not apply to developers of the new technologies used to collect these new data sets.
  • For non-HIPAA covered, customer facing technologies, the paper bases an alternative framework for consideration of privacy issues. The framework is based on the Fair Information Practice Principles, and three rounds of stakeholder consultations.

Center for Information Policy Leadership, “A Risk-based Approach to Privacy: Improving Effectiveness in Practice.” Centre for Information Policy Leadership, Hunton & Williams LLP, 2015. Available from: https://www.informationpolicycentre.com/uploads/5/7/1/0/57104281/white_paper_1-a_risk_based_approach_to_privacy_improving_effectiveness_in_practice.pdf.

  • This white paper is part of a project aiming to explain what is often referred to as a new, risk-based approach to privacy, and the development of a privacy risk framework and methodology.
  • With the pace of technological progress often outstripping the capabilities of privacy officers to keep up, this method aims to offer the ability to approach privacy matters in a structured way, assessing privacy implications from the perspective of possible negative impact on individuals.
  • With the intended outcomes of the project being “materials to help policy-makers and legislators to identify desired outcomes and shape rules for the future which are more effective and less burdensome”, insights from this paper might also feed into the development of innovative governance mechanisms aimed specifically at preventing individual harm.

Centre for Information Policy Leadership, “Data Governance for the Evolving Digital Market Place”, Centre for Information Policy Leadership, Hunton & Williams LLP, 2011. Available from: http://www.huntonfiles.com/files/webupload/CIPL_Centre_Accountability_Data_Governance_Paper_2011.pdf.

  • This paper argues that as a result of the proliferation of large scale data analytics, new models governing data inferred from society will shift responsibility to the side of organizations deriving and creating value from that data.
  • It is noted that, with the reality of the challenge corporations face of enabling agile and innovative data use “In exchange for increased corporate responsibility, accountability [and the governance models it mandates, ed.] allows for more flexible use of data.”
  • Proposed as a means to shift responsibility to the side of data-users, the accountability principle has been researched by a worldwide group of policymakers. Tailing the history of the accountability principle, the paper argues that it “(…) requires that companies implement programs that foster compliance with data protection principles, and be able to describe how those programs provide the required protections for individuals.”
  • The following essential elements of accountability are listed:
    • Organisation commitment to accountability and adoption of internal policies consistent with external criteria
    • Mechanisms to put privacy policies into effect, including tools, training and education
    • Systems for internal, ongoing oversight and assurance reviews and external verification
    • Transparency and mechanisms for individual participation
    • Means of remediation and external enforcement

Crawford, Kate; Schulz, Jason. “Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harm.” NYU School of Law, 2014. Available from: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2325784&download=yes.

  • Considering the privacy implications of large-scale analysis of numerous data sources, this paper proposes the implementation of a ‘procedural data due process’ mechanism to arm data subjects against potential privacy intrusions.
  • The authors acknowledge that some privacy protection structures already know similar mechanisms. However, due to the “inherent analytical assumptions and methodological biases” of big data systems, the authors argue for a more rigorous framework.

Letouze, Emmanuel, and; Vinck, Patrick. “The Ethics and Politics of Call Data Analytics”, DataPop Alliance, 2015. Available from: http://static1.squarespace.com/static/531a2b4be4b009ca7e474c05/t/54b97f82e4b0ff9569874fe9/1421442946517/WhitePaperCDRsEthicFrameworkDec10-2014Draft-2.pdf.

  • Focusing on the use of Call Detail Records (CDRs) for social good in development contexts, this whitepaper explores both the potential of these datasets – in part by detailing recent successful efforts in the space – and political and ethical constraints to their use.
  • Drawing from the Menlo Report Ethical Principles Guiding ICT Research, the paper explores how these principles might be unpacked to inform an ethics framework for the analysis of CDRs.

Data for Development External Ethics Panel, “Report of the External Ethics Review Panel.” Orange, 2015. Available from: http://www.d4d.orange.com/fr/content/download/43823/426571/version/2/file/D4D_Challenge_DEEP_Report_IBE.pdf.

  • This report presents the findings of the external expert panel overseeing the Orange Data for Development Challenge.
  • Several types of issues faced by the panel are described, along with the various ways in which the panel dealt with those issues.

Federal Trade Commission Staff Report, “Mobile Privacy Disclosures: Building Trust Through Transparency.” Federal Trade Commission, 2013. Available from: www.ftc.gov/os/2013/02/130201mobileprivacyreport.pdf.

  • This report looks at ways to address privacy concerns regarding mobile phone data use. Specific advise is provided for the following actors:
    • Platforms, or operating systems providers
    • App developers
    • Advertising networks and other third parties
    • App developer trade associations, along with academics, usability experts and privacy researchers

Mirani, Leo. “How to use mobile phone data for good without invading anyone’s privacy.” Quartz, 2015. Available from: http://qz.com/398257/how-to-use-mobile-phone-data-for-good-without-invading-anyones-privacy/.

  • This paper considers the privacy implications of using call detail records for social good, and ways to mitigate risks of privacy intrusion.
  • Taking example of the Orange D4D challenge and the anonymization strategy that was employed there, the paper describes how classic ‘anonymization’ is often not enough. The paper then lists further measures that can be taken to ensure adequate privacy protection.

Bernholz, Lucy. “Several Examples of Digital Ethics and Proposed Practices” Stanford Ethics of Data conference, 2014, Available from: http://www.scribd.com/doc/237527226/Several-Examples-of-Digital-Ethics-and-Proposed-Practices.

  • This list of readings prepared for Stanford’s Ethics of Data conference lists some of the leading available literature regarding ethical data use.

Abrams, Martin. “A Unified Ethical Frame for Big Data Analysis.” The Information Accountability Foundation, 2014. Available from: http://www.privacyconference2014.org/media/17388/Plenary5-Martin-Abrams-Ethics-Fundamental-Rights-and-BigData.pdf.

  • Going beyond privacy, this paper discusses the following elements as central to developing a broad framework for data analysis:
    • Beneficial
    • Progressive
    • Sustainable
    • Respectful
    • Fair

Lane, Julia; Stodden, Victoria; Bender, Stefan, and; Nissenbaum, Helen, “Privacy, Big Data and the Public Good”, Cambridge University Press, 2014. Available from: http://www.dataprivacybook.org.

  • This book treats the privacy issues surrounding the use of big data for promoting the public good.
  • The questions being asked include the following:
    • What are the ethical and legal requirements for scientists and government officials seeking to serve the public good without harming individual citizens?
    • What are the rules of engagement?
    • What are the best ways to provide access while protecting confidentiality?
    • Are there reasonable mechanisms to compensate citizens for privacy loss?

Richards, Neil M, and; King, Jonathan H. “Big Data Ethics”. Wake Forest Law Review, 2014. Available from: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2384174.

  • This paper describes the growing impact of big data analytics on society, and argues that because of this impact, a set of ethical principles to guide data use is called for.
  • The four proposed themes are: privacy, confidentiality, transparency and identity.
  • Finally, the paper discusses how big data can be integrated into society, going into multiple facets of this integration, including the law, roles of institutions and ethical principles.

OECD, “OECD Guidelines on the Protection of Privacy and Transborder Flows of Personal Data”. Available from: http://www.oecd.org/sti/ieconomy/oecdguidelinesontheprotectionofprivacyandtransborderflowsofpersonaldata.htm.

  • A globally used set of principles to inform thought about handling personal data, the OECD privacy guidelines serve as one the leading standards for informing privacy policies and data governance structures.
  • The basic principles of national application are the following:
    • Collection Limitation Principle
    • Data Quality Principle
    • Purpose Specification Principle
    • Use Limitation Principle
    • Security Safeguards Principle
    • Openness Principle
    • Individual Participation Principle
    • Accountability Principle

The White House Big Data and Privacy Working Group, “Big Data: Seizing Opportunities, Preserving Values”, White House, 2015. Available from: https://www.whitehouse.gov/sites/default/files/docs/big_data_privacy_report_5.1.14_final_print.pdf.

  • Documenting the findings of the White House big data and privacy working group, this report lists i.a. the following key recommendations regarding data governance:
    • Bringing greater transparency to the data services industry
    • Stimulating international conversation on big data, with multiple stakeholders
    • With regard to educational data: ensuring data is used for the purpose it is collected for
    • Paying attention to the potential for big data to facilitate discrimination, and expanding technical understanding to stop discrimination

William Hoffman, “Pathways for Progress” World Economic Forum, 2015. Available from: http://www3.weforum.org/docs/WEFUSA_DataDrivenDevelopment_Report2015.pdf.

  • This paper treats i.a. the lack of well-defined and balanced governance mechanisms as one of the key obstacles preventing particularly corporate sector data from being shared in a controlled space.
  • An approach that balances the benefits against the risks of large scale data usage in a development context, building trust among all stake holders in the data ecosystem, is viewed as key.
  • Furthermore, this whitepaper notes that new governance models are required not just by the growing amount of data and analytical capacity, and more refined methods for analysis. The current “super-structure” of information flows between institutions is also seen as one of the key reasons to develop alternatives to the current – outdated – approaches to data governance.

Mapping the Next Frontier of Open Data: Corporate Data Sharing

Stefaan Verhulst at the GovLab (cross-posted at the UN Global Pulse Blog): “When it comes to data, we are living in the Cambrian Age. About ninety percent of the data that exists today has been generated within the last two years. We create 2.5 quintillion bytes of data on a daily basis—equivalent to a “new Google every four days.”
All of this means that we are certain to witness a rapid intensification in the process of “datafication”– already well underway. Use of data will grow increasingly critical. Data will confer strategic advantages; it will become essential to addressing many of our most important social, economic and political challenges.
This explains–at least in large part–why the Open Data movement has grown so rapidly in recent years. More and more, it has become evident that questions surrounding data access and use are emerging as one of the transformational opportunities of our time.
Today, it is estimated that over one million datasets have been made open or public. The vast majority of this open data is government data—information collected by agencies and departments in countries as varied as India, Uganda and the United States. But what of the terabyte after terabyte of data that is collected and stored by corporations? This data is also quite valuable, but it has been harder to access.
The topic of private sector data sharing was the focus of a recent conference organized by the Responsible Data Forum, Data and Society Research Institute and Global Pulse (see event summary). Participants at the conference, which was hosted by The Rockefeller Foundation in New York City, included representatives from a variety of sectors who converged to discuss ways to improve access to private data; the data held by private entities and corporations. The purpose for that access was rooted in a broad recognition that private data has the potential to foster much public good. At the same time, a variety of constraints—notably privacy and security, but also proprietary interests and data protectionism on the part of some companies—hold back this potential.
The framing for issues surrounding sharing private data has been broadly referred to under the rubric of “corporate data philanthropy.” The term refers to an emerging trend whereby companies have started sharing anonymized and aggregated data with third-party users who can then look for patterns or otherwise analyze the data in ways that lead to policy insights and other public good. The term was coined at the World Economic Forum meeting in Davos, in 2011, and has gained wider currency through Global Pulse, a United Nations data project that has popularized the notion of a global “data commons.”
Although still far from prevalent, some examples of corporate data sharing exist….

Help us map the field

A more comprehensive mapping of the field of corporate data sharing would draw on a wide range of case studies and examples to identify opportunities and gaps, and to inspire more corporations to allow access to their data (consider, for instance, the GovLab Open Data 500 mapping for open government data) . From a research point of view, the following questions would be important to ask:

  • What types of data sharing have proven most successful, and which ones least?
  • Who are the users of corporate shared data, and for what purposes?
  • What conditions encourage companies to share, and what are the concerns that prevent sharing?
  • What incentives can be created (economic, regulatory, etc.) to encourage corporate data philanthropy?
  • What differences (if any) exist between shared government data and shared private sector data?
  • What steps need to be taken to minimize potential harms (e.g., to privacy and security) when sharing data?
  • What’s the value created from using shared private data?

We (the GovLab; Global Pulse; and Data & Society) welcome your input to add to this list of questions, or to help us answer them by providing case studies and examples of corporate data philanthropy. Please add your examples below, use our Google Form or email them to us at corporatedata@thegovlab.org”

Sharing Data Is a Form of Corporate Philanthropy

Matt Stempeck in HBR Blog:  “Ever since the International Charter on Space and Major Disasters was signed in 1999, satellite companies like DMC International Imaging have had a clear protocol with which to provide valuable imagery to public actors in times of crisis. In a single week this February, DMCii tasked its fleet of satellites on flooding in the United Kingdom, fires in India, floods in Zimbabwe, and snow in South Korea. Official crisis response departments and relevant UN departments can request on-demand access to the visuals captured by these “eyes in the sky” to better assess damage and coordinate relief efforts.

DMCii is a private company, yet it provides enormous value to the public and social sectors simply by periodically sharing its data.
Back on Earth, companies create, collect, and mine data in their day-to-day business. This data has quickly emerged as one of this century’s most vital assets. Public sector and social good organizations may not have access to the same amount, quality, or frequency of data. This imbalance has inspired a new category of corporate giving foreshadowed by the 1999 Space Charter: data philanthropy.
The satellite imagery example is an area of obvious societal value, but data philanthropy holds even stronger potential closer to home, where a wide range of private companies could give back in meaningful ways by contributing data to public actors. Consider two promising contexts for data philanthropy: responsive cities and academic research.
The centralized institutions of the 20th century allowed for the most sophisticated economic and urban planning to date. But in recent decades, the information revolution has helped the private sector speed ahead in data aggregation, analysis, and applications. It’s well known that there’s enormous value in real-time usage of data in the private sector, but there are similarly huge gains to be won in the application of real-time data to mitigate common challenges.
What if sharing economy companies shared their real-time housing, transit, and economic data with city governments or public interest groups? For example, Uber maintains a “God’s Eye view” of every driver on the road in a city:
Imagine combining this single data feed with an entire portfolio of real-time information. An early leader in this space is the City of Chicago’s urban data dashboard, WindyGrid. The dashboard aggregates an ever-growing variety of public datasets to allow for more intelligent urban management.
Over time, we could design responsive cities that react to this data. A responsive city is one where services, infrastructure, and even policies can flexibly respond to the rhythms of its denizens in real-time. Private sector data contributions could greatly accelerate these nascent efforts.
Data philanthropy could similarly benefit academia. Access to data remains an unfortunate barrier to entry for many researchers. The result is that only researchers with access to certain data, such as full-volume social media streams, can analyze and produce knowledge from this compelling information. Twitter, for example, sells access to a range of real-time APIs to marketing platforms, but the price point often exceeds researchers’ budgets. To accelerate the pursuit of knowledge, Twitter has piloted a program called Data Grants offering access to segments of their real-time global trove to select groups of researchers. With this program, academics and other researchers can apply to receive access to relevant bulk data downloads, such as an period of time before and after an election, or a certain geographic area.
Humanitarian response, urban planning, and academia are just three sectors within which private data can be donated to improve the public condition. There are many more possible applications possible, but few examples to date. For companies looking to expand their corporate social responsibility initiatives, sharing data should be part of the conversation…
Companies considering data philanthropy can take the following steps:

  • Inventory the information your company produces, collects, and analyzes. Consider which data would be easy to share and which data will require long-term effort.
  • Think who could benefit from this information. Who in your community doesn’t have access to this information?
  • Who could be harmed by the release of this data? If the datasets are about people, have they consented to its release? (i.e. don’t pull a Facebook emotional manipulation experiment).
  • Begin conversations with relevant public agencies and nonprofit partners to get a sense of the sort of information they might find valuable and their capacity to work with the formats you might eventually make available.
  • If you expect an onslaught of interest, an application process can help qualify partnership opportunities to maximize positive impact relative to time invested in the program.
  • Consider how you’ll handle distribution of the data to partners. Even if you don’t have the resources to set up an API, regular releases of bulk data could still provide enormous value to organizations used to relying on less-frequently updated government indices.
  • Consider your needs regarding privacy and anonymization. Strip the data of anything remotely resembling personally identifiable information (here are some guidelines).
  • If you’re making data available to researchers, plan to allow researchers to publish their results without obstruction. You might also require them to share the findings with the world under Open Access terms….”

Believe the hype: Big data can have a big social impact

Annika Small at the Guardian: “Given all the hype around so called big data at the moment, it would be easy to dismiss it as nothing more than the latest technology buzzword. This would be a mistake, given that the application and interpretation of huge – often publicly available – data sets is already supporting new models of creativity, innovation and engagement.
To date, stories of big data’s progress and successes have tended to come from government and the private sector, but we’ve heard little about its relevance to social organisations. Yet big data can fuel big social change.
It’s already playing a vital role in the charitable sector. Some social organisations are using existing open government data to better target their services, to improve advocacy and fundraising, and to support knowledge sharing and collaboration between different charities and agencies. Crowdsourcing of open data also offers a new way for not-for-profits to gather intelligence, and there is a wide range of freely available online tools to help them analyse the information.
However, realising the potential of big and open data presents a number of technical and organisational challenges for social organisations. Many don’t have the required skills, awareness and investment to turn big data to their advantage. They also tend to lack the access to examples that might help demystify the technicalities and focus on achievable results.
Overcoming these challenges can be surprisingly simple: Keyfund, for example, gained insight into what made for a successful application to their scheme through using a free, online tool to create word clouds out of all the text in their application forms. Many social organisations could use this same technique to better understand the large volume of unstructured text that they accumulate – in doing so, they would be “doing big data” (albeit in a small way). At the other end of the scale, Global Giving has developed its own sophisticated set of analytical tools to better understand the 57,000+ “stories” gathered from its network.
Innovation often happens when different disciplines collide and it’s becoming apparent that most value – certainly most social value – is likely to be created at the intersection of government, private and social sector data. That could be the combination of data from different sectors, or better “data collaboration” within sectors.
The Housing Association Charitable Trust (HACT) has produced two original tools that demonstrate this. Its Community Insight tool combines data from different sectors, allowing housing providers easily to match information about their stock to a large store of well-maintained open government figures. Meanwhile, its Housing Big Data programme is building a huge dataset by combining stats from 16 different housing providers across the UK. While Community Insight allows each organisation to gain better individual understanding of their communities (measuring well-being and deprivation levels, tracking changes over time, identifying hotspots of acute need), Housing Big Data is making progress towards a much richer network of understanding, providing a foundation for the sector to collaboratively identify challenges and quantify the impact of their interventions.
Alongside this specific initiative from HACT, it’s also exciting to see programmes such as 360giving, which forge connections between a range of private and social enterprises, and lays foundations for UK social investors to be a significant source of information over the next decade. Certainly, The Big Lottery Fund’s publication of open data late last year is a milestone which also highlights how far we have to travel as a sector before we are truly “data-rich”.
At Nominet Trust, we have produced the Social Tech Guide to demonstrate the scale and diversity of social value being generated internationally – much of which is achieved through harnessing the power of big data. From Knewton creating personally tailored learning programmes, to Cellslider using the power of the crowd to advance cancer research, there is no shortage of inspiration. The UN’s Global Pulse programme is another great example, with its focus on how we can combine private and public sources to pin down the size and shape of a social challenge, and calibrate our collective response.
These examples of data-driven social change demonstrate the huge opportunities for social enterprises to harness technology to generate insights, to drive more effective action and to fuel social change. If we are to realise this potential, we need to continue to stretch ourselves as social enterprises and social investors.”

Selected Readings on Big Data

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of big data was originally published in 2014.

Big Data refers to the wide-scale collection, aggregation, storage, analysis and use of data. Government is increasingly in control of a massive amount of raw data that, when analyzed and put to use, can lead to new insights on everything from public opinion to environmental concerns. The burgeoning literature on Big Data argues that it generates value by: creating transparency; enabling experimentation to discover needs, expose variability, and improve performance; segmenting populations to customize actions; replacing/supporting human decision making with automated algorithms; and innovating new business models, products and services. The insights drawn from data analysis can also be visualized in a manner that passes along relevant information, even to those without the tech savvy to understand the data on its own terms (see The GovLab Selected Readings on Data Visualization).

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Australian Government Information Management Office. The Australian Public Service Big Data Strategy: Improved Understanding through Enhanced Data-analytics Capability Strategy Report. August 2013. http://bit.ly/17hs2xY.

  • This Big Data Strategy produced for Australian Government senior executives with responsibility for delivering services and developing policy is aimed at ingraining in government officials that the key to increasing the value of big data held by government is the effective use of analytics. Essentially, “the value of big data lies in [our] ability to extract insights and make better decisions.”
  • This positions big data as a national asset that can be used to “streamline service delivery, create opportunities for innovation, identify new service and policy approaches as well as supporting the effective delivery of existing programs across a broad range of government operations.”

Bollier, David. The Promise and Peril of Big Data. The Aspen Institute, Communications and Society Program, 2010. http://bit.ly/1a3hBIA.

  • This report captures insights from the 2009 Roundtable exploring uses of Big Data within a number of important consumer behavior and policy implication contexts.
  • The report concludes that, “Big Data presents many exciting opportunities to improve modern society. There are incalculable opportunities to make scientific research more productive, and to accelerate discovery and innovation. People can use new tools to help improve their health and well-being, and medical care can be made more efficient and effective. Government, too, has a great stake in using large databases to improve the delivery of government services and to monitor for threats to national security.
  • However, “Big Data also presents many formidable challenges to government and citizens precisely because data technologies are becoming so pervasive, intrusive and difficult to understand. How shall society protect itself against those who would misuse or abuse large databases? What new regulatory systems, private-law innovations or social practices will be capable of controlling anti-social behaviors–and how should we even define what is socially and legally acceptable when the practices enabled by Big Data are so novel and often arcane?”

Boyd, Danah and Kate Crawford. “Six Provocations for Big Data.” A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society. September 2011http://bit.ly/1jJstmz.

  • In this paper, Boyd and Crawford raise challenges to unchecked assumptions and biases regarding big data. The paper makes a number of assertions about the “computational culture” of big data and pushes back against those who consider big data to be a panacea.
  • The authors’ provocations for big data are:
    • Automating Research Changes the Definition of Knowledge
    • Claims to Objectivity and Accuracy are Misleading
    • Big Data is not always Better Data
    • Not all Data is Equivalent
    • Just Because it is accessible doesn’t make it ethical
    • Limited Access to Big Data creates New Digital Divide

The Economist Intelligence Unit. Big Data and the Democratisation of Decisions. October 2012. http://bit.ly/17MpH8L.

  • This report from the Economist Intelligence Unit focuses on the positive impact of big data adoption in the private sector, but its insights can also be applied to the use of big data in governance.
  • The report argues that innovation can be spurred by democratizing access to data, allowing a diversity of stakeholders to “tap data, draw lessons and make business decisions,” which in turn helps companies and institutions respond to new trends and intelligence at varying levels of decision-making power.

Manyika, James, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, and Angela Hung Byers. Big Data: The Next Frontier for Innovation, Competition, and Productivity.  McKinsey & Company. May 2011. http://bit.ly/18Q5CSl.

  • This report argues that big data “will become a key basis of competition, underpinning new waves of productivity growth, innovation, and consumer surplus, and that “leaders in every sector will have to grapple with the implications of big data.” 
  • The report offers five broad ways in which using big data can create value:
    • First, big data can unlock significant value by making information transparent and usable at much higher frequency.
    • Second, as organizations create and store more transactional data in digital form, they can collect more accurate and detailed performance information on everything from product inventories to sick days, and therefore expose variability and boost performance.
    • Third, big data allows ever-narrower segmentation of customers and therefore much more precisely tailored products or services.
    • Fourth, big sophisticated analytics can substantially improve decision-making.
    • Finally, big data can be used to improve the development of the next generation of products and services.

The Partnership for Public Service and the IBM Center for The Business of Government. “From Data to Decisions II: Building an Analytics Culture.” October 17, 2012. https://bit.ly/2EbBTMg.

  • This report discusses strategies for better leveraging data analysis to aid decision-making. The authors argue that, “Organizations that are successful at launching or expanding analytics program…systematically examine their processes and activities to ensure that everything they do clearly connects to what they set out to achieve, and they use that examination to pinpoint weaknesses or areas for improvement.”
  • While the report features many strategies for government decisions-makers, the central recommendation is that, “leaders incorporate analytics as a way of doing business, making data-driven decisions transparent and a fundamental approach to day-to-day management. When an analytics culture is built openly, and the lessons are applied routinely and shared widely, an agency can embed valuable management practices in its DNA, to the mutual benet of the agency and the public it serves.”

TechAmerica Foundation’s Federal Big Data Commission. “Demystifying Big Data: A Practical Guide to Transforming the Business of Government.” 2013. http://bit.ly/1aalUrs.

  • This report presents key big data imperatives that government agencies must address, the challenges and the opportunities posed by the growing volume of data and the value Big Data can provide. The discussion touches on the value of big data to businesses and organizational mission, presents case study examples of big data applications, technical underpinnings and public policy applications.
  • The authors argue that new digital information, “effectively captured, managed and analyzed, has the power to change every industry including cyber security, healthcare, transportation, education, and the sciences.” To ensure that this opportunity is realized, the report proposes a detailed big data strategy framework with the following steps: define, assess, plan, execute and review.

World Economic Forum. “Big Data, Big Impact: New Possibilities for International Development.” 2012. http://bit.ly/17hrTKW.

  • This report examines the potential for channeling the “flood of data created every day by the interactions of billions of people using computers, GPS devices, cell phones, and medical devices” into “actionable information that can be used to identify needs, provide services, and predict and prevent crises for the benefit of low-income populations”
  • The report argues that, “To realise the mutual benefits of creating an environment for sharing mobile-generated data, all ecosystem actors must commit to active and open participation. Governments can take the lead in setting policy and legal frameworks that protect individuals and require contractors to make their data public. Development organisations can continue supporting governments and demonstrating both the public good and the business value that data philanthropy can deliver. And the private sector can move faster to create mechanisms for the sharing data that can benefit the public.”

White House Unveils Big Data Projects, Round Two

Information Week: “The White House Office of Science and Technology Policy (OSTP) and Networking and Information Technology R&D program (NITRD) on Tuesday introduced a slew of new big-data collaboration projects aimed at stimulating private-sector interest in federal data. The initiatives, announced at the White House-sponsored “Data to Knowledge to Action” event, are targeted at fields as varied as medical research, geointelligence, economics, and linguistics.
The new projects are a continuation of the Obama Administration’s Big Data Initiative, announced in March 2012, when the first round of big-data projects was presented.
Thomas Kalil, OSTP’s deputy director for technology and innovation, said that “dozens of new partnerships — more than 90 organizations,” are pursuing these new collaborative projects, including many of the best-known American technology, pharmaceutical, and research companies.
Among the initiatives, Amazon Web Services (AWS) and NASA have set up the NASA Earth eXchange, or NEX, a collaborative network to provide space-based data about our planet to researchers in Earth science. AWS will host much of NASA’s Earth-observation data as an AWS Public Data Set, making it possible, for instance, to crowdsource research projects.
An estimated 4.4 million jobs are being created between now and 2015 to support big-data projects. Employers, educational institutions, and government agencies are working to build the educational infrastructure to provide students with the skills they need to fill those jobs.
To help train new workers, IBM, for instance, has created a new assessment tool that gives university students feedback on their readiness for number-crunching careers in both the public and private sector. Eight universities that have a big data and analytics curriculum — Fordham, George Washington, Illinois Institute of Technology, University of Massachusetts-Boston, Northwestern, Ohio State, Southern Methodist, and the University of Virginia — will receive the assessment tool.
OSTP is organizing an initiative to create a “weather service” for pandemics, Kalil said, a way to use big data to identify and predict pandemics as early as possible in order to plan and prepare for — and hopefully mitigate — their effects.
The National Institutes of Health (NIH), meanwhile, is undertaking its ” Big Data to Knowledge” (BD2K) initiative to develop a range of standards, tools, software, and other approaches to make use of massive amounts of data being generated by the health and medical research community….”
See also:
November 12, 2013 – Fact Sheet: Progress by Federal Agencies: Data to Knowledge to Action
November 12, 2013 – Fact Sheet: New Announcements: Data to Knowledge to Action
November 12, 2013 – Press Release: Data to Knowledge to Action Event