Developing a Data Reuse Strategy for Solving Public Problems


The Data Stewards Academy…A self-directed learning program from the Open Data Policy Lab (The GovLab): “Communities across the world face unprecedented challenges. Strained by climate change, crumbling infrastructure, growing economic inequality, and the continued costs of the COVID-19 pandemic, institutions need new ways of solving public problems and improving how they operate.

In recent years, data has been increasingly used to inform policies and interventions targeted at these issues. Yet, many of these data projects, data collaboratives, and open data initiatives remain scattered. As we enter into a new age of data use and re-use, a third wave of open data, it is more important than ever to be strategic and purposeful, to find new ways to connect the demand for data with its supply to meet institutional objectives in a socially responsible way.

This self-directed learning program, adapted from a selective executive education course, will help data stewards (and aspiring data stewards) develop a data re-use strategy to solve public problems. Noting the ways data resources can inform their day-to-day and strategic decision-making, the course provides learners with ways they can use data to improve how they operate and pursue goals in the public’s interests. By working differently—using agile methods and data analytics—public, private, and civil sector leaders can promote data re-use and reduce data access inequities in ways that advance their institution’s goals.

In this self-directed learning program, we will teach participants how to develop a 21st century data strategy. Participants will learn:

  1. Why It Matters: A discussion of the three waves of open data and how data re-use has proven to be transformative;
  2. The Current State of Play: Current practice around data re-use, including deficits of current approaches and the need to shift from ad hoc engagements to more systematic, sustainable, and responsible models;
  3. Defining Demand: Methodologies for how organizations can formulate questions that data can answer; and make data collaboratives more purposeful;
  4. Mapping Supply: Methods for organizations to discover and assess the open and private data needed to answer the questions at hand that potentially may be available to them;
  5. Matching Supply with Demand: Operational models for connecting and meeting the needs of supply- and demand-side actors in a sustainable way;
  6. Identifying Risks: Overview of the risks that can emerge in the course of data re-use;
  7. Mitigating Risks and Other Considerations: Technical, legal and contractual issues that can be leveraged or may arise in the course of data collaboration and other data work; and
  8. Institutionalizing Data Re-use: Suggestions for how organizations can incorporate data re-use into their organizational structure and foster future collaboration and data stewardship.

The Data Stewardship Executive Education Course was designed and implemented by program leads Stefaan Verhulst, co-founder and chief research development officer at the GovLab, and Andrew Young, The GovLab’s knowledge director, in close collaboration with a global network of expert faculty and advisors. It aims to….(More)”.

Data Stewards Academy Canvas

WHO, Germany launch new global hub for pandemic and epidemic intelligence


Press Release: “The World Health Organization (WHO) and the Federal Republic of Germany will establish a new global hub for pandemic and epidemic intelligence, data, surveillance and analytics innovation. The Hub, based in Berlin and working with partners around the world, will lead innovations in data analytics across the largest network of global data to predict, prevent, detect prepare for and respond to pandemic and epidemic risks worldwide.

H.E. German Federal Chancellor Dr Angela Merkel said: “The current COVID-19 pandemic has taught us that we can only fight pandemics and epidemics together. The new WHO Hub will be a global platform for pandemic prevention, bringing together various governmental, academic and private sector institutions. I am delighted that WHO chose Berlin as its location and invite partners from all around the world to contribute to the WHO Hub.”

The WHO Hub for Pandemic and Epidemic Intelligence is part of WHO’s Health Emergencies Programme and will be a new collaboration of countries and partners worldwide, driving innovations to increase availability and linkage of diverse data; develop tools and predictive models for risk analysis; and to monitor disease control measures, community acceptance and infodemics. Critically, the WHO Hub will support the work of public health experts and policy-makers in all countries with insights so they can take rapid decisions to prevent and respond to future public health emergencies.

“We need to identify pandemic and epidemic risks as quickly as possible, wherever they occur in the world. For that aim, we need to strengthen the global early warning surveillance system with improved collection of health-related data and inter-disciplinary risk analysis,” said Jens Spahn, German Minister of Health. “Germany has consistently been committed to support WHO’s work in preparing for and responding to health emergencies, and the WHO Hub is a concrete initiative that will make the world safer.”

Working with partners globally, the WHO Hub will drive a scale-up in innovation for existing forecasting and early warning capacities in WHO and Member States. At the same time, the WHO Hub will accelerate global collaborations across public and private sector organizations, academia, and international partner networks. It will help them to collaborate and co-create the necessary tools for managing and analyzing data for early warning surveillance. It will also promote greater access to data and information….(More)”.

Responsible Data Science


Book by Peter Bruce and Grant Fleming: “The increasing popularity of data science has resulted in numerous well-publicized cases of bias, injustice, and discrimination. The widespread deployment of “Black box” algorithms that are difficult or impossible to understand and explain, even for their developers, is a primary source of these unanticipated harms, making modern techniques and methods for manipulating large data sets seem sinister, even dangerous. When put in the hands of authoritarian governments, these algorithms have enabled suppression of political dissent and persecution of minorities. To prevent these harms, data scientists everywhere must come to understand how the algorithms that they build and deploy may harm certain groups or be unfair.

Responsible Data Science delivers a comprehensive, practical treatment of how to implement data science solutions in an even-handed and ethical manner that minimizes the risk of undue harm to vulnerable members of society. Both data science practitioners and managers of analytics teams will learn how to:

  • Improve model transparency, even for black box models
  • Diagnose bias and unfairness within models using multiple metrics
  • Audit projects to ensure fairness and minimize the possibility of unintended harm…(More)”

Mapping the United Nations Fundamental Principles of Official Statistics against new and big data sources


Paper by Dominik Rozkrut, Olga Świerkot-Strużewska, and Gemma Van Halderen: “Never has there been a more exciting time to be an official statistician. The data revolution is responding to the demands of the CoVID-19 pandemic and a complex sustainable development agenda to improve how data is produced and used, to close data gaps to prevent discrimination, to build capacity and data literacy, to modernize data collection systems and to liberate data to promote transparency and accountability. But can all data be liberated in the production and communication of official statistics? This paper explores the UN Fundamental Principles of Official Statistics in the context of eight new and big data sources. The paper concludes each data source can be used for the production of official statistics in adherence with the Fundamental Principles and argues these data sources should be used if National Statistical Systems are to adhere to the first Fundamental Principle of compiling and making available official statistics that honor citizen’s entitlement to public information….(More)”.

Principles and Practices for a Federal Statistical Agency


Book by the National Academies of Sciences, Engineering, and Medicine: “Government statistics are widely used to inform decisions by policymakers, program administrators, businesses and other organizations as well as households and the general public. Principles and Practices for a Federal Statistical Agency, Seventh Edition will assist statistical agencies and units, as well as other agencies engaged in statistical activities, to carry out their responsibilities to provide accurate, timely, relevant, and objective information for public and policy use. This report will also inform legislative and executive branch decision makers, data users, and others about the characteristics of statistical agencies that enable them to serve the public good….(More)”

Building on a year of open data: progress and promise


Jennifer Yokoyama at Microsoft: “…The biggest takeaway from our work this past year – and the one thing I hope any reader of this post will take away – is that data collaboration is a spectrum. From the presence (or absence) of data to how open that data is to the trust level of the collaboration participants, these factors may necessarily lead to different configurations and different goals, but they can all lead to more open data and innovative insights and discoveries.

Here are a few other lessons we have learned over the last year:

  1. Principles set the foundation for stakeholder collaboration: When we launched the Open Data Campaign, we adopted five principles that guide our contributions and commitments to trusted data collaborations: Open, Usable, Empowering, Secure and Private. These principles underpin our participation, but importantly, organizations can build on them to establish responsible ways to share and collaborate around their data. The London Data Commission, for example, established a set of data sharing principles for public- and private-sector organizations to ensure alignment and to guide the participating groups in how they share data.
  2. There is value in pilot projects: Traditionally, data collaborations with several stakeholders require time – often including a long runway for building the collaboration, plus the time needed to execute on the project and learn from it. However, our learnings show short-term projects that experiment and test data collaborations can provide valuable insights. The London Data Commission did exactly that with the launch of four short-term pilot projects. Due to the success of the pilots, the partners are exploring how they can be expanded upon.
  3. Open data doesn’t require new data: Identifying data to share does not always mean it must be newly shared data; sometimes the data was narrowly shared, but can be shared more broadly, made more accessible or analyzed for a different purpose. Microsoft’s environmental indicator data is an example of data that was already disclosed in certain venues, but was then made available to the Linux Foundation’s OS-Climate Initiative to be consumed through analytics, thereby extending its reach and impact…

To get started, we suggest that emerging data collaborations make use of the wealth of existing resources. When embarking on data collaborations, we leveraged many of the definitions, toolkits and guides from leading organizations in this space. As examples, resources such as the Open Data Institute’s Data Ethics Canvas are extremely useful as a framework to develop ethical guidance. Additionally, The GovLab’s Open Data Policy Lab and Executive Course on Data Stewardship, both supported by Microsoft, highlight important case studies, governance considerations and frameworks when sharing data. If you want to learn more about the exciting work our partners are doing, check out the latest posts from the Open Data Institute and GovLab…(More)”. See also Open Data Policy Lab.

Resetting Data Governance: Authorized Public Purpose Access and Society Criteria for Implementation of APPA Principles


Paper by the WEF Japan: “In January 2020, our first publication presented Authorized Public Purpose Access (APPA), a new data governance model that aims to strike a balance between individual rights and the interests of data holders and the public interest. It is proposed that the use of personal data for public-health purposes, including fighting pandemics, be subject to appropriate and balanced governance mechanisms such as those set out the APPA approach. The same approach could be extended to the use of data for non-medical public-interest purposes, such as achieving the United Nations Sustainable Development Goals (SDGs). This publication proposes a systematic approach to implementing APPA and to pursuing public-interest goals through data use. The approach values practicality, broad social agreement on appropriate goals and methods, and the valid interests of all stakeholders….(More)”.

Tracking Economic Activity in Response to the COVID-19 using nighttime Lights


Paper by Mark Roberts: “Over the last decade, nighttime lights – artificial lighting at night that is associated with human activity and can be detected by satellite sensors – have become a proxy for monitoring economic activity. To examine how the COVID-19 crisis has affected economic activity in Morocco, we calculated monthly lights estimates for both the country overall and at a sub-national level. By examining the intensity of Morocco’s lights in comparison with the quarterly GDP data at the national level, we are also able to confirm that nighttime lights are able to track movements in real economic activity for Morocco….(More)”.

What Is Mobility Data? Where Is It Used?


Brief by Andrew J. Zahuranec, Stefaan Verhulst, Andrew Young, Aditi Ramesh, and Brennan Lake: “Mobility data is data about the geographic location of a device passively produced through normal activity. Throughout the pandemic, public health experts and public officials have used mobility data to understand patterns of COVID-19’s spread and the impact of disease control measures. However, privacy advocates and others have questioned the need for this data and raised concerns about the capacity of such data-driven tools to facilitate surveillance, improper data use, and other exploitative practices.

In April, The GovLab, Cuebiq, and the Open Data Institute released The Use of Mobility Data for Responding to the COVID-19 Pandemic, which relied on several case studies to look at the opportunities, risks, and challenges associated with mobility data. Today, we hope to supplement that report with a new resource: a brief on what mobility data is and the different types of data it can include. The piece is a one-pager to allow decision-makers to easily read it. It provides real-world examples from the report to illustrate how different data types can be used in a responsible way…..(More)”.

How we mapped billions of trees in West Africa using satellites, supercomputers and AI


Martin Brandt and Kjeld Rasmussen in The Conversation: “The possibility that vegetation cover in semi-arid and arid areas was retreating has long been an issue of international concern. In the 1930s it was first theorized that the Sahara was expanding and woody vegetation was on the retreat. In the 1970s, spurred by the “Sahel drought”, focus was on the threat of “desertification”, caused by human overuse and/or climate change. In recent decades, the potential impact of climate change on the vegetation has been the main concern, along with the feedback of vegetation on the climate, associated with the role of the vegetation in the global carbon cycle.

Using high-resolution satellite data and machine-learning techniques at supercomputing facilities, we have now been able to map billions of individual trees and shrubs in West Africa. The goal is to better understand the real state of vegetation coverage and evolution in arid and semi-arid areas.

Finding a shrub in the desert – from space

Since the 1970s, satellite data have been used extensively to map and monitor vegetation in semi-arid areas worldwide. Images are available in “high” spatial resolution (with NASA’s satellites Landsat MSS and TM, and ESA’s satellites Spot and Sentinel) and “medium or low” spatial resolution (NOAA AVHRR and MODIS).

To accurately analyse vegetation cover at continental or global scale, it is necessary to use the highest-resolution images available – with a resolution of 1 metre or less – and up until now the costs of acquiring and analysing the data have been prohibitive. Consequently, most studies have relied on moderate- to low-resolution data. This has not allowed for the identification of individual trees, and therefore these studies only yield aggregate estimates of vegetation cover and productivity, mixing herbaceous and woody vegetation.

In a new study covering a large part of the semi-arid Sahara-Sahel-Sudanian zone of West Africa, published in Nature in October 2020, an international group of researchers was able to overcome these limitations. By combining an immense amount of high-resolution satellite data, advanced computing capacities, machine-learning techniques and extensive field data gathered over decades, we were able to identify individual trees and shrubs with a crown area of more than 3 m2 with great accuracy. The result is a database of 1.8 billion trees in the region studied, available to all interested….(More)”

Supercomputing, machine learning, satellite data and field assessments allow to map billions of individual trees in West Africa. Martin Brandt, Author provided