Can national statistical offices shape the data revolution?


Article by Juan Daniel Oviedo, Katharina Fenz, François Fonteneau, and Simon Riedl: “In recent years, breakthrough technologies in artificial intelligence (AI) and the use of satellite imagery made it possible to disrupt the way we collect, process, and analyze data. Facilitated by the intersection of new statistical techniques and the availability of (big) data, it is now possible to create hypergranular estimates.

National statistical offices (NSOs) could be at the forefront of this change. Conventional tasks of statistical offices, such as the coordination of household surveys and censuses, will remain at the core of their work. However, just like AI can enhance the capabilities of doctors, it also has the potential to make statistical offices better, faster, and eventually cheaper.

Still, many countries struggle to make this happen. In a COVID-19 world marked by constrained financial and statistical capacities, making innovation work for statistical offices is of prime importance to create better lives for all…

In the case of Colombia, this novel method facilitated a scale-up from existing poverty estimates that contained 1,123 data points to 78,000 data points, which represents a 70-fold increase. This results in much more granular estimates highlighting Colombia’s heterogeneity between and within municipalities (see Figure 1).

Figure 1. Poverty shares (%) Colombia, in 2018

Figure 1. Poverty shares (%) Colombia, in 2018

Traditional methods don´t allow for cost-efficient hypergranular estimations but serve as a reference point, due to their ground-truthing capacity. Hence, we have combined existing data with novel AI techniques, to go down to granular estimates of up to 4×4 kilometers. In particular, we have trained an algorithm to connect daytime and nighttime satellite images….(More)”.

The Innovation Project: Can advanced data science methods be a game-change for data sharing?


Report by JIPS (Joint Internal Displacement Profiling Service): “Much has changed in the humanitarian data landscape in the last decade and not primarily with the arrival of big data and artificial intelligence. Mostly, the changes are due to increased capacity and resources to collect more data quicker, leading to the professionalisation of information management as a domain of work. Larger amounts of data are becoming available in a more predictable way. We believe that as the field has progressed in filling critical data gaps, the problem is not the availability of data, but the curation and sharing of that data between actors as well as the use of that data to its full potential.

In 2018, JIPS embarked on an innovation journey to explore the potential of state-of-the-art technologies to incentivise data sharing and collaboration. This report covers the first phase of the innovation project and launches a series of articles in which we will share more about the innovation journey itself, discuss safe data sharing and collaboration, and look at the prototype we developed – made possible by the UNHCR Innovation Fund.

We argue that by making data and insights safe and secure to share between stakeholders, it will allow for a more efficient use of available data, reduce the resources needed to collect new data, strengthen collaboration and foster a culture of trust in the evidence-informed protection of people in displacement and crises.

The paper first defines the problem and outlines the processes through which data is currently shared among the humanitarian community. It explores questions such as: what are the existing data sharing methods and technologies? Which ones constitute a feasible option for humanitarian and development organisations? How can different actors share and collaborate on datasets without impairing confidentiality and exposing them to disclosure threats?…(More)”.

Building a Responsible Open Data Ecosystem: Mobility Data & COVID-19


Blog by Anna Livaccari: “Over the last year and a half, COVID-19 has changed the way people move, work, shop, and live. The pandemic has necessitated new data-sharing initiatives to understand new patterns of movement, analyze the spread of COVID-19, and inform research and decision-making. Earlier this year, Cuebiq collaborated with the Open Data Institute (ODI) and NYU’s The GovLab to explore the efficacy of these new initiatives. 

The ODI is a non-profit organization that brings together commercial and non-commercial organizations and governments to address global issues as well as advise on how data can be used for positive social good. As part of a larger project titled “COVID-19: Building an open and trustworthy data ecosystem,” the ODI published a new report with Cuebiq and The GovLab, an action research center at NYU’s Tandon School of Engineering that has pioneered the concept of data collaboratives and runs the data stewards network among other initiatives to advance data-driven decision making in the public interest. This report, “The Use of Mobility Data for Responding to the COVID-19 Pandemic,” specifically addresses key enablers and obstacles to the successful sharing of mobility data between public and private organizations during the pandemic….

Since early 2020, researchers and policy makers have been eager to understand the impact of COVID-19. With the help of mobility data, organizations from different sectors were able to answer some of the most pressing questions regarding the pandemic: questions about policy decisions, mass-communication strategies, and overall socioeconomic impact. Mobility data can be applied to specific use cases and can help answer complex questions, a fact that The GovLab discusses in its short-form mobility data brief. Understanding exactly how organizations employ mobility data can also improve how institutions operate post-pandemic and make data collaboration as a whole more responsible, sustainable, and systemic.

Cuebiq and the GovLab identified 51 projects where mobility data was used for pandemic response, and then selected five case studies to analyze further. The report defines mobility data, the ethics surrounding it, and the lessons learned for the future….(More)”.

The Mobility Data Sharing Assessment


New Tool from the Mobility Data Collaborative (MDC): “…released a set of resources to support transparent and accountable decision making about how and when to share mobility data between organizations. …The Mobility Data Sharing Assessment (MDSA) is a practical and customizable assessment that provides operational guidance to support an organization’s existing processes when sharing or receiving mobility data. It consists of a collection of resources:

  • 1. A Tool that provides a practical, customizable and open-source assessment for organizations to conduct a self-assessment.
  • 2. An Operator’s Manual that provides detailed instructions, guidance and additional resources to assist organizations as they complete the tool.
  • 3. An Infographic that provides a visual overview of the MDSA process.

“We were excited to work with the MDC to create a practical set of resources to support mobility data sharing between organizations,” said Chelsey Colbert, policy counsel at FPF. “Through collaboration, we designed version one of a technology-neutral tool, which is consistent and interoperable with leading industry frameworks. The MDSA was designed to be a flexible and scalable approach that enables mobility data sharing initiatives by encouraging organizations of all sizes to assess the legal, privacy, and ethical considerations.”

New mobility options, such as shared cars and e-scooters, have rapidly emerged in cities over the past decade. Data generated by these mobility services offers an exciting opportunity to provide valuable and timely insight to effectively develop transportation policy and infrastructure. As the world becomes more data-driven, tools like the MDSA help remove barriers to safe data sharing without compromising consumer trust….(More)”.

Data in Crisis — Rethinking Disaster Preparedness in the United States


Paper by Satchit Balsari, Mathew V. Kiang, and Caroline O. Buckee: “…In recent years, large-scale streams of digital data on medical needs, population vulnerabilities, physical and medical infrastructure, human mobility, and environmental conditions have become available in near-real time. Sophisticated analytic methods for combining them meaningfully are being developed and are rapidly evolving. However, the translation of these data and methods into improved disaster response faces substantial challenges. The data exist but are not readily accessible to hospitals and response agencies. The analytic pipelines to rapidly translate them into policy-relevant insights are lacking, and there is no clear designation of responsibility or mandate to integrate them into disaster-mitigation or disaster-response strategies. Building these integrated translational pipelines that use data rapidly and effectively to address the health effects of natural disasters will require substantial investments, and these investments will, in turn, rely on clear evidence of which approaches actually improve outcomes. Public health institutions face some ongoing barriers to achieving this goal, but promising solutions are available….(More)”

WHO, Germany open Hub for Pandemic and Epidemic Intelligence in Berlin


Press Release: “To better prepare and protect the world from global disease threats, H.E. German Federal Chancellor Dr Angela Merkel and Dr Tedros Adhanom Ghebreyesus, World Health Organization Director-General, will today inaugurate the new WHO Hub for Pandemic and Epidemic Intelligence, based in Berlin. 

“The world needs to be able to detect new events with pandemic potential and to monitor disease control measures on a real-time basis to create effective pandemic and epidemic risk management,” said Dr Tedros. “This Hub will be key to that effort, leveraging innovations in data science for public health surveillance and response, and creating systems whereby we can share and expand expertise in this area globally.” 

The WHO Hub, which is receiving an initial investment of US$ 100 million from the Federal Republic of Germany, will harness broad and diverse partnerships across many professional disciplines, and the latest technology, to link the data, tools and communities of practice so that actionable data and intelligence are shared for the common good.

The  WHO Hub is part of WHO’s Health Emergencies Programme and will be a new collaboration of countries and partners worldwide, driving innovations to increase availability of key data; develop state of the art analytic tools and predictive models for risk analysis; and link communities of practice around the world. Critically, the WHO Hub will support the work of public health experts and policy-makers in all countries with the tools needed to forecast, detect and assess epidemic and pandemic risks so they can take rapid decisions to prevent and respond to future public health emergencies.

“Despite decades of investment, COVID-19 has revealed the great gaps that exist in the world’s ability to forecast, detect, assess and respond to outbreaks that threaten people worldwide,” said Dr Michael Ryan, Executive Director of WHO’s Health Emergency Programme. “The WHO Hub for Pandemic and Epidemic Intelligence is designed to develop the data access, analytic tools and communities of practice to fill these very gaps, promote collaboration and sharing, and protect the world from such crises in the future.” 

The Hub will work to:

  • Enhance methods for access to multiple data sources vital to generating signals and insights on disease emergence, evolution and impact;
  • Develop state of the art tools to process, analyze and model data for detection, assessment and response;
  • Provide WHO, our Member States, and partners with these tools to underpin better, faster decisions on how to address outbreak signals and events; and
  • Connect and catalyze institutions and networks developing disease outbreak solutions for the present and future.

Dr Chikwe Ihekweazu, currently Director-General of the Nigeria Centre for Disease Control, has been appointed to lead the WHO Hub….(More)” 

The Open-Source Movement Comes to Medical Datasets


Blog by Edmund L. Andrews: “In a move to democratize research on artificial intelligence and medicine, Stanford’s Center for Artificial Intelligence in Medicine and Imaging (AIMI) is dramatically expanding what is already the world’s largest free repository of AI-ready annotated medical imaging datasets.

Artificial intelligence has become an increasingly pervasive tool for interpreting medical images, from detecting tumors in mammograms and brain scans to analyzing ultrasound videos of a person’s pumping heart.

Many AI-powered devices now rival the accuracy of human doctors. Beyond simply spotting a likely tumor or bone fracture, some systems predict the course of a patient’s illness and make recommendations.

But AI tools have to be trained on expensive datasets of images that have been meticulously annotated by human experts. Because those datasets can cost millions of dollars to acquire or create, much of the research is being funded by big corporations that don’t necessarily share their data with the public.

“What drives this technology, whether you’re a surgeon or an obstetrician, is data,” says Matthew Lungren, co-director of AIMI and an assistant professor of radiology at Stanford. “We want to double down on the idea that medical data is a public good, and that it should be open to the talents of researchers anywhere in the world.”

Launched two years ago, AIMI has already acquired annotated datasets for more than 1 million images, many of them from the Stanford University Medical Center. Researchers can download those datasets at no cost and use them to train AI models that recommend certain kinds of action.

Now, AIMI has teamed up with Microsoft’s AI for Health program to launch a new platform that will be more automated, accessible, and visible. It will be capable of hosting and organizing scores of additional images from institutions around the world. Part of the idea is to create an open and global repository. The platform will also provide a hub for sharing research, making it easier to refine different models and identify differences between population groups. The platform can even offer cloud-based computing power so researchers don’t have to worry about building local resource intensive clinical machine-learning infrastructure….(More)”.

The “Onion Model”: A Layered Approach to Documenting How the Third Wave of Open Data Can Provide Societal Value


Blog post by Andrew Zahuranec, Andrew Young and Stefaan Verhulst: “There’s a lot that goes into data-driven decision-making. Behind the datasets, platforms, and analysts is a complex series of processes that inform what kinds of insight data can produce and what kinds of ends it can achieve. These individual processes can be hard to understand when viewed together but, by separating the stages out, we can not only track how data leads to decisions but promote better and more impactful data management.

Earlier this year, The Open Data Policy Lab published the Third Wave of Open Data Toolkit to explore the elements of data re-use. At the center of this toolkit was an abstraction that we call the Open Data Framework. Divided into individual, onion-like layers, the framework shows all the processes that go into capitalizing on data in the third wave, starting with the creation of a dataset through data collaboration, creating insights, and using those insights to produce value.

This blog tries to re-iterate what’s included in each layer of this data “onion model” and demonstrate how organizations can create societal value by making their data available for re-use by other parties….(More)”.

Innovative Data for Urban Planning: The Opportunities and Challenges of Public-Private Data Partnerships


GSMA Report: “Rapid urbanisation will be one of the most pressing and complex challenges in low-and-middle income countries (LMICs) for the next several decades. With cities in Africa and Asia expected to add more than one billion people, urban populations will represent two-thirds of the world population by 2050. This presents LMICs with an interesting opportunity and challenge, where rapid urbanisation can both contribute to economic or poverty growth.

The rapid pace and unequal character of urbanisation in LMICs has meant that not enough data has been generated to support urban planning solutions and the effective provision of urban utility services. Data-sharing partnerships between the public and private sector can bridge this data gap and open up an opportunity for governments to address urbanisation challenges with data-driven decisions. Innovative data sources such as mobile network operator data, remote sensing data, utility services data and other digital services data, can be applied to a range of critical urban planning and service provision use cases.

This report identifies challenges and enablers for public-private data-sharing partnerships (PPPs) that relate to the partnership engagement model, data and technology, regulation and ethics frameworks and evaluation and sustainability….(More)”

Remove obstacles to sharing health data with researchers outside of the European Union


Heidi Beate Bentzen et al in Nature: “International sharing of pseudonymized personal data among researchers is key to the advancement of health research and is an essential prerequisite for studies of rare diseases or subgroups of common diseases to obtain adequate statistical power.

Pseudonymized personal data are data on which identifiers such as names are replaced by codes. Research institutions keep the ‘code key’ that can link an individual person to the data securely and separately from the research data and thereby protect privacy while preserving the usefulness of data for research. Pseudonymized data are still considered personal data under the General Data Protection Regulation (GDPR) 2016/679 of the European Union (EU) and, therefore, international transfers of such data need to comply with GDPR requirements. Although the GDPR does not apply to transfers of anonymized data, the threshold for anonymity under the GDPR is very high; hence, rendering data anonymous to the level required for exemption from the GDPR can diminish the usefulness of the data for research and is often not even possible.

The GDPR requires that transfers of personal data to international organizations or countries outside the European Economic Area (EEA)—which comprises the EU Member States plus Iceland, Liechtenstein and Norway—be adequately protected. Over the past two years, it has become apparent that challenges emerge for the sharing of data with public-sector researchers in a majority of countries outside of the EEA, as only a few decisions stating that a country offers an adequate level of data protection have so far been issued by the European Commission. This is a problem, for example, with researchers at federal research institutions in the United States. Transfers to international organizations such as the World Health Organization are similarly affected. Because these obstacles ultimately affect patients as beneficiaries of research, solutions are urgently needed. The European scientific academies have recently published a report explaining the consequences of stalled data transfers and pushing for responsible solutions…(More)”.