Combating COVID-19 with Data: What Role for National Statistical Systems?


Press Release: “As part of its ongoing response to the COVID-19 crisis, PARIS21 released today a policy brief at the intersection of statistics and policy making to help inform the measures taken to address the pandemic.

cover

The COVID-19 pandemic has brought data to the centre of policy making and public attention. A diverse ecosystem of data producers, both private and public, report rates of infection, fatality and recovery on a daily basis. However, a proliferation of data, which is at times contradictory, can also lead to confusion and mistrust among data users.

Meanwhile, policymakers, development partners and citizens need to take quick, informed actions to design interventions that reach the most vulnerable and leave no one behind. As countries comply with lockdowns and other containment measures, national statistical systems (NSSs) face a dual effect of growing data demand and constrained supply. This in turn may squeeze NSSs beyond their institutional capacity.

At the same time, alternative data sources such as mobile phone or satellite data are in abundance. These data could potentially complement traditional sources such as censuses, surveys and administrative systems. However, with scant governance frameworks to scale and sustain their use, policy action is not yet based on a convergence of evidence.

This policy brief introduces a conceptual framework that describes the adverse effects of the crisis on NSSs in developing countries. Moreover, it suggests short and medium-term actions to mitigate the negative effects by:

1. Focusing data production on priority economic, social and demographic data.
2. Communicating proactively with citizens, academia, private sector and policy makers.
3. Positioning the NSO as advisor and knowledge bank for national governments.

NSSs contribute significantly to robust policy responses in a crisis. The brief thus calls on national statistical offices to assume a central role as coordinators of the NSSs and chart the way toward improved data ecosystem governance for informing policies during and after COVID-19….(More)”.

Open Covid Pledge


Pledge: “Immediate action is required to halt the COVID-19 Pandemic and treat those it has affected. It is a practical and moral imperative that every tool we have at our disposal be applied to develop and deploy technologies on a massive scale without impediment.

We therefore pledge to make our intellectual property available free of charge for use in ending the COVID-19 pandemic and minimizing the impact of the disease.

We will implement this pledge through a license that details the terms and conditions under which our intellectual property is made available.

How to make the Pledge

The first step for organizations wishing to make the Pledge is to publicly commit to making intellectual property relevant to COVID-19 freely available, by:

  • Posting a public statement that the organization is making the Pledge, on their website. 
  • Issuing an official press release.

And then sending us a link to this statement, a point of contact in the organization, and, at the organization’s discretion, a copy of their logo to display on this site.

How to implement the Pledge

The next step for organizations who have made the Pledge is to implement it via a license detailing the terms and conditions under which their intellectual property is made available. There are three options for doing so:

  • Adopt the Open COVID License, created by our legal team for organizations that wish to implement the Pledge simply and immediately on terms shared by many other organizations.
  • Create a custom license that accomplished the intent of the Pledge.
  • Identify existing license(s) that accomplish the goals of the Pledge.

As with making the Pledge, send us links to the license or licenses, a point of contact in the organization, and, at the organization’s discretion, a copy of their logo to display on this site….(More)”.

Coronavirus: country comparisons are pointless unless we account for these biases in testing


Norman Fenton, Magda Osman, Martin Neil, and Scott McLachlan at The Conversation: “Suppose we wanted to estimate how many car owners there are in the UK and how many of those own a Ford Fiesta, but we only have data on those people who visited Ford car showrooms in the last year. If 10% of the showroom visitors owned a Fiesta, then, because of the bias in the sample, this would certainly overestimate the proportion of Ford Fiesta owners in the country.

Estimating death rates for people with COVID-19 is currently undertaken largely along the same lines. In the UK, for example, almost all testing of COVID-19 is performed on people already hospitalised with COVID-19 symptoms. At the time of writing, there are 29,474 confirmed COVID-19 cases (analogous to car owners visiting a showroom) of whom 2,352 have died (Ford Fiesta owners who visited a showroom). But it misses out all the people with mild or no symptoms.

Concluding that the death rate from COVID-19 is on average 8% (2,352 out of 29,474) ignores the many people with COVID-19 who are not hospitalised and have not died (analogous to car owners who did not visit a Ford showroom and who do not own a Ford Fiesta). It is therefore equivalent to making the mistake of concluding that 10% of all car owners own a Fiesta.

There are many prominent examples of this sort of conclusion. The Oxford COVID-19 Evidence Service have undertaken a thorough statistical analysis. They acknowledge potential selection bias, and add confidence intervals showing how big the error may be for the (potentially highly misleading) proportion of deaths among confirmed COVID-19 patients.

They note various factors that can result in wide national differences – for example the UK’s 8% (mean) “death rate” is very high compared to Germany’s 0.74%. These factors include different demographics, for example the number of elderly in a population, as well as how deaths are reported. For example, in some countries everybody who dies after having been diagnosed with COVID-19 is recorded as a COVID-19 death, even if the disease was not the actual cause, while other people may die from the virus without actually having been diagnosed with COVID-19.

However, the models fail to incorporate explicit causal explanations in their modelling that might enable us to make more meaningful inferences from the available data, including data on virus testing.

What a causal model would look like. Author provided

We have developed an initial prototype “causal model” whose structure is shown in the figure above. The links between the named variables in a model like this show how they are dependent on each other. These links, along with other unknown variables, are captured as probabilities. As data are entered for specific, known variables, all of the unknown variable probabilities are updated using a method called Bayesian inference. The model shows that the COVID-19 death rate is as much a function of sampling methods, testing and reporting, as it is determined by the underlying rate of infection in a vulnerable population….(More)”

A guide to healthy skepticism of artificial intelligence and coronavirus


Alex Engler at Brookings: “The COVID-19 outbreak has spurred considerable news coverage about the ways artificial intelligence (AI) can combat the pandemic’s spread. Unfortunately, much of it has failed to be appropriately skeptical about the claims of AI’s value. Like many tools, AI has a role to play, but its effect on the outbreak is probably small. While this may change in the future, technologies like data reporting, telemedicine, and conventional diagnostic tools are currently far more impactful than AI.

Still, various news articles have dramatized the role AI is playing in the pandemic by overstating what tasks it can perform, inflating its effectiveness and scale, neglecting the level of human involvement, and being careless in consideration of related risks. In fact, the COVID-19 AI-hype has been diverse enough to cover the greatest hits of exaggerated claims around AI. And so, framed around examples from the COVID-19 outbreak, here are eight considerations for a skeptic’s approach to AI claims….(More)”.

The Rules of Contagion: Why Things Spread–And Why They Stop


Book by Adam Kucharski: “From ideas and infections to financial crises and “fake news,” why the science of outbreaks is the science of modern life.


These days, whenever anything spreads, whether it’s a YouTube fad or a political rumor, we say it went viral. But how does virality actually work? In The Rules of Contagion, epidemiologist Adam Kucharski explores topics including gun violence, online manipulation, and, of course, outbreaks of disease to show how much we get wrong about contagion, and how astonishing the real science is.
Why did the president retweet a Mussolini quote as his own? Why do financial bubbles take off so quickly? Why are disinformation campaigns so effective? And what makes the emergence of new illnesses–such as MERS, SARS, or the coronavirus disease COVID-19–so challenging? By uncovering the crucial factors driving outbreaks, we can see how things really spread — and what we can do about it….(More)”.

Mobile phone data and COVID-19: Missing an opportunity?


Paper by Nuria Oliver, et al: “This paper describes how mobile phone data can guide government and public health authorities in determining the best course of action to control the COVID-19 pandemic and in assessing the effectiveness of control measures such as physical distancing. It identifies key gaps and reasons why this kind of data is only scarcely used, although their value in similar epidemics has proven in a number of use cases. It presents ways to overcome these gaps and key recommendations for urgent action, most notably the establishment of mixed expert groups on national and regional level, and the inclusion and support of governments and public authorities early on. It is authored by a group of experienced data scientists, epidemiologists, demographers and representatives of mobile network operators who jointly put their work at the service of the global effort to combat the COVID-19 pandemic….(More)”.

Collective Intelligence at EU Level – Social and Democratic Dimensions


Paper by Nora Milotay and Gianluca Sgueo: “Humans are among the many living species capable of collaborative and imaginative thinking. While it is widely agreed among scholars that this capacity has contributed to making humans the dominant species, other crucial questions remain open to debate. Is it possible to encourage large groups of people to engage in collective thinking? Is it possible to coordinate citizens to find solutions to address global challenges? Some scholars claim that large groups of independent, motivated, and well-informed people can, collectively, make better decisions than isolated individuals can – what is known as ‘collective intelligence.’

The social dimension of collective intelligence mainly relates to social aspects of the economy and of innovation. It shows that a holistic approach to innovation – one that includes not only technological but also social aspects – can greatly contribute to the EU’s goal of promoting a just transition for everyone to a sustainable and green economy in the digital age. The EU has been taking concrete action to promote social innovation by supporting the development of its theory and practice. Mainly through funding programmes, it helps to seek new types of partners and build new capacity – and thus shape the future of local and national innovations aimed at societal needs.

The democratic dimension suggests that the power of the collective can be leveraged so as to improve public decision-making systems. Supported by technology, policy-makers can harness the ‘civic surplus’ of citizens – thus providing smarter solutions to regulatory challenges. This is particularly relevant at EU level in view of the planned Conference on the Future of Europe, aimed at engaging communities at large and making EU decision-making more inclusive and participatory.

The current coronavirus crisis is likely to change society and our economy in ways as yet too early to predict, but recovery after the crisis will require new ways of thinking and acting to overcome common challenges, and thus making use of our collective intelligence should be more urgent than ever. In the longer term, in order to mobilise collective intelligence across the EU and to fully exploit its innovative potential, the EU needs to strengthen its education policies and promote a shared understanding of a holistic approach to innovation and of collective intelligence – and thus become a ‘global brain,’ with a solid institutional set-up at the centre of a subsidised experimentation process that meets the challenges imposed by modernday transformations…(More)”.

A Closer Look at Location Data: Privacy and Pandemics


Assessment by Stacey Gray: “In light of COVID-19, there is heightened global interest in harnessing location data held by major tech companies to track individuals affected by the virus, better understand the effectiveness of social distancing, or send alerts to individuals who might be affected based on their previous proximity to known cases. Governments around the world are considering whether and how to use mobile location data to help contain the virus: Israel’s government passed emergency regulations to address the crisis using cell phone location data; the European Commission requested that mobile carriers provide anonymized and aggregate mobile location data; and South Korea has created a publicly available map of location data from individuals who have tested positive. 

Public health agencies and epidemiologists have long been interested in analyzing device location data to track diseases. In general, the movement of devices effectively mirrors movement of people (with some exceptions discussed below). However, its use comes with a range of ethical and privacy concerns. 

In order to help policymakers address these concerns, we provide below a brief explainer guide of the basics: (1) what is location data, (2) who holds it, and (3) how is it collected? Finally we discuss some preliminary ethical and privacy considerations for processing location data. Researchers and agencies should consider: how and in what context location data was collected; the fact and reasoning behind location data being classified as legally “sensitive” in most jurisdictions; challenges to effective “anonymization”; representativeness of the location dataset (taking into account potential bias and lack of inclusion of low-income and elderly subpopulations who do not own phones); and the unique importance of purpose limitation, or not re-using location data for other civil or law enforcement purposes after the pandemic is over….(More)”.

Human migration: the big data perspective


Alina Sîrbu et al at the International Journal of Data Science and Analytics: “How can big data help to understand the migration phenomenon? In this paper, we try to answer this question through an analysis of various phases of migration, comparing traditional and novel data sources and models at each phase. We concentrate on three phases of migration, at each phase describing the state of the art and recent developments and ideas. The first phase includes the journey, and we study migration flows and stocks, providing examples where big data can have an impact. The second phase discusses the stay, i.e. migrant integration in the destination country. We explore various data sets and models that can be used to quantify and understand migrant integration, with the final aim of providing the basis for the construction of a novel multi-level integration index. The last phase is related to the effects of migration on the source countries and the return of migrants….(More)”.

A controlled trial for reproducibility


Marc P. Raphael, Paul E. Sheehan & Gary J. Vora at Nature: “In 2016, the US Defense Advanced Research Projects Agency (DARPA) told eight research groups that their proposals had made it through the review gauntlet and would soon get a few million dollars from its Biological Technologies Office (BTO). Along with congratulations, the teams received a reminder that their award came with an unusual requirement — an independent shadow team of scientists tasked with reproducing their results.

Thus began an intense, multi-year controlled trial in reproducibility. Each shadow team consists of three to five researchers, who visit the ‘performer’ team’s laboratory and often host visits themselves. Between 3% and 8% of the programme’s total funds go to this independent validation and verification (IV&V) work. But DARPA has the flexibility and resources for such herculean efforts to assess essential techniques. In one unusual instance, an IV&V laboratory needed a sophisticated US$200,000 microscopy and microfluidic set-up to make an accurate assessment.

These costs are high, but we think they are an essential investment to avoid wasting taxpayers’ money and to advance fundamental research towards beneficial applications. Here, we outline what we’ve learnt from implementing this programme, and how it could be applied more broadly….(More)”.