Coronavirus: country comparisons are pointless unless we account for these biases in testing


Norman Fenton, Magda Osman, Martin Neil, and Scott McLachlan at The Conversation: “Suppose we wanted to estimate how many car owners there are in the UK and how many of those own a Ford Fiesta, but we only have data on those people who visited Ford car showrooms in the last year. If 10% of the showroom visitors owned a Fiesta, then, because of the bias in the sample, this would certainly overestimate the proportion of Ford Fiesta owners in the country.

Estimating death rates for people with COVID-19 is currently undertaken largely along the same lines. In the UK, for example, almost all testing of COVID-19 is performed on people already hospitalised with COVID-19 symptoms. At the time of writing, there are 29,474 confirmed COVID-19 cases (analogous to car owners visiting a showroom) of whom 2,352 have died (Ford Fiesta owners who visited a showroom). But it misses out all the people with mild or no symptoms.

Concluding that the death rate from COVID-19 is on average 8% (2,352 out of 29,474) ignores the many people with COVID-19 who are not hospitalised and have not died (analogous to car owners who did not visit a Ford showroom and who do not own a Ford Fiesta). It is therefore equivalent to making the mistake of concluding that 10% of all car owners own a Fiesta.

There are many prominent examples of this sort of conclusion. The Oxford COVID-19 Evidence Service have undertaken a thorough statistical analysis. They acknowledge potential selection bias, and add confidence intervals showing how big the error may be for the (potentially highly misleading) proportion of deaths among confirmed COVID-19 patients.

They note various factors that can result in wide national differences – for example the UK’s 8% (mean) “death rate” is very high compared to Germany’s 0.74%. These factors include different demographics, for example the number of elderly in a population, as well as how deaths are reported. For example, in some countries everybody who dies after having been diagnosed with COVID-19 is recorded as a COVID-19 death, even if the disease was not the actual cause, while other people may die from the virus without actually having been diagnosed with COVID-19.

However, the models fail to incorporate explicit causal explanations in their modelling that might enable us to make more meaningful inferences from the available data, including data on virus testing.

What a causal model would look like. Author provided

We have developed an initial prototype “causal model” whose structure is shown in the figure above. The links between the named variables in a model like this show how they are dependent on each other. These links, along with other unknown variables, are captured as probabilities. As data are entered for specific, known variables, all of the unknown variable probabilities are updated using a method called Bayesian inference. The model shows that the COVID-19 death rate is as much a function of sampling methods, testing and reporting, as it is determined by the underlying rate of infection in a vulnerable population….(More)”

A guide to healthy skepticism of artificial intelligence and coronavirus


Alex Engler at Brookings: “The COVID-19 outbreak has spurred considerable news coverage about the ways artificial intelligence (AI) can combat the pandemic’s spread. Unfortunately, much of it has failed to be appropriately skeptical about the claims of AI’s value. Like many tools, AI has a role to play, but its effect on the outbreak is probably small. While this may change in the future, technologies like data reporting, telemedicine, and conventional diagnostic tools are currently far more impactful than AI.

Still, various news articles have dramatized the role AI is playing in the pandemic by overstating what tasks it can perform, inflating its effectiveness and scale, neglecting the level of human involvement, and being careless in consideration of related risks. In fact, the COVID-19 AI-hype has been diverse enough to cover the greatest hits of exaggerated claims around AI. And so, framed around examples from the COVID-19 outbreak, here are eight considerations for a skeptic’s approach to AI claims….(More)”.

The War on Coronavirus Is Also a War on Paperwork


Article by Cass Sunstein: “As part of the war on coronavirus, U.S. regulators are taking aggressive steps against “sludge” – paperwork burdens and bureaucratic obstacles. This new battle front is aimed at eliminating frictions, or administrative barriers, that have been badly hurting doctors, nurses, hospitals, patients, and beneficiaries of essential public and private programs. 

Increasingly used in behavioral science, the term sludge refers to everything from form-filling requirements to time spent waiting in line to rules mandating in-person interviews imposed by both private and public sectors. Sometimes those burdens are justified – as, for example, when the Social Security Administration takes steps to ensure that those who receive benefits actually qualify for them. But far too often, sludge is imposed with little thought about its potentially devastating impact.

The coronavirus pandemic is concentrating the bureaucratic mind – and leading to impressive and brisk reforms. Consider a few examples. 

Under the Supplemental Nutrition Assistance Program (formerly known as food stamps), would-be beneficiaries have had to complete interviews before they are approved for benefits. In late March, the Department of Agriculture waived that requirement – and now gives states “blanket approval” to give out benefits to people who are entitled to them.

Early last week, the Internal Revenue Service announced that in order to qualify for payments under the Families First Coronavirus Response Act, people would have to file tax returns – even if they are Social Security recipients who typically don’t do that. The sludge would have ensured that many people never got money to which they were legally entitled. Under public pressure, the Department of Treasury reversed course – and said that Social Security recipients would receive the money automatically.

Some of the most aggressive sludge reduction efforts have come from the Department of Health and Human Services. Paperwork, reporting and auditing requirements are being eliminated. Importantly, dozens of medical services can now be provided through “telehealth.” 

In the department’s own words, the government “is allowing telehealth to fulfill many face-to-face visit requirements for clinicians to see their patients in inpatient rehabilitation facilities, hospice and home health.” 

In addition, Medicare will now pay laboratory technicians to travel to people’s homes to collect specimens for testing – thus eliminating the need for people to travel to health-care facilities for tests (and risk exposure to themselves or others). There are many other examples….(More)”.

Experts warn of privacy risk as US uses GPS to fight coronavirus spread


Alex Hern at The Guardian: “A transatlantic divide on how to use location data to fight coronavirus risks highlights the lack of safeguards for Americans’ personal data, academics and data scientists have warned.

The US Centers for Disease Control and Prevention (CDC) has turned to data provided by the mobile advertising industry to analyse population movements in the midst of the pandemic.

Owing to a lack of systematic privacy protections in the US, data collected by advertising companies is often extremely detailed: companies with access to GPS location data, such as weather apps or some e-commerce sites, have been known to sell that data on for ad targeting purposes. That data provides much more granular information on the location and movement of individuals than the mobile network data received by the UK government from carriers including O2 and BT.

While both datasets track individuals at the collection level, GPS data is accurate to within five metres, according to Yves-Alexandre de Montjoye, a data scientist at Imperial College, while mobile network data is accurate to 0.1km² in city centres and much less in less dense areas – the difference between locating an individual to their street and to a specific room in their home…

But, warns de Montjoye, such data is never truly anonymous. “The original data is pseudonymised, yet it is quite easy to reidentify someone. Knowing where someone was is enough to reidentify them 95% of the time, using mobile phone data. So there’s the privacy concern: you need to process the pseudonymised data, but the pseudonymised data can be reidentified. Most of the time, if done properly, the aggregates are aggregated, and cannot be de-anonymised.”

The data scientist points to successful attempts to use location data in tracking outbreaks of malaria in Kenya or dengue in Pakistan as proof that location data has use in these situations, but warns that trust will be hurt if data collected for modelling purposes is then “surreptitiously used to crack down on individuals not respecting quarantines or kept and used for unrelated purposes”….(More)”.

Mobile phone data and COVID-19: Missing an opportunity?


Paper by Nuria Oliver, et al: “This paper describes how mobile phone data can guide government and public health authorities in determining the best course of action to control the COVID-19 pandemic and in assessing the effectiveness of control measures such as physical distancing. It identifies key gaps and reasons why this kind of data is only scarcely used, although their value in similar epidemics has proven in a number of use cases. It presents ways to overcome these gaps and key recommendations for urgent action, most notably the establishment of mixed expert groups on national and regional level, and the inclusion and support of governments and public authorities early on. It is authored by a group of experienced data scientists, epidemiologists, demographers and representatives of mobile network operators who jointly put their work at the service of the global effort to combat the COVID-19 pandemic….(More)”.

Data Protection under SARS-CoV-2


GDPR Hub: “The sudden outbreak of cases of COVID-19-afflictions (“Corona-Virus”), which was declared a pandemic by the WHO affects data protection in various ways. Different data protection authorities published guidelines for employers and other parties involved in the processing of data related to the Corona-Virus (read more below).

The Corona-Virus has also given cause to the use of different technologies based on data collection and other data processing activities by the EU/EEA member states and private companies. These processing activities mostly focus on preventing and slowing the further spreading of the Corona-Virus and on monitoring the citizens’ abidance with governmental measures such as quarantine. Some of them are based on anonymous or anonymized data (like for statistics or movement patterns), but some proposals also revolved around personalized tracking.

At the moment, it is not easy to figure out, which processing activities are actually supposed to be conducted and which are only rumors. This page will therefore be adapted once certain processing activities have been confirmed. For now, this article does not assess the lawfulness of particular processing activities, but rather outlines the general conditions for data processing in connection with the Corona-Virus.

It must be noted that several activities – such as monitoring, if citizens comply with quarantine and stay indoors by watching at mobile phone locations – can be done without having to use personal data under Article 4(1) GDPR, if all necessary information can be derived from anonymised data. The GDPR does not apply to activities that only rely on anonymised data….(More)”.

Why isn’t the government publishing more data about coronavirus deaths?


Article by Jeni Tennison: “Studying the past is futile in an unprecedented crisis. Science is the answer – and open-source information is paramount…Data is a necessary ingredient in day-to-day decision-making – but in this rapidly evolving situation, it’s especially vital. Everything has changed, almost overnight. Demands for foodtransport, and energy have been overhauled as more people stop travelling and work from home. Jobs have been lost in some sectors, and workers are desperately needed in others. Historic experience can no longer tell us how our society or economy is working. Past models hold little predictive power in an unprecedented situation. To know what is happening right now, we need up-to-date information….

This data is also crucial for scientists, who can use it to replicate and build upon each other’s work. Yet no open data has been published alongside the evidence for the UK government’s coronavirus response. While a model that informed the US government’s response is freely available as a Google spreadsheet, the Imperial College London model that prompted the current lockdown has still not been published as open-source code. Making data open – publishing it on the web, in spreadsheets, without restrictions on access – is the best way to ensure it can be used by the people who need it most.

There is currently no open data available on UK hospitalisation rates; no regional, age or gender breakdown of daily deaths. The more granular breakdown of registered deaths provided by the Office of National Statistics is only published on a weekly basis, and with a delay. It is hard to tell whether this data does not exist or the NHS has prioritised creating dashboards for government decision makers rather than informing the rest of the country. But the UK is making progress with regard to data: potential Covid-19 cases identified through online and call-centre triage are now being published daily by NHS Digital.

Of course, not all data should be open. Singapore has been publishing detailed data about every infected person, including their age, gender, workplace, where they have visited and whether they had contact with other infected people. This can both harm the people who are documented and incentivise others to lie to authorities, undermining the quality of data.

When people are concerned about how data about them is handled, they demand transparency. To retain our trust, governments need to be open about how data is collected and used, how it’s being shared, with whom, and for what purpose. Openness about the use of personal data to help tackle the Covid-19 crisis will become more pressing as governments seek to develop contact tracing apps and immunity passports….(More)”.

Urgently Needed for Policy Guidance: An Operational Tool for Monitoring the COVID-19 Pandemic


Paper by Stephane Luchini et al:” The radical uncertainty around the current COVID19 pandemics requires that governments around the world should be able to track in real time not only how the virus spreads but, most importantly, what policies are effective in keeping the spread of the disease under check. To improve the quality of health decision-making, we argue that it is necessary to monitor and compare acceleration/deceleration of confirmed cases over health policy responses, across countries. To do so, we provide a simple mathematical tool to estimate the convexity/concavity of trends in epidemiological surveillance data. Had it been applied at the onset of the crisis, it would have offered more opportunities to measure the impact of the policies undertaken in different Asian countries, and to allow European and North-American governments to draw quicker lessons from these Asian experiences when making policy decisions. Our tool can be especially useful as the epidemic is currently extending to lower-income African and South American countries, some of which have weaker health systems….(More)”.

Privacy Protection Key for Using Patient Data to Develop AI Tools


Article by  Jessica Kent: “Clinical data should be treated as a public good when used for research or artificial intelligence algorithm development, so long as patients’ privacy is protected, according to a report from the Radiological Society of North America (RSNA).

As artificial intelligence and machine learning are increasingly applied to medical imaging, bringing the potential for streamlined analysis and faster diagnoses, the industry still lacks a broad consensus on an ethical framework for sharing this data.

“Now that we have electronic access to clinical data and the data processing tools, we can dramatically accelerate our ability to gain understanding and develop new applications that can benefit patients and populations,” said study lead author David B. Larson, MD, MBA, from the Stanford University School of Medicine. “But unsettled questions regarding the ethical use of the data often preclude the sharing of that information.”

To offer solutions around data sharing for AI development, RSNA developed a framework that highlights how to ethically use patient data for secondary purposes.

“Medical data, which are simply recorded observations, are acquired for the purposes of providing patient care,” Larson said….(More)”

Coronavirus Innovation Map


The Coronavirus Innovation Map is a platform of hundreds of innovations and solutions from around the world that help people cope and adapt to life amid the coronavirus pandemic, and to connect innovators.

The CoronaVirus Innovation Map is a visualized global database that is mapping the innovations related to tackling coronavirus in various fields such as diagnostics, treatment, lifestyle changes, etc., on a geographical scale….

Our goal with the Coronavirus Innovation Map is to build a crowdsourced resource that maps hundreds of innovations and solutions globally that help people cope and adapt to life amid the coronavirus, and to connect innovators.

This platform is a database for innovators to know who the other players are and where the projects or startups are located allowing them to connect and create solutions in this field. Policymakers will also be able to efficiently look for viable solutions in one place.

You may use the map to browse initiatives in specific locations (type a city or country in the search box), or choose a category wherein you would like to find a solution….(More)”