Mind the app – considerations on the ethical risks of COVID-19 apps


Blog by Luciano Floridi: “There is a lot of talk about apps to deal with the pandemic. Some of the best solutions use the Bluetooth connection of mobile phones to determine the contact between people and therefore the probability of contagion.

In theory, it’s simple. In practice, it is a minefield of ethical problems, not only technical ones. To understand them, it is useful to distinguish between the validation and the verification of a system. 
The validation of a system answers the question: “are we building the right system?”. The answer is no if the app

  • is illegal;
  • is unnecessary, for example, there are better solutions; 
  • is a disproportionate solution to the problem, for example, there are only a few cases in the country; 
  • goes beyond the purpose for which it was designed, for example, it is used to discriminate people; 
  • continues to be used even after the end of the emergency.

Assuming the app passes the validation stage, then it needs to be verified.
The verification of a system answers the question: “are we building the system in the right way?”. Here too the difficulties are considerable. I have become increasingly aware of them as I collaborate with two national projects about a coronavirus app, as an advisor on their ethical implications. 
For once, the difficult problem is not privacy. Of course, it is trivially true that there are and there might always be privacy issues. The point is that, in this case, they can be made much less pressing than other issues. However, once (or if you prefer, even if) privacy is taken care of, other difficulties appear to remain intractable. A Bluetooth-based app can use anonymous data, recorded only in the mobile phone, used exclusively to send alerts in case of the contact with people infected. It is not easy but it is feasible, as demonstrated by the approach adopted by the Pan-European Privacy Preserving Proximity Tracing initiative (PEPP-PT). The apparently intractable problems are the effectiveness and fairness of the app.

To be effective, an app must be adopted by many people. In Britain, I was told that it would be useless if used by less than 20% of the population. According to the PEPP-PT, real effectiveness seems to be reached around the threshold of 60% of the whole population. This means that in Italy, for example, the app should be consistently and correctly used by something between 11m to 33m people, out of a population of 55m. Consider that in 2019 Facebook Messenger was used by 23m Italians. Even the often-mentioned app TraceTogether has been downloaded by an insufficient number of people in Singapore.


Given that it is unlikely that the app will be adopted so extensively just voluntarily, out of social responsibility, and that governments are reluctant to impose it as mandatory (and rightly so, for it would be unfair, see below), it is clear that it will be necessary to encourage its use, but this only shifts the problem….

Therefore, one should avoid the risk of transforming the production of the app into a signalling process. To do so, the verification should not be severed from, but must feedback on, the validation. This means that if the verification fails so should the validation, and the whole project ought to be reconsidered. It follows that a clear deadline by when (and by whom) the whole project may be assessed (validation + verification) and in case be terminated, or improved, or even simply renewed as it is, is essential. At least this level of transparency and accountability should be in place.

An app will not save us. And the wrong app will be worse than useless, as it will cause ethical problems and potentially exacerbate health-related risks, e.g. by generating a false sense of security, or deepening the digital divide. A good app must be part of a wider strategy, and it needs to be designed to support a fair future. If this is not possible, better do something else, avoid its positive, negative and opportunity costs, and not play the political game of merely signalling that something (indeed anything) has been tried…(More)”.

Mapping how data can help address COVID-19


Blog by Andrew J. Zahuranec and Stefaan G. Verhulst: “The novel coronavirus disease (COVID-19) is a global health crisis the likes of which the modern world has never seen. Amid calls to action from the United Nations Secretary-General, the World Health Organization, and many national governments, there has been a proliferation of initiatives using data to address some facet of the pandemic. In March, The GovLab at NYU put out its own call to action, which identifies key steps organizations and decision-makers can take to build the data infrastructure needed to tackle pandemics. This call has been signed by over 400 data leaders from around the world in the public and private sector and in civil society.

But questions remain as to how many of these initiatives are useful for decision-makers. While The GovLab’s living repository contains over 160 data collaboratives, data competitions, and other innovative work, many of these examples take a data supply-side approach to the COVID-19 response. Given the urgency of the situation, some organizations create projects that align with the available data instead of trying to understand what insights those responding to the crisis actually want, including issues that may not be directly related to public health.

We need to identify and ask better questions to use data effectively in the current crisis. Part of that work means understanding what topics can be addressed through enhanced data access and analysis.

Using The GovLab’s rapid-research methodology, we’ve compiled a list of 12 topic areas related to COVID-19 where data and analysis is needed. …(More)”.

How can digital tools support deliberation?


 Claudia Chwalisz at the OECD: “As part of our work on Innovative Citizen Participation, we’ve launched a series of articles to open a discussion and gather evidence on the use of digital tools and practices in representative deliberative processes. ….The current context is obliging policy makers and practitioners to think outside the box and adapt to the inability of physical deliberation. How can digital tools allow planned or ongoing processes like Citizens’ Assemblies to continue, ensuring that policy makers can still garner informed citizen recommendations to inform their decision making? New experiments are getting underway, and the evidence gathered could also be applied to other situations when face-to-face is not possible or more difficult like international processes or any situation that prevents physical gathering.

This series will cover the core phases that a representative deliberative process should follow, as established in the forthcoming OECD report: learning, deliberation, decision making, and collective recommendations. Due to the different nature of conducting a process online, we will additionally consider a phase required before learning: skills training. The articles will explore the use of digital tools at each phase, covering questions about the appropriate tools, methods, evidence, and limitations.

They will also consider how the use of certain digital tools could enhance good practice principles such as impact, transparency, and evaluation:

  • Impact: Digital tools can help participants and the public to better monitor the status of the proposed recommendations and the impact they had on final decision- making. A parallel can be drawn with the extensive use of this methodology by the United Nations for the monitoring and evaluation of the impact of the Sustainable Development Goals (SDGs).
  • Transparency: Digital tools can facilitate transparency across the process. The use of collaborative tools allows for transparency regarding who wrote the final outcome of the process (ability to trace the contributors of the document and the different versions). By publishing the code and the algorithms applied for the random selection (sortition) process and the data or statistics used for the stratification could give total transparency on how participants are selected.
  • Evaluation: Data collection and analysis can help researchers and policy makers assess the process (for e.g., deliberation quality, participant surveys, opinion evolution). Publishing this data in a structured and open format can allow for a broader evaluation and contribute to research. Over the course of the next year, the OECD will be preparing evaluation guidelines in accordance with the good practice principles to enable comparative data analysis.

The series will also consider how the use of emerging technologies and digital tools could complement face-to-face processes, for instance:

  • Artificial intelligence (AI) and text-based technologies (i.e. natural language processing, NLP): Could the use of AI-based tools enrich deliberative processes? For example: mapping opinion clusters, consensus building, analysis of massive inputs from external participants in the early stage of stakeholder input. Could NLP allow for simultaneous translation to other languages, feelings analysis, and automated transcription? These possibilities already exist, but raise more pertinent questions around reliability and user experience. How could they be connected to human analysis, discussion, and decision making?
  • Virtual/Augmented reality: Could the development of these emerging technologies allow participants to be immersed in virtual environments and thereby simulate face-to-face deliberation or experiences that enable and build empathy with possible futures or others’ lived experiences?…(More)”.

Epistemic Humility—Knowing Your Limits in a Pandemic


Essay by Erik Angner: “Ignorance,” wrote Charles Darwin in 1871, “more frequently begets confidence than does knowledge.”

Darwin’s insight is worth keeping in mind when dealing with the current coronavirus crisis. That includes those of us who are behavioral scientists. Overconfidence—and a lack of epistemic humility more broadly—can cause real harm.

In the middle of a pandemic, knowledge is in short supply. We don’t know how many people are infected, or how many people will be. We have much to learn about how to treat the people who are sick—and how to help prevent infection in those who aren’t. There’s reasonable disagreement on the best policies to pursue, whether about health care, economics, or supply distribution. Although scientists worldwide are working hard and in concert to address these questions, final answers are some ways away.

Another thing that’s in short supply is the realization of how little we know. Even a quick glance at social or traditional media will reveal many people who express themselves with way more confidence than they should…

Frequent expressions of supreme confidence might seem odd in light of our obvious and inevitable ignorance about a new threat. The thing about overconfidence, though, is that it afflicts most of us much of the time. That’s according to cognitive psychologists, who’ve studied the phenomenon systematically for half a century. Overconfidence has been called “the mother of all psychological biases.” The research has led to findings that are at the same time hilarious and depressing. In one classic study, for example, 93 percent of U.S. drivers claimed to be more skillful than the median—which is not possible.

“But surely,” you might object, “overconfidence is only for amateurs—experts would not behave like this.” Sadly, being an expert in some domain does not protect against overconfidence. Some research suggests that the more knowledgeable are more prone to overconfidence. In a famous study of clinical psychologists and psychology students, researchers asked a series of questions about a real person described in psychological literature. As the participants received more and more information about the case, their confidence in their judgment grew—but the quality of their judgment did not. And psychologists with a Ph.D. did no better than the students….(More)”.

A widening data divide: COVID-19 and the Global South


Essay by stefania milan and Emiliano Treré at Data & Policy: “If numbers are the conditions of existence of the COVID-19 problem, we ought to pay attention to the actual (in)ability of many countries in the South to test their population for the virus, and to produce reliable population statistics more in general — let alone to adequately care for them. It is a matter of a “data gap” as well as of data quality, which even in “normal” times hinders the need for “evidence-based policy making, tracking progress and development, and increasing government accountability” (Chen et al., 2013). And while the World Health Organization issues warning about the “dramatic situation” concerning the spread of COVID-19 in the African continent, to name just one of the blind spots of our datasets of the global pandemic, the World Economic Forum calls for “flattening the curve” in developing countries. Progress has been made following the revision of the United Nations’ Millennium Development Goals in 2005, with countries in the Global South have been invited (and supported) to devise National Strategies for the Development of Statistics. Yet, a cursory look at the NYU GovLab’s valuable repository of data collaboratives” addressing the COVID-19 pandemic reveals the virtual absence of data collection and monitoring projects in the South of the hemisphere. The next obvious step is the dangerous equation “no data=no problem”.

Disease and “whiteness”

Epidemiology and pharmacogenetics (i.e. the study of the genetic basis of how people respond to pharmaceuticals), to name but a few amongst the number of concerned life sciences, are largely based on the “inclusion of white/Caucasians in studies and the exclusion of other ethnic groups” (Tutton, 2007). In other words, modeling of disease evolution and the related solutions are based on datasets that take into account primarily — and in fact almost exclusively — the caucasian population. This is a known problem in the field, which derives from the “assumption that a Black person could be thought of as being White”, dismissing specificities and differences. This problem has been linked to the “lack of social theory development, due mainly to the reluctance of epidemiologists to think about social mechanisms (e.g., racial exploitation)” (Muntaner, 1999, p. 121). While COVID-19 represents a slight variation on this trend, having been first identified in China, the problem on the large scale remains. And in times of a health emergency as global as this one, risks to be reinforced and perpetuated.

A succulent market for the industry

In the lack of national testing capacity, the developing world might fall prey to the blooming industry of genetic and disease testing, on the one hand, and of telecom-enabled population monitoring on the other. Private companies might be able to fill the gap left by the state, mapping populations at risk — while however monetizing their data. The case of 23andme is symptomatic of this rise of industry-led testing, which constitutes a double-edge sword. On the one hand, private actors might supply key services that resource-poor or failing states are unable to provide. On the other hand, however, the distorted and often hidden agendas of profit-led players reveals its shortcomings and dangers. If we look at the telecom industry, we note how it has contributed to track disease propagation in a number of health emergencies such as Ebola. And if the global open data community has called for smoother data exchange between the private and the public sector to collectively address the spread of the virus,in the absence of adequate regulatory frameworks in the Global South, for example in the field of privacy and data retention, local authorities might fall prey to outside interventions of dubious nature….(More)”.

Data & Policy


Data & Policy, an open-access journal exploring the potential of data science for governance and public decision-making, published its first cluster of peer-reviewed articles last week.

The articles include three contributions specifically concerned with data protection by design:

·       Gefion Theurmer and colleagues (University of Southampton) distinguish between data trusts and other data sharing mechanisms and discuss the need for workflows with data protection at their core;

·       Swee Leng Harris (King’s College London) explores Data Protection Impact Assessments as a framework for helping us know whether government use of data is legal, transparent and upholds human rights;

·       Giorgia Bincoletto’s (University of Bologna) study investigates data protection concerns arising from cross-border interoperability of Electronic Health Record systems in the European Union;

Also published, research by Jacqueline Lam and colleagues (University of Cambridge; Hong Kong University) on how fine-grained data from satellites and other sources can help us understand environmental inequality and socio-economic disparities in China, and this also reflects upon the importance of safeguarding data privacy and security. See also the blogs this week on the potential of Data Collaboratives for COVID-19 by Editor-in-Chief Stefaan Verhulst (the GovLab) and how COVID-19 exposes a widening data divide for the Global South, by Stefania Milan (University of Amsterdam) and Emiliano Treré (University of Cardiff).

Data & Policy is an open access, peer-reviewed venue for contributions that consider how systems of policy and data relate to one another. Read the 5 ways you can contribute to Data & Policy and contact [email protected] with any questions….(More)”.

Citizen input matters in the fight against COVID-19


Britt Lake at FeedbackLabs: “When the Ebola crisis hit West Africa in 2015, one of the first responses was to build large field hospitals to treat the rapidly growing number of Ebola patients. As Paul Richards explains, “These were seen as the safest option. But they were shunned by families, because so few patients came out alive.” Aid workers vocally opposed local customs like burial rituals that contributed to the spread of the virus, which caused tension with communities. Ebola-affected communities insisted that some of their methods had proven effective in lowering case numbers before outside help arrived. When government and aid agencies came in and delivered their own messages, locals felt that their expertise had been ignored. Distrust spread, as did a sense that the response pitted local knowledge against global experts. And the virus continued to spread. 

The same is true now. Today there are more than 1 million confirmed cases of COVID-19 worldwide. The virus has spread to every country and territory in the world, leaving virtually no one unaffected. The pandemic is exacerbating inequities in employment, education, access to healthcare and food, and workers’ rights even as it raises new challenges. Everyone is looking for answers to address their needs and anxieties while also collectively realizing that this pandemic and our responses to it will irrevocably shape the future.

It would be easy for us in the public sector to turn inwards for solutions on how to respond effectively to the pandemic and its aftermath. It’s comfortable to focus on perspectives from our own teams when we feel a heightened sense of urgency, and decisions must be made on a dime. However, it would be a mistake not to consider input from the communities we serve – alongside expert knowledge – when determining how we support them through this crisis. 

COVID-19 affects everyone on earth, and it won’t be possible to craft equitable responses that meet people’s needs around the globe unless we listen to what would work best to address those challenges and support homegrown solutions that are already working. Effective communication of public health information, for instance, is central to controlling the spread of COVID-19. By listening to communities, we can better understand what communication methods work for them and can do a better job getting those messages across in a way that resonates with diverse communities. And to face the looming economic crisis that COVID-19 is precipitating, we will need to engage in real dialogue with people about their priorities and the way they want to see society rebuilt….(More)”.

Synthetic data offers advanced privacy for the Census Bureau, business


Kate Kaye at IAPP: “In the early 2000s, internet accessibility made risks of exposing individuals from population demographic data more likely than ever. So, the U.S. Census Bureau turned to an emerging privacy approach: synthetic data.

Some argue the algorithmic techniques used to develop privacy-secure synthetic datasets go beyond traditional deidentification methods. Today, along with the Census Bureau, clinical researchers, autonomous vehicle system developers and banks use these fake datasets that mimic statistically valid data.

In many cases, synthetic data is built from existing data by filtering it through machine learning models. Real data representing real individuals flows in, and fake data mimicking individuals with corresponding characteristics flows out.

When data scientists at the Census Bureau began exploring synthetic data methods, adoption of the internet had made deidentified, open-source data on U.S. residents, their households and businesses more accessible than in the past.

Especially concerning, census-block-level information was now widely available. Because in rural areas, a census block could represent data associated with as few as one house, simply stripping names, addresses and phone numbers from that information might not be enough to prevent exposure of individuals.

“There was pretty widespread angst” among statisticians, said John Abowd, the bureau’s associate director for research and methodology and chief scientist. The hand-wringing led to a “gradual awakening” that prompted the agency to begin developing synthetic data methods, he said.

Synthetic data built from the real data preserves privacy while providing information that is still relevant for research purposes, Abowd said: “The basic idea is to try to get a model that accurately produces an image of the confidential data.”

The plan for the 2020 census is to produce a synthetic image of that original data. The bureau also produces On the Map, a web-based mapping and reporting application that provides synthetic data showing where workers are employed and where they live along with reports on age, earnings, industry distributions, race, ethnicity, educational attainment and sex.

Of course, the real census data is still locked away, too, Abowd said: “We have a copy and the national archives have a copy of the confidential microdata.”…(More)”.

The potential of Data Collaboratives for COVID19


Blog post by Stefaan Verhulst: “We live in almost unimaginable times. The spread of COVID-19 is a human tragedy and global crisis that will impact our communities for many years to come. The social and economic costs are huge and mounting, and they are already contributing to a global slowdown. Every day, the emerging pandemic reveals new vulnerabilities in various aspects of our economic, political and social lives. These include our vastly overstretched public health services, our dysfunctional political climate, and our fragile global supply chains and financial markets.

The unfolding crisis is also making shortcomings clear in another area: the way we re-use data responsibly. Although this aspect of the crisis has been less remarked upon than other, more obvious failures, those who work with data—and who have seen its potential to impact the public good—understand that we have failed to create the necessary governance and institutional structures that would allow us to harness data responsibly to halt or at least limit this pandemic. A recent article in Stat, an online journal dedicated to health news, characterized the COVID-19 outbreak as “a once-in-a-century evidence fiasco.” The article continues: 

“At a time when everyone needs better information, […] we lack reliable evidence on how many people have been infected with SARS-CoV-2 or who continue to become infected. Better information is needed to guide decisions and actions of monumental significance and to monitor their impact.” 

It doesn’t have to be this way, and these data challenges are not an excuse for inaction. As we explain in what follows, there is ample evidence that the re-use of data can help mitigate health pandemics. A robust (if somewhat unsystematized) body of knowledge could direct policymakers and others in their efforts. In the second part of this article, we outline eight steps that key stakeholders can and should take to better re-use data in the fight against COVID-19. In particular, we argue that more responsible data stewardship and increased use of data collaboratives are critical….(More)”. 

Unpredictable Residency during the COVID-19 Pandemic Spells Trouble for the 2020 Census Count


Blog by Diana Elliott and Robert Santos: “Social distancing measures to curtail the community spread of COVID-19 have upended daily life. Just before lockdowns were implemented across the country, there was tremendous movement and migration of people relocating to different residences to shelter in place. This makes sense for the people involved but could be disastrous for the communities they fled and the final 2020 Census counts.

Pandemic-based migration undermines an accurate count

The 2020 Census, like most data collected by the US Census Bureau, is residence based. In the years leading up to 2020, the US Census Bureau worked diligently on the quality of the Master Address File, or the catalog of all residential addresses in the country. Staff account for newly built housing developments and buildings, apartment units or accessory dwelling units that are used as permanent residences, and the demolition of homes and apartments in the past decade. Census materials are sent to an address, rather than a person.

Most residences across America have already received their 2020 Census invitation. Whether completed online, by paper, by phone, or in person, the first official question on the 2020 Census questionnaire is “How many people were living or staying in this house, apartment, or mobile home on April 1, 2020?” Households are expected to answer this based on the concept of “usual residence,” or the place where a person lives and sleeps most of the time.

Despite written guidance provided on the 2020 Census on how to answer this question, doing so may be wrought with complexities and nuance from the pandemic.

First, research reveals that respondents do not often read questionnaire instructions; they dive in and start answering. With many people scrambling to other counties, cities, and states to hunker down for the long haul with loved ones, this will lead to incorrect counts when people are counted at temporary addresses.

Second, for many, the concept of “usual residence” has little relevance in the uncertainty unfolding during the COVID-19 pandemic. What if your temporary address becomes your permanent address? What does “usual residence” mean during a global epidemic that could stretch for 18 months or more? And perhaps more importantly, what should it mean?

Finally, there is the added complication of census operational delays (PDF). Self-response to the 2020 Census has been extended into August, as have the nonresponse follow-up efforts, when enumerators knock on the doors of those who haven’t yet answered the census. Additional delays seem unavoidable. The longer the delay, the more time there is for people who have not yet completed a census form to realize their temporary plan has evolved into a state of permanence….(More)”.