Data for Policy: Junk-Food Diet or Technological Frontier?


Blog by Ed Humpherson at Data & Policy: “At the Office for Statistics Regulation, thinking about these questions is our day job. We set the standards for Government statistics and data through our Code of Practice for Statistics. And we review how Government departments are living up to these standards when they publish data and statistics. We routinely look at Government statistics are used in public debate.

Based on this, I would propose four factors that ensure that new data sources and tools serve the public good. They do so when:

  1. When data quality is properly tested and understood:

As my colleague Penny Babb wrote recently in a blog“‘Don’t trust the data. If you’ve found something interesting, something has probably gone wrong!”. People who work routinely with data develop a sort of innate scepticism, which Penny’s blog captures neatly. Understanding the limitations of both the data, and the inferences you make about the data, are the starting point for any appropriate role for data and policy. Accepting results and insights from new data at face value is a mistake. Much better to test the quality, explore the risks of mistakes, and only then to share findings and conclusions.

2. When the risks of misleadingness are considered:

At OSR, we have an approach to misleadingness that focuses on whether a misuse of data might lead a listener to a wrong conclusion. In fact, by “wrong” we don’t mean in some absolute sense of objective truth; more that if they received the data presented in a different and more faithful way, they would change their mind. Here’s a really simple example: someone might hear that, of two neighbouring countries, one has a much lower fatality rate, when comparing deaths to positive tests for Covid-19. …

3. When the data fill gaps

Data gaps come in several forms. One gap, highlighted by the interest in real-time economic indicators, is timing. Economic statistics don’t really tell us what’s going on right now. Figures like GDP, trade and inflation tells us about some point in the (admittedly quite) recent past. This is the attraction of the real-time economic indicators, which the Bank of England have drawn on in their decisions during the pandemic. They give policymakers a much more real-time feel by filling in this timing gap.

Other gaps are not about time but about coverage….

4. When the data are available

Perhaps the most important thing for data and policy is to democratise the notion of who the data are for. Data (and policy itself) are not just for decision-making elites. They are a tool to help people make sense of their world, what is going on in their community, helping frame and guide the choices they make.

For this reason, I often instinctively recoil at narratives of data that focus on the usefulness of data to decision-makers. Of course, we are all decision-makers of one kind or another, and data can help us all. But I always suspect that the “data for decision-makers” narrative harbours an assumption that decisions are made by senior, central, expert people, who make decisions on behalf of society; people who are, in the words of the musical Hamilton, in the room where it happens. It’s this implication that I find uncomfortable.

That’s why, during the pandemic, our work at the Office for Statistics Regulation has repeatedly argued that data should be made available. We have published a statement that any management information referred to by a decision maker should be published clearly and openly. We call this equality of access.

We fight for equality of access. We have secured the publication of lots of data — on positive Covid-19 cases in England’s Local Authorities, on Covid-19 in prisons, on antibody testing in Scotland…. and several others.

Data and policy are a powerful mix. They offer huge benefits to society in terms of defining, understanding and solving problems, and thereby in improving lives. We should be pleased that the coming together of data and policy is being sped-up by the pandemic.

But to secure these benefits, we need to focus on four things: quality, misleadingness, gaps, and public availability….(More)”

Digital in the Time of the Coronavirus: Data Science and Technology as a Force for Inclusion


Blog by Aleem Walji: “Crises do not create inequity and fault lines in society, they expose them. The systems and structures that give rise to inequality and inequity are deep-rooted and powerful. In recent months, we have seen the coronavirus bring into high relief many social and economic vulnerabilities across the world. It is now clear that Hispanics and Blacks are even more vulnerable to Covid-19 because of underlying health conditions, more frequent exposure to the virus, and broken social safety nets. This trend will only accelerate as the virus gains a foothold in Africa, parts of Asia, and Latin America.

The impact of the virus in places where health systems are weak, poverty is high, and large numbers of people are immunocompromised could be devastating. How do we mitigate the medium-term and second-order effects of a pandemic that will shrink economic growth and exacerbate inequality? This year alone, more than 500 million people are expected to fall into poverty, mostly in Africa and Asia. To defeat a virus that does not respect geographic boundaries, it is urgent for public and private actors, philanthropies, and global development institutions to use every tool available to alleviate a global humanitarian emergency and attendant economic collapse.

Technology, data science, and digital readiness are crucial elements for an effective emergency response and foundational to sustain a long-term recovery. Already, scientists and researchers across the world are leveraging data and digital platforms to accelerate the development of a vaccine, fast-track clinical trials, and contact tracing using mobile-enabled tools. Sensors are collecting huge amounts of data, and machine learning algorithms are helping policymakers decide when to relax physical distancing and where to open the economy and for how long.

Access to reliable information for decisionmaking, however, is not evenly spread. High frequency, granular, and anonymized datasets are essential for public-health officials and community health workers to target interventions and reach vulnerable populations faster and at a lower cost. Equipped with reliable data, civic technologists can leverage tools like artificial intelligence and machine learning to flatten the curve of Covid-19 and also the curve of inequity and unequal access to services and support.

This will not happen on its own. Preventing a much deeper digital divide will require forward-leaning policymakers, far-sighted investors and grant makers, civic-minded tech innovators and businesses, and a robust, digitally savvy civil society to work collaboratively for social and economic inclusion. It will require political will and improved data governance to deploy digital platforms to serve populations furthest behind. It is in our collective interest to ensure the health and well-being of every segment of society. Digital inclusion is part of the solution.

There are certain pathways public, private and social actors can follow to leverage data science, digital tools, and platforms today….(More)”.

Rethinking citizen engagement for an inclusive energy transition


Urban Futures Studio: “In July 2020, we published our new essay ‘What, How and Who? Designing inclusive interactions in the energy transition’ (Bronsvoort, Hoffman and Hajer, 2020). In this essay, we argue that how the interactions between citizens and governments are shaped and enacted, has a large influence on who gets involved and to what extend people feel heard. To apply this approach to cases, we distinguish between three dimensions of interaction:

  • What (the defined object or issue at hand)
  • How (the setting and staging of the interaction)
  • Who (the target groups and protagonists of the process)

Focusing on the issue of form, we argue that processes for interaction between citizens and governments should be designed in a way that is more future oriented, organized over the long term, in closer proximity to citizens and with attention to the powerful role of ‘in-betweeners’ and ‘in-between’ places such as community houses, where people can meet to deliberate on the wide range of possible futures for their neighbourhood. 

Towards a multiplicity of future visions for sustainable cities
The energy transition has major consequences for the way we live, work, move and consume. For such complex transitions, governments need to engage and collaborate with citizens and other stakeholders. Their engagement enriches existing visions on future neighbourhoods, inform local policies and stimulate change. But how do you shape and organize such a participatory process? While governments use a wide range of public participation methods, many researchers have emphasized the limitations of many of these conventional methods with regard to the inclusion of diverse groups of citizens and in bridging discrepancies between government approaches and people’s lived experiences.

Rethinking citizen engagement for an inclusive energy transition
To help rethink citizen engagement, the Urban Futures Studio investigates existing and new approaches to citizen engagement and how they are practised by governments and societal actors. Following our essay research, our next project on citizen engagement includes a study on its relation to experimentation as a novel mode of governance. The goal of this research is to show insights into how citizen engagement manifests itself in the context of experimental governance on the neighbourhood level. By investigating the interactions between citizens, governments and other stakeholders in different types of participatory projects, we aim to gain a better understanding of how citizens are engaged and included in energy transition experiments and how we can improve its level of inclusion.

We use a relational approach of citizen engagement, by which we view participatory processes as collective practices that both shape and are shaped by their ‘matter of concern’, their public and their setting and staging. This view places emphasis on the form and conditions under which the interaction takes place. For example, the initiative of Places of Hope showed that engagement can be organised in diverse ways and can create new collectives….(More)”.

The Coronavirus and Innovation


Essay by Scott E. Page: “The total impact of the coronavirus pandemic—the loss of life and the economic, social, and psychological costs arising from both the pandemic itself and the policies implemented to prevent its spread—defy any characterization. Though the pandemic continues to unsettle, disrupt, and challenge communities, we might take a moment to appreciate and applaud the diversity, breadth, and scope of our responses—from individual actions to national policies—and even more important, to reflect on how they will produce a post–Covid-19 world far better than the world that preceded it.

In this brief essay, I describe how our adaptive responses to the coronavirus will lead to beneficial policy innovations. I do so from the perspective of a many-model thinker. By that I mean that I will use several formal models to theoretically elucidate the potential pathways to creating a better world. I offer this with the intent that it instills optimism that our current efforts to confront this tragic and difficult challenge will do more than combat the virus now and teach us how to combat future viruses. They will, in the long run, result in an enormous number of innovations in policy, business practices, and our daily lives….(More)”.

Differential Privacy for Privacy-Preserving Data Analysis


Introduction to a Special Blog Series by NIST: “…How can we use data to learn about a population, without learning about specific individuals within the population? Consider these two questions:

  1.  “How many people live in Vermont?”
  2. “How many people named Joe Near live in Vermont?”

The first reveals a property of the whole population, while the second reveals information about one person. We need to be able to learn about trends in the population while preventing the ability to learn anything new about a particular individual. This is the goal of many statistical analyses of data, such as the statistics published by the U.S. Census Bureau, and machine learning more broadly. In each of these settings, models are intended to reveal trends in populations, not reflect information about any single individual.

But how can we answer the first question “How many people live in Vermont?” — which we’ll refer to as a query — while preventing the second question from being answered “How many people name Joe Near live in Vermont?” The most widely used solution is called de-identification (or anonymization), which removes identifying information from the dataset. (We’ll generally assume a dataset contains information collected from many individuals.) Another option is to allow only aggregate queries, such as an average over the data. Unfortunately, we now understand that neither approach actually provides strong privacy protection. De-identified datasets are subject to database-linkage attacks. Aggregation only protects privacy if the groups being aggregated are sufficiently large, and even then, privacy attacks are still possible [1, 2, 3, 4]. 

Differential Privacy

Differential privacy [5, 6] is a mathematical definition of what it means to have privacy. It is not a specific process like de-identification, but a property that a process can have. For example, it is possible to prove that a specific algorithm “satisfies” differential privacy.

Informally, differential privacy guarantees the following for each individual who contributes data for analysis: the output of a differentially private analysis will be roughly the same, whether or not you contribute your data. A differentially private analysis is often called a mechanism, and we denote it ℳ.

Figure 1: Informal Definition of Differential Privacy
Figure 1: Informal Definition of Differential Privacy

Figure 1 illustrates this principle. Answer “A” is computed without Joe’s data, while answer “B” is computed with Joe’s data. Differential privacy says that the two answers should be indistinguishable. This implies that whoever sees the output won’t be able to tell whether or not Joe’s data was used, or what Joe’s data contained.

We control the strength of the privacy guarantee by tuning the privacy parameter ε, also called a privacy loss or privacy budget. The lower the value of the ε parameter, the more indistinguishable the results, and therefore the more each individual’s data is protected.

Figure 2: Formal Definition of Differential Privacy
Figure 2: Formal Definition of Differential Privacy

We can often answer a query with differential privacy by adding some random noise to the query’s answer. The challenge lies in determining where to add the noise and how much to add. One of the most commonly used mechanisms for adding noise is the Laplace mechanism [5, 7]. 

Queries with higher sensitivity require adding more noise in order to satisfy a particular `epsilon` quantity of differential privacy, and this extra noise has the potential to make results less useful. We will describe sensitivity and this tradeoff between privacy and usefulness in more detail in future blog posts….(More)”.

Trade-offs and considerations for the future: Innovation and the COVID-19 response


Essay by Benjamin Kumpf: “…Here are some of the relevant trade-offs I identified. 

Rigour vs. Speed

How to best balance high-quality rigorous research and the need to gain actionable insights rapidly?  

Responding to a pandemic requires working at pace, while investing in ongoing research and the cross-fertilization of disciplines. In our response, we witness the importance of strong networks with academia and DFID’s focus on high-quality research. In parallel, we invest in supporting partners with rapid data collection through methods such as phone surveys, field visits, onsite interviews where possible as well as big data analysis and more. For example, through the International Growth Centre, DFID has supported a Sierra Leone COVID-19 dashboard, providing real time data on current economic conditions and trends from phone–based surveys from 195 towns and villages across Sierra Leone. ….

Breadth vs. depth

How to best balance providing services to large proportions of populations in need, while addressing challenges of specific communities?  

We are seeing emerging evidence that the virus and measures to prevent spread are disproportionately impacting marginalized communities and minorities. For example, in indigenous people are disproportionally affected by the virus in Brazil, Dalits are among the worst affected in India. In development and humanitarian contexts, it is paramount to guide innovation efforts with explicit values, including on the trade-off between scale and addressing last-mile challenges to leaveno–one behind. For example, to facilitate behaviour-change and embed insights from behavioural science and adaptive practices, DFID is supporting the Hygiene Hub, hosted at the London School for Hygiene and Tropical Medicine. The Hub provides free-of-charge advisory services to governments and non-governmental organizations working on COVID-19 related challenges in low and medium-income countries, balancing the need to reach large audiences and to design bespoke interventions for specific communities.  

Exploration vs. adaptation

How to best diversify innovation efforts and investments betweensearching for local solution and adapting proven approaches? 

Adaptive vs. locally-led

How to best learn and adapt, while providing ownership to local players?

Single-point solutions vs. systems-practices

How to advance specific tech and non-tech innovations that address urgent needs, while further improving existing systems? 

Supporting domestic innovators vs. strengthening local solutions and ecosystems

We need explicit conversations to ensure better transparency about this trade-off in innovation investments generally.…(More)”.

What Ever Happened to Digital Contact Tracing?


Chas Kissick, Elliot Setzer, and Jacob Schulz at Lawfare: “In May of this year, Prime Minister Boris Johnson pledged the United Kingdom would develop a “world beating” track and trace system by June 1 to stop the spread of the novel coronavirus. But on June 18, the government quietly abandoned its coronavirus contact-tracing app, a key piece of the “world beating” strategy, and instead promised to switch to a model designed by Apple and Google. The delayed app will not be ready until winter, and the U.K.’s Junior Health Minister told reporters that “it isn’t a priority for us at the moment.” When Johnson came under fire in Parliament for the abrupt U-turn, he replied: “I wonder whether the right honorable and learned Gentleman can name a single country in the world that has a functional contact tracing app—there isn’t one.”

Johnson’s rebuttal is perhaps a bit reductive, but he’s not that far off.

You probably remember the idea of contact-tracing apps: the technological intervention that seemed to have the potential to save lives while enabling a hamstrung economy to safely inch back open; it was a fixation of many public health and privacy advocates; it was the thing that was going to help us get out of this mess if we could manage the risks.

Yet nearly three months after Google and Apple announced with great fanfare their partnership to build a contact-tracing API, contact-tracing apps have made an unceremonious exit from the front pages of American newspapers. Countries, states and localities continue to try to develop effective digital tracing strategies. But as Jonathan Zittrain puts it, the “bigger picture momentum appears to have waned.”

What’s behind contact-tracing apps’ departure from the spotlight? For one, there’s the onset of a larger pandemic apathy in the U.S; many politicians and Americans seem to have thrown up their hands or put all their hopes in the speedy development of a vaccine. Yet, the apps haven’t even made much of a splash in countries that havetaken the pandemic more seriously. Anxieties about privacy persist. But technical shortcomings in the apps deserve the lion’s share of the blame. Countries have struggled to get bespoke apps developed by government technicians to work on Apple phones. The functionality of some Bluetooth-enabled models vary widely depending on small changes in phone positioning. And most countries have only convinced a small fraction of their populace to use national tracing apps.

Maybe it’s still possible that contact-tracing apps will make a miraculous comeback and approach the level of efficacy observers once anticipated.

But even if technical issues implausibly subside, the apps are operating in a world of unknowns.

Most centrally, researchers still have no real idea what level of adoption is required for the apps to actually serve their function. Some estimates suggest that 80 percent of current smartphone owners in a given area would need to use an app and follow its recommendations for digital contact tracing to be effective. But other researchers have noted that the apps could slow the rate of infections even if little more than 10 percent of a population used a tracing app. It will be an uphill battle even to hit the 10 percent mark in America, though. Survey data show that fewer than three in 10 Americans intend to use contact-tracing apps if they become available…(More).

Adolescent Mental Health: Using A Participatory Mapping Methodology to Identify Key Priorities for Data Collaboration


Blog by Alexandra Shaw, Andrew J. Zahuranec, Andrew Young, Stefaan G. Verhulst, Jennifer Requejo, Liliana Carvajal: “Adolescence is a unique stage of life. The brain undergoes rapid development; individuals face new experiences, relationships, and environments. These events can be exciting, but they can also be a source of instability and hardship. Half of all mental conditions manifest by early adolescence and between 10 and 20 percent of all children and adolescents report mental health conditions. Despite the increased risks and concerns for adolescents’ well-being, there remain significant gaps in availability of data at the country level for policymakers to address these issues.

In June, The GovLab partnered with colleagues at UNICEF’s Health and HIV team in the Division of Data, Analysis, Planning & Monitoring and the Data for Children Collaborative — a collaboration between UNICEF, the Scottish Government, and the University of Edinburgh — to design and apply a new methodology of participatory mapping and prioritization of key topics and issues associated with adolescent mental health that could be addressed through enhanced data collaboration….

The event led to three main takeaways. First, the topic mapping allows participants to deliberate and prioritize the various aspects of adolescent mental health in a more holistic manner. Unlike the “blind men and the elephant” parable, a topic map allows the participants to see and discuss  the interrelated parts of the topic, including those which they might be less familiar with.

Second, the workshops demonstrated the importance of tapping into distributed expertise via participatory processes. While the topic map provided a starting point, the inclusion of various experts allowed the findings of the document to be reviewed in a rapid, legitimate fashion. The diverse inputs helped ensure the individual aspects could be prioritized without a perspective being ignored.

Lastly, the approach showed the importance of data initiatives being driven and validated by those individuals representing the demand. By soliciting the input of those who would actually use the data, the methodology ensured data initiatives focused on the aspects thought to be most relevant and of greatest importance….(More)”

Citizen initiatives facing COVID-19: Due to spontaneous generation or the product of social capital in Mexico City?


UNDP: “Since the COVID-19 pandemic arrived in Mexico, multiple citizen responses[1] have emerged to tackle its impacts: digital aid platforms such as Frena la Curva (Stop the curve) and México Covid19;  groups of makers that design medical and protective equipment; Zapotec indigenous women that teach how to make hand sanitizer at home; public buses that turn into mobile markets; and much more.

There are initiatives that aid 10, 20, 3,000 or more people; initiatives that operate inside a housing unit, a municipality or across the City. Some responses have come from civil society organizations; others from collectives of practitioners or from groups of friends and family; there are even those made out of groups of strangers that the pandemic turned into partners working for the same goal….

Within the plurality of initiatives that have emerged, sometimes, seemingly in a spontaneous way, there is a common denominator: people are reacting in a collaboratively way to the crisis to solve the needs that the pandemic is leaving behind.

Different research studies connect social cohesion and social capital with the response’s capacity of a community in situations of crisis and natural disasters; and with its subsequent recovery. These concepts —derived from sociology— include aspects such as the level of union; relationships and networks; and interaction between people in a community.

This can be seen when, for example, a group of people in a neighborhood gets together to buy groceries for neighbors who have lost their incomes. Also, when a collective of professionals react to the shortages of protective equipment for health workers by creating low-cost prototypes, or when a civil organization collaborates with local authorities to bring water to households that lack access to clean water.

It would seem that a high rate of social capital and social cohesion might ease the rise of the citizen initiatives that aim to tackle the challenges that ensue from the pandemic. These do not come out of nowhere….(More)”.

Covid-19 data is a public good. The US government must start treating it like one.


Ryan Panchadsaramarchive at MIT Technology Review: “…When the Trump administration stripped the Centers for Disease Control and Prevention (CDC) of control over coronavirus data, it also took that information away from the public….

This is also an opportunity for HHS to make this data machine readable and thereby more accessible to data scientists and data journalists. The Open Government Data Act, signed into law by President Trump, treats data as a strategic asset and makes it open by default. This act builds upon the Open Data Executive Order, which recognized that the data sets collected by the government are paid for by taxpayers and must be made available to them. 

As a country, the United States has lagged behind in so many dimensions of response to this crisis, from the availability of PPE to testing to statewide mask orders. Its treatment of data has lagged as well. On March 7, as this crisis was unfolding, there was no national testing data. Alexis Madrigal, Jeff Hammerbacher, and a group of volunteers started the COVID Tracking Project to aggregate coronavirus information from all 50 state websites into a single Google spreadsheet. For two months, until the CDC began to share data through its own dashboard, this volunteer project was the sole national public source of information on cases and testing. 

With more than 150 volunteers contributing to the effort, the COVID Tracking Project sets the bar for how to treat data as an asset. I serve on the advisory board and am awed by what this group has accomplished. With daily updates, an API, and multiple download formats, they’ve made their data extraordinarily useful. Where the CDC’s data is cited 30 times in Google Scholar and approximately 10,000 times in Google search results, the COVID Tracking Project data is cited 299 times in Google Scholar and roughly 2 million times in Google search results.

Sharing reliable data is one of the most economical and effective interventions the United States has to confront this pandemic. With the Coronavirus Task Force daily briefings a thing of the past, it’s more necessary than ever for all covid-related data to be shared with the public. The effort required to defeat the pandemic is not just a federal response. It is a federal, state, local, and community response. Everyone needs to work from the same trusted source of facts about the situation on the ground. Data is not a partisan affair or a bureaucratic preserve. It is a public trust—and a public resource….(More)”.