Report by the National Academies of Sciences, Engineering, and Medicine: “In December 2019, new cases of severe pneumonia were first detected in Wuhan, China, and the cause was determined to be a novel beta coronavirus related to the severe acute respiratory syndrome (SARS) coronavirus that emerged from a bat reservoir in 2002. Within six months, this new virus—SARS coronavirus 2 (SARS-CoV-2)—has spread worldwide, infecting at least 10 million people with an estimated 500,000 deaths. COVID-19, the disease caused by SARS-CoV-2, was declared a public health emergency of international concern on January 30, 2020 by the World Health Organization (WHO) and a pandemic on March 11, 2020. To date, there is no approved effective treatment or vaccine for COVID-19, and it continues to spread in many countries.
Genomic Epidemiology Data Infrastructure Needs for SARS-CoV-2: Modernizing Pandemic Response Strategies lays out a framework to define and describe the data needs for a system to track and correlate viral genome sequences with clinical and epidemiological data. Such a system would help ensure the integration of data on viral evolution with detection, diagnostic, and countermeasure efforts. This report also explores data collection mechanisms to ensure a representative global sample set of all relevant extant sequences and considers challenges and opportunities for coordination across existing domestic, global, and regional data sources….(More)”.
Essay by Peter Strauss: “This essay has been written to set the context for a future issue of Daedalus, the quarterly of the American Academy of Arts and Sciences, addressing the prospects of American administrative law in the Twenty-first Century. It recounts the growth of American government over the centuries since its founding, in response to the profound changes in the technology, economy, and scientific understandings it must deal with, under a Constitution written for the governance of a dispersed agrarian population operating with hand tools in a localized economy. It then suggests profound challenges of the present day facing administrative law’s development: the transition from processes of the paper age to those of the digital age; the steadily growing centralization of decision in an opaque, political presidency, displacing the focused knowledge and expertise of agencies Congress created to pursue particular governmental ends; the thickening, as well, of the political layer within agencies themselves, threatening similar displacements; and the revival in the courts of highly formalized analytic techniques inviting a return to the forms of government those who wrote the Constitution might themselves have imagined. The essay will not be published until months after the November election. While President Trump’s first term in office has sharply illustrated an imbalance in American governance between law and politics and law, reason and unreason, that imbalance is hardly new; it has been growing for decades. There lie the challenges….(More)”
Blog by Georgina Bourke: “The Mayor of London listed cycling and walking as key population health indicators in the London Health Inequalities Strategy. The pandemic has only amplified the need for people to use cycling as a safer and healthier mode of transport. Yet as the majority of cyclists are white, Black communities are less likely to get the health benefits that cycling provides. Groups like Transport for London (TfL) should monitor how different communities cycle and who is excluded. Organisations like the London Office of Technology and Innovation (LOTI) could help boroughs procure privacy preserving technology to help their efforts.
But at the moment, it’s difficult for public organisations to access mobility data held by private companies. One reason is because mobility data is sensitive. Even if you remove identifiers like name and address, there’s still a risk you can reidentify someone by linking different data sets together. This means you could track how an individual moved around a city. I wrote more about the privacy risks with mobility data in a previous blog post. The industry’s awareness of privacy issues in using and sharing mobility data is rising. In the case of Los Angeles Department of Transport’s Mobility Data Specification (LADOT), Uber is concerned about sharing anonymised data because of the privacy risk. Both organisations are now involved in a legal battle to see which has the rights to the data. This might have been avoided if Uber had applied privacy preserving techniques….
Privacy preserving techniques can help mobility providers share important insights with authorities without compromising peoples’ privacy.
Instead of requiring access to all customer trip data, authorities could ask specific questions like, where are the least popular places to cycle? If mobility providers apply techniques like randomised response, an individual’s identity is obscured by the noise added to the data. This means it’s highly unlikely that someone could be reidentified later on. And because this technique requires authorities to ask very specific questions – for randomised response to work, the answer has to be binary, ie Yes or No – authorities will also be practicing data minimisation by default.
It’s easy to imagine transport authorities like TfL combining privacy preserved mobility data from multiple mobility providers to compare insights and measure service provision. They could cross reference the privacy preserved bike trip data with demographic data in the local area to learn how different communities cycle. The first step to addressing inequality is being able to measure it….(More)”.
Nicolás Rivero at Quartz: “In 2017, Mark McKinney decided enough was enough. The head of the Jackson County Rural Electric Membership Corporation in southern Indiana, a co-op that provides electricity to a rural community of 20,000 members, McKinney was still living without a reliable internet connection. No internet service provider would build the infrastructure to get him or his neighbors online.
“We realized no one was interested due to the capital expense and limited number of members per mile,” says McKinney, “so the board made the decision to go at it on our own.”
The coronavirus pandemic quickly proved the wisdom of their decision: Thanks to their new fiber optic connection, McKinney and his wife were able to self-quarantine without missing work after they were exposed to the virus. Their son finished the spring semester at home after his university shut down in March. “We could not have done that without this connection,” he said.
Across the rural US, more than 100 cooperatives, first launched to provide electric and telephone services as far back as the 1930s, are now laying miles of fiber optic cable to connect their members to high speed internet. Many started building their own networks after failing to convince established internet service providers to cover their communities.
But while rural fiber optic networks have spread swiftly over the past five years, their progress has been uneven. In North Dakota, for example, fiber optic co-ops cover 82% of the state’s landmass, while Nevada has just one co-op. And in the states where the utilities do exist, they tend to serve the whitest communities….(More)”.
Paper by Saira Ghafur, Jackie Van Dael, Melanie Leis and Ara Darzi, and Aziz Sheikh: “Data science and artificial intelligence (AI) have the potential to transform the delivery of health care. Health care as a sector, with all of the longitudinal data it holds on patients across their lifetimes, is positioned to take advantage of what data science and AI have to offer. The current COVID-19 pandemic has shown the benefits of sharing data globally to permit a data-driven response through rapid data collection, analysis, modelling, and timely reporting.
Despite its obvious advantages, data sharing is a controversial subject, with researchers and members of the public justifiably concerned about how and why health data are shared. The most common concern is privacy; even when data are (pseudo-)anonymised, there remains a risk that a malicious hacker could, using only a few datapoints, re-identify individuals. For many, it is often unclear whether the risks of data sharing outweigh the benefits.
A series of surveys over recent years indicate that the public holds a range of views about data sharing. Over the past few years, there have been several important data breaches and cyberattacks. This has resulted in patients and the public questioning the safety of their data, including the prospect or risk of their health data being shared with unauthorised third parties.
We surveyed people across the UK and the USA to examine public attitude towards data sharing, data access, and the use of AI in health care. These two countries were chosen as comparators as both are high-income countries that have had substantial national investments in health information technology (IT) with established track records of using data to support health-care planning, delivery, and research. The UK and USA, however, have sharply contrasting models of health-care delivery, making it interesting to observe if these differences affect public attitudes.
Willingness to share anonymised personal health information varied across receiving bodies (figure). The more commercial the purpose of the receiving institution (eg, for an insurance or tech company), the less often respondents were willing to share their anonymised personal health information in both the UK and the USA. Older respondents (≥35 years) in both countries were generally less likely to trust any organisation with their anonymised personal health information than younger respondents (<35 years)…
Despite the benefits of big data and technology in health care, our findings suggest that the rapid development of novel technologies has been received with concern. Growing commodification of patient data has increased awareness of the risks involved in data sharing. There is a need for public standards that secure regulation and transparency of data use and sharing and support patient understanding of how data are used and for what purposes….(More)”.
Press Release: “The U.S. Food and Drug Administration today launched Project Patient Voice, an initiative of the FDA’s Oncology Center of Excellence (OCE). Through a new website, Project Patient Voice creates a consistent source of publicly available information describing patient-reported symptoms from cancer trials for marketed treatments. While this patient-reported data has historically been analyzed by the FDA during the drug approval process, it is rarely included in product labeling and, therefore, is largely inaccessible to the public.
“Project Patient Voice has been initiated by the Oncology Center of Excellence to give patients and health care professionals unique information on symptomatic side effects to better inform their treatment choices,” said FDA Principal Deputy Commissioner Amy Abernethy, M.D., Ph.D. “The Project Patient Voice pilot is a significant step in advancing a patient-centered approach to oncology drug development. Where patient-reported symptom information is collected rigorously, this information should be readily available to patients.”
Patient-reported outcome (PRO) data is collected using questionnaires that patients complete during clinical trials. These questionnaires are designed to capture important information about disease- or treatment-related symptoms. This includes how severe or how often a symptom or side effect occurs.
Patient-reported data can provide additional, complementary information for health care professionals to discuss with patients, specifically when discussing the potential side effects of a particular cancer treatment. In contrast to the clinician-reported safety data in product labeling, the data in Project Patient Voice is obtained directly from patients and can show symptoms before treatment starts and at multiple time points while receiving cancer treatment.
The Project Patient Voice website will include a list of cancer clinical trials that have available patient-reported symptom data. Each trial will include a table of the patient-reported symptoms collected. Each patient-reported symptom can be selected to display a series of bar and pie charts describing the patient-reported symptom at baseline (before treatment starts) and over the first 6 months of treatment. This information provides insights into side effects not currently available in standard FDA safety tables, including existing symptoms before the start of treatment, symptoms over time, and the subset of patients who did not have a particular symptom prior to starting treatment….(More)”.
Electronic Frontier Foundation: “Law enforcement surveillance isn’t always secret. These technologies can be discovered in news articles and government meeting agendas, in company press releases and social media posts. It just hasn’t been aggregated before.
That’s the starting point for the Atlas of Surveillance, a collaborative effort between the Electronic Frontier Foundation and the University of Nevada, Reno Reynolds School of Journalism. Through a combination of crowdsourcing and data journalism, we are creating the largest-ever repository of information on which law enforcement agencies are using what surveillance technologies. The aim is to generate a resource for journalists, academics, and, most importantly, members of the public to check what’s been purchased locally and how technologies are spreading across the country.
We specifically focused on the most pervasive technologies, including drones, body-worn cameras, face recognition, cell-site simulators, automated license plate readers, predictive policing, camera registries, and gunshot detection. Although we have amassed more than 5,000 datapoints in 3,000 jurisdictions, our research only reveals the tip of the iceberg and underlines the need for journalists and members of the public to continue demanding transparency from criminal justice agencies….(More)”.
Courtney Linder at Popular Mechanics: “Several prominent academic mathematicians want to sever ties with police departments across the U.S., according to a letter submitted to Notices of the American Mathematical Society on June 15. The letter arrived weeks after widespread protests against police brutality, and has inspired over 1,500 other researchers to join the boycott.
These mathematicians are urging fellow researchers to stop all work related to predictive policing software, which broadly includes any data analytics tools that use historical data to help forecast future crime, potential offenders, and victims. The technology is supposed to use probability to help police departments tailor their neighborhood coverage so it puts officers in the right place at the right time….
RAND
According to a 2013 research briefing from the RAND Corporation, a nonprofit think tank in Santa Monica, California, predictive policing is made up of a four-part cycle (shown above). In the first two steps, researchers collect and analyze data on crimes, incidents, and offenders to come up with predictions. From there, police intervene based on the predictions, usually taking the form of an increase in resources at certain sites at certain times. The fourth step is, ideally, reducing crime.
“Law enforcement agencies should assess the immediate effects of the intervention to ensure that there are no immediately visible problems,” the authors note. “Agencies should also track longer-term changes by examining collected data, performing additional analysis, and modifying operations as needed.”
In many cases, predictive policing software was meant to be a tool to augment police departments that are facing budget crises with less officers to cover a region. If cops can target certain geographical areas at certain times, then they can get ahead of the 911 calls and maybe even reduce the rate of crime.
But in practice, the accuracy of the technology has been contested—and it’s even been called racist….(More)”.
Introduction to a Special Blog Series by NIST: “…How can we use data to learn about a population, without learning about specific individuals within the population? Consider these two questions:
“How many people live in Vermont?”
“How many people named Joe Near live in Vermont?”
The first reveals a property of the whole population, while the second reveals information about one person. We need to be able to learn about trends in the population while preventing the ability to learn anything new about a particular individual. This is the goal of many statistical analyses of data, such as the statistics published by the U.S. Census Bureau, and machine learning more broadly. In each of these settings, models are intended to reveal trends in populations, not reflect information about any single individual.
But how can we answer the first question “How many people live in Vermont?” — which we’ll refer to as a query — while preventing the second question from being answered “How many people name Joe Near live in Vermont?” The most widely used solution is called de-identification (or anonymization), which removes identifying information from the dataset. (We’ll generally assume a dataset contains information collected from many individuals.) Another option is to allow only aggregate queries, such as an average over the data. Unfortunately, we now understand that neither approach actually provides strong privacy protection. De-identified datasets are subject to database-linkage attacks. Aggregation only protects privacy if the groups being aggregated are sufficiently large, and even then, privacy attacks are still possible [1, 2, 3, 4].
Differential Privacy
Differential privacy [5, 6] is a mathematical definition of what it means to have privacy. It is not a specific process like de-identification, but a property that a process can have. For example, it is possible to prove that a specific algorithm “satisfies” differential privacy.
Informally, differential privacy guarantees the following for each individual who contributes data for analysis: the output of a differentially private analysis will be roughly the same, whether or not you contribute your data. A differentially private analysis is often called a mechanism, and we denote it ℳ.
Figure 1: Informal Definition of Differential Privacy
Figure 1 illustrates this principle. Answer “A” is computed without Joe’s data, while answer “B” is computed with Joe’s data. Differential privacy says that the two answers should be indistinguishable. This implies that whoever sees the output won’t be able to tell whether or not Joe’s data was used, or what Joe’s data contained.
We control the strength of the privacy guarantee by tuning the privacy parameter ε, also called a privacy loss or privacy budget. The lower the value of the ε parameter, the more indistinguishable the results, and therefore the more each individual’s data is protected.
Figure 2: Formal Definition of Differential Privacy
We can often answer a query with differential privacy by adding some random noise to the query’s answer. The challenge lies in determining where to add the noise and how much to add. One of the most commonly used mechanisms for adding noise is the Laplace mechanism [5, 7].
Queries with higher sensitivity require adding more noise in order to satisfy a particular `epsilon` quantity of differential privacy, and this extra noise has the potential to make results less useful. We will describe sensitivity and this tradeoff between privacy and usefulness in more detail in future blog posts….(More)”.
Chas Kissick, Elliot Setzer, and Jacob Schulz at Lawfare: “In May of this year, Prime Minister Boris Johnson pledged the United Kingdom would develop a “world beating” track and trace system by June 1 to stop the spread of the novel coronavirus. But on June 18, the government quietly abandoned its coronavirus contact-tracing app, a key piece of the “world beating” strategy, and instead promised to switch to a model designed by Apple and Google. The delayed app will not be ready until winter, and the U.K.’s Junior Health Minister told reporters that “it isn’t a priority for us at the moment.” When Johnson came under fire in Parliament for the abrupt U-turn, he replied: “I wonder whether the right honorable and learned Gentleman can name a single country in the world that has a functional contact tracing app—there isn’t one.”
Johnson’s rebuttal is perhaps a bit reductive, but he’s not that far off.
You probably remember the idea of contact-tracing apps: the technological intervention that seemed to have the potential to save lives while enabling a hamstrung economy to safely inch back open; it was a fixation of many public health and privacy advocates; it was the thing that was going to help us get out of this mess if we could manage the risks.
Yet nearly three months after Google and Apple announced with great fanfare their partnership to build a contact-tracing API, contact-tracing apps have made an unceremonious exit from the front pages of American newspapers. Countries, states and localities continue to try to develop effective digital tracing strategies. But as Jonathan Zittrain puts it, the “bigger picture momentum appears to have waned.”
What’s behind contact-tracing apps’ departure from the spotlight? For one, there’s the onset of a larger pandemic apathy in the U.S; many politicians and Americans seem to have thrown up their hands or put all their hopes in the speedy development of a vaccine. Yet, the apps haven’t even made much of a splash in countries that havetaken the pandemic more seriously. Anxieties about privacy persist. But technical shortcomings in the apps deserve the lion’s share of the blame. Countries have struggled to get bespoke apps developed by government technicians to work on Apple phones. The functionality of some Bluetooth-enabled models vary widely depending on small changes in phone positioning. And most countries have only convinced a small fraction of their populace to use national tracing apps.
Maybe it’s still possible that contact-tracing apps will make a miraculous comeback and approach the level of efficacy observers once anticipated.
But even if technical issues implausibly subside, the apps are operating in a world of unknowns.
Most centrally, researchers still have no real idea what level of adoption is required for the apps to actually serve their function. Some estimates suggest that 80 percent of current smartphone owners in a given area would need to use an app and follow its recommendations for digital contact tracing to be effective. But other researchers have noted that the apps could slow the rate of infections even if little more than 10 percent of a population used a tracing app. It will be an uphill battle even to hit the 10 percent mark in America, though. Survey data show that fewer than three in 10 Americans intend to use contact-tracing apps if they become available…(More).