Nicolás Rivero at Quartz: “In 2017, Mark McKinney decided enough was enough. The head of the Jackson County Rural Electric Membership Corporation in southern Indiana, a co-op that provides electricity to a rural community of 20,000 members, McKinney was still living without a reliable internet connection. No internet service provider would build the infrastructure to get him or his neighbors online.
“We realized no one was interested due to the capital expense and limited number of members per mile,” says McKinney, “so the board made the decision to go at it on our own.”
The coronavirus pandemic quickly proved the wisdom of their decision: Thanks to their new fiber optic connection, McKinney and his wife were able to self-quarantine without missing work after they were exposed to the virus. Their son finished the spring semester at home after his university shut down in March. “We could not have done that without this connection,” he said.
Across the rural US, more than 100 cooperatives, first launched to provide electric and telephone services as far back as the 1930s, are now laying miles of fiber optic cable to connect their members to high speed internet. Many started building their own networks after failing to convince established internet service providers to cover their communities.
But while rural fiber optic networks have spread swiftly over the past five years, their progress has been uneven. In North Dakota, for example, fiber optic co-ops cover 82% of the state’s landmass, while Nevada has just one co-op. And in the states where the utilities do exist, they tend to serve the whitest communities….(More)”.
Paper by Saira Ghafur, Jackie Van Dael, Melanie Leis and Ara Darzi, and Aziz Sheikh: “Data science and artificial intelligence (AI) have the potential to transform the delivery of health care. Health care as a sector, with all of the longitudinal data it holds on patients across their lifetimes, is positioned to take advantage of what data science and AI have to offer. The current COVID-19 pandemic has shown the benefits of sharing data globally to permit a data-driven response through rapid data collection, analysis, modelling, and timely reporting.
Despite its obvious advantages, data sharing is a controversial subject, with researchers and members of the public justifiably concerned about how and why health data are shared. The most common concern is privacy; even when data are (pseudo-)anonymised, there remains a risk that a malicious hacker could, using only a few datapoints, re-identify individuals. For many, it is often unclear whether the risks of data sharing outweigh the benefits.
A series of surveys over recent years indicate that the public holds a range of views about data sharing. Over the past few years, there have been several important data breaches and cyberattacks. This has resulted in patients and the public questioning the safety of their data, including the prospect or risk of their health data being shared with unauthorised third parties.
We surveyed people across the UK and the USA to examine public attitude towards data sharing, data access, and the use of AI in health care. These two countries were chosen as comparators as both are high-income countries that have had substantial national investments in health information technology (IT) with established track records of using data to support health-care planning, delivery, and research. The UK and USA, however, have sharply contrasting models of health-care delivery, making it interesting to observe if these differences affect public attitudes.
Willingness to share anonymised personal health information varied across receiving bodies (figure). The more commercial the purpose of the receiving institution (eg, for an insurance or tech company), the less often respondents were willing to share their anonymised personal health information in both the UK and the USA. Older respondents (≥35 years) in both countries were generally less likely to trust any organisation with their anonymised personal health information than younger respondents (<35 years)…
Despite the benefits of big data and technology in health care, our findings suggest that the rapid development of novel technologies has been received with concern. Growing commodification of patient data has increased awareness of the risks involved in data sharing. There is a need for public standards that secure regulation and transparency of data use and sharing and support patient understanding of how data are used and for what purposes….(More)”.
Press Release: “The U.S. Food and Drug Administration today launched Project Patient Voice, an initiative of the FDA’s Oncology Center of Excellence (OCE). Through a new website, Project Patient Voice creates a consistent source of publicly available information describing patient-reported symptoms from cancer trials for marketed treatments. While this patient-reported data has historically been analyzed by the FDA during the drug approval process, it is rarely included in product labeling and, therefore, is largely inaccessible to the public.
“Project Patient Voice has been initiated by the Oncology Center of Excellence to give patients and health care professionals unique information on symptomatic side effects to better inform their treatment choices,” said FDA Principal Deputy Commissioner Amy Abernethy, M.D., Ph.D. “The Project Patient Voice pilot is a significant step in advancing a patient-centered approach to oncology drug development. Where patient-reported symptom information is collected rigorously, this information should be readily available to patients.”
Patient-reported outcome (PRO) data is collected using questionnaires that patients complete during clinical trials. These questionnaires are designed to capture important information about disease- or treatment-related symptoms. This includes how severe or how often a symptom or side effect occurs.
Patient-reported data can provide additional, complementary information for health care professionals to discuss with patients, specifically when discussing the potential side effects of a particular cancer treatment. In contrast to the clinician-reported safety data in product labeling, the data in Project Patient Voice is obtained directly from patients and can show symptoms before treatment starts and at multiple time points while receiving cancer treatment.
The Project Patient Voice website will include a list of cancer clinical trials that have available patient-reported symptom data. Each trial will include a table of the patient-reported symptoms collected. Each patient-reported symptom can be selected to display a series of bar and pie charts describing the patient-reported symptom at baseline (before treatment starts) and over the first 6 months of treatment. This information provides insights into side effects not currently available in standard FDA safety tables, including existing symptoms before the start of treatment, symptoms over time, and the subset of patients who did not have a particular symptom prior to starting treatment….(More)”.
Electronic Frontier Foundation: “Law enforcement surveillance isn’t always secret. These technologies can be discovered in news articles and government meeting agendas, in company press releases and social media posts. It just hasn’t been aggregated before.
That’s the starting point for the Atlas of Surveillance, a collaborative effort between the Electronic Frontier Foundation and the University of Nevada, Reno Reynolds School of Journalism. Through a combination of crowdsourcing and data journalism, we are creating the largest-ever repository of information on which law enforcement agencies are using what surveillance technologies. The aim is to generate a resource for journalists, academics, and, most importantly, members of the public to check what’s been purchased locally and how technologies are spreading across the country.
We specifically focused on the most pervasive technologies, including drones, body-worn cameras, face recognition, cell-site simulators, automated license plate readers, predictive policing, camera registries, and gunshot detection. Although we have amassed more than 5,000 datapoints in 3,000 jurisdictions, our research only reveals the tip of the iceberg and underlines the need for journalists and members of the public to continue demanding transparency from criminal justice agencies….(More)”.
Courtney Linder at Popular Mechanics: “Several prominent academic mathematicians want to sever ties with police departments across the U.S., according to a letter submitted to Notices of the American Mathematical Society on June 15. The letter arrived weeks after widespread protests against police brutality, and has inspired over 1,500 other researchers to join the boycott.
These mathematicians are urging fellow researchers to stop all work related to predictive policing software, which broadly includes any data analytics tools that use historical data to help forecast future crime, potential offenders, and victims. The technology is supposed to use probability to help police departments tailor their neighborhood coverage so it puts officers in the right place at the right time….
RAND
According to a 2013 research briefing from the RAND Corporation, a nonprofit think tank in Santa Monica, California, predictive policing is made up of a four-part cycle (shown above). In the first two steps, researchers collect and analyze data on crimes, incidents, and offenders to come up with predictions. From there, police intervene based on the predictions, usually taking the form of an increase in resources at certain sites at certain times. The fourth step is, ideally, reducing crime.
“Law enforcement agencies should assess the immediate effects of the intervention to ensure that there are no immediately visible problems,” the authors note. “Agencies should also track longer-term changes by examining collected data, performing additional analysis, and modifying operations as needed.”
In many cases, predictive policing software was meant to be a tool to augment police departments that are facing budget crises with less officers to cover a region. If cops can target certain geographical areas at certain times, then they can get ahead of the 911 calls and maybe even reduce the rate of crime.
But in practice, the accuracy of the technology has been contested—and it’s even been called racist….(More)”.
Introduction to a Special Blog Series by NIST: “…How can we use data to learn about a population, without learning about specific individuals within the population? Consider these two questions:
“How many people live in Vermont?”
“How many people named Joe Near live in Vermont?”
The first reveals a property of the whole population, while the second reveals information about one person. We need to be able to learn about trends in the population while preventing the ability to learn anything new about a particular individual. This is the goal of many statistical analyses of data, such as the statistics published by the U.S. Census Bureau, and machine learning more broadly. In each of these settings, models are intended to reveal trends in populations, not reflect information about any single individual.
But how can we answer the first question “How many people live in Vermont?” — which we’ll refer to as a query — while preventing the second question from being answered “How many people name Joe Near live in Vermont?” The most widely used solution is called de-identification (or anonymization), which removes identifying information from the dataset. (We’ll generally assume a dataset contains information collected from many individuals.) Another option is to allow only aggregate queries, such as an average over the data. Unfortunately, we now understand that neither approach actually provides strong privacy protection. De-identified datasets are subject to database-linkage attacks. Aggregation only protects privacy if the groups being aggregated are sufficiently large, and even then, privacy attacks are still possible [1, 2, 3, 4].
Differential Privacy
Differential privacy [5, 6] is a mathematical definition of what it means to have privacy. It is not a specific process like de-identification, but a property that a process can have. For example, it is possible to prove that a specific algorithm “satisfies” differential privacy.
Informally, differential privacy guarantees the following for each individual who contributes data for analysis: the output of a differentially private analysis will be roughly the same, whether or not you contribute your data. A differentially private analysis is often called a mechanism, and we denote it ℳ.
Figure 1: Informal Definition of Differential Privacy
Figure 1 illustrates this principle. Answer “A” is computed without Joe’s data, while answer “B” is computed with Joe’s data. Differential privacy says that the two answers should be indistinguishable. This implies that whoever sees the output won’t be able to tell whether or not Joe’s data was used, or what Joe’s data contained.
We control the strength of the privacy guarantee by tuning the privacy parameter ε, also called a privacy loss or privacy budget. The lower the value of the ε parameter, the more indistinguishable the results, and therefore the more each individual’s data is protected.
Figure 2: Formal Definition of Differential Privacy
We can often answer a query with differential privacy by adding some random noise to the query’s answer. The challenge lies in determining where to add the noise and how much to add. One of the most commonly used mechanisms for adding noise is the Laplace mechanism [5, 7].
Queries with higher sensitivity require adding more noise in order to satisfy a particular `epsilon` quantity of differential privacy, and this extra noise has the potential to make results less useful. We will describe sensitivity and this tradeoff between privacy and usefulness in more detail in future blog posts….(More)”.
Chas Kissick, Elliot Setzer, and Jacob Schulz at Lawfare: “In May of this year, Prime Minister Boris Johnson pledged the United Kingdom would develop a “world beating” track and trace system by June 1 to stop the spread of the novel coronavirus. But on June 18, the government quietly abandoned its coronavirus contact-tracing app, a key piece of the “world beating” strategy, and instead promised to switch to a model designed by Apple and Google. The delayed app will not be ready until winter, and the U.K.’s Junior Health Minister told reporters that “it isn’t a priority for us at the moment.” When Johnson came under fire in Parliament for the abrupt U-turn, he replied: “I wonder whether the right honorable and learned Gentleman can name a single country in the world that has a functional contact tracing app—there isn’t one.”
Johnson’s rebuttal is perhaps a bit reductive, but he’s not that far off.
You probably remember the idea of contact-tracing apps: the technological intervention that seemed to have the potential to save lives while enabling a hamstrung economy to safely inch back open; it was a fixation of many public health and privacy advocates; it was the thing that was going to help us get out of this mess if we could manage the risks.
Yet nearly three months after Google and Apple announced with great fanfare their partnership to build a contact-tracing API, contact-tracing apps have made an unceremonious exit from the front pages of American newspapers. Countries, states and localities continue to try to develop effective digital tracing strategies. But as Jonathan Zittrain puts it, the “bigger picture momentum appears to have waned.”
What’s behind contact-tracing apps’ departure from the spotlight? For one, there’s the onset of a larger pandemic apathy in the U.S; many politicians and Americans seem to have thrown up their hands or put all their hopes in the speedy development of a vaccine. Yet, the apps haven’t even made much of a splash in countries that havetaken the pandemic more seriously. Anxieties about privacy persist. But technical shortcomings in the apps deserve the lion’s share of the blame. Countries have struggled to get bespoke apps developed by government technicians to work on Apple phones. The functionality of some Bluetooth-enabled models vary widely depending on small changes in phone positioning. And most countries have only convinced a small fraction of their populace to use national tracing apps.
Maybe it’s still possible that contact-tracing apps will make a miraculous comeback and approach the level of efficacy observers once anticipated.
But even if technical issues implausibly subside, the apps are operating in a world of unknowns.
Most centrally, researchers still have no real idea what level of adoption is required for the apps to actually serve their function. Some estimates suggest that 80 percent of current smartphone owners in a given area would need to use an app and follow its recommendations for digital contact tracing to be effective. But other researchers have noted that the apps could slow the rate of infections even if little more than 10 percent of a population used a tracing app. It will be an uphill battle even to hit the 10 percent mark in America, though. Survey data show that fewer than three in 10 Americans intend to use contact-tracing apps if they become available…(More).
Paper by Alasdair S. Roberts: “The first two decades of this century have shown there is no simple formula for governing well. Leaders must make difficult choices about national priorities and the broad lines of policy – that is, about the substance of their strategy for governing. These strategic choices have important implications for public administration. Scholars in this field should study the processes by which strategy is formulated and executed more closely than they have over the last thirty years. A new agenda for public administration should emphasize processes of top-level decision-making, mechanisms to improve foresight and the management of societal risks, and problems of large-scale reorganization and inter-governmental coordination, among other topics. Many of these themes have been examined more closely by researchers in Canada than by those abroad. This difference should be recognized an advantage rather than a liability….(More)”.
This is also an opportunity for HHS to make this data machine readable and thereby more accessible to data scientists and data journalists. The Open Government Data Act, signed into law by President Trump, treats data as a strategic asset and makes it open by default. This act builds upon the Open Data Executive Order, which recognized that the data sets collected by the government are paid for by taxpayers and must be made available to them.
As a country, the United States has lagged behind in so many dimensions of response to this crisis, from the availability of PPE to testing to statewide mask orders. Its treatment of data has lagged as well. On March 7, as this crisis was unfolding, there was no national testing data. Alexis Madrigal, Jeff Hammerbacher, and a group of volunteers started the COVID Tracking Project to aggregate coronavirus information from all 50 state websites into a single Google spreadsheet. For two months, until the CDC began to share data through its own dashboard, this volunteer project was the sole national public source of information on cases and testing.
With more than 150 volunteers contributing to the effort, the COVID Tracking Project sets the bar for how to treat data as an asset. I serve on the advisory board and am awed by what this group has accomplished. With daily updates, an API, and multiple download formats, they’ve made their data extraordinarily useful. Where the CDC’s data is cited 30 times in Google Scholar and approximately 10,000 times in Google search results, the COVID Tracking Project data is cited 299 times in Google Scholar and roughly 2 million times in Google search results.
Sharing reliable data is one of the most economical and effective interventions the United States has to confront this pandemic. With the Coronavirus Task Force daily briefings a thing of the past, it’s more necessary than ever for all covid-related data to be shared with the public. The effort required to defeat the pandemic is not just a federal response. It is a federal, state, local, and community response. Everyone needs to work from the same trusted source of facts about the situation on the ground. Data is not a partisan affair or a bureaucratic preserve. It is a public trust—and a public resource….(More)”.
Paper by Cass Sunstein: “Do people from benefit from food labels? When? By how much? Public officials face persistent challenges in answering these questions. In various nations, they use four different approaches: they refuse to do so on the ground that quantification is not feasible; they engage in breakeven analysis; they project end-states, such as economic savings or health outcomes; and they estimate willingness-to-pay for the relevant information. Each of these approaches runs into strong objections. In principle, the willingness-to-pay question has important advantages. But for those who has that question, there is a serious problem. In practice, people often lack enough information to give a sensible answer to the question how much they would be willing to pay for (more) information. People might also suffer from behavioral biases (including present bias and optimistic bias). And when preferences are labile or endogenous, even an informed and unbiased answer to the willingness to pay question may fail to capture the welfare consequences, because people may develop new tastes and values as a result of information….(More)”.