Applying new models of data stewardship to health and care data


Report by the Open Data Institute: “The outbreak of the coronavirus (Covid-19) has amplified and accelerated the need for an effective technology ecosystem that benefits everyone’s health. This report explores models of ‘data stewardship’ (the collection, maintenance and sharing of data) required to enable better evaluation

The pandemic has been accompanied by a marked increase in the use of digital technology, including introduction of remote consultation in general practice, new data flows to support the distribution of food and other essentials, and applications to support digital contact tracing.

This report explores models of ‘data stewardship’ (the collection, maintenance and sharing of data) required to enable better evaluation. It argues everybody involved in technology has a shared responsibility to enable evaluation, whether that means innovators sharing data for evaluation purposes, or healthcare providers being clearer, from the outset, about what data is needed to support effective evaluation.

This report re-envisages the role of evaluators as data stewards, who could use their positions as intermediaries to encourage stakeholders to share data, and help increase access to data for public benefit…(More)”.

Internet Searches for Acute Anxiety During the Early Stages of the COVID-19 Pandemic


Paper by John W. Ayers et al: “There is widespread concern that the coronavirus disease 2019 (COVID-19) pandemic may harm population mental health, chiefly owing to anxiety about the disease and its societal fallout. But traditional population mental health surveillance (eg, telephone surveys, medical records) is time consuming, expensive, and may miss persons who do not participate or seek care. To evaluate the association of COVID-19 with anxiety on a population basis, we examined internet searches indicative of acute anxiety during the early stages of the COVID-19 pandemic.Methods

The analysis relied on nonidentifiable, aggregate, public data and was exempted by the University of California San Diego Human Research Protections Program. Acute anxiety, including colloquially called anxiety attacks or panic attacks, was monitored because of its higher prevalence relative to other mental health problems. It can lead to other mental health problems (including depression), it is triggered by outside stressors, and it is socially contagious. Using Google Trends (https://trends.google.com/trends) we monitored the daily fraction of all internet searches (thereby adjusting the results for any change in total queries) that included the terms anxiety or panic in combination with attack (including panic attacksigns of anxiety attackanxiety attack symptoms) that originated from the US from January 1, 2004, through May 4, 2020. Raw search counts were inferred using Comscore estimates (comscore.com).

We compared search volumes after President Trump declared a national COVID-19 emergency on March 13, 2020, with expected search volumes if COVID-19 had not occurred, thereby taking into account the historical trend and periodicity in the data. Expected volumes were computed using an autoregressive integrated moving average model,4 based on historical trends from January 1, 2004 to March 12, 2020, to predict counterfactual trends for March 13, 2020 to May 9, 2020. The expected volumes with prediction intervals (PIs) and ratio of observed and expected volumes with bootstrap CIs were computed using R statistical software (version 3.5.3, R Foundation). The results were similar if we varied our interruption date plus or minus 1 week….(More)”.

Data Collaboratives


Andrew Young and Stefaan Verhulst at The Palgrave Encyclopedia of Interest Groups, Lobbying and Public Affairs: “The rise of the open data movement means that a growing amount of data is today being broken out of information silos and released or shared with third parties. Yet despite the growing accessibility of data, there continues to exist a mismatch between the supply of, and demand for, data (Verhulst & Young, 2018). This is because supply and demand are often widely dispersed – spread across government, the private sector, and civil society – meaning that those who need data do not know where to find it, and those who release data do not know how to effectively target it at those who can most effectively use it (Susha, Janssen, & Verhulst, 2017). While much commentary on the data era’s shortcomings focuses on issues such as data glut (Buchanan & Kock, 2001), misuse of data (Solove & Citron, 2017), or algorithmic bias (Hajian, Bonchi, & Castillo, 2016), this mismatch between supply and demand is at least equally problematic, resulting in tremendous inefficiencies and lost potential.

Data collaboratives, when designed responsibly (Alemanno, 2018), can help to address such shortcomings. They draw together otherwise siloed data – such as, for example, telecom data, satellite imagery, social media data, financial data – and a dispersed range of expertise. In the process, they help match supply and demand, and ensure that the appropriate institutions and individuals are using and analyzing data in ways that maximize the possibility of new, innovative social solutions (de Montjoye, Gambs, Blondel, et al., 2018)….(More)”.

Mapping socioeconomic indicators using social media advertising data


Paper by Ingmar Weber et al: “The United Nations Sustainable Development Goals (SDGs) are a global consensus on the world’s most pressing challenges. They come with a set of 232 indicators against which countries should regularly monitor their progress, ensuring that everyone is represented in up-to-date data that can be used to make decisions to improve people’s lives. However, existing data sources to measure progress on the SDGs are often outdated or lacking appropriate disaggregation. We evaluate the value that anonymous, publicly accessible advertising data from Facebook can provide in mapping socio-economic development in two low and middle income countries, the Philippines and India. Concretely, we show that audience estimates of how many Facebook users in a given location use particular device types, such as Android vs. iOS devices, or particular connection types, such as 2G vs. 4G, provide strong signals for modeling regional variation in the Wealth Index (WI), derived from the Demographic and Health Survey (DHS). We further show that, surprisingly, the predictive power of these digital connectivity features is roughly equal at both the high and low ends of the WI spectrum. Finally we show how such data can be used to create gender-disaggregated predictions, but that these predictions only appear plausible in contexts with gender equal Facebook usage, such as the Philippines, but not in contexts with large gender Facebook gaps, such as India….(More)”.

The EU is launching a market for personal data. Here’s what that means for privacy.


Anna Artyushina at MIT Tech Review: “The European Union has long been a trendsetter in privacy regulation. Its General Data Protection Regulation (GDPR) and stringent antitrust laws have inspired new legislation around the world. For decades, the EU has codified protections on personal data and fought against what it viewed as commercial exploitation of private information, proudly positioning its regulations in contrast to the light-touch privacy policies in the United States.

The new European data governance strategy (pdf) takes a fundamentally different approach. With it, the EU will become an active player in facilitating the use and monetization of its citizens’ personal data. Unveiled by the European Commission in February 2020, the strategy outlines policy measures and investments to be rolled out in the next five years.

This new strategy represents a radical shift in the EU’s focus, from protecting individual privacy to promoting data sharing as a civic duty. Specifically, it will create a pan-European market for personal data through a mechanism called a data trust. A data trust is a steward that manages people’s data on their behalf and has fiduciary duties toward its clients.

The EU’s new plan considers personal data to be a key asset for Europe. However, this approach raises some questions. First, the EU’s intent to profit from the personal data it collects puts European governments in a weak position to regulate the industry. Second, the improper use of data trusts can actually deprive citizens of their rights to their own data.

The Trusts Project, the first initiative put forth by the new EU policies, will be implemented by 2022. With a €7 million budget, it will set up a pan-European pool of personal and nonpersonal information that should become a one-stop shop for businesses and governments looking to access citizens’ information.

Global technology companies will not be allowed to store or move Europeans’ data. Instead, they will be required to access it via the trusts. Citizens will collect “data dividends,” which haven’t been clearly defined but could include monetary or nonmonetary payments from companies that use their personal data. With the EU’s roughly 500 million citizens poised to become data sources, the trusts will create the world’s largest data market.

For citizens, this means the data created by them and about them will be held in public servers and managed by data trusts. The European Commission envisions the trusts as a way to help European businesses and governments reuse and extract value from the massive amounts of data produced across the region, and to help European citizens benefit from their information. The project documentation, however, does not specify how individuals will be compensated.

Data trusts were first proposed by internet pioneer Sir Tim Berners Lee in 2018, and the concept has drawn considerable interest since then. Just like the trusts used to manage one’s property, data trusts may serve different purposes: they can be for-profit enterprises, or they can be set up for data storage and protection, or to work for a charitable cause.

IBM and Mastercard have built a data trust to manage the financial information of their European clients in Ireland; the UK and Canada have employed data trusts to stimulate the growth of the AI industries there; and recently, India announced plans to establish its own public data trust to spur the growth of technology companies.

The new EU project is modeled on Austria’s digital system, which keeps track of information produced by and about its citizens by assigning them unique identifiers and storing the data in public repositories.

Unfortunately, data trusts do not guarantee more transparency. The trust is governed by a charter created by the trust’s settlor, and its rules can be made to prioritize someone’s interests. The trust is run by a board of directors, which means a party that has more seats gains significant control.

The Trusts Project is bound to face some governance issues of its own. Public and private actors often do not see eye to eye when it comes to running critical infrastructure or managing valuable assets. Technology companies tend to favor policies that create opportunity for their own products and services. Caught in a conflict of interest, Europe may overlook the question of privacy….(More)”.

From Desert Battlefields To Coral Reefs, Private Satellites Revolutionize The View


NPR Story: “As the U.S. military and its allies attacked the last Islamic State holdouts last year, it wasn’t clear how many civilians were still in the besieged desert town of Baghouz, Syria.

So Human Rights Watch asked a private satellite company, Planet, for its regular daily photos and also made a special request for video.

“That live video actually was instrumental in convincing us that there were thousands of civilians trapped in this pocket,” said Josh Lyons of Human Rights Watch. “Therefore, the coalition forces absolutely had an obligation to stop and to avoid bombardment of that pocket at that time.”

Which they did until the civilians fled.

Lyons, who’s based in Geneva, has a job title you wouldn’t expect at a human rights group: director of geospatial analysis. He says satellite imagery is increasingly a crucial component of human rights investigations, bolstering traditional eyewitness accounts, especially in areas where it’s too dangerous to send researchers.

“Then we have this magical sort of fusion of data between open-source, eyewitness testimony and data from space. And that becomes essentially a new gold standard for investigations,” he said.

‘A string of pearls’

Satellite photos used to be restricted to the U.S. government and a handful of other nations. Now such imagery is available to everyone, creating a new world of possibilities for human rights groups, environmentalists and researchers who monitor nuclear programs.

They get those images from a handful of private, commercial satellite companies, like Planet and Maxar….(More)”.

The Risks and Rewards of Data Sharing for Smart Cities


Study by Massimo Russo and Tian Feng: “…To develop innovative solutions to problems old and new, many cities are aggregating and sharing more and more data, establishing platforms to facilitate private-sector participation, and holding “hackathons” and other digital events to invite public help. But digital solutions carry their own complications. Technology-led innovation often depends on access to data from a wide variety of sources to derive correlations and insights. Questions regarding data ownership, amalgamation, compensation, and privacy can be flashing red lights.

Smart cities are on the leading edge of the trend toward greater data sharing. They are also complex generators and users of data. Companies, industries, governments, and others are following in their wake, sharing more data in order to foster innovation and address such macro-level challenges as public health and welfare and climate change. Smart cities thus provide a constructive laboratory for studying the challenges and benefits of data sharing.

WHY CITIES SHARE DATA

BCG examined some 75 smart-city applications that use data from a variety of sources, including connected equipment (that is, the Internet of Things, or IoT). Nearly half the applications require data sourced from multiple industries or platforms. (See Exhibit 1.) For example, a parking reservation app assembles garage occupancy data, historical traffic data, current weather data, and information on upcoming public events to determine real-time parking costs. We also looked at a broader set of potential future applications and found that an additional 40% will likewise require cross-industry data aggregation.

Because today’s smart solutions are often sponsored by individual municipal departments, many IoT-enabled applications rely on limited, siloed data. But given the potential value of applications that require aggregation across sources, it’s no surprise that many cities are pursuing partnerships with tech providers to develop platforms and other initiatives that integrate data from multiple sources….(More)”.

Genomic Epidemiology Data Infrastructure Needs for SARS-CoV-2


Report by the National Academies of Sciences, Engineering, and Medicine: “In December 2019, new cases of severe pneumonia were first detected in Wuhan, China, and the cause was determined to be a novel beta coronavirus related to the severe acute respiratory syndrome (SARS) coronavirus that emerged from a bat reservoir in 2002. Within six months, this new virus—SARS coronavirus 2 (SARS-CoV-2)—has spread worldwide, infecting at least 10 million people with an estimated 500,000 deaths. COVID-19, the disease caused by SARS-CoV-2, was declared a public health emergency of international concern on January 30, 2020 by the World Health Organization (WHO) and a pandemic on March 11, 2020. To date, there is no approved effective treatment or vaccine for COVID-19, and it continues to spread in many countries.

Genomic Epidemiology Data Infrastructure Needs for SARS-CoV-2: Modernizing Pandemic Response Strategies lays out a framework to define and describe the data needs for a system to track and correlate viral genome sequences with clinical and epidemiological data. Such a system would help ensure the integration of data on viral evolution with detection, diagnostic, and countermeasure efforts. This report also explores data collection mechanisms to ensure a representative global sample set of all relevant extant sequences and considers challenges and opportunities for coordination across existing domestic, global, and regional data sources….(More)”.

Public perceptions on data sharing: key insights from the UK and the USA


Paper by Saira Ghafur, Jackie Van Dael, Melanie Leis and Ara Darzi, and Aziz Sheikh: “Data science and artificial intelligence (AI) have the potential to transform the delivery of health care. Health care as a sector, with all of the longitudinal data it holds on patients across their lifetimes, is positioned to take advantage of what data science and AI have to offer. The current COVID-19 pandemic has shown the benefits of sharing data globally to permit a data-driven response through rapid data collection, analysis, modelling, and timely reporting.

Despite its obvious advantages, data sharing is a controversial subject, with researchers and members of the public justifiably concerned about how and why health data are shared. The most common concern is privacy; even when data are (pseudo-)anonymised, there remains a risk that a malicious hacker could, using only a few datapoints, re-identify individuals. For many, it is often unclear whether the risks of data sharing outweigh the benefits.

A series of surveys over recent years indicate that the public holds a range of views about data sharing. Over the past few years, there have been several important data breaches and cyberattacks. This has resulted in patients and the public questioning the safety of their data, including the prospect or risk of their health data being shared with unauthorised third parties.

We surveyed people across the UK and the USA to examine public attitude towards data sharing, data access, and the use of AI in health care. These two countries were chosen as comparators as both are high-income countries that have had substantial national investments in health information technology (IT) with established track records of using data to support health-care planning, delivery, and research. The UK and USA, however, have sharply contrasting models of health-care delivery, making it interesting to observe if these differences affect public attitudes.

Willingness to share anonymised personal health information varied across receiving bodies (figure). The more commercial the purpose of the receiving institution (eg, for an insurance or tech company), the less often respondents were willing to share their anonymised personal health information in both the UK and the USA. Older respondents (≥35 years) in both countries were generally less likely to trust any organisation with their anonymised personal health information than younger respondents (<35 years)…

Despite the benefits of big data and technology in health care, our findings suggest that the rapid development of novel technologies has been received with concern. Growing commodification of patient data has increased awareness of the risks involved in data sharing. There is a need for public standards that secure regulation and transparency of data use and sharing and support patient understanding of how data are used and for what purposes….(More)”.

Project Patient Voice


Press Release: “The U.S. Food and Drug Administration today launched Project Patient Voice, an initiative of the FDA’s Oncology Center of Excellence (OCE). Through a new website, Project Patient Voice creates a consistent source of publicly available information describing patient-reported symptoms from cancer trials for marketed treatments. While this patient-reported data has historically been analyzed by the FDA during the drug approval process, it is rarely included in product labeling and, therefore, is largely inaccessible to the public.

“Project Patient Voice has been initiated by the Oncology Center of Excellence to give patients and health care professionals unique information on symptomatic side effects to better inform their treatment choices,” said FDA Principal Deputy Commissioner Amy Abernethy, M.D., Ph.D. “The Project Patient Voice pilot is a significant step in advancing a patient-centered approach to oncology drug development. Where patient-reported symptom information is collected rigorously, this information should be readily available to patients.” 

Patient-reported outcome (PRO) data is collected using questionnaires that patients complete during clinical trials. These questionnaires are designed to capture important information about disease- or treatment-related symptoms. This includes how severe or how often a symptom or side effect occurs.

Patient-reported data can provide additional, complementary information for health care professionals to discuss with patients, specifically when discussing the potential side effects of a particular cancer treatment. In contrast to the clinician-reported safety data in product labeling, the data in Project Patient Voice is obtained directly from patients and can show symptoms before treatment starts and at multiple time points while receiving cancer treatment. 

The Project Patient Voice website will include a list of cancer clinical trials that have available patient-reported symptom data. Each trial will include a table of the patient-reported symptoms collected. Each patient-reported symptom can be selected to display a series of bar and pie charts describing the patient-reported symptom at baseline (before treatment starts) and over the first 6 months of treatment. This information provides insights into side effects not currently available in standard FDA safety tables, including existing symptoms before the start of treatment, symptoms over time, and the subset of patients who did not have a particular symptom prior to starting treatment….(More)”.