Paging Dr. Google: How the Tech Giant Is Laying Claim to Health Data


Wall Street Journal: “Roughly a year ago, Google offered health-data company Cerner Corp.an unusually rich proposal.

Cerner was interviewing Silicon Valley giants to pick a storage provider for 250 million health records, one of the largest collections of U.S. patient data. Google dispatched former chief executive Eric Schmidt to personally pitch Cerner over several phone calls and offered around $250 million in discounts and incentives, people familiar with the matter say. 

Google had a bigger goal in pushing for the deal than dollars and cents: a way to expand its effort to collect, analyze and aggregate health data on millions of Americans. Google representatives were vague in answering questions about how Cerner’s data would be used, making the health-care company’s executives wary, the people say. Eventually, Cerner struck a storage deal with Amazon.com Inc. instead.

The failed Cerner deal reveals an emerging challenge to Google’s move into health care: gaining the trust of health care partners and the public. So far, that has hardly slowed the search giant.

Google has struck partnerships with some of the country’s largest hospital systems and most-renowned health-care providers, many of them vast in scope and few of their details previously reported. In just a few years, the company has achieved the ability to view or analyze tens of millions of patient health records in at least three-quarters of U.S. states, according to a Wall Street Journal analysis of contractual agreements. 

In certain instances, the deals allow Google to access personally identifiable health information without the knowledge of patients or doctors. The company can review complete health records, including names, dates of birth, medications and other ailments, according to people familiar with the deals.

The prospect of tech giants’ amassing huge troves of health records has raised concerns among lawmakers, patients and doctors, who fear such intimate data could be used without individuals’ knowledge or permission, or in ways they might not anticipate. 

Google is developing a search tool, similar to its flagship search engine, in which patient information is stored, collated and analyzed by the company’s engineers, on its own servers. The portal is designed for use by doctors and nurses, and eventually perhaps patients themselves, though some Google staffers would have access sooner. 

Google executives and some health systems say that detailed data sharing has the potential to improve health outcomes. Large troves of data help fuel algorithms Google is creating to detect lung cancer, eye disease and kidney injuries. Hospital executives have long sought better electronic record systems to reduce error rates and cut down on paperwork….

Legally, the information gathered by Google can be used for purposes beyond diagnosing illnesses, under laws enacted during the dial-up era. U.S. federal privacy laws make it possible for health-care providers, with little or no input from patients, to share data with certain outside companies. That applies to partners, like Google, with significant presences outside health care. The company says its intentions in health are unconnected with its advertising business, which depends largely on data it has collected on users of its many services, including email and maps.

Medical information is perhaps the last bounty of personal data yet to be scooped up by technology companies. The health data-gathering efforts of other tech giants such as Amazon and International Business Machines Corp. face skepticism from physician and patient advocates. But Google’s push in particular has set off alarm bells in the industry, including over privacy concerns. U.S. senators, as well as health-industry executives, are questioning Google’s expansion and its potential for commercializing personal data….(More)”.

Towards adaptive governance in big data health research: implementing regulatory principles


Chapter by Alessandro Blasimme and Effy Vayena: “While data-enabled health care systems are in their infancy, biomedical research is rapidly adopting the big data paradigm. Digital epidemiology for example, already employs data generated outside the public health care system – that is, data generated without the intent of using them for epidemiological research – to understand and prevent patterns of diseases in populations (Salathé 2018)(Salathé 2018). Precision medicine – pooling together genomic, environmental and lifestyle data – also represents a prominent example of how data integration can drive both fundamental and translational research in important medical domains such as oncology (D. C. Collins et al. 2017). All of this requires the collection, storage, analysis and distribution of massive amounts of personal information as well as the use of state-of-the art data analytics tools to uncover healthand disease related patterns.


The realization of the potential of big data in health evokes a necessary commitment to a sense of “continuity” articulated in three distinct ways: a) from data generation to use (as in the data enabled learning health care ); b) from research to clinical practice e.g. discovery of new mutations in the context of diagnostics; c) from strictly speaking health data (Vayena and Gasser 2016) e.g. clinical records, to less so e.g. tweets used in digital epidemiology. These continuities face the challenge of regulatory and governance approaches that were designed for clear data taxonomies, for a less blurred boundary between research and clinical practice, and for rules that focused mostly on data generation and less on their eventual and multiple uses.

The result is significant uncertainty about how responsible use of such large amounts of sensitive personal data could be fostered. In this chapter we focus on the uncertainties surrounding the use of biomedical big data in the context of health research. Are new criteria needed to review biomedical big data research projects? Do current mechanisms, such as informed consent, offer sufficient protection to research participants’ autonomy and privacy in this new context? Do existing oversight mechanisms ensure transparency and accountability in data access and sharing? What monitoring tools are available to assess how personal data are used over time? Is the equitable distribution of benefits accruing from such data uses considered, or can it be ensured? How is the public being involved – if at all – with decisions about creating and using large data
repositories for research purposes? What is the role that IT (information technology) players, and especially big ones, acquire in research? And what regulatory instruments do we have to ensure that such players do not undermine the independence of research?…(More)”.

Responsible data sharing in a big data-driven translational research platform: lessons learned


Paper by S. Kalkman et al: “The sharing of clinical research data is increasingly viewed as a moral duty [1]. Particularly in the context of making clinical trial data widely available, editors of international medical journals have labeled data sharing a highly efficient way to advance scientific knowledge [2,3,4]. The combination of even larger datasets into so-called “Big Data” is considered to offer even greater benefits for science, medicine and society [5]. Several international consortia have now promised to build grand-scale, Big Data-driven translational research platforms to generate better scientific evidence regarding disease etiology, diagnosis, treatment and prognosis across various disease areas [6,7,8].

Despite anticipated benefits, large-scale sharing of health data is charged with ethical questions. Stakeholders have been urged to consider how to manage privacy and confidentiality issues, ensure valid informed consent, and determine who gets to decide about data access [9]. More fundamentally, new data sharing activities prompt questions about social justice and public trust [10]. To balance potential benefits and ethical considerations, data sharing platforms require guidance for the processes of interaction and decision-making. In the European Union (EU), legal norms specified for the sharing of personal data for health research, most notably those set out in the General Data Protection Regulation (GDPR) (EU 2016/679), remain open to interpretation and offer limited practical guidance to researchers [12,12,13]. Striking in this regard is that the GDPR itself stresses the importance of adherence to ethical standards, when broad consent is put forward as a legal basis for the processing of personal data. For example, Recital 33 of the GDPR states that data subjects should be allowed to give “consent to certain areas of scientific research when in keeping with recognised ethical standards for scientific research” [14]. In fact, the GDPR actually encourages data controllers to establish self-regulating mechanisms, such as a code of conduct. To foster responsible and sustainable data sharing in translational research platforms, ethical guidance and governance is therefore necessary. Here, we define governance as ‘the processes of interaction and decision-making among the different stakeholders that are involved in a collective problem that lead to the creation, reinforcement, or reproduction of social norms and institutions’…(More)”.

Biased Algorithms Are Easier to Fix Than Biased People


Sendhil Mullainathan in The New York Times: “In one study published 15 years ago, two people applied for a job. Their résumés were about as similar as two résumés can be. One person was named Jamal, the other Brendan.

In a study published this year, two patients sought medical care. Both were grappling with diabetes and high blood pressure. One patient was black, the other was white.

Both studies documented racial injustice: In the first, the applicant with a black-sounding name got fewer job interviews. In the second, the black patient received worse care.

But they differed in one crucial respect. In the first, hiring managers made biased decisions. In the second, the culprit was a computer program.

As a co-author of both studies, I see them as a lesson in contrasts. Side by side, they show the stark differences between two types of bias: human and algorithmic.

Marianne Bertrand, an economist at the University of Chicago, and I conducted the first study: We responded to actual job listings with fictitious résumés, half of which were randomly assigned a distinctively black name.

The study was: “Are Emily and Greg more employable than Lakisha and Jamal?”

The answer: Yes, and by a lot. Simply having a white name increased callbacks for job interviews by 50 percent.

I published the other study in the journal “Science” in late October with my co-authors: Ziad Obermeyer, a professor of health policy at University of California at Berkeley; Brian Powers, a clinical fellow at Brigham and Women’s Hospital; and Christine Vogeli, a professor of medicine at Harvard Medical School. We focused on an algorithm that is widely used in allocating health care services, and has affected roughly a hundred million people in the United States.

To better target care and provide help, health care systems are turning to voluminous data and elaborately constructed algorithms to identify the sickest patients.

We found these algorithms have a built-in racial bias. At similar levels of sickness, black patients were deemed to be at lower risk than white patients. The magnitude of the distortion was immense: Eliminating the algorithmic bias would more than double the number of black patients who would receive extra help. The problem lay in a subtle engineering choice: to measure “sickness,” they used the most readily available data, health care expenditures. But because society spends less on black patients than equally sick white ones, the algorithm understated the black patients’ true needs.

One difference between these studies is the work needed to uncover bias…(More)”.

Accelerating Medicines Partnership (AMP): Improving Drug Research Efficiency through Biomarker Data Sharing


Data Collaborative Case Study by Michelle Winowatan, Andrew Young, and Stefaan Verhulst: “Accelerating Medicines Partnership (AMP) is a cross-sector data-sharing partnership in the United States between the National Institutes of Health (NIH), the Food and Drug Administration (FDA), multiple biopharmaceutical and life science companies, as well as non-profit organizations that seeks to improve the efficiency of developing new diagnostics and treatments for several types of disease. To achieve this goal, the partnership created a pre-competitive collaborative ecosystem where the biomedical community can pool data and resources that are relevant to the prioritized disease areas. A key component of the partnership is to make biomarkers data available to the medical research community through online portals.

Data Collaboratives Model: Based on our typology of data collaborative models, AMP is an example of the data pooling model of data collaboration, specifically a public data pool. Public data pools co-mingle data assets from multiple data holders — in this case pharmaceutical companies — and make those shared assets available on the web. Pools often limit contributions to approved partners (as public data pools are not crowdsourcing efforts), but access to the shared assets is open, enabling independent re-uses.

Data Stewardship Approach: Data stewardship is built into the partnership through the establishment of an executive committee, which governs the entire partnership, and a steering committee for each disease area, which governs each of the sub-projects within AMP. These committees consist of representatives from the institutional partners involved in AMP and perform data stewards function including enabling inter-institutional engagement as well as intra-institutional coordination, data audit and assessment of value and risk, communication of findings, and nurture the collaboration to sustainability….(Full Case Study)”.

The Role of Crowdsourcing in the Healthcare Industry


Chapter by Kabir C. Sen: “The twenty first century has seen the advent of technical advances in storage, transmission and analysis of information. This has had a profound impact on the field of medicine. However, notwithstanding these advances, various obstacles remain in the world regarding the improvement of human lives through the provision of better health care. The obstacles emanate from the demand (i.e., the problem) as well as the supply (i.e., the solution) side. In some cases, the nature of the problems might not have been correctly identified. In others, a solution to a problem could be known only to a small niche of the global population. Thus, from the demand perspective, the variety of health care issues can range from the quest for a cure for a rare illness to the inability to successfully implement verifiable preventive measures for a disease that affects pockets of the global population. Alternatively, from the supply perspective, the approach to a host of health issues might vary because of fundamental differences in both medical philosophies and organizational policies.

In many instances, effective solutions to health care problems are lacking because of inadequate global knowledge about the particular disease. Alternatively, in other cases, a solution might exist but the relevant knowledge about it might only be available to selected pockets of the global medical community. Sometimes, the barriers to the transfer of knowledge might have their root causes in ignorance or prejudice about the initiator of the cure or solution. However, the advent of information technology has now provided an opportunity for individuals located at different geographical locations to collaborate on solutions to various problems. These crowdsourcing projects now have the potential to extract the “wisdom of crowds” for tackling problems which previously could not be solved by a group of experts (Surowiecki, 2014). Anecdotal evidence suggests that crowdsourcing has achieved some success in providing solutions for a rare medical disease (Arnold, 2014). This chapter discusses crowdsourcing’s potential to solve medical problems by designing a framework to evaluate its promises and suggest recommended future paths of actions….(More)”.

Engaging citizens in determining the appropriate conditions and purposes for re-using Health Data


Beth Noveck at The GovLab: “…The term, big health data, refers to the ability to gather and analyze vast quantities of online information about health, wellness and lifestyle. It includes not only our medical records but data from apps that track what we buy, how often we exercise and how well we sleep, among many other things. It provides an ocean of information about how healthy or ill we are, and unsurprisingly, doctors, medical researchers, healthcare organizations, insurance companies and governments are keen to get access to it. Should they be allowed to?

It’s a huge question, and AARP is partnering with GovLab to learn what older Americans think about it. AARP is a non-profit organization — the largest in the nation and the world — dedicated to empowering Americans to choose how they live as they age. In 2018 it had more than 38 million members. It is a key voice in policymaking in the United States, because it represents the views of people aged over 50 in this country.

From today, AARP and the GovLab are using the Internet to capture what AARP members feel are the most urgent issues confronting them to try to discover what worries people most: the use of big health data or the failure to use it.

The answers are not simple. On the one hand, increasing the use and sharing of data could enable doctors to make better diagnoses and interventions to prevent disease and make us healthier. It could lead medical researchers to find cures faster, while the creation of health data businesses could strengthen the economy.

On the other hand, the collection, sharing, and use of big health data could reveal sensitive personal information over which we have little control. This data could be sold without our consent, and be used by entities for surveillance or discrimination, rather than to promote well-being….(More)”.

How Data Can Help in the Fight Against the Opioid Epidemic in the United States


Report by Joshua New: “The United States is in the midst of an opioid epidemic 20 years in the making….

One of the most pernicious obstacles in the fight against the opioid epidemic is that, until relatively recently, it was difficult to measure the epidemic in any comprehensive capacity beyond such high-level statistics. A lack of granular data and authorities’ inability to use data to inform response efforts allowed the epidemic to grow to devastating proportions. The maxim “you can’t manage what you can’t measure” has never been so relevant, and this failure to effectively leverage data has undoubtedly cost many lives and caused severe social and economic damage to communities ravaged by opioid addiction, with authorities limited in their ability to fight back.

Many factors contributed to the opioid epidemic, including healthcare providers not fully understanding the potential ramifications of prescribing opioids, socioeconomic conditions that make addiction more likely, and drug distributors turning a blind eye to likely criminal behavior, such as pharmacy workers illegally selling opioids on the black market. Data will not be able to solve these problems, but it can make public health officials and other stakeholders more effective at responding to them. Fortunately, recent efforts to better leverage data in the fight against the opioid epidemic have demonstrated the potential for data to be an invaluable and effective tool to inform decision-making and guide response efforts. Policymakers should aggressively pursue more data-driven strategies to combat the opioid epidemic while learning from past mistakes that helped contribute to the epidemic to prevent similar situations in the future.

The scope of this paper is limited to opportunities to better leverage data to help address problems primarily related to the abuse of prescription opioids, rather than the abuse of illicitly manufactured opioids such as heroin and fentanyl. While these issues may overlap, such as when a person develops an opioid use disorder from prescribed opioids and then seeks heroin when they are unable to obtain more from their doctor, the opportunities to address the abuse of prescription opioids are more clear-cut….(More)”.

Unregulated Health Research Using Mobile Devices: Ethical Considerations and Policy Recommendations


Paper by Mark A. Rothstein et al: “Mobile devices with health apps, direct-to-consumer genetic testing, crowd-sourced information, and other data sources have enabled research by new classes of researchers. Independent researchers, citizen scientists, patient-directed researchers, self-experimenters, and others are not covered by federal research regulations because they are not recipients of federal financial assistance or conducting research in anticipation of a submission to the FDA for approval of a new drug or medical device. This article addresses the difficult policy challenge of promoting the welfare and interests of research participants, as well as the public, in the absence of regulatory requirements and without discouraging independent, innovative scientific inquiry. The article recommends a series of measures, including education, consultation, transparency, self-governance, and regulation to strike the appropriate balance….(More)”.

Google’s ‘Project Nightingale’ Gathers Personal Health Data on Millions of Americans


Rob Copeland at Wall Street Journal: “Google is engaged with one of the U.S.’s largest health-care systems on a project to collect and crunch the detailed personal-health information of millions of people across 21 states.

The initiative, code-named “Project Nightingale,” appears to be the biggest effort yet by a Silicon Valley giant to gain a toehold in the health-care industry through the handling of patients’ medical data. Amazon.com Inc., Apple Inc.  and Microsoft Corp. are also aggressively pushing into health care, though they haven’t yet struck deals of this scope.

Google began Project Nightingale in secret last year with St. Louis-based Ascension, a Catholic chain of 2,600 hospitals, doctors’ offices and other facilities, with the data sharing accelerating since summer, according to internal documents.

The data involved in the initiative encompasses lab results, doctor diagnoses and hospitalization records, among other categories, and amounts to a complete health history, including patient names and dates of birth….

Neither patients nor doctors have been notified. At least 150 Google employees already have access to much of the data on tens of millions of patients, according to a person familiar with the matter and the documents.

In a news release issued after The Wall Street Journal reported on Project Nightingale on Monday, the companies said the initiative is compliant with federal health law and includes robust protections for patient data….(More)”.