Barriers to Working With National Health Service England’s Open Data


Paper by Ben Goldacre and Seb Bacon: “Open data is information made freely available to third parties in structured formats without restrictive licensing conditions, permitting commercial and noncommercial organizations to innovate. In the context of National Health Service (NHS) data, this is intended to improve patient outcomes and efficiency. EBM DataLab is a research group with a focus on online tools which turn our research findings into actionable monthly outputs. We regularly import and process more than 15 different NHS open datasets to deliver OpenPrescribing.net, one of the most high-impact use cases for NHS England’s open data, with over 15,000 unique users each month. In this paper, we have described the many breaches of best practices around NHS open data that we have encountered. Examples include datasets that repeatedly change location without warning or forwarding; datasets that are needlessly behind a “CAPTCHA” and so cannot be automatically downloaded; longitudinal datasets that change their structure without warning or documentation; near-duplicate datasets with unexplained differences; datasets that are impossible to locate, and thus may or may not exist; poor or absent documentation; and withholding of data for dubious reasons. We propose new open ways of working that will support better analytics for all users of the NHS. These include better curation, better documentation, and systems for better dialogue with technical teams….(More)”.

Hospitals Give Tech Giants Access to Detailed Medical Records


Melanie Evans at the Wall Street Journal: “Hospitals have granted Microsoft Corp., International Business Machines and Amazon.com Inc. the ability to access identifiable patient information under deals to crunch millions of health records, the latest examples of hospitals’ growing influence in the data economy.

The breadth of access wasn’t always spelled out by hospitals and tech giants when the deals were struck.

The scope of data sharing in these and other recently reported agreements reveals a powerful new role that hospitals play—as brokers to technology companies racing into the $3 trillion health-care sector. Rapid digitization of health records and privacy laws enabling companies to swap patient data have positioned hospitals as a primary arbiter of how such sensitive data is shared. 

“Hospitals are massive containers of patient data,” said Lisa Bari, a consultant and former lead for health information technology for the Centers for Medicare and Medicaid Services Innovation Center. 

Hospitals can share patient data as long as they follow federal privacy laws, which contain limited consumer protections, she said. “The data belongs to whoever has it.”…

Digitizing patients’ medical histories, laboratory results and diagnoses has created a booming market in which tech giants are looking to store and crunch data, with potential for groundbreaking discoveries and lucrative products.

There is no indication of wrongdoing in the deals. Officials at the companies and hospitals say they have safeguards to protect patients. Hospitals control data, with privacy training and close tracking of tech employees with access, they said. Health data can’t be combined independently with other data by tech companies….(More)”.

Social media firms 'should hand over data amid suicide risk'


Denis Campbell at the Guardian: “Social media firms such as Facebook and Instagram should be forced to hand over data about who their users are and why they use the sites to reduce suicide among children and young people, psychiatrists have said.

The call from the Royal College of Psychiatrists comes as ministers finalise plans to crack down on issues caused by people viewing unsavoury material and messages online.

The college, which represents the UK’s 18,000 psychiatrists, wants the government to make social media platforms hand over the data to academics so that they can study what sort of content users are viewing.

“We will never understand the risks and benefits of social media use unless the likes of Twitter, Facebook and Instagram share their data with researchers,” said Dr Bernadka Dubicka, chair of the college’s child and adolescent mental health faculty. “Their research will help shine a light on how young people are interacting with social media, not just how much time they spend online.”

Data passed to academics would show the type of material viewed and how long users were spending on such platforms but would be anonymous, the college said.

The government plans to set up a new online safety regulator and the college says it should be given the power to compel firms to hand over data. It is also calling for the forthcoming 2% “turnover tax” on social media companies’ income to be extended so that it includes their turnover internationally, not from just the UK.

“Self-regulation is not working. It is time for government to step up and take decisive action to hold social media companies to account for escalating harmful content to vulnerable children and young people,” said Dubicka.

The college’s demands come amid growing concern that young people are being harmed by material that, for example, encourages self-harm, suicide and eating disorders. They are included in a new position statement on technology use and the mental health of children and young people.

NHS England challenged firms to hand over the sort of information that the college is suggesting. Claire Murdoch, its national director for mental health, said that action was needed “to rein in potentially misleading or harmful online content and behaviours”.

She said: “If these tech giants really want to be a force for good, put a premium on users’ wellbeing and take their responsibilities seriously, then they should do all they can to help researchers better understand how they operate and the risks posed. Until then, they cannot confidently say whether the good outweighs the bad.”

The demands have also been backed by Ian Russell, who has become a campaigner against social media harm since his 14-year-old daughter Molly killed herself in November 2017….(More)”.

The future is intelligent: Harnessing the potential of artificial intelligence in Africa


Youssef Travaly and Kevin Muvunyi at Brookings: “…AI in particular presents countless avenues for both the public and private sectors to optimize solutions to the most crucial problems facing the continent today, especially for struggling industries. For example, in health care, AI solutions can help scarce personnel and facilities do more with less by speeding initial processing, triage, diagnosis, and post-care follow up. Furthermore, AI-based pharmacogenomics applications, which focus on the likely response of an individual to therapeutic drugs based on certain genetic markers, can be used to tailor treatments. Considering the genetic diversity found on the African continent, it is highly likely that the application of these technologies in Africa will result in considerable advancement in medical treatment on a global level.

In agricultureAbdoulaye Baniré Diallo, co-founder and chief scientific officer of the AI startup My Intelligent Machines, is working with advanced algorithms and machine learning methods to leverage genomic precision in livestock production models. With genomic precision, it is possible to build intelligent breeding programs that minimize the ecological footprint, address changing consumer demands, and contribute to the well-being of people and animals alike through the selection of good genetic characteristics at an early stage of the livestock production process. These are just a few examples that illustrate the transformative potential of AI technology in Africa.

However, a number of structural challenges undermine rapid adoption and implementation of AI on the continent. Inadequate basic and digital infrastructure seriously erodes efforts to activate AI-powered solutions as it reduces crucial connectivity. (For more on strategies to improve Africa’s digital infrastructure, see the viewpoint on page 67 of the full report). A lack of flexible and dynamic regulatory systems also frustrates the growth of a digital ecosystem that favors AI technology, especially as tech leaders want to scale across borders. Furthermore, lack of relevant technical skills, particularly for young people, is a growing threat. This skills gap means that those who would have otherwise been at the forefront of building AI are left out, preventing the continent from harnessing the full potential of transformative technologies and industries.

Similarly, the lack of adequate investments in research and development is an important obstacle. Africa must develop innovative financial instruments and public-private partnerships to fund human capital development, including a focus on industrial research and innovation hubs that bridge the gap between higher education institutions and the private sector to ensure the transition of AI products from lab to market….(More)”.

Paging Dr. Google: How the Tech Giant Is Laying Claim to Health Data


Wall Street Journal: “Roughly a year ago, Google offered health-data company Cerner Corp.an unusually rich proposal.

Cerner was interviewing Silicon Valley giants to pick a storage provider for 250 million health records, one of the largest collections of U.S. patient data. Google dispatched former chief executive Eric Schmidt to personally pitch Cerner over several phone calls and offered around $250 million in discounts and incentives, people familiar with the matter say. 

Google had a bigger goal in pushing for the deal than dollars and cents: a way to expand its effort to collect, analyze and aggregate health data on millions of Americans. Google representatives were vague in answering questions about how Cerner’s data would be used, making the health-care company’s executives wary, the people say. Eventually, Cerner struck a storage deal with Amazon.com Inc. instead.

The failed Cerner deal reveals an emerging challenge to Google’s move into health care: gaining the trust of health care partners and the public. So far, that has hardly slowed the search giant.

Google has struck partnerships with some of the country’s largest hospital systems and most-renowned health-care providers, many of them vast in scope and few of their details previously reported. In just a few years, the company has achieved the ability to view or analyze tens of millions of patient health records in at least three-quarters of U.S. states, according to a Wall Street Journal analysis of contractual agreements. 

In certain instances, the deals allow Google to access personally identifiable health information without the knowledge of patients or doctors. The company can review complete health records, including names, dates of birth, medications and other ailments, according to people familiar with the deals.

The prospect of tech giants’ amassing huge troves of health records has raised concerns among lawmakers, patients and doctors, who fear such intimate data could be used without individuals’ knowledge or permission, or in ways they might not anticipate. 

Google is developing a search tool, similar to its flagship search engine, in which patient information is stored, collated and analyzed by the company’s engineers, on its own servers. The portal is designed for use by doctors and nurses, and eventually perhaps patients themselves, though some Google staffers would have access sooner. 

Google executives and some health systems say that detailed data sharing has the potential to improve health outcomes. Large troves of data help fuel algorithms Google is creating to detect lung cancer, eye disease and kidney injuries. Hospital executives have long sought better electronic record systems to reduce error rates and cut down on paperwork….

Legally, the information gathered by Google can be used for purposes beyond diagnosing illnesses, under laws enacted during the dial-up era. U.S. federal privacy laws make it possible for health-care providers, with little or no input from patients, to share data with certain outside companies. That applies to partners, like Google, with significant presences outside health care. The company says its intentions in health are unconnected with its advertising business, which depends largely on data it has collected on users of its many services, including email and maps.

Medical information is perhaps the last bounty of personal data yet to be scooped up by technology companies. The health data-gathering efforts of other tech giants such as Amazon and International Business Machines Corp. face skepticism from physician and patient advocates. But Google’s push in particular has set off alarm bells in the industry, including over privacy concerns. U.S. senators, as well as health-industry executives, are questioning Google’s expansion and its potential for commercializing personal data….(More)”.

Towards adaptive governance in big data health research: implementing regulatory principles


Chapter by Alessandro Blasimme and Effy Vayena: “While data-enabled health care systems are in their infancy, biomedical research is rapidly adopting the big data paradigm. Digital epidemiology for example, already employs data generated outside the public health care system – that is, data generated without the intent of using them for epidemiological research – to understand and prevent patterns of diseases in populations (Salathé 2018)(Salathé 2018). Precision medicine – pooling together genomic, environmental and lifestyle data – also represents a prominent example of how data integration can drive both fundamental and translational research in important medical domains such as oncology (D. C. Collins et al. 2017). All of this requires the collection, storage, analysis and distribution of massive amounts of personal information as well as the use of state-of-the art data analytics tools to uncover healthand disease related patterns.


The realization of the potential of big data in health evokes a necessary commitment to a sense of “continuity” articulated in three distinct ways: a) from data generation to use (as in the data enabled learning health care ); b) from research to clinical practice e.g. discovery of new mutations in the context of diagnostics; c) from strictly speaking health data (Vayena and Gasser 2016) e.g. clinical records, to less so e.g. tweets used in digital epidemiology. These continuities face the challenge of regulatory and governance approaches that were designed for clear data taxonomies, for a less blurred boundary between research and clinical practice, and for rules that focused mostly on data generation and less on their eventual and multiple uses.

The result is significant uncertainty about how responsible use of such large amounts of sensitive personal data could be fostered. In this chapter we focus on the uncertainties surrounding the use of biomedical big data in the context of health research. Are new criteria needed to review biomedical big data research projects? Do current mechanisms, such as informed consent, offer sufficient protection to research participants’ autonomy and privacy in this new context? Do existing oversight mechanisms ensure transparency and accountability in data access and sharing? What monitoring tools are available to assess how personal data are used over time? Is the equitable distribution of benefits accruing from such data uses considered, or can it be ensured? How is the public being involved – if at all – with decisions about creating and using large data
repositories for research purposes? What is the role that IT (information technology) players, and especially big ones, acquire in research? And what regulatory instruments do we have to ensure that such players do not undermine the independence of research?…(More)”.

Responsible data sharing in a big data-driven translational research platform: lessons learned


Paper by S. Kalkman et al: “The sharing of clinical research data is increasingly viewed as a moral duty [1]. Particularly in the context of making clinical trial data widely available, editors of international medical journals have labeled data sharing a highly efficient way to advance scientific knowledge [2,3,4]. The combination of even larger datasets into so-called “Big Data” is considered to offer even greater benefits for science, medicine and society [5]. Several international consortia have now promised to build grand-scale, Big Data-driven translational research platforms to generate better scientific evidence regarding disease etiology, diagnosis, treatment and prognosis across various disease areas [6,7,8].

Despite anticipated benefits, large-scale sharing of health data is charged with ethical questions. Stakeholders have been urged to consider how to manage privacy and confidentiality issues, ensure valid informed consent, and determine who gets to decide about data access [9]. More fundamentally, new data sharing activities prompt questions about social justice and public trust [10]. To balance potential benefits and ethical considerations, data sharing platforms require guidance for the processes of interaction and decision-making. In the European Union (EU), legal norms specified for the sharing of personal data for health research, most notably those set out in the General Data Protection Regulation (GDPR) (EU 2016/679), remain open to interpretation and offer limited practical guidance to researchers [12,12,13]. Striking in this regard is that the GDPR itself stresses the importance of adherence to ethical standards, when broad consent is put forward as a legal basis for the processing of personal data. For example, Recital 33 of the GDPR states that data subjects should be allowed to give “consent to certain areas of scientific research when in keeping with recognised ethical standards for scientific research” [14]. In fact, the GDPR actually encourages data controllers to establish self-regulating mechanisms, such as a code of conduct. To foster responsible and sustainable data sharing in translational research platforms, ethical guidance and governance is therefore necessary. Here, we define governance as ‘the processes of interaction and decision-making among the different stakeholders that are involved in a collective problem that lead to the creation, reinforcement, or reproduction of social norms and institutions’…(More)”.

Biased Algorithms Are Easier to Fix Than Biased People


Sendhil Mullainathan in The New York Times: “In one study published 15 years ago, two people applied for a job. Their résumés were about as similar as two résumés can be. One person was named Jamal, the other Brendan.

In a study published this year, two patients sought medical care. Both were grappling with diabetes and high blood pressure. One patient was black, the other was white.

Both studies documented racial injustice: In the first, the applicant with a black-sounding name got fewer job interviews. In the second, the black patient received worse care.

But they differed in one crucial respect. In the first, hiring managers made biased decisions. In the second, the culprit was a computer program.

As a co-author of both studies, I see them as a lesson in contrasts. Side by side, they show the stark differences between two types of bias: human and algorithmic.

Marianne Bertrand, an economist at the University of Chicago, and I conducted the first study: We responded to actual job listings with fictitious résumés, half of which were randomly assigned a distinctively black name.

The study was: “Are Emily and Greg more employable than Lakisha and Jamal?”

The answer: Yes, and by a lot. Simply having a white name increased callbacks for job interviews by 50 percent.

I published the other study in the journal “Science” in late October with my co-authors: Ziad Obermeyer, a professor of health policy at University of California at Berkeley; Brian Powers, a clinical fellow at Brigham and Women’s Hospital; and Christine Vogeli, a professor of medicine at Harvard Medical School. We focused on an algorithm that is widely used in allocating health care services, and has affected roughly a hundred million people in the United States.

To better target care and provide help, health care systems are turning to voluminous data and elaborately constructed algorithms to identify the sickest patients.

We found these algorithms have a built-in racial bias. At similar levels of sickness, black patients were deemed to be at lower risk than white patients. The magnitude of the distortion was immense: Eliminating the algorithmic bias would more than double the number of black patients who would receive extra help. The problem lay in a subtle engineering choice: to measure “sickness,” they used the most readily available data, health care expenditures. But because society spends less on black patients than equally sick white ones, the algorithm understated the black patients’ true needs.

One difference between these studies is the work needed to uncover bias…(More)”.

Accelerating Medicines Partnership (AMP): Improving Drug Research Efficiency through Biomarker Data Sharing


Data Collaborative Case Study by Michelle Winowatan, Andrew Young, and Stefaan Verhulst: “Accelerating Medicines Partnership (AMP) is a cross-sector data-sharing partnership in the United States between the National Institutes of Health (NIH), the Food and Drug Administration (FDA), multiple biopharmaceutical and life science companies, as well as non-profit organizations that seeks to improve the efficiency of developing new diagnostics and treatments for several types of disease. To achieve this goal, the partnership created a pre-competitive collaborative ecosystem where the biomedical community can pool data and resources that are relevant to the prioritized disease areas. A key component of the partnership is to make biomarkers data available to the medical research community through online portals.

Data Collaboratives Model: Based on our typology of data collaborative models, AMP is an example of the data pooling model of data collaboration, specifically a public data pool. Public data pools co-mingle data assets from multiple data holders — in this case pharmaceutical companies — and make those shared assets available on the web. Pools often limit contributions to approved partners (as public data pools are not crowdsourcing efforts), but access to the shared assets is open, enabling independent re-uses.

Data Stewardship Approach: Data stewardship is built into the partnership through the establishment of an executive committee, which governs the entire partnership, and a steering committee for each disease area, which governs each of the sub-projects within AMP. These committees consist of representatives from the institutional partners involved in AMP and perform data stewards function including enabling inter-institutional engagement as well as intra-institutional coordination, data audit and assessment of value and risk, communication of findings, and nurture the collaboration to sustainability….(Full Case Study)”.

The Role of Crowdsourcing in the Healthcare Industry


Chapter by Kabir C. Sen: “The twenty first century has seen the advent of technical advances in storage, transmission and analysis of information. This has had a profound impact on the field of medicine. However, notwithstanding these advances, various obstacles remain in the world regarding the improvement of human lives through the provision of better health care. The obstacles emanate from the demand (i.e., the problem) as well as the supply (i.e., the solution) side. In some cases, the nature of the problems might not have been correctly identified. In others, a solution to a problem could be known only to a small niche of the global population. Thus, from the demand perspective, the variety of health care issues can range from the quest for a cure for a rare illness to the inability to successfully implement verifiable preventive measures for a disease that affects pockets of the global population. Alternatively, from the supply perspective, the approach to a host of health issues might vary because of fundamental differences in both medical philosophies and organizational policies.

In many instances, effective solutions to health care problems are lacking because of inadequate global knowledge about the particular disease. Alternatively, in other cases, a solution might exist but the relevant knowledge about it might only be available to selected pockets of the global medical community. Sometimes, the barriers to the transfer of knowledge might have their root causes in ignorance or prejudice about the initiator of the cure or solution. However, the advent of information technology has now provided an opportunity for individuals located at different geographical locations to collaborate on solutions to various problems. These crowdsourcing projects now have the potential to extract the “wisdom of crowds” for tackling problems which previously could not be solved by a group of experts (Surowiecki, 2014). Anecdotal evidence suggests that crowdsourcing has achieved some success in providing solutions for a rare medical disease (Arnold, 2014). This chapter discusses crowdsourcing’s potential to solve medical problems by designing a framework to evaluate its promises and suggest recommended future paths of actions….(More)”.