Trove of unique health data sets could help AI predict medical conditions earlier


Madhumita Murgia at the Financial Times: “…Ziad Obermeyer, a physician and machine learning scientist at the University of California, Berkeley, launched Nightingale Open Science last month — a treasure trove of unique medical data sets, each curated around an unsolved medical mystery that artificial intelligence could help to solve.

The data sets, released after the project received $2m of funding from former Google chief executive Eric Schmidt, could help to train computer algorithms to predict medical conditions earlier, triage better and save lives.

The data include 40 terabytes of medical imagery, such as X-rays, electrocardiogram waveforms and pathology specimens, from patients with a range of conditions, including high-risk breast cancer, sudden cardiac arrest, fractures and Covid-19. Each image is labelled with the patient’s medical outcomes, such as the stage of breast cancer and whether it resulted in death, or whether a Covid patient needed a ventilator.

Obermeyer has made the data sets free to use and mainly worked with hospitals in the US and Taiwan to build them over two years. He plans to expand this to Kenya and Lebanon in the coming months to reflect as much medical diversity as possible.

“Nothing exists like it,” said Obermeyer, who announced the new project in December alongside colleagues at NeurIPS, the global academic conference for artificial intelligence. “What sets this apart from anything available online is the data sets are labelled with the ‘ground truth’, which means with what really happened to a patient and not just a doctor’s opinion.”…

The Nightingale data sets were among dozens proposed this year at NeurIPS.

Other projects included a speech data set of Mandarin and eight subdialects recorded by 27,000 speakers in 34 cities in China; the largest audio data set of Covid respiratory sounds, such as breathing, coughing and voice recordings, from more than 36,000 participants to help screen for the disease; and a data set of satellite images covering the entire country of South Africa from 2006 to 2017, divided and labelled by neighbourhood, to study the social effects of spatial apartheid.

Elaine Nsoesie, a computational epidemiologist at the Boston University School of Public Health, said new types of data could also help with studying the spread of diseases in diverse locations, as people from different cultures react differently to illnesses.

She said her grandmother in Cameroon, for example, might think differently than Americans do about health. “If someone had an influenza-like illness in Cameroon, they may be looking for traditional, herbal treatments or home remedies, compared to drugs or different home remedies in the US.”

Computer scientists Serena Yeung and Joaquin Vanschoren, who proposed that research to build new data sets should be exchanged at NeurIPS, pointed out that the vast majority of the AI community still cannot find good data sets to evaluate their algorithms. This meant that AI researchers were still turning to data that were potentially “plagued with bias”, they said. “There are no good models without good data.”…(More)”.

Deliberate Ignorance: Choosing Not to Know


Book edited by Ralph Hertwig and Christoph Engel: “The history of intellectual thought abounds with claims that knowledge is valued and sought, yet individuals and groups often choose not to know. We call the conscious choice not to seek or use knowledge (or information) deliberate ignorance. When is this a virtue, when is it a vice, and what can be learned from formally modeling the underlying motives? On which normative grounds can it be judged? Which institutional interventions can promote or prevent it? In this book, psychologists, economists, historians, computer scientists, sociologists, philosophers, and legal scholars explore the scope of deliberate ignorance.

Drawing from multiple examples, including the right not to know in genetic testing, collective amnesia in transformational societies, blind orchestral auditions, and “don’t ask don’t tell” policies), the contributors offer novel insights and outline avenues for future research into this elusive yet fascinating aspect of human nature…(More)”.

Data trust and data privacy in the COVID-19 period


Paper by Nicholas Biddle et al: “In this article, we focus on data trust and data privacy, and how attitudes may be changing during the COVID-19 period. On balance, it appears that Australians are more trusting of organizations with regards to data privacy and less concerned about their own personal information and data than they were prior to the spread of COVID-19. The major determinant of this change in trust with regards to data was changes in general confidence in government institutions. Despite this improvement in trust with regards to data privacy, trust levels are still low….(More)”.

Nudges: Four reasons to doubt popular technique to shape people’s behavior


Article by Magda Osman: “Throughout the pandemic, many governments have had to rely on people doing the right thing to reduce the spread of the coronavirus – ranging from social distancing to handwashing. Many enlisted the help of psychologists for advice on how to “nudge” the public to do what was deemed appropriate.

Nudges have been around since the 1940s and originally were referred to as behavioural engineering. They are a set of techniques developed by psychologists to promote “better” behaviour through “soft” interventions rather than “hard” ones (mandates, bans, fines). In other words, people aren’t punished if they fail to follow them. The nudges are based on psychological and behavioural economic research into human behaviour and cognition.

The nudges can involve subtle as well as obvious methods. Authorities may set a “better” choice, such as donating your organs, as a default – so people have to opt out of a register rather than opt in. Or they could make a healthy option more attractive through food labelling.

But, despite the soft approach, many people aren’t keen on being nudged. During the pandemic, for example, scientists examined people’s attitudes to nudging in social and news media in the UK, and discovered that half of the sentiments expressed in social media posts were negative…(More)”.

A data-based participatory approach for health equity and digital inclusion: prioritizing stakeholders


Paper by Aristea Fotopoulou, Harriet Barratt, and Elodie Marandet: “This article starts from the premise that projects informed by data science can address social concerns, beyond prioritizing the design of efficient products or services. How can we bring the stakeholders and their situated realities back into the picture? It is argued that data-based, participatory interventions can improve health equity and digital inclusion while avoiding the pitfalls of top-down, technocratic methods. A participatory framework puts users, patients and citizens as stakeholders at the centre of the process, and can offer complex, sustainable benefits, which go beyond simply the experience of participation or the development of an innovative design solution. A significant benefit for example is the development of skills, which should not be seen as a by-product of the participatory processes, but a central element of empowering marginalized or excluded communities to participate in public life. By drawing from different examples in various domains, the article discusses what can be learnt from implementations of schemes using data science for social good, human-centric design, arts and wellbeing, to argue for a data-centric, creative and participatory approach to address health equity and digital inclusion in tandem…(More)”.

Quarantined Data? The impact, scope & challenges of open data during COVID


Chapter by Álvaro V. Ramírez-Alujas: “How do rates of COVID19 infection increase? How do populations respond to lockdown measures? How is the pandemic affecting the economic and social activity of communities beyond health? What can we do to mitigate risks and support families in this context? The answer to these and other key questions is part of the intense global public debate on the management of the health crisis and how appropriate public policy measures have been taken in order to combat the impact and effects of COVID19 around the world. The common ground to all of them? The availability and use of public data and information. This chapter reflects on the relevance of public information and the availability, processing and use of open data as the primary hub and key ingredient in the responsiveness of governments and public institutions to the COVID19 pandemic and its multiple impacts on society. Discussions are underway concerning the scope, paradoxes, lessons learned, and visible challenges with respect to the available evidence and comparative analysis of government strategies in the region, incorporating the urgent need to shift towards a more robust, sustainable data infrastructure anchored in a logic of strengthening the ecosystem of actors (public and private sectors, civil society and the scientific community) to shape a framework of governance, and a strong, emerging institutional architecture based on data management for sustainable development on a human scale…(More)”.

Biases in human mobility data impact epidemic modeling


Paper by Frank Schlosser, Vedran Sekara, Dirk Brockmann, and Manuel Garcia-Herranz: “Large-scale human mobility data is a key resource in data-driven policy making and across many scientific fields. Most recently, mobility data was extensively used during the COVID-19 pandemic to study the effects of governmental policies and to inform epidemic models. Large-scale mobility is often measured using digital tools such as mobile phones. However, it remains an open question how truthfully these digital proxies represent the actual travel behavior of the general population. Here, we examine mobility datasets from multiple countries and identify two fundamentally different types of bias caused by unequal access to, and unequal usage of mobile phones. We introduce the concept of data generation bias, a previously overlooked type of bias, which is present when the amount of data that an individual produces influences their representation in the dataset. We find evidence for data generation bias in all examined datasets in that high-wealth individuals are overrepresented, with the richest 20% contributing over 50% of all recorded trips, substantially skewing the datasets. This inequality is consequential, as we find mobility patterns of different wealth groups to be structurally different, where the mobility networks of high-wealth users are denser and contain more long-range connections. To mitigate the skew, we present a framework to debias data and show how simple techniques can be used to increase representativeness. Using our approach we show how biases can severely impact outcomes of dynamic processes such as epidemic simulations, where biased data incorrectly estimates the severity and speed of disease transmission. Overall, we show that a failure to account for biases can have detrimental effects on the results of studies and urge researchers and practitioners to account for data-fairness in all future studies of human mobility…(More)”.

NativeDATA


About: “NativeDATA is a free online resource that offers practical guidance for Tribes and Native-serving organizations. For this resource, Native-serving organizations includes Tribal and urban Indian organizations and Tribal Epidemiology Centers (TECs). 

Tribal and urban Indian communities need correct health information (data), so that community leaders can:

  • Watch disease trends
  • Respond to health threats
  • Create useful health policies…

Throughout, this resource offers practical guidance for obtaining and sharing health data in ways that honor Tribal sovereignty, data sovereignty, and public health authorityis the authority of a sovereign government to protect the health, safety, and welfare of its citizens. As sovereign nations, Tribes have the power to define how they will use this authority to protect and promote the health of their communities. The federal government recognizes Tribes and Tribal Epidemiology Centers (TECs) as public health authorities under federal law. More.

Inside you will find expert advice to help you:

Pandemic Privacy


A Preliminary Analysis of Collection Technologies, Data Collection Laws, and Legislative Reform during COVID-19 by Benjamin Ballard, Amanda Cutinha, and Christopher Parsons: “…a preliminary comparative analysis of how different information technologies were mobilized in response to COVID-19 to collect data, the extent to which Canadian health or privacy or emergencies laws impeded the response to COVID-19, and ultimately, the potential consequences of reforming data protection or privacy laws to enable more expansive data collection, use, or disclosure of personal information in future health emergencies. In analyzing how data has been collected in the United States, United Kingdom, and Canada, we found that while many of the data collection methods could be mapped onto a trajectory of past collection practices, the breadth and extent of data collection in tandem with how communications networks were repurposed constituted novel technological responses to a health crisis. Similarly, while the intersection of public and private interests in providing healthcare and government services is not new, the ability for private companies such as Google and Apple to forcefully shape some of the technology-enabled pandemic responses speaks to the significant ability of private companies to guide or direct public health measures that rely on contemporary smartphone technologies. While we found that the uses of technologies were linked to historical efforts to combat the spread of disease, the nature and extent of private surveillance to enable public action was arguably unprecedented….(More)”.

Data protection in the context of covid-19. A short (hi)story of tracing applications


Book edited by Elise Poillot, Gabriele Lenzini, Giorgio Resta, and Vincenzo Zeno-Zencovich: “The volume presents the results of a research project  (named “Legafight”) funded by the Luxembourg Fond National de la Recherche in order to verify if and how digital tracing applications could be implemented in the Grand-Duchy in order to counter and abate the Covid-19 pandemic. This inevitably brought to a deep comparative overview of the various existing various models, starting from that of the European Union and those put into practice by Belgium, France, Germany, and Italy, with attention also to some Anglo-Saxon approaches (the UK and Australia). Not surprisingly the main issue which had to be tackled was that of the protection of the personal data collected through the tracing applications, their use by public health authorities and the trust laid in tracing procedures by citizens. Over the last 18 months tracing apps have registered a rise, a fall, and a sudden rebirth as mediums devoted not so much to collect data, but rather to distribute real time information which should allow informed decisions and be used as repositories of health certifications…(More)”.