Collaboration in Healthcare: Implications of Data Sharing for Secondary Use in the European Union


Paper by Fanni Kertesz: “The European healthcare sector is transforming toward patient-centred and value-based healthcare delivery. The European Health Data Space (EHDS) Regulation aims to unlock the potential of health data by establishing a single market for its primary and secondary use. This paper examines the legal challenges associated with the secondary use of health data within the EHDS and offers recommendations for improvement. Key issues include the compatibility between the EHDS and the General Data Protection Regulation (GDPR), barriers to cross-border data sharing, and intellectual property concerns. Resolving these challenges is essential for realising the full potential of health data and advancing healthcare research and innovation within the EU…(More)”.

On Slicks and Satellites: An Open Source Guide to Marine Oil Spill Detection


Article by Wim Zwijnenburg: “The sheer scale of ocean oil pollution is staggering. In Europe, a suspected 3,000 major illegal oil dumps take place annually, with an estimated release of between 15,000 and 60,000 tonnes of oil ending up in the North Sea. In the Mediterranean, figures provided by the Regional Marine Pollution Emergency Response Centre estimate there are 1,500 to 2,000 oil spills every year.

The impact of any single oil spill on a marine or coastal ecosystem can be devastating and long-lasting. Animals such as birds, turtles, dolphins and otters can suffer from ingesting or inhaling oil, as well as getting stuck in the slick. The loss of water and soil quality can be toxic to both flora and fauna. Heavy metals enter the food chain, poisoning everything from plankton to shellfish, which in turn affects the livelihoods of coastal communities dependent on fishing and tourism.

However, with a wealth of open source earth observation tools at our fingertips, during such environmental disasters it’s possible for us to identify and monitor these spills, highlight at-risk areas, and even hold perpetrators accountable. …

There are several different types of remote sensing sensors we can use for collecting data about the Earth’s surface. In this article we’ll focus on two: optical and radar sensors. 

Optical imagery captures the broad light spectrum reflected from the Earth, also known as passive remote sensing. In contrast, Synthetic Aperture Radar (SAR) uses active remote sensing, sending radio waves down to the Earth’s surface and capturing them as they are reflected back. Any change in the reflection can indicate a change on ground, which can then be investigated. For more background, see Bellingcat contributor Ollie Ballinger’s Remote Sensing for OSINT Guide…(More)”.

Using internet search data as part of medical research


Blog by Susan Thomas and Matthew Thompson: “…In the UK, almost 50 million health-related searches are made using Google per year. Globally there are 100s of millions of health-related searches every day. And, of course, people are doing these searches in real-time, looking for answers to their concerns in the moment. It’s also possible that, even if people aren’t noticing and searching about changes to their health, their behaviour is changing. Maybe they are searching more at night because they are having difficulty sleeping or maybe they are spending more (or less) time online. Maybe an individual’s search history could actually be really useful for researchers. This realisation has led medical researchers to start to explore whether individuals’ online search activity could help provide those subtle, almost unnoticeable signals that point to the beginning of a serious illness.

Our recent review found 23 studies have been published so far that have done exactly this. These studies suggest that online search activity among people later diagnosed with a variety of conditions ranging from pancreatic cancer and stroke to mood disorders, was different to people who did not have one of these conditions.

One of these studies was published by researchers at Imperial College London, who used online search activity to identify signals of women with gynaecological malignancies. They found that women with malignant (e.g. ovarian cancer) and benign conditions had different search patterns, up to two months prior to a GP referral. 

Pause for a moment, and think about what this could mean. Ovarian cancer is one of the most devastating cancers women get. It’s desperately hard to detect early – and yet there are signals of this cancer visible in women’s internet searches months before diagnosis?…(More)”.

Data sovereignty for local governments. Considerations and enablers


Report by JRC Data sovereignty for local governments refers to a capacity to control and/or access data, and to foster a digital transformation aligned with societal values and EU Commission political priorities. Data sovereignty clauses are an instrument that local governments may use to compel companies to share data of public interest. Albeit promising, little is known about the peculiarities of this instrument and how it has been implemented so far. This policy brief aims at filling the gap by systematising existing knowledge and providing policy-relevant recommendations for its wider implementation…(More)”.

Community consent: neither a ceiling nor a floor


Article by Jasmine McNealy: “The 23andMe breach and the Golden State Killer case are two of the more “flashy” cases, but questions of consent, especially the consent of all of those affected by biodata collection and analysis in more mundane or routine health and medical research projects, are just as important. The communities of people affected have expectations about their privacy and the possible impacts of inferences that could be made about them in data processing systems. Researchers must, then, acquire community consent when attempting to work with networked biodata. 

Several benefits of community consent exist, especially for marginalized and vulnerable populations. These benefits include:

  • Ensuring that information about the research project spreads throughout the community,
  • Removing potential barriers that might be created by resistance from community members,
  • Alleviating the possible concerns of individuals about the perspectives of community leaders, and 
  • Allowing the recruitment of participants using methods most salient to the community.

But community consent does not replace individual consent and limits exist for both community and individual consent. Therefore, within the context of a biorepository, understanding whether community consent might be a ceiling or a floor requires examining governance and autonomy…(More)”.

The Data That Powers A.I. Is Disappearing Fast


Article by Kevin Roose: “For years, the people building powerful artificial intelligence systems have used enormous troves of text, images and videos pulled from the internet to train their models.

Now, that data is drying up.

Over the past year, many of the most important web sources used for training A.I. models have restricted the use of their data, according to a study published this week by the Data Provenance Initiative, an M.I.T.-led research group.

The study, which looked at 14,000 web domains that are included in three commonly used A.I. training data sets, discovered an “emerging crisis in consent,” as publishers and online platforms have taken steps to prevent their data from being harvested.

The researchers estimate that in the three data sets — called C4, RefinedWeb and Dolma — 5 percent of all data, and 25 percent of data from the highest-quality sources, has been restricted. Those restrictions are set up through the Robots Exclusion Protocol, a decades-old method for website owners to prevent automated bots from crawling their pages using a file called robots.txt.

The study also found that as much as 45 percent of the data in one set, C4, had been restricted by websites’ terms of service.

“We’re seeing a rapid decline in consent to use data across the web that will have ramifications not just for A.I. companies, but for researchers, academics and noncommercial entities,” said Shayne Longpre, the study’s lead author, in an interview.

Data is the main ingredient in today’s generative A.I. systems, which are fed billions of examples of text, images and videos. Much of that data is scraped from public websites by researchers and compiled in large data sets, which can be downloaded and freely used, or supplemented with data from other sources…(More)”.

Exploring Digital Biomarkers for Depression Using Mobile Technology


Paper by Yuezhou Zhang et al: “With the advent of ubiquitous sensors and mobile technologies, wearables and smartphones offer a cost-effective means for monitoring mental health conditions, particularly depression. These devices enable the continuous collection of behavioral data, providing novel insights into the daily manifestations of depressive symptoms.

We found several significant links between depression severity and various behavioral biomarkers: elevated depression levels were associated with diminished sleep quality (assessed through Fitbit metrics), reduced sociability (approximated by Bluetooth), decreased levels of physical activity (quantified by step counts and GPS data), a slower cadence of daily walking (captured by smartphone accelerometers), and disturbances in circadian rhythms (analyzed across various data streams).
Leveraging digital biomarkers for assessing and continuously monitoring depression introduces a new paradigm in early detection and development of customized intervention strategies. Findings from these studies not only enhance our comprehension of depression in real-world settings but also underscore the potential of mobile technologies in the prevention and management of mental health issues…(More)”

Designing an Effective Governance Model for Data Collaboratives


Paper by Federico Bartolomucci & Francesco Leoni: “Data Collaboratives have gained traction as interorganizational partnerships centered on data exchange. They enhance the collective capacity of responding to contemporary societal challenges using data, while also providing participating organizations with innovation capabilities and reputational benefits. Unfortunately, data collaboratives often fail to advance beyond the pilot stage and are therefore limited in their capacity to deliver systemic change. The governance setting adopted by a data collaborative affects how it acts over the short and long term. We present a governance design model to develop context-dependent data collaboratives. Practitioners can use the proposed model and list of key reflective questions to evaluate the critical aspects of designing a governance model for their data collaboratives…(More)”.

The 4M Roadmap: A Higher Road to Profitability by Using Big Data for Social Good


Report by Brennan Lake: “As the private sector faces conflicting pressures to either embrace or shun socially responsible practices, companies with privately held big-data assets must decide whether to share access to their data for public good. While some managers object to data sharing over concerns of privacy and product cannibalization, others launch well intentioned yet short-lived CSR projects that fail to deliver on lofty goals.

By embedding Shared-Value principles into ‘Data-for-Good’ programs, data-rich firms can launch responsible data-sharing initiatives that minimize risk, deliver sustained impact, and improve overall competitiveness in the process.

The 4M Roadmap by Brennan Lake, a Big-Data and Social Impact professional, guides managers to adopt a ‘Data-for-Good’ model that emphasizes four key pillars of value-creation: Mission, Messaging, Methods, and Monetization. Through deep analysis and private-sector case studies, The 4M Roadmap demonstrates how companies can engage in responsible data sharing to benefit society and business alike…(More)”.

A New National Purpose: Harnessing Data for Health


Report by the Tony Blair Institute: “We are at a pivotal moment where the convergence of large health and biomedical data sets, artificial intelligence and advances in biotechnology is set to revolutionise health care, drive economic growth and improve the lives of citizens. And the UK has strengths in all three areas. The immense potential of the UK’s health-data assets, from the NHS to biobanks and genomics initiatives, can unlock new diagnostics and treatments, deliver better and more personalised care, prevent disease and ultimately help people live longer, healthier lives.

However, realising this potential is not without its challenges. The complex and fragmented nature of the current health-data landscape, coupled with legitimate concerns around privacy and public trust, has made for slow progress. The UK has had a tendency to provide short-term funding across multiple initiatives, which has led to an array of individual projects – many of which have struggled to achieve long-term sustainability and deliver tangible benefits to patients.

To overcome these challenges, it will be necessary to be bold and imaginative. We must look for ways to leverage the unique strengths of the NHS, such as its nationwide reach and cradle-to-grave data coverage, to create a health-data ecosystem that is much more than the sum of its many parts. This will require us to think differently about how we collect, manage and utilise health data, and to create new partnerships and models of collaboration that break down traditional silos and barriers. It will mean treating data as a key health resource and managing it accordingly.

One model to do this is the proposed sovereign National Data Trust (NDT) – an endeavour to streamline access to and curation of the UK’s valuable health-data assets…(More)”.