Open health data: Mapping the ecosystem


Paper by Roel Heijlen and Joep Crompvoets: “Governments around the world own multiple datasets related to the policy domain of health. Datasets range from vaccination rates to the availability of health care practitioners in a region to the outcomes of certain surgeries. Health is believed to be a promising subject in the case of open government data policies. However, the specific properties of health data such as its sensibilities regarding privacy, ethics, and ownership encompass particular conditions either enabling or preventing datasets to become freely and easily accessible for everyone…

This paper aims to map the ecosystem of open health data. By analyzing the foundations of health data and the commonalities of open data ecosystems via literature analysis, the socio-technical environment in which health data managed by governments are opened up or potentially stay closed is created. After its theoretical development, the open health data ecosystem is tested via a case study concerning the Data for Better Health initiative from the government of Belgium…

The policy domain of health includes de-identification activities, bioethical assessments, and the specific role of data providers within its open data ecosystem. However, the concept of open data does not always fully apply to the topic of health. Such several health datasets may be findable via government portals but not directly accessible. Differentiation within types of health data and data user capacities are recommendable for future research….(More)”

Under What Conditions Are Data Valuable for Development?


Paper by Dean Jolliffe et al: “Data produced by the public sector can have transformational impacts on development outcomes through better targeting of resources, improved service delivery, cost savings in policy implementation, increased accountability, and more. Around the world, the amount of data produced by the public sector is increasing at a rapid pace, yet their transformational impacts have not been realized fully. Why has the full value of these data not been realized yet? This paper outlines 12 conditions needed for the production and use of public sector data to generate value for development and presents case studies substantiating these conditions. The conditions are that data need to have adequate spatial and temporal coverage (are complete, frequent, and timely), are of high quality (are accurate, comparable, and granular), are easy to use (are accessible, understandable, and interoperable), and are safe to use (are impartial, confidential, and appropriate)…(More)”.

A Proposal for Researcher Access to Platform Data: The Platform Transparency and Accountability Act


Paper by Nathaniel Persily: “We should not need to wait for whistleblowers to blow their whistles, however, before we can understand what is actually happening on these extremely powerful digital platforms. Congress needs to act immediately to ensure that a steady stream of rigorous research reaches the public on the most pressing issues concerning digital technology. No one trusts the representations made by the platforms themselves, though, given their conflict of interest and understandable caution in releasing information that might spook shareholders. We need to develop an unprecedented system of corporate datasharing, mandated by government for independent research in the public interest.

This is easier said than done. Not only do the details matter, they are the only thing that matters. It is all well and good to call for “transparency” or “datasharing,” as an uncountable number of academics have, but the way government might setup this unprecedented regime will determine whether it can serve the grandiose purposes techcritics hope it will….(More)”.

Giant, free index to world’s research papers released online


Holly Else at Nature: “In a project that could unlock the world’s research papers for easier computerized analysis, an American technologist has released online a gigantic index of the words and short phrases contained in more than 100 million journal articles — including many paywalled papers.

The catalogue, which was released on 7 October and is free to use, holds tables of more than 355 billion words and sentence fragments listed next to the articles in which they appear. It is an effort to help scientists use software to glean insights from published work even if they have no legal access to the underlying papers, says its creator, Carl Malamud. He released the files under the auspices of Public Resource, a non-profit corporation in Sebastopol, California, that he founded.

Malamud says that because his index doesn’t contain the full text of articles, but only sentence snippets up to five words long, releasing it does not breach publishers’ copyright restrictions on the reuse of paywalled articles. However, one legal expert says that publishers might question the legality of how Malamud created the index in the first place.

Some researchers who have had early access to the index say it’s a major development in helping them to search the literature with software — a procedure known as text mining. Gitanjali Yadav, a computational biologist at the University of Cambridge, UK, who studies volatile organic compounds emitted by plants, says she aims to comb through Malamud’s index to produce analyses of the plant chemicals described in the world’s research papers. “There is no way for me — or anyone else — to experimentally analyse or measure the chemical fingerprint of each and every plant species on Earth. Much of the information we seek already exists, in published literature,” she says. But researchers are restricted by lack of access to many papers, Yadav adds….(More)”.

Has COVID-19 been the making of Open Science?


Article by Lonni Besançon, Corentin Segalas and Clémence Leyrat: “Although many concepts fall under the umbrella of Open Science, some of its key concepts are: Open Access, Open Data, Open Source, and Open Peer Review. How far these four principles were embraced by researchers during the pandemic and where there is room for improvement, is what we, as early career researchers, set out to assess by looking at data on scientific articles published during the Covid-19 pandemic….Open Source and Open Data practices consist in making all the data and materials used to gather or analyse data available on relevant repositories. While we can find incredibly useful datasets shared publicly on COVID-19 (for instance those provided by the European Centre for Disease Control), they remain the exception rather than the norm. A spectacular example of this were the papers utilising data from the company Surgisphere, that led to retracted papers in The Lancet and The New England Journal of Medicine. In our paper, we highlight 4 papers that could have been retracted much earlier (and perhaps would never have been accepted) had the data been made accessible from the time of publication. As we argue in our paper, this presents a clear case for making open data and open source the default, with exceptions for privacy and safety. While some journals already have such policies, we go further in asking that, when data cannot be shared publicly, editors/publishers and authors/institutions should agree on a third party to check the existence and reliability/validity of the data and the results presented. This not only would strengthen the review process, but also enhance the reproducibility of research and further accelerate the production of new knowledge through data and code sharing…(More)”.

Can digital technologies improve health?


The Lancet: “If you have followed the news on digital technology and health in recent months, you will have read of a blockbuster fraud trial centred on a dubious blood-testing device, a controversial partnership between a telehealth company and a data analytics company, a social media company promising action to curb the spread of vaccine misinformation, and another addressing its role in the deteriorating mental health of young women. For proponents and critics alike, these stories encapsulate the health impact of many digital technologies, and the uncertain and often unsubstantiated position of digital technologies for health. The Lancet and Financial Times Commission on governing health futures 2030: growing up in a digital world, brings together diverse, independent experts to ask if this narrative can still be turned around? Can digital technologies deliver health benefits for all?

Digital technologies could improve health in many ways. For example, electronic health records can support clinical trials and provide large-scale observational data. These approaches have underpinned several high-profile research findings during the COVID-19 pandemic. Sequencing and genomics have been used to understand SARS-CoV-2 transmission and evolution. There is vast promise in digital technology, but the Commission argues that, overall, digital transformations will not deliver health benefits for all without fundamental and revolutionary realignment.

Globally, digital transformations are well underway and have had both direct and indirect health consequences. Direct effects can occur through, for example, the promotion of health information or propagating misinformation. Indirect ones can happen via effects on other determinants of health, including social, economic, commercial, and environmental factors, such as influencing people’s exposure to marketing or political messaging. Children and adolescents growing up in this digital world experience the extremes of digital access. Young people who spend large parts of their lives online may be protected or vulnerable to online harm. But many individuals remain digitally excluded, affecting their access to education and health information. Digital access, and the quality of that access, must be recognised as a key determinant of health. The Commission calls for connectivity to be recognised as a public good and human right.

Describing the accumulation of data and power by dominant actors, many of which are commercial, the Commissioners criticise business models based on the extraction of personal data, and those that benefit from the viral spread of misinformation. To redirect digital technologies to advance universal health coverage, the Commission invokes the guiding principles of democracy, equity, solidarity, inclusion, and human rights. Governments must protect individuals from emerging threats to their health, including bias, discrimination, and online harm to children. The Commission also calls for accountability and transparency in digital transformations, and for the governance of misinformation in health care—basic principles, but ones that have been overridden in a quest for freedom of expression and by the fear that innovation could be sidelined. Public participation and codesign of digital technologies, particularly including young people and those from affected communities, are fundamental.

The Commission also advocates for data solidarity, a radical new approach to health data in which both personal and collective interests and responsibilities are balanced. Rather than data being regarded as something to be owned or hoarded, it emphasises the social and relational nature of health data. Countries should develop data trusts that unlock potential health benefits in public data, while also safeguarding it.

Digital transformations cannot be reversed. But they must be rethought and changed. At its heart, this Commission is both an exposition of the health harms of digital technologies as they function now, and an optimistic vision of the potential alternatives. Calling for investigation and expansion of digital health technologies is not misplaced techno-optimism, but a serious opportunity to drive much needed change. Without new approaches, the world will not achieve the 2030 Sustainable Development Goals.

However, no amount of technical innovation or research will bring equitable health benefits from digital technologies without a fundamental redistribution of power and agency, achievable only through appropriate governance. There is a desperate need to reclaim digital technologies for the good of societies. Our future health depends on it….(More)”.

Open data in digital strategies against COVID-19: the case of Belgium


Paper by Robert Viseur: “COVID-19 has highlighted the importance of digital in the fight against the pandemic (control at the border, automated tracing, creation of databases…). In this research, we analyze the Belgian response in terms of open data. First, we examine the open data publication strategy in Belgium (a federal state with a sometimes complex functioning, especially in health), second, we conduct a case study (anatomy of the pandemic in Belgium) in order to better understand the strengths and weaknesses of the main COVID-19 open data repository. And third, we analyze the obstacles to open data publication. Finally, we discuss the Belgian COVID-19 open data strategy in terms of data availability, data relevance and knowledge management. In particular, we show how difficult it is to optimize the latter in order to make the best use of governmental, private and academic open data in a way that has a positive impact on public health policy….(More)”.

Data Science for Social Good: Philanthropy and Social Impact in a Complex World


Book edited by Ciro Cattuto and Massimo Lapucci: “This book is a collection of insights by thought leaders at first-mover organizations in the emerging field of “Data Science for Social Good”. It examines the application of knowledge from computer science, complex systems, and computational social science to challenges such as humanitarian response, public health, and sustainable development. The book provides an overview of scientific approaches to social impact – identifying a social need, targeting an intervention, measuring impact – and the complementary perspective of funders and philanthropies pushing forward this new sector.

TABLE OF CONTENTS


Introduction; By Massimo Lapucci

The Value of Data and Data Collaboratives for Good: A Roadmap for Philanthropies to Facilitate Systems Change Through Data; By Stefaan G. Verhulst

UN Global Pulse: A UN Innovation Initiative with a Multiplier Effect; By Dr. Paula Hidalgo-Sanchis

Building the Field of Data for Good; By Claudia Juech

When Philanthropy Meets Data Science: A Framework for Governance to Achieve Data-Driven Decision-Making for Public Good; By Nuria Oliver

Data for Good: Unlocking Privately-Held Data to the Benefit of the Many; By Alberto Alemanno

Building a Funding Data Ecosystem: Grantmaking in the UK; By Rachel Rank

A Reflection on the Role of Data for Health: COVID-19 and Beyond; By Stefan E. Germann and Ursula Jasper….(More)”

Data Stewardship Re-Imagined — Capacities and Competencies


Blog and presentation by Stefaan Verhulst: “In ways both large and small, COVID-OVID-19 has forced us to re-examine every aspect of our political, social, and economic systems. Among the many lessons, policymakers have learned is that existing methods for using data are often insufficient for our most pressing challenges. In particular, we need to find new, innovative ways of tapping into the potential of privately held and siloed datasets that nonetheless contain tremendous public good potential, including complementing and extending official statistics. Data collaboratives are an emerging set of methods for accessing and reusing data that offer tremendous opportunities in this regard. In the last five years, we have studied and initiated numerous data collaboratives, in the process assembling a collection of over 200 example case studies to better understand their possibilities.

Among our key findings is the vital importance and essential role that needs to be played by Data Stewards.

Data stewards do not represent an entirely new profession; rather, their role could be understood as an extension and re-definition of existing organizational positions that manage and interact with data. Traditionally, the role of a data officer was limited either to data integrity or the narrow context of internal data governance and management, with a strong emphasis on technical competencies. This narrow conception is no longer sufficient, especially given the proliferation of data and the increasing potential of data sharing and collaboration. As such, we call for a re-imagination of data stewardship to encompass a wider range of functions and responsibilities, directed at leveraging data assets toward addressing societal challenges and improving people’s lives.

DATA STEWARDSHIP: functions and competencies to enable access to and re-use of data for public benefit in a systematic, sustainable, and responsible way.

In our vision, data stewards are professionals empowered to create public value (including official statistics) by re-using data and data expertise, identifying opportunities for productive cross-sectoral collaboration, and proactively requesting or enabling functional access to data, insights, and expertise. Data stewards are active in both the public and private sectors, promoting trust within and outside their organizations. They are essential to data collaboratives by providing functional access to unlock the potential of siloed data sets. In short, data stewards form a new — and essential — link in the data value chain….(More)”.

Licensure as Data Governance


Essay by Frank Pasquale: “…A licensure regime for data and the AI it powers would enable citizens to democratically shape data’s scope and proper use, rather than resigning ourselves to being increasingly influenced and shaped by forces beyond our control.To ground the case for more ex ante regulation, Part I describes the expanding scope of data collection, analysis, and use, and the threats that that scope poses to data subjects. Part II critiques consent-based models of data protection, while Part III examines the substantive foundation of licensure models. Part IV addresses a key challenge to my approach: the free expression concerns raised by the licensure of large-scale personal data collection, analysis, and use. Part V concludes with reflections on the opportunities created by data licensure frameworks and potential limitations upon them….(More)”.