Giant, free index to world’s research papers released online


Holly Else at Nature: “In a project that could unlock the world’s research papers for easier computerized analysis, an American technologist has released online a gigantic index of the words and short phrases contained in more than 100 million journal articles — including many paywalled papers.

The catalogue, which was released on 7 October and is free to use, holds tables of more than 355 billion words and sentence fragments listed next to the articles in which they appear. It is an effort to help scientists use software to glean insights from published work even if they have no legal access to the underlying papers, says its creator, Carl Malamud. He released the files under the auspices of Public Resource, a non-profit corporation in Sebastopol, California, that he founded.

Malamud says that because his index doesn’t contain the full text of articles, but only sentence snippets up to five words long, releasing it does not breach publishers’ copyright restrictions on the reuse of paywalled articles. However, one legal expert says that publishers might question the legality of how Malamud created the index in the first place.

Some researchers who have had early access to the index say it’s a major development in helping them to search the literature with software — a procedure known as text mining. Gitanjali Yadav, a computational biologist at the University of Cambridge, UK, who studies volatile organic compounds emitted by plants, says she aims to comb through Malamud’s index to produce analyses of the plant chemicals described in the world’s research papers. “There is no way for me — or anyone else — to experimentally analyse or measure the chemical fingerprint of each and every plant species on Earth. Much of the information we seek already exists, in published literature,” she says. But researchers are restricted by lack of access to many papers, Yadav adds….(More)”.

Has COVID-19 been the making of Open Science?


Article by Lonni Besançon, Corentin Segalas and Clémence Leyrat: “Although many concepts fall under the umbrella of Open Science, some of its key concepts are: Open Access, Open Data, Open Source, and Open Peer Review. How far these four principles were embraced by researchers during the pandemic and where there is room for improvement, is what we, as early career researchers, set out to assess by looking at data on scientific articles published during the Covid-19 pandemic….Open Source and Open Data practices consist in making all the data and materials used to gather or analyse data available on relevant repositories. While we can find incredibly useful datasets shared publicly on COVID-19 (for instance those provided by the European Centre for Disease Control), they remain the exception rather than the norm. A spectacular example of this were the papers utilising data from the company Surgisphere, that led to retracted papers in The Lancet and The New England Journal of Medicine. In our paper, we highlight 4 papers that could have been retracted much earlier (and perhaps would never have been accepted) had the data been made accessible from the time of publication. As we argue in our paper, this presents a clear case for making open data and open source the default, with exceptions for privacy and safety. While some journals already have such policies, we go further in asking that, when data cannot be shared publicly, editors/publishers and authors/institutions should agree on a third party to check the existence and reliability/validity of the data and the results presented. This not only would strengthen the review process, but also enhance the reproducibility of research and further accelerate the production of new knowledge through data and code sharing…(More)”.

Can digital technologies improve health?


The Lancet: “If you have followed the news on digital technology and health in recent months, you will have read of a blockbuster fraud trial centred on a dubious blood-testing device, a controversial partnership between a telehealth company and a data analytics company, a social media company promising action to curb the spread of vaccine misinformation, and another addressing its role in the deteriorating mental health of young women. For proponents and critics alike, these stories encapsulate the health impact of many digital technologies, and the uncertain and often unsubstantiated position of digital technologies for health. The Lancet and Financial Times Commission on governing health futures 2030: growing up in a digital world, brings together diverse, independent experts to ask if this narrative can still be turned around? Can digital technologies deliver health benefits for all?

Digital technologies could improve health in many ways. For example, electronic health records can support clinical trials and provide large-scale observational data. These approaches have underpinned several high-profile research findings during the COVID-19 pandemic. Sequencing and genomics have been used to understand SARS-CoV-2 transmission and evolution. There is vast promise in digital technology, but the Commission argues that, overall, digital transformations will not deliver health benefits for all without fundamental and revolutionary realignment.

Globally, digital transformations are well underway and have had both direct and indirect health consequences. Direct effects can occur through, for example, the promotion of health information or propagating misinformation. Indirect ones can happen via effects on other determinants of health, including social, economic, commercial, and environmental factors, such as influencing people’s exposure to marketing or political messaging. Children and adolescents growing up in this digital world experience the extremes of digital access. Young people who spend large parts of their lives online may be protected or vulnerable to online harm. But many individuals remain digitally excluded, affecting their access to education and health information. Digital access, and the quality of that access, must be recognised as a key determinant of health. The Commission calls for connectivity to be recognised as a public good and human right.

Describing the accumulation of data and power by dominant actors, many of which are commercial, the Commissioners criticise business models based on the extraction of personal data, and those that benefit from the viral spread of misinformation. To redirect digital technologies to advance universal health coverage, the Commission invokes the guiding principles of democracy, equity, solidarity, inclusion, and human rights. Governments must protect individuals from emerging threats to their health, including bias, discrimination, and online harm to children. The Commission also calls for accountability and transparency in digital transformations, and for the governance of misinformation in health care—basic principles, but ones that have been overridden in a quest for freedom of expression and by the fear that innovation could be sidelined. Public participation and codesign of digital technologies, particularly including young people and those from affected communities, are fundamental.

The Commission also advocates for data solidarity, a radical new approach to health data in which both personal and collective interests and responsibilities are balanced. Rather than data being regarded as something to be owned or hoarded, it emphasises the social and relational nature of health data. Countries should develop data trusts that unlock potential health benefits in public data, while also safeguarding it.

Digital transformations cannot be reversed. But they must be rethought and changed. At its heart, this Commission is both an exposition of the health harms of digital technologies as they function now, and an optimistic vision of the potential alternatives. Calling for investigation and expansion of digital health technologies is not misplaced techno-optimism, but a serious opportunity to drive much needed change. Without new approaches, the world will not achieve the 2030 Sustainable Development Goals.

However, no amount of technical innovation or research will bring equitable health benefits from digital technologies without a fundamental redistribution of power and agency, achievable only through appropriate governance. There is a desperate need to reclaim digital technologies for the good of societies. Our future health depends on it….(More)”.

Open data in digital strategies against COVID-19: the case of Belgium


Paper by Robert Viseur: “COVID-19 has highlighted the importance of digital in the fight against the pandemic (control at the border, automated tracing, creation of databases…). In this research, we analyze the Belgian response in terms of open data. First, we examine the open data publication strategy in Belgium (a federal state with a sometimes complex functioning, especially in health), second, we conduct a case study (anatomy of the pandemic in Belgium) in order to better understand the strengths and weaknesses of the main COVID-19 open data repository. And third, we analyze the obstacles to open data publication. Finally, we discuss the Belgian COVID-19 open data strategy in terms of data availability, data relevance and knowledge management. In particular, we show how difficult it is to optimize the latter in order to make the best use of governmental, private and academic open data in a way that has a positive impact on public health policy….(More)”.

Data Science for Social Good: Philanthropy and Social Impact in a Complex World


Book edited by Ciro Cattuto and Massimo Lapucci: “This book is a collection of insights by thought leaders at first-mover organizations in the emerging field of “Data Science for Social Good”. It examines the application of knowledge from computer science, complex systems, and computational social science to challenges such as humanitarian response, public health, and sustainable development. The book provides an overview of scientific approaches to social impact – identifying a social need, targeting an intervention, measuring impact – and the complementary perspective of funders and philanthropies pushing forward this new sector.

TABLE OF CONTENTS


Introduction; By Massimo Lapucci

The Value of Data and Data Collaboratives for Good: A Roadmap for Philanthropies to Facilitate Systems Change Through Data; By Stefaan G. Verhulst

UN Global Pulse: A UN Innovation Initiative with a Multiplier Effect; By Dr. Paula Hidalgo-Sanchis

Building the Field of Data for Good; By Claudia Juech

When Philanthropy Meets Data Science: A Framework for Governance to Achieve Data-Driven Decision-Making for Public Good; By Nuria Oliver

Data for Good: Unlocking Privately-Held Data to the Benefit of the Many; By Alberto Alemanno

Building a Funding Data Ecosystem: Grantmaking in the UK; By Rachel Rank

A Reflection on the Role of Data for Health: COVID-19 and Beyond; By Stefan E. Germann and Ursula Jasper….(More)”

Data Stewardship Re-Imagined — Capacities and Competencies


Blog and presentation by Stefaan Verhulst: “In ways both large and small, COVID-OVID-19 has forced us to re-examine every aspect of our political, social, and economic systems. Among the many lessons, policymakers have learned is that existing methods for using data are often insufficient for our most pressing challenges. In particular, we need to find new, innovative ways of tapping into the potential of privately held and siloed datasets that nonetheless contain tremendous public good potential, including complementing and extending official statistics. Data collaboratives are an emerging set of methods for accessing and reusing data that offer tremendous opportunities in this regard. In the last five years, we have studied and initiated numerous data collaboratives, in the process assembling a collection of over 200 example case studies to better understand their possibilities.

Among our key findings is the vital importance and essential role that needs to be played by Data Stewards.

Data stewards do not represent an entirely new profession; rather, their role could be understood as an extension and re-definition of existing organizational positions that manage and interact with data. Traditionally, the role of a data officer was limited either to data integrity or the narrow context of internal data governance and management, with a strong emphasis on technical competencies. This narrow conception is no longer sufficient, especially given the proliferation of data and the increasing potential of data sharing and collaboration. As such, we call for a re-imagination of data stewardship to encompass a wider range of functions and responsibilities, directed at leveraging data assets toward addressing societal challenges and improving people’s lives.

DATA STEWARDSHIP: functions and competencies to enable access to and re-use of data for public benefit in a systematic, sustainable, and responsible way.

In our vision, data stewards are professionals empowered to create public value (including official statistics) by re-using data and data expertise, identifying opportunities for productive cross-sectoral collaboration, and proactively requesting or enabling functional access to data, insights, and expertise. Data stewards are active in both the public and private sectors, promoting trust within and outside their organizations. They are essential to data collaboratives by providing functional access to unlock the potential of siloed data sets. In short, data stewards form a new — and essential — link in the data value chain….(More)”.

Licensure as Data Governance


Essay by Frank Pasquale: “…A licensure regime for data and the AI it powers would enable citizens to democratically shape data’s scope and proper use, rather than resigning ourselves to being increasingly influenced and shaped by forces beyond our control.To ground the case for more ex ante regulation, Part I describes the expanding scope of data collection, analysis, and use, and the threats that that scope poses to data subjects. Part II critiques consent-based models of data protection, while Part III examines the substantive foundation of licensure models. Part IV addresses a key challenge to my approach: the free expression concerns raised by the licensure of large-scale personal data collection, analysis, and use. Part V concludes with reflections on the opportunities created by data licensure frameworks and potential limitations upon them….(More)”.

Feedback Loops in Open Data Ecosystems


Paper by Daniel Rudmark and Magnus Andersson: “Public agencies are increasingly publishing open data to increase transparency and fuel data-driven innovation. For these organizations, maintaining sufficient data quality is key to continuous re-use but also heavily dependent on feedback loops being initiated between data publishers and users. This paper reports from a longitudinal engagement with Scandinavian transportation agencies, where such feedback loops have been successfully established. Based on these experiences, we propose four distinct types of data feedback loops in which both data publishers and re-users play critical roles…(More)”.

UNCTAD calls on countries to make digital data flow for the benefit of all


Press Release: “The world needs a new global governance approach to enable digital data to flow across borders as freely as necessary and possible, says UNCTAD’s Digital Economy Report 2021 released on 29 September.

The UN trade and development body says the new approach should help maximize development gains, ensure those gains are equitably distributed and minimize risks and harms.

It should also enable worldwide data sharing, develop global digital public goods, increase trust and reduce uncertainty in the digital economy.

The report says the new global system should also help avoid further fragmentation of the internet, address policy challenges emerging from the dominant positions of digital platforms and narrow existing inequalities.

“It is more important than ever to embark on a new path for digital and data governance,” says UN Secretary-General António Guterres in his preface to the report.

“The current fragmented data landscape risks us failing to capture value that could accrue from digital technologies and it may create more space for substantial harms related to privacy breaches, cyberattacks and other risks.”

UNCTAD Secretary-General Rebeca Grynspan said: “We urgently need a renewed focus on achieving global digital and data governance, developing global digital public goods, increasing trust and reducing uncertainty in the digital economy. The pandemic has shown the critical importance of sharing health data globally – the issue of digital governance can no longer be postponed.”

Pandemic underscores need for new governance

Digital data play an increasingly important role as an economic and strategic resource, a trend reinforced by the COVID-19 pandemic.

The pandemic has shown the importance of sharing health data globally to help countries cope with its consequences, and for research purposes in finding vaccines.

“The increased interconnection and interdependence challenges in the global data economy call for moving away from the silo approach towards a more holistic, coordinated global approach,” UNCTAD Deputy Secretary-General Isabelle Durant said.

“Moreover, new and innovative ways of global governance are urgently needed, as the old ways may not be well suited to respond to the new context,” she added.

New UN data-related body proposed

UNCTAD proposes the formation of a new United Nations coordinating body, with a focus on, and with the skills for, assessing and developing comprehensive global digital and data governance. Its work should be multilateral, multi-stakeholder and multidisciplinary.

It should also seek to remedy the current underrepresentation of developing countries in global and regional data governance initiatives.

The body should also function as a complement to and in coherence with national policies and provide sufficient policy space to ensure countries with different levels of digital readiness and capacities can benefit from the data-driven digital economy…(More)”.

Statement of Principles to support proactive disclosure of government-held information


Statement of principles by the  Australian information commissioners and ombudsmen: “Information commissioners and ombudsmen across Australia oversight and promote citizens’ rights to access government-held information and have powers to review agency decisions under the applicable right to information (RTI) legislation. Beyond formal rights of access, the proactive disclosure of government-held information promotes open government and advances our system of representative democracy.

All Australian governments (Commonwealth, state, territory, and local) and public institutions are strongly encouraged to commit to being Open by Design by building a culture of transparency and by prioritising, promoting and resourcing proactive disclosure.

These Principles recognise that:

  1. information held by government and public institutions is a public resource and, to the greatest extent possible, should be published promptly and proactively at the lowest reasonable cost, without the need for a formal access request, and
  2. a culture of transparency within government is everyone’s responsibility requiring action by all public sector leaders and officers to encourage and support the proactive disclosure of information, and
  3. appropriate, prompt and proactive disclosure of government-held information:
  • informs community – proactive disclosure leads to a more informed community, and awareness raising of government and public institutions’ strategic intentions and initiatives, driving innovation and improving standards. Transparent and coherent public communication can also address misinformation
  • increases participation and enhances decision-making – proactive disclosure increases citizen participation in government processes and promotes better informed decision-making through increased scrutiny, discussion, comment and review of government and public institutions’ decisions
  • builds trust and confidence – proactive disclosure enhances public sector accountability and integrity, builds public trust and confidence in decision-making by government and public institutions and strengthens principles of liberal democracy
  • improves service delivery – proactive disclosure improves service delivery by providing access to information faster and more easily than formal access regimes, providing the opportunity to decide when and how information is provided, and to contextualise and explain information
  • is required or permitted by law – proactive disclosure is mandated, permitted, or protected by law in all Australian states and territories and the Commonwealth
  • improves efficiency – proactive disclosure reduces the administrative burden on departments and agencies and the need for citizens to make a formal information access request.

 Australian information commissioners and ombudsmen recommend that public sector agencies:

  1. Embed a proactive disclosure culture in all public sector agencies and public institutions by…(More)”.