Open science: after the COVID-19 pandemic there can be no return to closed working


Article by Virginia Barbour and Martin Borchert: “In the few months since the first case of COVID-19 was identified, the underlying cause has been isolated, its symptoms agreed on, its genome sequenced, diagnostic tests developed, and potential treatments and vaccines are on the horizon. The astonishingly short time frame of these discoveries has only happened through a global open science effort.

The principles and practices underpinning open science are what underpin good research—research that is reliable, reproducible, and has the broadest impact possible. It specifically requires the application of principles and practices that make research FAIR (Findable, Accessible, Interoperable, Reusable); researchers are making their data and preliminary publications openly accessible, and then publishers are making the peer-reviewed research immediately and freely available to all. The rapid dissemination of research—through preprints in particular as well as journal articles—stands in contrast to what happened in the 2003 SARS outbreak when the majority of research on the disease was published well after the outbreak had ended.

Many outside observers might reasonably assume, given the digital world we all now inhabit, that science usually works like this. Yet this is very far from the norm for most research. Science is not something that just happens in response to emergencies or specific events—it is an ongoing, largely publicly funded, national and international enterprise….

Sharing of the underlying data that journal articles are based on is not yet a universal requirement for publication, nor are researchers usually recognised for data sharing.

There are many benefits associated with an open science model. Image adapted from: Gaelen Pinnock/UCT; CC-BY-SA 4.0 .

Once published, even access to research is not seamless. The majority of academic journals still require a subscription to access. Subscriptions are expensive; Australian universities alone currently spend more than $300 million per year on subscriptions to academic journals. Access to academic journals also varies between universities with varying library budgets. The main markets for subscriptions to the commercial journal literature are higher education and health, with some access to government and commercial….(More)”.

How Statistics Can Help — Going Beyond COVID-19


Blog by Walter J. Radermacher at Data & Policy: “It is rightly pointed out that in the midst of a crisis of enormous dimensions we needed high quality statistics with utmost urgency, but that instead we are in danger of drowning in an ocean of data and information. The pandemic is accompanied and exacerbated by an infodemic. At this moment, and in this confusion and search for solutions, it seems appropriate to take advice from previous initiatives and draw lessons for the current situation. More than 20 years ago in the United Kingdom, the report “Statistics — A Matter of Trust” laid the foundations for overcoming the previously spreading crisis of confidence through a solidly structured statistical system. This report does not stand alone in international comparison. Rather, it is one of a series of global, European and national measures and agreements which, since the fall of the Berlin Wall in 1989, have strengthened official statistics as the backbone of policy in democratic societies, with the UN Fundamental Statistical Principles and the EU Statistics Code of Practice being prominent representatives. So, if we want to deal with our current difficulties, we should address precisely those points that have emerged as determining factors for the quality of statistics, with the following three questions: What (statistical products, quality profile)? How (methods)? Who (institutions)? The aim must be to ensure that statistical information is suitable for facilitating the resolution of conflicts by eliminating the need to argue about the facts and only about the conclusions to be drawn from them.

In the past, this task would have led relatively quickly to a situation where the need for information would have been directed to official statistics as the preferred provider; this has changed recently for many reasons. On the one hand, there is the danger that the much-cited data revolution and learning algorithms (so-called AI) are presented as an alternative to official statistics (which are perceived as too slow, too inflexible and too expensive), instead of emphasizing possible commonalities and cross-fertilization possibilities. On the other hand, after decades of austerity policies, official statistics are in a similarly defensive situation to that of the public health system in many respects and in many countries: There is a lack of financial reserves, personnel and know-how for the new and innovative work now so urgently needed.

It is therefore required, as in the 1990s, to ask the fundamental question again, namely, do we (still and again) really deserve official statistics as the backbone of democratic decision-making, and if so, what should their tasks be, how should they be financed and anchored in the political system?…(More)”.

Protecting Data Privacy and Rights During a Crisis are Key to Helping the Most Vulnerable in Our Community


Blog by Amen Ra Mashariki: “Governments should protect the data and privacy rights of their communities even during emergencies. It is a false trade-off to require more data without protection. We can and should do both — collect the appropriate data and protect it. Establishing and protecting the data rights and privacy of our communities’ underserved, underrepresented, disabled, and vulnerable residents is the only way we can combat the negative impact of COVID-19 or any other crisis.

Building trust is critical. Governments can strengthen data privacy protocols, beef up transparency mechanisms, and protect the public’s data rights in the name of building trust — especially with the most vulnerable populations. Otherwise, residents will opt out of engaging with government, and without their information, leaders like first responders will be blind to their existence when making decisions and responding to emergencies, as we are seeing with COVID-19.

As Chief Analytics Officer of New York City, I often remembered the words of Defense Secretary Donald Rumsfeld, especially with regards to using data during emergencies, that there are “known knowns, known unknowns, and unknown unknowns, and we will always get hurt by the unknown unknowns.” Meaning the things we didn’t know — the data that we didn’t have — was always going to be what hurt us during times of emergencies….

There are three key steps that governments can do right now to use data most effectively to respond to emergencies — both for COVID-19 and in the future.

Seek Open Data First

In times of crisis and emergencies, many believe that government and private entities, either purposefully or inadvertently, are willing to trample on the data rights of the public in the name of appropriate crisis response. This should not be a trade-off. We can respond to crises while keeping data privacy and data rights in the forefront of our minds. Rather than dismissing data rights, governments can start using data that is already openly available. This seems like a simple step, but it does two very important things. First, it forces you to understand the data that is already available in your jurisdiction. Second, it grows your ability to fill the gaps with respect to what you know about the city by looking outside of city government. …(More)”.

Responsible Data Toolkit


Andrew Young at The GovLab: “The GovLab and UNICEF, as part of the Responsible Data for Children initiative (RD4C), are pleased to share a set of user-friendly tools to support organizations and practitioners seeking to operationalize the RD4C Principles. These principles—Purpose-Driven, People-Centric, Participatory, Protective of Children’s Rights, Proportional, Professionally Accountable, and Prevention of Harms Across the Data Lifecycle—are especially important in the current moment, as actors around the world are taking a data-driven approach to the fight against COVID-19.

The initial components of the RD4C Toolkit are:

The RD4C Data Ecosystem Mapping Tool intends to help users to identify the systems generating data about children and the key components of those systems. After using this tool, users will be positioned to understand the breadth of data they generate and hold about children; assess data systems’ redundancies or gaps; identify opportunities for responsible data use; and achieve other insights.

The RD4C Decision Provenance Mapping methodology provides a way for actors designing or assessing data investments for children to identify key decision points and determine which internal and external parties influence those decision points. This distillation can help users to pinpoint any gaps and develop strategies for improving decision-making processes and advancing more professionally accountable data practices.

The RD4C Opportunity and Risk Diagnostic provides organizations with a way to take stock of the RD4C principles and how they might be realized as an organization reviews a data project or system. The high-level questions and prompts below are intended to help users identify areas in need of attention and to strategize next steps for ensuring more responsible handling of data for and about children across their organization.

Finally, the Data for Children Collaborative with UNICEF developed an Ethical Assessment that “forms part of [their] safe data ecosystem, alongside data management and data protection policies and practices.” The tool reflects the RD4C Principles and aims to “provide an opportunity for project teams to reflect on the material consequences of their actions, and how their work will have real impacts on children’s lives.

RD4C launched in October 2019 with the release of the RD4C Synthesis ReportSelected Readings, and the RD4C Principles. Last month we published the The RD4C Case Studies, which analyze data systems deployed in diverse country environments, with a focus on their alignment with the RD4C Principles. The case studies are: Romania’s The Aurora ProjectChildline Kenya, and Afghanistan’s Nutrition Online Database.

To learn more about Responsible Data for Children, visit rd4c.org or contact rd4c [at] thegovlab.org. To join the RD4C conversation and be alerted to future releases, subscribe at this link.”

Coronavirus shows how badly we need consensus on collective data rights and needs


Blogpost by Ania Calderon: “The rapid spread of this disease is exposing fault lines in our political and social balance — most visibly in the lack of protection for the poorest or investment in healthcare systems. It’s also forcing us to think about how we can work across jurisdictions and political contexts to foster better collaboration, build trust in institutions, and save lives.

As we said recently in a call for Open COVID-19 Data, governments need data from other countries to model and flatten the curve, but there is little consistency in how they gather it. Meanwhile, the consequences of different approaches show the balance required in effectively implementing open data policies. For example, Singapore has published detailed personal data about every coronavirus patient, including where they work and live and whether they had contact with others. This helped the city-state keep its infection and death rates extremely low in the early stages of the epidemic, but also led to proportionality concerns as people might be targeted and harmed.

Overall, few governments are publishing the information on which they are basing these huge decisions. This makes it hard to collaborate, scrutinise, and build trust. For example, the models can only be as good as the data that feed them, and we need to understand their limitations. Opening up the data and the source code behind them would give citizens confidence that officials were making decisions in the public’s interest rather than their political ones. It would also foster the international joined-up action needed to meet this challenge. And it would allow non-state actors into the process to plug gaps and deliver and scale effective solutions quickly.

At the same time, legitimate concerns have been raised about how this data is used, both now and in the future.

As we say in our strategy, openness needs to be balanced with both individual and collective data rights, and policies need to account for context.

People may be ok to give up some of their privacy — like having their movements tracked by government smartphone apps — if that can help combat a global health crisis, but that would seem an unthinkable invasion of privacy to many in less exceptional times. We rightly worry how this data might be used later on, and by whom. Which shows that data systems need to be able to respond to changing times, while holding fundamental human rights and civil liberties in check.

As with so many things, this crisis is forcing the world to question orthodoxies around individual and collective data rights and needs. It shines a light on policies and approaches which might help avoid future disasters and build a fairer, healthier, more collaborative society overall….(More)”.

From Idea to Reality: Why We Need an Open Data Policy Lab


Stefaan G. Verhulst at Open Data Policy Lab: “The belief that we are living in a data age — one characterized by unprecedented amounts of data, with unprecedented potential — has become mainstream. We regularly read phrases such as “data is the most valuable commodity in the global economy” or that data provides decision-makers with an “ever-swelling flood of information.”

Without a doubt, there is truth in such statements. But they also leave out a major shortcoming — the fact that much of the most useful data continue to remain inaccessible, hidden in silos, behind digital walls, and in untapped “treasuries.”

For close to a decade, the technology and public interest community have pushed the idea of open data. At its core, open data represents a new paradigm of data availability and access. The movement borrows from the language of open source and is rooted in notions of a “knowledge commons”, a concept developed, among others, by scholars like Nobel Prize winner Elinor Ostrom.

Milestones and Limitations in Open Data

Significant milestones have been achieved in the short history of the open data movement. Around the world, an ever-increasing number of governments at the local, state and national levels now release large datasets for the public’s benefit. For example, New York City requires that all public data be published on a single web portal. The current portal site contains thousands of datasets that fuel projects on topics as diverse as school bullying, sanitation, and police conduct. In California, the Forest Practice Watershed Mapper allows users to track the impact of timber harvesting on aquatic life through the use of the state’s open data. Similarly, Denmark’s Building and Dwelling Register releases address data to the public free of charge, improving transparent property assessment for all interested parties.

A growing number of private companies have also initiated or engaged in “Data Collaborative”projects to leverage their private data toward the public interest. For example, Valassis, a direct-mail marketing company, shared its massive address database with community groups in New Orleans to visualize and track block-by-block repopulation rates after Hurricane Katrina. A wide number of data collaboratives are also currently being launched to respond to the COVID-19 pandemic. Through its COVID-19 Data Collaborative Program, the location-intelligence company Cuebiq is providing researchers access to the company’s data to study, for instance, the impacts of social distancing policies in Italy and New York City. The health technology company Kinsa Health’s US Health Weather initiative is likewise visualizing the rate of fever across the United States using data from its network of Smart Thermometers, thereby providing early indications regarding the location of likely COVID-19 outbreaks.

Yet despite such initiatives, many open data projects (and data collaboratives) remain fledgling — especially those at the state and local level.

Among other issues, the field has trouble scaling projects beyond initial pilots, and many potential stakeholders — private sector and government “owners” of data, as well as public beneficiaries — remain skeptical of open data’s value. In addition, terabytes of potentially transformative data remain inaccessible for re-use. It is absolutely imperative that we continue to make the case to all stakeholders regarding the importance of open data, and of moving it from an interesting idea to an impactful reality. In order to do this, we need a new resource — one that can inform the public and data owners, and that would guide decision-makers on how to achieve open data in a responsible manner, without undermining privacy and other rights.

Purpose of the Open Data Policy Lab

Today, with support from Microsoft and under the counsel of a global advisory board of open data leaders, The GovLab is launching an initiative designed precisely to build such a resource.

Our Open Data Policy Lab will draw on lessons and experiences from around the world to conduct analysis, provide guidance, build community, and take action to accelerate the responsible re-use and opening of data for the benefit of society and the equitable spread of economic opportunity…(More)”.

Casualties of a Pandemic: Truth, Trust and Transparency


Essay by Frank D. LoMonte at the Journal of Civic Information: “In an April 1 interview with NPR’s “Morning Edition,” retired U.S. Army Gen. Stanley A. McChrystal, former commander of U.S. forces in Iraq, explained that, in a crisis situation, accurate information from government authorities can be crucial in reassuring the public – and in the absence of accurate information, speculation and rumor will proliferate. Joni Mitchell, who’s probably never before appeared in the same paragraph with Stanley McChrystal, might have put it a touch more poetically: “Don’t it always seem to go; That you don’t know what you’ve got ’til it’s gone.”

The outbreak of the coronavirus strain COVID-19, which prompted the U.S. Department of Health and Human Services to declare a public health emergency on Jan. 31, 2020,3 is introducing Americans to a newfound world of austerity and loss. Professional haircuts, sit-down restaurant meals and recreational plane flights increasingly seem like memories from a bygone golden age (small inconveniences, to be sure, alongside the suffering of thousands who’ve died and the families they’ve left behind).

Access to information from government agencies, too, is adapting to a mail-order, drive-through society. As public-health authorities reached consensus that the spread of COVID-19 could be contained only by eliminating non-essential travel and group gatherings, strict adherence to open-meeting and public-records laws became a casualty alongside salad bars and theme-park rides. Governors and legislatures relaxed, or entirely waived, compliance with statutes that require agencies to open their meetings to in-person public attendance and promptly fulfill requests for documents.

As with all other areas of public life, some sacrifices in open-government formalities are unavoidable. With agencies down to a sustenance-level crew of essential workers, it’s unrealistic to expect that decades-old paper documents will be speedily located and produced. And it’s unsafe to invite people to congregate at public hearings to address their elected officials. But the public shouldn’t be alone in the sacrifice….(More)”.

Why isn’t the government publishing more data about coronavirus deaths?


Article by Jeni Tennison: “Studying the past is futile in an unprecedented crisis. Science is the answer – and open-source information is paramount…Data is a necessary ingredient in day-to-day decision-making – but in this rapidly evolving situation, it’s especially vital. Everything has changed, almost overnight. Demands for foodtransport, and energy have been overhauled as more people stop travelling and work from home. Jobs have been lost in some sectors, and workers are desperately needed in others. Historic experience can no longer tell us how our society or economy is working. Past models hold little predictive power in an unprecedented situation. To know what is happening right now, we need up-to-date information….

This data is also crucial for scientists, who can use it to replicate and build upon each other’s work. Yet no open data has been published alongside the evidence for the UK government’s coronavirus response. While a model that informed the US government’s response is freely available as a Google spreadsheet, the Imperial College London model that prompted the current lockdown has still not been published as open-source code. Making data open – publishing it on the web, in spreadsheets, without restrictions on access – is the best way to ensure it can be used by the people who need it most.

There is currently no open data available on UK hospitalisation rates; no regional, age or gender breakdown of daily deaths. The more granular breakdown of registered deaths provided by the Office of National Statistics is only published on a weekly basis, and with a delay. It is hard to tell whether this data does not exist or the NHS has prioritised creating dashboards for government decision makers rather than informing the rest of the country. But the UK is making progress with regard to data: potential Covid-19 cases identified through online and call-centre triage are now being published daily by NHS Digital.

Of course, not all data should be open. Singapore has been publishing detailed data about every infected person, including their age, gender, workplace, where they have visited and whether they had contact with other infected people. This can both harm the people who are documented and incentivise others to lie to authorities, undermining the quality of data.

When people are concerned about how data about them is handled, they demand transparency. To retain our trust, governments need to be open about how data is collected and used, how it’s being shared, with whom, and for what purpose. Openness about the use of personal data to help tackle the Covid-19 crisis will become more pressing as governments seek to develop contact tracing apps and immunity passports….(More)”.

Now Is the Time for Open Access Policies—Here’s Why



Victoria Heath and Brigitte Vézina at Creative Commons: “Over the weekend, news emerged that upset even the most ardent skeptics of open access. Under the headline, “Trump vs Berlin” the German newspaper Welt am Sonntag reported that President Trump offered $1 billion USD to the German biopharmaceutical company CureVac to secure their COVID-19 vaccine “only for the United States.”

In response, Jens Spahn, the German health minister said such a deal was completely “off the table” and Peter Altmaier, the German economic minister replied, “Germany is not for sale.” Open science advocates were especially infuriated. Professor Lorraine Leeson of Trinity College Dublin, for example, tweeted, “This is NOT the time for this kind of behavior—it flies in the face of the #OpenScience work that is helping us respond meaningfully right now. This is the time for solidarity, not exclusivity.” The White House and CureVac have since denied the report. 

Today, we find ourselves at a pivotal moment in history—we must cooperate effectively to respond to an unprecedented global health emergency. The mantra, “when we share, everyone wins” applies now more than ever. With this in mind, we felt it imperative to underscore the importance of open access, specifically open science, in times of crisis.

Why open access matters, especially during a global health emergency 

One of the most important components of maintaining global health, specifically in the face of urgent threats, is the creation and dissemination of reliable, up-to-date scientific information to the public, government officials, humanitarian and health workers, as well as scientists.

Several scientific research funders like the Gates Foundation, the Hewlett Foundation, and the Wellcome Trust have long-standing open access policies and some have now called for increased efforts to share COVID-19 related research rapidly and openly to curb the outbreak. By licensing material under a CC BY-NC-SA license, the World Health Organization (WHO) is adopting a more conservative approach to open access that falls short of what the scientific community urgently needs in order to access and build upon critical information….(More)”.

The Economic Impact of Open Data: Opportunities for value creation in Europe


Press Release: “The European Data Portal publishes its study “The Economic Impact of Open Data: Opportunities for value creation in Europe”. It researches the value created by open data in Europe. It is the second study by the European Data Portal, following the 2015 report. The open data market size is estimated at €184 billion and forecast to reach between €199.51 and €334.21 billion in 2025. The report additionally considers how this market size is distributed along different sectors and how many people are employed due to open data. The efficiency gains from open data, such as potential lives saved, time saved, environmental benefits, and improvement of language services, as well as associated potential costs savings are explored and quantified where possible. Finally, the report also considers examples and insights from open data re-use in organisations. The key findings of the report are summarised below:

  1. The specification and implementation of high-value datasets as part of the new Open Data Directive is a promising opportunity to address quality & quantity demands of open data.
  2. Addressing quality & quantity demands is important, yet not enough to reach the full potential of open data.
  3. Open data re-users have to be aware and capable of understanding and leveraging the potential.
  4. Open data value creation is part of the wider challenge of skill and process transformation: a lengthy process whose change and impact are not always easy to observe and measure.
  5. Sector-specific initiatives and collaboration in and across private and public sector foster value creation.
  6. Combining open data with personal, shared, or crowdsourced data is vital for the realisation of further growth of the open data market.
  7. For different challenges, we must explore and improve multiple approaches of data re-use that are ethical, sustainable, and fit-for-purpose….(More)”.