From Idea to Reality: Why We Need an Open Data Policy Lab


Stefaan G. Verhulst at Open Data Policy Lab: “The belief that we are living in a data age — one characterized by unprecedented amounts of data, with unprecedented potential — has become mainstream. We regularly read phrases such as “data is the most valuable commodity in the global economy” or that data provides decision-makers with an “ever-swelling flood of information.”

Without a doubt, there is truth in such statements. But they also leave out a major shortcoming — the fact that much of the most useful data continue to remain inaccessible, hidden in silos, behind digital walls, and in untapped “treasuries.”

For close to a decade, the technology and public interest community have pushed the idea of open data. At its core, open data represents a new paradigm of data availability and access. The movement borrows from the language of open source and is rooted in notions of a “knowledge commons”, a concept developed, among others, by scholars like Nobel Prize winner Elinor Ostrom.

Milestones and Limitations in Open Data

Significant milestones have been achieved in the short history of the open data movement. Around the world, an ever-increasing number of governments at the local, state and national levels now release large datasets for the public’s benefit. For example, New York City requires that all public data be published on a single web portal. The current portal site contains thousands of datasets that fuel projects on topics as diverse as school bullying, sanitation, and police conduct. In California, the Forest Practice Watershed Mapper allows users to track the impact of timber harvesting on aquatic life through the use of the state’s open data. Similarly, Denmark’s Building and Dwelling Register releases address data to the public free of charge, improving transparent property assessment for all interested parties.

A growing number of private companies have also initiated or engaged in “Data Collaborative”projects to leverage their private data toward the public interest. For example, Valassis, a direct-mail marketing company, shared its massive address database with community groups in New Orleans to visualize and track block-by-block repopulation rates after Hurricane Katrina. A wide number of data collaboratives are also currently being launched to respond to the COVID-19 pandemic. Through its COVID-19 Data Collaborative Program, the location-intelligence company Cuebiq is providing researchers access to the company’s data to study, for instance, the impacts of social distancing policies in Italy and New York City. The health technology company Kinsa Health’s US Health Weather initiative is likewise visualizing the rate of fever across the United States using data from its network of Smart Thermometers, thereby providing early indications regarding the location of likely COVID-19 outbreaks.

Yet despite such initiatives, many open data projects (and data collaboratives) remain fledgling — especially those at the state and local level.

Among other issues, the field has trouble scaling projects beyond initial pilots, and many potential stakeholders — private sector and government “owners” of data, as well as public beneficiaries — remain skeptical of open data’s value. In addition, terabytes of potentially transformative data remain inaccessible for re-use. It is absolutely imperative that we continue to make the case to all stakeholders regarding the importance of open data, and of moving it from an interesting idea to an impactful reality. In order to do this, we need a new resource — one that can inform the public and data owners, and that would guide decision-makers on how to achieve open data in a responsible manner, without undermining privacy and other rights.

Purpose of the Open Data Policy Lab

Today, with support from Microsoft and under the counsel of a global advisory board of open data leaders, The GovLab is launching an initiative designed precisely to build such a resource.

Our Open Data Policy Lab will draw on lessons and experiences from around the world to conduct analysis, provide guidance, build community, and take action to accelerate the responsible re-use and opening of data for the benefit of society and the equitable spread of economic opportunity…(More)”.

Casualties of a Pandemic: Truth, Trust and Transparency


Essay by Frank D. LoMonte at the Journal of Civic Information: “In an April 1 interview with NPR’s “Morning Edition,” retired U.S. Army Gen. Stanley A. McChrystal, former commander of U.S. forces in Iraq, explained that, in a crisis situation, accurate information from government authorities can be crucial in reassuring the public – and in the absence of accurate information, speculation and rumor will proliferate. Joni Mitchell, who’s probably never before appeared in the same paragraph with Stanley McChrystal, might have put it a touch more poetically: “Don’t it always seem to go; That you don’t know what you’ve got ’til it’s gone.”

The outbreak of the coronavirus strain COVID-19, which prompted the U.S. Department of Health and Human Services to declare a public health emergency on Jan. 31, 2020,3 is introducing Americans to a newfound world of austerity and loss. Professional haircuts, sit-down restaurant meals and recreational plane flights increasingly seem like memories from a bygone golden age (small inconveniences, to be sure, alongside the suffering of thousands who’ve died and the families they’ve left behind).

Access to information from government agencies, too, is adapting to a mail-order, drive-through society. As public-health authorities reached consensus that the spread of COVID-19 could be contained only by eliminating non-essential travel and group gatherings, strict adherence to open-meeting and public-records laws became a casualty alongside salad bars and theme-park rides. Governors and legislatures relaxed, or entirely waived, compliance with statutes that require agencies to open their meetings to in-person public attendance and promptly fulfill requests for documents.

As with all other areas of public life, some sacrifices in open-government formalities are unavoidable. With agencies down to a sustenance-level crew of essential workers, it’s unrealistic to expect that decades-old paper documents will be speedily located and produced. And it’s unsafe to invite people to congregate at public hearings to address their elected officials. But the public shouldn’t be alone in the sacrifice….(More)”.

Why isn’t the government publishing more data about coronavirus deaths?


Article by Jeni Tennison: “Studying the past is futile in an unprecedented crisis. Science is the answer – and open-source information is paramount…Data is a necessary ingredient in day-to-day decision-making – but in this rapidly evolving situation, it’s especially vital. Everything has changed, almost overnight. Demands for foodtransport, and energy have been overhauled as more people stop travelling and work from home. Jobs have been lost in some sectors, and workers are desperately needed in others. Historic experience can no longer tell us how our society or economy is working. Past models hold little predictive power in an unprecedented situation. To know what is happening right now, we need up-to-date information….

This data is also crucial for scientists, who can use it to replicate and build upon each other’s work. Yet no open data has been published alongside the evidence for the UK government’s coronavirus response. While a model that informed the US government’s response is freely available as a Google spreadsheet, the Imperial College London model that prompted the current lockdown has still not been published as open-source code. Making data open – publishing it on the web, in spreadsheets, without restrictions on access – is the best way to ensure it can be used by the people who need it most.

There is currently no open data available on UK hospitalisation rates; no regional, age or gender breakdown of daily deaths. The more granular breakdown of registered deaths provided by the Office of National Statistics is only published on a weekly basis, and with a delay. It is hard to tell whether this data does not exist or the NHS has prioritised creating dashboards for government decision makers rather than informing the rest of the country. But the UK is making progress with regard to data: potential Covid-19 cases identified through online and call-centre triage are now being published daily by NHS Digital.

Of course, not all data should be open. Singapore has been publishing detailed data about every infected person, including their age, gender, workplace, where they have visited and whether they had contact with other infected people. This can both harm the people who are documented and incentivise others to lie to authorities, undermining the quality of data.

When people are concerned about how data about them is handled, they demand transparency. To retain our trust, governments need to be open about how data is collected and used, how it’s being shared, with whom, and for what purpose. Openness about the use of personal data to help tackle the Covid-19 crisis will become more pressing as governments seek to develop contact tracing apps and immunity passports….(More)”.

Now Is the Time for Open Access Policies—Here’s Why



Victoria Heath and Brigitte Vézina at Creative Commons: “Over the weekend, news emerged that upset even the most ardent skeptics of open access. Under the headline, “Trump vs Berlin” the German newspaper Welt am Sonntag reported that President Trump offered $1 billion USD to the German biopharmaceutical company CureVac to secure their COVID-19 vaccine “only for the United States.”

In response, Jens Spahn, the German health minister said such a deal was completely “off the table” and Peter Altmaier, the German economic minister replied, “Germany is not for sale.” Open science advocates were especially infuriated. Professor Lorraine Leeson of Trinity College Dublin, for example, tweeted, “This is NOT the time for this kind of behavior—it flies in the face of the #OpenScience work that is helping us respond meaningfully right now. This is the time for solidarity, not exclusivity.” The White House and CureVac have since denied the report. 

Today, we find ourselves at a pivotal moment in history—we must cooperate effectively to respond to an unprecedented global health emergency. The mantra, “when we share, everyone wins” applies now more than ever. With this in mind, we felt it imperative to underscore the importance of open access, specifically open science, in times of crisis.

Why open access matters, especially during a global health emergency 

One of the most important components of maintaining global health, specifically in the face of urgent threats, is the creation and dissemination of reliable, up-to-date scientific information to the public, government officials, humanitarian and health workers, as well as scientists.

Several scientific research funders like the Gates Foundation, the Hewlett Foundation, and the Wellcome Trust have long-standing open access policies and some have now called for increased efforts to share COVID-19 related research rapidly and openly to curb the outbreak. By licensing material under a CC BY-NC-SA license, the World Health Organization (WHO) is adopting a more conservative approach to open access that falls short of what the scientific community urgently needs in order to access and build upon critical information….(More)”.

The Economic Impact of Open Data: Opportunities for value creation in Europe


Press Release: “The European Data Portal publishes its study “The Economic Impact of Open Data: Opportunities for value creation in Europe”. It researches the value created by open data in Europe. It is the second study by the European Data Portal, following the 2015 report. The open data market size is estimated at €184 billion and forecast to reach between €199.51 and €334.21 billion in 2025. The report additionally considers how this market size is distributed along different sectors and how many people are employed due to open data. The efficiency gains from open data, such as potential lives saved, time saved, environmental benefits, and improvement of language services, as well as associated potential costs savings are explored and quantified where possible. Finally, the report also considers examples and insights from open data re-use in organisations. The key findings of the report are summarised below:

  1. The specification and implementation of high-value datasets as part of the new Open Data Directive is a promising opportunity to address quality & quantity demands of open data.
  2. Addressing quality & quantity demands is important, yet not enough to reach the full potential of open data.
  3. Open data re-users have to be aware and capable of understanding and leveraging the potential.
  4. Open data value creation is part of the wider challenge of skill and process transformation: a lengthy process whose change and impact are not always easy to observe and measure.
  5. Sector-specific initiatives and collaboration in and across private and public sector foster value creation.
  6. Combining open data with personal, shared, or crowdsourced data is vital for the realisation of further growth of the open data market.
  7. For different challenges, we must explore and improve multiple approaches of data re-use that are ethical, sustainable, and fit-for-purpose….(More)”.

Assessing the Returns on Investment in Data Openness and Transparency


Paper by Megumi Kubota and Albert Zeufack: “This paper investigates the potential benefits for a country from investing in data transparency. The paper shows that increased data transparency can bring substantive returns in lower costs of external borrowing.

This result is obtained by estimating the impact of public data transparency on sovereign spreads conditional on the country’s level of institutional quality and public and external debt. While improving data transparency alone reduces the external borrowing costs for a country, the return is much higher when combined with stronger institutional quality and lower public and external debt. Similarly, the returns on investing in data transparency are higher when a country’s integration to the global economy deepens, as captured by trade and financial openness.

Estimation of an instrumental variable regression shows that Sub-Saharan African countries could have saved up to 14.5 basis points in sovereign bond spreads and decreased their external debt burden by US$405.4 million (0.02 percent of gross domestic product) in 2018, if their average level of data transparency was that of a country in the top quartile of the upper-middle-income country category. At the country level, Angola could have reduced its external debt burden by around US$73.6 million….(More)”.

Barriers to Working With National Health Service England’s Open Data


Paper by Ben Goldacre and Seb Bacon: “Open data is information made freely available to third parties in structured formats without restrictive licensing conditions, permitting commercial and noncommercial organizations to innovate. In the context of National Health Service (NHS) data, this is intended to improve patient outcomes and efficiency. EBM DataLab is a research group with a focus on online tools which turn our research findings into actionable monthly outputs. We regularly import and process more than 15 different NHS open datasets to deliver OpenPrescribing.net, one of the most high-impact use cases for NHS England’s open data, with over 15,000 unique users each month. In this paper, we have described the many breaches of best practices around NHS open data that we have encountered. Examples include datasets that repeatedly change location without warning or forwarding; datasets that are needlessly behind a “CAPTCHA” and so cannot be automatically downloaded; longitudinal datasets that change their structure without warning or documentation; near-duplicate datasets with unexplained differences; datasets that are impossible to locate, and thus may or may not exist; poor or absent documentation; and withholding of data for dubious reasons. We propose new open ways of working that will support better analytics for all users of the NHS. These include better curation, better documentation, and systems for better dialogue with technical teams….(More)”.

Reuse of open data in Quebec: from economic development to government transparency


Paper by

Reuse of open data in Quebec: from economic development to government transparency

Paper by Christian Boudreau: “Based on the history of open data in Quebec, this article discusses the reuse of these data by various actors within society, with the aim of securing desired economic, administrative and democratic benefits. Drawing on an analysis of government measures and community practices in the field of data reuse, the study shows that the benefits of open data appear to be inconclusive in terms of economic growth. On the other hand, their benefits seem promising from the point of view of government transparency in that it allows various civil society actors to monitor the integrity and performance of government activities. In the age of digital data and networks, the state must be seen not only as a platform conducive to innovation, but also as a rich field of study that is closely monitored by various actors driven by political and social goals….

Although the economic benefits of open data have been inconclusive so far, governments, at least in Quebec, must not stop investing in opening up their data. In terms of transparency, the results of the study suggest that the benefits of open data are sufficiently promising to continue releasing government data, if only to support the evaluation and planning activities of public programmes and services….(More)”.

How digital sleuths unravelled the mystery of Iran’s plane crash


Chris Stokel-Walker at Wired: “The video shows a faint glow in the distance, zig-zagging like a piece of paper caught in an underdraft, slowly meandering towards the horizon. Then there’s a bright flash and the trees in the foreground are thrown into shadow as Ukraine International Airlines flight PS752 hits the ground early on the morning of January 8, killing all 176 people on board.

At first, it seemed like an accident – engine failure was fingered as the cause – until the first video showing the plane seemingly on fire as it weaved to the ground surfaced. United States officials started to investigate, and a more complicated picture emerged. It appeared that the plane had been hit by a missile, corroborated by a second video that appears to show the moment the missile ploughs into the Boeing 737-800. While military and intelligence officials at governments around the world were conducting their inquiries in secret, a team of investigators were using open-source intelligence (OSINT) techniques to piece together the puzzle of flight PS752.

It’s not unusual nowadays for OSINT to lead the way in decoding key news events. When Sergei Skripal was poisoned, Bellingcat, an open-source intelligence website, tracked and identified his killers as they traipsed across London and Salisbury. They delved into military records to blow the cover of agents sent to kill. And in the days after the Ukraine Airlines plane crashed into the ground outside Tehran, Bellingcat and The New York Times have blown a hole in the supposition that the downing of the aircraft was an engine failure. The pressure – and the weight of public evidence – compelled Iranian officials to admit overnight on January 10 that the country had shot down the plane “in error”.

So how do they do it? “You can think of OSINT as a puzzle. To get the complete picture, you need to find the missing pieces and put everything together,” says Loránd Bodó, an OSINT analyst at Tech versus Terrorism, a campaign group. The team at Bellingcat and other open-source investigators pore over publicly available material. Thanks to our propensity to reach for our cameraphones at the sight of any newsworthy incident, video and photos are often available, posted to social media in the immediate aftermath of events. (The person who shot and uploaded the second video in this incident, of the missile appearing to hit the Boeing plane was a perfect example: they grabbed their phone after they heard “some sort of shot fired”.) “Open source investigations essentially involve the collection, preservation, verification, and analysis of evidence that is available in the public domain to build a picture of what happened,” says Yvonne McDermott Rees, a lecturer at Swansea University….(More)”.

Open Science, Open Data, and Open Scholarship: European Policies to Make Science Fit for the Twenty-First Century


Paper by Jean-Claude Burgelman et al: “Open science will make science more efficient, reliable, and responsive to societal challenges. The European Commission has sought to advance open science policy from its inception in a holistic and integrated way, covering all aspects of the research cycle from scientific discovery and review to sharing knowledge, publishing, and outreach. We present the steps taken with a forward-looking perspective on the challenges laying ahead, in particular the necessary change of the rewards and incentives system for researchers (for which various actors are co-responsible and which goes beyond the mandate of the European Commission). Finally, we discuss the role of artificial intelligence (AI) within an open science perspective….(More)”.