What the drive for open science data can learn from the evolving history of open government data


Stefaan Verhulst, Andrew Young, and Andrew Zahuranec at The Conversation: “Nineteen years ago, a group of international researchers met in Budapest to discuss a persistent problem. While experts published an enormous amount of scientific and scholarly material, few of these works were accessible. New research remained locked behind paywalls run by academic journals. The result was researchers struggled to learn from one another. They could not build on one another’s findings to achieve new insights. In response to these problems, the group developed the Budapest Open Access Initiative, a declaration calling for free and unrestricted access to scholarly journal literature in all academic fields.

In the years since, open access has become a priority for a growing number of universitiesgovernments, and journals. But while access to scientific literature has increased, access to the scientific data underlying this research remains extremely limited. Researchers can increasingly see what their colleagues are doing but, in an era defined by the replication crisis, they cannot access the data to reproduce the findings or analyze it to produce new findings. In some cases there are good reasons to keep access to the data limited – such as confidentiality or sensitivity concerns – yet in many other cases data hoarding still reigns.

To make scientific research data open to citizens and scientists alike, open science data advocates can learn from open data efforts in other domains. By looking at the evolving history of the open government data movement, scientists can see both limitations to current approaches and identify ways to move forward from them….(More) (French version)”.

The Third Wave of Open Data Toolkit


The GovLab: “Today, as part of Open Data Week 2021, the Open Data Policy Lab is launching  The Third Wave of Open Data Toolkit, which provides organizations with specific operational guidance on how to foster responsible, effective, and purpose-driven re-use. The toolkit—authored by Andrew Young, Andrew J. Zahuranec, Stefaan G. Verhulst, and Kateryna Gazaryan—supports the work of data stewards, responsible data leaders at public, private, and civil society organizations empowered to seek new ways to create public value through cross-sector data collaboration. The toolkit provides this support a few different ways. 

First, it offers a framework to make sense of the present and future open data ecosystem. Acknowledging that data re-use is the result of many stages, the toolkit separates each stage, identifying the ways the data lifecycle plays into data collaboration, the way data collaboration plays into the production of insights, the way insights play into conditions that enable further collaboration, and so on. By understanding the processes that data is created and used, data stewards can promote better and more impactful data management. 

Third Wave Framework

Second, the toolkit offers eight primers showing how data stewards can operationalize the actions previously identified as being part of the third wave. Each primer includes a brief explanation of what each action entails, offers some specific ways data stewards can implement these actions, and lists some supplementary pieces that might be useful in this work. The primers, which are available as part of the toolkit and as standalone two-pagers, are…(More)”.

Open Data Day 2021: How to unlock its potential moving forward?


Stefaan Verhulst, Andrew Young, and Andrew Zahuranec at Data and Policy: “For over a decade, data advocates have reserved one day out of the year to celebrate open data. Open Data Day 2021 comes at a time of unprecedented upheaval. As the world remains in the grip of COVID-19, open data researchers and practitioners must confront the challenge of how to use open data to address the types of complex, emergent challenges that are likely to define the rest of this century (and beyond). Amid threats like the ongoing pandemic, climate change, and systemic poverty, there is renewed pressure to find ways that open data can solve complex social, cultural, economic and political problems.

Over the past year, the Open Data Policy Lab, an initiative of The GovLab at NYU’s Tandon School of Engineering, held several sessions with leaders of open data from around the world. Over the course of these sessions, which we called the Summer of Open Data, we studied various strategies and trends, and identified future pathways for open data leaders to pursue. The results of this research suggest an emergent Third Wave of Open Data— one that offers a clear pathway for stakeholders of all types to achieve Open Data Day’s goal of “showing the benefits of open data and encouraging the adoption of open data policies in government, business, and civil society.”

The Third Wave of Open Data is central to how data is being collected, stored, shared, used, and reused around the world. In what follows, we explain this notion further, and argue that it offers a useful rubric through which to take stock of where we are — and to consider future goals — as we mark this latest iteration of Open Data Day.

The Past and Present of Open Data

The history of open data can be divided into several waves, each reflecting the priorities and values of the era in which they emerged….(More)”.

Image for post
The Three Waves of Open Data

Covid-19 Data Cards: Building a Data Taxonomy for Pandemic Preparedness


Open Data Charter: “…We want to initiate the repair of the public’s trust through the building of a Pandemic Data Taxonomy with you — a network of data users and practitioners.

Building on feedback we got from our call to identify high value Open COVID-19 Data, we have structured a set of data cards, including key data types related to health issues, legal and socioeconomic impacts and fiscal transparency, within which are well-defined data models and dictionaries. Our target audience for this data taxonomy are governments. We are hoping this framework is a starting point towards building greater consistency around pandemic data release, and flag areas for better cooperation and standardisation within and between our governments and communities around the world.

We hope that together, with the input and feedback from a diverse group of data users and practitioners, we can have at the end of this public consultation and open-call, a document by a global collective, one that we can present to governments and public servants for their buy-in to reform our data infrastructures to be better prepared for future outbreaks.

In order to analyze the variables necessary to manage and investigate the different aspects of a pandemic, as exemplified by COVID-19, and based on a review of the type of data being released by 25 countries — we categorised the data in 4 major categories:

  • General — Contains the general concepts that all the files have in common and are defined, such as the METADATA, global sections of RISKS and their MITIGATION and the general STANDARDS required for the use, management and publication of the data. Then, a link to a spreadsheet, where more details of the precision, update frequency, publication methods and specific standards of each data set are defined.
  • Health Data — Describes how to manage and potentially publish the follow-up information on COVID-19 cases, considering data with temporal, geographical and demographic distribution along with the details for the study of the evolution of the disease.
  • Legal and Socioeconomic Impact Data — Contains the regulations, actions, measures, restrictions, protocols, documents and all the information regarding quarantine and the socio-economic impact as well as medical, labor or economic regulations for each data publisher.
  • Fiscal Data — Contains all budget allocations in accordance with the overall approved Pandemic budget, as well as the implemented adjustments. It also identifies specific allocations for facing prevention, detection, control, treatment and containment of the virus, as well as possible budget reallocations from other sectors or items derived from the actions mentioned above or by the derived economic constraints. It’s based on the recommendations made by GIFT and Open Contracting….(More)”

Inside the ‘Wikipedia of Maps,’ Tensions Grow Over Corporate Influence


Corey Dickinson at Bloomberg: “What do Lyft, Facebook, the International Red Cross, the U.N., the government of Nepal and Pokémon Go have in common? They all use the same source of geospatial data: OpenStreetMap, a free, open-source online mapping service akin to Google Maps or Apple Maps. But unlike those corporate-owned mapping platforms, OSM is built on a network of mostly volunteer contributors. Researchers have described it as the “Wikipedia for maps.”

Since it launched in 2004, OpenStreetMap has become an essential part of the world’s technology infrastructure. Hundreds of millions of monthly users interact with services derived from its data, from ridehailing apps, to social media geotagging on Snapchat and Instagram, to humanitarian relief operations in the wake of natural disasters. 

But recently the map has been changing, due the growing impact of private sector companies that rely on it. In a 2019 paper published in the ISPRS International Journal of Geo-Information, a cross-institutional team of researchers traced how Facebook, Apple, Microsoft and other companies have gained prominence as editors of the map. Their priorities, the researchers say, are driving significant change to what is being mapped compared to the past. 

“OpenStreetMap’s data is crowdsourced, which has always made spectators to the project a bit wary about the quality of the data,” says Dipto Sarkar, a professor of geoscience at Carleton University in Ottawa, and one of the paper’s co-authors. “As the data becomes more valuable and is used for an ever-increasing list of projects, the integrity of the information has to be almost perfect. These companies need to make sure there’s a good map of the places they want to expand in, and nobody else is offering that, so they’ve decided to fill it in themselves.”…(More)”.

How public should science be?


Discussion Report by Edel, A., Kübler: “Since the outbreak of the COVID-19 pandemic, the question of what role science should play in political discourse has moved into the focus of public interest with unprecedented vehemence. In addition to governments directly consulting individual virologists or (epidemiological) research institutes, major scientific institutions such as the German National Academy of Sciences Leopoldina1 and the presidents of four non-university research organisations have actively participated in the discussion by providing recommendations. More than ever before, scientific problem descriptions, data and evaluations are influencing political measures. It seems as if the relationship between science, politics and the public is currently being reassessed.

The current crisis situation has not created a new phenomenon but has only reinforced the trend of mutual reliance between science, politics and the public, which has been observed for some time. Decision-makers in the political arena and in business were already looking for ways to better substantiate and legitimise their decisions through external scientific expertise when faced with major societal challenges, for example when trying to deal with increasing immigration, climate protection and when preparing for far-reaching reforms (e.g. of the labour market or the pension system) or in economic crises. Research is also held in high esteem within society. The special edition of the ‘Science Barometer’ was able to demonstrate in the surveys an increased trust in science in the case of the current COVID-19 pandemic. Conversely, scientists have always been and continue to be active in the public sphere. For some time now, research experts have frequently been guests on talk shows. Authors from the field of science often write opinion pieces and guest contributions in daily newspapers and magazines. However, this role of research is by no means un-controversial….(More)”.

Scholarly publishing needs regulation


Essay by Jean-Claude Burgelman: “The world of scientific communication has changed significantly over the past 12 months. Understandably, the amazing mobilisation of research and scholarly publishing in an effort to mitigate the effects of Covid-19 and find a vaccine has overshadowed everything else. But two other less-noticed events could also have profound implications for the industry and the researchers who rely on it.

On 10 January 2020, Taylor and Francis announced its acquisition of one of the most innovative small open-access publishers, F1000 Research. A year later, on 5 January 2021, another of the big commercial scholarly publishers, Wiley, paid nearly $300 million for Hindawi, a significant open-access publisher in London.

These acquisitions come alongside rapid change in publishers’ functions and business models. Scientific publishing is no longer only about publishing articles. It’s a knowledge industry—and it’s increasingly clear it needs to be regulated like one.

The two giant incumbents, Springer Nature and Elsevier, are already a long way down the road to open access, and have built up impressive in-house capacity. But Wiley, and Taylor and Francis, had not. That’s why they decided to buy young open-access publishers. Buying up a smaller, innovative competitor is a well-established way for an incumbent in any industry to expand its reach, gain the ability to do new things and reinvent its business model—it’s why Facebook bought WhatsApp and Instagram, for example.

New regulatory approach

To understand why this dynamic demands a new regulatory approach in scientific publishing, we need to set such acquisitions alongside a broader perspective of the business’s transformation into a knowledge industry. 

Monopolies, cartels and oligopolies in any industry are a cause for concern. By reducing competition, they stifle innovation and push up prices. But for science, the implications of such a course are particularly worrying. 

Science is a common good. Its products—and especially its spillovers, the insights and applications that cannot be monopolised—are vital to our knowledge societies. This means that having four companies control the worldwide production of car tyres, as they do, has very different implications to an oligopoly in the distribution of scientific outputs. The latter situation would give the incumbents a tight grip on the supply of knowledge.

Scientific publishing is not yet a monopoly, but Europe at least is witnessing the emergence of an oligopoly, in the shape of Elsevier, Springer Nature, Wiley, and Taylor and Francis. The past year’s acquisitions have left only two significant independent players in open-access publishing—Frontiers and MDPI, both based in Switzerland….(More)”.

An Open Data Team Experiments with a New Way to Tell City Stories


Article by  Sean Finnan: “Can you see me?” says Mark Linnane, over Zoom, as he walks around a plastic structure on the floor of an office at Maynooth University. “That gives you some sense of the size of it. It’s 3.5 metres by 2.”

Linnane trails his laptop’s webcam over the surface of the off-white 3D model, giving a birds-eye view of tens of thousands of tiny buildings, the trails of roads and the clear pathway of the Liffey.

This replica of the heart of the city from Phoenix Park to Dublin Port was created to scale by the university’s Building City Dashboards team, using data from the Ordnance Survey Ireland.

In the five years since they started to grapple with the question of how to present data about the city in an engaging and accessible way, the team has experimented with virtual reality, and augmented reality – and most recently, with this new form of mapping, which blends the lego-like miniature of Dublin’s centre with changeable data projected on.

This could really come into its own as a public exhibit if they start to tell meaningful data-driven and empirical stories, says Linnane, a digital exhibition developer at Maynooth University.

Stories that are “relevant in terms of the everyday daily lives of people who will be coming to see it”, he says.

Layers of Meaning

Getting the projector that throws the visualisations onto the model to work right was Linnane’s job, he says.

He had to mesh the Ordnance Survey data with others that showed building heights for example. “Every single building down to the sheds in someone’s garden have a unique identifier,” says Linnane.

Projectors are built to project onto flat surfaces and not 3D models so that had to be finessed, too, he says. “Every step on the way was a new development. There wasn’t really a process there before.”

The printed 3D model shows 7km by 4km of Dublin and 122,355 structures, says Linnane. That includes bigger buildings but also small outbuildings, railway platforms, public toilets and glasshouses – all mocked up and serving as a canvas for a kaleidoscope of data.

“We’re just projecting data on to it and seeing what’s going on with that,” says Rob Kitchin, principal investigator at Maynooth University’s Programmable City project….(More)”

Image of model courtesy of Mark Linnane.

When FOIA Goes to Court: 20 Years of Freedom of Information Act Litigation by News Organizations and Reporters


Report by The FOIA Project: “The news media are powerful players in the world of government transparency and public accountability. One important tool for ensuring public accountability is through invoking transparency mandates provided by the Freedom of Information Act (FOIA). In 2020, news organizations and individual reporters filed 122 different FOIA suits[1] to compel disclosure of federal government records—more than any year on record according to federal court data back to 2001 analyzed by the FOIA Project

In fact, the media alone have filed a total of 386 FOIA cases during the four years of the Trump Administration, from 2017 through 2020. This is greater than the total of 311 FOIA media cases filed during the sixteen years of the Bush and Obama Administrations combined. Moreover, many of these FOIA cases were the very first FOIA cases filed by members of the news media. Almost as many new FOIA litigators filed their first case in court in the past four years—178 from 2017 to 2020—than the years 2001 to 2016, when 196 FOIA litigators filed their first case. Reporters made up the majority of these. During the past four years, more than four out of five of first-time litigators were individual reporters. The ranks of FOIA litigators thus expanded considerably during the Trump Administration, with more reporters challenging agencies in court for failing to provide records they are seeking, either alone or with their news organizations.

Using the FOIA Project’s unique dataset of FOIA cases filed in federal court, this report provides unprecedented and valuable insight into the rapid growth of media lawsuits designed to make the government more transparent and accountable to the public. The complete, updated list of news media cases, along with the names of organizations and reporters who filed these suits, is available on the News Media List at FOIAProject.org. Figure 1shows the total number of FOIA cases filed by the news each year. Counts are available in Appendix Table 1 at the end of this report….(More)”.

Figure 1. Freedom of Information Act (FOIA) Cases Filed by News Organizations and Reporters in Federal Court, 2001–2020.

Can open data increase younger generations’ trust in democratic institutions? A study in the European Union


Paper by Nicolás Gonzálvez-Gallego and Laura Nieto-Torrejón: “Scholars and policy makers are giving increasing attention to how young people are involved in politics and their confidence in the current democratic system. In a context of a global trust crisis in the European Union, this paper examines if open government data, a promising governance strategy, may help to boost Millennials’ and Generation Z trust in public institutions and satisfaction with public outcomes. First, results from our preliminary analysis challenge some popular beliefs by revealing that younger generations tend to trust in their institutions notably more than the rest of the European citizens. In addition, our findings show that open government data is a trust-enabler for Millennials and Generation Z, not only through a direct link between both, but also thanks to the mediator role of citizens’ satisfaction. Accordingly, public officers are encouraged to spread the implementation of open data strategies as a way to improve younger generations’ attachment to democratic institutions….(More)”.