Why Does Open Data Get Underused? A Focus on the Role of (Open) Data Literacy

Paper by Gema Santos-Hermosa et al: “Open data has been conceptualised as a strategic form of public knowledge. Tightly connected with the developments in open government and open science, the main claim is that access to open data (OD) might be a catalyser of social innovation and citizen empowerment. Nevertheless, the so-called (open) data divide, as a problem connected to the situation of OD usage and engagement, is a concern.

In this chapter, we introduce the OD usage trends, focusing on the role played by (open) data literacy amongst either users or producers: citizens, professionals, and researchers. Indeed, we attempted to cover the problem of OD through a holistic approach including two areas of research and practice: open government data (OGD) and open research data (ORD). After uncovering several factors blocking OD consumption, we point out that more OD is being published (albeit with low usage), and we overview the research on data literacy. While the intentions of stakeholders are driven by many motivations, the abilities that put them in the condition to enhance OD might require further attention. In the end, we focus on several lifelong learning activities supporting open data literacy, uncovering the challenges ahead to unleash the power of OD in society…(More)”.

AI-Ready Open Data

Explainer by Sean Long and Tom Romanoff: “Artificial intelligence and machine learning (AI/ML) have the potential to create applications that tackle societal challenges from human health to climate change. These applications, however, require data to power AI model development and implementation. Government’s vast amount of open data can fill this gap: McKinsey estimates that open data can help unlock $3 trillion to $5 trillion in economic value annually across seven sectors. But for open data to fuel innovations in academia and the private sector, the data must be both easy to find and use. While Data.gov makes it simpler to find the federal government’s open data, researchers still spend up to 80% of their time preparing data into a usable, AI-ready format. As Intel warns, “You’re not AI-ready until your data is.”

In this explainer, the Bipartisan Policy Center provides an overview of existing efforts across the federal government to improve the AI readiness of its open data. We answer the following questions:

  • What is AI-ready data?
  • Why is AI-ready data important to the federal government’s AI agenda?
  • Where is AI-ready data being applied across federal agencies?
  • How could AI-ready data become the federal standard?…(More)”.

Rethinking the impact of open data: A first step towards a European impact assessment for open data

Report for data.europa.eu: “This report is the first in a series of four that aims to establish a standard methodology for open data impact assessments that can be used across Europe. This exercise is key because a consistent definition of the impact of open data does not exist. The lack of a robust, conceptual foundation has made it more difficult for data portals to demonstrate their value through empirical evidence. It also challenges the EU’s ability to understand and compare performance across Member States. Most academic articles that look to explore the impact of data refer to existing open data frameworks, with the open data maturity (ODM) and open data barometer (ODB) ones most frequently represented. These two frameworks distinguish between different kinds of impact, and both mention social, political and economic impacts in particular. The ODM also includes the environmental impact in its framework.

Sometimes, these frameworks diverge from the European Commission’s own recommendations of how best to measure impact, as explained in specific sections of the better regulation guidelines and the better regulation toolbox. They help to answer a critical question for policymakers: do the benefits provided outweigh the costs of assembling and distributing (open) data? Future reports in this series will further explore how to better align existing frameworks, such as the ODM, with these critically important guidelines…(More)”.

Ready, set, share: Researchers brace for new data-sharing rules

Jocelyn Kaiser and Jeffrey Brainard in Science: “…By 2025, new U.S. requirements for data sharing will extend beyond biomedical research to encompass researchers across all scientific disciplines who receive federal research funding. Some funders in the European Union and China have also enacted data-sharing requirements. The new U.S. moves are feeding hopes that a worldwide movement toward increased sharing is in the offing. Supporters think it could speed the pace and reliability of science.

Some scientists may only need to make a few adjustments to comply with the policies. That’s because data sharing is already common in fields such as protein crystallography and astronomy. But in other fields the task could be weighty, because sharing is often an afterthought. For example, a study involving 7750 medical research papers found that just 9% of those published from 2015 to 2020 promised to make their data publicly available, and authors of just 3% actually shared, says lead author Daniel Hamilton of the University of Melbourne, who described the finding at the International Congress on Peer Review and Scientific Publication in September 2022. Even when authors promise to share their data, they often fail to follow through. Out of 21,000 journal articles that included data-sharing plans, a study published in PLOS ONE in 2020 found, fewer than 21% provided links to the repository storing the data.

Journals and funders, too, have a mixed record when it comes to supporting data sharing. Research presented at the September 2022 peer-review congress found only about half of the 110 largest public, corporate, and philanthropic funders of health research around the world recommend or require grantees to share data…

“Health research is the field where the ethical obligation to share data is the highest,” says Aidan Tan, a clinician-researcher at the University of Sydney who led the study. “People volunteer in clinical trials and put themselves at risk to advance medical research and ultimately improve human health.”

Across many fields of science, researchers’ support for sharing data has increased during the past decade, surveys show. But given the potential cost and complexity, many are apprehensive about the NIH policy, and other requirements to follow. “How we get there is pretty messy right now,” says Parker Antin, a developmental biologist and associate vice president for research at the University of Arizona. “I’m really not sure whether the total return will justify the cost. But I don’t know of any other way to find out than trying to do it.”

Science offers this guide as researchers prepare to plunge in….(More)”.

The State of Open Data Policy Repository

The State of Open Data Policy Repository is a collection of recent policy developments surrounding open data, data reuse, and data collaboration around the world. 

A refinement of compilation of policies launched at the Open Data Policy Summit last year, the State of Open Data Policy Online Repository is an interactive resource that looks at recent legislation, directives, and proposals that affect open data and data collaboration all around the world. It captures what kinds of data collaboration issues policymakers are currently focused on and where the momentum for data innovation is heading in countries around the world.

Users can filter policies according to region, country, focus, and type of data sharing. The review currently surfaced approximately 60 examples of recent legislative acts, proposals, directives, and other policy documents, from which the Open Data Policy Lab draws findings about the need to promote more innovative policy frameworks.

This collection shows that, despite increased interest in the third wave conception of open data, policy development remains nascent. It is primarily concerned with open data repositories at the expense of alternative forms of collaboration. Most policies listed focus on releasing government data and, elsewhere, most nations still don’t have open data rules or a method to put the policies in place. 

This work reveals a pressing need for institutions to create frameworks that can direct data professionals since there are worries that inaction may both allow for misuse of data and lead to missed chances to use data…(More)”.

Commission defines high-value datasets to be made available for re-use

Press Release: “Today, the Commission has published a list of high-value datasets that public sector bodies will have to make available for re-use, free of charge, within 16 months.

Certain public sector data, such as meteorological or air quality data are particularly interesting for creators of value-added services and applications and have important benefits for society, the environment and the economy – which is why they should be made available to the public…

The Regulation is set up under the Open Data Directive, which defines six categories of such high-value datasets: geospatial, earth observation and environment, meteorological, statistics, companies and mobility. This thematic range can be extended at a later stage to reflect technological and market developments. The datasets will be available in machine-readable format, via an Application Programming Interface and, where relevant, as bulk download.

The increased availability of data will boost entrepreneurship and result in the creation of new companies. High-value datasets can be an important resource for SMEs to develop new digital products and services, and therefore also an enabler helping them to attract investors. The re-use of datasets such as mobility or geolocalisation of buildings can open business opportunities for the logistics or transport sectors, as well as improve the efficiency of public service delivery, for example by understanding traffic flows to make transport more efficient. Meteorological observation data, radar data, air quality and soil contamination data can also support research and digital innovation as well as better-informed policymaking, especially in the fight against climate change….(More)”. See also: List of specific high-value datasets

Studying open government data: Acknowledging practices and politics

Paper by Gijs van Maanen: “Open government and open data are often presented as the Asterix and Obelix of modern government—one cannot discuss one, without involving the other. Modern government, in this narrative, should open itself up, be more transparent, and allow the governed to have a say in their governance. The usage of technologies, and especially the communication of governmental data, is then thought to be one of the crucial instruments helping governments achieving these goals. Much open government data research, hence, focuses on the publication of open government data, their reuse, and re-users. Recent research trends, by contrast, divert from this focus on data and emphasize the importance of studying open government data in practice, in interaction with practitioners, while simultaneously paying attention to their political character. This commentary looks more closely at the implications of emphasizing the practical and political dimensions of open government data. It argues that researchers should explicate how and in what way open government data policies present solutions to what kind of problems. Such explications should be based on a detailed empirical analysis of how different actors do or do not do open data. The key question to be continuously asked and answered when studying and implementing open government data is how the solutions openness present latch onto the problem they aim to solve…(More)”.


ResearchDataGov.org is a product of the federal statistical agencies and units, created in response to the Foundations of Evidence-based Policymaking Act of 2018. The site is the single portal for discovery of restricted data in the federal statistical system. The agencies have provided detailed descriptions of each data asset. Users can search for data by topic, agency, and keywords. Questions related to the data should be directed to the owning agency, using the contact information on the page that describes the data. In late 2022, users will be able to apply for access to these data using a single-application process built into ResearchDataGov. ResearchDataGov.org is built by and hosted at ICPSR at the University of Michigan, under contract and guidance from the National Center for Science and Engineering Statistics within the National Science Foundation.

The data described in ResearchDataGov.org are owned by and accessed through the agencies and units of the federal statistical system. Data access is determined by the owning or distributing agency and is limited to specific physical or virtual data enclaves. Even though all data assets are listed in a single inventory, they are not necessarily available for use in the same location(s). Multiple data assets accessed in the same location may not be able to be used together due to disclosure risk and other requirements. Please note the access modality of the data in which you are interested and seek guidance from the owning agency about whether assets can be linked or otherwise used together…(More)”.

A Landscape of Open Science Policies Research

Paper by Alejandra Manco: “This literature review aims to examine the approach given to open science policy in the different studies. The main findings are that the approach given to open science has different aspects: policy framing and its geopolitical aspects are described as an asymmetries replication and epistemic governance tool. The main geopolitical aspects of open science policies described in the literature are the relations between international, regional, and national policies. There are also different components of open science covered in the literature: open data seems much discussed in the works in the English language, while open access is the main component discussed in the Portuguese and Spanish speaking papers. Finally, the relationship between open science policies and the science policy is framed by highlighting the innovation and transparency that open science can bring into it…(More)”

Explore the first Open Science Indicators dataset

Article by Lauren Cadwallader, Lindsay Morton, and Iain Hrynaszkiewicz: “Open Science is on the rise. We can infer as much from the proliferation of Open Access publishing options; the steady upward trend in bioRxiv postings; the periodic rollout of new national, institutional, or funder policies. 

But what do we actually know about the day-to-day realities of Open Science practice? What are the norms? How do they vary across different research subject areas and regions? Are Open Science practices shifting over time? Where might the next opportunity lie and where do barriers to adoption persist? 

To even begin exploring these questions and others like them we need to establish a shared understanding of how we define and measure Open Science practices. We also need to understand the current state of adoption in order to track progress over time. That’s where the Open Science Indicators project comes in. PLOS conceptualized a framework for measuring Open Science practices according to the FAIR principles, and partnered with DataSeer to develop a set of numerical “indicators” linked to specific Open Science characteristics and behaviors observable in published research articles. Our very first dataset, now available for download at Figshare, focuses on three Open Science practices: data sharing, code sharing, and preprint posting…(More)”.