Open data: The building block of 21st century (open) science


Paper by Corina Pascu and Jean-Claude Burgelman: “Given this irreversibility of data driven and reproducible science and the role machines will play in that, it is foreseeable that the production of scientific knowledge will be more like a constant flow of updated data driven outputs, rather than a unique publication/article of some sort. Indeed, the future of scholarly publishing will be more based on the publication of data/insights with the article as a narrative.

For open data to be valuable, reproducibility is a sine qua non (King2011; Piwowar, Vision and Whitlock2011) and—equally important as most of the societal grand challenges require several sciences to work together—essential for interdisciplinarity.

This trend correlates with the already ongoing observed epistemic shift in the rationale of science: from demonstrating the absolute truth via a unique narrative (article or publication), to the best possible understanding what at that moment is needed to move forward in the production of knowledge to address problem “X” (de Regt2017).

Science in the 21st century will be thus be more “liquid,” enabled by open science and data practices and supported or even co-produced by artificial intelligence (AI) tools and services, and thus a continuous flow of knowledge produced and used by (mainly) machines and people. In this paradigm, an article will be the “atomic” entity and often the least important output of the knowledge stream and scholarship production. Publishing will offer in the first place a platform where all parts of the knowledge stream will be made available as such via peer review.

The new frontier in open science as well as where most of future revenue will be made, will be via value added data services (such as mining, intelligence, and networking) for people and machines. The use of AI is on the rise in society, but also on all aspects of research and science: what can be put in an algorithm will be put; the machines and deep learning add factor “X.”

AI services for science 4 are already being made along the research process: data discovery and analysis and knowledge extraction out of research artefacts are accelerated with the use of AI. AI technologies also help to maximize the efficiency of the publishing process and make peer-review more objective5 (Table 1).

Table 1. Examples of AI services for science already being developed

Abbreviation: AI, artificial intelligence.

Source: Authors’ research based on public sources, 2021.

Ultimately, actionable knowledge and translation of its benefits to society will be handled by humans in the “machine era” for decades to come. But as computers are indispensable research assistants, we need to make what we publish understandable to them.

The availability of data that are “FAIR by design” and shared Application Programming Interfaces (APIs) will allow new ways of collaboration between scientists and machines to make the best use of research digital objects of any kind. The more findable, accessible, interoperable, and reusable (FAIR) data resources will become available, the more it will be possible to use AI to extract and analyze new valuable information. The main challenge is to master the interoperability and quality of research data…(More)”.

State of Open Data Policy Repository


The GovLab: “To accompany its State of Open Data Policy Summit, the Open Data Policy Lab announced the release of a new resource to assess recent policy developments surrounding open data, data reuse, and data collaboration around the world: State of Open Data Repository of Recent Developments.

This document examines recent legislation, directives, and proposals that affect open data and data collaboration. Its goal is to capture signals of concerns, direction and leadership as to determine what stakeholders may focus on in the future. The review currently surfaced approximately 50 examples of recent legislative acts, proposals, directives, and other policy documents, from which the Open Data Policy Lab draws findings about the need to promote more innovative policy frameworks.

This collection demonstrates that, while there is growing interest in open data and data collaboration, policy development still remains nascent and focused on open data repositories at the expense of other collaborative arrangements. As we indicated in our report on the Third Wave of Open Data, there is an urgent need for governance frameworks at the local, regional, and national level to facilitate responsible reuse…(More)”.

(When) Do Open Budgets Transform Lives? Progress and Next Steps in Fiscal Openness Research


Paper by Xiao Hui Tai, Shikhar Mehra & Joshua E. Blumenstock: “This paper documents the rapidly growing empirical literature that can plausibly claim to identify causal effects of transparency or participation in budgeting in a variety of contexts. Recent studies convincingly demonstrate that the power of audits travels well beyond the context of initial field-defining studies, consider participatory budgeting beyond Brazil, where such practices were pioneered, and examine previously neglected outcomes, notably revenues and procurement. Overall, the study of the impacts of fiscal openness has become richer and more nuanced. The most well-documented causal effects are positive: lower corruption and enhanced accountability at the ballot box. Moreover, these impacts have been shown to apply across different settings. This research concludes that the empirical case for open government in this policy area is rapidly growing in strength. This paper sets out challenges related to studying national-level reforms; working directly with governments; evaluating systems as opposed to programs; clarifying the relationship between transparency and participation; and understanding trade-offs for reforms in this area….(More)”.

Global Data Barometer


Report and Site by the Global Data Barometer: “This report provides an overview of the Global Data Barometer findings. The Barometer includes 39 primary indicators, and over 500 sub-questions, covering 109 countries (delivering more than 60,000 data points in total). In this report, we select just a few of these to explore, providing a non-exhaustive overview of some of the topics that could be explored further using Barometer data.

  • Section 1 provides a short overview of the key concepts used in the Barometer, and a short description of the methodology.
  • Section 2 looks at the four key pillars of the Barometer (governance, capability, availability and use), and provides headlines from each.
  • Section 3 provides a regional analysis, drawing on insights from Barometer regional hubs to understand the unique context of each region, and the relative strengths and weaknesses of countries.
  • Section 4 provides a short summary of learning from the first edition, and highlights directions for future work.

The full methodology, and details of how to access and work further with Barometer data, are contained in Appendices…(More)”.

In potentially seismic shift, Government could release almost all advice to ministers


Article by Henry Cooke: (New Zealand) “The Government is considering proactively releasing almost all advice to ministers under a planned shakeup to transparency rules, which, if made, would amount to a seismic shift in the way the public sector communicates.

Open government advocates have cautiously welcomed the planned move, but say the devil will be in the detail – as the proactive release regime could end up defanging the Official Information Act (OIA).

The Public Service Commission is consulting with government departments and agencies on a proposal to release to the public all briefings and other advice given to ministers – unless there is a compelling reason not to, such as national security or breaching a commercial agreement, according to a person with knowledge of the discussions.

Currently, the Government proactively releases all Cabinet papers within 30 working days of a decision being made, but it does not release the advice that underpins those decisions. The Cabinet papers can also be redacted entirely or in part if the Government believes there is a good reason to do so.

Some advice is proactively released by individual agencies but there is no uniform rule declaring it or any centralised depository. In practice, much of it is released after either the media or opposition requests a copy under the OIA.

The new regime would see all ministerial advice be released without waiting to be asked for it, although it is not clear on what timeframe.

Ministers would also have to proactively release the titles of their briefings on a regular basis, meaning any advice that was not released could be requested under the OIA.

The Public Service Commission – which oversees the sprawling public sector – is also exploring options for a single point of access for these documents, instead of it being spread over many different websites….(More)”.

Taking Transparency to the Next Level


Blog by USAID: “In order for us all to work better together, foreign assistance data — how and where the U.S. government invests our foreign assistance dollars — must be easily, readily, and freely available to the public, media, and our international partners.

To uphold these core values of transparency and openness, USAID and the U.S. Department of State jointly re-launched ForeignAssistance.gov.

This one-stop-shop helps the American taxpayer and other stakeholders understand the depth and breadth of the U.S. Government’s work in international development and humanitarian assistance, so that how much we invest and where and when we invest it is easier to access, use, and understand.

The new ForeignAssistance.gov provides a wealth of global information (above) as well as specific details for countries (below).

The new, consolidated ForeignAssistance.gov is a visual, interactive website that advances transparency by publishing U.S. foreign assistance budget and financial data that is usable, accurate, and timely. The site empowers users to explore U.S. foreign assistance data through visualizations, while also providing the flexibility for users to create custom queries, download data, and conduct analyses by country, sector, or agency…(More)”.

Guns, Privacy, and Crime


Paper by Alessandro Acquisti & Catherine Tucker: “Open government holds promise of both a more efficient but more accountable and transparent government. It is not clear, however, how transparent information about citizens and their interaction with government, however, affects the welfare of those citizens, and if so in what direction. We investigate this by using as a natural experiment the effect of the online publication of the names and addresses of holders of handgun carry permits on criminals’ propensity to commit burglaries. In December 2008, a Memphis, TN newspaper published a searchable online database of names, zip codes, and ages of Tennessee handgun carry permit holders. We use detailed crime and handgun carry permit data for the city of Memphis to estimate the impact of publicity about the database on burglaries. We find that burglaries increased in zip codes with fewer gun permits, and decreased in those with more gun permits, after the database was publicized….(More)”

Transparency of open data ecosystems in smart cities: Definition and assessment of the maturity of transparency in 22 smart cities


Paper by Martin Lnenicka et al: “This paper focuses on the issue of the transparency maturity of open data ecosystems seen as the key for the development and maintenance of sustainable, citizen-centered, and socially resilient smart cities. This study inspects smart cities’ data portals and assesses their compliance with transparency requirements for open (government) data. The expert assessment of 34 portals representing 22 smart cities, with 36 features, allowed us to rank them and determine their level of transparency maturity according to four predefined levels of maturity – developing, defined, managed, and integrated. In addition, recommendations for identifying and improving the current maturity level and specific features have been provided. An open data ecosystem in the smart city context has been conceptualized, and its key components were determined. Our definition considers the components of the data-centric and data-driven infrastructure using the systems theory approach. We have defined five predominant types of current open data ecosystems based on prevailing data infrastructure components. The results of this study should contribute to the improvement of current data ecosystems and build sustainable, transparent, citizen-centered, and socially resilient open data-driven smart cities…(More)”.

Time to recognize authorship of open data


Nature Editorial: “At times, it seems there’s an unstoppable momentum towards the principle that data sets should be made widely available for research purposes (also called open data). Research funders all over the world are endorsing the open data-management standards known as the FAIR principles (which ensure data are findable, accessible, interoperable and reusable). Journals are increasingly asking authors to make the underlying data behind papers accessible to their peers. Data sets are accompanied by a digital object identifier (DOI) so they can be easily found. And this citability helps researchers to get credit for the data they generate.

But reality sometimes tells a different story. The world’s systems for evaluating science do not (yet) value openly shared data in the same way that they value outputs such as journal articles or books. Funders and research leaders who design these systems accept that there are many kinds of scientific output, but many reject the idea that there is a hierarchy among them.

In practice, those in powerful positions in science tend not to regard open data sets in the same way as publications when it comes to making hiring and promotion decisions or awarding memberships to important committees, or in national evaluation systems. The open-data revolution will stall unless this changes….

Universities, research groups, funding agencies and publishers should, together, start to consider how they could better recognize open data in their evaluation systems. They need to ask: how can those who have gone the extra mile on open data be credited appropriately?

There will always be instances in which researchers cannot be given access to human data. Data from infants, for example, are highly sensitive and need to pass stringent privacy and other tests. Moreover, making data sets accessible takes time and funding that researchers don’t always have. And researchers in low- and middle-income countries have concerns that their data could be used by researchers or businesses in high-income countries in ways that they have not consented to.

But crediting all those who contribute their knowledge to a research output is a cornerstone of science. The prevailing convention — whereby those who make their data open for researchers to use make do with acknowledgement and a citation — needs a rethink. As long as authorship on a paper is significantly more valued than data generation, this will disincentivize making data sets open. The sooner we change this, the better….(More)”.

Access Rules: Freeing Data from Big Tech for a Better Future


Book by Thomas Ramge: “Information is power, and the time is now for digital liberation. Access Rules mounts a strong and hopeful argument for how informational tools at present in the hands of a few could instead become empowering machines for everyone. By forcing data-hoarding companies to open access to their data, we can reinvigorate both our economy and our society. Authors Viktor Mayer-Schönberger and Thomas Ramge contend that if we disrupt monopoly power and create a level playing field, digital innovations can emerge to benefit us all.

Over the past twenty years, Big Tech has managed to centralize the most relevant data on their servers, as data has become the most important raw material for innovation. However, dominant oligopolists like Facebook, Amazon, and Google, in contrast with their reputation as digital pioneers, are actually slowing down innovation and progress by withholding data for the benefit of their shareholders––at the expense of customers, the economy, and society. As Access Rules compellingly argues, ultimately it is up to us to force information giants, wherever they are located, to open their treasure troves of data to others. In order for us to limit global warming, contain a virus like COVID-19, or successfully fight poverty, everyone—including citizens and scientists, start-ups and established companies, as well as the public sector and NGOs—must have access to data. When everyone has access to the informational riches of the data age, the nature of digital power will change. Information technology will find its way back to its original purpose: empowering all of us to use information so we can thrive as individuals and as societies….(More)”.