Expert Group to Eurostat releases its report on the re-use of privately-held data for Official Statistics

Blog by Stefaan Verhulst: “…To inform its efforts, Eurostat set up an expert group in 2021 on ‘Facilitating the use of new data sources for official statistics’ to reflect on opportunities offered by the data revolution to enhance the reuse of private sector data for official statistics”.

Data reuse is a particularly important area for exploration, both because of the potential it offers and because it is not sufficiently covered by current policies. Data reuse occurs when data collected for one purpose is shared and reused for another, often with resulting social benefit. Currently, this process is limited by a fragmented or outdated policy and regulatory framework, and often quite legitimate concerns over ethical challenges represented by sharing (e.g., threats to individual privacy).

Nonetheless, despite such hurdles, a wide variety of evidence supports the idea that responsible data reuse can strengthen and supplement official statistics, and potentially lead to lasting and positive social impact.

Having reviewed and deliberated about these issues over several months, the expert group issued its report this week entitled “Empowering society by reusing privately held data for official statistics”. It seeks to develop recommendations and a framework for sustainable data reuse in the production of official statistics. It highlights regulatory gaps, fragmentation of practices, and a lack of clarity regarding businesses’ rights and obligations, and it draws attention to the ways in which current efforts to reuse data have often led to ad-hoc, one-off projects rather than systematic transformation.

The report considers a wide variety of evidence, including historical, policy, and academic research, as well as the theoretical literature… (More)”.

Read the Eurostat report at:

Many researchers say they’ll share data — but don’t

Article by Clare Watson: “Most biomedical and health researchers who declare their willingness to share the data behind journal articles do not respond to access requests or hand over the data when asked, a study reports1.

Livia Puljak, who studies evidence-based medicine at the Catholic University of Croatia in Zagreb, and her colleagues analysed 3,556 biomedical and health science articles published in a month by 282 BMC journals. (BMC is part of Springer Nature, the publisher of Nature; Nature’s news team is editorially independent of its publisher.)

The team identified 381 articles with links to data stored in online repositories and another 1,792 papers for which the authors indicated in statements that their data sets would be available on reasonable request. The remaining studies stated that their data were in the published manuscript and its supplements, or generated no data, so sharing did not apply.

But of the 1,792 manuscripts for which the authors stated they were willing to share their data, more than 90% of corresponding authors either declined or did not respond to requests for raw data (see ‘Data-sharing behaviour’). Only 14%, or 254, of the contacted authors responded to e-mail requests for data, and a mere 6.7%, or 120 authors, actually handed over the data in a usable format. The study was published in the Journal of Clinical Epidemiology on 29 May.

DATA-SHARING BEHAVIOUR. Graphic showing percentage of authors that were willing to share data.
Source: Livia Puljak et al

Puljak was “flabbergasted” that so few researchers actually shared their data. “There is a gap between what people say and what people do,” she says. “Only when we ask for the data can we see their attitude towards data sharing.”

“It’s quite dismaying that [researchers] are not coming forward with the data,” says Rebecca Li, who is executive director of non-profit global data-sharing platform Vivli and is based in Cambridge, Massachusetts…(More)”.

The Intersection of Data, Equity, and City Governments

Blog by Yuki Mitsuda: “The Open Data Policy Lab’s City Incubator program was established in September 2021 to help realize the Third Wave of Open Data at the subnational level by building data capacity among city intrapreneurs. In its first iteration, the program supported innovators from ten cities around the world to better use data to address the opportunities and challenges they face.

Reflecting on the six-month program, the work enabled participants to meet the needs of their cities and the people within them. They also revealed shared themes across cities — common challenges and issues that defined urban, data-driven work in the 21st century. This blog explores one of the emerging themes we saw from participants in the City Incubator program: the intersection of equity, data, and city governments…

Three of our city incubator participants designed their data innovations around the ways cities and citizens can use data to measure and improve equity. 

  • Jennifer Bodnarchuk, a Senior Data Scientist at the Innovation & Technology Department in the City of Winnipeg, for example, led the development of a Diversity Dashboard that quantified and visualized their municipal government’s workforce representation. The tool can be used to measure the level of diversity represented in city-wide employment to move towards equitable hiring in the public sector. 
  • Henry Xavier Hernandez, the Chief Information Officer at the Information Technology Department in Guayaquil, Ecuador, and his team leveraged the City Incubator to develop Citizen 360, a public market analysis platform that helps businesses, organizations, and individuals identify economic opportunities in the city. This tool can aid small business owners from all backgrounds who are navigating the journey of starting a new business.
  • Andrea Calderon led Albuquerque’s Equity Index, which helps evaluate the reach of city service distribution with the goal of increasing municipal investment in pockets of the city where equitable city service provision has not yet been achieved. Albuquerque’s Equity Index work entailed assessing air quality in the city through the framework of cumulative impacts, which measures “exposures, public health, or environmental effects from the combined emissions in a geographic area” in pursuit of environmental justice…(More)”.

The Future of Open Data: Law, Technology and Media

Book edited by Pamela Robinson, and Teresa Scassa: “The Future of Open Data flows from a multi-year Social Sciences and Humanities Research Council (SSHRC) Partnership Grant project that set out to explore open government geospatial data from an interdisciplinary perspective. Researchers on the grant adopted a critical social science perspective grounded in the imperative that the research should be relevant to government and civil society partners in the field.

This book builds on the knowledge developed during the course of the grant and asks the question, “What is the future of open data?” The contributors’ insights into the future of open data combine observations from five years of research about the Canadian open data community with a critical perspective on what could and should happen as open data efforts evolve.

Each of the chapters in this book addresses different issues and each is grounded in distinct disciplinary or interdisciplinary perspectives. The opening chapter reflects on the origins of open data in Canada and how it has progressed to the present date, taking into account how the Indigenous data sovereignty movement intersects with open data. A series of chapters address some of the pitfalls and opportunities of open data and consider how the changing data context may impact sources of open data, limits on open data, and even liability for open data. Another group of chapters considers new landscapes for open data, including open data in the global South, the data priorities of local governments, and the emerging context for rural open data…(More)”.

Open data: The building block of 21st century (open) science

Paper by Corina Pascu and Jean-Claude Burgelman: “Given this irreversibility of data driven and reproducible science and the role machines will play in that, it is foreseeable that the production of scientific knowledge will be more like a constant flow of updated data driven outputs, rather than a unique publication/article of some sort. Indeed, the future of scholarly publishing will be more based on the publication of data/insights with the article as a narrative.

For open data to be valuable, reproducibility is a sine qua non (King2011; Piwowar, Vision and Whitlock2011) and—equally important as most of the societal grand challenges require several sciences to work together—essential for interdisciplinarity.

This trend correlates with the already ongoing observed epistemic shift in the rationale of science: from demonstrating the absolute truth via a unique narrative (article or publication), to the best possible understanding what at that moment is needed to move forward in the production of knowledge to address problem “X” (de Regt2017).

Science in the 21st century will be thus be more “liquid,” enabled by open science and data practices and supported or even co-produced by artificial intelligence (AI) tools and services, and thus a continuous flow of knowledge produced and used by (mainly) machines and people. In this paradigm, an article will be the “atomic” entity and often the least important output of the knowledge stream and scholarship production. Publishing will offer in the first place a platform where all parts of the knowledge stream will be made available as such via peer review.

The new frontier in open science as well as where most of future revenue will be made, will be via value added data services (such as mining, intelligence, and networking) for people and machines. The use of AI is on the rise in society, but also on all aspects of research and science: what can be put in an algorithm will be put; the machines and deep learning add factor “X.”

AI services for science 4 are already being made along the research process: data discovery and analysis and knowledge extraction out of research artefacts are accelerated with the use of AI. AI technologies also help to maximize the efficiency of the publishing process and make peer-review more objective5 (Table 1).

Table 1. Examples of AI services for science already being developed

Abbreviation: AI, artificial intelligence.

Source: Authors’ research based on public sources, 2021.

Ultimately, actionable knowledge and translation of its benefits to society will be handled by humans in the “machine era” for decades to come. But as computers are indispensable research assistants, we need to make what we publish understandable to them.

The availability of data that are “FAIR by design” and shared Application Programming Interfaces (APIs) will allow new ways of collaboration between scientists and machines to make the best use of research digital objects of any kind. The more findable, accessible, interoperable, and reusable (FAIR) data resources will become available, the more it will be possible to use AI to extract and analyze new valuable information. The main challenge is to master the interoperability and quality of research data…(More)”.

State of Open Data Policy Repository

The GovLab: “To accompany its State of Open Data Policy Summit, the Open Data Policy Lab announced the release of a new resource to assess recent policy developments surrounding open data, data reuse, and data collaboration around the world: State of Open Data Repository of Recent Developments.

This document examines recent legislation, directives, and proposals that affect open data and data collaboration. Its goal is to capture signals of concerns, direction and leadership as to determine what stakeholders may focus on in the future. The review currently surfaced approximately 50 examples of recent legislative acts, proposals, directives, and other policy documents, from which the Open Data Policy Lab draws findings about the need to promote more innovative policy frameworks.

This collection demonstrates that, while there is growing interest in open data and data collaboration, policy development still remains nascent and focused on open data repositories at the expense of other collaborative arrangements. As we indicated in our report on the Third Wave of Open Data, there is an urgent need for governance frameworks at the local, regional, and national level to facilitate responsible reuse…(More)”.

(When) Do Open Budgets Transform Lives? Progress and Next Steps in Fiscal Openness Research

Paper by Xiao Hui Tai, Shikhar Mehra & Joshua E. Blumenstock: “This paper documents the rapidly growing empirical literature that can plausibly claim to identify causal effects of transparency or participation in budgeting in a variety of contexts. Recent studies convincingly demonstrate that the power of audits travels well beyond the context of initial field-defining studies, consider participatory budgeting beyond Brazil, where such practices were pioneered, and examine previously neglected outcomes, notably revenues and procurement. Overall, the study of the impacts of fiscal openness has become richer and more nuanced. The most well-documented causal effects are positive: lower corruption and enhanced accountability at the ballot box. Moreover, these impacts have been shown to apply across different settings. This research concludes that the empirical case for open government in this policy area is rapidly growing in strength. This paper sets out challenges related to studying national-level reforms; working directly with governments; evaluating systems as opposed to programs; clarifying the relationship between transparency and participation; and understanding trade-offs for reforms in this area….(More)”.

Global Data Barometer

Report and Site by the Global Data Barometer: “This report provides an overview of the Global Data Barometer findings. The Barometer includes 39 primary indicators, and over 500 sub-questions, covering 109 countries (delivering more than 60,000 data points in total). In this report, we select just a few of these to explore, providing a non-exhaustive overview of some of the topics that could be explored further using Barometer data.

  • Section 1 provides a short overview of the key concepts used in the Barometer, and a short description of the methodology.
  • Section 2 looks at the four key pillars of the Barometer (governance, capability, availability and use), and provides headlines from each.
  • Section 3 provides a regional analysis, drawing on insights from Barometer regional hubs to understand the unique context of each region, and the relative strengths and weaknesses of countries.
  • Section 4 provides a short summary of learning from the first edition, and highlights directions for future work.

The full methodology, and details of how to access and work further with Barometer data, are contained in Appendices…(More)”.

In potentially seismic shift, Government could release almost all advice to ministers

Article by Henry Cooke: (New Zealand) “The Government is considering proactively releasing almost all advice to ministers under a planned shakeup to transparency rules, which, if made, would amount to a seismic shift in the way the public sector communicates.

Open government advocates have cautiously welcomed the planned move, but say the devil will be in the detail – as the proactive release regime could end up defanging the Official Information Act (OIA).

The Public Service Commission is consulting with government departments and agencies on a proposal to release to the public all briefings and other advice given to ministers – unless there is a compelling reason not to, such as national security or breaching a commercial agreement, according to a person with knowledge of the discussions.

Currently, the Government proactively releases all Cabinet papers within 30 working days of a decision being made, but it does not release the advice that underpins those decisions. The Cabinet papers can also be redacted entirely or in part if the Government believes there is a good reason to do so.

Some advice is proactively released by individual agencies but there is no uniform rule declaring it or any centralised depository. In practice, much of it is released after either the media or opposition requests a copy under the OIA.

The new regime would see all ministerial advice be released without waiting to be asked for it, although it is not clear on what timeframe.

Ministers would also have to proactively release the titles of their briefings on a regular basis, meaning any advice that was not released could be requested under the OIA.

The Public Service Commission – which oversees the sprawling public sector – is also exploring options for a single point of access for these documents, instead of it being spread over many different websites….(More)”.

Taking Transparency to the Next Level

Blog by USAID: “In order for us all to work better together, foreign assistance data — how and where the U.S. government invests our foreign assistance dollars — must be easily, readily, and freely available to the public, media, and our international partners.

To uphold these core values of transparency and openness, USAID and the U.S. Department of State jointly re-launched

This one-stop-shop helps the American taxpayer and other stakeholders understand the depth and breadth of the U.S. Government’s work in international development and humanitarian assistance, so that how much we invest and where and when we invest it is easier to access, use, and understand.

The new provides a wealth of global information (above) as well as specific details for countries (below).

The new, consolidated is a visual, interactive website that advances transparency by publishing U.S. foreign assistance budget and financial data that is usable, accurate, and timely. The site empowers users to explore U.S. foreign assistance data through visualizations, while also providing the flexibility for users to create custom queries, download data, and conduct analyses by country, sector, or agency…(More)”.