Do disappearing data repositories pose a threat to open science and the scholarly record?


Article by Dorothea Strecker, Heinz Pampel, Rouven Schabinger and Nina Leonie Weisweiler: “Research data repositories, such as Zenodo or the UK Data Archive, are specialised information infrastructures that focus on the curation and dissemination of research data. One of repositories’ main tasks is maintaining their collections long-term, see for example the TRUST Principles, or the requirements of the certification organization CoreTrustSeal. Long-term preservation is also a prerequisite for several data practices that are getting increasing attention, such as data reuse and data citation.

For data to remain usable, the infrastructures that host them also have to be kept operational. However, the long-term operation of research data repositories is challenging, and sometimes, for varying reasons and despite best efforts, they are shut down….

In a recent study we therefore set out to take an infrastructure perspective on the long-term preservation of research data by investigating repositories across disciplines and types that were shut down. We also tried to estimate the impact of repository shutdown on data availability…

We found that repository shutdown was not rare: 6.2% of all repositories listed in re3data were shut down. Since the launch of the registry in 2012, at least one repository has been shut down each year (see Fig.1). The median age of a repository when shutting down was 12 years…(More)”.

Missing Evidence : Tracking Academic Data Use around the World


Worldbank Report: “Data-driven research on a country is key to producing evidence-based public policies. Yet little is known about where data-driven research is lacking and how it could be expanded. This paper proposes a method for tracking academic data use by country of subject, applying natural language processing to open-access research papers. The model’s predictions produce country estimates of the number of articles using data that are highly correlated with a human-coded approach, with a correlation of 0.99. Analyzing more than 1 million academic articles, the paper finds that the number of articles on a country is strongly correlated with its gross domestic product per capita, population, and the quality of its national statistical system. The paper identifies data sources that are strongly associated with data-driven research and finds that availability of subnational data appears to be particularly important. Finally, the paper classifies countries into groups based on whether they could most benefit from increasing their supply of or demand for data. The findings show that the former applies to many low- and lower-middle-income countries, while the latter applies to many upper-middle- and high-income countries…(More)”.

Are we entering a “Data Winter”?


Article by Stefaan G. Verhulst: “In an era where data drives decision-making, the accessibility of data for public interest purposes has never been more crucial. Whether shaping public policy, responding to disasters, or empowering research, data plays a pivotal role in our understanding of complex social, environmental, and economic issues. In 2015, I introduced the concept of Data Collaboratives to advance new and innovative partnerships between the public and private sectors that could make data more accessible for public interest purposes. More recently, I have been advocating for a reimagined approach to data stewardship to make data collaboration more systematic, agile, sustainable, and responsible.

We may be entering a “Data Winter”

Despite many advances toward data stewardship (especially during Covid19) and despite the creation of several important data collaboratives (e.g., the Industry Data for Society Partnership) the project of opening access to data is proving increasingly challenging. Indeed, unless we step up our efforts in 2024, we may be entering a prolonged data winter — analogous to previous Artificial Intelligence winters, marked by reduced funding and interest in AI research, in which data assets that could be leveraged for the common good are instead frozen and immobilized. Recent developments, such as a decline in access to social media data for research and the growing privatization of climate data, along with a decrease in open data policy activity, signify a worrying trend. This blog takes stock of these developments and, building on some recent expert commentary, raises a number of concerns about the current state of data accessibility and its implications for the public interest. We conclude by calling for a new Decade of Data — one marked by a reinvigorated commitment to open data and data reuse for the public interest…(More)”.

The world needs an International Decade for Data–or risk splintering into AI ‘haves’ and ‘have-nots,’ UN researchers warn


Article by Tshilidzi Marwala and David Passarelli: “The rapid rise in data-driven technologies is shaping how many of us live–from biometric data collected by our smartwatches, artificial intelligence (AI) tools and models changing how we work, to social media algorithms that seem to know more about our content preferences than we do. Greater amounts of data are affecting all aspects of our lives, and indeed, society at large.

This explosion in data risks creating new inequalities, equipping a new set of “haves” who benefit from the power of data while excluding, or even harming, a set of “have-nots”–and splitting the international community into “data-poor” and “data-rich” worlds.

We know that data, when harnessed correctly, can be a powerful tool for sustainable development. Intelligent and innovative use of data can support public health systems, improve our understanding of climate change and biodiversity loss, anticipate crises, and tackle deep-rooted structural injustices such as racism and economic inequality.

However, the vast quantity of data is fueling an unregulated Wild West. Instead of simply issuing more warnings, governments must instead work toward good governance of data on a global scale. Due to the rapid pace of technological innovation, policies intended to protect society will inevitably fall behind. We need to be more ambitious.

To begin with, governments must ensure that the benefits derived from data are equitably distributed by establishing global ground rules for data collection, sharing, taxation, and re-use. This includes dealing with synthetic data and cross-border data flows…(More)”.

The New Knowledge


Book by Blayne Haggart and Natasha Tusikov: “From the global geopolitical arena to the smart city, control over knowledge—particularly over data and intellectual property—has become a key battleground for the exercise of economic and political power. For companies and governments alike, control over knowledge—what scholar Susan Strange calls the knowledge structure—has become a goal unto itself.

The rising dominance of the knowledge structure is leading to a massive redistribution of power, including from individuals to companies and states. Strong intellectual property rights have concentrated economic benefits in a smaller number of hands, while the “internet of things” is reshaping basic notions of property, ownership, and control. In the scramble to create and control data and intellectual property, governments and companies alike are engaging in ever-more surveillance.

The New Knowledge is a guide to and analysis of these changes, and of the emerging phenomenon of the knowledge-driven society. It highlights how the pursuit of the control over knowledge has become its own ideology, with its own set of experts drawn from those with the ability to collect and manipulate digital data. Haggart and Tusikov propose a workable path forward—knowledge decommodification—to ensure that our new knowledge is not treated simply as a commodity to be bought and sold, but as a way to meet the needs of the individuals and communities that create this knowledge in the first place…(More)”.

Climate change may kill data sovereignty


Article by Trisha Ray: “Data centres are the linchpin of a nation’s technological progress, serving as the nerve centers that power the information age. The need for robust and reliable data centre infrastructure cuts across the UN Sustainable Development Goals (SDGs), serving as an essential foundation for e-government, innovation and entrepreneurship, decent work, and economic growth. It comes as no surprise then that data sovereignty has gained traction over the past decade, particularly in the Global South. However, climate change threatens the very infrastructure that underpins the digital future, and its impact on data centres is a multifaceted challenge, with rising temperatures, extreme weather events, and changing environmental conditions posing significant threats to their reliability and sustainability, even as developing countries begin rolling out ambitious strategies and incentives to attract data centres…(More)”.

Data Science for Social Impact in Higher Education:  First Steps


Data.org playbook: “… was designed to help you expand opportunities for social impact data science learning. As you browse, you will see a range of these opportunities including courses, modules for other courses, research and internship opportunities, and a variety of events and activities. The playbook also offers lessons learned to guide you through your process. Additionally, the Playbook includes profiles of students who have engaged in data science for social impact, guidance for engaging partners, and additional resources relating to evaluation and courses. We hope that this playbook will inspire and support your efforts to bring social impact data science to your institutions…

As you look at the range of ways you might bring data science for social impact to your students, remember that the intention is not for you to replicate what is here, but rather adapt them to your local contexts and conditions. You might draw pieces from several activities and combine them to create a customized strategy that works for you. Consider the assets you have around you and how you might be able to leverage them. At the same time, imagine how some of the lessons learned might reflect barriers you might face, as well. Most importantly, know that it is possible for you to create data science for social impact at your institution to bring benefit to your students and society…(More)”.

Medical AI could be ‘dangerous’ for poorer nations, WHO warns


Article by David Adam: “The introduction of health-care technologies based on artificial intelligence (AI) could be “dangerous” for people in lower-income countries, the World Health Organization (WHO) has warned.

The organization, which today issued a report describing new guidelines on large multi-modal models (LMMs), says it is essential that uses of the developing technology are not shaped only by technology companies and those in wealthy countries. If models aren’t trained on data from people in under-resourced places, those populations might be poorly served by the algorithms, the agency says.

“The very last thing that we want to see happen as part of this leap forward with technology is the propagation or amplification of inequities and biases in the social fabric of countries around the world,” Alain Labrique, the WHO’s director for digital health and innovation, said at a media briefing today.

The WHO issued its first guidelines on AI in health care in 2021. But the organization was prompted to update them less than three years later by the rise in the power and availability of LMMs. Also called generative AI, these models, including the one that powers the popular ChatGPT chatbot, process and produce text, videos and images…(More)”.

2024 Edelman Trust Barometer


Edelman: “The 2024 Edelman Trust Barometer reveals a new paradox at the heart of society. Rapid innovation offers the promise of a new era of prosperity, but instead risks exacerbating trust issues, leading to further societal instability and political polarization.

Innovation is accelerating – in regenerative agriculture, messenger RNA, renewable energy, and most of all in artificial intelligence. But society’s ability to process and accept rapid change is under pressure, with skepticism about science’s relationship with Government and the perception that the benefits skew towards the wealthy.

There is one issue on which the world stands united: innovation is being poorly managed – defined by lagging government regulation, uncertain impacts, lack of transparency, and an assumption that science is immutable. Our respondents cite this as a problem by nearly a two to one margin across most developed and developing countries, plus all age groups, income levels, educational levels, and genders. There is consensus among those who say innovation is poorly managed that society is changing too quickly and not in ways that benefit “people like me” (69%).

Many are concerned that Science is losing its independence: to Government, to the political process, and to the wealthy. In the U.S., two thirds assert that science is too politicized. For the first time in China, we see a contrast to their high trust in government: Three-quarters of respondents believe that Government and organizations that fund research have too much influence on science. There is concern about excessive influence of the elites, with 82% of those who say innovation is managed poorly believing that the system is biased in favor of the rich – this is 30 percentage points higher than those who feel innovation is managed well…(More)”.

Integrating Participatory Budgeting and Institutionalized Citizens’ Assemblies: A Community-Driven Perspective


Article by Nick Vlahos: “There is a growing excitement in the democracy field about the potential of citizen’s assemblies (CAs), a practice that brings together groups of residents selected by lottery to deliberate on public policy issues. There is longitudinal evidence to suggest that deliberative mini-publics such as those who meet in CAs can be transformative when it comes to adding more nuance to public opinion on complex and potentially polarizing issues.

But there are two common critiques of CAs. The first is that they are not connected to centers of power (with very few notable exceptions) and don’t have authority to make binding decisions. The second is that they are often disconnected from the broader public, and indeed often claim to be making their own, new “publics” instead of engaging with existing ones.

In this article I propose that proponents of CAs could benefit from the thirty-year history of another democratic innovation—participatory budgeting (PB). There are nearly 12,000 recorded instances of PB to draw learnings from. I see value in both innovations (and have advocated and written about both) and would be interested to see some sort of experimentation that combines PB and CAs, from a decentralized, bottom-up, community-driven approach.

We can and should think about grassroots ways to scale and connect people across geography using combinations of democratic innovations, which along the way builds up (local) civic infrastructure by drawing from existing civic capital (resident-led groups, non-profits, service providers, social movements/mobilization etc.)…(More)”.