Two Open Science Foundations: Data Commons and Stewardship as Pillars for Advancing the FAIR Principles and Tackling Planetary Challenges


Article by Stefaan Verhulst and Jean Claude Burgelman: “Today the world is facing three major planetary challenges: war and peace, steering Artificial Intelligence and making the planet a healthy Anthropoceen. As they are closely interrelated, they represent an era of “polycrisis”, to use the term Adam Tooze has coined. There are no simple solutions or quick fixes to these (and other) challenges; their interdependencies demand a multi-stakeholder, interdisciplinary approach.

As world leaders and experts convene in Baku for The 29th session of the Conference of the Parties to the United Nations Framework Convention on Climate Change (COP29), the urgency of addressing these global crises has never been clearer. A crucial part of addressing these challenges lies in advancing science — particularly open science, underpinned by data made available leveraging the FAIR principles (Findable, Accessible, Interoperable, and Reusable). In this era of computation, the transformative potential of research depends on the seamless flow and reuse of high-quality data to unlock breakthrough insights and solutions. Ensuring data is available in reusable, interoperable formats not only accelerates the pace of scientific discovery but also expedites the search for solutions to global crises.

Image of the retreat of the Columbia glacier by Jesse Allen, using Landsat data from the U.S. Geological Survey. Free to re-use from NASA Visible Earth.

While FAIR principles provide a vital foundation for making data accessible, interoperable and reusable, translating these principles into practice requires robust institutional approaches. Toward that end, in the below, we argue two foundational pillars must be strengthened:

  • Establishing Data Commons: The need for shared data ecosystems where resources can be pooled, accessed, and re-used collectively, breaking down silos and fostering cross-disciplinary collaboration.
  • Enabling Data Stewardship: Systematic and responsible data reuse requires more than access; it demands stewardship — equipping institutions and scientists with the capabilities to maximize the value of data while safeguarding its responsible use is essential…(More)”.

A Second Academic Exodus From X?


Article by Josh Moody: “Two years ago, after Elon Musk bought Twitter for $44 billion, promptly renaming it X, numerous academics decamped from the platform. Now, in the wake of a presidential election fraught with online disinformation, a second exodus from the social media site appears underway.

Academics, including some with hundreds of thousands of followers, announced departures from the platform in the immediate aftermath of the election, decrying the toxicity of the website and objections to Musk and how he wielded the platform to back President-elect Donald Trump. The business mogul threw millions of dollars behind Trump and personally campaigned for him this fall. Musk also personally advanced various debunked conspiracy theories during the election cycle.

Amid another wave of exits, some users see this as the end of Academic Twitter, which was already arguably in its death throes…

LeBlanc, Kamola and Rosen all mentioned that they were moving to the platform Bluesky, which has grown to 14.5 million users, welcoming more than 700,000 new accounts in recent days. In September, Bluesky had nine million users…

A study published in PS: Political Science & Politics last month concluded that academics began to engage less after Musk bought the platform. But the peak of disengagement wasn’t when the billionaire took over the site in October 2022 but rather the next month, when he reinstated Donald Trump’s account, which the platform’s previous owners deactivated following the Jan. 6, 2021, insurrection, which he encouraged.

The researchers reviewed 15,700 accounts from academics in economics, political science, sociology and psychology for their study.

James Bisbee, a political science professor at Vanderbilt University and article co-author, wrote via email that changes to the platform, particularly to the application programming interface, or API, undermined their ability to collect data for their research.

“Twitter used to be an amazing source of data for political scientists (and social scientists more broadly) thanks in part to its open data ethos,” Bisbee wrote. “Since Musk’s takeover, this is no longer the case, severely limiting the types of conclusions we could draw, and theories we could test, on this platform.”

To Bisbee, that loss is an understated issue: “Along with many other troubling developments on X since the change in ownership, the amputation of data access should not be ignored.”..(More)”

Human-AI coevolution


Paper by Dino Pedreschi et al: “Human-AI coevolution, defined as a process in which humans and AI algorithms continuously influence each other, increasingly characterises our society, but is understudied in artificial intelligence and complexity science literature. Recommender systems and assistants play a prominent role in human-AI coevolution, as they permeate many facets of daily life and influence human choices through online platforms. The interaction between users and AI results in a potentially endless feedback loop, wherein users’ choices generate data to train AI models, which, in turn, shape subsequent user preferences. This human-AI feedback loop has peculiar characteristics compared to traditional human-machine interaction and gives rise to complex and often “unintended” systemic outcomes. This paper introduces human-AI coevolution as the cornerstone for a new field of study at the intersection between AI and complexity science focused on the theoretical, empirical, and mathematical investigation of the human-AI feedback loop. In doing so, we: (i) outline the pros and cons of existing methodologies and highlight shortcomings and potential ways for capturing feedback loop mechanisms; (ii) propose a reflection at the intersection between complexity science, AI and society; (iii) provide real-world examples for different human-AI ecosystems; and (iv) illustrate challenges to the creation of such a field of study, conceptualising them at increasing levels of abstraction, i.e., scientific, legal and socio-political…(More)”.

How to evaluate statistical claims


Blog by Sean Trott: “…The goal of this post is to distill what I take to be the most important, immediately applicable, and generalizable insights from these classes. That means that readers should be able to apply those insights without a background in math or knowing how to, say, build a linear model in R. In that way, it’ll be similar to my previous post about “useful cognitive lenses to see through”, but with a greater focus on evaluating claims specifically.

Lesson #1: Consider the whole distribution, not just the central tendency.

If you spend much time reading news articles or social media posts, the odds are good you’ll encounter some descriptive statistics: numbers summarizing or describing a distribution (a set of numbers or values in a dataset). One of the most commonly used descriptive statistics is the arithmetic mean: the sum of every value in a distribution, divided by the number of values overall. The arithmetic mean is a measure of “central tendency”, which just means it’s a way to characterize the typical or expected value in that distribution.

The arithmetic mean is a really useful measure. But as many readers might already know, it’s not perfect. It’s strongly affected by outliers—values that are really different from the rest of the distribution—and things like the skew of a distribution (see the image below for examples of skewed distribution).

Three different distributions. Leftmost is a roughly “normal” distribution; middle is a “right-skewed” distribution; and rightmost is a “left-skewed” distribution.

In particular, the mean is pulled in the direction of outliers or distribution skew. That’s the logic behind the joke about the average salary of people at a bar jumping up as soon as a billionaire walks in. It’s also why other measures of central tendency, such as the median, are often presented alongside (or instead of) the mean—especially for distributions that happen to be very skewed, such as income or wealth.

It’s not that one of these measures is more “correct”. As Stephen Jay Gould wrote in his article The Median Is Not the Message, they’re just different perspectives on the same distribution:

A politician in power might say with pride, “The mean income of our citizens is $15,000 per year.” The leader of the opposition might retort, “But half our citizens make less than $10,000 per year.” Both are right, but neither cites a statistic with impassive objectivity. The first invokes a mean, the second a median. (Means are higher than medians in such cases because one millionaire may outweigh hundreds of poor people in setting a mean, but can balance only one mendicant in calculating a median.)..(More)”

Engaging publics in science: a practical typology


Paper by Heather Douglas et al: “Public engagement with science has become a prominent area of research and effort for democratizing science. In the fall of 2020, we held an online conference, Public Engagement with Science: Defining and Measuring Success, to address questions of how to do public engagement well. The conference was organized around conceptualizations of the publics engaged, with attendant epistemic, ethical, and political valences. We present here the typology of publics we used (volunteer, representative sample, stakeholder, and community publics), discuss the differences among those publics and what those differences mean for practice, and situate this typology within the existing work on public engagement with science. We then provide an overview of the essays published in this journal arising from the conference which provides a window into the rich work presented at the event…(More)”.

‘We were just trying to get it to work’: The failure that started the internet


Article by Scott Nover: “At the height of the Cold War, Charley Kline and Bill Duvall were two bright-eyed engineers on the front lines of one of technology’s most ambitious experiments. Kline, a 21-year-old graduate student at the University of California, Los Angeles (UCLA), and Duvall, a 29-year-old systems programmer at Stanford Research Institute (SRI), were working on a system called Arpanet, short for the Advanced Research Projects Agency Network. Funded by the US Department of Defense, the project aimed to create a network that could directly share data without relying on telephone lines. Instead, this system used a method of data delivery called “packet switching” that would later form the basis for the modern internet.

It was the first test of a technology that would change almost every facet of human life. But before it could work, you had to log in.

Kline sat at his keyboard between the lime-green walls of UCLA’s Boelter Hall Room 3420, prepared to connect with Duvall, who was working a computer halfway across the state of California. But Kline didn’t even make it all the way through the word “L-O-G-I-N” before Duvall told him over the phone that his system crashed. Thanks to that error, the first “message” that Kline sent Duvall on that autumn day in 1969 was simply the letters “L-O”…(More)”.

Artificial Intelligence, Scientific Discovery, and Product Innovation


Paper by Aidan Toner-Rodgers: “… studies the impact of artificial intelligence on innovation, exploiting the randomized introduction of a new materials discovery technology to 1,018 scientists in the R&D lab of a large U.S. firm. AI-assisted researchers discover 44% more materials, resulting in a 39% increase in patent filings and a 17% rise in downstream product innovation. These compounds possess more novel chemical structures and lead to more radical inventions. However, the technology has strikingly disparate effects across the productivity distribution: while the bottom third of scientists see little benefit, the output of top researchers nearly doubles. Investigating the mechanisms behind these results, I show that AI automates 57% of “idea-generation” tasks, reallocating researchers to the new task of evaluating model-produced candidate materials. Top scientists leverage their domain knowledge to prioritize promising AI suggestions, while others waste significant resources testing false positives. Together, these findings demonstrate the potential of AI-augmented research and highlight the complementarity between algorithms and expertise in the innovative process. Survey evidence reveals that these gains come at a cost, however, as 82% of scientists report reduced satisfaction with their work due to decreased creativity and skill underutilization…(More)”.

Digital Media Metaphors


Book edited by Johan Farkas and Marcus Maloney: “Bringing together leading scholars from media studies and digital sociology, this edited volume provides a comprehensive introduction to digital media metaphors, unpacking their power and limitations.

Digital technologies have reshaped our way of life. To grasp their dynamics and implications, people often rely on metaphors to provide a shared frame of reference. Scholars, journalists, tech companies, and policymakers alike speak of digital clouds, bubbles, frontiers, platforms, trolls, and rabbit holes. Some of these metaphors distort the workings of the digital realm and neglect key consequences. This collection, structured in three parts, explores metaphors across digital infrastructures, content, and users. Within these parts, each chapter examines a specific metaphor that has become near-ubiquitous in public debate. Doing so, the book engages not only with the technological, but also the social, political, and environmental implications of digital technologies and relations.

This unique collection will interest students and scholars of digital media and the broader fields of media and communication studies, sociology, and science and technology studies…(More)”.

Federated Data Infrastructures for Scientific Use


Policy paper by the German Council for Scientific Information Infrastructures: “…provides an overview and a comparative in-depth analysis of the emerging research (and research related) data infrastructures NFDI, EOSC, Gaia-X and the European Data Spaces. In addition, the Council makes recommendations for their future development and coordination. The RfII notes that access to genuine high-quality research data and related core services is a matter of basic public supply and strongly advises to achieve coherence between the various initiatives and approaches…(More)”.

Effective Data Stewardship in Higher Education: Skills, Competences, and the Emerging Role of Open Data Stewards


Paper by Panos Fitsilis et al: “The significance of open data in higher education stems from the changing tendencies towards open science, and open research in higher education encourages new ways of making scientific inquiry more transparent, collaborative and accessible. This study focuses on the critical role of open data stewards in this transition, essential for managing and disseminating research data effectively in universities, while it also highlights the increasing demand for structured training and professional policies for data stewards in academic settings. Building upon this context, the paper investigates the essential skills and competences required for effective data stewardship in higher education institutions by elaborating on a critical literature review, coupled with practical engagement in open data stewardship at universities, provided insights into the roles and responsibilities of data stewards. In response to these identified needs, the paper proposes a structured training framework and comprehensive curriculum for data stewardship, a direct response to the gaps identified in the literature. It addresses five key competence categories for open data stewards, aligning them with current trends and essential skills and knowledge in the field. By advocating for a structured approach to data stewardship education, this work sets the foundation for improved data management in universities and serves as a critical step towards professionalizing the role of data stewards in higher education. The emphasis on the role of open data stewards is expected to advance data accessibility and sharing practices, fostering increased transparency, collaboration, and innovation in academic research. This approach contributes to the evolution of universities into open ecosystems, where there is free flow of data for global education and research advancement…(More)”.