Global population data is in crisis – here’s why that matters


Article by Andrew J Tatem and Jessica Espey: “Every day, decisions that affect our lives depend on knowing how many people live where. For example, how many vaccines are needed in a community, where polling stations should be placed for elections or who might be in danger as a hurricane approaches. The answers rely on population data.

But counting people is getting harder.

For centuries, census and household surveys have been the backbone of population knowledge. But we’ve just returned from the UN’s statistical commission meetings in New York, where experts reported that something alarming is happening to population data systems globally.

Census response rates are declining in many countries, resulting in large margins of error. The 2020 US census undercounted America’s Latino population by more than three times the rate of the 2010 census. In Paraguay, the latest census revealed a population one-fifth smaller than previously thought.

South Africa’s 2022 census post-enumeration survey revealed a likely undercount of more than 30%. According to the UN Economic Commission for Africa, undercounts and census delays due to COVID-19, conflict or financial limitations have resulted in an estimated one in three Africans not being counted in the 2020 census round.

When people vanish from data, they vanish from policy. When certain groups are systematically undercounted – often minorities, rural communities or poorer people – they become invisible to policymakers. This translates directly into political underrepresentation and inadequate resource allocation…(More)”.

Integrating Social Media into Biodiversity Databases: The Next Big Step?


Article by Muhammad Osama: “Digital technologies and social media have transformed ecology and conservation biology data collection. Traditional biodiversity monitoring often relies on field surveys, which can be time-consuming and biased toward rural habitats.

The Global Biodiversity Information Facility (GBIF) serves as a key repository for biodiversity data, but it faces challenges such as delayed data availability and underrepresentation of urban habitats.

Social media platforms have become valuable tools for rapid data collection, enabling users to share georeferenced observations instantly, reducing time lags associated with traditional methods. The widespread use of smartphones with cameras allows individuals to document wildlife sightings in real-time, enhancing biodiversity monitoring. Integrating social media data with traditional ecological datasets offers significant advancements, particularly in tracking species distributions in urban areas.

In this paper, the authors evaluated the Jersey tiger moth’s habitat usage by comparing occurrence data from social media platforms (Instagram and Flickr) with traditional records from GBIF and iNaturalist. They hypothesized that social media data would reveal significant JTM occurrences in urban environments, which may be underrepresented in traditional datasets…(More)”.

The Language Data Space (LDS)


European Commission: “… welcomes launch of the Alliance for Language Technologies European Digital Infrastructure Consortium (ALT-EDIC) and the Language Data Space (LDS).

Aimed at addressing the shortage of European language data needed for training large language models, these projects are set to revolutionise multilingual Artificial Intelligence (AI) systems across the EU.

By offering services in all EU languages, the initiatives are designed to break down language barriers, providing better, more accessible solutions for smaller businesses within the EU. This effort not only aims to preserve the EU’s rich cultural and linguistic heritage in the digital age but also strengthens Europe’s quest for tech sovereignty. Formed in February 2024, the ALT-EDIC includes 17 participating Member States and 9 observer Member States and regions, making it one of the pioneering European Digital Infrastructure Consortia.

The LDS, part of the Common European Data Spaces, is crucial for increasing data availability for AI development in Europe. Developed by the Commission and funded by the DIGITAL programme,  this project aims to create a cohesive marketplace for language data. This will enhance the collection and sharing of multilingual data to support European large language models. Initially accessible to selected institutions and companies, the project aims to eventually involve all European public and private stakeholders.

Find more information about the Alliance for Language Technologies European Digital Infrastructure Consortium (ALT-EDIC) and the Language Data Space (LDS)…(More)”

Large AI models are cultural and social technologies


Essay by Henry Farrell, Alison Gopnik, Cosma Shalizi, and James Evans: “Debates about artificial intelligence (AI) tend to revolve around whether large models are intelligent, autonomous agents. Some AI researchers and commentators speculate that we are on the cusp of creating agents with artificial general intelligence (AGI), a prospect anticipated with both elation and anxiety. There have also been extensive conversations about cultural and social consequences of large models, orbiting around two foci: immediate effects of these systems as they are currently used, and hypothetical futures when these systems turn into AGI agents perhaps even superintelligent AGI agents.

But this discourse about large models as intelligent agents is fundamentally misconceived. Combining ideas from social and behavioral sciences with computer science can help us understand AI systems more accurately. Large Models should not be viewed primarily as intelligent agents, but as a new kind of cultural and social technology, allowing humans to take advantage of information other humans have accumulated.

The new technology of large models combines important features of earlier technologies. Like pictures, writing, print, video, Internet search, and other such technologies, large models allow people to access information that other people have created. Large Models – currently language, vision, and multi-modal depend on the fact that the Internet has made the products of these earlier technologies readily available in machine-readable form. But like economic markets, state bureaucracies, and other social technologies, these systems not only make information widely available, they allow it to be reorganized, transformed, and restructured in distinctive ways. Adopting Herbert Simon’s terminology, large models are a new variant of the “artificial systems of human society” that process information to enable large-scale coordination…(More)”

A Quest for AI Knowledge


Paper by Joshua S. Gans: “This paper examines how the introduction of artificial intelligence (AI), particularly generative and large language models capable of interpolating precisely between known data points, reshapes scientists’ incentives for pursuing novel versus incremental research. Extending the theoretical framework of Carnehl and Schneider (2025), we analyse how decision-makers leverage AI to improve precision within well-defined knowledge domains. We identify conditions under which the availability of AI tools encourages scientists to choose more socially valuable, highly novel research projects, contrasting sharply with traditional patterns of incremental knowledge growth. Our model demonstrates a critical complementarity: scientists strategically align their research novelty choices to maximise the domain where AI can reliably inform decision-making. This dynamic fundamentally transforms the evolution of scientific knowledge, leading either to systematic “stepping stone” expansions or endogenous research cycles of strategic knowledge deepening. We discuss the broader implications for science policy, highlighting how sufficiently capable AI tools could mitigate traditional inefficiencies in scientific innovation, aligning private research incentives closely with the social optimum…(More)”.

The Age of AI in the Life Sciences: Benefits and Biosecurity Considerations


Report by the National Academies of Sciences, Engineering, and Medicine: “Artificial intelligence (AI) applications in the life sciences have the potential to enable advances in biological discovery and design at a faster pace and efficiency than is possible with classical experimental approaches alone. At the same time, AI-enabled biological tools developed for beneficial applications could potentially be misused for harmful purposes. Although the creation of biological weapons is not a new concept or risk, the potential for AI-enabled biological tools to affect this risk has raised concerns during the past decade.

This report, as requested by the Department of Defense, assesses how AI-enabled biological tools could uniquely impact biosecurity risk, and how advancements in such tools could also be used to mitigate these risks. The Age of AI in the Life Sciences reviews the capabilities of AI-enabled biological tools and can be used in conjunction with the 2018 National Academies report, Biodefense in the Age of Synthetic Biology, which sets out a framework for identifying the different risk factors associated with synthetic biology capabilities…(More)”

Nudges and Nudging: A User’s Manual


Paper by Cass Sunstein: “Many policies take the form of nudges, defined as liberty-preserving approaches that steer people in particular directions, but that also allow them to go their own way Some nudges attempt to correct self-control problems. Some nudges attempt to counteract unrealistic optimism. Some nudges attempt to correct present bias. Some nudges attempt to correct market failures, as when people are nudged not to emit air pollution. For every conventional market failure, there is a potential nudge. For every behavioral bias (optimistic bias, present bias, availability bias, limited attention), there is a responsive nudge. There are many misconceptions about nudges and nudging, and they are a diversion…(More)”.

Vetted Researcher Data Access


Coimisiún na Meán: “Article 40 of the Digital Services Act (DSA) makes provision for researchers to access data from Very Large Online Platforms (VLOPs) or Very Large Online Search Engines (VLOSEs) for the purposes of studying systemic risk in the EU and assessing mitigation measures. There are two ways that researchers that are studying systemic risk in the EU can get access to data under Article 40 of the DSA. 

Non-public data, known as “vetted researcher data access”, under Article 40(4)-(11). This is a process where a researcher, who has been vetted or assessed by a Digital Services Coordinator to have met the criteria as set out in DSA Article 40(8), can request access to non-public data held by a VLOP/VLOSE. The data must be limited in scope and deemed necessary and proportionate to the purpose of the research.

Public data under Article 40(12).  This is a process where a researcher who meets the relevant criteria can apply for data access directly from a VLOP/VLOSE, for example, access to a content library or API of public posts…(More)”.

Reconciling open science with technological sovereignty


Paper by C. Huang & L. Soete: “In history, open science has been effective in facilitating knowledge sharing and promoting and diffusing innovations. However, as a result of geopolitical tensions, technological sovereignty has recently been increasingly emphasized in various countries’ science and technology policy making, posing a challenge to open science policy. In this paper, we argue that the European Union significantly benefits from and contributes to open science and should continue to support it. Similarly, China embraced foreign technologies and engaged in open science as its economy developed rapidly in the last 40 years. Today both economies could learn from each other in finding the right balance between open science and technological sovereignty particularly given the very different policy experience and the urgency of implementing new technologies addressing the grand challenges such as climate change faced by mankind…(More)”.

Disinformation: Definitions and examples


Explainer by Perthusasia Centre: “Disinformation has been a tool of manipulation and control for centuries, from ancient military strategies to Cold War propaganda. With the rapid advancement of technology,
it has evolved into a sophisticated and pervasive security threat that transcends traditional boundaries.

This explainer takes the definitions and examples from our recent Indo-Pacific Analysis Brief, Disinformation and cognitive warfare by Senior Fellow Alana Ford, and creates an simple, standalone guide for quick reference…(More)”.