Stefaan Verhulst
Book edited by Kelebogile Zvobgo, and Francesca Parente: “This timely book presents a practical framework for conceptualizing and analyzing human rights issues such as repression, compliance, and transitional justice in an increasingly fraught climate for human rights globally. Emerging and established experts advance quantitative and mixed-methods research, showcasing innovative ways of measuring and evaluating multifaceted concepts.
Chapters cover a broad range of salient topics including state repression, civil society activism, compliance with international law, and transitional justice. Emphasizing that rigorous research is driven by substance, not methods, the contributing authors explain how they measure concepts that are vital to human rights research. They showcase diverse forms of evidence in descriptive and analytical studies, as well as guidance for using cutting-edge techniques like machine learning and text analysis, charting a path for future empirical human rights research…(More)”.
Paper by Open Data Watch and Paris21: “Deep cuts in development financing for statistics, legitimacy issues, rapid technological change like AI, and rising expectations for more inclusive and participatory data are colliding with long-standing weaknesses in trust, capacity, and data use. For many national statistical offices (NSOs), particularly in low- and middle-income countries, this convergence amounts to a systemic data crisis that threatens their relevance, credibility, and sustainability. At the same time, these pressures create a rare opportunity to rethink how data systems are designed, governed, and embedded in society.
This paper argues that the statistical community has reached a fork in the road. Incremental adjustment alone may no longer be sufficient. Countries and the international community face a strategic choice between two broad paths, each with distinct implications for legitimacy, financing, risk, and equity. Rather than prescribing a single solution, the paper aims to provoke informed debate ahead of the 57th Session of the UN Statistical Commission…(More)”.
Centre for HumData: “Based on analysis of the HDX Data Grids, we estimate that 68 percent of crisis data is available and up-to-date across 22 humanitarian operations, down from 74 percent in the previous year.
The report provides details on the data available for each location, category and sub-category covered in the Data Grids. The 22 Data Grids include 411 unique datasets, which were downloaded almost four times more than the average dataset on HDX.
In addition, the report explores changes in data availability, reflected both in the Data Grids and in conversations with data partners throughout the year. This qualitative analysis has allowed us to capture changes that would otherwise be hard to quantify. In addition, we explain the impact of AI bots on web traffic and Open Data platforms with the aggressive roll out of Large Language Models. Finally, we take a closer look at the organizations that contribute climate hazards data on HDX, which is essential for getting ahead of crises…(More)”.
Article by Lucila Pinto, Ehsan Masood, and Subhra Priyadarshini: “Uncertainty.”, “Loss of trust.”, “Definitely a crisis.” These are some of the ways in which researchers describe the state of affairs for government data in many countries.
“There is a new type of politics that is undermining the credibility of official statistics,” says João Pedro Azevedo, chief statistician for the United Nations children’s agency UNICEF in New York City.
Official statistics are data collected and validated by both national statistical agencies and international organizations. Nearly every country has an agency for official statistics. They collect information and organize it into statistics about myriad aspects of life, including what people earn, how many individuals are employed, how well children perform in school, the quality of nutrition, how long patients have to wait for an operation, levels of air pollution and increases to average temperatures.
National agencies collect data through surveys and from secondary sources. These data sets are used by governments to inform policy, by businesses to plan for the future, and by researchers and advocacy organizations. Official statistics, such as those measuring nations’ gross domestic product (GDP), are also the foundation for monitoring progress towards the 17 UN Sustainable Development Goals, the world’s plan to end poverty and achieve environmental sustainability.
“Official statistics are like the backbone of a nation’s data infrastructure,” says Steve Pierson, director of science policy at the American Statistical Association (ASA) in Washington DC. “Just like any other infrastructure — roads, bridges and highways — they cannot fail.”
People who work with or study official statistics say that they have never experienced a period similar to today’s situation. Those who call the current state a crisis think it has been triggered by an accumulation of overlapping factors. These include falling response rates to national surveys, cuts to funding and, in some cases, government interference.
Although funded by governments, national statistics offices are expected to operate independently of politicians, not least so that they are free to report the data as measured — much as academic research operates at arm’s length from its public-funding bodies. Moreover, rules established by an assembly of the world’s national statisticians and endorsed by the UN require that some data sets meet international standards, which state that official statistics should be accurate, impartial, trustworthy and grounded in evidence.
Although there is a history of inappropriate government involvement in the collection and reporting of national statistics (A. V. Georgiou Stat. J. IAOS 37, 85–105; 2021), there is a record of statistics agencies calling out the misuse of such data, too. But researchers worry that this might not be the case in future. “I fear that it is becoming harder for official statisticians to do their jobs,” says Diane Coyle, research director at the Bennett School of Public Policy at the University of Cambridge, UK.
Nature explores problems with official statistics in four countries that are causing concern for researchers and statisticians…(More)”.
Paper by Benjamin S. Manning & John J. Horton: “Useful social science theories predict behavior across settings. However, applying a theory to make predictions in new settings is challenging: rarely can it be done without ad hoc modifications to account for setting-specific factors. We argue that AI agents put in simulations of those novel settings offer an alternative for applying theory, requiring minimal or no modifications. We present an approach for building such “general” agents that use theory-grounded natural language instructions, existing empirical data, and knowledge acquired by the underlying AI during training. To demonstrate the approach in settings where no data from that data-generating process exists–as is often the case in applied prediction problems–we design a heterogeneous population of 883,320 novel games. AI agents are constructed using human data from a small set of conceptually related but structurally distinct “seed” games. In preregistered experiments, on average, agents predict initial human play in a random sample of 1,500 games from the population better than (i) a cognitive hierarchy model, (ii) game-theoretic equilibria, and (iii) out-of-the-box agents. For a small set of separate novel games, these simulations predict responses from a new sample of human subjects better even than the most plausibly relevant published human data…(More)”.
Article by Stefaan Verhulst: “The world has become more complex, more dynamic and more interconnected than ever before. The challenges we face – from health to climate, from democratic resilience to economic transformation – are deeply intertwined. And we need new ideas to meet these challenges.
Europe has never lacked intellectual ambition, but ideas alone aren’t enough. To make real progress, we need breakthrough discoveries. We need evidence of what works. And we need the institutional capacity to test, validate and scale solutions across borders and disciplines.
That’s where science comes in. Yet good science depends on data. And if we want AI to supercharge discovery and transform science, then data becomes even more important.
The ‘datafication’ of society
Digitalisation has led to an unprecedented datafication of society. When citizens engage with government services, visit a doctor, use a mobility platform, shop online or measure their steps and/or sleep through wearable devices, data are generated.
But this datafication doesn’t stop with individual behaviour. It extends deep into the productive fabric of our economies. Manufacturing systems, industrial supply chains, logistics networks, energy grids and robotic production lines are now embedded with sensors, connected devices and intelligent control systems. The implication is profound – data is no longer a by-product of digital services alone. It’s a structural feature of both our digital and physical infrastructures.
The remarkable feature of digital data isn’t merely its volume. It’s its reusability. When done responsibly, data created for one purpose can often be reused for entirely different objectives – including scientific research.
But there’s a fundamental constraint: access. Much of today’s most valuable data remains locked away in institutional stovepipes – within government agencies, universities and private companies. Despite its public value potential, it often remains inaccessible to scientists and public interest actors.
Europe has taken important steps to address this data asymmetry. Open data policies have expanded transparency. The Data Governance Act and the Data Act seek to facilitate data sharing and rebalance power in data markets. Article 40 of the Digital Services Act creates pathways for vetted researchers to access platform data. The European Open Science Cloud seeks to enable the sharing of scientific data. Sectoral data spaces – including those envisioned under the European Health Data Space – and Data Labs aim to provide structured, interoperable infrastructures for data access and use.
Yet instead of a steady expansion of access, we’re now witnessing a ‘data winter.’ Access to private sector data for research has declined in several domains. Open government data initiatives have slowed or been rolled back. Scientific datasets have become restricted or have disappeared. Open science has struggled to scale beyond pilot projects. And broader political retrenchment risks weakening some of the very infrastructures designed to enable responsible reuse.
Generative AI’s rapid expansion has also triggered backlash. Large-scale data scraping for AI training has blurred the line between openness and extraction. Consequently, institutions and content creators have become more protective, sometimes closing access altogether. And without reliable access to diverse, high-quality data, scientific progress risks stagnation.
What should Europe do? Three priorities stand out.
Access shouldn’t be only supply-driven
For too long, data policy has focused on releasing datasets without clearly articulating the questions they’re meant to answer. But the value of data – and increasingly the value of AI – depends directly on the value of the question.
In short, better questions define better discovery.
If we want to unlock meaningful access, we must invest in what might be called ‘question science’ – the systematic identification of high-priority societal questions; the structuring of those questions so they are researchable and actionable; the mapping of those questions to existing or potential data sources; and embedding them into funding frameworks, governance mandates, and institutional strategies.
When demand is vague, access debates remain abstract. When questions are clear, access becomes purposeful. Researchers, policymakers and data holders can align around concrete objectives. This requires structured, participatory processes that bring scientists, communities, funders and regulators together to define and prioritise the questions that matter most…(More)”.
Paper by Jana Leonie Peters and Marc Ziegele: “Users’ low willingness to participate in discussions in comment sections and the often-poor quality of their contributions have been identified as key challenges in online participation. To address these issues, previous research has proposed various strategies, including moderation. We argue that a less well-researched intervention, namely aggregation in the form of discussion summaries, reduces users’ information overload and enhances their objective knowledge and subjective knowledge, which in turn are positively associated with their willingness to participate and the deliberative quality of their comments. Results from an online experiment (n = 643) support most of our hypotheses, though objective knowledge does not directly impact willingness to comment. Differences between aggregation criteria were minimal, but fact-based aggregation was superior in improving objective knowledge compared to opinion- or argument-based approaches. These findings suggest that platform designers and moderators can utilize aggregation techniques to encourage participation and foster higher-quality online discourse…(More)”.
About: “AI systems, digital transformation, biodiversity markets, biotechnology, and finance for nature increasingly rely on data originating from indigenous and local territories.
However, the governance of these data flows remains largely undefined.
Version 1.0 of the Sovereign Data Supply Chain: Functional and Operational Framework— seeks to address these important topics.
This document is a structured framework designed to evolve through feedback, territorial validation, pilot implementation, and collective iteration—toward a valuable framework.
With Kinray Hub as the lead author, catalytic funder and co-strategist from NaturaTech LAC, and strategic support from Climate Collective, this version serves as an initial architecture for transitioning from extractive models to sovereign data chains based on Collective Rights in Latin America and the Caribbean…(More)“.
Report by Neil Kleiman, Eric Gordon, and Mai-Ling Garcia: “As governments and communities across the United States struggle to make sense of artificial intelligence, one of the most capable—and underutilized—partners is often hiding in plain sight: local colleges and universities. Much of the public conversation about AI focuses on big tech companies or federal regulation. Meanwhile, far less attention has been paid to how higher education institutions can help cities and nonprofits deploy AI to serve residents and strengthen public trust.
Across the United States, higher education institutions are already governing AI internally, experimenting with operational use cases, and absorbing unprecedented investment to build technical capacity. And as the appetite for an AI-trained workforce blossoms, local colleges are now a prime pipeline for talent. At the same time, local governments and nonprofits are just beginning to respond to and translate AI’s promise into public value.
This asymmetry presents a clear gap: Colleges and universities are increasingly adept at deploying AI, but the connection between local communities and higher ed remains underdeveloped.
This brief argues that AI has created a rare institutional opening to bridge the divide. Colleges are seeking clearer public relevance, governments require technical capacity, and communities are demanding institutions that are more responsive and trustworthy. Local leaders from governors to nonprofit executives who recognize this alignment—and act on it—can shape how AI strengthens democratic infrastructure rather than allowing it to evolve according to purely academic or commercial priorities…(More)”.
Article by Michelle Holko, John Wilbanks, and Sam Howell: “…Compute, talent, and capital are necessary for AI-enabled biotechnology, but biodata is the binding constraint. Without large, representative, and interoperable biological datasets, AI models cannot generalize, scale, or translate into real-world impact.
The application of AI to biotechnology carries profound promise for national power. From stronger, bio-based armor for U.S. warfighters to patching supply chain vulnerabilities with domestic biomanufacturing, the potential is as vast as biology itself. The country that leads in AI-enabled biology will set the pace not only in health and medical discovery but also in agriculture, industrial production, and potentially even future deterrence. Seizing this potential, however, will hinge on improving America’s access to high-quality, secure biodata that is designed specifically for AI.
Biodata holds the blueprints of life and has become a new form of strategic power in the age of AI. These data, including DNA, RNA, proteins, and metabolites, are foundational to innovation in bio-based materials, fuels, agriculture, and medicine.
The National Security Commission on Emerging Biotechnology’s 2025 final report concludes that dominance in biotechnology will “hinge on who controls the most complete, accurate, and secure biological datasets.” Biodata is a strategic asset for national power in the twenty-first century, analogous to advanced semiconductors or critical minerals. U.S. competitors, namely China, are moving fast to establish AI-bio leadership.
China’s Biotech Edge
China’s advantage in AI-enabled biotechnology is not simply scale, but also coordination. Beijing’s national strategies explicitly link biotechnology, big data, and artificial intelligence under directed planning, aiming to align data generation, compute resources, and industrial translation across sectors. One example is China’s non-invasive prenatal testing ecosystem: The domestic non-invasive prenatal testing market was valued at roughly $608 million in 2023 and is projected to exceed $1 billion by the end of the decade, reflecting widespread integration of genomic sequencing, hospital networks, and commercial bioinformatics services. Firms such as BGI Group operate large-scale sequencing and testing platforms (including the noninvasive fetal trisomy test) that generate and process substantial volumes of genomic data within an integrated ecosystem that spans clinical care, research, and industry. China has also rapidly expanded its domestic cell- and gene-therapy ecosystem, including multiple Chimeric Antigen Receptor T-cell therapy approvals and a growing clinical biomanufacturing base, shortening the path from research to deployment. At the same time, China is building the data substrate that makes AI-bio compounding possible: massive longitudinal health cohorts and national-level biodata platforms designed for large-scale integration and analysis…(More)”.