Stefaan Verhulst
Blog by Qhala: “…In AI, benchmarks are the gold standard for evaluation. They are used to test whether large language models (LLMs) can reason, diagnose, and communicate effectively. In healthcare, LLMs are tested against benchmarks before they’re considered “safe” for clinical use.
But here’s the problem: These benchmarks are primarily built for Western settings. They reflect English-language health systems, Western disease burdens, and datasets scraped from journals and exams thousands of kilometres away from the real-world clinics of Kisumu, Kano, or Kigali.
A study in Kenya found over 90 different clinical guidelines used by frontline health workers in primary care. That’s not chaos, it’s context. Medicine in Africa is deeply localised, shaped by resource availability, epidemiology, and culture. When a mother arrives with a feverish child, a community nurse doesn’t consult the United States Medical Licensing Examination (USMLE). She consults the local Ministry of Health protocol and speaks in Luo, Hausa, or Amharic.
In practice, Human medical Doctors have to go through various levels of rigorous, context-based, localised assessment before they can practise in a country and in a specific specialisation. These licensing exams aren’t arbitrary; they’re tailored to national priorities, clinical practices, and patient populations. They acknowledge that even great doctors must be assessed in context. These assessments are mandatory and are an obvious logic when it comes to human clinicians. A Kenyan-trained doctor must pass the United States Medical Licensing Examination (USMLE). In the United Kingdom, it is the Professional and Linguistic Assessments Board (PLAB) test. In Australia, the relevant assessment is the Australian Medical Council (AMC) examination.
However, unlike the nationally ratified assessments for humans, the LLM benchmarks and subsequently the LLMs and Health AI tools are not created for local realities, nor are they reflective of the local context.
…Amidst the limitations of global benchmarks, a wave of important African-led innovations is starting to reshape the landscape. Projects like AfriMedQA represent some of the first structured attempts to evaluate large language models (LLMs) using African health contexts. These benchmarks thoughtfully align with the continent’s disease burden, such as malaria, HIV, and maternal health. Crucially, they also attempt to account for cultural nuances that are often overlooked in Western-designed benchmarks.
But even these fall short. They remain Anglophone…(More)”.
British Council: “From telling stories that seed future breakthroughs to diversifying AI datasets, artists reimagine what technologies can be, and who they can be for. This publication creates an international evidence base for this argument. 56 leaders in art and technology have offered 40 statements, spanning 20 countries and 5 continents. As a collection, they articulate artists, the cultural sector and creative industries as catalysing progressive innovation with cultural diversity, human values, and community at its core.
Responses include research leads from Adobe, Lelapa AI and Google, who detail the contribution artists make to the human-centric development of high-growth technologies. UK institutions like Serpentine and FACT, and LAS Art Foundation in Germany show cultural organisations are essential spaces for progressive artist-led R&D. Directors of TUMO Centre for Creative Technologies in Armenia, and Diriyah Art Futures in Saudi Arabia highlight education across art and technology as a source of skills for the future. Leaders of African Digital Heritage in Kenya and the Centre for Historical Memory in Colombia demonstrate how community ownership of technologies for heritage preservation increases network resilience. Artists such as Xu Bing in China and Libby Heaney in the UK present art as a site for public demystification of complex technologies, from space satellites to quantum computing.
The perspectives presented in this publication serve as a resource for policy making and programme development spanning art and technology. Global in scope, they offer case studies that highlight why innovation needs artists, on both a national and international scale…(More)”.
Paper by Alex Fischer et al: “While the Sustainable Development Goals (SDGs) were being negotiated, global policymakers assumed that advances in data technology and statistical capabilities, what was dubbed the “data revolution”, would accelerate development outcomes by improving policy efficiency and accountability. The 2014 report to the United Nations Secretary General, “A World That Counts” framed the data-for-development agenda, and proposed four pathways to impact: measuring for accountability, generating disaggregated and real-time data supplies, improving policymaking, and implementing efficiency. The subsequent experience suggests that while many recommendations were implemented globally to advance the production of data and statistics, the impact on SDG outcomes has been inconsistent. Progress towards SDG targets has stalled despite advances in statistical systems capability, data production, and data analytics. The coherence of the SDG policy agenda has undoubtedly improved aspects of data collection and supply, with SDG frameworks standardizing greater indicator reporting. However, other events, including the response to COVID-19, have played catalytic roles in statistical system innovation. Overall, increased financing for statistical systems has not materialized, though planning and monitoring of these national systems may have longer-term impacts. This article reviews how assumptions about the data revolution have evolved and where new assumptions are necessary to advance the impact across the data value chain. These include focusing on measuring what matters most for decision-making needs across polycentric institutions, leveraging the SDGs for global data standardization and strategic financial mobilization, closing data gaps while enhancing policymaker analytic capabilities, and fostering collective intelligence to drive data innovation, credible information, and sustainable development outcomes…(More)”.
Responsible Data for Children (RD4C): “From schools to clinics to the phones in their hands, children are generating more data than ever before. This data holds enormous potential, both informing smarter policies and helping every child to thrive. But with this opportunity comes serious risks, too. Misuse, breaches, and privacy violations are all too common. Without strong governance, the very systems meant to support children can expose them to harm, bias, or exclusion.
Since 2019, the Responsible Data for Children (RD4C) initiative—a partnership between UNICEF and The GovLab at New York University—has worked to strengthen and promote responsible data practices for and about children, from collection to processing to use—across every stage throughout the entire data lifecycle.
In this time, RD4C.org has reached more than 10,000 users across 165 countries, with its tools and resources viewed over 74,000 times—a reflection of growing global momentum to make data governance work better for children.
We are now pleased to announce the launch of an upgraded RD4C.org: a more accessible, dynamic, and action-driven platform to support responsible data use for and about children in today’s rapidly evolving digital landscape.
What’s New
RD4C.org has been upgraded with a fresh design and new features to scale impact, deepen accessibility, and better equip those working to uphold children’s rights in the rapidly evolving digital age.
- Multilingual Access: The upgraded RD4C.org is now available in five languages — English, Spanish, French, Arabic, and Chinese. By making the site and its resources accessible in multiple languages, RD4C empowers practitioners, policymakers, and advocates including children and young people to adapt and apply child-centered data governance principles across diverse political, cultural, and operational contexts.
- Comprehensive Resource Hub: The redesigned resource section brings together videos, case studies, and RD4C tools in one cohesive space. This enhanced collection page offers practical, actionable insights for anyone working to advance child-centered data governance, whether shaping national policies, improving service delivery, or designing ethical data systems.
- A New Editorial Space for Global Commitment: As part of our deepening commitment to cross-sector collaboration, RD4C.org now hosts a dedicated space spotlighting the Commitment to Data Governance Fit for Children—a global initiative launched at the 2024 UN World Data Forum to co-develop responsible data systems with and for children, grounded in their rights and realities. This new editorial focus featured in the blog section highlights practical insights from both young people and key committed partners — including UNICEF, the Govlab, GPSDD, the Datasphere Initiative, the Abu Dhabi Early Childhood Authority, Highway Child, Develop Metrics, and others — showcasing real-world efforts to make data governance truly fit for children…(More)”.
Briefing for European Parliament: “…explores the potential of generative AI in supporting foresight analysis and strategic decision-making. Recent technological developments promise an increased role for large language models (LLMs) in policy research and analysis. From identifying trends and weak signals to fleshing out rich scenario narratives and bringing them to life in experiential and immersive ways, generative AI is empowering foresight analysts in their endeavour to anticipate uncertainties and support policymakers in preparing better for the future. As generative agents powered by LLMs become more adept at mimicking human behaviour, they could offer foresight practitioners and policy analysts new ways to gain additional insights at greater speed and scale, supporting their work.
However, to effectively integrate generative AI and LLMs into foresight practice, it is crucial to critically evaluate their limitations and biases. Human oversight and expertise are essential for ensuring the reliability and validity of AI-generated outputs, as well as the need for transparency, accountability, and other ethical considerations. It is important to note that, while generative AI can augment human capabilities, it should not be seen as a replacement for human involvement and judgment.
By combining human expertise with generative AI capabilities, foresight analysts can uncover new opportunities to enhance strategic planning in policymaking. A proactive and informed approach to adopting generative AI in foresight analysis may lead to more informed, nuanced, and effective strategies when dealing with complex futures…(More)”.
Paper by Marcel Binz: “Establishing a unified theory of cognition has been an important goal in psychology. A first step towards such a theory is to create a computational model that can predict human behaviour in a wide range of settings. Here we introduce Centaur, a computational model that can predict and simulate human behaviour in any experiment expressible in natural language. We derived Centaur by fine-tuning a state-of-the-art language model on a large-scale dataset called Psych-101. Psych-101 has an unprecedented scale, covering trial-by-trial data from more than 60,000 participants performing in excess of 10,000,000 choices in 160 experiments. Centaur not only captures the behaviour of held-out participants better than existing cognitive models, but it also generalizes to previously unseen cover stories, structural task modifications and entirely new domains. Furthermore, the model’s internal representations become more aligned with human neural activity after fine-tuning. Taken together, our results demonstrate that it is possible to discover computational models that capture human behaviour across a wide range of domains. We believe that such models provide tremendous potential for guiding the development of cognitive theories, and we present a case study to demonstrate this…(More)”.
Book by Robert V. Moody, Ming-Dao Deng: “One of life’s most fundamental revelations is change. Presenting the fascinating view that pattern is the manifestation of change, this unique book explores the science, mathematics, and philosophy of change and the ways in which they have come to inform our understanding of the world. Through discussions on chance and determinism, symmetry and invariance, information and entropy, quantum theory and paradox, the authors trace the history of science and bridge the gaps between mathematical, physical, and philosophical perspectives. Change as a foundational concept is deeply rooted in ancient Chinese thought, and this perspective is integrated into the narrative throughout, providing philosophical counterpoints to customary Western thought. Ultimately, this is a book about ideas. Intended for a wide audience, not so much as a book of answers, but rather an introduction to new ways of viewing the world.
- Combines mathematics and philosophy to explore the relationship between pattern and change
- Uses examples from the world around us to illustrate how thinking has developed over time and in different parts of the world
- Includes chapters on information, dynamics, symmetry, chance, order, the brain, and quantum mechanics, all introduced gently and building progressively toward deeper insights
- Accompanied online by additional chapters and endnotes to explore topics of further interest..(More)”.
Paper by Chiara Farronato, Andrey Fradkin & Tesary Lin: “We study the welfare consequences of choice architecture for online privacy using a field experiment that randomizes cookie consent banners. We study three ways in which firms or policymakers can influence choices: (1) nudging users through banner design to encourage acceptance of cookie tracking; (2) setting defaults when users dismiss banners; and (3) implementing consent decisions at the website versus browser level. Absent design manipulation, users accept all cookies more than half of the time. Placing cookie options behind extra clicks strongly influences choices, shifting users toward more easily accessible alternatives. Many users dismiss banners without making an explicit choice, underscoring the importance of default settings. Survey evidence further reveals substantial confusion about default settings. Using a structural model, we find that among consent policies requiring site-specific decisions, consumer surplus is maximized when consent interfaces clearly display all options and default to acceptance in the absence of an explicit choice. However, the welfare gains from optimizing banner design are much smaller than those from adopting browser-level consent, which eliminates the time costs of repeated decisions…(More)”.
A Report of the Center for Open Data Enterprise (CODE): “The U.S. has had a strong bipartisan consensus that open federal data is an essential public good. Since 2009, initiatives by Presidents Obama, Trump, and Biden and two acts of Congress have made federal data more accessible, transparent, and useful. The current presidential administration has not challenged these established principles. However, the administration has altered many government data programs on an individual basis, often with the rationale that they do not align with the President’s priorities.
Civil society has responded to these actions with a data rescue movement to archive critical datasets and keep them publicly available. There is a good chance that the movement will be able to save most of the federal data that was available in January 2025.
The greater risk, however, is to the future. The data we have today will not be very useful in a year or two, and future data collections are now under threat. Since the start of the Trump Administration, the federal government has:
● Dismantled and defunded agencies that collect data mandated by Congress
● Discontinued specific data programs
● Defunded research that can be a source of open scientific data
● Disbanded advisory committees for the U.S. Census Bureau and other data-collecting
agencies and offices
● Removed data disaggregated by sexual orientation and gender identity
● Proposed changing established methods of data collection and publishing in some key
areas
These changes can have a major impact on the many institutions – including state and local governments, businesses, civil society organizations, and more – that depend on federal data for policymaking, decision making, and growth…(More)”
Report by ProPublica: “The Internal Revenue Service is building a computer program that would give deportation officers unprecedented access to confidential tax data.
ProPublica has obtained a blueprint of the system, which would create an “on demand” process allowing Immigration and Customs Enforcement to obtain the home addresses of people it’s seeking to deport.
Last month, in a previously undisclosed dispute, the acting general counsel at the IRS, Andrew De Mello, refused to turn over the addresses of 7.3 million taxpayers sought by ICE. In an email obtained by ProPublica, De Mello said he had identified multiple legal “deficiencies” in the agency’s request.
Two days later, on June 27, De Mello was forced out of his job, people familiar with the dispute said. The addresses have not yet been released to ICE. De Mello did not respond to requests for comment, and the administration did not address questions sent by ProPublica about his departure.
The Department of Government Efficiency began pushing the IRS to provide taxpayer data to immigration agents soon after President Donald Trump took office. The tax agency’s acting general counsel refused and was replaced by De Mello, who Trump administration officials viewed as more willing to carry out the president’s agenda. Soon after, the Department of Homeland Security, ICE’s parent agency, and the IRS negotiated a “memorandum of understanding” that included specific legal guardrails to safeguard taxpayers’ private information.
In his email, De Mello said ICE’s request for millions of records did not meet those requirements, which include having a written assurance that each taxpayer whose address is being sought was under active criminal investigation.
“There’s just no way ICE has 7 million real criminal investigations, that’s a fantasy,” said a former senior IRS official who had been advising the agency on this issue. The demands from the DHS were “unprecedented,” the official added, saying the agency was pressing the IRS to do what amounted to “a big data dump.”
In the past, when law enforcement sought IRS data to support its investigations, agencies would give the IRS the full legal name of the target, an address on file and an explanation of why the information was relevant to a criminal inquiry. Such requests rarely involved more than a dozen people at a time, former IRS officials said.
Danny Werfel, IRS commissioner during the Biden administration, said the privacy laws allowing federal investigators to obtain taxpayer data have never “been read to open the door to the sharing of thousands, tens of thousands, or hundreds of thousands of tax records for a broad-based enforcement initiative.”
A spokesperson for the White House said the planned use of IRS data was legal and a means of fulfilling Trump’s campaign pledge to carry out mass deportations of “illegal criminal aliens.”
Taxpayer data is among the most confidential in the federal government and is protected by strict privacy laws, which have historically limited its transfer to law enforcement and other government agencies. Unauthorized disclosure of taxpayer return information is a felony that can carry a penalty of up to five years in prison…(More)”.