Explore our articles
View All Results

Stefaan Verhulst

Article by Krzysztof Pelc: “We are good at predicting what machines will do better; we are far worse at predicting what people will value differently once that happens. Yet the history of technological disruption suggests a fairly consistent pattern: when one property becomes abundant, perceived value migrates elsewhere. The Arts and Crafts aesthetic thus rose up as a challenge to factory production. Similarly, despite quartz watches making accuracy trivial by the 1970 s, mechanical watches once more dominate the global market by value (Raffaelli 2019). Technological shocks alter not only the goods on offer, but also the basis by which those goods are evaluated. What had seemed central is downgraded; what had been incidental becomes precious.

The direction of this change is invariably towards the human—not out of sentiment, but because in the wake of technological shocks, it is the human aspect that grows distinctive. The advent of large language models (LLMs) is beginning to have a similar effect on all writing. LLMs make verbal fluency cheap, they make competent prose abundant. As that happens, the old premium on flowing prose will weaken. If smoothness can be summoned on demand, smoothness no longer distinguishes. The scarce good will no longer be fluency, but provenance: whether a text can be traced to a particular human sensibility, lived experience, and intention. Call it the flight-to-humanity effect.

Usually, this revaluation works to the humans’ advantage. It’s the phenomenon that protects the radiologist, the craftsperson, the live performer, once their core output is superseded by machines. But writing will likely be an exception.

The problem is that writing is peculiarly ill-suited to certifying its own origins. In most domains, human provenance remains legible in the thing itself. A handmade bowl can bear the mark of its maker; a performer is present in the act; a physician’s judgment is tied to the physical person and their credentials. Writing is different. It arrives as a finished product, stripped of the conditions of its making. The reader sees the result, but not the process that produced it. Novels, essays, love notes, wedding speeches: none carry intrinsic evidence of authorship. This is not simply a practical difficulty. It’s a property of writing itself. And it’s what means that suspicion, once introduced, extends to every text alike…(More)”.

“Human authored”? Who knows

Paper by Stefaan Verhulst, Roshni Singh, Marta Dell’Aquila, Leonie Kunze, and Cosima Lenz: “Women’s health research remains under-resourced, underprioritized, and narrowly defined. Across the life course, women experience distinct health needs with significant implications for health and wellbeing, yet persistent gaps in evidence and data continue to reinforce inequities. In the absence of a universally accepted definition of women’s health, this study aimed to develop a topic map to capture its breadth and to identify an evidence-informed set of the top ten priority questions to guide future women’s health research and innovation.
Methods: We used a participatory, iterative methodology inspired by the 100 Questions Initiative, combining structured stakeholder engagement, rapid evidence synthesis, and iterative validation. An initial topic map was developed through an in-person workshop and refined through ongoing engagement with 77 global experts in women’s health and data science. Guided by the topic map, experts submitted research questions via a virtual survey, which were refined, clustered, prioritized, and ranked.
Results: The topic map served as a shared framework to guide the submission of actionable research questions and comprised four branches: (1) key domains of women’s health; (2) determinants and barriers; (3) technology and innovation; and (4) research and evidence gaps. A total of 113 questions were submitted, clustered into 56 themes, and narrowed to a top ten through expert prioritization, followed by public ranking via a virtual survey that yielded 115 responses. The highest-ranked questions focused on reframing and prioritizing women’s health, strengthening investment and innovation ecosystems, and addressing evidence gaps, research participation, data quality, and equity.
Conclusion: This study presents a comprehensive topic map that captures the complexity and cross-sectoral nature of women’s health and provides a unifying framework for the field. The prioritized questions offer a strategic foundation to guide future global research, policy, and investment to advance women’s health innovation…(More)”.

A Crowdsourced Topic Map and Future Research Agenda for Women’s Health

Editorial to Special Issue: “Manuel Pérez-Troncoso, Katrina L. Bledsoe, Karen Peterman, Theresa N. Melton, and Rodney K. Hopson: “…People-centered approaches challenge evaluators to “walk the talk” of culturally responsive, equitable, and socially just practices by expanding the role of evaluation in service to society. This means not only studying with communities but also giving back, investing in, and standing side by side with them (Bledsoe 2021, 2014). Similar to many efforts working across multiple sectors, people-centered approaches often take place in communities shaped by a history of colonialism, discrimination, and marginalization, which continues to influence life, opportunity, and culture on a daily basis. Researchers and evaluators must strive to build authentic, collaborative relationships with participants to understand and help tell the story of how they are affected by the programs we work with. We must integrate and prioritize culture, local context, and community perspectives in all aspects of program and evaluation design, implementation, and use.

This issue identifies three key dimensions that differentiate People-Centered Evaluation (PCE) from program-centered evaluation, as outlined in Table 1: Full humanity (the evaluator’s positionality and axiology); prioritizing relationships (investment in relationships vs. extraction); and community engagement (pursues open vs. selective access). These dimensions reflect a relational worldview: ontologically, reality is understood as co-constructed through relationships and contexts; epistemologically, knowledge emerges through dialogue, participation, and lived experience; and methodologically, evaluation practices adapt to community-defined meanings and purposes (Mertens et al. 2025). We contend that evaluation feels and functions differently when it is prioritized using the people-centered distinctions in Table 1. We challenge readers to consider the following: In what ways would your evaluation practice look different if you began a new project with the intention of being an agent of social change versus a distant observer? In what ways would your evaluation practice look different if you began a new project with the intention of fostering and strengthening relationships with community members rather than focusing on creating a context for gathering data? In what ways would your evaluation theories, processes, and communication strategies differ if you prioritized and centered authentic community engagement?TABLE 1. Differences between program-centered evaluation and people-centered evaluation.

DimensionsProgram-centered evaluationPeople-centered evaluation
Full humanityPursue objectivity, impartial assessments of programs, initiatives, and strategiesEvaluators, as agents for social change, address inequalities; acknowledgespositionality, and perspective
Prioritizing relationshipsFocus on results, efficiency, and impactInvest in long-term relationships with participants
Community engagementEngage stakeholders selectively, often based on roles and specific needEnsure open access and inclusive engagement with communities

This issue aims to push the boundaries of evaluation by focusing on both theoretical advances and practical applications of people-centered approaches, based upon those that are culturally-responsive, indigenous, and equity-driven approaches…(More)”.

People-centered evaluation: Theory and Action

Article by Alex Pasternack: “On March 2, two days into the United States and Israel’s air campaign against Iran, CNN published imagery showing a still-smoking operations center at Port Shuaiba in Kuwait, where six American service members had just been killed by an Iranian drone—before the Pentagon had provided details of the strike, including the full death toll. A day later, the New York Times offered a preliminary rundown of damage to US military sites across the Gulf. In the following days, multiple outlets showed that a strike on an elementary school that killed 175 people had likely been carried out by the US—an apparent mistake, which the Pentagon initially disputed. Amid a cascade of restrictions and conflicting narratives, all of these reports relied on a cornerstone of open-source intelligence: commercial satellite imagery, much of it from a single vendor called Planet Labs. 

Then, on March 6, the flow of pictures began slowing to a crawl. Planet Labs, a San Francisco–based company that operates more than two hundred satellites capable of photographing most of Earth’s landmass once per day—an unparalleled frequency among commercial satellites—announced a four-day hold on “all new imagery collected over the Gulf States, Iraq, Kuwait, and adjacent conflict zones.” On March 11, Planet, as the firm is known, told customers the delay would be extended to fourteen days and expanded to include “all of Iran and nearby allied bases, in addition to the Gulf States and existing conflict zones.” Planet said it had made the decision through discussions with experts inside and outside of the government about preventing images from being “tactically leveraged by adversarial actors to target allied and NATO-partner personnel and civilians”—in other words, out of fear that Iran might use them to target the US and its allies in the Middle East.

On April 4, Planet’s stop in service became indefinite—imagery feeds would be halted, retroactive to March 8, local time. (Many outlets reported that the last available images would be from March 9.) “Due to the conflict in the Middle East, the U.S. government has requested all satellite imagery providers voluntarily implement an indefinite withhold of imagery in the designated Area of Interest,” the company told customers in an email, the text of which was provided to CJR by a spokesperson. Going forward, Planet said, it would release imagery on a case-by-case basis and for “urgent, mission-critical requirements or in the public interest.” A spokesperson told me that “this model is in line with the media policies of other remote-sensing companies.”..(More)”.

Blind spots

Paper by Ana Dodik and Moira Weigel: “We put forth a critical theoretical framework for analyzing generative models both descriptively and normatively. Our thesis is that generative models automate the production not only of intellectual labor or intelligence, but of a broader set of human social capacities we name “social doing.” We do this by historicizing the commodification of sociality in the digital economy, leading to the availability of social data as the precondition for generative models. We elaborate our definition of “social doing” by drawing a distinction between “use” and “exchange” sociality and further differentiate between the ways that generative models either substitute for or mediate existing social relations and processes. We then turn to existing empirical research on how people use generative model-based products and the effects that their use has upon them. In this, we introduce the concept of Synthetic Sociality, a social reality in part fabricated by Silicon Valley’s privately owned and undemocratically governed generative models. Lastly, we offer a normative analysis based on our findings and framework, and discuss future design opportunities…(More)”.

Synthetic Sociality: How Generative Models Privatize the Social Fabric

Article by Stefaan Verhulst and Claudia Chwalisz: “The race to build the infrastructure of artificial intelligence is accelerating. Across the world, fields, industrial parks, and suburban edges are being transformed into data centers — vast, warehouse-like facilities that power everything from cloud storage to large language models.

For technology companies, this expansion is claimed to be essential. For the communities where these facilities are built, it is becoming increasingly contentious.

Recent reporting in The New York Times and elsewhere has captured the growing unease. Residents are questioning the scale of water consumption required to cool servers, the strain on local energy grids, and the transformation of landscapes once defined by entirely different economic and environmental logics. In many cases, the promised benefits — jobs, investment, growth — feel limited when set against the demands these facilities place on shared resources.

What is emerging is not simply a series of local disputes. It is a broader challenge of legitimacy.

There is a concept for this, though it predates the digital economy. In the 1990s, mining and energy companies (often called extractive industries) began to recognize that regulatory approval was no longer sufficient to ensure that projects could proceed smoothly. Communities could — and did — push back against developments that were fully legal but widely perceived as unfair or harmful. The term that emerged to describe what was missing was “a social license to operate”.

A social license is not granted by governments. It is conferred, informally but powerfully, by the people who live with the consequences of a project. It depends on trust, on transparency, and on a sense that the balance between costs and benefits is acceptable. Crucially, it is not static. It can be strengthened over time — or withdrawn.

Data centers are now encountering this reality…(More)”.

Data Centers Need a Social License to Operate

White Paper by the Siegel Family Endowment: “We’re living in an era of unprecedented information abundance, yet still struggling to generate real insight. The issue isn’t a lack of data, but a lack of well-formed questions. The way we frame problems—and who gets to frame them—shapes everything that follows.

Better Questions, Better Insights introduces the emerging science of questions: a more rigorous approach to defining, testing, and refining the inquiries that guide our work.

At Siegel Family Endowment, this approach has shaped an inquiry-driven model of philanthropy—one that moves beyond linear solutions toward deeper systems change.

This paper offers a practical framework for embedding inquiry into decision-making, helping organizations move from information to insight—and from insight to impact…

This paper is an invitation. A look under the hood at how we’ve approached inquiry in our own work, and a starting point for shared exploration.

As the complexity of societal challenges grows, our approaches must evolve with it. That means embracing a more rigorous practice of curiosity—asking better questions, together—and expanding who gets to ask them.

If we can do that, we have an opportunity to modernize and democratize philanthropy in ways that better meet this moment…(More)”.

Better Questions, Better Insights

Report by the World Economic Forum: “Agentic artificial intelligence (AI) is driving a fundamental shift in capability, allowing systems to autonomously execute end-to-end, multi-step workflows. This technological progress is poised to transform how governments operate and serve citizens. However, without a strategic, evidence-based grasp of where agentic AI can deliver the greatest public value – balancing high potential with manageable complexity – governments risk investing in the wrong places, undermining confidence in the technology and launching pilots that fail to scale…(More)”.

Making Agentic AI Work for Government: A Readiness Framework

Initiative by Hugging Face: “Many of the deepest challenges in advancing AI for scientific discovery are not purely technical—they are social and organizational. Progress is often limited not by algorithms or computational power, but by how effectively we coordinate efforts, share resources, and collaborate across disciplinary boundaries. Hugging Science brings together a global community of researchers, developers, and practitioners committed to accelerating breakthroughs in physics, biology, chemistry, neuroscience, and beyond through open collaboration.

Our vision is grounded in the argument presented in our position paper: democratizing AI for science requires treating it as a collective social project where equitable participation and sustainable collaboration are prerequisites for technical progress.

What We Do

  • Launch collaborative challenges and open problem calls to identify and mobilize collective effort around upstream computational bottlenecks with broad applicability across scientific domains—such as efficient PDE solvers, multi-scale coupling, and high-dimensional sampling—rather than fragmenting resources across narrow, domain-specific applications
  • Build open toolkits, benchmarks, and workflows that address data fragmentation through standardized formats and shared evaluation metrics, making it easier for researchers at institutions of all resource levels to collaborate and build on each other’s work
  • Support cross-disciplinary exchange and education by creating resources that bridge the communication gap between domain scientists who prioritize mechanistic understanding and ML researchers who focus on predictive performance, enabling more effective collaboration
  • Nurture a community that values contributions to data curation, infrastructure development, and educational resources alongside algorithmic innovation—recognizing that datasets and infrastructure often have far greater long-term impact than individual models
  • Learn together through open discussion of both technical advances and the social and institutional barriers that constrain progress, working to align incentives and build sustainable practices for scientific AI..(More)”.
Hugging Science

Blog by Michael Hallsworth: “Last week, I was in San Francisco for the HumanX conference. Listening to people there pushed me to ask a question that’s been bouncing around in my head with increasing insistency:

What’s the psychological impact of being the human in the loop?

I feel like this issue is a time bomb that could destroy current plans of how AI will be governed. If you listen to any AI policy conversation for more than a few minutes, you’re likely to hear the phrase “human-in-the-loop” (HITL). It’s a catch-all term that provides reassurance and allow us carry on with the technical discussion. Like in the workplace, if we just keep the right people “in the loop,” all will be well.

The idea evokes an image of a capable, watchful person who will intervene expertly if the system goes wrong. Whole governance frameworks are built on top of this comforting picture. For example, Article 14 of the EU AI Act tries to put a set of requirements on humans to “prevent or minimise the risks to health, safety or fundamental rights”.

But the Act says nothing about whether these humans will have the skills, attention, or motivation to perform this oversight. Or, even if they can, for how long. Or what the experience would be like.

In other words, we’re not thinking enough about what it actually feels like to be the human in the loop.

I find that gap increasingly hard to ignore because billions (?) of humans-in-the-loop may soon face two contrasting problems that we’ve been neglecting:

  • Verification burdens caused by too much cognitive stimulus;1
  • Vigilance atrophy caused by too little stimulus.

The tricky thing is that these two risks can affect the same person on the same day. Moreover, they call for almost opposite responses. Even trickier! Here I suggest how we should start tackling this problem…(More)”.

Who wants to be the human in the loop?

Get the latest news right in your inbox

Subscribe to curated findings and actionable knowledge from The Living Library, delivered to your inbox every Friday