Stefaan Verhulst

WorkflowHub: a registry for computational workflows

Curated on May 24, 2025May 26, 2025 by Stefaan Verhulst

Paper by Ove Johan Ragnar Gustafsson et al: “The rising popularity of computational workflows is driven by the need for repetitive and scalable data processing, sharing of processing know-how, and transparent methods. As both combined records of analysis and descriptions of processing steps, workflows should be reproducible, reusable, adaptable, and available. Workflow sharing presents opportunities to reduce unnecessary reinvention, promote reuse, increase access to best practice analyses for non-experts, and increase productivity. In reality, workflows are scattered and difficult to find, in part due to the diversity of available workflow engines and ecosystems, and because workflow sharing is not yet part of research practice. WorkflowHub provides a unified registry for all computational workflows that links to community repositories, and supports both the workflow lifecycle and making workflows findable, accessible, interoperable, and reusable (FAIR). By interoperating with diverse platforms, services, and external registries, WorkflowHub adds value by supporting workflow sharing, explicitly assigning credit, enhancing FAIRness, and promoting workflows as scholarly artefacts. The registry has a global reach, with hundreds of research organisations involved, and more than 800 workflows registered…(More)”

Where Cloud Meets Cement

Curated on May 24, 2025May 24, 2025 by Stefaan Verhulst

Report by Hanna Barakat, Chris Cameron, Alix Dunn and Prathm Juneja, and Emma Prest: “This report examines the global expansion of data centers driven by AI and cloud computing, highlighting both their economic promises and the often-overlooked social and environmental costs. Through case studies across five countries, it investigates how governments and tech companies influence development, how communities resist harmful effects, and what support is needed for effective advocacy…(More)”.

Designing Shared Data Futures: Engaging young people on how to re-use data responsibly for health and well-being

Curated on May 22, 2025May 22, 2025 by Stefaan Verhulst

Report by Hannah Chafetz, Sampriti Saxena, Tracy Jo Ingram, Andrew J. Zahuranec, Jennifer Requejo and Stefaan Verhulst: “When young people are engaged in data decisions for or about them, they not only become more informed about this data, but can also contribute to new policies and programs that improve their health and well-being. However, oftentimes young people are left out of these discussions and are unaware of the data that organizations collect.

In October 2023, The Second Lancet Commission on Adolescent Health and well-being, the United Nations Children’s Fund (UNICEF), and The GovLab at New York University hosted six Youth Solutions Labs (or co-design workshops) with over 120 young people from 36 countries around the world. In addition to co-designing solutions to five key issues impacting their health and well-being, we sought to understand current sentiments around the re-use of data on those issues. The Labs provided several insights about young people’s preferences regarding: 1) the purposes for which data should be re-used to improve health and well-being, 2) the types and sources of data that should and should not be re-used, 3) who should have access to previously collected data, and 4) under what circumstances data re-use should take place. Additionally, participants provided suggestions of what ethical and responsible data re-use looks like to them and how young people can participate in decision making processes. In this paper, we elaborate on these findings and provide a series of recommendations to accelerate responsible data re-use for the health and well-being of young people…(More)”.

Computer Science and the Law

Curated on May 22, 2025May 22, 2025 by Stefaan Verhulst

Article by Steven M. Bellovin: “There were three U.S. technical/legal developments occurring in approximately 1993 that had a profound effect on the technology industry and on many technologists. More such developments are occurring with increasing frequency.

The three developments were, in fact, technically unrelated. One was a bill before the U.S. Congress for a standardized wiretap interface in phone switches, a concept that spread around the world under the generic name of “lawful intercept.” The second was an update to the copyright statute to adapt to the digital age. While there were some useful changes—caching proxies and ISPs transmitting copyrighted material were no longer to be held liable for making illegal copies of protected content—it also provided an easy way for careless or unscrupulous actors—including bots—to request takedown of perfectly legal material. The third was the infamous Clipper chip, an encryption device that provided a backdoor for the U.S.—and only the U.S.—government.

All three of these developments could be and were debated on purely legal or policy grounds. But there were also technical issues. Thus, one could argue on legal grounds that the Clipper chip granted the government unprecedented powers, powers arguably in violation of the Fourth Amendment to the U.S. Constitution. That, of course, is a U.S. issue—but technologists, including me, pointed out the technical risks of deploying a complex cryptographic protocol, anywhere in the world (and many other countries have since expressed similar desires). Sure enough, Matt Blaze showed how to abuse the Clipper chip to let it do backdoor-free encryption, and at least two other mechanisms for adding backdoors to encryption protocols were shown to have flaws that allowed malefactors to read data that others had encrypted.

These posed a problem: debating some issues intelligently required not just a knowledge of law or of technology, but of both. That is, some problems cannot be discussed purely on technical grounds or purely on legal grounds; the crux of the matter lies in the intersection.

Consider, for example, the difference between content and metadata in a communication. Metadata alone is extremely powerful; indeed, Michael Hayden, former director of both the CIA and the NSA, once said, “We kill people based on metadata.” The combination of content and metadata is of course even more powerful. However, under U.S. law (and the legal reasoning is complex and controversial), the content of a phone call is much more strongly protected than the metadata: who called whom, when, and for how long they spoke. But how does this doctrine apply to the Internet, a network that provides far more powerful abilities to the endpoints in a conversation? (Metadata analysis is not an Internet-specific phenomenon. The militaries of the world have likely been using it for more than a century.) You cannot begin to answer that question without knowing not just how the Internet actually works, but also the legal reasoning behind the difference. It took more than 100 pages for some colleagues and I, three computer scientists and a former Federal prosecutor, to show how the line between content and metadata can be drawn in some cases (and that the Department of Justice’s manuals and some Federal judges got the line wrong), but that in other cases, there is no possible line¹

Newer technologies pose the same sorts of risks…(More)”.

Future design in the public policy process: giving a voice to future generations

Curated on May 22, 2025May 22, 2025 by Stefaan Verhulst

Paper by Marij Swinkels, Olivier de Vette & Victor Toom: “Long-term public issues face the intergenerational problem: current policy decisions place a disproportionate burden on future generations while primarily benefitting those in the present. The interests of present generations trump those of future generations, as the latter play no explicit part as stakeholders in policy making processes. How can the interests of future generations be voiced in the present? In this paper, we explore an innovative method to incorporate the interests of future generations in the process of policymaking: future design. First, we situate future design in the policy process and relate it to other intergenerational policymaking initiatives that aim to redeem the intergenerational problem. Second, we show how we applied future design and provide insights into three pilots that we organized on two long-term public issues in the Netherlands: housing shortages and water management. We conclude that future design can effectively contribute to representing the interests of future generations, but that adoption of future design in different contexts also requires adaptation of the method. The findings increase our understanding of the value of future design as an innovative policymaking practice to strengthen intergenerational policymaking. As such, it provides policymakers with insights into how to use this method…(More)”.

Why Generative AI Isn’t Transforming Government (Yet) — and What We Can Do About It

Curated on May 22, 2025May 22, 2025 by Stefaan Verhulst

Article by Tiago C. Peixoto: “A few weeks ago, I reached out to a handful of seasoned digital services practitioners, NGOs, and philanthropies with a simple question: Where are the compelling generative AI (GenAI) use cases in public-sector workflows? I wasn’t looking for better search or smarter chatbots. I wanted examples of automation of real public workflows – something genuinely interesting and working. The responses, though numerous, were underwhelming.

That question has gained importance amid a growing number of reports forecasting AI’s transformative impact on government. The Alan Turing Institute, for instance, published a rigorous study estimating the potential of AI to help automate over 140 million government transactions in the UK. The Tony Blair Institute also weighed in, suggesting that a substantive portion of public-sector work could be automated. While the report helped bring welcome attention to the issue, its use of GPT-4 to assess task automatability has sparked a healthy discussion about how best to evaluate feasibility. Like other studies in this area, both reports highlight potential – but stop short of demonstrating real service automation.

Without testing technologies in real service environments – where workflows, incentives, and institutional constraints shape outcomes – and grounding each pilot in clear efficiency or well-being metrics, estimates risk becoming abstractions that underestimate feasibility.

This pattern aligns with what Arvind Narayanan and Sayash Kapoor argue in “AI as Normal Technology:” the impact of AI is realized only when methods translate into applications and diffuse through real-world systems. My own review, admittedly non-representative, confirms their call for more empirical work on the innovation-diffusion lag.

In the public sector, the gap between capability and impact is not only wide but also structural…(More)”

Urban Development Needs Systems Thinking

Curated on May 22, 2025May 22, 2025 by Stefaan Verhulst

Article by Yaera Chung: “More than three decades after the collapse of the Soviet Union, cities in Eastern Europe and Central Asia (EECA) continue to grapple with economic stagnation, aging infrastructure, and environmental degradation while also facing new pressures from climate change and regional conflicts. In this context, traditional city planning, which tackles problems in isolation, is struggling to keep up. Urban strategies often rely on siloed, one-off interventions that fail to reflect the complexity of social challenges or adapt to shifting conditions. As a result, efforts are frequently fragmented, overlook root causes, and miss opportunities for long-term, cross-sector collaboration.

Instead of addressing one issue at a time, cities need to develop a set of coordinated, interlinked solutions that tackle multiple urban challenges simultaneously and align efforts across sectors. As part of a broader strategy to address environmental, economic, and social goals at once, for example, cities might advance a range of initiatives, such as transforming biowaste into resources, redesigning streets to reduce air pollution, and creating local green jobs. These kinds of “portfolio” approaches are leading to lasting and systems-level change.

Since 2021, the United Nations Development Programme (UNDP) has been collaborating with 15 cities across EECA to solve problems in ways that embrace complexity and interconnectedness. Selected through open calls under two UNDP initiatives, Mayors for Economic Growth and the City Experiment Fund, these cities demonstrated a strong interest in tackling systemic issues. Their proposals highlighted the problems they face, their capacity for innovation, and local initiatives and partnerships.

Their ongoing journeys have surfaced four lessons that can help other cities move beyond conventional planning pitfalls, and adopt a more responsive, inclusive, and sustainable approach to urban development…(More)”.

Can We Trust Social Science Yet?

Curated on May 21, 2025May 21, 2025 by Stefaan Verhulst

Essay by Ryan Briggs: “Everyone likes the idea of evidence-based policy, but it’s hard to realize it when our most reputable social science journals are still publishing poor quality research.

Ideally, policy and program design is a straightforward process: a decision-maker faces a problem, turns to peer-reviewed literature, and selects interventions shown to work. In reality, that’s rarely how things unfold. The popularity of “evidence-based medicine” and other “evidence-based” topics highlights our desire for empirical approaches — but would the world actually improve if those in power consistently took social scienceevidence seriously? It brings me no joy to tell you that, at present, I think the answer is usually “no.”

Given the current state of evidence production in the social sciences, I believe that many — perhaps most — attempts to use social scientific evidence to inform policy will not lead to better outcomes. This is not because of politics or the challenges of scaling small programs. The problem is more immediate. Much of social science research is of poor quality, and sorting the trustworthy work from bad work is difficult, costly, and time-consuming.

But it is necessary. If you were to randomly select an empirical paper published in the past decade — including any studies from the top journals in political science or economics — there is a high chance that its findings may be inaccurate. And not just off by a little: possibly two times as large, or even incorrectly signed. As an academic, this bothers me. I think it should bother you, too. So let me explain why this happens…(More)”.

Simulating Human Behavior with AI Agents

Curated on May 21, 2025May 21, 2025 by Stefaan Verhulst

Brief by The Stanford Institute for Human-Centered AI (HAI): “…we introduce an AI agent architecture that simulates more than 1,000 real people. The agent architecture—built by combining the transcripts of two-hour, qualitative interviews with a large language model (LLM) and scored against social science benchmarks—successfully replicated real individuals’ responses to survey questions 85% as accurately as participants replicate their own answers across surveys staggered two weeks apart. The generative agents performed comparably in predicting people’s personality traits and experiment outcomes and were less biased than previously used simulation tools.

This architecture underscores the benefits of using generative agents as a research tool to glean new insights into real-world individual behavior. However, researchers and policymakers must also mitigate the risks of using generative agents in such contexts, including harms related to over-reliance on agents, privacy, and reputation…(More)”.

When data disappear: public health pays as US policy strays

Curated on May 21, 2025May 22, 2025 by Stefaan Verhulst

Paper by Thomas McAndrew, Andrew A Lover, Garrik Hoyt, and Maimuna S Majumder: “Presidential actions on Jan 20, 2025, by President Donald Trump, including executive orders, have delayed access to or led to the removal of crucial public health data sources in the USA. The continuous collection and maintenance of health data support public health, safety, and security associated with diseases such as seasonal influenza. To show how public health data surveillance enhances public health practice, we analysed data from seven US Government-maintained sources associated with seasonal influenza. We fit two models that forecast the number of national incident influenza hospitalisations in the USA: (1) a data-rich model incorporating data from all seven Government data sources; and (2) a data-poor model built using a single Government hospitalisation data source, representing the minimal required information to produce a forecast of influenza hospitalisations. The data-rich model generated reliable forecasts useful for public health decision making, whereas the predictions using the data-poor model were highly uncertain, rendering them impractical. Thus, health data can serve as a transparent and standardised foundation to improve domestic and global health. Therefore, a plan should be developed to safeguard public health data as a public good…(More)”.