Explore our articles
View All Results

Stefaan Verhulst

Framework by Josh Martin: “The first 90 days as a Chief Data Officer can make or break your tenure. You’re walking into an organization with high expectations, complex political dynamics, legacy technical debt, and competing priorities. Everyone wants quick wins, but sustainable change takes time. I learned this the hard way when I became Indiana’s Chief Data Officer in 2020—right as COVID-19 hit. Within weeks, I was leading the state’s data response while simultaneously building an agency from scratch. The framework below is what I wish I’d had on day one. This isn’t theory. It’s a battle-tested playbook from 13 years in state government, leading a 50-person data agency, navigating crises, and building enterprise data governance across 120+ agencies…(More)”.

The First 90 Days as a Chief Data Officer

Article by Adam Milward: “According to a recent Request for Information published in the Federal Register, ICE is seeking details from U.S. companies about “commercial Big Data and Ad Tech” products that could directly support investigative work.

As WIRED has reported, this appears to be the first time ICE has explicitly referenced ad tech in such a filing — signalling interest in repurposing technologies originally built for advertising, such as location and device data, for law-enforcement and surveillance purposes.

ICE has framed the request as exploratory and planning-oriented, asserting a commitment to civil liberties and privacy. However, this is not happening in isolation. ICE has previously purchased and used commercial data products — including mobile location data and analytics platforms — from vendors such as Palantir, Penlink (Webloc), and Venntel.

What are the implications for commercial organisations?

This kind of move by ICE throws a spotlight on the moral responsibilities of data-heavy companies, even when what they’re doing is technically legal.

I strongly believe in data federation and meaningful data sharing between public and private sectors. But we must be honest with ourselves: data sharing is not always an unqualified good.

If you’re sharing data or data tools with ICE, it seems reasonable to suggest you’re contributing to their output – at the moment this is certainly not something I, or MetadataWorks as a company, would be comfortable with.

For now, most of these private companies are not legally forced to sell or share data with ICE.

In essence:

  • For the private sector, choosing to sell or share data or data tools is an ethical as well as a financial decision
  • Choosing not to sell is also a statement which could have real commercial implications..(More)”.
Enabling ICE: The Moral Obligations of Data Sharing

Article by Meghan Maury: “The Privacy Act of 1974 was designed to give people at least some control over how the federal government uses and shares their personal data. Under the law, agencies must notify the public when they plan to use personal information in new ways – including when they intend to share it with another agency – and give the public an opportunity to weigh in.

At dataindex.us, we track these data-sharing notices on our Take Action page. Recently, a pattern has emerged that you might miss if you’re only looking at one notice at a time.

Since around July of last year, the number and pace of data-sharing agreements between federal agencies and the Department of the Treasury has steadily increased. Most are framed as efforts to reduce “waste, fraud, and abuse” in government programs…

It might be. Cutting waste and fraud could mean taxpayer dollars are used more efficiently, programs run more smoothly, and services improve for the people who rely on them.

I’ve personally benefited from this kind of data sharing. When the Department of Education began pulling tax information directly from the IRS, I no longer had to re-enter everything for my financial aid forms. The process became faster, simpler, and far less error-prone…

The danger comes when automated data matching is used to decide who gets help (and who doesn’t!) without adequate safeguards. When errors happen, the consequences can be devastating.

Imagine a woman named Olivia Johnson. She has a spouse and three children and earns about $40,000 a year. Based on her income and family size, she qualifies for SNAP and other assistance that helps keep food on the table.

Right down the road lives another Olivia Johnson. She earns about $110,000 a year, has a spouse and one child, and doesn’t qualify for any benefits.

When SNAP runs Olivia’s application through a new data-matching system, it accidentally links her to the higher-earning Olivia. Her application is flagged as “fraud,” denied, and she’s barred from reapplying for a year.

This is a fictional example, but false matches like this are not rare. In many settings, a data error just means a messy spreadsheet or a bad statistic. In public benefit programs, it can mean a family goes hungry…(More)”

The Rise of the Data Sharing Agreement

The GovLab: “…we are launching the Observatory of Public Sector AIa research initiative of InnovateUS, and a project of The Governance Lab(opens in new window). With data from more than 150,000 public servants, the Observatory represents one of the most comprehensive empirical efforts to date to understand awareness, attitudes, and adoption of AI as well as the impact of AI on work and workers.

Our goal is not simply to document learning, but to translate these insights into a clearer understanding of which investments in upskilling lead to better services, more effective policies, and stronger government capacity.

Our core hypothesis is straightforward: the right investments in public sector human capital can produce measurable improvements in government capability and performance, and ultimately better outcomes for residents. Skill-building is not peripheral to how the government works. It is central to creating institutions that are more effective, more responsive, and better equipped to deliver public value.

We are currently cleaning, analyzing, and expanding this dataset and will publish the Observatory’s first research report later this spring.

The Research Agenda

The Observatory is organized around a set of interconnected research questions that trace the full pathway from learning to impact.

Our goal is not simply to document learning, but to translate these insights into a clearer understanding of which investments in upskilling lead to better services, more effective policies, and stronger government capacity.

We begin with baseline capacity, mapping where public servants start across core AI competencies, identifying where skill gaps are largest, and distinguishing individual limitations from structural constraints such as unclear policies or restricted access to tools.

We then examine task-level use, documenting what public servants are actually doing with AI. 

Our data also surface organizational obstacles that shape adoption far more than skill alone. Across agencies, respondents cite inconsistent guidance, uncertainty about permissions, and limited access as primary barriers. 

Through matched pre- and post-training assessments, we measure gains in technical proficiency, confidence, and ethical reasoning. We plan to track persistence through three to six-month follow-ups to assess whether skills endure, reshape workflows, and diffuse across teams.

We analyze how training shifts confidence and perceived value, both of which are essential precursors to behavior change. We collect indicators of effectiveness through self-reported workflow improvements that can later be paired with administrative performance data.

Finally, we examine variation across roles, agencies, and geographies, how workers exercise judgment when evaluating accuracy, bias, and reliability in AI outputs, and how different training modalities compare in producing durable learning outcomes…(More)”

Launching the Observatory of Public Sector AI: An Invitation to Build the Evidence Base Together

Article by David Oks: “Here’s the story of a remarkable scandal from a few years ago.

In the South Pacific, just north of Australia, there is a small, impoverished, and remote country called Papua New Guinea. It’s a country that I’ve always found absolutely fascinating. If there’s any outpost of true remoteness in the world, I think it’s either in the outer mountains of Afghanistan, in the deepest jungles of central Africa, or in the highlands of Papua New Guinea. (PNG, we call it.) Here’s my favorite fact: Papua New Guinea, with about 0.1 percent of the world’s population, hosts more than 10 percent of the world’s languages. Two villages, separated perhaps only by a few miles, will speak languages that are not mutually intelligible. And if you go into rural PNG, far into rural PNG, you’ll find yourself in places that time forgot.

But here’s a question about Papua New Guinea: how many people live there?

The answer should be pretty simple. National governments are supposed to provide annual estimates for their populations. And the PNG government does just that. In 2022, it said that there were 9.4 million people in Papua New Guinea. So 9.4 million people was the official number.

But how did the PNG government reach that number?

The PNG government conducts a census about every ten years. When the PNG government provided its 2022 estimate, the previous census had been done in 2011. But that census was a disaster, and the PNG government didn’t consider its own findings credible. So the PNG government took the 2000 census, which found that the country had 5.5 million people, and worked off of that one. So the 2022 population estimate was an extrapolation from the 2000 census, and the number that the PNG government arrived at was 9.4 million.

But this, even the PNG government would admit, was a hazy guess.

About 80 percent of people in Papua New Guinea live in the countryside. And this is not a countryside of flat plains and paved roads: PNG is a country of mountain highlands and remote islands. Many places, probably most places, don’t have roads leading to them; and the roads that do exist are almost never paved. People speak different languages and have little trust in the central government, which simply isn’t a force in most of the country. So traveling across PNG is extraordinarily treacherous. It’s not a country where you can send people to survey the countryside with much ease. And so the PNG government really had no idea how many people lived in the country.

Late in 2022, word leaked of a report that the UN had commissioned. The report found that PNG’s population was not 9.4 million people, as the government maintained, but closer to 17 million people—roughly double the official number. Researchers had used satellite imagery and household surveys to find that the population in rural areas had been dramatically undercounted.

This was a huge embarrassment for the PNG government. It suggested, first of all, that they were completely incompetent and had no idea what was going on in the country that they claimed to govern. And it also meant that all the economic statistics about PNG—which presented a fairly happy picture—were entirely false. Papua New Guinea had been ranked as a “lower-middle income” country, along with India and Egypt; but if the report was correct then it was simply a “lower-income” country, like Afghanistan or Mali. Any economic progress that the government could have cited was instantly wiped away.

But it wasn’t as though the government could point to census figures of its own. So the country’s prime minister had to admit that he didn’t know what the population was: he didn’t know, he said, whether the population is “17 million, or 13 million, or 10 million.” It basically didn’t matter, he said, because no matter what the population was, “I cannot adequately educate, provide health cover, build infrastructures and create the enabling law and order environment” for the country’s people to succeed…(More)”.

A lot of population numbers are fake

Article by Phillip Olla: “For the past few years, artificial intelligence has felt almost miraculously accessible. Nonprofits, schools, public agencies, and social enterprises have been able to use advanced AI tools at little or no cost. Grant proposals, impact evaluations, program curricula, community outreach campaigns, and policy briefs are now routinely “co-written” with AI. This accessibility has been widely described as the “democratization” of AI. But it rests on a fragile foundation.

The reality is the current era of “free” or heavily subsidized AI is a temporary phase, not a stable feature of the technology. As AI shifts from experimental tool to core infrastructure, its underlying economics such as energy, hardware, privacy, and market power are beginning to assert themselves. That will have serious consequences for equity, public interest work, and the organizations that serve communities most affected by social and economic inequality.

The question is no longer whether AI will become a paid, utility-like service. It is whether social sector institutions will help design that future or simply be forced to adapt to it on unfavorable terms…(More)”.

The Low-Cost AI Illusion

Paper by Yuval Rymon: “As artificial intelligence becomes embedded in democratic governance, a fundamental question emerges: how does AI transform the role of political representatives? This review analyzes AI’s impact across two channels: input representation (aggregating citizen preferences) and output representation (implementing policy decisions). It employs five democratic criteria to evaluate impacts, and examines the case studies of Taiwan’s vTaiwan platform and Austria’s AMS algorithmic profiling system. The analysis reveals AI transforms representatives’ roles along both channels: from interpreters of obscure public will to facilitators who reconcile clearly expressed preferences with practical constraints (input side), and from direct decision-makers to architects of algorithmic decision-making (ADM) systems (output side). Six institutional conditions determining whether AI enhances or undermines representation are derived: explicit democratic authorization of objectives, transparency extending to the system design stage, accountability mechanisms enabling challenge of system premises by operators, platform independence with institutional integration, active reduction of participation barriers, and clear authority frameworks preventing selective implementation of citizen consensus…(More)”.

Of the people, by the algorithm: how AI transforms the role of democratic representatives?

Article by Ruchika Joshi and Miranda Bogen: “The ability to remember you and your preferences is rapidly becoming a big selling point for AI chatbots and agents. 

Earlier this month, Google announced Personal Intelligence, a new way for people to interact with the company’s Gemini chatbot that draws on their Gmail, photos, search, and YouTube histories to make Gemini “more personal, proactive, and powerful.” It echoes similar moves by OpenAIAnthropic, and Meta to add new ways for their AI products to remember and draw from people’s personal details and preferences. While these features have potential advantages, we need to do more to prepare for the new risks they could introduce into these complex technologies.

Personalized, interactive AI systems are built to act on our behalf, maintain context across conversations, and improve our ability to carry out all sorts of tasks, from booking travel to filing taxes. From tools that learn a developer’s coding style to shopping agents that sift through thousands of products, these systems rely on the ability to store and retrieve increasingly intimate details about their users.  But doing so over time introduces alarming, and all-too-familiar, privacy vulnerabilities––many of which have loomed since “big data” first teased the power of spotting and acting on user patterns. Worse, AI agents now appear poised to plow through whatever safeguards had been adopted to avoid those vulnerabilities. 

Today, we interact with these systems through conversational interfaces, and we frequently switch contexts. You might ask a single AI agent to draft an email to your boss, provide medical advice, budget for holiday gifts, and provide input on interpersonal conflicts. Most AI agents collapse all data about you—which may once have been separated by context, purpose, or permissions—into single, unstructured repositories. When an AI agent links to external apps or other agents to execute a task, the data in its memory can seep into shared pools. This technical reality creates the potential for unprecedented privacy breaches that expose not only isolated data points, but the entire mosaic of people’s lives…(More)”.

What AI “remembers” about you is privacy’s next frontier

Article by Sarah Wray: “The UK Government Digital Service (GDS) has published new guidelines to help public sector organisations prepare their datasets for use with artificial intelligence. Alongside a four-pillar framework, the guidance includes an AI-ready data action plan and a self-assessment checklist.

The document states that: “The United Kingdom is at a critical inflection point in its adoption of artificial intelligence across sectors. While advances in machine learning, generative AI capabilities, and agentic AI capabilities continue at pace, the effectiveness, safety, and legitimacy of AI adoption remain fundamentally constrained by the quality, structure, and governance of underlying data.”

The guidelines, which were shaped by input from public sector bodies, departments and expert organisations, set out four pillars of AI-ready datasets to address these issues: technical optimisation; data and metadata quality; organisational and infrastructure context; and legal, security and ethical compliance.

The document states that: “AI readiness is inherently socio technical. Infrastructure modernisation, metadata fitness, and unstructured data pipelines are essential, but insufficient without clear accountability, sustained skills, and explicit legal and ethical decisioning at dataset level.”..The Department for Science, Innovation and Technology (DSIT) has also published a progress update on the National Data Library (NDL).

The forthcoming NDL is envisaged as a tool to make it “easier to find and reuse data across public sector organisations”. Its goal is to support “better prevention, intervention and detection, [and open] up data to industry, the voluntary sector, start-ups and academics to accelerate AI-driven innovation and boost growth”.

The creation of the NDL is backed by over £100m (US$138m) as part of a £1.9bn (US$2.6bn) total investment allocated to DSIT for cross-cutting digital priorities…(More)”.

New guidelines aim to make UK government datasets AI-ready

Paper by Bruno Botas et al: “The increasing use of social media, particularly X (formerly Twitter), has enabled citizens to openly share their views, making it a valuable arena for examining public perceptions of immigration and its intersections with racial discrimination and xenophobia. This study analyzes Spanish digital debates from January 2020 to January 2023 through a mixed methodology that combines text pre-processing, semantic filtering of keywords, topic modeling, and sentiment analysis. A five-topic solution obtained through Latent Dirichlet Allocation (LDA) captured the main dimensions of the discourse: (1) economic and political debates on immigration, (2) international migration and refugee contexts, (3) racism and social discrimination, (4) insults, stereotypes, and xenophobic framings, and (5) small boat arrivals and maritime management. Sentiment analysis using a transformer-based model (roBERTuito) revealed a strong predominance of negativity across all topics, with sharp spikes linked to major migration crises, humanitarian emergencies, and highly mediatized cultural events. Qualitative readings of representative posts further showed that negativity was often articulated through invasion metaphors, securitarian framings, satire, and ridicule, indicating that hostility was not merely reactive but embedded in broader economic, political, and cultural registers. These findings demonstrate that discriminatory discourse in Spain is event-driven, becoming particularly salient during crises and symbolic moments, and underline the persistent role of social media in amplifying racialized exclusion and partisan polarization…(More)”.

Public perception on immigration and racial discrimination in Spain: a social media analysis using X data

Get the latest news right in your inbox

Subscribe to curated findings and actionable knowledge from The Living Library, delivered to your inbox every Friday