A New Paradigm for Fueling AI for the Public Good


Article by Kevin T. Frazier: “Imagine receiving this email in the near future: “Thank you for sharing data with the American Data Collective on May 22, 2025. After first sharing your workout data with SprintAI, a local startup focused on designing shoes for differently abled athletes, your data donation was also sent to an artificial intelligence research cluster hosted by a regional university. Your donation is on its way to accelerate artificial intelligence innovation and support researchers and innovators addressing pressing public needs!”

That is exactly the sort of message you could expect to receive if we made donations of personal data akin to blood donations—a pro-social behavior that may not immediately serve a donor’s individual needs but may nevertheless benefit the whole of the community. This vision of a future where data flow toward the public good is not science fiction—it is a tangible possibility if we address a critical bottleneck faced by innovators today.

Creating the data equivalent of blood banks may not seem like a pressing need or something that people should voluntarily contribute to, given widespread concerns about a few large artificial intelligence (AI) companies using data for profit-driven and, arguably, socially harmful ends. This narrow conception of the AI ecosystem fails to consider the hundreds of AI research initiatives and startups that have a desperate need for high-quality data. I was fortunate enough to meet leaders of those nascent AI efforts at Meta’s Open Source AI Summit in Austin, Texas. For example, I met with Matt Schwartz, who leads a startup that leans on AI to glean more diagnostic information from colonoscopies. I also connected with Edward Chang, a professor of neurological surgery at the University of California, San Francisco Weill Institute for Neurosciences, who relies on AI tools to discover new information on how and why our brains work. I also got to know Corin Wagen, whose startup is helping companies “find better molecules faster.” This is a small sample of the people leveraging AI for objectively good outcomes. They need your help. More specifically, they need your data.

A tragic irony shapes our current data infrastructure. Most of us share mountains of data with massive and profitable private parties—smartwatch companies, diet apps, game developers, and social media companies. Yet, AI labs, academic researchers, and public interest organizations best positioned to leverage our data for the common good are often those facing the most formidable barriers to acquiring the necessary quantity, quality, and diversity of data. Unlike OpenAI, they are not going to use bots to scrape the internet for data. Unlike Google and Meta, they cannot rely on their own social media platforms and search engines to act as perpetual data generators. And, unlike Anthropic, they lack the funds to license data from media outlets. So, while commercial entities amass vast datasets, frequently as a byproduct of consumer services and proprietary data acquisition strategies, mission-driven AI initiatives dedicated to public problems find themselves in a state of chronic data scarcity. This is not merely a hurdle—it is a systemic bottleneck choking off innovation where society needs it most, delaying or even preventing the development of AI tools that could significantly improve lives.

Individuals are, quite rightly, increasingly hesitant to share their personal information, with concerns about privacy, security, and potential misuse being both rampant and frequently justified by past breaches and opaque practices. Yet, in a striking contradiction, troves of deeply personal data are continuously siphoned by app developers, by tech platforms, and, often opaquely, by an extensive network of data brokers. This practice often occurs with minimal transparency and without informed consent concerning the full lifecycle and downstream uses of that data. This lack of transparency extends to how algorithms trained on this data make decisions that can impact individuals’ lives—from loan applications to job prospects—often without clear avenues for recourse or understanding, potentially perpetuating existing societal biases embedded in historical data…(More)”.

Sentinel Cities for Public Health


Article by Jesse Rothman, Paromita Hore & Andrew McCartor: “In 2017, a New York City health inspector visited the home of a 5-year-old child with an elevated blood lead level. With no sign of lead paint—the usual suspect in such cases—the inspector discovered dangerous levels of lead in a bright yellow container of “Georgian Saffron,” a spice obtained in the family’s home country. It was not the first case associated with the use of lead-containing Georgian spices—the NYC Health Department shared their findings with authorities in Georgia, which catalyzed a survey of children’s blood lead levels in Georgia, and led to increased regulatory enforcement and education. Significant declines in spice lead levels in the country have had ripple effects in NYC also: not only a drop in spice samples from Georgia containing detectable lead but also a significant reduction in blood lead levels among NYC children of Georgian ancestry.

This wasn’t a lucky break—it was the result of a systematic approach to transform local detection into global impact. Findings from local NYC surveillance are, of course, not limited to Georgian spices. Surveillance activities have identified a variety of lead-containing consumer products from around the world, from cosmetics and medicines to ceramics and other goods. Routinely surveying local stores for lead-containing products has resulted in the removal of over 30,000 hazardous consumer products from NYC store shelves since 2010.

How can we replicate and scale up NYC’s model to address the global crisis of lead poisoning?…(More)”.

Energy and AI Observatory


IEA’s Energy and AI Observatory: “… provides up-to-date data and analysis on the growing links between the energy sector and artificial intelligence (AI). The new and fast-moving field of AI requires a new approach to gathering data and information, and the Observatory aims to provide regularly updated data and a comprehensive view of the implications of AI on energy demand (energy for AI) and of AI applications for efficiency, innovation, resilience and competitiveness in the energy sector (AI for energy). This first-of-a-kind platform is developed and maintained by the IEA, with valuable contributions of data and insights from the IEA’s energy industry and tech sector partners, and complements the IEA’s Special Report on Energy and AI…(More)”.

AI alone cannot solve the productivity puzzle


Article by Carl Benedikt Frey: “Each time fears of AI-driven job losses flare up, optimists reassure us that artificial intelligence is a productivity tool that will help both workers and the economy. Microsoft chief Satya Nadella thinks autonomous AI agents will allow users to name their goal while the software plans, executes and learns across every system. A dream tool — if efficiency alone was enough to solve the productivity problem.

History says it is not. Over the past half-century we have filled offices and pockets with ever-faster computers, yet labour-productivity growth in advanced economies has slowed from roughly 2 per cent a year in the 1990s to about 0.8 per cent in the past decade. Even China’s once-soaring output per worker has stalled.

The shotgun marriage of the computer and the internet promised more than enhanced office efficiency — it envisioned a golden age of discovery. By placing the world’s knowledge in front of everyone and linking global talent, breakthroughs should have multiplied. Yet research productivity has sagged. The average scientist now produces fewer breakthrough ideas per dollar than their 1960s counterpart.

What went wrong? As economist Gary Becker once noted, parents face a quality-versus-quantity trade-off: the more children they have, the less they can invest in each child. The same might be said for innovation.

Large-scale studies of inventive output confirm the result: researchers juggling more projects are less likely to deliver breakthrough innovations. Over recent decades, scientific papers and patents have become increasingly incremental. History’s greats understood why. Isaac Newton kept a single problem “constantly before me . . . till the first dawnings open slowly, by little and little, into a full and clear light”. Steve Jobs concurred: “Innovation is saying no to a thousand things.”

Human ingenuity thrives where precedent is thin. Had the 19th century focused solely on better looms and ploughs, we would enjoy cheap cloth and abundant grain — but there would be no antibiotics, jet engines or rockets. Economic miracles stem from discovery, not repeating tasks at greater speed.

Large language models gravitate towards the statistical consensus. A model trained before Galileo would have parroted a geocentric universe; fed 19th-century texts it would have proved human flight impossible before the Wright brothers succeeded. A recent Nature review found that while LLMs lightened routine scientific chores, the decisive leaps of insight still belonged to humans. Even Demis Hassabis, whose team at Google DeepMind produced AlphaFold — a model that can predict the shape of a protein and is arguably AI’s most celebrated scientific feat so far — admits that achieving genuine artificial general intelligence systems that can match or surpass humans across the full spectrum of cognitive tasks may require “several more innovations”…(More)”.

Community-Aligned A.I. Benchmarks


White Paper by the Aspen Institute: “…When people develop machine learning models for AI products and services, they iterate to improve performance. 

What it means to “improve” a machine learning model depends on what you want the model to do, like correctly transcribe an audio sample or generate a reliable summary of a long document.

Machine learning benchmarks are similar to standardized tests that AI researchers and builders can score their work against. Benchmarks allow us to both see if different model tweaks improve the performance for the intended task and compare similar models against one another.

Some famous benchmarks in AI include ImageNet and the Stanford Question Answering Dataset (SQuAD).

Benchmarks are important, but their development and adoption has historically been somewhat arbitrary. The capabilities that benchmarks measure should reflect the priorities for what the public wants AI tools to be and do. 

We can build positive AI futures, ones that emphasize what the public wants out of these emerging technologies. As such, it’s imperative that we build benchmarks worth striving for…(More)”.

The Loyalty Trap


Book by Jaime Lee Kucinskas: “…explores how civil servants navigated competing pressures and duties amid the chaos of the Trump administration, drawing on in-depth interviews with senior officials in the most contested agencies over the course of a tumultuous term. Jaime Lee Kucinskas argues that the professional culture and ethical obligations of the civil service stabilize the state in normal times but insufficiently prepare bureaucrats to cope with a president like Trump. Instead, federal employees became ensnared in intractable ethical traps, caught between their commitment to nonpartisan public service and the expectation of compliance with political directives. Kucinskas shares their quandaries, recounting attempts to preserve the integrity of government agencies, covert resistance, and a few bold acts of moral courage in the face of organizational decline and politicized leadership. A nuanced sociological account of the lessons of the Trump administration for democratic governance, The Loyalty Trap offers a timely and bracing portrait of the fragility of the American state…(More)”.

Manipulation: What It Is, Why It’s Bad, What to Do About It


Book by Cass Sunstein: “New technologies are offering companies, politicians, and others unprecedented opportunity to manipulate us. Sometimes we are given the illusion of power – of freedom – through choice, yet the game is rigged, pushing us in specific directions that lead to less wealth, worse health, and weaker democracy. In, Manipulation, nudge theory pioneer and New York Times bestselling author, Cass Sunstein, offers a new definition of manipulation for the digital age, explains why it is wrong; and shows what we can do about it. He reveals how manipulation compromises freedom and personal agency, while threatening to reduce our well-being; he explains the difference between manipulation and unobjectionable forms of influence, including ‘nudges’; and he lifts the lid on online manipulation and manipulation by artificial intelligence, algorithms, and generative AI, as well as threats posed by deepfakes, social media, and ‘dark patterns,’ which can trick people into giving up time and money. Drawing on decades of groundbreaking research in behavioral science, this landmark book outlines steps we can take to counteract manipulation in our daily lives and offers guidance to protect consumers, investors, and workers…(More)”.

Participatory Approaches to Responsible Data Reuse and Establishing a Social License


Chapter by Stefaan Verhulst, Andrew J. Zahuranec & Adam Zable in Global Public Goods Communication (edited by Sónia Pedro Sebastião and Anne-Marie Cotton): “… examines innovative participatory processes for establishing a social license for reusing data as a global public good. While data reuse creates societal value, it can raise concerns and reinforce power imbalances when individuals and communities lack agency over how their data is reused. To address this, the chapter explores participatory approaches that go beyond traditional consent mechanisms. By engaging data subjects and stakeholders, these approaches aim to build trust and ensure data reuse benefits all parties involved.

The chapter presents case studies of participatory approaches to data reuse from various sectors. This includes The GovLab’s New York City “Data Assembly,” which engaged citizens to set conditions for reusing cell phone data during the COVID-19 response. These examples highlight both the potential and challenges of citizen engagement, such as the need to invest in data literacy and other resources to support meaningful public input. The chapter concludes by considering whether participatory processes for data reuse can foster digital self-determination…(More)”.

Facilitating the secondary use of health data for public interest purposes across borders


OECD Paper: “Recent technological developments create significant opportunities to process health data in the public interest. However, the growing fragmentation of frameworks applied to data has become a structural impediment to fully leverage these opportunities. Public and private stakeholders suggest that three key areas should be analysed to support this outcome, namely: the convergence of governance frameworks applicable to health data use in the public interest across jurisdictions; the harmonisation of national procedures applicable to secondary health data use; and the public perceptions around the use of health data. This paper explores each of these three key areas and concludes with an overview of collective findings relating specifically to the convergence of legal bases for secondary data use…(More)”.

Protecting young digital citizens


Blog by Pascale Raulin-Serrier: “…As digital tools become more deeply embedded in children’s lives, many young users are unaware of the long-term consequences of sharing personal information online through apps, games, social media platforms and even educational tools. The large-scale collection of data related to their preferences, identity or lifestyle may be used for targeted advertising or profiling. This affects not only their immediate online experiences but can also have lasting consequences, including greater risks of discrimination and exclusion. These concerns underscore the urgent need for stronger safeguards, greater transparency and a child-centered approach to data governance.

CNIL’s initiatives to promote children’s privacy

In response to these challenges, the CNIL introduced eight recommendations in 2021 to provide practical guidance for children, parents and other stakeholders in the digital economy. These are built around several key pillars to promote and protect children’s privacy:

1. Providing specific safeguards

Children have distinct digital rights and must be able to exercise them fully. Under the European General Data Protection Regulation (GDPR), they benefit from special protections, including the right to be forgotten and, in some cases, the ability to consent to the processing of their data.In France, children can only register for social networks or online gaming platforms if they are over 15, or with parental consent if they are younger. CNIL helps hold platforms accountable by offering clear recommendations on how to present terms of service and collect consent in ways that are accessible and understandable to children.

2. Balancing autonomy and protection

The needs and capacities of a 6-year-old child differ greatly from those of a 16-year-old adolescent. It is essential to consider this diversity in online behaviour, maturity and the evolving ability to make informed decisions. The CNIL emphasizes  the importance of offering children a digital environment that strikes a balance between protection and autonomy. It also advocates for digital citizenship education to empower young people with the tools they need to manage their privacy responsibly…(More)”. See also Responsible Data for Children.