Charting the AI for Good Landscape – A New Look


Article by Perry Hewitt and Jake Porway: “More than 50% of nonprofits report that their organization uses generative AI in day-to-day operations. We’ve also seen an explosion of AI tools and investments. 10% of all the AI companies that exist in the US were founded in 2022, and that number has likely grown in subsequent years.  With investors funneling over $300B into AI and machine learning startups, it’s unlikely this trend will reverse any time soon.

Not surprisingly, the conversation about Artificial Intelligence (AI) is now everywhere, spanning from commercial uses such as virtual assistants and consumer AI to public goods, like AI-driven drug discovery and chatbots for education. The dizzying amount of new AI programs and initiatives – over 5000 new tools listed in 2023 on AI directories like TheresAnAI alone – can make the AI landscape challenging to navigate in general, much less for social impact. Luckily, four years ago, we surveyed the Data and AI for Good landscape and mapped out distinct families of initiatives based on their core goals. Today, we are revisiting that landscape to help folks get a handle on the AI for Good landscape today and to reflect on how the field has expanded, diversified, and matured…(More)”.

Smart Cities:Technologies and Policy Options to Enhance Services and Transparency


GAO Report: “Cities across the nation are using “smart city” technologies like traffic cameras and gunshot detectors to improve public services. In this technology assessment, we looked at their use in transportation and law enforcement.

Experts and city officials reported multiple benefits. For example, Houston uses cameras and Bluetooth sensors to measure traffic flow and adjust signal timing. Other cities use license plate readers to find stolen vehicles.

But the technologies can be costly and the benefits unclear. The data they collect may be sold, raising privacy and civil liberties concerns. We offer three policy options to address such challenges…(More)”.

Data Commons: The Missing Infrastructure for Public Interest Artificial Intelligence


Article by Stefaan Verhulst, Burton Davis and Andrew Schroeder: “Artificial intelligence is celebrated as the defining technology of our time. From ChatGPT to Copilot and beyond, generative AI systems are reshaping how we work, learn, and govern. But behind the headline-grabbing breakthroughs lies a fundamental problem: The data these systems depend on to produce useful results that serve the public interest is increasingly out of reach.

Without access to diverse, high-quality datasets, AI models risk reinforcing bias, deepening inequality, and returning less accurate, more imprecise results. Yet, access to data remains fragmented, siloed, and increasingly enclosed. What was once open—government records, scientific research, public media—is now locked away by proprietary terms, outdated policies, or simple neglect. We are entering a data winter just as AI’s influence over public life is heating up.

This isn’t just a technical glitch. It’s a structural failure. What we urgently need is new infrastructure: data commons.

A data commons is a shared pool of data resources—responsibly governed, managed using participatory approaches, and made available for reuse in the public interest. Done correctly, commons can ensure that communities and other networks have a say in how their data is used, that public interest organizations can access the data they need, and that the benefits of AI can be applied to meet societal challenges.

Commons offer a practical response to the paradox of data scarcity amid abundance. By pooling datasets across organizations—governments, universities, libraries, and more—they match data supply with real-world demand, making it easier to build AI that responds to public needs.

We’re already seeing early signs of what this future might look like. Projects like Common Corpus, MLCommons, and Harvard’s Institutional Data Initiative show how diverse institutions can collaborate to make data both accessible and accountable. These initiatives emphasize open standards, participatory governance, and responsible reuse. They challenge the idea that data must be either locked up or left unprotected, offering a third way rooted in shared value and public purpose.

But the pace of progress isn’t matching the urgency of the moment. While policymakers debate AI regulation, they often ignore the infrastructure that makes public interest applications possible in the first place. Without better access to high-quality, responsibly governed data, AI for the common good will remain more aspiration than reality.

That’s why we’re launching The New Commons Challenge—a call to action for universities, libraries, civil society, and technologists to build data ecosystems that fuel public-interest AI…(More)”.

Real-time prices, real results: comparing crowdsourcing, AI, and traditional data collection


Article by Julius Adewopo, Bo Andree, Zacharey Carmichael, Steve Penson, Kamwoo Lee: “Timely, high-quality food price data is essential for shock responsive decision-making. However, in many low- and middle-income countries, such data is often delayed, limited in geographic coverage, or unavailable due to operational constraints. Traditional price monitoring, which relies on structured surveys conducted by trained enumerators, is often constrained by challenges related to cost, frequency, and reach.

To help overcome these limitations, the World Bank launched the Real-Time Prices (RTP) data platform. This effort provides monthly price data using a machine learning framework. The models combine survey results with predictions derived from observations in nearby markets and related commodities. This approach helps fill gaps in local price data across a basket of goods, enabling real-time monitoring of inflation dynamics even when survey data is incomplete or irregular.

In parallel, new approaches—such as citizen-submitted (crowdsourced) data—are being explored to complement conventional data collection methods. These crowdsourced data were recently published in a Nature Scientific Data paper. While the adoption of these innovations is accelerating, maintaining trust requires rigorous validation.

newly published study in PLOS compares the two emerging methods with the traditional, enumerator-led gold standard, providing  new evidence that both crowdsourced and AI-imputed prices can serve as credible, timely alternatives to traditional ground-truth data collection—especially in contexts where conventional methods face limitations…(More)”.

These Startups Are Building Advanced AI Models Without Data Centers


Article by Will Knight: “Researchers have trained a new kind of large language model (LLM) using GPUs dotted across the world and fed private as well as public data—a move that suggests that the dominant way of building artificial intelligence could be disrupted.

Article by Will Knight: “Flower AI and Vana, two startups pursuing unconventional approaches to building AI, worked together to create the new model, called Collective-1.

Flower created techniques that allow training to be spread across hundreds of computers connected over the internet. The company’s technology is already used by some firms to train AI models without needing to pool compute resources or data. Vana provided sources of data including private messages from X, Reddit, and Telegram.

Collective-1 is small by modern standards, with 7 billion parameters—values that combine to give the model its abilities—compared to hundreds of billions for today’s most advanced models, such as those that power programs like ChatGPTClaude, and Gemini.

Nic Lane, a computer scientist at the University of Cambridge and cofounder of Flower AI, says that the distributed approach promises to scale far beyond the size of Collective-1. Lane adds that Flower AI is partway through training a model with 30 billion parameters using conventional data, and plans to train another model with 100 billion parameters—close to the size offered by industry leaders—later this year. “It could really change the way everyone thinks about AI, so we’re chasing this pretty hard,” Lane says. He says the startup is also incorporating images and audio into training to create multimodal models.

Distributed model-building could also unsettle the power dynamics that have shaped the AI industry…(More)”

AI action plan database


A project by the Institute for Progress: “In January 2025, President Trump tasked the Office of Science and Technology Policy with creating an AI Action Plan to promote American AI Leadership. The government requested input from the public, and received 10,068 submissions. The database below summarizes specific recommendations from these submissions. … We used AI to extract recommendations from each submission, and to tag them with relevant information. Click on a recommendation to learn more about it. See our analysis of common themes and ideas across these recommendations…(More)”.

Updating purpose limitation for AI: a normative approach from law and philosophy 


Paper by Rainer Mühlhoff and Hannah Ruschemeier: “The purpose limitation principle goes beyond the protection of the individual data subjects: it aims to ensure transparency, fairness and its exception for privileged purposes. However, in the current reality of powerful AI models, purpose limitation is often impossible to enforce and is thus structurally undermined. This paper addresses a critical regulatory gap in EU digital legislation: the risk of secondary use of trained models and anonymised training datasets. Anonymised training data, as well as AI models trained from this data, pose the threat of being freely reused in potentially harmful contexts such as insurance risk scoring and automated job applicant screening. We propose shifting the focus of purpose limitation from data processing to AI model regulation. This approach mandates that those training AI models define the intended purpose and restrict the use of the model solely to this stated purpose…(More)”.

Rebooting the global consensus: Norm entrepreneurship, data governance and the inalienability of digital bodies


Paper by Siddharth Peter de Souza and Linnet Taylor: “The establishment of norms among states is a common way of governing international actions. This article analyses the potential of norm-building for governing data and artificial intelligence technologies’ collective effects. Rather than focusing on state actors’s ability to establish and enforce norms, however, we identify a contrasting process taking place among civil society organisations in response to the international neoliberal consensus on the commodification of data. The norm we identify – ‘nothing about us without us’ – asserts civil society’s agency, and specifically the right of those represented in datasets to give or refuse permission through structures of democratic representation. We argue that this represents a form of norm-building that should be taken as seriously as that of states, and analyse how it is constructing the political power, relations, and resources to engage in governing technology at scale. We first outline how this counter-norming is anchored in data’s connections to bodies, land, community, and labour. We explore the history of formal international norm-making and the current norm-making work being done by civil society organisations internationally, and argue that these, although very different in their configurations and strategies, are comparable in scale and scope. Based on this, we make two assertions: first, that a norm-making lens is a useful way for both civil society and research to frame challenges to the primacy of market logics in law and governance, and second, that the conceptual exclusion of civil society actors as norm-makers is an obstacle to the recognition of counter-power in those spheres…(More)”.

Technical Tiers: A New Classification Framework for Global AI Workforce Analysis


Report by Siddhi Pal, Catherine Schneider and Ruggero Marino Lazzaroni: “… introduces a novel three-tiered classification system for global AI talent that addresses significant methodological limitations in existing workforce analyses, by distinguishing between different skill categories within the existing AI talent pool. By distinguishing between non-technical roles (Category 0), technical software development (Category 1), and advanced deep learning specialization (Category 2), our framework enables precise examination of AI workforce dynamics at a pivotal moment in global AI policy.

Through our analysis of a sample of 1.6 million individuals in the AI talent pool across 31 countries, we’ve uncovered clear patterns in technical talent distribution that significantly impact Europe’s AI ambitions. Asian nations hold an advantage in specialized AI expertise, with South Korea (27%), Israel (23%), and Japan (20%) maintaining the highest proportions of Category 2 talent. Within Europe, Poland and Germany stand out as leaders in specialized AI talent. This may be connected to their initiatives to attract tech companies and investments in elite research institutions, though further research is needed to confirm these relationships.

Our data also reveals a shifting landscape of global talent flows. Research shows that countries employing points-based immigration systems attract 1.5 times more high-skilled migrants than those using demand-led approaches. This finding takes on new significance in light of recent geopolitical developments affecting scientific research globally. As restrictive policies and funding cuts create uncertainty for researchers in the United States, one of the big destinations for European AI talent, the way nations position their regulatory environments, scientific freedoms, and research infrastructure will increasingly determine their ability to attract and retain specialized AI talent.

The gender analysis in our study illuminates another dimension of competitive advantage. Contrary to the overall AI talent pool, EU countries lead in female representation in highly technical roles (Category 2), occupying seven of the top ten global rankings. Finland, Czechia, and Italy have the highest proportion of female representation in Category 2 roles globally (39%, 31%, and 28%, respectively). This gender diversity represents not merely a social achievement but a potential strategic asset in AI innovation, particularly as global coalitions increasingly emphasize the importance of diverse perspectives in AI development…(More)”

Integrating Data Governance and Mental Health Equity: Insights from ‘Towards a Set of Universal Data Principles’


Article by Cindy Hansen: “This recent scholarly work, “Towards a Set of Universal Data Principles” by Steve MacFeely et al (2025), delves comprehensively into the expansive landscape of data management and governance. It is noteworthy to acknowledge the intricate processes through which humans collect, manage, and disseminate vast quantities of data. …To truly democratize digital mental healthcare, it’s crucial to empower individuals in their data journey. By focusing on Digital Self-Determination, people can participate in a transformative shift where control over personal data becomes a fundamental right, aligning with the proposed universal data principles. One can envision a world where mental health data, collected and used responsibly, contributes not only to personal well-being but also to the greater public good, echoing the need for data governance to serve society at large.

This concept of digital self-determination empowers individuals by ensuring they have the autonomy to decide who accesses their mental health data and how it’s utilized. Such empowerment is especially significant in the context of mental health, where data sensitivity is high, and privacy is paramount. Giving people the confidence to manage their data fosters trust and encourages them to engage more openly with digital health services, promoting a culture of trust which is a core element of the proposed data governance frameworks.

Holistic Research Canada’s Outcome Monitoring System honors this ethos, allowing individuals to control how their data is accessed, shared, and used while maintaining engagement with healthcare providers. With this system, people can actively participate in their mental health decisions, supported by data that offers transparency about their progress and prognoses, which is crucial in realizing the potential of data to serve both individual and broader societal interests.

Furthermore, this tool provides actionable insights into mental health journeys, promoting evidence-based practices, enhancing transparency, and ensuring that individuals’ rights are safeguarded throughout. These principles are vital to transforming individuals from passive subjects into active stewards of their data, consistent with the proposed principles of safeguarding data quality, integrity, and security…(More)”.