How data can transform government in Latin America and the Caribbean


Article by William Maloney, Daniel Rogger, and Christian Schuster: ” Governments across Latin America and the Caribbean are grappling with deep governance challenges that threaten progress and stability, including the need to improve efficiency, accountability and transparency.

Amid these obstacles, however, the region possesses a powerful, often underutilized asset: the administrative data it collects as a part of its everyday operations.

When harnessed effectively using data analytics, this data has the potential to drive transformative change, unlock new opportunities for growth and help address some of the most pressing issues facing the region. It’s time to tap into this potential and use data to chart a path forward. To help governments make the most of the opportunities that this data presents, the World Bank has embarked on a decade-long project to synthesize the latest knowledge on how to measure and improve government performance. We have found that governments already have a lot of the data they need to dramatically improve public services while conserving scarce resources.

But it’s not enough to collect data. It must also be put to good use to improve decision making, design better public policy and strengthen public sector functioning. We call these tools and practices for repurposing government data government analytics…(More)”.

Launch: A Blueprint to Unlock New Data Commons for Artificial Intelligence (AI)


Blueprint by Hannah Chafetz, Andrew J. Zahuranec, and Stefaan Verhulst: “In today’s rapidly evolving AI landscape, it is critical to broaden access to diverse and high-quality data to ensure that AI applications can serve all communities equitably. Yet, we are on the brink of a potential “data winter,” where valuable data assets that could drive public good are increasingly locked away or inaccessible.

Data commons — collaboratively governed ecosystems that enable responsible sharing of diverse datasets across sectors — offer a promising solution. By pooling data under clear standards and shared governance, data commons can unlock the potential of AI for public benefit while ensuring that its development reflects the diversity of experiences and needs across society.

To accelerate the creation of data commons, The Open Data Policy, today, releases “A Blueprint to Unlock New Data Commons for AI” — a guide on how to steward data to create data commons that enable public-interest AI use cases…the document is aimed at supporting libraries, universities, research centers, and other data holders (e.g. governments and nonprofits) through four modules:

  • Mapping the Demand and Supply: Understanding why AI systems need data, what data can be made available to train, adapt, or augment AI, and what a viable data commons prototype might look like that incorporates stakeholder needs and values;
  • Unlocking Participatory Governance: Co-designing key aspects of the data commons with key stakeholders and documenting these aspects within a formal agreement;
  • Building the Commons: Establishing the data commons from a practical perspective and ensure all stakeholders are incentivized to implement it; and
  • Assessing and Iterating: Evaluating how the commons is working and iterating as needed.

These modules are further supported by two supplementary taxonomies. “The Taxonomy of Data Types” provides a list of data types that can be valuable for public-interest generative AI use cases. The “Taxonomy of Use Cases” outlines public-interest generative AI applications that can be developed using a data commons approach, along with possible outcomes and stakeholders involved.

A separate set of worksheets can be used to further guide organizations in deploying these tools…(More)”.

Vetted Researcher Data Access


Coimisiún na Meán: “Article 40 of the Digital Services Act (DSA) makes provision for researchers to access data from Very Large Online Platforms (VLOPs) or Very Large Online Search Engines (VLOSEs) for the purposes of studying systemic risk in the EU and assessing mitigation measures. There are two ways that researchers that are studying systemic risk in the EU can get access to data under Article 40 of the DSA. 

Non-public data, known as “vetted researcher data access”, under Article 40(4)-(11). This is a process where a researcher, who has been vetted or assessed by a Digital Services Coordinator to have met the criteria as set out in DSA Article 40(8), can request access to non-public data held by a VLOP/VLOSE. The data must be limited in scope and deemed necessary and proportionate to the purpose of the research.

Public data under Article 40(12).  This is a process where a researcher who meets the relevant criteria can apply for data access directly from a VLOP/VLOSE, for example, access to a content library or API of public posts…(More)”.

A US-run system alerts the world to famines. It’s gone dark after Trump slashed foreign aid


Article by Lauren Kent: “A vital, US-run monitoring system focused on spotting food crises before they turn into famines has gone dark after the Trump administration slashed foreign aid.

The Famine Early Warning Systems Network (FEWS NET) monitors drought, crop production, food prices and other indicators in order to forecast food insecurity in more than 30 countries…Now, its work to prevent hunger in Sudan, South Sudan, Somalia, Yemen, Ethiopia, Afghanistan and many other nations has been stopped amid the Trump administration’s effort to dismantle the US Agency for International Development (USAID).

“These are the most acutely food insecure countries around the globe,” said Tanya Boudreau, the former manager of the project.

Amid the aid freeze, FEWS NET has no funding to pay staff in Washington or those working on the ground. The website is down. And its treasure trove of data that underpinned global analysis on food security – used by researchers around the world – has been pulled offline.

FEWS NET is considered the gold-standard in the sector, and it publishes more frequent updates than other global monitoring efforts. Those frequent reports and projections are key, experts say, because food crises evolve over time, meaning early interventions save lives and save money…The team at the University of Colorado Boulder has built a model to forecast water demand in Kenya, which feeds some data into the FEWS NET project but also relies on FEWS NET data provided by other research teams.

The data is layered and complex. And scientists say pulling the data hosted by the US disrupts other research and famine-prevention work conducted by universities and governments across the globe.

“It compromises our models, and our ability to be able to provide accurate forecasts of ground water use,” Denis Muthike, a Kenyan scientist and assistant research professor at UC Boulder, told CNN, adding: “You cannot talk about food security without water security as well.”

“Imagine that that data is available to regions like Africa and has been utilized for years and years – decades – to help inform divisions that mitigate catastrophic impacts from weather and climate events, and you’re taking that away from the region,” Muthike said. He cautioned that it would take many years to build another monitoring service that could reach the same level…(More)”.

Funding the Future: Grantmakers Strategies in AI Investment


Report by Project Evident: “…looks at how philanthropic funders are approaching requests to fund the use of AI… there was common recognition of AI’s importance and the tension between the need to learn more and to act quickly to meet the pace of innovation, adoption, and use of AI tools.

This research builds on the work of a February 2024 Project Evident and Stanford Institute for Human-Centered Artificial Intelligence working paper, Inspiring Action: Identifying the Social Sector AI Opportunity Gap. That paper reported that more practitioners than funders (by over a third) claimed their organization utilized AI. 

“From our earlier research, as well as in conversations with funders and nonprofits, it’s clear there’s a mismatch in the understanding and desire for AI tools and the funding of AI tools,” said Sarah Di Troia, Managing Director of Project Evident’s OutcomesAI practice and author of the report. “Grantmakers have an opportunity to quickly upskill their understanding – to help nonprofits improve their efficiency and impact, of course, but especially to shape the role of AI in civil society.”

The report offers a number of recommendations to the philanthropic sector. For example, funders and practitioners should ensure that community voice is included in the implementation of new AI initiatives to build trust and help reduce bias. Grantmakers should consider funding that allows for flexibility and innovation so that the social and education sectors can experiment with approaches. Most importantly, funders should increase their capacity and confidence in assessing AI implementation requests along both technical and ethical criteria…(More)”.

Human Development and the Data Revolution


Book edited by Sanna Ojanperä, Eduardo López, and Mark Graham: “…explores the uses of large-scale data in the contexts of development, in particular, what techniques, data sources, and possibilities exist for harnessing large datasets and new online data to address persistent concerns regarding human development, inequality, exclusion, and participation.

Employing a global perspective to explore the latest advances at the intersection of big data analysis and human development, this volume brings together pioneering voices from academia, development practice, civil society organizations, government, and the private sector. With a two-pronged focus on theoretical and practical research on big data and computational approaches in human development, the volume covers such themes as data acquisition, data management, data mining and statistical analysis, network science, visual analytics, and geographic information systems and discusses them in terms of practical applications in development projects and initiatives. Ethical considerations surrounding these topics are visited throughout, highlighting the tradeoffs between benefitting and harming those who are the subjects of these new approaches…(More)”

Extending the CARE Principles: managing data for vulnerable communities in wartime and humanitarian crises


Essay by Yana Suchikova & Serhii Nazarovets: “The CARE Principles (Collective Benefit, Authority to Control, Responsibility, Ethics) were developed to ensure ethical stewardship of Indigenous data. However, their adaptability makes them an ideal framework for managing data related to vulnerable populations affected by armed conflicts. This essay explores the application of CARE principles to wartime contexts, with a particular focus on internally displaced persons (IDPs) and civilians living under occupation. These groups face significant risks of data misuse, ranging from privacy violations to targeted repression. By adapting CARE, data governance can prioritize safety, dignity, and empowerment while ensuring that data serves the collective welfare of affected communities. Drawing on examples from Indigenous data governance, open science initiatives, and wartime humanitarian challenges, this essay argues for extending CARE principles beyond their original scope. Such an adaptation highlights CARE’s potential as a universal standard for addressing the ethical complexities of data management in humanitarian crises and conflict-affected environments…(More)”.

Data, waves and wind to be counted in the economy


Article by Robert Cuffe: “Wind and waves are set to be included in calculations of the size of countries’ economies for the first time, as part of changes approved at the United Nations.

Assets like oilfields were already factored in under the rules – last updated in 2008.

This update aims to capture areas that have grown since then, such as the cost of using up natural resources and the value of data.

The changes come into force in 2030, and could mean an increase in estimates of the size of the UK economy making promises to spend a fixed share of the economy on defence or aid more expensive.

The economic value of wind and waves can be estimated from the price of all the energy that can be generated from the turbines in a country.

The update also treats data as an asset in its own right on top of the assets that house it like servers and cables.

Governments use a common rule book for measuring the size of their economies and how they grow over time.

These changes to the rule book are “tweaks, rather than a rewrite”, according to Prof Diane Coyle of the University of Cambridge.

Ben Zaranko of the Institute for Fiscal Studies (IFS) calls it an “accounting” change, rather than a real change. He explains: “We’d be no better off in a material sense, and tax revenues would be no higher.”

But it could make economies look bigger, creating a possible future spending headache for the UK government…(More)”.

Bridging the Data Provenance Gap Across Text, Speech and Video


Paper by Shayne Longpre et al: “Progress in AI is driven largely by the scale and quality of training data. Despite this, there is a deficit of empirical analysis examining the attributes of well-established datasets beyond text. In this work we conduct the largest and first-of-its-kind longitudinal audit across modalities–popular text, speech, and video datasets–from their detailed sourcing trends and use restrictions to their geographical and linguistic representation. Our manual analysis covers nearly 4000 public datasets between 1990-2024, spanning 608 languages, 798 sources, 659 organizations, and 67 countries. We find that multimodal machine learning applications have overwhelmingly turned to web-crawled, synthetic, and social media platforms, such as YouTube, for their training sets, eclipsing all other sources since 2019. Secondly, tracing the chain of dataset derivations we find that while less than 33% of datasets are restrictively licensed, over 80% of the source content in widely-used text, speech, and video datasets, carry non-commercial restrictions. Finally, counter to the rising number of languages and geographies represented in public AI training datasets, our audit demonstrates measures of relative geographical and multilingual representation have failed to significantly improve their coverage since 2013. We believe the breadth of our audit enables us to empirically examine trends in data sourcing, restrictions, and Western-centricity at an ecosystem-level, and that visibility into these questions are essential to progress in responsible AI. As a contribution to ongoing improvements in dataset transparency and responsible use, we release our entire multimodal audit, allowing practitioners to trace data provenance across text, speech, and video…(More)”.

Artificial intelligence for modelling infectious disease epidemics


Paper by Moritz U. G. Kraemer et al: “Infectious disease threats to individual and public health are numerous, varied and frequently unexpected. Artificial intelligence (AI) and related technologies, which are already supporting human decision making in economics, medicine and social science, have the potential to transform the scope and power of infectious disease epidemiology. Here we consider the application to infectious disease modelling of AI systems that combine machine learning, computational statistics, information retrieval and data science. We first outline how recent advances in AI can accelerate breakthroughs in answering key epidemiological questions and we discuss specific AI methods that can be applied to routinely collected infectious disease surveillance data. Second, we elaborate on the social context of AI for infectious disease epidemiology, including issues such as explainability, safety, accountability and ethics. Finally, we summarize some limitations of AI applications in this field and provide recommendations for how infectious disease epidemiology can harness most effectively current and future developments in AI…(More)”.