A Quest for AI Knowledge


Paper by Joshua S. Gans: “This paper examines how the introduction of artificial intelligence (AI), particularly generative and large language models capable of interpolating precisely between known data points, reshapes scientists’ incentives for pursuing novel versus incremental research. Extending the theoretical framework of Carnehl and Schneider (2025), we analyse how decision-makers leverage AI to improve precision within well-defined knowledge domains. We identify conditions under which the availability of AI tools encourages scientists to choose more socially valuable, highly novel research projects, contrasting sharply with traditional patterns of incremental knowledge growth. Our model demonstrates a critical complementarity: scientists strategically align their research novelty choices to maximise the domain where AI can reliably inform decision-making. This dynamic fundamentally transforms the evolution of scientific knowledge, leading either to systematic “stepping stone” expansions or endogenous research cycles of strategic knowledge deepening. We discuss the broader implications for science policy, highlighting how sufficiently capable AI tools could mitigate traditional inefficiencies in scientific innovation, aligning private research incentives closely with the social optimum…(More)”.

What is a fair exchange for access to public data?


Blog and policy brief by Jeni Tennison: “The most obvious approach to get companies to share value back to the public sector in return for access to data is to charge them. However, there are a number of challenges with a “pay to access” approach: it’s hard to set the right price; it creates access barriers, particularly for cash-poor start-ups; and it creates a public perception that the government is willing to sell their data, and might be tempted to loosen privacy-protecting governance controls in exchange for cash.

Are there other options? The policy brief explores a range of other approaches and assesses these against five goals that a value-sharing framework should ideally meet, to:

  • Encourage use of public data, including by being easy for organisations to understand and administer.
  • Provide a return on investment for the public sector, offsetting at least some of the costs of supporting the NDL infrastructure and minimising administrative costs.
  • Promote equitable innovation and economic growth in the UK, which might mean particularly encouraging smaller, home-grown businesses.
  • Create social value, particularly towards this Government’s other missions, such as achieving Net Zero or unlocking opportunity for all.
  • Build public trust by being easily explainable, avoiding misaligned incentives that encourage the breaking of governance guardrails, and feeling like a fair exchange.

In brief, alternatives to a pay-to-access model that still provide direct financial returns include:

  • Discounts: the public sector could secure discounts on products and services created using public data. However, this could be difficult to administer and enforce.
  • Royalties: taking a percentage of charges for products and services created using public data might be similarly hard to administer and enforce, but applies to more companies.
  • Equity: taking equity in startups can provide long-term returns and align with public investment goals.
  • Levies: targeted taxes on businesses that use public data can provide predictable revenue and encourage data use.
  • General taxation: general taxation can fund data infrastructure, but it may lack the targeted approach and public visibility of other methods.

It’s also useful to consider non-financial conditions that could be put on organisations accessing public data..(More)”.

A crowd-sourced repository for valuable government data


About: “DataLumos is an ICPSR archive for valuable government data resources. ICPSR has a long commitment to safekeeping and disseminating US government and other social science data. DataLumos accepts deposits of public data resources from the community and recommendations of public data resources that ICPSR itself might add to DataLumos. Please consider making a monetary donation to sustain DataLumos…(More)”.

The Age of AI in the Life Sciences: Benefits and Biosecurity Considerations


Report by the National Academies of Sciences, Engineering, and Medicine: “Artificial intelligence (AI) applications in the life sciences have the potential to enable advances in biological discovery and design at a faster pace and efficiency than is possible with classical experimental approaches alone. At the same time, AI-enabled biological tools developed for beneficial applications could potentially be misused for harmful purposes. Although the creation of biological weapons is not a new concept or risk, the potential for AI-enabled biological tools to affect this risk has raised concerns during the past decade.

This report, as requested by the Department of Defense, assesses how AI-enabled biological tools could uniquely impact biosecurity risk, and how advancements in such tools could also be used to mitigate these risks. The Age of AI in the Life Sciences reviews the capabilities of AI-enabled biological tools and can be used in conjunction with the 2018 National Academies report, Biodefense in the Age of Synthetic Biology, which sets out a framework for identifying the different risk factors associated with synthetic biology capabilities…(More)”

Can Real-Time Metrics Fill China’s Data Gap?


Case-study by Danielle Goldfarb: “After Chinese authorities abruptly reversed the country’s zero-COVID policy in 2022, global policymakers needed a clear and timely picture of the economic and health fallout.

China’s economy is the world’s second largest and the country has deep global links, so an accurate picture of its trajectory mattered for global health, growth and inflation. Getting a solid read was a challenge, however, since official health and economic data not only were not timely, but were widely viewed as unreliable.

There are now vast amounts and varied types of digital data available, from satellite images to social media text to online payments; these, along with advances in artificial intelligence (AI), make it possible to collect and analyze digital data in ways previously impossible.

Could these new tools help governments and global institutions refute or confirm China’s official picture and gather more timely intelligence?..(More)”.

Generative AI in Transportation Planning: A Survey


Paper by Longchao Da: “The integration of generative artificial intelligence (GenAI) into transportation planning has the potential to revolutionize tasks such as demand forecasting, infrastructure design, policy evaluation, and traffic simulation. However, there is a critical need for a systematic framework to guide the adoption of GenAI in this interdisciplinary domain. In this survey, we, a multidisciplinary team of researchers spanning computer science and transportation engineering, present the first comprehensive framework for leveraging GenAI in transportation planning. Specifically, we introduce a new taxonomy that categorizes existing applications and methodologies into two perspectives: transportation planning tasks and computational techniques. From the transportation planning perspective, we examine the role of GenAI in automating descriptive, predictive, generative simulation, and explainable tasks to enhance mobility systems. From the computational perspective, we detail advancements in data preparation, domain-specific fine-tuning, and inference strategies such as retrieval-augmented generation and zero-shot learning tailored to transportation applications. Additionally, we address critical challenges, including data scarcity, explainability, bias mitigation, and the development of domain-specific evaluation frameworks that align with transportation goals like sustainability, equity, and system efficiency. This survey aims to bridge the gap between traditional transportation planning methodologies and modern AI techniques, fostering collaboration and innovation. By addressing these challenges and opportunities, we seek to inspire future research that ensures ethical, equitable, and impactful use of generative AI in transportation planning…(More)”.

How data can transform government in Latin America and the Caribbean


Article by William Maloney, Daniel Rogger, and Christian Schuster: ” Governments across Latin America and the Caribbean are grappling with deep governance challenges that threaten progress and stability, including the need to improve efficiency, accountability and transparency.

Amid these obstacles, however, the region possesses a powerful, often underutilized asset: the administrative data it collects as a part of its everyday operations.

When harnessed effectively using data analytics, this data has the potential to drive transformative change, unlock new opportunities for growth and help address some of the most pressing issues facing the region. It’s time to tap into this potential and use data to chart a path forward. To help governments make the most of the opportunities that this data presents, the World Bank has embarked on a decade-long project to synthesize the latest knowledge on how to measure and improve government performance. We have found that governments already have a lot of the data they need to dramatically improve public services while conserving scarce resources.

But it’s not enough to collect data. It must also be put to good use to improve decision making, design better public policy and strengthen public sector functioning. We call these tools and practices for repurposing government data government analytics…(More)”.

Launch: A Blueprint to Unlock New Data Commons for Artificial Intelligence (AI)


Blueprint by Hannah Chafetz, Andrew J. Zahuranec, and Stefaan Verhulst: “In today’s rapidly evolving AI landscape, it is critical to broaden access to diverse and high-quality data to ensure that AI applications can serve all communities equitably. Yet, we are on the brink of a potential “data winter,” where valuable data assets that could drive public good are increasingly locked away or inaccessible.

Data commons — collaboratively governed ecosystems that enable responsible sharing of diverse datasets across sectors — offer a promising solution. By pooling data under clear standards and shared governance, data commons can unlock the potential of AI for public benefit while ensuring that its development reflects the diversity of experiences and needs across society.

To accelerate the creation of data commons, The Open Data Policy, today, releases “A Blueprint to Unlock New Data Commons for AI” — a guide on how to steward data to create data commons that enable public-interest AI use cases…the document is aimed at supporting libraries, universities, research centers, and other data holders (e.g. governments and nonprofits) through four modules:

  • Mapping the Demand and Supply: Understanding why AI systems need data, what data can be made available to train, adapt, or augment AI, and what a viable data commons prototype might look like that incorporates stakeholder needs and values;
  • Unlocking Participatory Governance: Co-designing key aspects of the data commons with key stakeholders and documenting these aspects within a formal agreement;
  • Building the Commons: Establishing the data commons from a practical perspective and ensure all stakeholders are incentivized to implement it; and
  • Assessing and Iterating: Evaluating how the commons is working and iterating as needed.

These modules are further supported by two supplementary taxonomies. “The Taxonomy of Data Types” provides a list of data types that can be valuable for public-interest generative AI use cases. The “Taxonomy of Use Cases” outlines public-interest generative AI applications that can be developed using a data commons approach, along with possible outcomes and stakeholders involved.

A separate set of worksheets can be used to further guide organizations in deploying these tools…(More)”.

Vetted Researcher Data Access


Coimisiún na Meán: “Article 40 of the Digital Services Act (DSA) makes provision for researchers to access data from Very Large Online Platforms (VLOPs) or Very Large Online Search Engines (VLOSEs) for the purposes of studying systemic risk in the EU and assessing mitigation measures. There are two ways that researchers that are studying systemic risk in the EU can get access to data under Article 40 of the DSA. 

Non-public data, known as “vetted researcher data access”, under Article 40(4)-(11). This is a process where a researcher, who has been vetted or assessed by a Digital Services Coordinator to have met the criteria as set out in DSA Article 40(8), can request access to non-public data held by a VLOP/VLOSE. The data must be limited in scope and deemed necessary and proportionate to the purpose of the research.

Public data under Article 40(12).  This is a process where a researcher who meets the relevant criteria can apply for data access directly from a VLOP/VLOSE, for example, access to a content library or API of public posts…(More)”.

A US-run system alerts the world to famines. It’s gone dark after Trump slashed foreign aid


Article by Lauren Kent: “A vital, US-run monitoring system focused on spotting food crises before they turn into famines has gone dark after the Trump administration slashed foreign aid.

The Famine Early Warning Systems Network (FEWS NET) monitors drought, crop production, food prices and other indicators in order to forecast food insecurity in more than 30 countries…Now, its work to prevent hunger in Sudan, South Sudan, Somalia, Yemen, Ethiopia, Afghanistan and many other nations has been stopped amid the Trump administration’s effort to dismantle the US Agency for International Development (USAID).

“These are the most acutely food insecure countries around the globe,” said Tanya Boudreau, the former manager of the project.

Amid the aid freeze, FEWS NET has no funding to pay staff in Washington or those working on the ground. The website is down. And its treasure trove of data that underpinned global analysis on food security – used by researchers around the world – has been pulled offline.

FEWS NET is considered the gold-standard in the sector, and it publishes more frequent updates than other global monitoring efforts. Those frequent reports and projections are key, experts say, because food crises evolve over time, meaning early interventions save lives and save money…The team at the University of Colorado Boulder has built a model to forecast water demand in Kenya, which feeds some data into the FEWS NET project but also relies on FEWS NET data provided by other research teams.

The data is layered and complex. And scientists say pulling the data hosted by the US disrupts other research and famine-prevention work conducted by universities and governments across the globe.

“It compromises our models, and our ability to be able to provide accurate forecasts of ground water use,” Denis Muthike, a Kenyan scientist and assistant research professor at UC Boulder, told CNN, adding: “You cannot talk about food security without water security as well.”

“Imagine that that data is available to regions like Africa and has been utilized for years and years – decades – to help inform divisions that mitigate catastrophic impacts from weather and climate events, and you’re taking that away from the region,” Muthike said. He cautioned that it would take many years to build another monitoring service that could reach the same level…(More)”.