Panels giving scientific advice to Census Bureau disbanded by Trump administration


Article by Jeffrey Mervis: “…U.S. Secretary of Commerce Howard Lutnick has disbanded five outside panels that provide scientific and community advice to the U.S. Census Bureau and other federal statistical agencies just as preparations are ramping up for the country’s next decennial census, in 2030.

The dozens of demographers, statisticians, and public members on the five panels received nearly identical letters this week telling them that “the Secretary of Commerce has determined that the purposes for which the [committee] was established have been fulfilled, and the committee has been terminated effective February 28, 2025. Thank you for your service.”

Statistician Robert Santos, who last month resigned as Census Bureau director 3 years into his 5-year term, says he’s “terribly disappointed but not surprised” by the move, noting how a recent directive by President Donald Trump on gender identity has disrupted data collection for a host of federal surveys…(More)”.

New AI Collaboratives to take action on wildfires and food insecurity


Google: “…last September we introduced AI Collaboratives, a new funding approach designed to unite public, private and nonprofit organizations, and researchers, to create AI-powered solutions to help people around the world.

Today, we’re sharing more about our first two focus areas for AI Collaboratives: Wildfires and Food Security.

Wildfires are a global crisis, claiming more than 300,000 lives due to smoke exposure annually and causing billions of dollars in economic damage. …Google.org has convened more than 15 organizations, including Earth Fire Alliance and Moore Foundation, to help in this important effort. By coordinating funding and integrating cutting-edge science, emerging technology and on-the-ground applications, we can provide collaborators with the tools they need to identify and track wildfires in near real time; quantify wildfire risk; shift more acreage to beneficial fires; and ultimately reduce the damage caused by catastrophic wildfires.

Nearly one-third of the world’s population faces moderate or severe food insecurity due to extreme weather, conflict and economic shocks. The AI Collaborative: Food Security will strengthen the resilience of global food systems and improve food security for the world’s most vulnerable populations through AI technologies, collaborative research, data-sharing and coordinated action. To date, 10 organizations have joined us in this effort, and we’ll share more updates soon…(More)”.

Government data is disappearing before our eyes


Article by Anna Massoglia: “A battle is being waged in the quiet corners of government websites and data repositories. Essential public records are disappearing and, with them, Americans’ ability to hold those in power accountable.

Take the Department of Government Efficiency, Elon Musk’s federal cost-cutting initiative. Touted as “maximally transparent,” DOGE is supposed to make government spending more efficient. But when journalists and researchers exposed major errors — from double-counting contracts to conflating caps with actual spending — DOGE didn’t fix the mistakes. Instead, it made them harder to detect.

Many Americans hoped DOGE’s work would be a step toward cutting costs and restoring trust in government. But trust must be earned. If our leaders truly want to restore faith in our institutions, they must ensure that facts remain available to everyone, not just when convenient.

Since Jan. 20, public records across the federal government have been erased. Economic indicators that guide investments, scientific datasets that drive medical breakthroughs, federal health guidelines and historical archives that inform policy decisions have all been put on the chopping block. Some missing datasets have been restored but are incomplete or have unexplained changes, rendering them unreliable.

Both Republican and Democratic administrations have played a role in limiting public access to government records. But the scale and speed of the Trump administration’s data manipulation — combined with buyouts, resignations and other restructuring across federal agencies — signal a new phase in the war on public information. This is not just about deleting files, it’s about controlling what the public sees, shaping the narrative and limiting accountability.

The Trump administration is accelerating this trend with revisions to official records. Unelected advisors are overseeing a sweeping reorganization of federal data, granting entities like DOGE unprecedented access to taxpayer records with little oversight. This is not just a bureaucratic reshuffle — it is a fundamental reshaping of the public record.

The consequences of data manipulation extend far beyond politics. When those in power control the flow of information, they can dictate collective truth. Governments that manipulate information are not just rewriting statistics — they are rewriting history.

From authoritarian regimes that have erased dissent to leaders who have fabricated economic numbers to maintain their grip on power, the dangers of suppressing and distorting data are well-documented.

Misleading or inconsistent data can be just as dangerous as opacity. When hard facts are replaced with political spin, conspiracy theories take root and misinformation fills the void.

The fact that data suppression and manipulation has occurred before does not lessen the danger, but underscores the urgency of taking proactive measures to safeguard transparency. A missing statistic today can become a missing historical fact tomorrow. Over time, that can reshape our reality…(More)”.

Large AI models are cultural and social technologies


Essay by Henry Farrell, Alison Gopnik, Cosma Shalizi, and James Evans: “Debates about artificial intelligence (AI) tend to revolve around whether large models are intelligent, autonomous agents. Some AI researchers and commentators speculate that we are on the cusp of creating agents with artificial general intelligence (AGI), a prospect anticipated with both elation and anxiety. There have also been extensive conversations about cultural and social consequences of large models, orbiting around two foci: immediate effects of these systems as they are currently used, and hypothetical futures when these systems turn into AGI agents perhaps even superintelligent AGI agents.

But this discourse about large models as intelligent agents is fundamentally misconceived. Combining ideas from social and behavioral sciences with computer science can help us understand AI systems more accurately. Large Models should not be viewed primarily as intelligent agents, but as a new kind of cultural and social technology, allowing humans to take advantage of information other humans have accumulated.

The new technology of large models combines important features of earlier technologies. Like pictures, writing, print, video, Internet search, and other such technologies, large models allow people to access information that other people have created. Large Models – currently language, vision, and multi-modal depend on the fact that the Internet has made the products of these earlier technologies readily available in machine-readable form. But like economic markets, state bureaucracies, and other social technologies, these systems not only make information widely available, they allow it to be reorganized, transformed, and restructured in distinctive ways. Adopting Herbert Simon’s terminology, large models are a new variant of the “artificial systems of human society” that process information to enable large-scale coordination…(More)”

Can small language models revitalize Indigenous languages?


Article by Brooke Tanner and Cameron F. Kerry: “Indigenous languages play a critical role in preserving cultural identity and transmitting unique worldviews, traditions, and knowledge, but at least 40% of the world’s 6,700 languages are currently endangered. The United Nations declared 2022-2032 as the International Decade of Indigenous Languages to draw attention to this threat, in hopes of supporting the revitalization of these languages and preservation of access to linguistic resources.  

Building on the advantages of SLMs, several initiatives have successfully adapted these models specifically for Indigenous languages. Such Indigenous language models (ILMs) represent a subset of SLMs that are designed, trained, and fine-tuned with input from the communities they serve. 

Case studies and applications 

  • Meta released No Language Left Behind (NLLB-200), a 54 billion–parameter open-source machine translation model that supports 200 languages as part of Meta’s universal speech translator project. The model includes support for languages with limited translation resources. While the model’s breadth of languages included is novel, NLLB-200 can struggle to capture the intricacies of local context for low-resource languages and often relies on machine-translated sentence pairs across the internet due to the scarcity of digitized monolingual data. 
  • Lelapa AI’s InkubaLM-0.4B is an SLM with applications for low-resource African languages. Trained on 1.9 billion tokens across languages including isiZulu, Yoruba, Swahili, and isiXhosa, InkubaLM-0.4B (with 400 million parameters) builds on Meta’s LLaMA 2 architecture, providing a smaller model than the original LLaMA 2 pretrained model with 7 billion parameters. 
  • IBM Research Brazil and the University of São Paulo have collaborated on projects aimed at preserving Brazilian Indigenous languages such as Guarani Mbya and Nheengatu. These initiatives emphasize co-creation with Indigenous communities and address concerns about cultural exposure and language ownership. Initial efforts included electronic dictionaries, word prediction, and basic translation tools. Notably, when a prototype writing assistant for Guarani Mbya raised concerns about exposing their language and culture online, project leaders paused further development pending community consensus.  
  • Researchers have fine-tuned pre-trained models for Nheengatu using linguistic educational sources and translations of the Bible, with plans to incorporate community-guided spellcheck tools. Since the translations relying on data from the Bible, primarily translated by colonial priests, often sounded archaic and could reflect cultural abuse and violence, they were classified as potentially “toxic” data that would not be used in any deployed system without explicit Indigenous community agreement…(More)”.

Bridging Digital Divides: How PescaData is Connecting Small-Scale Fishing Cooperatives to the Blue Economy


Article by Stuart Fulton: “In this research project, we examine how digital platforms – specifically PescaData – can be leveraged to connect small-scale fishing cooperatives with impact investors and donors, creating new pathways for sustainable blue economy financing, while simultaneously ensuring fair data practices that respect data sovereignty and traditional ecological knowledge.

PescaData emerged as a pioneering digital platform that enables fishing communities to collect more accurate data to ensure sustainable fisheries. Since then, PescaData has evolved to provide software as a service to fishing cooperatives and to allow fishers to document their solutions to environmental and economic challenges. Since 2022, small-scale fishers have used it to document nearly 300 initiatives that contribute to multiple Sustainable Development Goals. 

Respecting Data Sovereignty in the Digital Age

One critical aspect of our research acknowledges the unique challenges of implementing digital tools in traditional cooperative settings. Unlike conventional tech implementations that often extract value from communities, PescaData´s approach centers on data sovereignty – the principle that fishing communities should maintain ownership and control over their data. As the PescaData case study demonstrates, a humanity-centric rather than merely user-centric approach is essential. This means designing with compassion and establishing clear governance around data from the very beginning. The data generated by fishing cooperatives represents not just information, but traditional knowledge accumulated over generations of resource management.

The fishers themselves have articulated clear principles for data governance in a cooperative model:

  • Ownership: Fishers, as data producers, decide who has access and under what conditions.
  • Transparency: Clear agreements on data use.
  • Knowledge assessment: Highlighting fishers’ contributions and placing them in decision-making positions.
  • Co-design: Ensuring the platform meets their specific needs.
  • Security: Protecting collected data…(More)”.

A Quest for AI Knowledge


Paper by Joshua S. Gans: “This paper examines how the introduction of artificial intelligence (AI), particularly generative and large language models capable of interpolating precisely between known data points, reshapes scientists’ incentives for pursuing novel versus incremental research. Extending the theoretical framework of Carnehl and Schneider (2025), we analyse how decision-makers leverage AI to improve precision within well-defined knowledge domains. We identify conditions under which the availability of AI tools encourages scientists to choose more socially valuable, highly novel research projects, contrasting sharply with traditional patterns of incremental knowledge growth. Our model demonstrates a critical complementarity: scientists strategically align their research novelty choices to maximise the domain where AI can reliably inform decision-making. This dynamic fundamentally transforms the evolution of scientific knowledge, leading either to systematic “stepping stone” expansions or endogenous research cycles of strategic knowledge deepening. We discuss the broader implications for science policy, highlighting how sufficiently capable AI tools could mitigate traditional inefficiencies in scientific innovation, aligning private research incentives closely with the social optimum…(More)”.

What is a fair exchange for access to public data?


Blog and policy brief by Jeni Tennison: “The most obvious approach to get companies to share value back to the public sector in return for access to data is to charge them. However, there are a number of challenges with a “pay to access” approach: it’s hard to set the right price; it creates access barriers, particularly for cash-poor start-ups; and it creates a public perception that the government is willing to sell their data, and might be tempted to loosen privacy-protecting governance controls in exchange for cash.

Are there other options? The policy brief explores a range of other approaches and assesses these against five goals that a value-sharing framework should ideally meet, to:

  • Encourage use of public data, including by being easy for organisations to understand and administer.
  • Provide a return on investment for the public sector, offsetting at least some of the costs of supporting the NDL infrastructure and minimising administrative costs.
  • Promote equitable innovation and economic growth in the UK, which might mean particularly encouraging smaller, home-grown businesses.
  • Create social value, particularly towards this Government’s other missions, such as achieving Net Zero or unlocking opportunity for all.
  • Build public trust by being easily explainable, avoiding misaligned incentives that encourage the breaking of governance guardrails, and feeling like a fair exchange.

In brief, alternatives to a pay-to-access model that still provide direct financial returns include:

  • Discounts: the public sector could secure discounts on products and services created using public data. However, this could be difficult to administer and enforce.
  • Royalties: taking a percentage of charges for products and services created using public data might be similarly hard to administer and enforce, but applies to more companies.
  • Equity: taking equity in startups can provide long-term returns and align with public investment goals.
  • Levies: targeted taxes on businesses that use public data can provide predictable revenue and encourage data use.
  • General taxation: general taxation can fund data infrastructure, but it may lack the targeted approach and public visibility of other methods.

It’s also useful to consider non-financial conditions that could be put on organisations accessing public data..(More)”.

A crowd-sourced repository for valuable government data


About: “DataLumos is an ICPSR archive for valuable government data resources. ICPSR has a long commitment to safekeeping and disseminating US government and other social science data. DataLumos accepts deposits of public data resources from the community and recommendations of public data resources that ICPSR itself might add to DataLumos. Please consider making a monetary donation to sustain DataLumos…(More)”.

The Age of AI in the Life Sciences: Benefits and Biosecurity Considerations


Report by the National Academies of Sciences, Engineering, and Medicine: “Artificial intelligence (AI) applications in the life sciences have the potential to enable advances in biological discovery and design at a faster pace and efficiency than is possible with classical experimental approaches alone. At the same time, AI-enabled biological tools developed for beneficial applications could potentially be misused for harmful purposes. Although the creation of biological weapons is not a new concept or risk, the potential for AI-enabled biological tools to affect this risk has raised concerns during the past decade.

This report, as requested by the Department of Defense, assesses how AI-enabled biological tools could uniquely impact biosecurity risk, and how advancements in such tools could also be used to mitigate these risks. The Age of AI in the Life Sciences reviews the capabilities of AI-enabled biological tools and can be used in conjunction with the 2018 National Academies report, Biodefense in the Age of Synthetic Biology, which sets out a framework for identifying the different risk factors associated with synthetic biology capabilities…(More)”