DOGE comes for the data wonks


The Economist: “For nearly three decades the federal government has painstakingly surveyed tens of thousands of Americans each year about their health. Door-knockers collect data on the financial toll of chronic conditions like obesity and asthma, and probe the exact doses of medications sufferers take. The result, known as the Medical Expenditure Panel Survey (MEPS), is the single most comprehensive, nationally representative portrait of American health care, a balkanised and unwieldy $5trn industry that accounts for some 17% of GDP.

MEPS is part of a largely hidden infrastructure of government statistics collection now in the crosshairs of the Department of Government Efficiency (DOGE). In mid-March officials at a unit of the Department of Health and Human Services (HHS) that runs the survey told employees that DOGE had slated them for an 80-90% reduction in staff and that this would “not be a negotiation”. Since then scores of researchers have taken voluntary buyouts. Those left behind worry about the integrity of MEPS. “Very unclear whether or how we can put on MEPS” with roughly half of the staff leaving, one said. On March 27th, the health secretary, Robert F. Kennedy junior, announced an overall reduction of 10,000 personnel at the department, in addition to those who took buyouts.

There are scores of underpublicised government surveys like MEPS that document trends in everything from house prices to the amount of lead in people’s blood. Many provide standard-setting datasets and insights into the world’s largest economy that the private sector has no incentive to replicate.

Even so, America’s system of statistics research is overly analogue and needs modernising. “Using surveys as the main source of information is just not working” because it is too slow and suffers from declining rates of participation, says Julia Lane, an economist at New York University. In a world where the economy shifts by the day, the lags in traditional surveys—whose results can take weeks or even years to refine and publish—are unsatisfactory. One practical reform DOGE might encourage is better integration of administrative data such as tax records and social-security filings which often capture the entire population and are collected as a matter of course.

As in so many other areas, however, DOGE’s sledgehammer is more likely to cause harm than to achieve improvements. And for all its clunkiness, America’s current system manages a spectacular feat. From Inuits in remote corners of Alaska to Spanish-speakers in the Bronx, it measures the country and its inhabitants remarkably well, given that the population is highly diverse and spread out over 4m square miles. Each month surveys from the federal government reach about 1.5m people, a number roughly equivalent to the population of Hawaii or West Virginia…(More)”.

Researching data discomfort: The case of Statistics Norway’s quest for billing data


Paper by Lisa Reutter: “National statistics offices are increasingly exploring the possibilities of utilizing new data sources to position themselves in emerging data markets. In 2022, Statistics Norway announced that the national agency will require the biggest grocers in Norway to hand over all collected billing data to produce consumer behavior statistics which had previously been produced by other sampling methods. An online article discussing this proposal sparked a surprisingly (at least to Statistics Norway) high level of interest among readers, many of whom expressed concerns about this intended change in data practice. This paper focuses on the multifaceted online discussions of the proposal, as these enable us to study citizens’ reactions and feelings towards increased data collection and emerging public-private data flows in a Nordic context. Through an explorative empirical analysis of comment sections, this paper investigates what is discussed by commenters and reflects upon why this case sparked so much interest among citizens in the first place. It therefore contributes to the growing literature of citizens’ voices in data-driven administration and to a wider discussion on how to research public feeling towards datafication. I argue that this presents an interesting case of discomfort voiced by citizens, which demonstrates the contested nature of data practices among citizens–and their ability to regard data as deeply intertwined with power and politics. This case also reminds researchers to pay attention to seemingly benign and small changes in administration beyond artificial intelligence…(More)”

What Autocrats Want From Academics: Servility


Essay by Anna Dumont: “Since Trump’s inauguration, the university community has received a good deal of “messaging” from academic leadership. We’ve received emails from our deans and university presidents; we’ve sat in department meetings regarding the “developing situation”; and we’ve seen the occasional official statement or op-ed or comment in the local newspaper. And the unfortunate takeaway from all this is that our leaders’ strategy rests on a disturbing and arbitrary distinction. The public-facing language of the university — mission statements, programming, administrative structures, and so on — has nothing at all to do with the autonomy of our teaching and research, which, they assure us, they hold sacrosanct. Recent concessions — say, the disappearance of the website of the Women’s Center — are concerning, they admit, but ultimately inconsequential to our overall working lives as students and scholars.

History, however, shows that public-facing statements are deeply consequential, and one episode from the 20-year march of Italian fascism strikes me as especially instructive. On October 8, 1931, a law went into effect requiring, as a condition of their employment, every Italian university professor to sign an oath pledging their loyalty to the government of Benito Mussolini. Out of over 1,200 professors in the country, only 12 refused.

Today, those who refused are known simply as “I Dodici”: the Twelve. They were a scholar of Middle Eastern languages, an organic chemist, a doctor of forensic medicine, three lawyers, a mathematician, a theologian, a surgeon, a historian of ancient Rome, a philosopher of Kantian ethics, and one art historian. Two, Francesco Ruffini and Edoardo Ruffini Avondo, were father and son. Four were Jewish. All of them were immediately fired…(More)”

Global population data is in crisis – here’s why that matters


Article by Andrew J Tatem and Jessica Espey: “Every day, decisions that affect our lives depend on knowing how many people live where. For example, how many vaccines are needed in a community, where polling stations should be placed for elections or who might be in danger as a hurricane approaches. The answers rely on population data.

But counting people is getting harder.

For centuries, census and household surveys have been the backbone of population knowledge. But we’ve just returned from the UN’s statistical commission meetings in New York, where experts reported that something alarming is happening to population data systems globally.

Census response rates are declining in many countries, resulting in large margins of error. The 2020 US census undercounted America’s Latino population by more than three times the rate of the 2010 census. In Paraguay, the latest census revealed a population one-fifth smaller than previously thought.

South Africa’s 2022 census post-enumeration survey revealed a likely undercount of more than 30%. According to the UN Economic Commission for Africa, undercounts and census delays due to COVID-19, conflict or financial limitations have resulted in an estimated one in three Africans not being counted in the 2020 census round.

When people vanish from data, they vanish from policy. When certain groups are systematically undercounted – often minorities, rural communities or poorer people – they become invisible to policymakers. This translates directly into political underrepresentation and inadequate resource allocation…(More)”.

Trump Admin Plans to Cut Team Responsible for Critical Atomic Measurement Data


Article by Louise Matsakis and Will Knight: “The US National Institute of Standards and Technology (NIST) is discussing plans to eliminate an entire team responsible for publishing and maintaining critical atomic measurement data in the coming weeks, as the Trump administration continues its efforts to reduce the US federal workforce, according to a March 18 email sent to dozens of outside scientists. The data in question underpins advanced scientific research around the world in areas like semiconductor manufacturing and nuclear fusion…(More)”.

The Language Data Space (LDS)


European Commission: “… welcomes launch of the Alliance for Language Technologies European Digital Infrastructure Consortium (ALT-EDIC) and the Language Data Space (LDS).

Aimed at addressing the shortage of European language data needed for training large language models, these projects are set to revolutionise multilingual Artificial Intelligence (AI) systems across the EU.

By offering services in all EU languages, the initiatives are designed to break down language barriers, providing better, more accessible solutions for smaller businesses within the EU. This effort not only aims to preserve the EU’s rich cultural and linguistic heritage in the digital age but also strengthens Europe’s quest for tech sovereignty. Formed in February 2024, the ALT-EDIC includes 17 participating Member States and 9 observer Member States and regions, making it one of the pioneering European Digital Infrastructure Consortia.

The LDS, part of the Common European Data Spaces, is crucial for increasing data availability for AI development in Europe. Developed by the Commission and funded by the DIGITAL programme,  this project aims to create a cohesive marketplace for language data. This will enhance the collection and sharing of multilingual data to support European large language models. Initially accessible to selected institutions and companies, the project aims to eventually involve all European public and private stakeholders.

Find more information about the Alliance for Language Technologies European Digital Infrastructure Consortium (ALT-EDIC) and the Language Data Space (LDS)…(More)”

Panels giving scientific advice to Census Bureau disbanded by Trump administration


Article by Jeffrey Mervis: “…U.S. Secretary of Commerce Howard Lutnick has disbanded five outside panels that provide scientific and community advice to the U.S. Census Bureau and other federal statistical agencies just as preparations are ramping up for the country’s next decennial census, in 2030.

The dozens of demographers, statisticians, and public members on the five panels received nearly identical letters this week telling them that “the Secretary of Commerce has determined that the purposes for which the [committee] was established have been fulfilled, and the committee has been terminated effective February 28, 2025. Thank you for your service.”

Statistician Robert Santos, who last month resigned as Census Bureau director 3 years into his 5-year term, says he’s “terribly disappointed but not surprised” by the move, noting how a recent directive by President Donald Trump on gender identity has disrupted data collection for a host of federal surveys…(More)”.

New AI Collaboratives to take action on wildfires and food insecurity


Google: “…last September we introduced AI Collaboratives, a new funding approach designed to unite public, private and nonprofit organizations, and researchers, to create AI-powered solutions to help people around the world.

Today, we’re sharing more about our first two focus areas for AI Collaboratives: Wildfires and Food Security.

Wildfires are a global crisis, claiming more than 300,000 lives due to smoke exposure annually and causing billions of dollars in economic damage. …Google.org has convened more than 15 organizations, including Earth Fire Alliance and Moore Foundation, to help in this important effort. By coordinating funding and integrating cutting-edge science, emerging technology and on-the-ground applications, we can provide collaborators with the tools they need to identify and track wildfires in near real time; quantify wildfire risk; shift more acreage to beneficial fires; and ultimately reduce the damage caused by catastrophic wildfires.

Nearly one-third of the world’s population faces moderate or severe food insecurity due to extreme weather, conflict and economic shocks. The AI Collaborative: Food Security will strengthen the resilience of global food systems and improve food security for the world’s most vulnerable populations through AI technologies, collaborative research, data-sharing and coordinated action. To date, 10 organizations have joined us in this effort, and we’ll share more updates soon…(More)”.

Bridging Digital Divides: How PescaData is Connecting Small-Scale Fishing Cooperatives to the Blue Economy


Article by Stuart Fulton: “In this research project, we examine how digital platforms – specifically PescaData – can be leveraged to connect small-scale fishing cooperatives with impact investors and donors, creating new pathways for sustainable blue economy financing, while simultaneously ensuring fair data practices that respect data sovereignty and traditional ecological knowledge.

PescaData emerged as a pioneering digital platform that enables fishing communities to collect more accurate data to ensure sustainable fisheries. Since then, PescaData has evolved to provide software as a service to fishing cooperatives and to allow fishers to document their solutions to environmental and economic challenges. Since 2022, small-scale fishers have used it to document nearly 300 initiatives that contribute to multiple Sustainable Development Goals. 

Respecting Data Sovereignty in the Digital Age

One critical aspect of our research acknowledges the unique challenges of implementing digital tools in traditional cooperative settings. Unlike conventional tech implementations that often extract value from communities, PescaData´s approach centers on data sovereignty – the principle that fishing communities should maintain ownership and control over their data. As the PescaData case study demonstrates, a humanity-centric rather than merely user-centric approach is essential. This means designing with compassion and establishing clear governance around data from the very beginning. The data generated by fishing cooperatives represents not just information, but traditional knowledge accumulated over generations of resource management.

The fishers themselves have articulated clear principles for data governance in a cooperative model:

  • Ownership: Fishers, as data producers, decide who has access and under what conditions.
  • Transparency: Clear agreements on data use.
  • Knowledge assessment: Highlighting fishers’ contributions and placing them in decision-making positions.
  • Co-design: Ensuring the platform meets their specific needs.
  • Security: Protecting collected data…(More)”.

Government data is disappearing before our eyes


Article by Anna Massoglia: “A battle is being waged in the quiet corners of government websites and data repositories. Essential public records are disappearing and, with them, Americans’ ability to hold those in power accountable.

Take the Department of Government Efficiency, Elon Musk’s federal cost-cutting initiative. Touted as “maximally transparent,” DOGE is supposed to make government spending more efficient. But when journalists and researchers exposed major errors — from double-counting contracts to conflating caps with actual spending — DOGE didn’t fix the mistakes. Instead, it made them harder to detect.

Many Americans hoped DOGE’s work would be a step toward cutting costs and restoring trust in government. But trust must be earned. If our leaders truly want to restore faith in our institutions, they must ensure that facts remain available to everyone, not just when convenient.

Since Jan. 20, public records across the federal government have been erased. Economic indicators that guide investments, scientific datasets that drive medical breakthroughs, federal health guidelines and historical archives that inform policy decisions have all been put on the chopping block. Some missing datasets have been restored but are incomplete or have unexplained changes, rendering them unreliable.

Both Republican and Democratic administrations have played a role in limiting public access to government records. But the scale and speed of the Trump administration’s data manipulation — combined with buyouts, resignations and other restructuring across federal agencies — signal a new phase in the war on public information. This is not just about deleting files, it’s about controlling what the public sees, shaping the narrative and limiting accountability.

The Trump administration is accelerating this trend with revisions to official records. Unelected advisors are overseeing a sweeping reorganization of federal data, granting entities like DOGE unprecedented access to taxpayer records with little oversight. This is not just a bureaucratic reshuffle — it is a fundamental reshaping of the public record.

The consequences of data manipulation extend far beyond politics. When those in power control the flow of information, they can dictate collective truth. Governments that manipulate information are not just rewriting statistics — they are rewriting history.

From authoritarian regimes that have erased dissent to leaders who have fabricated economic numbers to maintain their grip on power, the dangers of suppressing and distorting data are well-documented.

Misleading or inconsistent data can be just as dangerous as opacity. When hard facts are replaced with political spin, conspiracy theories take root and misinformation fills the void.

The fact that data suppression and manipulation has occurred before does not lessen the danger, but underscores the urgency of taking proactive measures to safeguard transparency. A missing statistic today can become a missing historical fact tomorrow. Over time, that can reshape our reality…(More)”.