Paper by Jacy Reese Anthis et al: “Accurate and verifiable large language model (LLM) simulations of human research subjects promise an accessible data source for understanding human behavior and training new AI systems. However, results to date have been limited, and few social scientists have adopted these methods. In this position paper, we argue that the promise of LLM social simulations can be achieved by addressing five tractable challenges. We ground our argument in a literature survey of empirical comparisons between LLMs and human research subjects, commentaries on the topic, and related work. We identify promising directions with prompting, fine-tuning, and complementary methods. We believe that LLM social simulations can already be used for exploratory research, such as pilot experiments for psychology, economics, sociology, and marketing. More widespread use may soon be possible with rapidly advancing LLM capabilities, and researchers should prioritize developing conceptual models and evaluations that can be iteratively deployed and refined at pace with ongoing AI advances…(More)”.
Need a Side Gig? In China, Just Shake Your Phone
Article by Chen Yiru: “From a restaurant shift to a quick plumbing job, gig work in China is now just a phone shake away.
That’s the idea behind Tencent’s new “Nearby Jobs” feature, which was quietly rolled out nationwide on its messaging super app WeChat last week. Aimed at flexible job seekers, the tool connects users to verified listings in fields like driving, design, tech support, and catering — all within the country’s most-used app.
First piloted in Jiangmen, a city in the southern Guangdong province, the mini-program has expanded to more than 200 cities including Beijing, Shanghai, and Shenzhen. Tencent says it has already helped over 24,000 people secure short-term work, with filters that let users sort listings by pay, distance, payment schedule, and even gender preferences.The “Nearby Jobs” tool borrows from WeChat’s classic “Shake” feature, first introduced in 2012 to connect nearby users by physically shaking their phones. While the original version was discontinued for mainland users in early 2024 due to privacy concerns, traces of the function have recently resurfaced in limited testing — hinting at a possible revival.
The launch comes amid rising demand for platforms that can bridge the gap between gig employers and job seekers. China is home to an estimated 200 million flexible workers, and market demand for blue-collar labor has surged 380% over the past five years, according to a 2024 industry report. Younger workers are driving much of this growth, with job applicants under 25 rising by 165% during the same period…(More)”.
Massive, Unarchivable Datasets of Cancer, Covid, and Alzheimer’s Research Could Be Lost Forever
Article by Sam Cole: “Almost two dozen repositories of research and public health data supported by the National Institutes of Health are marked for “review” under the Trump administration’s direction, and researchers and archivists say the data is at risk of being lost forever if the repositories go down.
“The problem with archiving this data is that we can’t,” Lisa Chinn, Head of Research Data Services at the University of Chicago, told 404 Media. Unlike other government datasets or web pages, downloading or otherwise archiving NIH data often requires a Data Use Agreement between a researcher institution and the agency, and those agreements are carefully administered through a disclosure risk review process.
A message appeared at the top of multiple NIH websites last week that says: “This repository is under review for potential modification in compliance with Administration directives.”
Repositories with the message include archives of cancer imagery, Alzheimer’s disease research, sleep studies, HIV databases, and COVID-19 vaccination and mortality data…
“So far, it seems like what is happening is less that these data sets are actively being deleted or clawed back and more that they are laying off the workers whose job is to maintain them, update them and maintain the infrastructure that supports them,” a librarian affiliated with the Data Rescue Project told 404 Media. “In time, this will have the same effect, but it’s really hard to predict. People don’t usually appreciate, much less our current administration, how much labor goes into maintaining a large research dataset.”…(More)”.
Beyond data egoism: let’s embrace data altruism
Blog by Frank Hamerlinck: “When it comes to data sharing, there’s often a gap between ambition and reality. Many organizations recognize the potential of data collaboration, yet when it comes down to sharing their own data, hesitation kicks in. The concern? Costs, risks, and unclear returns. At the same time, there’s strong enthusiasm for accessing data.
This is the paradox we need to break. Because if data egoism rules, real innovation is out of reach, making the need for data altruism more urgent than ever.
…More and more leaders recognize that unlocking data is essential to staying competitive on a global scale, and they understand that we must do so while upholding our European values. However, the real challenge lies in translating this growing willingness into concrete action. Many acknowledge its importance in principle, but few are ready to take the first step. And that’s a challenge we need to address – not just as organizations but as a society…
To break down barriers and accelerate data-driven innovation, we’re launching the FTI Data Catalog – a step toward making data sharing easier, more transparent, and more impactful.
The catalog provides a structured, accessible overview of available datasets, from location data and financial data to well-being data. It allows organizations to discover, understand, and responsibly leverage data with ease. Whether you’re looking for insights to fuel innovation, enhance decision-making, drive new partnerships or unlock new value from your own data, the catalog is built to support open and secure data exchange.
Feeling curious? Explore the catalog
By making data more accessible, we’re laying the foundation for a culture of collaboration. The road to data altruism is long, but it’s one worth walking. The future belongs to those who dare to share!..(More)”
The Measure of Progress: Counting What Really Matters
Book by Diane Coyle: “The ways that statisticians and governments measure the economy were developed in the 1940s, when the urgent economic problems were entirely different from those of today. In The Measure of Progress, Diane Coyle argues that the framework underpinning today’s economic statistics is so outdated that it functions as a distorting lens, or even a set of blinkers. When policymakers rely on such an antiquated conceptual tool, how can they measure, understand, and respond with any precision to what is happening in today’s digital economy? Coyle makes the case for a new framework, one that takes into consideration current economic realities.
Coyle explains why economic statistics matter. They are essential for guiding better economic policies; they involve questions of freedom, justice, life, and death. Governments use statistics that affect people’s lives in ways large and small. The metrics for economic growth were developed when a lack of physical rather than natural capital was the binding constraint on growth, intangible value was less important, and the pressing economic policy challenge was managing demand rather than supply. Today’s challenges are different. Growth in living standards in rich economies has slowed, despite remarkable innovation, particularly in digital technologies. As a result, politics is contentious and democracy strained.
Coyle argues that to understand the current economy, we need different data collected in a different framework of categories and definitions, and she offers some suggestions about what this would entail. Only with a new approach to measurement will we be able to achieve the right kind of growth for the benefit of all…(More)”.
DOGE comes for the data wonks
The Economist: “For nearly three decades the federal government has painstakingly surveyed tens of thousands of Americans each year about their health. Door-knockers collect data on the financial toll of chronic conditions like obesity and asthma, and probe the exact doses of medications sufferers take. The result, known as the Medical Expenditure Panel Survey (MEPS), is the single most comprehensive, nationally representative portrait of American health care, a balkanised and unwieldy $5trn industry that accounts for some 17% of GDP.
MEPS is part of a largely hidden infrastructure of government statistics collection now in the crosshairs of the Department of Government Efficiency (DOGE). In mid-March officials at a unit of the Department of Health and Human Services (HHS) that runs the survey told employees that DOGE had slated them for an 80-90% reduction in staff and that this would “not be a negotiation”. Since then scores of researchers have taken voluntary buyouts. Those left behind worry about the integrity of MEPS. “Very unclear whether or how we can put on MEPS” with roughly half of the staff leaving, one said. On March 27th, the health secretary, Robert F. Kennedy junior, announced an overall reduction of 10,000 personnel at the department, in addition to those who took buyouts.
There are scores of underpublicised government surveys like MEPS that document trends in everything from house prices to the amount of lead in people’s blood. Many provide standard-setting datasets and insights into the world’s largest economy that the private sector has no incentive to replicate.
Even so, America’s system of statistics research is overly analogue and needs modernising. “Using surveys as the main source of information is just not working” because it is too slow and suffers from declining rates of participation, says Julia Lane, an economist at New York University. In a world where the economy shifts by the day, the lags in traditional surveys—whose results can take weeks or even years to refine and publish—are unsatisfactory. One practical reform DOGE might encourage is better integration of administrative data such as tax records and social-security filings which often capture the entire population and are collected as a matter of course.
As in so many other areas, however, DOGE’s sledgehammer is more likely to cause harm than to achieve improvements. And for all its clunkiness, America’s current system manages a spectacular feat. From Inuits in remote corners of Alaska to Spanish-speakers in the Bronx, it measures the country and its inhabitants remarkably well, given that the population is highly diverse and spread out over 4m square miles. Each month surveys from the federal government reach about 1.5m people, a number roughly equivalent to the population of Hawaii or West Virginia…(More)”.
Researching data discomfort: The case of Statistics Norway’s quest for billing data
Paper by Lisa Reutter: “National statistics offices are increasingly exploring the possibilities of utilizing new data sources to position themselves in emerging data markets. In 2022, Statistics Norway announced that the national agency will require the biggest grocers in Norway to hand over all collected billing data to produce consumer behavior statistics which had previously been produced by other sampling methods. An online article discussing this proposal sparked a surprisingly (at least to Statistics Norway) high level of interest among readers, many of whom expressed concerns about this intended change in data practice. This paper focuses on the multifaceted online discussions of the proposal, as these enable us to study citizens’ reactions and feelings towards increased data collection and emerging public-private data flows in a Nordic context. Through an explorative empirical analysis of comment sections, this paper investigates what is discussed by commenters and reflects upon why this case sparked so much interest among citizens in the first place. It therefore contributes to the growing literature of citizens’ voices in data-driven administration and to a wider discussion on how to research public feeling towards datafication. I argue that this presents an interesting case of discomfort voiced by citizens, which demonstrates the contested nature of data practices among citizens–and their ability to regard data as deeply intertwined with power and politics. This case also reminds researchers to pay attention to seemingly benign and small changes in administration beyond artificial intelligence…(More)”
What Autocrats Want From Academics: Servility
Essay by Anna Dumont: “Since Trump’s inauguration, the university community has received a good deal of “messaging” from academic leadership. We’ve received emails from our deans and university presidents; we’ve sat in department meetings regarding the “developing situation”; and we’ve seen the occasional official statement or op-ed or comment in the local newspaper. And the unfortunate takeaway from all this is that our leaders’ strategy rests on a disturbing and arbitrary distinction. The public-facing language of the university — mission statements, programming, administrative structures, and so on — has nothing at all to do with the autonomy of our teaching and research, which, they assure us, they hold sacrosanct. Recent concessions — say, the disappearance of the website of the Women’s Center — are concerning, they admit, but ultimately inconsequential to our overall working lives as students and scholars.
History, however, shows that public-facing statements are deeply consequential, and one episode from the 20-year march of Italian fascism strikes me as especially instructive. On October 8, 1931, a law went into effect requiring, as a condition of their employment, every Italian university professor to sign an oath pledging their loyalty to the government of Benito Mussolini. Out of over 1,200 professors in the country, only 12 refused.
Today, those who refused are known simply as “I Dodici”: the Twelve. They were a scholar of Middle Eastern languages, an organic chemist, a doctor of forensic medicine, three lawyers, a mathematician, a theologian, a surgeon, a historian of ancient Rome, a philosopher of Kantian ethics, and one art historian. Two, Francesco Ruffini and Edoardo Ruffini Avondo, were father and son. Four were Jewish. All of them were immediately fired…(More)”
Global population data is in crisis – here’s why that matters
Article by Andrew J Tatem and Jessica Espey: “Every day, decisions that affect our lives depend on knowing how many people live where. For example, how many vaccines are needed in a community, where polling stations should be placed for elections or who might be in danger as a hurricane approaches. The answers rely on population data.
But counting people is getting harder.
For centuries, census and household surveys have been the backbone of population knowledge. But we’ve just returned from the UN’s statistical commission meetings in New York, where experts reported that something alarming is happening to population data systems globally.
Census response rates are declining in many countries, resulting in large margins of error. The 2020 US census undercounted America’s Latino population by more than three times the rate of the 2010 census. In Paraguay, the latest census revealed a population one-fifth smaller than previously thought.
South Africa’s 2022 census post-enumeration survey revealed a likely undercount of more than 30%. According to the UN Economic Commission for Africa, undercounts and census delays due to COVID-19, conflict or financial limitations have resulted in an estimated one in three Africans not being counted in the 2020 census round.
When people vanish from data, they vanish from policy. When certain groups are systematically undercounted – often minorities, rural communities or poorer people – they become invisible to policymakers. This translates directly into political underrepresentation and inadequate resource allocation…(More)”.
Trump Admin Plans to Cut Team Responsible for Critical Atomic Measurement Data
Article by Louise Matsakis and Will Knight: “The US National Institute of Standards and Technology (NIST) is discussing plans to eliminate an entire team responsible for publishing and maintaining critical atomic measurement data in the coming weeks, as the Trump administration continues its efforts to reduce the US federal workforce, according to a March 18 email sent to dozens of outside scientists. The data in question underpins advanced scientific research around the world in areas like semiconductor manufacturing and nuclear fusion…(More)”.