DOGE comes for the data wonks


The Economist: “For nearly three decades the federal government has painstakingly surveyed tens of thousands of Americans each year about their health. Door-knockers collect data on the financial toll of chronic conditions like obesity and asthma, and probe the exact doses of medications sufferers take. The result, known as the Medical Expenditure Panel Survey (MEPS), is the single most comprehensive, nationally representative portrait of American health care, a balkanised and unwieldy $5trn industry that accounts for some 17% of GDP.

MEPS is part of a largely hidden infrastructure of government statistics collection now in the crosshairs of the Department of Government Efficiency (DOGE). In mid-March officials at a unit of the Department of Health and Human Services (HHS) that runs the survey told employees that DOGE had slated them for an 80-90% reduction in staff and that this would “not be a negotiation”. Since then scores of researchers have taken voluntary buyouts. Those left behind worry about the integrity of MEPS. “Very unclear whether or how we can put on MEPS” with roughly half of the staff leaving, one said. On March 27th, the health secretary, Robert F. Kennedy junior, announced an overall reduction of 10,000 personnel at the department, in addition to those who took buyouts.

There are scores of underpublicised government surveys like MEPS that document trends in everything from house prices to the amount of lead in people’s blood. Many provide standard-setting datasets and insights into the world’s largest economy that the private sector has no incentive to replicate.

Even so, America’s system of statistics research is overly analogue and needs modernising. “Using surveys as the main source of information is just not working” because it is too slow and suffers from declining rates of participation, says Julia Lane, an economist at New York University. In a world where the economy shifts by the day, the lags in traditional surveys—whose results can take weeks or even years to refine and publish—are unsatisfactory. One practical reform DOGE might encourage is better integration of administrative data such as tax records and social-security filings which often capture the entire population and are collected as a matter of course.

As in so many other areas, however, DOGE’s sledgehammer is more likely to cause harm than to achieve improvements. And for all its clunkiness, America’s current system manages a spectacular feat. From Inuits in remote corners of Alaska to Spanish-speakers in the Bronx, it measures the country and its inhabitants remarkably well, given that the population is highly diverse and spread out over 4m square miles. Each month surveys from the federal government reach about 1.5m people, a number roughly equivalent to the population of Hawaii or West Virginia…(More)”.

Public Governance and Emerging Technologies


Book edited by Jurgen Goossens, Esther Keymolen, and Antonia Stanojević: “This open access book focuses on public governance’s increasing reliance on emerging digital technologies. ‘Disruptive’ or ‘emerging’ digital technologies, such as artificial intelligence and blockchain, are often portrayed as highly promising, with the potential to transform established societal, economic, or governmental practices. Unsurprisingly, public actors are therefore increasingly experimenting with the application of these emerging digital technologies in public governance.

The first part of the book shows how automatization via algorithmic systems, the networked nature of distributed technologies such as blockchain, and data-driven use of AI in public governance can promote hyper-connectivity and hyper-complexity. This trend and the associated concerns have drawn societal, political, and scholarly attention to regulatory compliance considering the current and potential future uses of emerging technologies. Accordingly, the second part of the book focuses on regulatory compliance and regulatory solutions. It explores the compatibility of technology with existing regulations, existing legal tools that could be innovatively applied for the successful regulation of emerging technologies, and approaches to updating existing legislation or creating new legislation for the regulation of emerging technologies. While socio-ethical considerations on upholding public values in a digital world are at the heart of all chapters, the third part specifically focuses on public values and trust. It advances a conceptual, normative discussion, putting the spotlight on trust and other fundamental public values that should be safeguarded…(More)”

How governments can move beyond bureaucracy


Interview with Jorrit de Jong: “..Bureaucracy is not so much a system of rules, it is a system of values. It is an organizational form that governs how work gets done in accordance with principles that the sociologist Max Weber first codified: standardization, formalization, expert officialdom, specialization, hierarchy, and accountability. Add those up and you arrive at a system that values the written word; that is siloed because that’s what specialization does; that can sometimes be slow because there is a chain of command and an approval process. Standardization supports the value that it doesn’t matter who you are, who you know, what you look like when you’re applying for a permit, or who is issuing the permit: the case will be evaluated based on its merits. That is a good thing. Bureaucracy is a way to do business in a rational, impersonal, responsible and efficient way, at least in theory

It becomes a problem when organizations start to violate their own values and lose connection with their purpose. If standardization turns into rigidity, doing justice to extenuating individual circumstances becomes hard. If formalization becomes pointless paper pushing, it defeats the purpose. And if accountability structures favor risk aversion over taking initiative, organizations can’t innovate.

Bureaucratic dysfunction occurs when the system that we’ve created ceases to produce the value that we wanted out of it. But that does not mean we have to throw away the baby with the bathwater. Can we create organizations that have the benefits of accountability, standardization and specialization without the burdens of slowness, rigidity, and silos? My answer is yes. Research we did with the Bloomberg Harvard City Leadership Initiative shows how organizations can improve performance by building capabilities that make them more nimble, responsive, and user-friendly. Cities that leverage data to better understand the communities they serve and measure performance learn and improve faster. Cities that use design thinking to reinvent resident services save time and money. And cities that collaborate across organizational and sector boundaries come up with more effective solutions to urban problems…(More)”

Researching data discomfort: The case of Statistics Norway’s quest for billing data


Paper by Lisa Reutter: “National statistics offices are increasingly exploring the possibilities of utilizing new data sources to position themselves in emerging data markets. In 2022, Statistics Norway announced that the national agency will require the biggest grocers in Norway to hand over all collected billing data to produce consumer behavior statistics which had previously been produced by other sampling methods. An online article discussing this proposal sparked a surprisingly (at least to Statistics Norway) high level of interest among readers, many of whom expressed concerns about this intended change in data practice. This paper focuses on the multifaceted online discussions of the proposal, as these enable us to study citizens’ reactions and feelings towards increased data collection and emerging public-private data flows in a Nordic context. Through an explorative empirical analysis of comment sections, this paper investigates what is discussed by commenters and reflects upon why this case sparked so much interest among citizens in the first place. It therefore contributes to the growing literature of citizens’ voices in data-driven administration and to a wider discussion on how to research public feeling towards datafication. I argue that this presents an interesting case of discomfort voiced by citizens, which demonstrates the contested nature of data practices among citizens–and their ability to regard data as deeply intertwined with power and politics. This case also reminds researchers to pay attention to seemingly benign and small changes in administration beyond artificial intelligence…(More)”

Oxford Intersections: AI in Society


Series edited by Philipp Hacker: “…provides an interdisciplinary corpus for understanding artificial intelligence (AI) as a global phenomenon that transcends geographical and disciplinary boundaries. Edited by a consortium of experts hailing from diverse academic traditions and regions, the 11 edited and curated sections provide a holistic view of AI’s societal impact. Critically, the work goes beyond the often Eurocentric or U.S.-centric perspectives that dominate the discourse, offering nuanced analyses that encompass the implications of AI for a range of regions of the world. Taken together, the sections of this work seek to move beyond the state of the art in three specific respects. First, they venture decisively beyond existing research efforts to develop a comprehensive account and framework for the rapidly growing importance of AI in virtually all sectors of society. Going beyond a mere mapping exercise, the curated sections assess opportunities, critically discuss risks, and offer solutions to the manifold challenges AI harbors in various societal contexts, from individual labor to global business, law and governance, and interpersonal relationships. Second, the work tackles specific societal and regulatory challenges triggered by the advent of AI and, more specifically, large generative AI models and foundation models, such as ChatGPT or GPT-4, which have so far received limited attention in the literature, particularly in monographs or edited volumes. Third, the novelty of the project is underscored by its decidedly interdisciplinary perspective: each section, whether covering Conflict; Culture, Art, and Knowledge Work; Relationships; or Personhood—among others—will draw on various strands of knowledge and research, crossing disciplinary boundaries and uniting perspectives most appropriate for the context at hand…(More)”.

Legal frictions for data openness


Paper by Ramya Chandrasekhar: “investigates legal entanglements of re-use, when data and content from the open web is used to train foundation AI models. Based on conversations with AI researchers and practitioners, an online workshop, and legal analysis of a repository of 41 legal disputes relating to copyright and data protection, this report highlights tensions between legal imaginations of data flows and computational processes involved in training foundation models.

To realise the promise of the open web as open for all, this report argues that efforts oriented solely towards techno-legal openness of training datasets are not enough. Techno-legal openness of datasets facilitates easy re-use of data. But, certain well-resourced actors like Big Tech are able to take advantage of data flows on the open web to internet to train proprietary foundation models, while giving little to no value back to either the maintenance of shared informational resources or communities of commoners. At the same time, open licenses no longer accommodate changing community preferences of sharing and re-use of data and content.
In addition to techno-legal openness of training datasets, there is a need for certain limits on the extractive power of well-resourced actors like BigTech combined with increased recognition of community data sovereignty. Alternative licensing frameworks, such as the Nwulite Obodo License, Kaitiakitanga Licenses, the Montreal License, the OpenRAIL Licenses, the Open Data Commons License, and the AI2Impact Licenses hold valuable insights in this regard. While these licensing frameworks impose more obligations on re-users and necessitate more collective thinking on interoperability,they are nonetheless necessary for the creation of healthy digital and data commons, to realise the original promise of the open web as open for all…(More)”.

Robotics for Global development


Report by the Frontier Tech Hub: “Robotics could enable progress on 46% of SDG targets  yet this potential remains largely untapped in low and middle-income countries. 

While technological developments and new-found applications of artificial intelligence (AI) keep captivating significant attention and investments, using robotics to advance the Sustainable Development Goals (SDGs) is consistently overlooked. This is especially true when the focus moves from aerial robotics (drones) to robotic arms, ground robotics, and aquatic robotics. How might these types of robots accelerate global development in the least developed countries? 

We aim to answer this question and inform the UK Foreign, Commonwealth & Development Office’s (FCDO) investment and policy towards robotics in the least developed countries (LDCs). In an emergent space, the UK FCDO has a unique opportunity to position itself as a global leader in leveraging robotics technology to accelerate sustainable development outcomes…(More)”.

Towards a set of Universal data principles


Paper by Steve MacFeely, Angela Me, Friederike Schueuer, Joseph Costanzo, David Passarelli, Malarvizhi Veerappan, and Stefaan Verhulst: “Humanity collects, processes, shares, uses, and reuses a staggering volume of data. These data are the lifeblood of the digital economy; they feed algorithms and artificial intelligence, inform logistics, and shape markets, communication, and politics. Data do not just yield economic benefits; they can also have individual and societal benefits and impacts. Being able to access, process, use, and reuse data is essential for dealing with global challenges, such as managing and protecting the environment, intervening in the event of a pandemic, or responding to a disaster or crisis. While we have made great strides, we have yet to realize the full potential of data, in particular, the potential of data to serve the public good. This will require international cooperation and a globally coordinated approach. Many data governance issues cannot be fully resolved at national level. This paper presents a proposal for a preliminary set of data goals and principles. These goals and principles are envisaged as the normative foundations for an international data governance framework – one that is grounded in human rights and sustainable development. A principles-based approach to data governance helps create common values, and in doing so, helps to change behaviours, mindsets and practices. It can also help create a foundation for the safe use of all types of data and data transactions. The purpose of this paper is to present the preliminary principles to solicit reaction and feedback…(More)”.

Differential Privacy


Open access book by  Simson L. Garfinkel: “Differential privacy (DP) is an increasingly popular, though controversial, approach to protecting personal data. DP protects confidential data by introducing carefully calibrated random numbers, called statistical noise, when the data is used. Google, Apple, and Microsoft have all integrated the technology into their software, and the US Census Bureau used DP to protect data collected in the 2020 census. In this book, Simson Garfinkel presents the underlying ideas of DP, and helps explain why DP is needed in today’s information-rich environment, why it was used as the privacy protection mechanism for the 2020 census, and why it is so controversial in some communities.

When DP is used to protect confidential data, like an advertising profile based on the web pages you have viewed with a web browser, the noise makes it impossible for someone to take that profile and reverse engineer, with absolute certainty, the underlying confidential data on which the profile was computed. The book also chronicles the history of DP and describes the key participants and its limitations. Along the way, it also presents a short history of the US Census and other approaches for data protection such as de-identification and k-anonymity…(More)”.

Which Data Do Economists Use to Study Corruption ?


World Bank paper: “…examines the data sources and methodologies used in economic research on corruption by analyzing 339 journal articles published in 2022 that include Journal of Economic Literature codes. The paper identifies the most commonly used data types, sources, and geographical foci, as well as whether studies primarily investigate the causes or consequences of corruption. Cross-country composite indicators remain the dominant measure, while single country studies more frequently utilize administrative data. Articles in ranked journals are more likely to employ administrative and experimental data and focus on the causes of corruption. The broader dataset of 882 articles highlights the significant academic interest in corruption across disciplines, particularly in political science and public policy. The findings raise concerns about the limited use of novel data sources and the relative neglect of research on the causes of corruption, underscoring the need for a more integrated approach within the field of economics…(More)”.