Paper by Ajinkya Kulkarni, et al: “Children are one of the most under-represented groups in speech technologies, as well as one of the most vulnerable in terms of privacy. Despite this, anonymization techniques targeting this population have received little attention. In this study, we seek to bridge this gap, and establish a baseline for the use of voice anonymization techniques designed for adult speech when applied to children’s voices. Such an evaluation is essential, as children’s speech presents a distinct set of challenges when compared to that of adults. This study comprises three children’s datasets, six anonymization methods, and objective and subjective utility metrics for evaluation. Our results show that existing systems for adults are still able to protect children’s voice privacy, but suffer from much higher utility degradation. In addition, our subjective study displays the challenges of automatic evaluation methods for speech quality in children’s speech, highlighting the need for further research…(More)”. See also: Responsible Data for Children.
Surveillance pricing: How your data determines what you pay
Article by Douglas Crawford: “Surveillance pricing, also known as personalized or algorithmic pricing, is a practice where companies use your personal data, such as your location, the device you’re using, your browsing history, and even your income, to determine what price to show you. It’s not just about supply and demand — it’s about you as a consumer and how much the system thinks you’re able (or willing) to pay.
Have you ever shopped online for a flight(new window), only to find that the price mysteriously increased the second time you checked? Or have you and a friend searched for the same hotel room on your phones, only to find your friend sees a lower price? This isn’t a glitch — it’s surveillance pricing at work.
In the United States, surveillance pricing is becoming increasingly prevalent across various industries, including airlines, hotels, and e-commerce platforms. It exists elsewhere, but in other parts of the world, such as the European Union, there is a growing recognition of the danger this pricing model presents to citizens’ privacy, resulting in stricter data protection laws aimed at curbing it. The US appears to be moving in the opposite direction…(More)”.
Reliable data facilitates better policy implementation
Article by Ganesh Rao and Parul Agarwal: “Across India, state government departments are at the forefront of improving human capabilities through education, health, and nutrition programmes. Their ability to do so effectively depends on administrative (or admin) data1 collected and maintained by their staff. This data is collected as part of regular operations and informs both day-to-day decision-making and long-term policy. While policymaking can draw on (reasonably reliable) sample surveys alone, effective implementation of schemes and services requires accurate individual-level admin data. However, unreliable admin data can be a severe constraint, forcing bureaucrats to rely on intuition, experience, and informed guesses. Improving the reliability of admin data can greatly enhance state capacity, thereby improving governance and citizen outcomes.
There has been some progress on this front in recent years. For instance, the Jan Dhan-Aadhaar-Mobile (JAM) trinity has significantly improved direct benefit transfer (DBT) mechanisms by ensuring that certain recipient data is reliable. However, challenges remain in accurately capturing the well-being of targeted citizens. Despite significant investments in the digitisation of data collection and management systems, persistent reliability issues undermine the government’s efforts to build a data-driven decision-making culture…
There is growing evidence of serious quality issues in admin data. At CEGIS, we have conducted extensive analyses of admin data across multiple states, uncovering systemic issues in key indicators across sectors and platforms. These quality issues compound over time, undermining both micro-level service delivery and macro-level policy planning. This results in distorted budget allocations, gaps in service provision, and weakened frontline accountability…(More)”.
Trump Taps Palantir to Compile Data on Americans
Article by Sheera Frenkel and Aaron Krolik: “In March, President Trump signed an executive order calling for the federal government to share data across agencies, raising questions over whether he might compile a master list of personal information on Americans that could give him untold surveillance power.
Mr. Trump has not publicly talked about the effort since. But behind the scenes, officials have quietly put technological building blocks into place to enable his plan. In particular, they have turned to one company: Palantir, the data analysis and technology firm.
The Trump administration has expanded Palantir’s work across the federal government in recent months. The company has received more than $113 million in federal government spending since Mr. Trump took office, according to public records, including additional funds from existing contracts as well as new contracts with the Department of Homeland Security and the Pentagon. (This does not include a $795 million contract that the Department of Defense awarded the company last week, which has not been spent.)
Representatives of Palantir are also speaking to at least two other agencies — the Social Security Administration and the Internal Revenue Service — about buying its technology, according to six government officials and Palantir employees with knowledge of the discussions.
The push has put a key Palantir product called Foundry into at least four federal agencies, including D.H.S. and the Health and Human Services Department. Widely adopting Foundry, which organizes and analyzes data, paves the way for Mr. Trump to easily merge information from different agencies, the government officials said…(More)“
Creating detailed portraits of Americans based on government data is not just a pipe dream. The Trump administration has already sought access to hundreds of data points on citizens and others through government databases, including their bank account numbers, the amount of their student debt, their medical claims and any disability status…(More)”.
Digital Democracy in a Divided Global Landscape
10 essays by the Carnegie Endowment for International Peace: “A first set of essays analyzes how local actors are navigating the new tech landscape. Lillian Nalwoga explores the challenges and upsides of Starlink satellite internet deployment in Africa, highlighting legal hurdles, security risks, and concerns about the platform’s leadership. As African nations look to Starlink as a valuable tool in closing the digital divide, Nalwoga emphasizes the need to invest in strong regulatory frameworks to safeguard digital spaces. Jonathan Corpus Ong and Dean Jackson analyze the landscape of counter-disinformation funding in local contexts. They argue that there is a “mismatch” between the priorities of funders and the strategies that activists would like to pursue, resulting in “ineffective and extractive workflows.” Ong and Jackson isolate several avenues for structural change, including developing “big tent” coalitions of activists and strategies for localizing aid projects. Janjira Sombatpoonsiri examines the role of local actors in foreign influence operations in Southeast Asia. She highlights three motivating factors that drive local participation in these operations: financial benefits, the potential to gain an edge in domestic power struggles, and the appeal of anti-Western narratives.
A second set of essays explores evolving applications of digital repression…
A third set focuses on national strategies and digital sovereignty debates…
A fourth set explores pressing tech policy and regulatory questions…(More)”.
How Canada Needs to Respond to the US Data Crisis
Article by Danielle Goldfarb: “The United States is cutting and undermining official US data across a wide range of domains, eroding the foundations of evidence-based policy making. This is happening mostly under the radar here in Canada, buried by news about US President Donald Trump’s barrage of tariffs and many other alarming actions. Doing nothing in response means Canada accepts blind spots in critical areas. Instead, this country should respond by investing in essential data and building the next generation of trusted public intelligence.
The United States has cut or altered more than 2,000 official data sets across the science, health, climate and development sectors, according to the National Security Archive. Deep staff cuts across all program areas effectively cancel or deeply erode many other statistical programs….
Even before this data purge, official US data methods were becoming less relevant and reliable. Traditional government surveys lag by weeks or months and face declining participation. This lag proved particularly problematic during the COVID-19 pandemic and also now, when economic data with a one- or two-month lag is largely irrelevant for tracking the real-time impact of constantly shifting Trump tariffs….
With deep ties to the United States, Canada needs to take action to reduce these critical blind spots. This challenge brings a major strength into the picture: Canada’s statistical agencies have strong reputations as trusted, transparent information sources.
First, Canada should strengthen its data infrastructure. Official Canadian data suffers from similar delays and declining response rates as in the United States. Statistics Canada needs a renewed mandate and stable resources to produce policy-relevant indicators, especially in a timelier way, and in areas where US data has been cut or compromised.
Second, Canada could also act as a trusted place to store vulnerable indicators — inventorying missing data sets, archiving those at risk and coordinating global efforts to reconstruct essential metrics.
Third, Canada has an opportunity to lead in shaping the next generation of trusted and better public-interest intelligence…(More)”.
Making the case for collaborative digital infrastructure to scale regenerative food supply networks
Briefing paper from the Food Data Collaboration: “…a call to action to collaborate and invest in data infrastructure that will enable shorter, relational, regenerative food supply networks to scale.
These food supply networks play a vital role in achieving a truly sustainable and resilient food system. By embracing data technology that fosters commons ownership models, collaboration and interdependence we can build a more inclusive and dynamic food ecosystem in which collaborative efforts, as opposed to competitive businesses operating in silos, can achieve transformative scale.
Since 2022, the Food Data Collaboration has been exploring the potential for open data standards to enable shorter, relational, regenerative food supply networks to scale and pave the way towards a healthier, more equitable, and more resilient food future. This paper explores the high level rationale for our approach and is essential reading for anyone keen to know more about the project’s aims, achievements and future development…(More)”.
Who Is Government?
Book edited by Michael Lewis: “The government is a vast, complex system that Americans pay for, rebel against, rely upon, dismiss, and celebrate. It’s also our shared resource for addressing the biggest problems of society. And it’s made up of people, mostly unrecognized and uncelebrated, doing work that can be deeply consequential and beneficial to everyone.
Michael Lewis invited his favorite writers, including Casey Cep, Dave Eggers, John Lanchester, Geraldine Brooks, Sarah Vowell, and W. Kamau Bell, to join him in finding someone doing an interesting job for the government and writing about them. The stories they found are unexpected, riveting, and inspiring, including a former coal miner devoted to making mine roofs less likely to collapse, saving thousands of lives; an IRS agent straight out of a crime thriller; and the manager who made the National Cemetery Administration the best-run organization, public or private, in the entire country. Each essay shines a spotlight on the essential behind-the-scenes work of exemplary federal employees.
Whether they’re digitizing archives, chasing down cybercriminals, or discovering new planets, these public servants are committed to their work and universally reluctant to take credit. Expanding on the Washington Post series, the vivid profiles in Who Is Government? blow up the stereotype of the irrelevant bureaucrat. They show how the essential business of government makes our lives possible, and how much it matters…(More)”.
Computer Science and the Law
Article by Steven M. Bellovin: “There were three U.S. technical/legal developments occurring in approximately 1993 that had a profound effect on the technology industry and on many technologists. More such developments are occurring with increasing frequency.
The three developments were, in fact, technically unrelated. One was a bill before the U.S. Congress for a standardized wiretap interface in phone switches, a concept that spread around the world under the generic name of “lawful intercept.” The second was an update to the copyright statute to adapt to the digital age. While there were some useful changes—caching proxies and ISPs transmitting copyrighted material were no longer to be held liable for making illegal copies of protected content—it also provided an easy way for careless or unscrupulous actors—including bots—to request takedown of perfectly legal material. The third was the infamous Clipper chip, an encryption device that provided a backdoor for the U.S.—and only the U.S.—government.
All three of these developments could be and were debated on purely legal or policy grounds. But there were also technical issues. Thus, one could argue on legal grounds that the Clipper chip granted the government unprecedented powers, powers arguably in violation of the Fourth Amendment to the U.S. Constitution. That, of course, is a U.S. issue—but technologists, including me, pointed out the technical risks of deploying a complex cryptographic protocol, anywhere in the world (and many other countries have since expressed similar desires). Sure enough, Matt Blaze showed how to abuse the Clipper chip to let it do backdoor-free encryption, and at least two other mechanisms for adding backdoors to encryption protocols were shown to have flaws that allowed malefactors to read data that others had encrypted.
These posed a problem: debating some issues intelligently required not just a knowledge of law or of technology, but of both. That is, some problems cannot be discussed purely on technical grounds or purely on legal grounds; the crux of the matter lies in the intersection.
Consider, for example, the difference between content and metadata in a communication. Metadata alone is extremely powerful; indeed, Michael Hayden, former director of both the CIA and the NSA, once said, “We kill people based on metadata.” The combination of content and metadata is of course even more powerful. However, under U.S. law (and the legal reasoning is complex and controversial), the content of a phone call is much more strongly protected than the metadata: who called whom, when, and for how long they spoke. But how does this doctrine apply to the Internet, a network that provides far more powerful abilities to the endpoints in a conversation? (Metadata analysis is not an Internet-specific phenomenon. The militaries of the world have likely been using it for more than a century.) You cannot begin to answer that question without knowing not just how the Internet actually works, but also the legal reasoning behind the difference. It took more than 100 pages for some colleagues and I, three computer scientists and a former Federal prosecutor, to show how the line between content and metadata can be drawn in some cases (and that the Department of Justice’s manuals and some Federal judges got the line wrong), but that in other cases, there is no possible line1
Newer technologies pose the same sorts of risks…(More)”.
When data disappear: public health pays as US policy strays
Paper by Thomas McAndrew, Andrew A Lover, Garrik Hoyt, and Maimuna S Majumder: “Presidential actions on Jan 20, 2025, by President Donald Trump, including executive orders, have delayed access to or led to the removal of crucial public health data sources in the USA. The continuous collection and maintenance of health data support public health, safety, and security associated with diseases such as seasonal influenza. To show how public health data surveillance enhances public health practice, we analysed data from seven US Government-maintained sources associated with seasonal influenza. We fit two models that forecast the number of national incident influenza hospitalisations in the USA: (1) a data-rich model incorporating data from all seven Government data sources; and (2) a data-poor model built using a single Government hospitalisation data source, representing the minimal required information to produce a forecast of influenza hospitalisations. The data-rich model generated reliable forecasts useful for public health decision making, whereas the predictions using the data-poor model were highly uncertain, rendering them impractical. Thus, health data can serve as a transparent and standardised foundation to improve domestic and global health. Therefore, a plan should be developed to safeguard public health data as a public good…(More)”.