Trump Taps Palantir to Compile Data on Americans


Article by Sheera Frenkel and Aaron Krolik: “In March, President Trump signed an executive order calling for the federal government to share data across agencies, raising questions over whether he might compile a master list of personal information on Americans that could give him untold surveillance power.

Mr. Trump has not publicly talked about the effort since. But behind the scenes, officials have quietly put technological building blocks into place to enable his plan. In particular, they have turned to one company: Palantir, the data analysis and technology firm.

The Trump administration has expanded Palantir’s work across the federal government in recent months. The company has received more than $113 million in federal government spending since Mr. Trump took office, according to public records, including additional funds from existing contracts as well as new contracts with the Department of Homeland Security and the Pentagon. (This does not include a $795 million contract that the Department of Defense awarded the company last week, which has not been spent.)

Representatives of Palantir are also speaking to at least two other agencies — the Social Security Administration and the Internal Revenue Service — about buying its technology, according to six government officials and Palantir employees with knowledge of the discussions.

The push has put a key Palantir product called Foundry into at least four federal agencies, including D.H.S. and the Health and Human Services Department. Widely adopting Foundry, which organizes and analyzes data, paves the way for Mr. Trump to easily merge information from different agencies, the government officials said…(More)

Creating detailed portraits of Americans based on government data is not just a pipe dream. The Trump administration has already sought access to hundreds of data points on citizens and others through government databases, including their bank account numbers, the amount of their student debt, their medical claims and any disability status…(More)”.

Designing Shared Data Futures: Engaging young people on how to re-use data responsibly for health and well-being


Report by Hannah Chafetz, Sampriti Saxena, Tracy Jo Ingram, Andrew J. Zahuranec, Jennifer Requejo and Stefaan Verhulst: “When young people are engaged in data decisions for or about them, they not only become more informed about this data, but can also contribute to new policies and programs that improve their health and well-being. However, oftentimes young people are left out of these discussions and are unaware of the data that organizations collect.

In October 2023, The Second Lancet Commission on Adolescent Health and well-being, the United Nations Children’s Fund (UNICEF), and The GovLab at New York University hosted six Youth Solutions Labs (or co-design workshops) with over 120 young people from 36 countries around the world. In addition to co-designing solutions to five key issues impacting their health and well-being, we sought to understand current sentiments around the re-use of data on those issues. The Labs provided several insights about young people’s preferences regarding: 1) the purposes for which data should be re-used to improve health and well-being, 2) the types and sources of data that should and should not be re-used, 3) who should have access to previously collected data, and 4) under what circumstances data re-use should take place. Additionally, participants provided suggestions of what ethical and responsible data re-use looks like to them and how young people can participate in decision making processes. In this paper, we elaborate on these findings and provide a series of recommendations to accelerate responsible data re-use for the health and well-being of young people…(More)”.

Our new AI strategy puts Wikipedia’s humans first


Blog by Chris Albon and Leila Zia: “Not too long ago, we were asked when we’re going to replace Wikipedia’s human-curated knowledge with AI. 

The answer? We’re not.

The community of volunteers behind Wikipedia is the most important and unique element of Wikipedia’s success. For nearly 25 years, Wikipedia editors have researched, deliberated, discussed, built consensus, and collaboratively written the largest encyclopedia humankind has ever seen. Their care and commitment to reliable encyclopedic knowledge is something AI cannot replace. 

That is why our new AI strategy doubles down on the volunteers behind Wikipedia.

We will use AI to build features that remove technical barriers to allow the humans at the core of Wikipedia to spend their valuable time on what they want to accomplish, and not on how to technically achieve it. Our investments will be focused on specific areas where generative AI excels, all in the service of creating unique opportunities that will boost Wikipedia’s volunteers: 

  • Supporting Wikipedia’s moderators and patrollers with AI-assisted workflows that automate tedious tasks in support of knowledge integrity; 
  • Giving Wikipedia’s editors time back by improving the discoverability of information on Wikipedia to leave more time for human deliberation, judgment, and consensus building; 
  • Helping editors share local perspectives or context by automating the translation and adaptation of common topics;
  • Scaling the onboarding of new Wikipedia volunteers with guided mentorship. 

You can read the Wikimedia Foundation’s new AI strategy over on Meta-Wiki…(More)”.

Brazil’s AI-powered social security app is wrongly rejecting claims


Article by Gabriel Daros: “Brazil’s social security institute, known as INSS, added AI to its app in 2018 in an effort to cut red tape and speed up claims. The office, known for its long lines and wait times, had around 2 million pending requests for everything from doctor’s appointments to sick pay to pensions to retirement benefits at the time. While the AI-powered tool has since helped process thousands of basic claims, it has also rejected requests from hundreds of people like de Brito — who live in remote areas and have little digital literacy — for minor errors.

The government is right to digitize its systems to improve efficiency, but that has come at a cost, Edjane Rodrigues, secretary for social policies at the National Confederation of Workers in Agriculture, told Rest of World.

“If the government adopts this kind of service to speed up benefits for the people, this is good. We are not against it,” she said. But, particularly among farm workers, claims can be complex because of the nature of their work, she said, referring to cases that require additional paperwork, such as when a piece of land is owned by one individual but worked by a group of families. “There are many peculiarities in agriculture, and rural workers are being especially harmed” by the app, according to Rodrigues.

“Each automated decision is based on specified legal criteria, ensuring that the standards set by the social security legislation are respected,” a spokesperson for INSS told Rest of World. “Automation does not work in an arbitrary manner. Instead, it follows clear rules and regulations, mirroring the expected standards applied in conventional analysis.”

Governments across Latin America have been introducing AI to improve their processes. Last year, Argentina began using ChatGPT to draft court rulings, a move that officials said helped cut legal costs and reduce processing times. Costa Rica has partnered with Microsoft to launch an AI tool to optimize tax data collection and check for fraud in digital tax receipts. El Salvador recently set up an AI lab to develop tools for government services.

But while some of these efforts have delivered promising results, experts have raised concerns about the risk of officials with little tech know-how applying these tools with no transparency or workarounds…(More)”.

DOGE’s Growing Reach into Personal Data: What it Means for Human Rights


Article by Deborah Brown: “Expansive interagency sharing of personal data could fuel abuses against vulnerable people and communities who are already being targeted by Trump administration policies, like immigrants, lesbian, gay, bisexual, and transgender (LGBT) people, and student protesters. The personal data held by the government reveals deeply sensitive information, such as people’s immigration status, race, gender identity, sexual orientation, and economic status.

A massive centralized government database could easily be used for a range of abusive purposes, like to discriminate against current federal employees and future job applicants on the basis of their sexual orientation or gender identity, or to facilitate the deportation of immigrants. It could result in people forgoing public services out of fear that their data will be weaponized against them by another federal agency.

But the danger doesn’t stop with those already in the administration’s crosshairs. The removal of barriers keeping private data siloed could allow the government or DOGE to deny federal loans for education or Medicaid benefits based on unrelated or even inaccurate data. It could also facilitate the creation of profiles containing all of the information various agencies hold on every person in the country. Such profiles, combined with social media activity, could facilitate the identification and targeting of people for political reasons, including in the context of elections.

Information silos exist for a reason. Personal data should be collected for a determined, specific, and legitimate purpose, and not used for another purpose without notice or justification, according to the key internationally recognized data protection principle, “purpose limitation.” Sharing data seamlessly across federal or even state agencies in the name of an undefined and unmeasurable goal of efficiency is incompatible with this core data protection principle…(More)”.

Data Localization: A Global Threat to Human Rights Online


Article by Freedom House: “From Pakistan to Zambia, governments around the world are increasingly proposing and passing data localization legislation. These laws, which refer to the rules governing the storage and transfer of electronic data across jurisdictions, are often justified as addressing concerns such as user privacy, cybersecurity, national security, and monopolistic market practices. Notwithstanding these laudable goals, data localization initiatives cause more harm than good, especially in legal environments with poor rule of law.

Data localization requirements can take many different forms. A government may require all companies collecting and processing certain types of data about local users to store the data on servers located in the country. Authorities may also restrict the foreign transfer of certain types of data or allow it only under narrow circumstances, such as after obtaining the explicit consent of users, receiving a license or permit from a public authority, or conducting a privacy assessment of the country to which the data will be transferred.

While data localization can have significant economic and security implications, the focus of this piece—inline with that of the Global Network Initiative and Freedom House—is on its potential human rights impacts, which are varied. Freedom House’s research shows that the rise in data localization policies worldwide is contributing to the global decline of internet freedom. Without robust transparency and accountability frameworks embedded into these provisions, digital rights are often put on the line. As these types of legislation continue to pop up globally, the need for rights-respecting solutions and norms for cross-border data flows is greater than ever…(More)”.

Towards a set of Universal data principles


Paper by Steve MacFeely, Angela Me, Friederike Schueuer, Joseph Costanzo, David Passarelli, Malarvizhi Veerappan, and Stefaan Verhulst: “Humanity collects, processes, shares, uses, and reuses a staggering volume of data. These data are the lifeblood of the digital economy; they feed algorithms and artificial intelligence, inform logistics, and shape markets, communication, and politics. Data do not just yield economic benefits; they can also have individual and societal benefits and impacts. Being able to access, process, use, and reuse data is essential for dealing with global challenges, such as managing and protecting the environment, intervening in the event of a pandemic, or responding to a disaster or crisis. While we have made great strides, we have yet to realize the full potential of data, in particular, the potential of data to serve the public good. This will require international cooperation and a globally coordinated approach. Many data governance issues cannot be fully resolved at national level. This paper presents a proposal for a preliminary set of data goals and principles. These goals and principles are envisaged as the normative foundations for an international data governance framework – one that is grounded in human rights and sustainable development. A principles-based approach to data governance helps create common values, and in doing so, helps to change behaviours, mindsets and practices. It can also help create a foundation for the safe use of all types of data and data transactions. The purpose of this paper is to present the preliminary principles to solicit reaction and feedback…(More)”.

Can small language models revitalize Indigenous languages?


Article by Brooke Tanner and Cameron F. Kerry: “Indigenous languages play a critical role in preserving cultural identity and transmitting unique worldviews, traditions, and knowledge, but at least 40% of the world’s 6,700 languages are currently endangered. The United Nations declared 2022-2032 as the International Decade of Indigenous Languages to draw attention to this threat, in hopes of supporting the revitalization of these languages and preservation of access to linguistic resources.  

Building on the advantages of SLMs, several initiatives have successfully adapted these models specifically for Indigenous languages. Such Indigenous language models (ILMs) represent a subset of SLMs that are designed, trained, and fine-tuned with input from the communities they serve. 

Case studies and applications 

  • Meta released No Language Left Behind (NLLB-200), a 54 billion–parameter open-source machine translation model that supports 200 languages as part of Meta’s universal speech translator project. The model includes support for languages with limited translation resources. While the model’s breadth of languages included is novel, NLLB-200 can struggle to capture the intricacies of local context for low-resource languages and often relies on machine-translated sentence pairs across the internet due to the scarcity of digitized monolingual data. 
  • Lelapa AI’s InkubaLM-0.4B is an SLM with applications for low-resource African languages. Trained on 1.9 billion tokens across languages including isiZulu, Yoruba, Swahili, and isiXhosa, InkubaLM-0.4B (with 400 million parameters) builds on Meta’s LLaMA 2 architecture, providing a smaller model than the original LLaMA 2 pretrained model with 7 billion parameters. 
  • IBM Research Brazil and the University of São Paulo have collaborated on projects aimed at preserving Brazilian Indigenous languages such as Guarani Mbya and Nheengatu. These initiatives emphasize co-creation with Indigenous communities and address concerns about cultural exposure and language ownership. Initial efforts included electronic dictionaries, word prediction, and basic translation tools. Notably, when a prototype writing assistant for Guarani Mbya raised concerns about exposing their language and culture online, project leaders paused further development pending community consensus.  
  • Researchers have fine-tuned pre-trained models for Nheengatu using linguistic educational sources and translations of the Bible, with plans to incorporate community-guided spellcheck tools. Since the translations relying on data from the Bible, primarily translated by colonial priests, often sounded archaic and could reflect cultural abuse and violence, they were classified as potentially “toxic” data that would not be used in any deployed system without explicit Indigenous community agreement…(More)”.

The Access to Public Information: A Fundamental Right


Book by Alejandra Soriano Diaz: “Information is not only a human-fundamental right, but it has been shaped as a pillar for the exercise of other human rights around the world. It is the path for bringing to account authorities and other powerful actors before the people, who are, for all purposes, the actual owners of public data.

Providing information about public decisions that have the potential to significantly impact a community is vital to modern democracy. This book explores the forms in which individuals and collectives are able to voice their opinions and participate in public decision-making when long-lasting effects are at stake, on present and future generations. The strong correlation between the right to access public information and the enjoyment of civil and political rights, as well as economic and environmental rights, emphasizes their interdependence.

This study raises a number of important questions to mobilize towards openness and empowerment of people’s right of ownership of their public information…(More)”.

Big brother: the effects of surveillance on fundamental aspects of social vision


Paper by Kiley Seymour et al: “Despite the dramatic rise of surveillance in our societies, only limited research has examined its effects on humans. While most research has focused on voluntary behaviour, no study has examined the effects of surveillance on more fundamental and automatic aspects of human perceptual awareness and cognition. Here, we show that being watched on CCTV markedly impacts a hardwired and involuntary function of human sensory perception—the ability to consciously detect faces. Using the method of continuous flash suppression (CFS), we show that when people are surveilled (N = 24), they are quicker than controls (N = 30) to detect faces. An independent control experiment (N = 42) ruled out an explanation based on demand characteristics and social desirability biases. These findings show that being watched impacts not only consciously controlled behaviours but also unconscious, involuntary visual processing. Our results have implications concerning the impacts of surveillance on basic human cognition as well as public mental health…(More)”.