AI Liability Along the Value Chain


Report by Beatriz Botero Arcila: “…explores how liability law can help solve the “problem of many hands” in AI: that is, determining who is responsible for harm that has been dealt in a value chain in which a variety of different companies and actors might be contributing to the development of any given AI system. This is aggravated by the fact that AI systems are both opaque and technically complex, making their behavior hard to predict.

Why AI Liability Matters

To find meaningful solutions to this problem, different kinds of experts have to come together. This resource is designed for a wide audience, but we indicate how specific audiences can best make use of different sections, overviews, and case studies.

Specifically, the report:

  • Proposes a 3-step analysis to consider how liability should be allocated along the value chain: 1) The choice of liability regime, 2) how liability should be shared amongst actors along the value chain and 3) whether and how information asymmetries will be addressed.
  • Argues that where ex-ante AI regulation is already in place, policymakers should consider how liability rules will interact with these rules.
  • Proposes a baseline liability regime where actors along the AI value chain share responsibility if fault can be demonstrated, paired with measures to alleviate or shift the burden of proof and to enable better access to evidence — which would incentivize companies to act with sufficient care and address information asymmetries between claimants and companies.
  • Argues that in some cases, courts and regulators should extend a stricter regime, such as product liability or strict liability.
  • Analyzes liability rules in the EU based on this framework…(More)”.

How crawlers impact the operations of the Wikimedia projects


Article by the Wikimedia Foundation: “Since the beginning of 2024, the demand for the content created by the Wikimedia volunteer community – especially for the 144 million images, videos, and other files on Wikimedia Commons – has grown significantly. In this post, we’ll discuss the reasons for this trend and its impact.

The Wikimedia projects are the largest collection of open knowledge in the world. Our sites are an invaluable destination for humans searching for information, and for all kinds of businesses that access our content automatically as a core input to their products. Most notably, the content has been a critical component of search engine results, which in turn has brought users back to our sites. But with the rise of AI, the dynamic is changing: We are observing a significant increase in request volume, with most of this traffic being driven by scraping bots collecting training data for large language models (LLMs) and other use cases. Automated requests for our content have grown exponentially, alongside the broader technology economy, via mechanisms including scraping, APIs, and bulk downloads. This expansion happened largely without sufficient attribution, which is key to drive new users to participate in the movement, and is causing a significant load on the underlying infrastructure that keeps our sites available for everyone. 

When Jimmy Carter died in December 2024, his page on English Wikipedia saw more than 2.8 million views over the course of a day. This was relatively high, but manageable. At the same time, quite a few users played a 1.5 hour long video of Carter’s 1980 presidential debate with Ronald Reagan. This caused a surge in the network traffic, doubling its normal rate. As a consequence, for about one hour a small number of Wikimedia’s connections to the Internet filled up entirely, causing slow page load times for some users. The sudden traffic surge alerted our Site Reliability team, who were swiftly able to address this by changing the paths our internet connections go through to reduce the congestion. But still, this should not have caused any issues, as the Foundation is well equipped to handle high traffic spikes during exceptional events. So what happened?…

Since January 2024, we have seen the bandwidth used for downloading multimedia content grow by 50%. This increase is not coming from human readers, but largely from automated programs that scrape the Wikimedia Commons image catalog of openly licensed images to feed images to AI models. Our infrastructure is built to sustain sudden traffic spikes from humans during high-interest events, but the amount of traffic generated by scraper bots is unprecedented and presents growing risks and costs…(More)”.

AI, Innovation and the Public Good: A New Policy Playbook


Paper by Burcu Kilic: “When Chinese start-up DeepSeek released R1 in January 2025, the groundbreaking open-source artificial intelligence (AI) model rocked the tech industry as a more cost-effective alternative to models running on more advanced chips. The launch coincided with industrial policy gaining popularity as a strategic tool for governments aiming to build AI capacity and competitiveness. Once dismissed under neoliberal economic frameworks, industrial policy is making a strong comeback with more governments worldwide embracing it to build digital public infrastructure and foster local AI ecosystems. This paper examines how the national innovation system framework can guide AI industrial policy to foster innovation and reduce reliance on dominant tech companies…(More)”.

Oxford Intersections: AI in Society


Series edited by Philipp Hacker: “…provides an interdisciplinary corpus for understanding artificial intelligence (AI) as a global phenomenon that transcends geographical and disciplinary boundaries. Edited by a consortium of experts hailing from diverse academic traditions and regions, the 11 edited and curated sections provide a holistic view of AI’s societal impact. Critically, the work goes beyond the often Eurocentric or U.S.-centric perspectives that dominate the discourse, offering nuanced analyses that encompass the implications of AI for a range of regions of the world. Taken together, the sections of this work seek to move beyond the state of the art in three specific respects. First, they venture decisively beyond existing research efforts to develop a comprehensive account and framework for the rapidly growing importance of AI in virtually all sectors of society. Going beyond a mere mapping exercise, the curated sections assess opportunities, critically discuss risks, and offer solutions to the manifold challenges AI harbors in various societal contexts, from individual labor to global business, law and governance, and interpersonal relationships. Second, the work tackles specific societal and regulatory challenges triggered by the advent of AI and, more specifically, large generative AI models and foundation models, such as ChatGPT or GPT-4, which have so far received limited attention in the literature, particularly in monographs or edited volumes. Third, the novelty of the project is underscored by its decidedly interdisciplinary perspective: each section, whether covering Conflict; Culture, Art, and Knowledge Work; Relationships; or Personhood—among others—will draw on various strands of knowledge and research, crossing disciplinary boundaries and uniting perspectives most appropriate for the context at hand…(More)”.

Robotics for Global development


Report by the Frontier Tech Hub: “Robotics could enable progress on 46% of SDG targets  yet this potential remains largely untapped in low and middle-income countries. 

While technological developments and new-found applications of artificial intelligence (AI) keep captivating significant attention and investments, using robotics to advance the Sustainable Development Goals (SDGs) is consistently overlooked. This is especially true when the focus moves from aerial robotics (drones) to robotic arms, ground robotics, and aquatic robotics. How might these types of robots accelerate global development in the least developed countries? 

We aim to answer this question and inform the UK Foreign, Commonwealth & Development Office’s (FCDO) investment and policy towards robotics in the least developed countries (LDCs). In an emergent space, the UK FCDO has a unique opportunity to position itself as a global leader in leveraging robotics technology to accelerate sustainable development outcomes…(More)”.

Cloze Encounters: The Impact of Pirated Data Access on LLM Performance


Paper by Stella Jia & Abhishek Nagaraj: “Large Language Models (LLMs) have demonstrated remarkable capabilities in text generation, but their performance may be influenced by the datasets on which they are trained, including potentially unauthorized or pirated content. We investigate the extent to which data access through pirated books influences LLM responses. We test the performance of leading foundation models (GPT, Claude, Llama, and Gemini) on a set of books that were and were not included in the Books3 dataset, which contains full-text pirated books and could be used for LLM training. We assess book-level performance using the “name cloze” word-prediction task. To examine the causal effect of Books3 inclusion we employ an instrumental variables strategy that exploits the pattern of book publication years in the Books3 dataset. In our sample of 12,916 books, we find significant improvements in LLM name cloze accuracy on books available within the Books3 dataset compared to those not present in these data. These effects are more pronounced for less popular books as compared to more popular books and vary across leading models. These findings have crucial implications for the economics of digitization, copyright policy, and the design and training of AI systems…(More)”.

Bubble Trouble


Article by Bryan McMahon: “…Venture capital (VC) funds, drunk on a decade of “growth at all costs,” have poured about $200 billion into generative AI. Making matters worse, the stock market’s bull run is deeply dependent on the growth of the Big Tech companies fueling the AI bubble. In 2023, 71 percent of the total gains in the S&P 500 were attributable to the “Magnificent Seven”—Apple, Nvidia, Tesla, Alphabet, Meta, Amazon, and Microsoft—all of which are among the biggest spenders on AI. Just four—Microsoft, Alphabet, Amazon, and Meta—combined for $246 billion of capital expenditure in 2024 to support the AI build-out. Goldman Sachs expects Big Tech to spend over $1 trillion on chips and data centers to power AI over the next five years. Yet OpenAI, the current market leader, expects to lose $5 billion this year, and its annual losses to swell to $11 billion by 2026. If the AI bubble bursts, it not only threatens to wipe out VC firms in the Valley but also blow a gaping hole in the public markets and cause an economy-wide meltdown…(More)”.

The Language Data Space (LDS)


European Commission: “… welcomes launch of the Alliance for Language Technologies European Digital Infrastructure Consortium (ALT-EDIC) and the Language Data Space (LDS).

Aimed at addressing the shortage of European language data needed for training large language models, these projects are set to revolutionise multilingual Artificial Intelligence (AI) systems across the EU.

By offering services in all EU languages, the initiatives are designed to break down language barriers, providing better, more accessible solutions for smaller businesses within the EU. This effort not only aims to preserve the EU’s rich cultural and linguistic heritage in the digital age but also strengthens Europe’s quest for tech sovereignty. Formed in February 2024, the ALT-EDIC includes 17 participating Member States and 9 observer Member States and regions, making it one of the pioneering European Digital Infrastructure Consortia.

The LDS, part of the Common European Data Spaces, is crucial for increasing data availability for AI development in Europe. Developed by the Commission and funded by the DIGITAL programme,  this project aims to create a cohesive marketplace for language data. This will enhance the collection and sharing of multilingual data to support European large language models. Initially accessible to selected institutions and companies, the project aims to eventually involve all European public and private stakeholders.

Find more information about the Alliance for Language Technologies European Digital Infrastructure Consortium (ALT-EDIC) and the Language Data Space (LDS)…(More)”

New AI Collaboratives to take action on wildfires and food insecurity


Google: “…last September we introduced AI Collaboratives, a new funding approach designed to unite public, private and nonprofit organizations, and researchers, to create AI-powered solutions to help people around the world.

Today, we’re sharing more about our first two focus areas for AI Collaboratives: Wildfires and Food Security.

Wildfires are a global crisis, claiming more than 300,000 lives due to smoke exposure annually and causing billions of dollars in economic damage. …Google.org has convened more than 15 organizations, including Earth Fire Alliance and Moore Foundation, to help in this important effort. By coordinating funding and integrating cutting-edge science, emerging technology and on-the-ground applications, we can provide collaborators with the tools they need to identify and track wildfires in near real time; quantify wildfire risk; shift more acreage to beneficial fires; and ultimately reduce the damage caused by catastrophic wildfires.

Nearly one-third of the world’s population faces moderate or severe food insecurity due to extreme weather, conflict and economic shocks. The AI Collaborative: Food Security will strengthen the resilience of global food systems and improve food security for the world’s most vulnerable populations through AI technologies, collaborative research, data-sharing and coordinated action. To date, 10 organizations have joined us in this effort, and we’ll share more updates soon…(More)”.

Large AI models are cultural and social technologies


Essay by Henry Farrell, Alison Gopnik, Cosma Shalizi, and James Evans: “Debates about artificial intelligence (AI) tend to revolve around whether large models are intelligent, autonomous agents. Some AI researchers and commentators speculate that we are on the cusp of creating agents with artificial general intelligence (AGI), a prospect anticipated with both elation and anxiety. There have also been extensive conversations about cultural and social consequences of large models, orbiting around two foci: immediate effects of these systems as they are currently used, and hypothetical futures when these systems turn into AGI agents perhaps even superintelligent AGI agents.

But this discourse about large models as intelligent agents is fundamentally misconceived. Combining ideas from social and behavioral sciences with computer science can help us understand AI systems more accurately. Large Models should not be viewed primarily as intelligent agents, but as a new kind of cultural and social technology, allowing humans to take advantage of information other humans have accumulated.

The new technology of large models combines important features of earlier technologies. Like pictures, writing, print, video, Internet search, and other such technologies, large models allow people to access information that other people have created. Large Models – currently language, vision, and multi-modal depend on the fact that the Internet has made the products of these earlier technologies readily available in machine-readable form. But like economic markets, state bureaucracies, and other social technologies, these systems not only make information widely available, they allow it to be reorganized, transformed, and restructured in distinctive ways. Adopting Herbert Simon’s terminology, large models are a new variant of the “artificial systems of human society” that process information to enable large-scale coordination…(More)”