Comparative Data Law


Conference Proceedings edited by Josef Drexl, Moritz Hennemann, Patricia Boshe,  and Klaus Wiedemann: “The increasing relevance of data is now recognized all over the world. The large number of regulatory acts and proposals in the field of data law serves as a testament to the significance of data processing for the economies of the world. The European Union’s Data Strategy, the African Union’s Data Policy Framework and the Australian Data Strategy only serve as examples within a plethora of regulatory actions. Yet, the purposeful and sensible use of data does not only play a role in economic terms, e.g. regarding the welfare or competitiveness of economies. The implications for society and the common good are at least equally relevant. For instance, data processing is an integral part of modern research methodology and can thus help to address the problems the world is facing today, such as climate change.

The conference was the third and final event of the Global Data Law Conference Series. Legal scholars from all over the world met, presented and exchanged their experiences on different data-related regulatory approaches. Various instruments and approaches to the regulation of data – personal or non-personal – were discussed, without losing sight of the global effects going hand-in-hand with different kinds of regulation.

In compiling the conference proceedings, this book does not only aim at providing a critical and analytical assessment of the status quo of data law in different countries today, it also aims at providing a forward-looking perspective on the pressing issues of our time, such as: How to promote sensible data sharing and purposeful data governance? Under which circumstances, if ever, do data localisation requirements make sense? How – and by whom – should international regulation be put in place? The proceedings engage in a discussion on future-oriented ideas and actions, thereby promoting a constructive and sensible approach to data law around the world…(More)”.

AI companies start winning the copyright fight


Article by Blake Montgomery: “…tech companies notched several victories in the fight over their use of copyrighted text to create artificial intelligence products.

Anthropic: A US judge has ruled that Anthropic, maker of the Claude chatbot, use of books to train its artificial intelligence system – without permission of the authors – did not breach copyright law. Judge William Alsup compared the Anthropic model’s use of books to a “reader aspiring to be a writer.”

And the next day, Meta: The US district judge Vince Chhabria, in San Francisco, said in his decision on the Meta case that the authors had not presented enough evidence that the technology company’s AI would cause “market dilution” by flooding the market with work similar to theirs.

The same day that Meta received its favorable ruling, a group of writers sued Microsoft, alleging copyright infringement in the creation of that company’s Megatron text generator. Judging by the rulings in favor of Meta and Anthropic, the authors are facing an uphill battle.

These three cases are skirmishes in the wider legal war over copyrighted media, which rages on. Three weeks ago, Disney and NBCUniversal sued Midjourney, alleging that the company’s namesake AI image generator and forthcoming video generator made illegal use of the studios’ iconic characters like Darth Vader and the Simpson family. The world’s biggest record labels – Sony, Universal and Warner – have sued two companies that make AI-powered music generators, Suno and Udio. On the textual front, the New York Times’ suit against OpenAI and Microsoft is ongoing.

The lawsuits over AI-generated text were filed first, and, as their rulings emerge, the next question in the copyright fight is whether decisions about one type of media will apply to the next.

“The specific media involved in the lawsuit – written works versus images versus videos versus audio – will certainly change the fair-use analysis in each case,” said John Strand, a trademark and copyright attorney with the law firm Wolf Greenfield. “The impact on the market for the copyrighted works is becoming a key factor in the fair-use analysis, and the market for books is different than that for movies.”…(More)”.

Bad data leads to bad policy


Article by Georges-Simon Ulrich: “When the UN created a Statistical Commission in 1946, the world was still recovering from the devastation of the second world war. Then, there was broad consensus that only reliable, internationally comparable data could prevent conflict, combat poverty and anchor global co-operation. Nearly 80 years later, this insight remains just as relevant, but the context has changed dramatically…

This erosion of institutional capacity could not come at a more critical moment. The UN is unable to respond adequately as it is facing a staffing shortfall itself. Due to ongoing austerity measures at the UN, many senior positions remain vacant, and the director of the UN Statistics Division has retired, with no successor appointed. This comes at a time when bold and innovative initiatives — such as a newly envisioned Trusted Data Observatory — are urgently needed to make official statistics more accessible and machine-readable.

Meanwhile, the threat of targeted disinformation is growing. On social media, distorted or manipulated content spreads at unprecedented speed. Emerging tools like AI chatbots exacerbate the problem. These systems rely on web content, not verified data, and are not built to separate truth from falsehood. Making matters worse, many governments cannot currently make their data usable for AI because it is not standardised, not machine-readable, or not openly accessible. The space for sober, evidence-based discourse is shrinking.

This trend undermines public trust in institutions, strips policymaking of its legitimacy, and jeopardises the UN Sustainable Development Goals (SDGs). Without reliable data, governments will be flying blind — or worse: they will be deliberately misled.

When countries lose control of their own data, or cannot integrate it into global decision-making processes, they become bystanders to their own development. Decisions about their economies, societies and environments are then outsourced to AI systems trained on skewed, unrepresentative data. The global south is particularly at risk, with many countries lacking access to quality data infrastructures. In countries such as Ethiopia, unverified information spreading rapidly on social media has fuelled misinformation-driven violence.

The Covid-19 pandemic demonstrated that strong data systems enable better crisis response. To counter these risks, the creation of a global Trusted Data Observatory (TDO) is essential. This UN co-ordinated, democratically governed platform would help catalogue and make accessible trusted data around the world — while fully respecting national sovereignty…(More)”

A New Social Contract for AI? Comparing CC Signals and the Social License for Data Reuse


Article by Stefaan Verhulst: “Last week, Creative Commons — the global nonprofit best known for its open copyright licenses — released “CC Signals: A New Social Contract for the Age of AI.” This framework seeks to offer creators a means to signal their preferences for how their works are used in machine learning, including training Artificial Intelligence systems. It marks an important step toward integrating re-use preferences and shared benefits directly into the AI development lifecycle….

From a responsible AI perspective, the CC Signals framework is an important development. It demonstrates how soft governance mechanisms — declarations, usage expressions, and social signaling — can supplement or even fill gaps left by inconsistent global copyright regimes in the context of AI. At the same time, this initiative provides an interesting point of comparison with our ongoing work to develop a Social License for Data Reuse. A social license for data reuse is a participatory governance framework that allows communities to collectively define, signal and enforce the conditions under which data about them can be reused — including training AI. Unlike traditional consent-based mechanisms, which focus on individual permissions at the point of collection, a social license introduces a community-centered, continuous process of engagement — ensuring that data practices align with shared values, ethical norms, and contextual realities. It provides a complementary layer to legal compliance, emphasizing trust, legitimacy, and accountability in data governance.

While both frameworks are designed to signal preferences and expectations for data or content reuse, they differ meaningfully in scope, method, and theory of change.

Below, we offer a comparative analysis of the two frameworks — highlighting how each approaches the challenge of embedding legitimacy and trust into AI and data ecosystems…(More)”.

Cloudflare Introduces Default Blocking of A.I. Data Scrapers


Article by Natallie Rocha: “Data for A.I. systems has become an increasingly contentious issue. OpenAI, Anthropic, Google and other companies building A.I. systems have amassed reams of information from across the internet to train their A.I. models. High-quality data is particularly prized because it helps A.I. models become more proficient in generating accurate answers, videos and images.

But website publishers, authors, news organizations and other content creators have accused A.I. companies of using their material without permission and payment. Last month, Reddit sued Anthropic, saying the start-up had unlawfully used the data of its more than 100 million daily users to train its A.I. systems. In 2023, The New York Times sued OpenAI and its partner, Microsoft, accusing them of copyright infringement of news content related to A.I. systems. OpenAI and Microsoft have denied those claims.

Some publishers have struck licensing deals with A.I. companies to receive compensation for their content. In May, The Times agreed to license its editorial content to Amazon for use in the tech giant’s A.I. platforms. Axel Springer, Condé Nast and News Corp have also entered into agreements with A.I. companies to receive revenue for the use of their material.

Mark Howard, the chief operating officer of Time, said he welcomed Cloudflare’s move. Data scraping by A.I. companies threatens anyone who creates content, he said, adding that news publishers like Time deserved fair compensation for what they published…(More)”.

AI and Assembly: Coming Together and Apart in a Datafied World


Book edited by Toussaint Nothias and Lucy Bernholz: “Artificial intelligence has moved from the lab into everyday life and is now seemingly everywhere. As AI creeps into every aspect of our lives, the data grab required to power AI also expands. People worldwide are tracked, analyzed, and influenced, whether on or off their screens, inside their homes or outside in public, still or in transit, alone or together. What does this mean for our ability to assemble with others for collective action, including protesting, holding community meetings and organizing rallies ? In this context, where and how does assembly take place, and who participates by choice and who by coercion? AI and Assembly explores these questions and offers global perspectives on the present and future of assembly in a world taken over by AI.

The contributors analyze how AI threatens free assembly by clustering people without consent, amplifying social biases, and empowering authoritarian surveillance. But they also explore new forms of associational life that emerge in response to these harms, from communities in the US conducting algorithmic audits to human rights activists in East Africa calling for biometric data protection and rideshare drivers in London advocating for fair pay. Ultimately, AI and Assembly is a rallying cry for those committed to a digital future beyond the narrow horizon of corporate extraction and state surveillance…(More)”.

Data integration and synthesis for pandemic and epidemic intelligence


Paper by Barbara Tornimbene et al: “The COVID-19 pandemic highlighted substantial obstacles in real-time data generation and management needed for clinical research and epidemiological analysis. Three years after the pandemic, reflection on the difficulties of data integration offers potential to improve emergency preparedness. The fourth session of the WHO Pandemic and Epidemic Intelligence Forum sought to report the experiences of key global institutions in data integration and synthesis, with the aim of identifying solutions for effective integration. Data integration, defined as the combination of heterogeneous sources into a cohesive system, allows for combining epidemiological data with contextual elements such as socioeconomic determinants to create a more complete picture of disease patterns. The approach is critical for predicting outbreaks, determining disease burden, and evaluating interventions. The use of contextual information improves real-time intelligence and risk assessments, allowing for faster outbreak responses. This report captures the growing acknowledgment of data integration importance in boosting public health intelligence and readiness and show examples of how global institutions are strengthening initiatives to respond to this need. However, obstacles persist, including interoperability, data standardization, and ethical considerations. The success of future data integration efforts will be determined by the development of a common technical and legal framework, the promotion of global collaboration, and the protection of sensitive data. Ultimately, effective data integration can potentially transform public health intelligence and our way to successfully respond to future pandemics…(More)”.

Why AI hardware needs to be open


Article by Ayah Bdeir: “Once again, the future of technology is being engineered in secret by a handful of people and delivered to the rest of us as a sealed, seamless, perfect device. When technology is designed in secrecy and sold to us as a black box, we are reduced to consumers. We wait for updates. We adapt to features. We don’t shape the tools; they shape us. 

This is a problem. And not just for tinkerers and technologists, but for all of us.

We are living through a crisis of disempowerment. Children are more anxious than ever; the former US surgeon general described a loneliness epidemic; people are increasingly worried about AI eroding education. The beautiful devices we use have been correlated with many of these trends. Now AI—arguably the most powerful technology of our era—is moving off the screen and into physical space. 

The timing is not a coincidence. Hardware is having a renaissance. Every major tech company is investing in physical interfaces for AI. Startups are raising capital to build robots, glasses, wearables that are going to track our every move. The form factor of AI is the next battlefield. Do we really want our future mediated entirely through interfaces we can’t open, code we can’t see, and decisions we can’t influence? 

This moment creates an existential opening, a chance to do things differently. Because away from the self-centeredness of Silicon Valley, a quiet, grounded sense of resistance is reactivating. I’m calling it the revenge of the makers. 

In 2007, as the iPhone emerged, the maker movement was taking shape. This subculture advocates for learning-through-making in social environments like hackerspaces and libraries. DIY and open hardware enthusiasts gathered in person at Maker Faires—large events where people of all ages tinkered and shared their inventions in 3D printing, robotics, electronics, and more. Motivated by fun, self-fulfillment, and shared learning, the movement birthed companies like MakerBot, Raspberry Pi, Arduino, and (my own education startup) littleBits from garages and kitchen tables. I myself wanted to challenge the notion that technology had to be intimidating or inaccessible, creating modular electronic building blocks designed to put the power of invention in the hands of everyone…(More)”

5 Ways Cooperatives Can Shape the Future of AI


Article by Trebor Scholz and Stefano Tortorici: “Today, AI development is controlled by a small cadre of firms. Companies like OpenAI, Alphabet, Amazon, Meta, and Microsoft dominate through vast computational resources, massive proprietary datasets, deep pools of technical talent, extractive data practices, low-cost labor, and capital that enables continuous experimentation and rapid deployment. Even open-source challengers like DeepSeek run on vast computational muscle and industrial training pipelines.

This domination brings problems: privacy violation and cost-minimizing labor strategies, high environmental costs from data centers, and evident biases in models that can reinforce discrimination in hiring, healthcare, credit scoring, policing, and beyond. These problems tend to affect the people who are already too often left out. AI’s opaque algorithms don’t just sidestep democratic control and transparency—they shape who gets heard, who’s watched, and who’s quietly pushed aside.

Yet, as companies consider using this technology, it can seem that there are few other options. As such, it can seem that they are locked into these compromises.

A different model is taking shape, however, with little fanfare, but with real potential. AI cooperatives—organizations developing or governing AI technologies based on cooperative principles—offer a promising alternative. The cooperative movement, with its global footprint and diversity of models, has been successful from banking and agriculture to insurance and manufacturing. Cooperatives enterprises, which are owned and governed by their members, have long managed infrastructure for the public good.

A handful of AI cooperatives offer early examples of how democratic governance and shared ownership could shape more accountable and community-centered uses of the technology. Most are large agricultural cooperatives that are putting AI to use in their day-to-day operations, such as IFFCO’s DRONAI program (AI for fertilization), FrieslandCampina (dairy quality control), and Fonterra (milk production analytics). Cooperatives must urgently organize to challenge AI’s dominance or remain on the sidelines of critical political and technological developments.​

There is undeniably potential here, for both existing cooperatives and companies that might want to partner with them. The $589 billion drop in Nvidia’s market cap DeepSeek triggered shows how quickly open-source innovation can shift the landscape. But for cooperative AI labs to do more than signal intent, they need public infrastructure, civic partnerships, and serious backing…(More)”.

Trends in AI Supercomputers


Paper by Konstantin F. Pilz, James Sanders, Robi Rahman, and Lennart Heim: “Frontier AI development relies on powerful AI supercomputers, yet analysis of these systems is limited. We create a dataset of 500 AI supercomputers from 2019 to 2025 and analyze key trends in performance, power needs, hardware cost, ownership, and global distribution. We find that the computational performance of AI supercomputers has doubled every nine months, while hardware acquisition cost and power needs both doubled every year. The leading system in March 2025, xAI’s Colossus, used 200,000 AI chips, had a hardware cost of $7B, and required 300 MW of power, as much as 250,000 households. As AI supercomputers evolved from tools for science to industrial machines, companies rapidly expanded their share of total AI supercomputer performance, while the share of governments and academia diminished. Globally, the United States accounts for about 75% of total performance in our dataset, with China in second place at 15%. If the observed trends continue, the leading AI supercomputer in 2030 will achieve 2×1022 16-bit FLOP/s, use two million AI chips, have a hardware cost of $200 billion, and require 9 GW of power. Our analysis provides visibility into the AI supercomputer landscape, allowing policymakers to assess key AI trends like resource needs, ownership, and national competitiveness…(More)”.