The Emergent Landscape of Data Commons: A Brief Survey and Comparison of Existing Initiatives


Article by Stefaan G. Verhulst and Hannah Chafetz: With the increased attention on the need for data to advance AI, data commons initiatives around the world are redefining how data can be accessed, and re-used for societal benefit. These initiatives focus on generating access to data from various sources for a public purpose and are governed by communities themselves. While diverse in focus–from health and mobility to language and environmental data–data commons are united by a common goal: democratizing access to data to fuel innovation and tackle global challenges.

This includes innovation in the context of artificial intelligence (AI). Data commons are providing the framework to make pools of diverse data available in machine understandable formats for responsible AI development and deployment. By providing access to high quality data sources with open licensing, data commons can help increase the quantity of training data in a less exploitative fashion, minimize AI providers’ reliance on data extracted across the internet without an open license, and increase the quality of the AI output (while reducing mis-information).

Over the last few months, the Open Data Policy Lab (a collaboration between The GovLab and Microsoft) has conducted various research initiatives to explore these topics further and understand:

(1) how the concept of a data commons is changing in the context of artificial intelligence, and

(2) current efforts to advance the next generation of data commons.

In what follows we provide a summary of our findings thus far. We hope it inspires more data commons use cases for responsible AI innovation in the public’s interest…(More)”.

The Death of Search


Article by Matteo Wong: “For nearly two years, the world’s biggest tech companies have said that AI will transform the web, your life, and the world. But first, they are remaking the humble search engine.

Chatbots and search, in theory, are a perfect match. A standard Google search interprets a query and pulls up relevant results; tech companies have spent tens or hundreds of millions of dollars engineering chatbots that interpret human inputs, synthesize information, and provide fluent, useful responses. No more keyword refining or scouring Wikipedia—ChatGPT will do it all. Search is an appealing target, too: Shaping how people navigate the internet is tantamount to shaping the internet itself.

Months of prophesying about generative AI have now culminated, almost all at once, in what may be the clearest glimpse yet into the internet’s future. After a series of limited releases and product demos, mired with various setbacks and embarrassing errors, tech companies are debuting AI-powered search engines as fully realized, all-inclusive products. Last Monday, Google announced that it would launch its AI Overviews in more than 100 new countries; that feature will now reach more than 1 billion users a month. Days later, OpenAI announced a new search function in ChatGPT, available to paid users for now and soon opening to the public. The same afternoon, the AI-search start-up Perplexity shared instructions for making its “answer engine” the default search tool in your web browser.

For the past week, I have been using these products in a variety of ways: to research articles, follow the election, and run everyday search queries. In turn I have scried, as best I can, into the future of how billions of people will access, relate to, and synthesize information. What I’ve learned is that these products are at once unexpectedly convenient, frustrating, and weird. These tools’ current iterations surprised and, at times, impressed me, yet even when they work perfectly, I’m not convinced that AI search is a wise endeavor…(More)”.

Who Is Responsible for AI Copyright Infringement?


Article by Michael P. Goodyear: “Twenty-one-year-old college student Shane hopes to write a song for his boyfriend. In the past, Shane would have had to wait for inspiration to strike, but now he can use generative artificial intelligence to get a head start. Shane decides to use Anthropic’s AI chat system, Claude, to write the lyrics. Claude dutifully complies and creates the words to a love song. Shane, happy with the result, adds notes, rhythm, tempo, and dynamics. He sings the song and his boyfriend loves it. Shane even decides to post a recording to YouTube, where it garners 100,000 views.

But Shane did not realize that this song’s lyrics are similar to those of “Love Story,” Taylor Swift’s hit 2008 song. Shane must now contend with copyright law, which protects original creative expression such as music. Copyright grants the rights owner the exclusive rights to reproduce, perform, and create derivatives of the copyrighted work, among other things. If others take such actions without permission, they can be liable for damages up to $150,000. So Shane could be on the hook for tens of thousands of dollars for copying Swift’s song.

Copyright law has surged into the news in the past few years as one of the most important legal challenges for generative AI tools like Claude—not for the output of these tools but for how they are trained. Over two dozen pending court cases grapple with the question of whether training generative AI systems on copyrighted works without compensating or getting permission from the creators is lawful or not. Answers to this question will shape a burgeoning AI industry that is predicted to be worth $1.3 trillion by 2032.

Yet there is another important question that few have asked: Who should be liable when a generative AI system creates a copyright-infringing output? Should the user be on the hook?…(More)”

Assessing potential future artificial intelligence risks, benefits and policy imperatives


OECD Report: “The swift evolution of AI technologies calls for policymakers to consider and proactively manage AI-driven change. The OECD’s Expert Group on AI Futures was established to help meet this need and anticipate AI developments and their potential impacts. Informed by insights from the Expert Group, this report distils research and expert insights on prospective AI benefits, risks and policy imperatives. It identifies ten priority benefits, such as accelerated scientific progress, productivity gains and better sense-making and forecasting. It discusses ten priority risks, such as facilitation of increasingly sophisticated cyberattacks; manipulation, disinformation, fraud and resulting harms to democracy; concentration of power; incidents in critical systems and exacerbated inequality and poverty. Finally, it points to ten policy priorities, including establishing clearer liability rules, drawing AI “red lines”, investing in AI safety and ensuring adequate risk management procedures. The report reviews existing public policy and governance efforts and remaining gaps…(More)”.

Human-AI coevolution


Paper by Dino Pedreschi et al: “Human-AI coevolution, defined as a process in which humans and AI algorithms continuously influence each other, increasingly characterises our society, but is understudied in artificial intelligence and complexity science literature. Recommender systems and assistants play a prominent role in human-AI coevolution, as they permeate many facets of daily life and influence human choices through online platforms. The interaction between users and AI results in a potentially endless feedback loop, wherein users’ choices generate data to train AI models, which, in turn, shape subsequent user preferences. This human-AI feedback loop has peculiar characteristics compared to traditional human-machine interaction and gives rise to complex and often “unintended” systemic outcomes. This paper introduces human-AI coevolution as the cornerstone for a new field of study at the intersection between AI and complexity science focused on the theoretical, empirical, and mathematical investigation of the human-AI feedback loop. In doing so, we: (i) outline the pros and cons of existing methodologies and highlight shortcomings and potential ways for capturing feedback loop mechanisms; (ii) propose a reflection at the intersection between complexity science, AI and society; (iii) provide real-world examples for different human-AI ecosystems; and (iv) illustrate challenges to the creation of such a field of study, conceptualising them at increasing levels of abstraction, i.e., scientific, legal and socio-political…(More)”.

What is ‘sovereign AI’ and why is the concept so appealing (and fraught)?


Article by John Letzing: “Denmark unveiled its own artificial intelligence supercomputer last month, funded by the proceeds of wildly popular Danish weight-loss drugs like Ozempic. It’s now one of several sovereign AI initiatives underway, which one CEO believes can “codify” a country’s culture, history, and collective intelligence – and become “the bedrock of modern economies.”

That particular CEO, Jensen Huang, happens to run a company selling the sort of chips needed to pursue sovereign AI – that is, to construct a domestic vintage of the technology, informed by troves of homegrown data and powered by the computing infrastructure necessary to turn that data into a strategic reserve of intellect…

It’s not surprising that countries are forging expansive plans to put their own stamp on AI. But big-ticket supercomputers and other costly resources aren’t feasible everywhere.

Training a large language model has gotten a lot more expensive lately; the funds required for the necessary hardware, energy, and staff may soon top $1 billion. Meanwhile, geopolitical friction over access to the advanced chips necessary for powerful AI systems could further warp the global playing field.

Even for countries with abundant resources and access, there are “sovereignty traps” to consider. Governments pushing ahead on sovereign AI could risk undermining global cooperation meant to ensure the technology is put to use in transparent and equitable ways. That might make it a lot less safe for everyone.

An example: a place using AI systems trained on a local set of values for its security may readily flag behaviour out of sync with those values as a threat…(More)”.

Code and Craft: How Generative Ai Tools Facilitate Job Crafting in Software Development


Paper by Leonie Rebecca Freise et al: “The rapid evolution of the software development industry challenges developers to manage their diverse tasks effectively. Traditional assistant tools in software development often fall short of supporting developers efficiently. This paper explores how generative artificial intelligence (GAI) tools, such as Github Copilot or ChatGPT, facilitate job crafting—a process where employees reshape their jobs to meet evolving demands. By integrating GAI tools into workflows, software developers can focus more on creative problem-solving, enhancing job satisfaction, and fostering a more innovative work environment. This study investigates how GAI tools influence task, cognitive, and relational job crafting behaviors among software developers, examining its implications for professional growth and adaptability within the industry. The paper provides insights into the transformative impacts of GAI tools on software development job crafting practices, emphasizing their role in enabling developers to redefine their job functions…(More)”.

AI Analysis of Body Camera Videos Offers a Data-Driven Approach to Police Reform


Article by Ingrid Wickelgren: But unless something tragic happens, body camera footage generally goes unseen. “We spend so much money collecting and storing this data, but it’s almost never used for anything,” says Benjamin Graham, a political scientist at the University of Southern California.

Graham is among a small number of scientists who are reimagining this footage as data rather than just evidence. Their work leverages advances in natural language processing, which relies on artificial intelligence, to automate the analysis of video transcripts of citizen-police interactions. The findings have enabled police departments to spot policing problems, find ways to fix them and determine whether the fixes improve behavior.

Only a small number of police agencies have opened their databases to researchers so far. But if this footage were analyzed routinely, it would be a “real game changer,” says Jennifer Eberhardt, a Stanford University psychologist, who pioneered this line of research. “We can see beat-by-beat, moment-by-moment how an interaction unfolds.”

In papers published over the past seven years, Eberhardt and her colleagues have examined body camera footage to reveal how police speak to white and Black people differently and what type of talk is likely to either gain a person’s trust or portend an undesirable outcome, such as handcuffing or arrest. The findings have refined and enhanced police training. In a study published in PNAS Nexus in September, the researchers showed that the new training changed officers’ behavior…(More)”.

The history of AI and power in government


Book chapter by Shirley Kempeneer: “…begins by examining the simultaneous development of statistics and the state. Drawing on the works of notable scholars like Alain Desrosières, Theodore Porter, James Scott, and Michel Foucault, the chapter explores measurement as a product of modernity. It discusses the politics and power of (large) numbers, through their ability to make societies legible and controllable, also in the context of colonialism. The chapter then discusses the shift from data to big data and how AI and the state, just like statistics and the state, are mutually constitutive. It zooms in on shifting power relations, discussing the militarization of society, the outsourcing of the state to tech contractors, the exploitation of human bodies under the guise of ‘automation’, and the oppression of vulnerable citizens. Where news media often focus on the power of AI, that is supposedly escaping our control, this chapter relocates power in AI-systems, building on the work of Kate Crawford, Bruno Latour, and Emily Bender…(More)”