Data Rivers: Carving Out the Public Domain in the Age of Generative AI


Paper by Sylvie Delacroix: “What if the data ecosystems that made the advent of generative AI possible are being undermined by those very tools? For tools such as GPT4 (it is but one example of a tool made possible by scraping data from the internet), the erection of IP ‘fences’ is an existential threat. European and British regulators are alert to it: so-called ‘text and data mining’ exceptions are at the heart of intense debates. In the US, these debates are taking place in court hearings structured around ‘fair use’. While the concerns of the corporations developing these tools are being heard, there is currently no reliable mechanism for members of the public to exert influence on the (re)-balancing of the rights and responsibilities that shape our ‘data rivers’. Yet the existential threat that stems from restricted public access to such tools is arguably greater.

When it comes to re-balancing the data ecosystems that made generative AI possible, much can be learned from age-old river management practices, with one important proviso: data not only carries traces of our past. It is also a powerful tool to envisage different futures. If data-powered technologies such as GPT4 are to live up to their potential, we would do well to invest in bottom-up empowerment infrastructure. Such infrastructure could not only facilitate the valorisation of and participation in the public domain. It could also help steer the (re)-development of ‘copyright as privilege’ in a way that is better able to address the varied circumstances of today’s original content creators…(More)”