Article by Stefaan G. Verhulst: “…For years, public interest advocates and other defenders of freedom on the Internet used “open” as a rallying cry. Open data. Open science. Open government. The idea was simple and noble: Knowledge should be shared freely, accessibly, and transparently to empower citizens, accelerate discovery, and improve governance

For a time, this vision made sense, even if it was imperfectly implemented. But as with many well-intentioned revolutions, openness has more recently been weaponised. What began as a movement to democratise knowledge has instead become justification for a new kind of extraction — this time not of oil or minerals, but of meaning. This phenomenon has become especially evident with the rise of generative AI, which relies on its voracious appetite for public data to train its models and refine its predictions. In the process, the very datasets, research repositories, and public web archives that were designed to serve the public interest have been harvested to train the large language models now controlled by a few corporations in a handful of countries.
The situation is dire but it is not hopeless. In what follows, we describe the problem in greater detail, outline the insufficiency of current mechanisms, and then discuss some possible mitigating responses…(More)”.