Paper by Yaniv Benhamou and Melanie Dulong de Rosnay: “Data are often subject to a multitude of rights (e.g. original works or personal data posted on social media, or collected through captcha, subject to copyright, database and data protection) and voluntary shared through non standardized, non interoperable contractual terms. This leads to fragmented legal regimes and has become an even major challenge in the AI-era, for example when online platforms set their own Terms of Services (ToS), in business-to-consumer relationship (B2C).
This article proposes standard terms that may apply to all kind of data (including personal and mixed datasets subject to different legal regimes) based on the open data philosophy initially developed for Free and Open Source software and Creative Commons licenses for artistic and other copyrighted works. In a first part, we analyse how to extend open standard terms to all kinds of data (II). In a second part, we suggest to combine these open standard terms with collective governance instruments, in particular data trust, inspired by commons-based projects and by the centennial collective management of copyright (III). In a last part, after few concluding remarks (IV), we propose a template “Open Data Commons Licenses“ (ODCL) combining compulsory and optional elements to be selected by licensors, illustrated by pictograms and icons inspired by the bricks of Creative Commons licences and legal design techniques (V).
This proposal addresses the bargaining power imbalance and information asymmetry (by offering the licensor the ability to decide the terms), and conceptualises contract law differently. It reverses the current logic of contract: instead of letting companies (licensees) impose their own ToS to the users (licensors, being the copyright owner, data subject, data producer), licensors will reclaim the ability to set their own terms for access and use of data, by selecting standard terms. This should also allow the management of complex datasets, increase data sharing, and improve trust and control over the data. Like previous open licencing standards, the model is expected to lower the transaction costs by reducing the need to develop and read new complicated contractual terms. It can also spread the virality of open data to all data in an AI-era, if any input data under such terms used for AI training purposes propagates its conditions to all aggregated and output data. In other words, any data distributed under our ODCL template will turn all outcome into more or less open data and foster a data common ecosystem. Finally, instead of full openness, our model allows for restrictions outside of certain boundaries (e.g. authorized users and uses), in order to protect the commons and certain values. The model would require to be governed and monitored by a collective data trust…(More)”.