A Fourth Wave of Open Data? Exploring the Spectrum of Scenarios for Open Data and Generative AI


Report by Hannah Chafetz, Sampriti Saxena, and Stefaan G. Verhulst: “Since late 2022, generative AI services and large language models (LLMs) have transformed how many individuals access, and process information. However, how generative AI and LLMs can be augmented with open data from official sources and how open data can be made more accessible with generative AI – potentially enabling a Fourth Wave of Open Data – remains an under explored area. 

For these reasons, The Open Data Policy Lab (a collaboration between The GovLab and Microsoft) decided to explore the possible intersections between open data from official sources and generative AI. Throughout the last year, the team has conducted a range of research initiatives about the potential of open data and generative including a panel discussion, interviews, and Open Data Action Labs – a series of design sprints with a diverse group of industry experts. 

These initiatives were used to inform our latest report, “A Fourth Wave of Open Data? Exploring the Spectrum of Scenarios for Open Data and Generative AI,” (May 2024) which provides a new framework and recommendations to support open data providers and other interested parties in making open data “ready” for generative AI…

The report outlines five scenarios in which open data from official sources (e.g. open government and open research data) and generative AI can intersect. Each of these scenarios includes case studies from the field and a specific set of requirements that open data providers can focus on to become ready for a scenario. These include…(More)” (Arxiv).

Png Cover Page 26