Data Rivers: Carving Out the Public Domain in the Age of Generative AI


Paper by Sylvie Delacroix: “What if the data ecosystems that made the advent of generative AI possible are being undermined by those very tools? For tools such as GPT4 (it is but one example of a tool made possible by scraping data from the internet), the erection of IP ‘fences’ is an existential threat. European and British regulators are alert to it: so-called ‘text and data mining’ exceptions are at the heart of intense debates. In the US, these debates are taking place in court hearings structured around ‘fair use’. While the concerns of the corporations developing these tools are being heard, there is currently no reliable mechanism for members of the public to exert influence on the (re)-balancing of the rights and responsibilities that shape our ‘data rivers’. Yet the existential threat that stems from restricted public access to such tools is arguably greater.

When it comes to re-balancing the data ecosystems that made generative AI possible, much can be learned from age-old river management practices, with one important proviso: data not only carries traces of our past. It is also a powerful tool to envisage different futures. If data-powered technologies such as GPT4 are to live up to their potential, we would do well to invest in bottom-up empowerment infrastructure. Such infrastructure could not only facilitate the valorisation of and participation in the public domain. It could also help steer the (re)-development of ‘copyright as privilege’ in a way that is better able to address the varied circumstances of today’s original content creators…(More)”

Data property, data governance and Common European Data Spaces


Paper by Thomas Margoni, Charlotte Ducuing and Luca Schirru: “The Data Act proposal of February 2022 constitutes a central element of a broader and ambitious initiative of the European Commission (EC) to regulate the data economy through the erection of a new general regulatory framework for data and digital markets. The resulting framework may be represented as a model of governance between a pure market-driven model and a fully regulated approach, thereby combining elements that traditionally belong to private law (e.g., property rights, contracts) and public law (e.g., regulatory authorities, limitation of contractual freedom). This article discusses the role of (intellectual) property rights as well as of other forms of rights allocation in data legislation with particular attention to the Data Act proposal. We argue that the proposed Data Act has the potential to play a key role in the way in which data, especially privately held data, may be accessed, used, and shared. Nevertheless, it is only by looking at the whole body of data (and data related) legislation that the broader plan for a data economy can be grasped in its entirety. Additionally, the Data Act proposal may also arguably reveal the elements for a transition from a property-based to a governance-based paradigm in the EU data strategy. Whereas elements of data governance abound, the stickiness of property rights and rhetoric seem however hard to overcome. The resulting regulatory framework, at least for now, is therefore an interesting but not always perfectly coordinated mix of both. Finally, this article suggests that the Data Act Proposal may have missed the chance to properly address the issue of data holders’ power and related information asymmetries, as well as the need for coordination mechanisms…(More)”.

End of data sharing could make Covid-19 harder to control, experts and high-risk patients warn


Article by Sam Whitehead: “…The federal government’s public health emergency that’s been in effect since January 2020 expires May 11. The emergency declaration allowed for sweeping changes in the U.S. health care system, like requiring state and local health departments, hospitals, and commercial labs to regularly share data with federal officials.

But some shared data requirements will come to an end and the federal government will lose access to key metrics as a skeptical Congress seems unlikely to grant agencies additional powers. And private projects, like those from The New York Times and Johns Hopkins University, which made covid data understandable and useful for everyday people, stopped collecting data in March.

Public health legal scholars, data experts, former and current federal officials, and patients at high risk of severe covid outcomes worry the scaling back of data access could make it harder to control covid.

There have been improvements in recent years, such as major investments in public health infrastructure and updated data reporting requirements in some states. But concerns remain that the overall shambolic state of U.S. public health data infrastructure could hobble the response to any future threats.

“We’re all less safe when there’s not the national amassing of this information in a timely and coherent way,” said Anne Schuchat, former principal deputy director of the Centers for Disease Control and Prevention.

A lack of data in the early days of the pandemic left federal officials, like Schuchat, with an unclear picture of the rapidly spreading coronavirus. And even as the public health emergency opened the door for data-sharing, the CDC labored for months to expand its authority.

Eventually, more than a year into the pandemic, the CDC gained access to data from private health care settings, such as hospitals and nursing homes, commercial labs, and state and local health departments…(More)”. See also: Why we still need data to understand the COVID-19 pandemic

Harnessing Data Innovation for Migration Policy: A Handbook for Practitioners


Report by IOM: “The Practitioners’ Handbook provides first-hand insights into why and how non-traditional data sources can contribute to better understanding migration-related phenomena. The Handbook aims to (a) bridge the practical and technical aspects of using data innovations in migration statistics, (a) demonstrate the added value of using new data sources and innovative methodologies to analyse key migration topics that may be hard to fully grasp using traditional data sources, and (c) identify good practices in addressing issues of data access and collaboration with multiple stakeholders (including the private sector), ethical standards, and security and data protection issues…(More)” See also Big Data for Migration Alliance.

Whose data commons? Whose city?


Blog by Gijs van Maanen and Anna Artyushina: “In 2020, the notion of data commons became a staple of the new European Data Governance Strategy, which envisions data cooperatives as key players of the European Union’s (EU) emerging digital market. In this new legal landscape, public institutions, businesses, and citizens are expected to share their data with the licensed data-governance entities that will oversee its responsible reuse. In 2022, the Open Future Foundation released several white papers where the NGO (non-govovernmental organisation) detailed a vision for the publicly governed and funded EU level data commons. Some academic researchers see data commons as a way to break the data silos maintained and exploited by Big Tech and, potentially, dismantle surveillance capitalism.

In this blog post, we discuss data commons as a concept and practice. Our argument here is that, for data commons to become a (partial) solution to the issues caused by data monopolies, they need to be politicised. As smart city scholar Shannon Mattern pointedly argues, the city is not a computer. This means that digitization and datafication of our cities involves making choices about what is worth digitising and whose interests are prioritised. These choices and their implications must be foregrounded when we discuss data commons or any emerging forms of data governance. It is important to ask whose data is made common and, subsequently, whose city we will end up living in. ..(More)”

Data Cooperatives as Catalysts for Collaboration, Data Sharing, and the (Trans)Formation of the Digital Commons


Paper by Michael Max Bühler et al: “Network effects, economies of scale, and lock-in-effects increasingly lead to a concentration of digital resources and capabilities, hindering the free and equitable development of digital entrepreneurship (SDG9), new skills, and jobs (SDG8), especially in small communities (SDG11) and their small and medium-sized enterprises (“SMEs”). To ensure the affordability and accessibility of technologies, promote digital entrepreneurship and community well-being (SDG3), and protect digital rights, we propose data cooperatives [1,2] as a vehicle for secure, trusted, and sovereign data exchange [3,4]. In post-pandemic times, community/SME-led cooperatives can play a vital role by ensuring that supply chains to support digital commons are uninterrupted, resilient, and decentralized [5]. Digital commons and data sovereignty provide communities with affordable and easy access to information and the ability to collectively negotiate data-related decisions. Moreover, cooperative commons (a) provide access to the infrastructure that underpins the modern economy, (b) preserve property rights, and (c) ensure that privatization and monopolization do not further erode self-determination, especially in a world increasingly mediated by AI. Thus, governance plays a significant role in accelerating communities’/SMEs’ digital transformation and addressing their challenges. Cooperatives thrive on digital governance and standards such as open trusted Application Programming Interfaces (APIs) that increase the efficiency, technological capabilities, and capacities of participants and, most importantly, integrate, enable, and accelerate the digital transformation of SMEs in the overall process. This policy paper presents and discusses several transformative use cases for cooperative data governance. The use cases demonstrate how platform/data-cooperatives, and their novel value creation can be leveraged to take digital commons and value chains to a new level of collaboration while addressing the most pressing community issues. The proposed framework for a digital federated and sovereign reference architecture will create a blueprint for sustainable development both in the Global South and North…(More)”

Responding to the coronavirus disease-2019 pandemic with innovative data use: The role of data challenges


Paper by Jamie Danemayer, Andrew Young, Siobhan Green, Lydia Ezenwa and Michael Klein: “Innovative, responsible data use is a critical need in the global response to the coronavirus disease-2019 (COVID-19) pandemic. Yet potentially impactful data are often unavailable to those who could utilize it, particularly in data-poor settings, posing a serious barrier to effective pandemic mitigation. Data challenges, a public call-to-action for innovative data use projects, can identify and address these specific barriers. To understand gaps and progress relevant to effective data use in this context, this study thematically analyses three sets of qualitative data focused on/based in low/middle-income countries: (a) a survey of innovators responding to a data challenge, (b) a survey of organizers of data challenges, and (c) a focus group discussion with professionals using COVID-19 data for evidence-based decision-making. Data quality and accessibility and human resources/institutional capacity were frequently reported limitations to effective data use among innovators. New fit-for-purpose tools and the expansion of partnerships were the most frequently noted areas of progress. Discussion participants identified building capacity for external/national actors to understand the needs of local communities can address a lack of partnerships while de-siloing information. A synthesis of themes demonstrated that gaps, progress, and needs commonly identified by these groups are relevant beyond COVID-19, highlighting the importance of a healthy data ecosystem to address emerging threats. This is supported by data holders prioritizing the availability and accessibility of their data without causing harm; funders and policymakers committed to integrating innovations with existing physical, data, and policy infrastructure; and innovators designing sustainable, multi-use solutions based on principles of good data governance…(More)”.

Seize the Future by Harnessing the Power of Data


Essay by Kriss Deiglmeier: “…Data is a form of power. And the sad reality is that power is being held increasingly by the commercial sector and not by organizations seeking to create a more just, sustainable, and prosperous world. A year into my tenure as the chief global impact officer at Splunk, I became consumed with the new era driven by data. Specifically, I was concerned with the emerging data divide, which I defined as “the disparity between the expanding use of data to create commercial value, and the comparatively weak use of data to solve social and environmental challenges.”…

To effectively address the emerging data future, the social impact sector must build an entire impact data ecosystem for this moment in time—and the next moment in time. The way to do that is by investing in those areas where we currently lag the commercial sector. Consider the following gaps:

  • Nonprofits are ill-equipped with the financial and technical resources they need to make full use of data, often due to underfunding.
  • The sector’s technical and data talent is a desert compared to the commercial sector.
  • While the sector is rich with output and service-delivery data, that data is locked away or is unusable in its current form.
  • The sector lacks living data platforms (collaboratives and data refineries) that can make use of sector-wide data in a way that helps improve service delivery, maximize impact, and create radical innovation.

The harsh realities of the sector’s disparate data skills, infrastructure, and competencies show the dire current state. For the impact sector to transition to a place of power, it must jump without hesitation into the arena of the Data Age—and invest time, talent, and money in filling in these gaps.

Regardless of our lagging position, the social sector has both an incredible opportunity and a unique capacity to drive the power of data into the emerging and unimaginable. The good news is that there’s pivotal work already happening in the sector that is making it easier to build the kind of impact data ecosystem needed to join the Data Age. The framing and terms used to describe this work are many—data for good, data science for impact, open data, public interest technology, data lakes, ethical data, and artificial intelligence ethics.

These individual pieces, while important, are not enough. To fully exploit the power of data for a more just, sustainable, and prosperous world, we need to be bold enough to build the full ecosystem and not be satisfied with piecemeal work. To do that we should begin by looking at the assets that we have and build on those.

People. There are dedicated leaders in the field of social innovation who are committed to using data for impact and who have been doing that for many years. We need to support them by investing in their work at scale. The list of people leading the way is constantly growing, but to name a few: Stefaan G. Verhulst, Joy Buolamwini, Jim Fruchterman, Katara McCarty, Geoff Mulgan, Rediet Abebe, Jason Saul, and Jake Porway….(More)”.

Could a Global “Wicked Problems Agency” Incentivize Data Sharing?


Paper by Susan Ariel Aaronson: “Global data sharing could help solve “wicked” problems (problems such as climate change, terrorism and global poverty that no one knows how to solve without creating further problems). There is no one or best way to address wicked problems because they have many different causes and manifest in different contexts. By mixing vast troves of data, policy makers and researchers may find new insights and strategies to address these complex problems. National and international government agencies and large corporations generally control the use of such data, and the world has made little progress in encouraging cross-sectoral and international data sharing. This paper proposes a new international cloud-based organization, the “Wicked Problems Agency,” to catalyze both data sharing and data analysis in the interest of mitigating wicked problems. This organization would work to prod societal entities — firms, individuals, civil society groups and governments — to share and analyze various types of data. The Wicked Problems Agency could provide a practical example of how data sharing can yield both economic and public good benefits…(More)”.

An agenda for advancing trusted data collaboration in cities


Report by Hannah Chafetz, Sampriti Saxena, Adrienne Schmoeker, Stefaan G. Verhulst, & Andrew J. Zahuranec: “… Joined by experts across several domains including smart cities, the law, and data ecosystem, this effort was focused on developing solutions that could improve the design of Data Sharing Agreements…we assessed what is needed to implement each aspect of our Contractual Wheel of Data Collaboration–a tool developed as a part of the Contracts for Data Collaborations initiative that seeks to capture the elements involved in data collaborations and Data Sharing Agreements.

In what follows, we provide key suggestions from this Action Lab…

  1. The Elements of Principled Negotiations: Those seeking to develop a Data Sharing Agreement often struggle to work with collaborators or agree to common ends. There is a need for a common resource that Data Stewards can use to initiate a principled negotiation process. To address this need, we would identify the principles to inform negotiations and the elements that could help achieve those principles. For example, participants voiced a need for fairness, transparency, and reciprocity principles. These principles could be supported by having a shared language or outlining the minimum legal documents required for each party. The final product would be a checklist or visualization of principles and their associated elements.
  2. Data Responsibility Principles by Design: …
  3. Readiness Matrix: 
  4. A Decision Provenance Approach for Data Collaboration: ..
  5. The Contractual Wheel of Data Collaboration 2.0
  6. A Repository of Legal Drafting Technologies:…(More)”.