Facilitating Data Flows through Data Collaboratives

A Practical Guide “to Designing Valuable, Accessible, and Responsible Data Collaboratives” by Uma Kalkar, Natalia González Alarcón, Arturo Muente Kunigami and Stefaan Verhulst: “Data is an indispensable asset in today’s society, but its production and sharing are subject to well-known market failures. Among these: neither economic nor academic markets efficiently reward costly data collection and quality assurance efforts; data providers cannot easily supervise the appropriate use of their data; and, correspondingly, users have weak incentives to pay for, acknowledge, and protect data that they receive from providers. Data collaboratives are a potential non-market solution to this problem, bringing together data providers and users to address these market failures. The governance frameworks for these collaboratives are varied and complex and their details are not widely known. This guide proposes a methodology and a set of common elements that facilitate experimentation and creation of collaborative environments. It offers guidance to governments on implementing effective data collaboratives as a means to promote data flows in Latin America and the Caribbean, harnessing their potential to design more effective services and improve public policies…(More)”.

The Good and Bad of Anticipating Migration

Article by Sara Marcucci, Stefaan Verhulst, María Esther Cervantes, Elena Wüllhorst: “This blog is the first in a series that will be published weekly, dedicated to exploring innovative anticipatory methods for migration policy. Over the coming weeks, we will delve into various aspects of these methods, delving into their value, challenges, taxonomy, and practical applications. 

This first blog serves as an exploration of the value proposition and challenges inherent in innovative anticipatory methods for migration policy. We delve into the various reasons why these methods hold promise for informing more resilient, and proactive migration policies. These reasons include evidence-based policy development, enabling policymakers to ground their decisions in empirical evidence and future projections. Decision-takers, users, and practitioners can benefit from anticipatory methods for policy evaluation and adaptation, resource allocation, the identification of root causes, and the facilitation of humanitarian aid through early warning systems. However, it’s vital to acknowledge the challenges associated with the adoption and implementation of these methods, ranging from conceptual concerns such as fossilization, unfalsifiability, and the legitimacy of preemptive intervention, to practical issues like interdisciplinary collaboration, data availability and quality, capacity building, and stakeholder engagement. As we navigate through these complexities, we aim to shed light on the potential and limitations of anticipatory methods in the context of migration policy, setting the stage for deeper explorations in the coming blogs of this series…(More)”.

Towards a Holistic EU Data Governance

SITRA Publication: “The European Union’s ambitious data strategy aims to establish the EU as a leader in a data-driven society by creating a single market for data while fully respecting European policies on privacy, data protection, and competition law. To achieve the strategy’s bold aims, Europe needs more practical business cases where data flows across the organisations.

Reliable data sharing requires new technical, governance and business solutions. Data spaces address these needs by providing soft infrastructure to enable trusted and easy data flows across organisational boundaries.

Striking the right balance between regulation and innovation will be critical to creating a supportive environment for data-sharing business cases to flourish. In this working paper, we take an in-depth look at the governance issues surrounding data sharing and data spaces.

Data sharing requires trust. Trust can be facilitated by effective governance, meaning the rules for data sharing. These rules come from different arenas. The European Commission is establishing new regulations related to data, and member states also have their laws and authorities that oversee data-sharing activities. Ultimately, data spaces need local rules to enable interoperability and foster trust between participants. The governance framework for data spaces is called a rulebook, which codifies legal, business, technical, and ethical rules for data sharing.

The extensive discussions and interviews with experts reveal confusion in the field. People developing data sharing in practice or otherwise involved in data governance issues struggle to know who does what and who decides what. Data spaces also struggle to create internal governance structures in line with the regulatory environment. The interviews conducted for this study indicate that coordination at the member state level could play a decisive role in coordinating the EU-level strategy with concrete local data space initiatives.

The root cause of many of the pain points we identify is the problem of gaps, duplication and overlapping of roles between the different actors at all levels. To address these challenges and cultivate effective governance, a holistic data governance framework is proposed. This framework combines the existing approach of rulebooks with a new tool called the rolebook, which serves as a register of roles and bodies involved in data sharing. The rolebook aims to increase clarity and empower stakeholders at all levels to understand the current data governance structures.

In conclusion, effective governance is crucial for the success of the EU data strategy and the development of data spaces. By implementing the proposed holistic data governance framework, the EU can promote trust, balanced regulation and innovation, and support the growth of data spaces across sectors…(More)”.

The emergence of non-personal data markets

Report by the Think Tank of the European Parliament: “The European Commission’s Data Strategy aims to create a single market for data, open to data from across the world, where personal and non-personal data, including sensitive business data, are secure. The EU Regulation on the free flow of non-personal data allows non-personal data to be stored and processed anywhere in the EU without unjustified restrictions, with limited exceptions based on grounds of public security. The creation of multiple common sector-specific European data spaces aims to ensure Europe’s global competitiveness and data sovereignty. The Data Act proposed by the Commission aims to remove barriers to data access for both consumers and businesses and to establish common rules to govern the sharing of data generated using connected products or related services.

The aim of the study is to provide an in-depth, comprehensive, and issue-specific analysis of the emergence of non-personal data markets in Europe. The study seeks to identify the potential value of the non-personal data market, potential challenges and solutions, and the legislative/policy measures necessary to facilitate the further development of non-personal data markets. The study also ranks the main non-personal data markets by size and growth rate and provides a sector-specific analysis for the mobility and transport, energy, and manufacturing sectors…(More)”.

Four Questions to Guide Decision-Making for Data Sharing and Integration

Paper by the Actionable Intelligence for Social Policy Center: “This paper presents a Four Question Framework to guide data integration partners in building a strong governance and legal foundation to support ethical data use. While this framework was developed based on work in the United States that routinely integrates public data, it is meant to be a simple, digestible tool that can be adapted to any context. The framework was developed through a series of public deliberation workgroups and 15 years of field experience working with a diversity of data integration efforts across the United States.
The Four Questions – Is this legal? Is this ethical? Is this a good idea? How do we know (and who decides)? – should be considered within an established data governance framework and alongside core partners to determine whether and how to move forward when building an Integrated Data System (IDS) and also at each stage of a specific data project. We discuss these questions in depth, with a particular focus on the role of governance in establishing legal and ethical data use. In addition, we provide example data governance structures from two IDS sites and hypothetical scenarios that illustrate key considerations for the Four Question Framework.
A robust governance process is essential for determining whether data sharing and integration is legal, ethical, and a good idea within the local context. This process is iterative and as relational as it is technical, which means authentic collaboration across partners should be prioritized at each stage of a data use project. The Four Questions serve as a guide for determining whether to undertake data sharing and integration and should be regularly revisited throughout the life of a project…(More)”.

Can Google Trends predict asylum-seekers’ destination choices?

Paper by Haodong Qi & Tuba Bircan: “Google Trends (GT) collate the volumes of search keywords over time and by geographical location. Such data could, in theory, provide insights into people’s ex ante intentions to migrate, and hence be useful for predictive analysis of future migration. Empirically, however, the predictive power of GT is sensitive, it may vary depending on geographical context, the search keywords selected for analysis, as well as Google’s market share and its users’ characteristics and search behavior, among others. Unlike most previous studies attempting to demonstrate the benefit of using GT for forecasting migration flows, this article addresses a critical but less discussed issue: when GT cannot enhance the performances of migration models. Using EUROSTAT statistics on first-time asylum applications and a set of push-pull indicators gathered from various data sources, we train three classes of gravity models that are commonly used in the migration literature, and examine how the inclusion of GT may affect models’ abilities to predict refugees’ destination choices. The results suggest that the effects of including GT are highly contingent on the complexity of different models. Specifically, GT can only improve the performance of relatively simple models, but not of those augmented by flow Fixed-Effects or by Auto-Regressive effects. These findings call for a more comprehensive analysis of the strengths and limitations of using GT, as well as other digital trace data, in the context of modeling and forecasting migration. It is our hope that this nuanced perspective can spur further innovations in the field, and ultimately bring us closer to a comprehensive modeling framework of human migration…(More)”.

Essential requirements for the governance and management of data trusts, data repositories, and other data collaborations

Paper by Alison Paprica et al: “Around the world, many organisations are working on ways to increase the use, sharing, and reuse of person-level data for research, evaluation, planning, and innovation while ensuring that data are secure and privacy is protected. As a contribution to broader efforts to improve data governance and management, in 2020 members of our team published 12 minimum specification essential requirements (min specs) to provide practical guidance for organisations establishing or operating data trusts and other forms of data infrastructure… We convened an international team, consisting mostly of participants from Canada and the United States of America, to test and refine the original 12 min specs. Twenty-three (23) data-focused organisations and initiatives recorded the various ways they address the min specs. Sub-teams analysed the results, used the findings to make improvements to the min specs, and identified materials to support organisations/initiatives in addressing the min specs.
Analyses and discussion led to an updated set of 15 min specs covering five categories: one min spec for Legal, five for Governance, four for Management, two for Data Users, and three for Stakeholder & Public Engagement. Multiple changes were made to make the min specs language more technically complete and precise. The updated set of 15 min specs has been integrated into a Canadian national standard that, to our knowledge, is the first to include requirements for public engagement and Indigenous Data Sovereignty…(More)”.

Data Commons

Paper by R. V. Guha et al: “Publicly available data from open sources (e.g., United States Census Bureau (Census), World Health Organization (WHO), Intergovernmental Panel on Climate Change (IPCC) are vital resources for policy makers, students and researchers across different disciplines. Combining data from different sources requires the user to reconcile the differences in schemas, formats, assumptions, and more. This data wrangling is time consuming, tedious and needs to be repeated by every user of the data. Our goal with Data Commons (DC) is to help make public data accessible and useful to those who want to understand this data and use it to solve societal challenges and opportunities. We do the data processing and make the processed data widely available via standard schemas and Cloud APIs. Data Commons is a distributed network of sites that publish data in a common schema and interoperate using the Data Commons APIs. Data from different Data Commons can be ‘joined’ easily. The aggregate of these Data Commons can be viewed as a single Knowledge Graph. This Knowledge Graph can then be searched over using Natural Language questions utilizing advances in Large Language Models. This paper describes the architecture of Data Commons, some of the major deployments and highlights directions for future work…(More)”.

Data Collaboratives

Policy Brief by Center for the Governance of Change: “Despite the abundance of data generated, it is becoming increasingly clear that its accessibility and advantages are not equitably or effectively distributed throughout society. Data asymmetries, driven in large part by deeply entrenched inequalities and lack of incentives by many public- and private-sector organizations to collaborate, are holding back the public good potential of data and hindering progress and innovation in key areas such as financial inclusion, health, and the future of work.

More (and better) collaboration is needed to address the data asymmetries that exist across society, but early efforts at opening data have fallen short of achieving their intended aims. In the EU, the proposed Data Act is seeking to address these shortcomings and make more data available for public use by setting up new rules on data sharing. However, critics say its current reading risks limiting the potential for delivering innovative solutions by failing to establish cross-sectoral data-sharing frameworks, leaving the issue of public data stewardship off the table, and avoiding the thorny question of business incentives.

This policy brief, based on Stefaan Verhulst’s recent policy paper for the Center for the Governance of Change, argues that data collaboratives, an emerging model of collaboration in which participants from different sectors exchange data to solve public problems, offer a promising solution to address these data asymmetries and contribute to a healthy data economy that can benefit society as a whole. However, data collaboratives require a systematic, sustainable, and responsible approach to be successful, with a particular focus on..(More):

Establishing a new science of questions, to help identify the most pressing public and private challenges that can be addressed with data sharing.Fostering a new profession of data stewards, to promote a culture of responsible sharing within organizations and recognize opportunities for productive collaboration.Clarifying incentives, to bring the private sector to the table and help operationalize data collaboration, ideally with some sort of market-led compensation model.
Establishing a social license for data reuse, to promote trust among stakeholders through public engagement, data stewardship, and an enabling regulatory framework.Becoming more data-driven about data, to improve our understanding of collaboration, build sustainable initiatives, and achieve project accountability.

Sharing Health Data: The Why, the Will, and the Way Forward.

Book edited by Grossmann C, Chua PS, Ahmed M, et al. : “Sharing health data and information1 across stakeholder groups is the bedrock of a learning health system. As data and information are increasingly combined across various sources, their generative value to transform health, health care, and health equity increases significantly. Facilitating this potential is an escalating surge of digital technologies (i.e., cloud computing, broadband and wireless solutions, digital health technologies, and application programming interfaces [APIs]) that, with each successive generation, not only enhance data sharing, but also improve in their ability to preserve privacy and identify and mitigate cybersecurity risks. These technological advances, coupled with notable policy developments, new interoperability standards (particularly the Fast Healthcare Interoperability Resources [FHIR] standard), and the launch of innovative payment models within the last decade, have resulted in a greater recognition of the value of health data sharing among patients, providers, and researchers. Consequently, a number of data sharing collaborations are emerging across the health care ecosystem.

Unquestionably, the COVID-19 pandemic has had a catalytic effect on this trend. The criticality of swift data exchange became evident at the outset of the pandemic, when the scientific community sought answers about the novel SARS-CoV-2 virus and emerging disease. Then, as the crisis intensified, data sharing graduated from a research imperative to a societal one, with a clear need to urgently share and link data across multiple sectors and industries to curb the effects of the pandemic and prevent the next one.

In spite of these evolving attitudes toward data sharing and the ubiquity of data-sharing partnerships, barriers persist. The practice of health data sharing occurs unevenly, prominent in certain stakeholder communities while absent in others. A stark contrast is observed between the volume, speed, and frequency with which health data is aggregated and linked—oftentimes with non-traditional forms of health data—for marketing purposes, and the continuing challenges patients experience in contributing data to their own health records. In addition, there are varying levels of data sharing. Not all types of data are shared in the same manner and at the same level of granularity, creating a patchwork of information. As highlighted by the gaps observed in the haphazard and often inadequate sharing of race and ethnicity data during the pandemic, the consequences can be severe—impacting the allocation of much-needed resources and attention to marginalized communities. Therefore, it is important to recognize the value of data sharing in which stakeholder participation is equitable and comprehensive— not only for achieving a future ideal state in health care, but also for redressing long-standing inequities…(More)”