The Complexities of Differential Privacy for Survey Data


Paper by Jörg Drechsler & James Bailie: “The concept of differential privacy (DP) has gained substantial attention in recent years, most notably since the U.S. Census Bureau announced the adoption of the concept for its 2020 Decennial Census. However, despite its attractive theoretical properties, implementing DP in practice remains challenging, especially when it comes to survey data. In this paper we present some results from an ongoing project funded by the U.S. Census Bureau that is exploring the possibilities and limitations of DP for survey data. Specifically, we identify five aspects that need to be considered when adopting DP in the survey context: the multi-staged nature of data production; the limited privacy amplification from complex sampling designs; the implications of survey-weighted estimates; the weighting adjustments for nonresponse and other data deficiencies, and the imputation of missing values. We summarize the project’s key findings with respect to each of these aspects and also discuss some of the challenges that still need to be addressed before DP could become the new data protection standard at statistical agencies…(More)”.

Artificial Intelligence for the Internal Democracy of Political Parties


Paper by Claudio Novelli et al: “The article argues that AI can enhance the measurement and implementation of democratic processes within political parties, known as Intra-Party Democracy (IPD). It identifies the limitations of traditional methods for measuring IPD, which often rely on formal parameters, self-reported data, and tools like surveys. Such limitations lead to partial data collection, rare updates, and significant resource demands. To address these issues, the article suggests that specific data management and Machine Learning techniques, such as natural language processing and sentiment analysis, can improve the measurement and practice of IPD…(More)”.

Toward a citizen science framework for public policy evaluation


Paper by Giovanni Esposito et al: “This study pioneers the use of citizen science in evaluating Freedom of Information laws, with a focus on Belgium, where since its 1994 enactment, Freedom of Information’s effectiveness has remained largely unexamined. Utilizing participatory methods, it engages citizens in assessing transparency policies, significantly contributing to public policy evaluation methodology. The research identifies regional differences in Freedom of Information implementation across Belgian municipalities, highlighting that larger municipalities handle requests more effectively, while administrations generally show reluctance to respond to requests from perceived knowledgeable individuals. This phenomenon reflects a broader European caution toward well-informed requesters. By integrating citizen science, this study not only advances our understanding of Freedom of Information law effectiveness in Belgium but also advocates for a more inclusive, collaborative approach to policy evaluation. It addresses the gap in researchers’ experience with citizen science, showcasing its vast potential to enhance participatory governance and policy evaluation…(More)”.

Private sector trust in data sharing: enablers in the European Union


Paper by Jaime Bernal: “Enabling private sector trust stands as a critical policy challenge for the success of the EU Data Governance Act and Data Act in promoting data sharing to address societal challenges. This paper attributes the widespread trust deficit to the unmanageable uncertainty that arises from businesses’ limited usage control to protect their interests in the face of unacceptable perceived risks. For example, a firm may hesitate to share its data with others in case it is leaked and falls into the hands of business competitors. To illustrate this impasse, competition, privacy, and reputational risks are introduced, respectively, in the context of three suboptimal approaches to data sharing: data marketplaces, data collaboratives, and data philanthropy. The paper proceeds by analyzing seven trust-enabling mechanisms comprised of technological, legal, and organizational elements to balance trust, risk, and control and assessing their capacity to operate in a fair, equitable, and transparent manner. Finally, the paper examines the regulatory context in the EU and the advantages and limitations of voluntary and mandatory data sharing, concluding that an approach that effectively balances the two should be pursued…(More)”.

Collaboration in Healthcare: Implications of Data Sharing for Secondary Use in the European Union


Paper by Fanni Kertesz: “The European healthcare sector is transforming toward patient-centred and value-based healthcare delivery. The European Health Data Space (EHDS) Regulation aims to unlock the potential of health data by establishing a single market for its primary and secondary use. This paper examines the legal challenges associated with the secondary use of health data within the EHDS and offers recommendations for improvement. Key issues include the compatibility between the EHDS and the General Data Protection Regulation (GDPR), barriers to cross-border data sharing, and intellectual property concerns. Resolving these challenges is essential for realising the full potential of health data and advancing healthcare research and innovation within the EU…(More)”.

Geographies of missing data: Spatializing counterdata production against feminicide


Paper by Catherine D’Ignazio et al: “Feminicide is the gender-related killing of cisgender and transgender women and girls. It reflects patriarchal and racialized systems of oppression and reveals how territories and socio-economic landscapes configure everyday gender-related violence. In recent decades, many grassroots data production initiatives have emerged with the aim of monitoring this extreme but invisibilized phenomenon. We bridge scholarship in feminist and information geographies with data feminism to examine the ways in which space, broadly defined, shapes the counterdata production strategies of feminicide data activists. Drawing on a qualitative study of 33 monitoring efforts led by civil society organizations across 15 countries, primarily in Latin America, we provide a conceptual framework for examining the spatial dimensions of data activism. We show how there are striking transnational patterns related to where feminicide goes unrecorded, resulting in geographies of missing data. In response to these omissions, activists deploy multiple spatialized strategies to make these geographies visible, to situate and contextualize each case of feminicide, to reclaim databases as spaces for memory and witnessing, and to build transnational networks of solidarity. In this sense, we argue that data activism about feminicide constitutes a space of resistance and resignification of everyday forms of gender-related violence…(More)”.

DAOs of Collective Intelligence? Unraveling the Complexity of Blockchain Governance in Decentralized Autonomous Organizations


Paper by Mark C. Ballandies, Dino Carpentras, and Evangelos Pournaras: “Decentralized autonomous organizations (DAOs) have transformed organizational structures by shifting from traditional hierarchical control to decentralized approaches, leveraging blockchain and cryptoeconomics. Despite managing significant funds and building global networks, DAOs face challenges like declining participation, increasing centralization, and inabilities to adapt to changing environments, which stifle innovation. This paper explores DAOs as complex systems and applies complexity science to explain their inefficiencies. In particular, we discuss DAO challenges, their complex nature, and introduce the self-organization mechanisms of collective intelligence, digital democracy, and adaptation. By applying these mechansims to improve DAO design and construction, a practical design framework for DAOs is created. This contribution lays a foundation for future research at the intersection of complexity science and DAOs…(More)”.

Using internet search data as part of medical research


Blog by Susan Thomas and Matthew Thompson: “…In the UK, almost 50 million health-related searches are made using Google per year. Globally there are 100s of millions of health-related searches every day. And, of course, people are doing these searches in real-time, looking for answers to their concerns in the moment. It’s also possible that, even if people aren’t noticing and searching about changes to their health, their behaviour is changing. Maybe they are searching more at night because they are having difficulty sleeping or maybe they are spending more (or less) time online. Maybe an individual’s search history could actually be really useful for researchers. This realisation has led medical researchers to start to explore whether individuals’ online search activity could help provide those subtle, almost unnoticeable signals that point to the beginning of a serious illness.

Our recent review found 23 studies have been published so far that have done exactly this. These studies suggest that online search activity among people later diagnosed with a variety of conditions ranging from pancreatic cancer and stroke to mood disorders, was different to people who did not have one of these conditions.

One of these studies was published by researchers at Imperial College London, who used online search activity to identify signals of women with gynaecological malignancies. They found that women with malignant (e.g. ovarian cancer) and benign conditions had different search patterns, up to two months prior to a GP referral. 

Pause for a moment, and think about what this could mean. Ovarian cancer is one of the most devastating cancers women get. It’s desperately hard to detect early – and yet there are signals of this cancer visible in women’s internet searches months before diagnosis?…(More)”.

Even laypeople use legalese


Paper by Eric Martínez, Francis Mollica and Edward Gibson: “Whereas principles of communicative efficiency and legal doctrine dictate that laws be comprehensible to the common world, empirical evidence suggests legal documents are largely incomprehensible to lawyers and laypeople alike. Here, a corpus analysis (n = 59) million words) first replicated and extended prior work revealing laws to contain strikingly higher rates of complex syntactic structures relative to six baseline genres of English. Next, two preregistered text generation experiments (n = 286) tested two leading hypotheses regarding how these complex structures enter into legal documents in the first place. In line with the magic spell hypothesis, we found people tasked with writing official laws wrote in a more convoluted manner than when tasked with writing unofficial legal texts of equivalent conceptual complexity. Contrary to the copy-and-edit hypothesis, we did not find evidence that people editing a legal document wrote in a more convoluted manner than when writing the same document from scratch. From a cognitive perspective, these results suggest law to be a rare exception to the general tendency in human language toward communicative efficiency. In particular, these findings indicate law’s complexity to be derived from its performativity, whereby low-frequency structures may be inserted to signal law’s authoritative, world-state-altering nature, at the cost of increased processing demands on readers. From a law and policy perspective, these results suggest that the tension between the ubiquity and impenetrability of the law is not an inherent one, and that laws can be simplified without a loss or distortion of communicative content…(More)”.

Regulating the Direction of Innovation


Paper by Joshua S. Gans: “This paper examines the regulation of technological innovation direction under uncertainty about potential harms. We develop a model with two competing technological paths and analyze various regulatory interventions. Our findings show that market forces tend to inefficiently concentrate research on leading paths. We demonstrate that ex post regulatory instruments, particularly liability regimes, outperform ex ante restrictions in most scenarios. The optimal regulatory approach depends critically on the magnitude of potential harm relative to technological benefits. Our analysis reveals subtle insurance motives in resource allocation across research paths, challenging common intuitions about diversification. These insights have important implications for regulating emerging technologies like artificial intelligence, suggesting the need for flexible, adaptive regulatory frameworks…(More)”.