The tensions of data sharing for human rights: A modern slavery case study


Paper by Jamie Hancock et al: “There are calls for greater data sharing to address human rights issues. Advocates claim this will provide an evidence-base to increase transparency, improve accountability, enhance decision-making, identify abuses, and offer remedies for rights violations. However, these well-intentioned efforts have been found to sometimes enable harms against the people they seek to protect. This paper shows issues relating to fairness, accountability, or transparency (FAccT) in and around data sharing can produce such ‘ironic’ consequences. It does so using an empirical case study: efforts to tackle modern slavery and human trafficking in the UK. We draw on a qualitative analysis of expert interviews, workshops, ecosystem mapping exercises, and a desk-based review. The findings show how, in the UK, a large ecosystem of data providers, hubs, and users emerged to process and exchange data from across the country. We identify how issues including legal uncertainties, non-transparent sharing procedures, and limited accountability regarding downstream uses of data may undermine efforts to tackle modern slavery and place victims of abuses at risk of further harms. Our findings help explain why data sharing activities can have negative consequences for human rights, even within human rights initiatives. Moreover, our analysis offers a window into how FAccT principles for technology relate to the human rights implications of data sharing. Finally, we discuss why these tensions may be echoed in other areas where data sharing is pursued for human rights concerns, identifying common features which may lead to similar results, especially where sensitive data is shared to achieve social goods or policy objectives…(More)”.

Societal interaction plans—A tool for enhancing societal engagement of strategic research in Finland


Paper by Kirsi Pulkkinen, Timo Aarrevaara, Mikko Rask, and Markku Mattila: “…we investigate the practices and capacities that define successful societal interaction of research groups with stakeholders in mutually beneficial processes. We studied the Finnish Strategic Research Council’s (SRC) first funded projects through a dynamic governance lens. The aim of the paper is to explore how the societal interaction was designed and commenced at the onset of the projects in order to understand the logic through which the consortia expected broad impacts to occur. The Finnish SRC introduced a societal interaction plan (SIP) approach, which requires research consortia to consider societal interaction alongside research activities in a way that exceeds conventional research plans. Hence, the first SRC projects’ SIPs and the implemented activities and working logics discussed in the interviews provide a window into exploring how active societal interaction reflects the call for dynamic, sustainable practices and new capabilities to better link research to societal development. We found that the capacities of dynamic governance were implemented by integrating societal interaction into research, in particular through a ‘drizzling’ approach. In these emerging practices SIP designs function as platforms for the formation of communities of experts, rather than traditional project management models or mere communication tools. The research groups utilized the benefits of pooling academic knowledge and skills with other types of expertise for mutual gain. They embraced the limits of expertise and reached out to societal partners to truly broker knowledge, and exchange and develop capacities and perspectives to solve grand societal challenges…(More)”.

Will we run out of data? Limits of LLM scaling based on human-generated data


Paper by Pablo Villalobos: We investigate the potential constraints on LLM scaling posed by the availability of public human-generated text data. We forecast the growing demand for training data based on current trends and estimate the total stock of public human text data. Our findings indicate that if current LLM development trends continue, models will be trained on datasets roughly equal in size to the available stock of public human text data between 2026 and 2032, or slightly earlier if models are overtrained. We explore how progress in language modeling can continue when human-generated text datasets cannot be scaled any further. We argue that synthetic data generation, transfer learning from data-rich domains, and data efficiency improvements might support further progress…(More)”.

What does it mean to be good? The normative and metaethical problem with ‘AI for good’


Article by Tom Stenson: “Using AI for good is an imperative for its development and regulation, but what exactly does it mean? This article contends that ‘AI for good’ is a powerful normative concept and is problematic for the ethics of AI because it oversimplifies complex philosophical questions in defining good and assumes a level of moral knowledge and certainty that may not be justified. ‘AI for good’ expresses a value judgement on what AI should be and its role in society, thereby functioning as a normative concept in AI ethics. As a moral statement, AI for good makes two things implicit: i) we know what a good outcome is and ii) we know the process by which to achieve it. By examining these two claims, this article will articulate the thesis that ‘AI for good’ should be examined as a normative and metaethical problem for AI ethics. Furthermore, it argues that we need to pay more attention to our relationship with normativity and how it guides what we believe the ‘work’ of ethical AI should be…(More)”.

Scraping the demos. Digitalization, web scraping and the democratic project


Paper by Lena Ulbricht: “Scientific, political and bureaucratic elites use epistemic practices like “big data analysis” and “web scraping” to create representations of the citizenry and to legitimize policymaking. I develop the concept of “demos scraping” for these practices of gaining information about citizens (the “demos”) through automated analysis of digital trace data which are re-purposed for political means. This article critically engages with the discourse advocating demos scraping and provides a conceptual analysis of its democratic implications. It engages with the promise of demos scraping advocates to reduce the gap between political elites and citizens and highlights how demos scraping is presented as a superior form of accessing the “will of the people” and to increase democratic legitimacy. This leads me to critically discuss the implications of demos scraping for political representation and participation. In its current form, demos scraping is technocratic and de-politicizing; and the larger political and economic context in which it takes place makes it unlikely that it will reduce the gap between elites and citizens. From the analytic perspective of a post-democratic turn, demos scraping is an attempt of late modern and digitalized societies to address the democratic paradox of increasing citizen expectations coupled with a deep legitimation crisis…(More)”.

Participation in the Age of Foundation Models


Paper by Harini Suresh et al: “Growing interest and investment in the capabilities of foundation models has positioned such systems to impact a wide array of services, from banking to healthcare. Alongside these opportunities is the risk that these systems reify existing power imbalances and cause disproportionate harm to historically marginalized groups. The larger scale and domain-agnostic manner in which these models operate further heightens the stakes: any errors or harms are liable to reoccur across use cases. In AI & ML more broadly, participatory approaches hold promise to lend agency and decision-making power to marginalized stakeholders, leading to systems that better benefit justice through equitable and distributed governance. But existing approaches in participatory AI/ML are typically grounded in a specific application and set of relevant stakeholders, and it is not straightforward how to apply these lessons to the context of foundation models. Our paper aims to fill this gap.
First, we examine existing attempts at incorporating participation into foundation models. We highlight the tension between participation and scale, demonstrating that it is intractable for impacted communities to meaningfully shape a foundation model that is intended to be universally applicable. In response, we develop a blueprint for participatory foundation models that identifies more
local, application-oriented opportunities for meaningful participation. In addition to the “foundation” layer, our framework proposes the “subfloor” layer, in which stakeholders develop shared technical infrastructure, norms and governance for a grounded domain such as clinical care, journalism, or finance, and the “surface” (or application) layer, in which affected communities shape the use of a foundation model for a specific downstream task. The intermediate “subfloor” layer scopes the range of potential harms to consider, and affords communities more concrete avenues for deliberation and intervention. At the same time, it avoids duplicative effort by scaling input across relevant use cases. Through three case studies in clinical care, financial services, and journalism, we illustrate how this multi-layer model can create more meaningful opportunities for participation than solely intervening at the foundation layer…(More)”.

“The Death of Wikipedia?” — Exploring the Impact of ChatGPT on Wikipedia Engagement


Paper by Neal Reeves, Wenjie Yin, Elena Simperl: “Wikipedia is one of the most popular websites in the world, serving as a major source of information and learning resource for millions of users worldwide. While motivations for its usage vary, prior research suggests shallow information gathering — looking up facts and information or answering questions — dominates over more in-depth usage. On the 22nd of November 2022, ChatGPT was released to the public and has quickly become a popular source of information, serving as an effective question-answering and knowledge gathering resource. Early indications have suggested that it may be drawing users away from traditional question answering services such as Stack Overflow, raising the question of how it may have impacted Wikipedia. In this paper, we explore Wikipedia user metrics across four areas: page views, unique visitor numbers, edit counts and editor numbers within twelve language instances of Wikipedia. We perform pairwise comparisons of these metrics before and after the release of ChatGPT and implement a panel regression model to observe and quantify longer-term trends. We find no evidence of a fall in engagement across any of the four metrics, instead observing that page views and visitor numbers increased in the period following ChatGPT’s launch. However, we observe a lower increase in languages where ChatGPT was available than in languages where it was not, which may suggest ChatGPT’s availability limited growth in those languages. Our results contribute to the understanding of how emerging generative AI tools are disrupting the Web ecosystem…(More)”. See also: Are we entering a Data Winter? On the urgent need to preserve data access for the public interest.

Towards a pan-EU Freedom of Information Act? Harmonizing Access to Information in the EU through the internal market competence


Paper by Alberto Alemanno and Sébastien Fassiaux: “This paper examines whether – and on what basis – the EU may harmonise the right of access to information across the Union. It does by examining the available legal basis established by relevant international obligations, such as those stemming from the Council of Europe, and EU primary law. Its demonstrates that neither the Council of Europe – through the European Convention of Human Rights and the more recent Trømso Convention – nor the EU – through Article 41 of the EU Charter of Fundamental Rights – do require the EU to enact minimum standards of access to information. That Charter’s provision combined with Articles 10 and 11 TEU do require instead only the EU institutions – not the EU Member States – to ensure public access to documents, including legislative texts and meeting minutes. Regulation 1049/2001 was adopted (originally Art. 255 TEC) on such a legal basis and should be revised accordingly. The paper demonstrates that the most promising legal basis enabling the EU to proceed towards the harmonisation of access to information within the EU is offered by Article 114 TFEU. It argues hat the harmonisation of the conditions governing access to information across Member States would facilitate cross-border activities and trade, thus enhancing the internal market. Moreover, this would ensure equal access to information for all EU citizens and residents, irrespective of their location within the EU. Therefore, the question is not whether but how the EU may – under Article 114 TFEU – act to harmonise access to information. If the EU enjoys wide legislative discretion under Article 114(1) TFEU, this is not absolute but is subject to limits derived from fundamental rights and principles such as proportionality, equality, and subsidiarity. Hence, the need to design the type of harmonisation capable of preserving existing national FOIAs while enhancing the weakest ones. The only type of harmonisation fit for purpose would therefore be minimal, as opposed to maximal, by merely defining the minimum conditions required on each Member State’s national legislation governing the access to information…(More)”.

Determinants of behaviour and their efficacy as targets of behavioural change interventions


Paper with Dolores Albarracín, Bita Fayaz-Farkhad & Javier A. Granados Samayoa: “Unprecedented social, environmental, political and economic challenges — such as pandemics and epidemics, environmental degradation and community violence — require taking stock of how to promote behaviours that benefit individuals and society at large. In this Review, we synthesize multidisciplinary meta-analyses of the individual and social-structural determinants of behaviour (for example, beliefs and norms, respectively) and the efficacy of behavioural change interventions that target them. We find that, across domains, interventions designed to change individual determinants can be ordered by increasing impact as those targeting knowledge, general skills, general attitudes, beliefs, emotions, behavioural skills, behavioural attitudes and habits. Interventions designed to change social-structural determinants can be ordered by increasing impact as legal and administrative sanctions; programmes that increase institutional trustworthiness; interventions to change injunctive norms; monitors and reminders; descriptive norm interventions; material incentives; social support provision; and policies that increase access to a particular behaviour. We find similar patterns for health and environmental behavioural change specifically. Thus, policymakers should focus on interventions that enable individuals to circumvent obstacles to enacting desirable behaviours rather than targeting salient but ineffective determinants of behaviour such as knowledge and beliefs…(More)”.

Artificial intelligence, the common good, and the democratic deficit in AI governance


Paper by Mark Coeckelbergh: “There is a broad consensus that artificial intelligence should contribute to the common good, but it is not clear what is meant by that. This paper discusses this issue and uses it as a lens for analysing what it calls the “democracy deficit” in current AI governance, which includes a tendency to deny the inherently political character of the issue and to take a technocratic shortcut. It indicates what we may agree on and what is and should be up to (further) deliberation when it comes to AI ethics and AI governance. Inspired by the republican tradition in political theory, it also argues for a more active role of citizens and (end-)users: not only as participants in deliberation but also in ensuring, creatively and communicatively, that AI contributes to the common good…(More)”.