Will we run out of data? Limits of LLM scaling based on human-generated data


Paper by Pablo Villalobos: We investigate the potential constraints on LLM scaling posed by the availability of public human-generated text data. We forecast the growing demand for training data based on current trends and estimate the total stock of public human text data. Our findings indicate that if current LLM development trends continue, models will be trained on datasets roughly equal in size to the available stock of public human text data between 2026 and 2032, or slightly earlier if models are overtrained. We explore how progress in language modeling can continue when human-generated text datasets cannot be scaled any further. We argue that synthetic data generation, transfer learning from data-rich domains, and data efficiency improvements might support further progress…(More)”.

What does it mean to be good? The normative and metaethical problem with ‘AI for good’


Article by Tom Stenson: “Using AI for good is an imperative for its development and regulation, but what exactly does it mean? This article contends that ‘AI for good’ is a powerful normative concept and is problematic for the ethics of AI because it oversimplifies complex philosophical questions in defining good and assumes a level of moral knowledge and certainty that may not be justified. ‘AI for good’ expresses a value judgement on what AI should be and its role in society, thereby functioning as a normative concept in AI ethics. As a moral statement, AI for good makes two things implicit: i) we know what a good outcome is and ii) we know the process by which to achieve it. By examining these two claims, this article will articulate the thesis that ‘AI for good’ should be examined as a normative and metaethical problem for AI ethics. Furthermore, it argues that we need to pay more attention to our relationship with normativity and how it guides what we believe the ‘work’ of ethical AI should be…(More)”.

Building SimCity: How to Put the World in a Machine


Book by Chaim Gingold: “…explores the history of computer simulation by chronicling one of the most influential simulation games ever made: SimCity. As author Chaim Gingold explains, Will Wright, the visionary designer behind the urban planning game, created SimCity in part to learn about cities, thinking about the world as a complex system and appropriating ideas from traditions in which computers are used for modeling. As such, SimCity is a microcosm of the histories and cultures of computer simulation that engages with questions, themes, and representational techniques that reach back to the earliest computer simulations.

Gingold uses SimCity to explore a web of interrelated topics in the history of technology, software, and simulation, taking us far and wide—from the dawn of programmable computers to miniature cities made of construction paper and role-play. An unprecedented history of Maxis, the company founded to bring SimCity to market, the book reveals Maxis’s complex relations with venture capitalists, Nintendo, and the Santa Fe Institute, which shaped the evolution of Will Wright’s career; Maxis’s failure to back The Sims to completion; and the company’s sale to Electronic Arts.

A lavishly visual book, Building SimCity boasts a treasure trove of visual matter to help bring its wide-ranging subjects to life, including painstakingly crafted diagrams that explain SimCity‘s operation, the Kodachrome photographs taken by Charles Eames of schoolchildren making model cities, and Nintendo’s manga-style “Dr. Wright” character design, just to name a few…(More)”.

Uganda’s Sweeping Surveillance State Is Built on National ID Cards


Article by Olivia Solon: “Uganda has spent hundreds of millions of dollars in the past decade on biometric tools that document a person’s unique physical characteristics, such as their face, fingerprints and irises, to form the basis of a comprehensive identification system. While the system is central to many of the state’s everyday functions, as Museveni has grown increasingly authoritarian over nearly four decades in power, it has also become a powerful mechanism for surveilling politicians, journalists, human rights advocates and ordinary citizens, according to dozens of interviews and hundreds of pages of documents obtained and analyzed by Bloomberg and nonprofit investigative newsroom Lighthouse Reports.

It’s a cautionary tale for any country considering establishing a biometric identity system without rigorous checks and balances and input from civil society. Dozens of global south countries have adopted this approach as part of an effort to meet sustainable development goals from the UN, which considers having a legal identity to be a fundamental human right. But, despite billions of dollars of investment, with backing from organizations including the World Bank, those identity systems haven’t always lived up to expectations. In many cases, the key problem is the failure to register large swathes of the population, leading to exclusion from public services. But in other places, like Uganda, inclusion in the system has been weaponized for surveillance purposes.

A year-long investigation by Bloomberg and Lighthouse Reports sheds new light on the ways in which Museveni’s regime has built and deployed this system to target opponents and consolidate power. It shows how the underlying software and data sets are easily accessed by individuals at all levels of law enforcement, despite official claims to the contrary. It also highlights, in some cases for the first time, how senior government and law enforcement officials have used these tools to target individuals deemed to pose a political threat…(More)”.

Using ChatGPT for analytics


Paper by Aleksei Turobov et al: “The utilisation of AI-driven tools, notably ChatGPT (Generative Pre-trained Transformer), within academic research is increasingly debated from several perspectives including ease of implementation, and potential enhancements in research efficiency, as against ethical concerns and risks such as biases and unexplained AI operations. This paper explores the use of the GPT model for initial coding in qualitative thematic analysis using a sample of United Nations (UN) policy documents. The primary aim of this study is to contribute to the methodological discussion regarding the integration of AI tools, offering a practical guide to validation for using GPT as a collaborative research assistant. The paper outlines the advantages and limitations of this methodology and suggests strategies to mitigate risks. Emphasising the importance of transparency and reliability in employing GPT within research methodologies, this paper argues for a balanced use of AI in supported thematic analysis, highlighting its potential to elevate research efficacy and outcomes…(More)”.

Unmasking and Quantifying Power Structures: How Network Analysis Enhances Peace and State-Building Efforts


Blog by Issa Luna Pla: “Critiques of peace and state-building efforts have pointed out the inadequate grasp of the origins of conflict, political unrest, and the intricate dynamics of criminal and illicit networks (Holt and Bouch, 2009Cockayne and Lupel, 2011). This limited understanding has failed to sufficiently weaken their economic and political influence or effectively curb their activities and objectives. A recent study highlights that although punitive approaches may have temporarily diminished the power of these networks, the absence of robust analytical tools has made it difficult to assess the enduring impact of these strategies.

1. Application of Network Analytics in State-Building

The importance of analytics in international peace and state-building operations is becoming increasingly recognized (O’Brien, 2010Gnanguenon, 2021Rød et al., 2023). Analytics, particularly network analysis, plays a crucial role in dissecting and dismantling complex power structures that often undermine peace initiatives and governance reforms. This analytical approach is crucial for revealing and disrupting the entrenched networks that sustain ongoing conflicts or obstruct peace processes. From the experiences in Guatemala, three significant lessons have been learned regarding the need for analytics for regional and thematic priorities in such operations (Waxenecker, 2019). These insights are vital for understanding how to tailor analytical strategies to address specific challenges in conflict-affected areas.

  1. The effectiveness of the International Commission CICIG in dismantling criminal networks was constrained by its lack of advanced analytical tools. This limitation prevented a deeper exploration of the conflicts’ roots and hindered the assessment of the long-term impacts of its strategies. While the CICIG had a systematic approach to understanding criminal networks from a contextual and legal perspective, its action plans lacked comprehensive statistic analytics methodologies, leading to missed opportunities in targeting key strategic players within these networks. High-level arrests were based on available evidence and charges that prosecutors could substantiate, rather than a strategic analysis of actors’ roles and influences within the networks’ dynamics.
  2. Furthermore, the extent of network dismantlement and the lasting effects of imprisonment and financial control of the illicit groups’ assets remain unclear, highlighting the need for predictive analytics to anticipate conflicts and sustainability. Such tools could enable operations to forecast potential disruptions or stability, allowing for data-driven proactive measures to prevent violence or bolster peace.
  3. Lastly, insights derived from network analysis suggest that efforts should focus on enhancing diplomatic negotiations, promoting economic development and social capital, and balancing punitive measures with strategic interventions. By understanding the dynamics and modeling group behavior in conflict zones, negotiations can be better informed by a deep and holistic comprehension of the underlying power structures and motivations. This approach could also help in forecasting recidivism, assessing risks of network reorganization, and evaluating the potential for increased armament, workforce, or empowerment, thereby facilitating more effective and sustainable peacebuilding initiatives.

2. Advancing Legal and Institutional Reforms

Utilizing data sciences in conflicted environments offers unique insights into the behavior of illicit networks and their interactions within the public and private sectors (Morselli et al., 2007Leuprecht and Hall, 2014Campedelli et al., 2019). This systematic approach, grounded in the analysis of years of illicit activities in Guatemala, highlights the necessity of rethinking traditional legal and institutional frameworks…(More)”.

Scraping the demos. Digitalization, web scraping and the democratic project


Paper by Lena Ulbricht: “Scientific, political and bureaucratic elites use epistemic practices like “big data analysis” and “web scraping” to create representations of the citizenry and to legitimize policymaking. I develop the concept of “demos scraping” for these practices of gaining information about citizens (the “demos”) through automated analysis of digital trace data which are re-purposed for political means. This article critically engages with the discourse advocating demos scraping and provides a conceptual analysis of its democratic implications. It engages with the promise of demos scraping advocates to reduce the gap between political elites and citizens and highlights how demos scraping is presented as a superior form of accessing the “will of the people” and to increase democratic legitimacy. This leads me to critically discuss the implications of demos scraping for political representation and participation. In its current form, demos scraping is technocratic and de-politicizing; and the larger political and economic context in which it takes place makes it unlikely that it will reduce the gap between elites and citizens. From the analytic perspective of a post-democratic turn, demos scraping is an attempt of late modern and digitalized societies to address the democratic paradox of increasing citizen expectations coupled with a deep legitimation crisis…(More)”.

Participation in the Age of Foundation Models


Paper by Harini Suresh et al: “Growing interest and investment in the capabilities of foundation models has positioned such systems to impact a wide array of services, from banking to healthcare. Alongside these opportunities is the risk that these systems reify existing power imbalances and cause disproportionate harm to historically marginalized groups. The larger scale and domain-agnostic manner in which these models operate further heightens the stakes: any errors or harms are liable to reoccur across use cases. In AI & ML more broadly, participatory approaches hold promise to lend agency and decision-making power to marginalized stakeholders, leading to systems that better benefit justice through equitable and distributed governance. But existing approaches in participatory AI/ML are typically grounded in a specific application and set of relevant stakeholders, and it is not straightforward how to apply these lessons to the context of foundation models. Our paper aims to fill this gap.
First, we examine existing attempts at incorporating participation into foundation models. We highlight the tension between participation and scale, demonstrating that it is intractable for impacted communities to meaningfully shape a foundation model that is intended to be universally applicable. In response, we develop a blueprint for participatory foundation models that identifies more
local, application-oriented opportunities for meaningful participation. In addition to the “foundation” layer, our framework proposes the “subfloor” layer, in which stakeholders develop shared technical infrastructure, norms and governance for a grounded domain such as clinical care, journalism, or finance, and the “surface” (or application) layer, in which affected communities shape the use of a foundation model for a specific downstream task. The intermediate “subfloor” layer scopes the range of potential harms to consider, and affords communities more concrete avenues for deliberation and intervention. At the same time, it avoids duplicative effort by scaling input across relevant use cases. Through three case studies in clinical care, financial services, and journalism, we illustrate how this multi-layer model can create more meaningful opportunities for participation than solely intervening at the foundation layer…(More)”.

Missions with Impact: A practical guide to formulating effective missions


Guide by the Bertelsmann Stiftung: “The complex challenges associated with sustainability transitions pose major problems for modern political systems and raise the question of whether new ways of negotiation, decision-making and implementation are needed to address these challenges. For example, given the broad-reaching effects of an issue like climate change on diverse aspects of daily life, policy fields and action areas, conventional solutions are unlikely to prove effective.

Mission orientation proves to be a promising approach for addressing cross-cutting thematic challenges. It involves formulating well-defined “missions” intended to direct innovation, economic activities and societal initiatives toward desired outcomes. These missions aim for transformational change, targeting fundamental shifts that extend beyond the usual political timelines to ensure enduring impact. Across several OECD countries and at the EU level, initiatives embracing a mission-oriented approach are gaining momentum. For instance, the EU’s mission of “100 climate-neutral cities” exemplifies this approach by exploring new pathways to achieve climate neutrality by 2030. Here, stakeholders from diverse sectors can get involved to help generate effective solutions targeting the objective of climate neutrality…(More)”.

In the Land of the Unreal


Book by Lisa Messeri: “In the mid-2010s, a passionate community of Los Angeles-based storytellers, media artists, and tech innovators formed around virtual reality (VR), believing that it could remedy society’s ills. Lisa Messeri offers an ethnographic exploration of this community, which conceptualized VR as an “empathy machine” that could provide glimpses into diverse social realities. She outlines how, in the aftermath of #MeToo, the backlash against Silicon Valley, and the turmoil of the Trump administration, it was imagined that VR—if led by women and other marginalized voices—could bring about a better world. Messeri delves into the fantasies that allowed this vision to flourish, exposing the paradox of attempting to use a singular VR experience to mend a fractured reality full of multiple, conflicting social truths. She theorizes this dynamic as unreal, noting how dreams of empathy collide with reality’s irreducibility to a “common” good. With In the Land of the Unreal, Messeri navigates the intersection of place, technology, and social change to show that technology alone cannot upend systemic forces attached to gender and race…(More)”.