Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Rival Human Crowd Accuracy


Paper by Philipp Schoenegger, Indre Tuminauskaite, Peter S. Park, and Philip E. Tetlock: “Human forecasting accuracy in practice relies on the ‘wisdom of the crowd’ effect, in which predictions about future events are significantly improved by aggregating across a crowd of individual forecasters. Past work on the forecasting ability of large language models (LLMs) suggests that frontier LLMs, as individual forecasters, underperform compared to the gold standard of a human crowd forecasting tournament aggregate. In Study 1, we expand this research by using an LLM ensemble approach consisting of a crowd of twelve LLMs. We compare the aggregated LLM predictions on 31 binary questions to that of a crowd of 925 human forecasters from a three-month forecasting tournament. Our preregistered main analysis shows that the LLM crowd outperforms a simple no-information benchmark and is not statistically different from the human crowd. In exploratory analyses, we find that these two approaches are equivalent with respect to medium-effect-size equivalence bounds. We also observe an acquiescence effect, with mean model predictions being significantly above 50%, despite an almost even split of positive and negative resolutions. Moreover, in Study 2, we test whether LLM predictions (of GPT-4 and Claude 2) can be improved by drawing on human cognitive output. We find that both models’ forecasting accuracy benefits from exposure to the median human prediction as information, improving accuracy by between 17% and 28%: though this leads to less accurate predictions than simply averaging human and machine forecasts. Our results suggest that LLMs can achieve forecasting accuracy rivaling that of human crowd forecasting tournaments: via the simple, practically applicable method of forecast aggregation. This replicates the ‘wisdom of the crowd’ effect for LLMs, and opens up their use for a variety of applications throughout society…(More)”.

Unconventional data, unprecedented insights: leveraging non-traditional data during a pandemic


Paper by Kaylin Bolt et al: “The COVID-19 pandemic prompted new interest in non-traditional data sources to inform response efforts and mitigate knowledge gaps. While non-traditional data offers some advantages over traditional data, it also raises concerns related to biases, representativity, informed consent and security vulnerabilities. This study focuses on three specific types of non-traditional data: mobility, social media, and participatory surveillance platform data. Qualitative results are presented on the successes, challenges, and recommendations of key informants who used these non-traditional data sources during the COVID-19 pandemic in Spain and Italy….

Non-traditional data proved valuable in providing rapid results and filling data gaps, especially when traditional data faced delays. Increased data access and innovative collaborative efforts across sectors facilitated its use. Challenges included unreliable access and data quality concerns, particularly the lack of comprehensive demographic and geographic information. To further leverage non-traditional data, participants recommended prioritizing data governance, establishing data brokers, and sustaining multi-institutional collaborations. The value of non-traditional data was perceived as underutilized in public health surveillance, program evaluation and policymaking. Participants saw opportunities to integrate them into public health systems with the necessary investments in data pipelines, infrastructure, and technical capacity…(More)”.

A complexity science approach to law and governance


Introduction to a Special Issue by Pierpaolo Vivo, Daniel M. Katz and J. B. Ruhl: “The premise of this Special Issue is that legal systems are complex adaptive systems, and thus complexity science can be usefully applied to improve understanding of how legal systems operate, perform and change over time. The articles that follow take this proposition as a given and act on it using a variety of methods applied to a broad array of legal system attributes and contexts. Yet not too long ago some prominent legal scholars expressed scepticism that this field of study would produce more than broad generalizations, if even that. To orient readers unfamiliar with this field and its history, here we offer a brief background on how using complexity science to study legal systems has advanced from claims of ‘pseudoscience’ status to a widely adopted mainstream method. We then situate and summarize the articles.

The focus of complexity science is complex adaptive systems (CAS), systems ‘in which large networks of components with no central control and simple rules of operation give rise to complex collective behavior, sophisticated information processing and adaptation via learning or evolution’. It is important to distinguish CAS from systems that are merely complicated, such as a combustion engine, or complex but non-adaptive, such as a hurricane. A forest or coastal ecosystem, for example, is a complicated network of diverse physical and biological components, which, under no central rules of control, is highly adaptive over time…(More)”.

Blockchain and public service delivery: a lifetime cross-referenced model for e-government


Paper by Maxat Kassen: “The article presents the results of field studies, analysing the perspectives of blockchain developers on decentralised service delivery and elaborating on unique algorithms for lifetime ledgers to reliably and safely record e-government transactions in an intrinsically cross-referenced manner. New interesting technological niches of service delivery and emerging models of related data management in the industry were proposed and further elaborated such as the generation of unique lifetime personal data profiles, blockchain-driven cross-referencing of e-government metadata, parallel maintenance of serviceable ledgers for data identifiers and phenomena of blockchain ‘black holes’ to ensure reliable protection of important public, corporate and civic information…(More)”.

Situating Data Sets: Making Public Data Actionable for Housing Justice


Paper by Anh-Ton Tran et al: “Activists, governments and academics regularly advocate for more open data. But how is data made open, and for whom is it made useful and usable? In this paper, we investigate and describe the work of making eviction data open to tenant organizers. We do this through an ethnographic description of ongoing work with a local housing activist organization. This work combines observation, direct participation in data work, and creating media artifacts, specifically digital maps. Our interpretation is grounded in D’Ignazio and Klein’s Data Feminism, emphasizing standpoint theory. Through our analysis and discussion, we highlight how shifting positionalities from data intermediaries to data accomplices affects the design of data sets and maps. We provide HCI scholars with three design implications when situating data for grassroots organizers: becoming a domain beginner, striving for data actionability, and evaluating our design artifacts by the social relations they sustain rather than just their technical efficacy…(More)”.

Community views on the secondary use of general practice data: Findings from a mixed-methods study


Paper by Annette J. Braunack-Mayer et al: “General practice data, particularly when combined with hospital and other health service data through data linkage, are increasingly being used for quality assurance, evaluation, health service planning and research.Using general practice data is particularly important in countries where general practitioners (GPs) are the first and principal source of health care for most people.

Although there is broad public support for the secondary use of health data, there are good reasons to question whether this support extends to general practice settings. GP–patient relationships may be very personal and longstanding and the general practice health record can capture a large amount of information about patients. There is also the potential for multiple angles on patients’ lives: GPs often care for, or at least record information about, more than one generation of a family. These factors combine to amplify patients’ and GPs’ concerns about sharing patient data….

Adams et al. have developed a model of social licence, specifically in the context of sharing administrative data for health research, based on an analysis of the social licence literature and founded on two principal elements: trust and legitimacy.In this model, trust is founded on research enterprises being perceived as reliable and responsive, including in relation to privacy and security of information, and having regard to the community’s interests and well-being.

Transparency and accountability measures may be used to demonstrate trustworthiness and, as a consequence, to generate trust. Transparency involves a level of openness about the way data are handled and used as well as about the nature and outcomes of the research. Adams et al. note that lack of transparency can undermine trust. They also note that the quality of public engagement is important and that simply providing information is not sufficient. While this is one element of transparency, other elements such as accountability and collaboration are also part of the trusting, reflexive relationship necessary to establish and support social licence.

The second principal element, legitimacy, is founded on research enterprises conforming to the legal, cultural and social norms of society and, again, acting in the best interests of the community. In diverse communities with a range of views and interests, it is necessary to develop a broad consensus on what amounts to the common good through deliberative and collaborative processes.

Social licence cannot be assumed. It must be built through public discussion and engagement to avoid undermining the relationship of trust with health care providers and confidence in the confidentiality of health information…(More)”

Data, Privacy Laws and Firm Production: Evidence from the GDPR


Paper by Mert Demirer, Diego J. Jiménez Hernández, Dean Li & Sida Peng: “By regulating how firms collect, store, and use data, privacy laws may change the role of data in production and alter firm demand for information technology inputs. We study how firms respond to privacy laws in the context of the EU’s General Data Protection Regulation (GDPR) by using seven years of data from a large global cloud-computing provider. Our difference-in-difference estimates indicate that, in response to the GDPR, EU firms decreased data storage by 26% and data processing by 15% relative to comparable US firms, becoming less “data-intensive.” To estimate the costs of the GDPR for firms, we propose and estimate a production function where data and computation serve as inputs to the production of “information.” We find that data and computation are strong complements in production and that firm responses are consistent with the GDPR, representing a 20% increase in the cost of data on average. Variation in the firm-level effects of the GDPR and industry-level exposure to data, however, drives significant heterogeneity in our estimates of the impact of the GDPR on production costs…(More)”

Manipulation by design


Article by Jan Trzaskowski: “Human behaviour is affected by architecture, including how online user interfaces are designed. The purpose of this article is to provide insights into the regulation of behaviour modification by the design of choice architecture in light of the European Union data protection law (GDPR) and marketing law (UCPD). It has become popular to use the term ‘dark pattern’ (also ‘deceptive practices’) to describe such practices in online environments. The term provides a framework for identifying and discussing ‘problematic’ design practices, but the definitions and descriptions are not sufficient in themselves to draw the fine line between legitimate (lawful) persuasion and unlawful manipulation, which requires an inquiry into agency, self-determination, regulation and legal interpretation. The main contribution of this article is to place manipulative design, including ‘dark patterns’, within the framework of persuasion (marketing), technology (persuasive technology) and law (privacy and marketing)…(More)”.

Does information about citizen participation initiatives increase political trust?


Paper by Martin Ardanaz,  Susana Otálvaro-Ramírez, and Carlos Scartascini: “Participatory programs can reduce the informational and power asymmetries that engender mistrust. These programs, however, cannot include every citizen. Hence, it is important to evaluate if providing information about those programs could affect trust among those who do not participate. We assess the effect of an informational campaign about these programs in the context of a survey experiment conducted in the city of Buenos Aires, Argentina. Results show that providing detailed information about citizen involvement and outputs of a participatory budget initiative marginally shapes voters’ assessments of government performance and political trust. In particular, it increases voters’ perceptions about the benevolence and honesty of the government. Effects are larger for individuals with ex ante more negative views about the local government’s quality and they differ according to the respondents’ interpersonal trust and their beliefs about the ability of their communities to solve the type of collective-action problems that the program seeks to address. This article complements the literature that has examined the effects of participatory interventions on trust, and the literature that evaluates the role of information. The results in the article suggest that participatory budget programs could directly affect budget allocations and trust for those who participate, and those that are well-disseminated could also affect trust in the broader population. Because mistrustful individuals tend to shy away from demanding the government public goods that increase overall welfare, well-disseminated participatory budget programs could affect budget allocations directly and through their effect on trust…(More)”.

Applying AI to Rebuild Middle Class Jobs


Paper by David Autor: “While the utopian vision of the current Information Age was that computerization would flatten economic hierarchies by democratizing information, the opposite has occurred. Information, it turns out, is merely an input into a more consequential economic function, decision-making, which is the province of elite experts. The unique opportunity that AI offers to the labor market is to extend the relevance, reach, and value of human expertise. Because of AI’s capacity to weave information and rules with acquired experience to support decision-making, it can be applied to enable a larger set of workers possessing complementary knowledge to perform some of the higher-stakes decision-making tasks that are currently arrogated to elite experts, e.g., medical care to doctors, document production to lawyers, software coding to computer engineers, and undergraduate education to professors. My thesis is not a forecast but an argument about what is possible: AI, if used well, can assist with restoring the middle-skill, middle-class heart of the US labor market that has been hollowed out by automation and globalization…(More)”.