Paper by Nikhil Agarwal, Alex Moehring, Pranav Rajpurkar & Tobias Salz: “While Artificial Intelligence (AI) algorithms have achieved performance levels comparable to human experts on various predictive tasks, human experts can still access valuable contextual information not yet incorporated into AI predictions. Humans assisted by AI predictions could outperform both human-alone or AI-alone. We conduct an experiment with professional radiologists that varies the availability of AI assistance and contextual information to study the effectiveness of human-AI collaboration and to investigate how to optimize it. Our findings reveal that (i) providing AI predictions does not uniformly increase diagnostic quality, and (ii) providing contextual information does increase quality. Radiologists do not fully capitalize on the potential gains from AI assistance because of large deviations from the benchmark Bayesian model with correct belief updating. The observed errors in belief updating can be explained by radiologists’ partially underweighting the AI’s information relative to their own and not accounting for the correlation between their own information and AI predictions. In light of these biases, we design a collaborative system between radiologists and AI. Our results demonstrate that, unless the documented mistakes can be corrected, the optimal solution involves assigning cases either to humans or to AI, but rarely to a human assisted by AI…(More)”.
How Good Are Privacy Guarantees? Platform Architecture and Violation of User Privacy
Paper by Daron Acemoglu, Alireza Fallah, Ali Makhdoumi, Azarakhsh Malekian & Asuman Ozdaglar: “Many platforms deploy data collected from users for a multitude of purposes. While some are beneficial to users, others are costly to their privacy. The presence of these privacy costs means that platforms may need to provide guarantees about how and to what extent user data will be harvested for activities such as targeted ads, individualized pricing, and sales to third parties. In this paper, we build a multi-stage model in which users decide whether to share their data based on privacy guarantees. We first introduce a novel mask-shuffle mechanism and prove it is Pareto optimal—meaning that it leaks the least about the users’ data for any given leakage about the underlying common parameter. We then show that under any mask-shuffle mechanism, there exists a unique equilibrium in which privacy guarantees balance privacy costs and utility gains from the pooling of user data for purposes such as assessment of health risks or product development. Paradoxically, we show that as users’ value of pooled data increases, the equilibrium of the game leads to lower user welfare. This is because platforms take advantage of this change to reduce privacy guarantees so much that user utility declines (whereas it would have increased with a given mechanism). Even more strikingly, we show that platforms have incentives to choose data architectures that systematically differ from those that are optimal from the user’s point of view. In particular, we identify a class of pivot mechanisms, linking individual privacy to choices by others, which platforms prefer to implement and which make users significantly worse off…(More)”.
Non-traditional data sources in obesity research: a systematic review of their use in the study of obesogenic environments
Paper by Julia Mariel Wirtz Baker, Sonia Alejandra Pou, Camila Niclis, Eugenia Haluszka & Laura Rosana Aballay: “The field of obesity epidemiology has made extensive use of traditional data sources, such as health surveys and reports from official national statistical systems, whose variety of data can be at times limited to explore a wider range of determinants relevant to obesity. Over time, other data sources began to be incorporated into obesity research, such as geospatial data (web mapping platforms, satellite imagery, and other databases embedded in Geographic Information Systems), social network data (such as Twitter, Facebook, Instagram, or other social networks), digital device data and others. The data revolution, facilitated by the massive use of digital devices with hundreds of millions of users and the emergence of the “Internet of Things” (IoT), has generated huge volumes of data from everywhere: customers, social networks and sensors, in addition to all the traditional sources mentioned above. In the research area, it offers fruitful opportunities, contributing in ways that traditionally sourced research data could not.
An international expert panel in obesity and big data pointed out some key factors in the definition of Big Data, stating that “it is always digital, has a large sample size, and a large volume or variety or velocity of variables that require additional computing power, as well as specialist skills in computer programming, database management and data science analytics”. Our interpretation of non-traditional data sources is an approximation to this definition, assuming that they are sources not traditionally used in obesity epidemiology and environmental studies, which can include digital devices, social media and geospatial data within a GIS, the latter mainly based on complex indexes that require advanced data analysis techniques and expertise.
Beyond the still discussed limitations, Big Data can be assumed as a great opportunity to improve the study of obesogenic environments, since it has been announced as a powerful resource that can provide new knowledge about human behaviour and social phenomena. Besides, it can contribute to the formulation and evaluation of policies and the development of interventions for obesity prevention. However, in this field of research, the suitability of these novel data sources is still a subject of considerable discussion, and their use has not been investigated from the obesogenic environment approach…(More)”.
Can AI language models replace human participants?
Paper by Danica Dillion et al: “Recent work suggests that language models such as GPT can make human-like judgments across a number of domains. We explore whether and when language models might replace human participants in psychological science. We review nascent research, provide a theoretical model, and outline caveats of using AI as a participant…(More)”
Diversity of Expertise is Key to Scientific Impact
Paper by Angelo Salatino, Simone Angioni, Francesco Osborne, Diego Reforgiato Recupero, Enrico Motta: “Understanding the relationship between the composition of a research team and the potential impact of their research papers is crucial as it can steer the development of new science policies for improving the research enterprise. Numerous studies assess how the characteristics and diversity of research teams can influence their performance across several dimensions: ethnicity, internationality, size, and others. In this paper, we explore the impact of diversity in terms of the authors’ expertise. To this purpose, we retrieved 114K papers in the field of Computer Science and analysed how the diversity of research fields within a research team relates to the number of citations their papers received in the upcoming 5 years. The results show that two different metrics we defined, reflecting the diversity of expertise, are significantly associated with the number of citations. This suggests that, at least in Computer Science, diversity of expertise is key to scientific impact…(More)”.
Data collaborations at a local scale: Lessons learnt in Rennes (2010–2021)
Paper by Simon Chignard and Marion Glatron: “Data sharing is a requisite for developing data-driven innovation and collaboration at the local scale. This paper aims to identify key lessons and recommendations for building trustworthy data governance at the local scale, including the public and private sectors. Our research is based on the experience gained in Rennes Metropole since 2010 and focuses on two thematic use cases: culture and energy. For each one, we analyzed how the power relations between actors and the local public authority shape the modalities of data sharing and exploitation. The paper will elaborate on challenges and opportunities at the local level, in perspective with the national and European frameworks…(More)”.
Artificial Intelligence, Big Data, Algorithmic Management, and Labor Law
Chapter by Pauline Kim: “Employers are increasingly relying on algorithms and AI to manage their workforces, using automated systems to recruit, screen, select, supervise, discipline, and even terminate employees. This chapter explores the effects of these systems on the rights of workers in standard work relationships, who are presumptively protected by labor laws. It examines how these new technological tools affect fundamental worker interests and how existing law applies, focusing primarily as examples on two particular concerns—nondiscrimination and privacy. Although current law provides some protections, legal doctrine has largely developed with human managers in mind, and as a result, fails to fully apprehend the risks posed by algorithmic tools. Thus, while anti-discrimination law prohibits discrimination by workplace algorithms, the existing framework has a number of gaps and uncertainties when applied to these systems. Similarly, traditional protections for employee privacy are ill-equipped to address the sheer volume and granularity of worker data that can now be collected, and the ability of computational techniques to extract new insights and infer sensitive information from that data. More generally, the expansion of algorithmic management affects other fundamental worker interests because it tends to increase employer power vis à vis labor. This chapter concludes by briefly considering the role that data protection laws might play in addressing the risks of algorithmic management…(More)”.
Opening industry data: The private sector’s role in addressing societal challenges
Paper by Jennifer Hansen and Yiu-Shing Pang: “This commentary explores the potential of private companies to advance scientific progress and solve social challenges through opening and sharing their data. Open data can accelerate scientific discoveries, foster collaboration, and promote long-term business success. However, concerns regarding data privacy and security can hinder data sharing. Companies have options to mitigate the challenges through developing data governance mechanisms, collaborating with stakeholders, communicating the benefits, and creating incentives for data sharing, among others. Ultimately, open data has immense potential to drive positive social impact and business value, and companies can explore solutions for their specific circumstances and tailor them to their specific needs…(More)”.
Turning the Cacophony of the Internet’s Tower of Babel into a Coherent General Collective Intelligence
Paper by Andy E. Williams: “Increasing the number, diversity, or uniformity of opinions in a group does not necessarily imply that those opinions will converge into a single more “intelligent” one, if an objective definition of the term intelligent exists as it applies to opinions. However, a recently developed approach called human-centric functional modeling provides what might be the first general model for individual or collective intelligence. In the case of the collective intelligence of groups, this model suggests how a cacophony of incoherent opinions in a large group might be combined into coherent collective reasoning by a hypothetical platform called “general collective intelligence” (GCI). When applied to solving group problems, a GCI might be considered a system that leverages collective reasoning to increase the beneficial insights that might be derived from the information available to any group. This GCI model also suggests how the collective reasoning ability (intelligence) might be exponentially increased compared to the intelligence of any individual in a group, potentially resulting in what is predicted to be a collective superintelligence….(More)”
Digital Sovereignty and Governance in the Data Economy: Data Trusteeship Instead of Property Rights on Data
Chapter by Ingrid Schneider: “This chapter challenges the current business models of the dominant platforms in the digital economy. In the search for alternatives, and towards the aim of achieving digital sovereignty, it proceeds in four steps: First, it discusses scholarly proposals to constitute a new intellectual property right on data. Second, it examines four models of data governance distilled from the literature that seek to see data administered (1) as a private good regulated by the market, (2) as a public good regulated by the state, (3) as a common good managed by a commons’ community, and (4) as a data trust supervised by means of stewardship by a trustee. Third, the strengths and weaknesses of each of these models, which are ideal types and serve as heuristics, are critically appraised. Fourth, data trusteeship which at present seems to be emerging as a promising implementation model for better data governance, is discussed in more detail, both in an empirical-descriptive way, by referring to initiatives in several countries, and analytically, by highlighting the challenges and pitfalls of data trusteeship…(More)”.