Data Statements: From Technical Concept to Community Practice

Paper by Angelina McMillan-Major, Emily M. Bender, and Batya Friedman: “Responsible computing ultimately requires that technical communities develop and adopt tools, processes, and practices that mitigate harms and support human flourishing. Prior efforts toward the responsible development and use of datasets, machine learning models, and other technical systems have led to the creation of documentation toolkits to facilitate transparency, diagnosis, and inclusion. This work takes the next step: to catalyze community uptake, alongside toolkit improvement. Specifically, starting from one such proposed toolkit specialized for language datasets, data statements for natural language processing, we explore how to improve the toolkit in three senses: (1) the content of the toolkit itself, (2) engagement with professional practice, and (3) moving from a conceptual proposal to a tested schema that the intended community of use may readily adopt. To achieve these goals, we first conducted a workshop with natural language processing practitioners to identify gaps and limitations of the toolkit as well as to develop best practices for writing data statements, yielding an interim improved toolkit. Then we conducted an analytic comparison between the interim toolkit and another documentation toolkit, datasheets for datasets. Based on these two integrated processes, we present our revised Version 2 schema and best practices in a guide for writing data statements. Our findings more generally provide integrated processes for co-evolving both technology and practice to address ethical concerns within situated technical communities…(More)”

Green Light

Google Research: “Road transportation is responsible for a significant amount of global and urban greenhouse gas emissions. It is especially problematic at city intersections where pollution can be 29 times higher than on open roads.  At intersections, half of these emissions come from traffic accelerating after stopping. While some amount of stop-and-go traffic is unavoidable, part of it is preventable through the optimization of traffic light timing configurations. To improve traffic light timing, cities need to either install costly hardware or run manual vehicle counts; both of these solutions are expensive and don’t provide all the necessary information. 

Green Light uses AI and Google Maps driving trends, with one of the strongest understandings of global road networks, to model traffic patterns and build intelligent recommendations for city traffic engineers to optimize traffic flow. Early numbers indicate a potential for up to 30% reduction in stops and 10% reduction in greenhouse gas emissions (1). By optimizing each intersection, and coordinating between adjacent intersections, we can create waves of green lights and help cities further reduce stop-and-go traffic. Green Light is now live in 70 intersections in 12 cities, 4 continents, from Haifa, Israel to Bangalore, India to Hamburg, Germany – and in these intersections we are able to save fuel and lower emissions for up to 30M car rides monthly. Green Light reflects Google Research’s commitment to use AI to address climate change and improve millions of lives in cities around the world…(More)”

Artificial intelligence and the local government: A five-decade scientometric analysis on the evolution, state-of-the-art, and emerging trends

Paper by Tan Yigitcanlar et al: “In recent years, the rapid advancement of artificial intelligence (AI) technologies has significantly impacted various sectors, including public governance at the local level. However, there exists a limited understanding of the overarching narrative surrounding the adoption of AI in local governments and its future. Therefore, this study aims to provide a comprehensive overview of the evolution, current state-of-the-art, and emerging trends in the adoption of AI in local government. A comprehensive scientometric analysis was conducted on a dataset comprising 7112 relevant literature records retrieved from the Scopus database in October 2023, spanning over the last five decades. The study findings revealed the following key insights: (a) exponential technological advancements over the last decades ushered in an era of AI adoption by local governments; (b) the primary purposes of AI adoption in local governments include decision support, automation, prediction, and service delivery; (c) the main areas of AI adoption in local governments encompass planning, analytics, security, surveillance, energy, and modelling; and (d) under-researched but critical research areas include ethics of and public participation in AI adoption in local governments. This study informs research, policy, and practice by offering a comprehensive understanding of the literature on AI applications in local governments, providing valuable insights for stakeholders and decision-makers…(More)”.

Brazil hires OpenAI to cut costs of court battles

Article by Marcela Ayres and Bernardo Caram: “Brazil’s government is hiring OpenAI to expedite the screening and analysis of thousands of lawsuits using artificial intelligence (AI), trying to avoid costly court losses that have weighed on the federal budget.

The AI service will flag to government the need to act on lawsuits before final decisions, mapping trends and potential action areas for the solicitor general’s office (AGU).

AGU told Reuters that Microsoft would provide the artificial intelligence services from ChatGPT creator OpenAI through its Azure cloud-computing platform. It did not say how much Brazil will pay for the services.

Court-ordered debt payments have consumed a growing share of Brazil’s federal budget. The government estimated it would spend 70.7 billion reais ($13.2 billion) next year on judicial decisions where it can no longer appeal. The figure does not include small-value claims, which historically amount to around 30 billion reais annually.

The combined amount of over 100 billion reais represents a sharp increase from 37.3 billion reais in 2015. It is equivalent to about 1% of gross domestic product, or 15% more than the government expects to spend on unemployment insurance and wage bonuses to low-income workers next year.

AGU did not provide a reason for Brazil’s rising court costs…(More)”.

Artificial Intelligence Applications for Social Science Research

Report by Megan Stubbs-Richardson et al: “Our team developed a database of 250 Artificial Intelligence (AI) applications useful for social science research. To be included in our database, the AI tool had to be useful for: 1) literature reviews, summaries, or writing, 2) data collection, analysis, or visualizations, or 3) research dissemination. In the database, we provide a name, description, and links to each of the AI tools that were current at the time of publication on September 29, 2023. Supporting links were provided when an AI tool was found using other databases. To help users evaluate the potential usefulness of each tool, we documented information about costs, log-in requirements, and whether plug-ins or browser extensions are available for each tool. Finally, as we are a team of scientists who are also interested in studying social media data to understand social problems, we also documented when the AI tools were useful for text-based data, such as social media. This database includes 132 AI tools that may have use for literature reviews or writing; 146 tools that may have use for data collection, analyses, or visualizations; and 108 that may be used for dissemination efforts. While 170 of the AI tools within this database can be used for general research purposes, 18 are specific to social media data analyses, and 62 can be applied to both. Our database thus offers some of the recently published tools for exploring the application of AI to social science research…(More)”

Designing for AI Transparency in Public Services: A User-Centred Study of Citizens’ Preferences

Paper by Stefan Schmager, Samrat Gupta, Ilias Pappas & Polyxeni Vassilakopoulou: “Enhancing transparency in AI enabled public services has the potential to improve their adoption and service delivery. Hence, it is important to identify effective design strategies for AI transparency in public services. To this end, we conduct this empirical qualitative study providing insights for responsible deployment of AI in practice by public organizations. We design an interactive prototype for a Norwegian public welfare service organization which aims to use AI to support sick leaves related services. Qualitative analysis of citizens’ data collected through survey, think-aloud interactions with the prototype, and open-ended questions revealed three key themes related to: articulating information in written form, representing information in graphical form, and establishing the appropriate level of information detail for improving AI transparency in public service delivery. This study advances research pertaining to design of public service portals and has implications for AI implementation in the public sector…(More)”.

What does it mean to be good? The normative and metaethical problem with ‘AI for good’

Article by Tom Stenson: “Using AI for good is an imperative for its development and regulation, but what exactly does it mean? This article contends that ‘AI for good’ is a powerful normative concept and is problematic for the ethics of AI because it oversimplifies complex philosophical questions in defining good and assumes a level of moral knowledge and certainty that may not be justified. ‘AI for good’ expresses a value judgement on what AI should be and its role in society, thereby functioning as a normative concept in AI ethics. As a moral statement, AI for good makes two things implicit: i) we know what a good outcome is and ii) we know the process by which to achieve it. By examining these two claims, this article will articulate the thesis that ‘AI for good’ should be examined as a normative and metaethical problem for AI ethics. Furthermore, it argues that we need to pay more attention to our relationship with normativity and how it guides what we believe the ‘work’ of ethical AI should be…(More)”.

Using ChatGPT for analytics

Paper by Aleksei Turobov et al: “The utilisation of AI-driven tools, notably ChatGPT (Generative Pre-trained Transformer), within academic research is increasingly debated from several perspectives including ease of implementation, and potential enhancements in research efficiency, as against ethical concerns and risks such as biases and unexplained AI operations. This paper explores the use of the GPT model for initial coding in qualitative thematic analysis using a sample of United Nations (UN) policy documents. The primary aim of this study is to contribute to the methodological discussion regarding the integration of AI tools, offering a practical guide to validation for using GPT as a collaborative research assistant. The paper outlines the advantages and limitations of this methodology and suggests strategies to mitigate risks. Emphasising the importance of transparency and reliability in employing GPT within research methodologies, this paper argues for a balanced use of AI in supported thematic analysis, highlighting its potential to elevate research efficacy and outcomes…(More)”.

Seeing Like a Data Structure

Essay by Barath Raghavan and Bruce Schneier: “Technology was once simply a tool—and a small one at that—used to amplify human intent and capacity. That was the story of the industrial revolution: we could control nature and build large, complex human societies, and the more we employed and mastered technology, the better things got. We don’t live in that world anymore. Not only has technology become entangled with the structure of society, but we also can no longer see the world around us without it. The separation is gone, and the control we thought we once had has revealed itself as a mirage. We’re in a transitional period of history right now.

We tell ourselves stories about technology and society every day. Those stories shape how we use and develop new technologies as well as the new stories and uses that will come with it. They determine who’s in charge, who benefits, who’s to blame, and what it all means.

Some people are excited about the emerging technologies poised to remake society. Others are hoping for us to see this as folly and adopt simpler, less tech-centric ways of living. And many feel that they have little understanding of what is happening and even less say in the matter.

But we never had total control of technology in the first place, nor is there a pretechnological golden age to which we can return. The truth is that our data-centric way of seeing the world isn’t serving us well. We need to tease out a third option. To do so, we first need to understand how we got here…(More)”

“The Death of Wikipedia?” — Exploring the Impact of ChatGPT on Wikipedia Engagement

Paper by Neal Reeves, Wenjie Yin, Elena Simperl: “Wikipedia is one of the most popular websites in the world, serving as a major source of information and learning resource for millions of users worldwide. While motivations for its usage vary, prior research suggests shallow information gathering — looking up facts and information or answering questions — dominates over more in-depth usage. On the 22nd of November 2022, ChatGPT was released to the public and has quickly become a popular source of information, serving as an effective question-answering and knowledge gathering resource. Early indications have suggested that it may be drawing users away from traditional question answering services such as Stack Overflow, raising the question of how it may have impacted Wikipedia. In this paper, we explore Wikipedia user metrics across four areas: page views, unique visitor numbers, edit counts and editor numbers within twelve language instances of Wikipedia. We perform pairwise comparisons of these metrics before and after the release of ChatGPT and implement a panel regression model to observe and quantify longer-term trends. We find no evidence of a fall in engagement across any of the four metrics, instead observing that page views and visitor numbers increased in the period following ChatGPT’s launch. However, we observe a lower increase in languages where ChatGPT was available than in languages where it was not, which may suggest ChatGPT’s availability limited growth in those languages. Our results contribute to the understanding of how emerging generative AI tools are disrupting the Web ecosystem…(More)”. See also: Are we entering a Data Winter? On the urgent need to preserve data access for the public interest.