Artificial Intelligence Applications for Social Science Research


Report by Megan Stubbs-Richardson et al: “Our team developed a database of 250 Artificial Intelligence (AI) applications useful for social science research. To be included in our database, the AI tool had to be useful for: 1) literature reviews, summaries, or writing, 2) data collection, analysis, or visualizations, or 3) research dissemination. In the database, we provide a name, description, and links to each of the AI tools that were current at the time of publication on September 29, 2023. Supporting links were provided when an AI tool was found using other databases. To help users evaluate the potential usefulness of each tool, we documented information about costs, log-in requirements, and whether plug-ins or browser extensions are available for each tool. Finally, as we are a team of scientists who are also interested in studying social media data to understand social problems, we also documented when the AI tools were useful for text-based data, such as social media. This database includes 132 AI tools that may have use for literature reviews or writing; 146 tools that may have use for data collection, analyses, or visualizations; and 108 that may be used for dissemination efforts. While 170 of the AI tools within this database can be used for general research purposes, 18 are specific to social media data analyses, and 62 can be applied to both. Our database thus offers some of the recently published tools for exploring the application of AI to social science research…(More)”

Designing for AI Transparency in Public Services: A User-Centred Study of Citizens’ Preferences


Paper by Stefan Schmager, Samrat Gupta, Ilias Pappas & Polyxeni Vassilakopoulou: “Enhancing transparency in AI enabled public services has the potential to improve their adoption and service delivery. Hence, it is important to identify effective design strategies for AI transparency in public services. To this end, we conduct this empirical qualitative study providing insights for responsible deployment of AI in practice by public organizations. We design an interactive prototype for a Norwegian public welfare service organization which aims to use AI to support sick leaves related services. Qualitative analysis of citizens’ data collected through survey, think-aloud interactions with the prototype, and open-ended questions revealed three key themes related to: articulating information in written form, representing information in graphical form, and establishing the appropriate level of information detail for improving AI transparency in public service delivery. This study advances research pertaining to design of public service portals and has implications for AI implementation in the public sector…(More)”.

The tensions of data sharing for human rights: A modern slavery case study


Paper by Jamie Hancock et al: “There are calls for greater data sharing to address human rights issues. Advocates claim this will provide an evidence-base to increase transparency, improve accountability, enhance decision-making, identify abuses, and offer remedies for rights violations. However, these well-intentioned efforts have been found to sometimes enable harms against the people they seek to protect. This paper shows issues relating to fairness, accountability, or transparency (FAccT) in and around data sharing can produce such ‘ironic’ consequences. It does so using an empirical case study: efforts to tackle modern slavery and human trafficking in the UK. We draw on a qualitative analysis of expert interviews, workshops, ecosystem mapping exercises, and a desk-based review. The findings show how, in the UK, a large ecosystem of data providers, hubs, and users emerged to process and exchange data from across the country. We identify how issues including legal uncertainties, non-transparent sharing procedures, and limited accountability regarding downstream uses of data may undermine efforts to tackle modern slavery and place victims of abuses at risk of further harms. Our findings help explain why data sharing activities can have negative consequences for human rights, even within human rights initiatives. Moreover, our analysis offers a window into how FAccT principles for technology relate to the human rights implications of data sharing. Finally, we discuss why these tensions may be echoed in other areas where data sharing is pursued for human rights concerns, identifying common features which may lead to similar results, especially where sensitive data is shared to achieve social goods or policy objectives…(More)”.

Blueprints for Learning


Report by the Data Foundation: “The Foundations for Evidence-Based Policymaking Act of 2018 (Evidence Act) required the creation of learning agendas for the largest federal agencies. These agendas outline how agencies will identify and answer priority questions through data and evidence-building activities. The Data Foundation undertook an analysis of the agendas to understand how they were developed and plans for implementation as part of the 5-Year milestone of the Evidence Act.

The analysis reveals both progress and areas for improvement in the development and use of learning agendas. All but one large agency produced a publicly-available learning agenda, demonstrating a significant initial effort. However, several challenges were identified:

  • Limited detail on execution and use: Many learning agendas lacked specifics on how the identified priority questions would be addressed or how the evidence generated would be used.
  • Variation in quality: Agencies diverged in the comprehensiveness and clarity of their agendas, with some providing more detailed plans than others.
  • Resource constraints: The analysis suggests that a lack of dedicated resources may be hindering some agencies’ capacity to fully implement their learning agendas…(More)”.

Are We Ready for the Next Pandemic? Navigating the First and Last Mile Challenges in Data Utilization


Blog by Stefaan Verhulst, Daniela Paolotti, Ciro Cattuto and Alessandro Vespignani:

“Public health officials from around the world are gathering this week in Geneva for a weeklong meeting of the 77th World Health Assembly. A key question they are examining is: Are we ready for the next pandemic? As we have written elsewhere, regarding access to and re-use of data, particularly non-traditional data, for pandemic preparedness and response: we are not. Below, we list ten recommendations to advance access to and reuse of non-traditional data for pandemics, drawing on input from a high-level workshop, held in Brussels, within the context of the ESCAPE program…(More)”

As

close.city


About: “Proximity governs how we live, work, and socialize. Close is an interactive travel time map for people who want to be near the amenities that matter most to them. Close builds on two core principles:

  1. Different people will prioritize being near different amenities
  2. A neighborhood is only as accessible as its most distant important amenity

When you select multiple amenities in Close, the map shows the travel time to the furthest of those amenities. You can set your preferred travel mode to get to each amenity. Walking + Public Transit, Biking or Combined. Close is currently in public beta, with more features and destination types coming over the next few months. The reliability of destinations will continually improve as new data sources and user feedback are incorporated. Close is built and maintained by Henry Spatial Analysis. You can stay up-to-date on the latest improvements to Close by subscribing to the newsletter. How to use Close – Close includes travel time information for cities across the United States. To view a different location, select the search icon on the top left of the screen and enter a city or county name. To access map details, including a link to this About page, click the menu icon in the top left corner of the map…(More)”

Will we run out of data? Limits of LLM scaling based on human-generated data


Paper by Pablo Villalobos: We investigate the potential constraints on LLM scaling posed by the availability of public human-generated text data. We forecast the growing demand for training data based on current trends and estimate the total stock of public human text data. Our findings indicate that if current LLM development trends continue, models will be trained on datasets roughly equal in size to the available stock of public human text data between 2026 and 2032, or slightly earlier if models are overtrained. We explore how progress in language modeling can continue when human-generated text datasets cannot be scaled any further. We argue that synthetic data generation, transfer learning from data-rich domains, and data efficiency improvements might support further progress…(More)”.

What does it mean to be good? The normative and metaethical problem with ‘AI for good’


Article by Tom Stenson: “Using AI for good is an imperative for its development and regulation, but what exactly does it mean? This article contends that ‘AI for good’ is a powerful normative concept and is problematic for the ethics of AI because it oversimplifies complex philosophical questions in defining good and assumes a level of moral knowledge and certainty that may not be justified. ‘AI for good’ expresses a value judgement on what AI should be and its role in society, thereby functioning as a normative concept in AI ethics. As a moral statement, AI for good makes two things implicit: i) we know what a good outcome is and ii) we know the process by which to achieve it. By examining these two claims, this article will articulate the thesis that ‘AI for good’ should be examined as a normative and metaethical problem for AI ethics. Furthermore, it argues that we need to pay more attention to our relationship with normativity and how it guides what we believe the ‘work’ of ethical AI should be…(More)”.

Uganda’s Sweeping Surveillance State Is Built on National ID Cards


Article by Olivia Solon: “Uganda has spent hundreds of millions of dollars in the past decade on biometric tools that document a person’s unique physical characteristics, such as their face, fingerprints and irises, to form the basis of a comprehensive identification system. While the system is central to many of the state’s everyday functions, as Museveni has grown increasingly authoritarian over nearly four decades in power, it has also become a powerful mechanism for surveilling politicians, journalists, human rights advocates and ordinary citizens, according to dozens of interviews and hundreds of pages of documents obtained and analyzed by Bloomberg and nonprofit investigative newsroom Lighthouse Reports.

It’s a cautionary tale for any country considering establishing a biometric identity system without rigorous checks and balances and input from civil society. Dozens of global south countries have adopted this approach as part of an effort to meet sustainable development goals from the UN, which considers having a legal identity to be a fundamental human right. But, despite billions of dollars of investment, with backing from organizations including the World Bank, those identity systems haven’t always lived up to expectations. In many cases, the key problem is the failure to register large swathes of the population, leading to exclusion from public services. But in other places, like Uganda, inclusion in the system has been weaponized for surveillance purposes.

A year-long investigation by Bloomberg and Lighthouse Reports sheds new light on the ways in which Museveni’s regime has built and deployed this system to target opponents and consolidate power. It shows how the underlying software and data sets are easily accessed by individuals at all levels of law enforcement, despite official claims to the contrary. It also highlights, in some cases for the first time, how senior government and law enforcement officials have used these tools to target individuals deemed to pose a political threat…(More)”.

Using ChatGPT for analytics


Paper by Aleksei Turobov et al: “The utilisation of AI-driven tools, notably ChatGPT (Generative Pre-trained Transformer), within academic research is increasingly debated from several perspectives including ease of implementation, and potential enhancements in research efficiency, as against ethical concerns and risks such as biases and unexplained AI operations. This paper explores the use of the GPT model for initial coding in qualitative thematic analysis using a sample of United Nations (UN) policy documents. The primary aim of this study is to contribute to the methodological discussion regarding the integration of AI tools, offering a practical guide to validation for using GPT as a collaborative research assistant. The paper outlines the advantages and limitations of this methodology and suggests strategies to mitigate risks. Emphasising the importance of transparency and reliability in employing GPT within research methodologies, this paper argues for a balanced use of AI in supported thematic analysis, highlighting its potential to elevate research efficacy and outcomes…(More)”.