Generative AI in Journalism


Report by Nicholas Diakopoulos et al: “The introduction of ChatGPT by OpenAI in late 2022 captured the imagination of the public—and the news industry—with the potential of generative AI to upend how people create and consume media. Generative AI is a type of artificial intelligence technology that can create new content, such as text, images, audio, video, or other media, based on the data it has been trained on and according to written prompts provided by users. ChatGPT is the chat-based user interface that made the power and potential of generative AI salient to a wide audience, reaching 100 million users within two months of its launch.

Although similar technology had been around, by late 2022 it was suddenly working, spurring its integration into various products and presenting not only a host of opportunities for productivity and new experiences but also some serious concerns about accuracy, provenance and attribution of source information, and the increased potential for creating misinformation.

This report serves as a snapshot of how the news industry has grappled with the initial promises and challenges of generative AI towards the end of 2023. The sample of participants reflects how some of the more savvy and experienced members of the profession are reacting to the technology.

Based on participants’ responses, they found that generative AI is already changing work structure and organization, even as it triggers ethical concerns around use. Here are some key takeaways:

  • Applications in News Production. The most predominant current use cases for generative AI include various forms of textual content production, information gathering and sensemaking, multimedia content production, and business uses.
  • Changing Work Structure and Organization. There are a host of new roles emerging to grapple with the changes introduced by generative AI including for leadership, editorial, product, legal, and engineering positions.
  • Work Redesign. There is an unmet opportunity to design new interfaces to support journalistic work with generative AI, in particular to enable the human oversight needed for the efficient and confident checking and verification of outputs..(More)”

How Copyright May Destroy Our Access To The World’s Academic Knowledge


Article by Glyn Moody: “The shift from analogue to digital has had a massive impact on most aspects of life. One area where that shift has the potential for huge benefits is in the world of academic publishing. Academic papers are costly to publish and distribute on paper, but in a digital format they can be shared globally for almost no cost. That’s one of the driving forces behind the open access movement. But as Walled Culture has reported, resistance from the traditional publishing world has slowed the shift to open access, and undercut the benefits that could flow from it.

That in itself is bad news, but new research from Martin Paul Eve (available as open access) shows that the way the shift to digital has been managed by publishers brings with it a new problem. For all their flaws, analogue publications have the great virtue that they are durable: once a library has a copy, it is likely to be available for decades, if not centuries. Digital scholarly articles come with no such guarantee. The Internet is constantly in flux, with many publishers and sites closing down each year, often without notice. That’s a problem when sites holding archival copies of scholarly articles vanish, making it harder, perhaps impossible, to access important papers. Eve explored whether publishers were placing copies of the articles they published in key archives. Ideally, digital papers would be available in multiple archives to ensure resilience, but the reality is that very few publishers did this. Ars Technica has a good summary of Eve’s results:

When Eve broke down the results by publisher, less than 1 percent of the 204 publishers had put the majority of their content into multiple archives. (The cutoff was 75 percent of their content in three or more archives.) Fewer than 10 percent had put more than half their content in at least two archives. And a full third seemed to be doing no organized archiving at all.

At the individual publication level, under 60 percent were present in at least one archive, and over a quarter didn’t appear to be in any of the archives at all. (Another 14 percent were published too recently to have been archived or had incomplete records.)..(More)”.

The Unintended Consequences of Data Standardization


Article by Cathleen Clerkin: “The benefits of data standardization within the social sector—and indeed just about any industry—are multiple, important, and undeniable. Access to the same type of data over time lends the ability to track progress and increase accountability. For example, over the last 20 years, my organization, Candid, has tracked grantmaking by the largest foundations to assess changes in giving trends. The data allowed us to demonstrate philanthropy’s disinvestment in historically Black colleges and universities. Data standardization also creates opportunities for benchmarking—allowing individuals and organizations to assess how they stack up to their colleagues and competitors. Moreover, large amounts of standardized data can help predict trends in the sector. Finally—and perhaps most importantly to the social sector—data standardization invariably reduces the significant reporting burdens placed on nonprofits.

Yet, for all of its benefits, data is too often proposed as a universal cure that will allow us to unequivocally determine the success of social change programs and processes. The reality is far more complex and nuanced. Left unchecked, the unintended consequences of data standardization pose significant risks to achieving a more effective, efficient, and equitable social sector…(More)”.

Data Authenticity, Consent, and Provenance for AI Are All Broken: What Will It Take to Fix Them?


Article by Shayne Longpre et al: “New AI capabilities are owed in large part to massive, widely sourced, and underdocumented training data collections. Dubious collection practices have spurred crises in data transparency, authenticity, consent, privacy, representation, bias, copyright infringement, and the overall development of ethical and trustworthy AI systems. In response, AI regulation is emphasizing the need for training data transparency to understand AI model limitations. Based on a large-scale analysis of the AI training data landscape and existing solutions, we identify the missing infrastructure to facilitate responsible AI development practices. We explain why existing tools for data authenticity, consent, and documentation alone are unable to solve the core problems facing the AI community, and outline how policymakers, developers, and data creators can facilitate responsible AI development, through universal data provenance standards…(More)”.

Why data about people are so hard to govern


Paper by Wendy H. Wong, Jamie Duncan, and David A. Lake: “How data on individuals are gathered, analyzed, and stored remains largely ungoverned at both domestic and global levels. We address the unique governance problem posed by digital data to provide a framework for understanding why data governance remains elusive. Data are easily transferable and replicable, making them a useful tool. But this characteristic creates massive governance problems for all of us who want to have some agency and choice over how (or if) our data are collected and used. Moreover, data are co-created: individuals are the object from which data are culled by an interested party. Yet, any data point has a marginal value of close to zero and thus individuals have little bargaining power when it comes to negotiating with data collectors. Relatedly, data follow the rule of winner take all—the parties that have the most can leverage that data for greater accuracy and utility, leading to natural oligopolies. Finally, data’s value lies in combination with proprietary algorithms that analyze and predict the patterns. Given these characteristics, private governance solutions are ineffective. Public solutions will also likely be insufficient. The imbalance in market power between platforms that collect data and individuals will be reproduced in the political sphere. We conclude that some form of collective data governance is required. We examine the challenges to the data governance by looking a public effort, the EU’s General Data Protection Regulation, a private effort, Apple’s “privacy nutrition labels” in their App Store, and a collective effort, the First Nations Information Governance Centre in Canada…(More)”

Creating an Integrated System of Data and Statistics on Household Income, Consumption, and Wealth: Time to Build


Report by the National Academies: “Many federal agencies provide data and statistics on inequality and related aspects of household income, consumption, and wealth (ICW). However, because the information provided by these agencies is often produced using different concepts, underlying data, and methods, the resulting estimates of poverty, inequality, mean and median household income, consumption, and wealth, as well as other statistics, do not always tell a consistent or easily interpretable story. Measures also differ in their accuracy, timeliness, and relevance so that it is difficult to address such questions as the effects of the Great Recession on household finances or of the Covid-19 pandemic and the ensuing relief efforts on household income and consumption. The presence of multiple, sometimes conflicting statistics at best muddies the waters of policy debates and, at worst, enable advocates with different policy perspectives to cherry-pick their preferred set of estimates. Achieving an integrated system of relevant, high-quality, and transparent household ICW data and statistics should go far to reduce disagreement about who has how much, and from what sources. Further, such data are essential to advance research on economic wellbeing and to ensure that policies are well targeted to achieve societal goals…(More)”.

Objectivity vs affect: how competing forms of legitimacy can polarize public debate in data-driven public consultation


Paper by Alison Powell: “How do data and objectivity become politicized? How do processes intended to include citizen voices instead push them into social media that intensify negative expression? This paper examines the possibility and limits of ‘agonistic data practices’ (Crooks & Currie, 2021) examining how data-driven consultation practices create competing forms of legitimacy for quantifiable knowledge and affective lived experience. Drawing on a two-year study of a private Facebook group self-presenting as a supportive space for working-class people critical of the development of ‘low-traffic neighbourhoods’ (LTNs), the paper reveals how the dynamics of ‘affective polarization’ associated the use of data with elite and exclusionary politics. Participants addressed this by framing their online contributions as ‘vernacular data’ and also by associating numerical data with exclusion and inequality. Over time the strong statements of feeling began to support content of a conspiratorial nature, reflected at the social level of discourse in the broader media environment where stories of strong feeling gain legitimacy in right-wing sources. The paper concludes that ideologies of dataism and practices of datafication may create conditions for political extremism to develop when the potential conditions of ‘agonistic data practices’ are not met, and that consultation processes must avoid overly valorizing data and calculable knowledge if they wish to retain democratic accountability…(More)”.

AI and the Future of Government: Unexpected Effects and Critical Challenges


Policy Brief by Tiago C. Peixoto, Otaviano Canuto, and Luke Jordan: “Based on observable facts, this policy paper explores some of the less- acknowledged yet critically important ways in which artificial intelligence (AI) may affect the public sector and its role. Our focus is on those areas where AI’s influence might be understated currently, but where it has substantial implications for future government policies and actions.

We identify four main areas of impact that could redefine the public sector role, require new answers from it, or both. These areas are the emergence of a new language-based digital divide, jobs displacement in the public administration, disruptions in revenue mobilization, and declining government responsiveness.

This discussion not only identifies critical areas but also underscores the importance of transcending conventional approaches in tackling them. As we examine these challenges, we shed light on their significance, seeking to inform policymakers and stakeholders about the nuanced ways in which AI may quietly, yet profoundly, alter the public sector landscape…(More)”.

AI for Good: Applications in Sustainability, Humanitarian Action, and Health


Book by Juan M. Lavista Ferres and William B. Weeks: “…an insightful and fascinating discussion of how one of the world’s most recognizable software companies is tacking intractable social problems with the power of artificial intelligence (AI). In the book, you’ll learn about how climate change, illness and disease, and challenges to fundamental human rights are all being fought using replicable methods and reusable AI code.

The authors also provide:

  • Easy-to-follow, non-technical explanations of what AI is and how it works
  • Examinations of how healthcare is being improved, climate change is being addressed, and humanitarian aid is being facilitated around the world with AI
  • Discussions of the future of AI in the realm of social benefit organizations and efforts

An essential guide to impactful social change with artificial intelligence, AI for Good is a must-read resource for technical and non-technical professionals interested in AI’s social potential, as well as policymakers, regulators, NGO professionals, and, and non-profit volunteers…(More)”.

The Cambridge Handbook of Facial Recognition in the Modern State


Book edited by Rita Matulionyte and Monika Zalnieriute: “In situations ranging from border control to policing and welfare, governments are using automated facial recognition technology (FRT) to collect taxes, prevent crime, police cities and control immigration. FRT involves the processing of a person’s facial image, usually for identification, categorisation or counting. This ambitious handbook brings together a diverse group of legal, computer, communications, and social and political science scholars to shed light on how FRT has been developed, used by public authorities, and regulated in different jurisdictions across five continents. Informed by their experiences working on FRT across the globe, chapter authors analyse the increasing deployment of FRT in public and private life. The collection argues for the passage of new laws, rules, frameworks, and approaches to prevent harms of FRT in the modern state and advances the debate on scrutiny of power and accountability of public authorities which use FRT…(More)”.