Article by Yaqub Chaudhary and Jonnie Penn: “The rapid proliferation of large language models (LLMs) invites the possibility of a new marketplace for behavioral and psychological data that signals intent. This brief article introduces some initial features of that emerging marketplace. We survey recent efforts by tech executives to position the capture, manipulation, and commodification of human intentionality as a lucrative parallel to—and viable extension of—the now-dominant attention economy, which has bent consumer, civic, and media norms around users’ finite attention spans since the 1990s. We call this follow-on the intention economy. We characterize it in two ways. First, as a competition, initially, between established tech players armed with the infrastructural and data capacities needed to vie for first-mover advantage on a new frontier of persuasive technologies. Second, as a commodification of hitherto unreachable levels of explicit and implicit data that signal intent, namely those signals borne of combining (a) hyper-personalized manipulation via LLM-based sycophancy, ingratiation, and emotional infiltration and (b) increasingly detailed categorization of online activity elicited through natural language.
This new dimension of automated persuasion draws on the unique capabilities of LLMs and generative AI more broadly, which intervene not only on what users want, but also, to cite Williams, “what they want to want” (Williams, 2018, p. 122). We demonstrate through a close reading of recent technical and critical literature (including unpublished papers from ArXiv) that such tools are already being explored to elicit, infer, collect, record, understand, forecast, and ultimately manipulate, modulate, and commodify human plans and purposes, both mundane (e.g., selecting a hotel) and profound (e.g., selecting a political candidate)…(More)”.
Wikenigma – an Encyclopedia of Unknowns
About: “Wikenigma is a unique wiki-based resource specifically dedicated to documenting fundamental gaps in human knowledge.
Listing scientific and academic questions to which no-one, anywhere, has yet been able to provide a definitive answer. [ 1141 so far ]
That’s to say, a compendium of so-called ‘Known Unknowns’.
The idea is to inspire and promote interest in scientific and academic research by highlighting opportunities to investigate problems which no-one has yet been able to solve.
You can start browsing the content via the main menu on the left (or in the ‘Main Menu’ section if you’re using a small-screen device) Alternatively, the search box (above right) will find any articles with details that match your search terms…(More)”.
Overcoming challenges associated with broad sharing of human genomic data
Paper by Jonathan E. LoTempio Jr & Jonathan D. Moreno: “Since the Human Genome Project, the consensus position in genomics has been that data should be shared widely to achieve the greatest societal benefit. This position relies on imprecise definitions of the concept of ‘broad data sharing’. Accordingly, the implementation of data sharing varies among landmark genomic studies. In this Perspective, we identify definitions of broad that have been used interchangeably, despite their distinct implications. We further offer a framework with clarified concepts for genomic data sharing and probe six examples in genomics that produced public data. Finally, we articulate three challenges. First, we explore the need to reinterpret the limits of general research use data. Second, we consider the governance of public data deposition from extant samples. Third, we ask whether, in light of changing concepts of broad, participants should be encouraged to share their status as participants publicly or not. Each of these challenges is followed with recommendations…(More)”.
Towards Best Practices for Open Datasets for LLM Training
Paper by Stefan Baack et al: “Many AI companies are training their large language models (LLMs) on data without the permission of the copyright owners. The permissibility of doing so varies by jurisdiction: in countries like the EU and Japan, this is allowed under certain restrictions, while in the United States, the legal landscape is more ambiguous. Regardless of the legal status, concerns from creative producers have led to several high-profile copyright lawsuits, and the threat of litigation is commonly cited as a reason for the recent trend towards minimizing the information shared about training datasets by both corporate and public interest actors. This trend in limiting data information causes harm by hindering transparency, accountability, and innovation in the broader ecosystem by denying researchers, auditors, and impacted individuals access to the information needed to understand AI models.
While this could be mitigated by training language models on open access and public domain data, at the time of writing, there are no such models (trained at a meaningful scale) due to the substantial technical and sociological challenges in assembling the necessary corpus. These challenges include incomplete and unreliable metadata, the cost and complexity of digitizing physical records, and the diverse set of legal and technical skills required to ensure relevance and responsibility in a quickly changing landscape. Building towards a future where AI systems can be trained on openly licensed data that is responsibly curated and governed requires collaboration across legal, technical, and policy domains, along with investments in metadata standards, digitization, and fostering a culture of openness…(More)”.
How and When to Involve Crowds in Scientific Research
Book by Marion K. Poetz and Henry Sauermann: “This book explores how millions of people can significantly contribute to scientific research with their effort and experience, even if they are not working at scientific institutions and may not have formal scientific training.
Drawing on a strong foundation of scholarship on crowd involvement, this book helps researchers recognize and understand the benefits and challenges of crowd involvement across key stages of the scientific process. Designed as a practical toolkit, it enables scientists to critically assess the potential of crowd participation, determine when it can be most effective, and implement it to achieve meaningful scientific and societal outcomes.
The book also discusses how recent developments in artificial intelligence (AI) shape the role of crowds in scientific research and can enhance the effectiveness of crowd science projects…(More)”
Governing artificial intelligence means governing data: (re)setting the agenda for data justice
Paper by Linnet Taylor, Siddharth Peter de Souza, Aaron Martin, and Joan López Solano: “The field of data justice has been evolving to take into account the role of data in powering the field of artificial intelligence (AI). In this paper we review the main conceptual bases for governing data and AI: the market-based approach, the personal–non-personal data distinction and strategic sovereignty. We then analyse how these are being operationalised into practical models for governance, including public data trusts, data cooperatives, personal data sovereignty, data collaboratives, data commons approaches and indigenous data sovereignty. We interrogate these models’ potential for just governance based on four benchmarks which we propose as a reformulation of the Data Justice governance agenda identified by Taylor in her 2017 framework. Re-situating data justice at the intersection of data and AI, these benchmarks focus on preserving and strengthening public infrastructures and public goods; inclusiveness; contestability and accountability; and global responsibility. We demonstrate how they can be used to test whether a governance approach will succeed in redistributing power, engaging with public concerns and creating a plural politics of AI…(More)”.
Artificial Intelligence Narratives
A Global Voices Report: “…Framing AI systems as intelligent is further complicated and intertwined with neighboring narratives. In the US, AI narratives often revolve around opposing themes such as hope and fear, often bridging two strong emotions: existential fears and economic aspirations. In either case, they propose that the technology is powerful. These narratives contribute to the hype surrounding AI tools and their potential impact on society. Some examples include:
- An “AI arms race” between the United States and other global powers, particularly China.
- AI is a driver of economic growth and a transformative agent reshaping society.
- AI is a threat to labor.
- AI as a tool for good to benefit society.
- AI is an inevitable force that requires urgent regulation.
- If you don’t use AI, you will fall behind.
- AI is a democratizing force, lowering barriers to entry globally.
Many of these framings often present AI as an unstoppable and accelerating force. While this narrative can generate excitement and investment in AI research, it can also contribute to a sense of technological determinism and a lack of critical engagement with the consequences of widespread AI adoption. Counter-narratives are many and expand on the motifs of surveillance, erosions of trust, bias, job impacts, exploitation of labor, high-risk uses, the concentration of power, and environmental impacts, among others.
These narrative frames, combined with the metaphorical language and imagery used to describe AI, contribute to the confusion and lack of public knowledge about the technology. By positioning AI as a transformative, inevitable, and necessary tool for national success, these narratives can shape public opinion and policy decisions, often in ways that prioritize rapid adoption and commercialization…(More)”
Information Ecosystems and Troubled Democracy
Report by the Observatory on Information and Democracy: “This inaugural meta-analysis provides a critical assessment of the role of information ecosystems in the Global North and Global Majority World, focusing on their relationship with information integrity (the quality of public discourse), the fairness of political processes, the protection of media freedoms, and the resilience of public institutions.
The report addresses three thematic areas with a cross-cutting theme of mis- and disinformation:
- Media, Politics and Trust;
- Artificial Intelligence, Information Ecosystems and Democracy;
- and Data Governance and Democracy.
The analysis is based mainly on academic publications supplemented by reports and other materials from different disciplines and regions (1,664 citations selected among a total corpus of over +2700 resources aggregated). The report showcases what we can learn from landmark research on often intractable challenges posed by rapid changes in information and communication spaces…(More)”.
What’s a Fact, Anyway?
Essay by Fergus McIntosh: “…For journalists, as for anyone, there are certain shortcuts to trustworthiness, including reputation, expertise, and transparency—the sharing of sources, for example, or the prompt correction of errors. Some of these shortcuts are more perilous than others. Various outfits, positioning themselves as neutral guides to the marketplace of ideas, now tout evaluations of news organizations’ trustworthiness, but relying on these requires trusting in the quality and objectivity of the evaluation. Official data is often taken at face value, but numbers can conceal motives: think of the dispute over how to count casualties in recent conflicts. Governments, meanwhile, may use their powers over information to suppress unfavorable narratives: laws originally aimed at misinformation, many enacted during the COVID-19 pandemic, can hinder free expression. The spectre of this phenomenon is fuelling a growing backlash in America and elsewhere.
Although some categories of information may come to be considered inherently trustworthy, these, too, are in flux. For decades, the technical difficulty of editing photographs and videos allowed them to be treated, by most people, as essentially incontrovertible. With the advent of A.I.-based editing software, footage and imagery have swiftly become much harder to credit. Similar tools are already used to spoof voices based on only seconds of recorded audio. For anyone, this might manifest in scams (your grandmother calls, but it’s not Grandma on the other end), but for a journalist it also puts source calls into question. Technologies of deception tend to be accompanied by ones of detection or verification—a battery of companies, for example, already promise that they can spot A.I.-manipulated imagery—but they’re often locked in an arms race, and they never achieve total accuracy. Though chatbots and A.I.-enabled search engines promise to help us with research (when a colleague “interviewed” ChatGPT, it told him, “I aim to provide information that is as neutral and unbiased as possible”), their inability to provide sourcing, and their tendency to hallucinate, looks more like a shortcut to nowhere, at least for now. The resulting problems extend far beyond media: election campaigns, in which subtle impressions can lead to big differences in voting behavior, feel increasingly vulnerable to deepfakes and other manipulations by inscrutable algorithms. Like everyone else, journalists have only just begun to grapple with the implications.
In such circumstances, it becomes difficult to know what is true, and, consequently, to make decisions. Good journalism offers a way through, but only if readers are willing to follow: trust and naïveté can feel uncomfortably close. Gaining and holding that trust is hard. But failure—the end point of the story of generational decay, of gold exchanged for dross—is not inevitable. Fact checking of the sort practiced at The New Yorker is highly specific and resource-intensive, and it’s only one potential solution. But any solution must acknowledge the messiness of truth, the requirements of attention, the way we squint to see more clearly. It must tell you to say what you mean, and know that you mean it…(More)”.
Governance of Indigenous data in open earth systems science
Paper by Lydia Jennings et al: “In the age of big data and open science, what processes are needed to follow open science protocols while upholding Indigenous Peoples’ rights? The Earth Data Relations Working Group (EDRWG), convened to address this question and envision a research landscape that acknowledges the legacy of extractive practices and embraces new norms across Earth science institutions and open science research. Using the National Ecological Observatory Network (NEON) as an example, the EDRWG recommends actions, applicable across all phases of the data lifecycle, that recognize the sovereign rights of Indigenous Peoples and support better research across all Earth Sciences…(More)”