Introduction to Digital Humanism


Open access textbook edited by Hannes Werthner et al: “…introduces and defines digital humanism from a diverse range of disciplines. Following the 2019 Vienna Manifesto, the book calls for a digital humanism that describes, analyzes, and, most importantly, influences the complex interplay of technology and humankind, for a better society and life, fully respecting universal human rights.The book is organized in three parts: Part I “Background” provides the multidisciplinary background needed to understand digital humanism in its philosophical, cultural, technological, historical, social, and economic dimensions. The goal is to present the necessary knowledge upon which an effective interdisciplinary discourse on digital humanism can be founded. Part II “Digital Humanism – a System’s View” focuses on an in-depth presentation and discussion of the main digital humanism concerns arising in current digital systems. The goal of this part is to make readers aware and sensitive to these issues, including e.g. the control and autonomy of AI systems, privacy and security, and the role of governance. Part III “Critical and Societal Issues of Digital Systems” delves into critical societal issues raised by advances of digital technologies. While the public debate in the past has often focused on them separately, especially when they became visible through sensational events the aim here is to shed light on the entire landscape and show their interconnected relationships. This includes issues such as AI and ethics, fairness and bias, privacy and surveillance, platform power and democracy.

This textbook is intended for students, teachers, and policy makers interested in digital humanism. It is designed for stand-alone and for complementary courses in computer science, or curricula in science, engineering, humanities and social sciences. Each chapter includes questions for students and an annotated reading list to dive deeper into the associated chapter material. The book aims to provide readers with as wide an exposure as possible to digital advances and their consequences for humanity. It includes constructive ideas and approaches that seek to ensure that our collective digital future is determined through human agency…(More)”.

Computational social science is growing up: why puberty consists of embracing measurement validation, theory development, and open science practices


Paper by Timon Elmer: “Puberty is a phase in which individuals often test the boundaries of themselves and surrounding others and further define their identity – and thus their uniqueness compared to other individuals. Similarly, as Computational Social Science (CSS) grows up, it must strike a balance between its own practices and those of neighboring disciplines to achieve scientific rigor and refine its identity. However, there are certain areas within CSS that are reluctant to adopt rigorous scientific practices from other fields, which can be observed through an overreliance on passively collected data (e.g., through digital traces, wearables) without questioning the validity of such data. This paper argues that CSS should embrace the potential of combining both passive and active measurement practices to capitalize on the strengths of each approach, including objectivity and psychological quality. Additionally, the paper suggests that CSS would benefit from integrating practices and knowledge from other established disciplines, such as measurement validation, theoretical embedding, and open science practices. Based on this argument, the paper provides ten recommendations for CSS to mature as an interdisciplinary field of research…(More)”.

Informing Decisionmakers in Real Time


Article by Robert M. Groves: “In response, the National Science Foundation (NSF) proposed the creation of a complementary group to provide decisionmakers at all levels with the best available evidence from the social sciences to inform pandemic policymaking. In May 2020, with funding from NSF and additional support from the Alfred P. Sloan Foundation and the David and Lucile Packard Foundation, NASEM established the Societal Experts Action Network (SEAN) to connect “decisionmakers grappling with difficult issues to the evidence, trends, and expert guidance that can help them lead their communities and speed their recovery.” We chose to build a network because of the widespread recognition that no one small group of social scientists would have the expertise or the bandwidth to answer all the questions facing decisionmakers. What was needed was a structure that enabled an ongoing feedback loop between researchers and decisionmakers. This structure would foster the integration of evidence, research, and advice in real time, which broke with NASEM’s traditional form of aggregating expert guidance over lengthier periods.

In its first phase, SEAN’s executive committee set about building a network that could both gather and disseminate knowledge. To start, we brought in organizations of decisionmakers—including the National Association of Counties, the National League of Cities, the International City/County Management Association, and the National Conference of State Legislatures—to solicit their questions. Then we added capacity to the network by inviting social and behavioral organizations—like the National Bureau of Economic Research, the National Hazards Center at the University of Colorado Boulder, the Kaiser Family Foundation, the National Opinion Research Center at the University of Chicago, The Policy Lab at Brown University, and Testing for America—to join and respond to questions and disseminate guidance. In this way, SEAN connected teams of experts with evidence and answers to leaders and communities looking for advice…(More)”.

WikiCrow: Automating Synthesis of Human Scientific Knowledge


About: “As scientists, we stand on the shoulders of giants. Scientific progress requires curation and synthesis of prior knowledge and experimental results. However, the scientific literature is so expansive that synthesis, the comprehensive combination of ideas and results, is a bottleneck. The ability of large language models to comprehend and summarize natural language will  transform science by automating the synthesis of scientific knowledge at scale. Yet current LLMs are limited by hallucinations, lack access to the most up-to-date information, and do not provide reliable references for statements.

Here, we present WikiCrow, an automated system that can synthesize cited Wikipedia-style summaries for technical topics from the scientific literature. WikiCrow is built on top of Future House’s internal LLM agent platform, PaperQA, which in our testing, achieves state-of-the-art (SOTA) performance on a retrieval-focused version of PubMedQA and other benchmarks, including a new retrieval-first benchmark, LitQA, developed internally to evaluate systems retrieving full-text PDFs across the entire scientific literature.

As a demonstration of the potential for AI to impact scientific practice, we use WikiCrow to generate draft articles for the 15,616 human protein-coding genes that currently lack Wikipedia articles, or that have article stubs. WikiCrow creates articles in 8 minutes, is much more consistent than human editors at citing its sources, and makes incorrect inferences or statements about 9% of the time, a number that we expect to improve as we mature our systems. WikiCrow will be a foundational tool for the AI Scientists we plan to build in the coming years, and will help us to democratize access to scientific research…(More)”.

How to make data open? Stop overlooking librarians


Article by Jessica Farrell: “The ‘Year of Open Science’, as declared by the US Office of Science and Technology Policy (OSTP), is now wrapping up. This followed an August 2022 memo from OSTP acting director Alondra Nelson, which mandated that data and peer-reviewed publications from federally funded research should be made freely accessible by the end of 2025. Federal agencies are required to publish full plans for the switch by the end of 2024.

But the specifics of how data will be preserved and made publicly available are far from being nailed down. I worked in archives for ten years and now facilitate two digital-archiving communities, the Software Preservation Network and BitCurator Consortium, at Educopia in Atlanta, Georgia. The expertise of people such as myself is often overlooked. More open-science projects need to integrate digital archivists and librarians, to capitalize on the tools and approaches that we have already created to make knowledge accessible and open to the public.How to make your scientific data accessible, discoverable and useful

Making data open and ‘FAIR’ — findable, accessible, interoperable and reusable — poses technical, legal, organizational and financial questions. How can organizations best coordinate to ensure universal access to disparate data? Who will do that work? How can we ensure that the data remain open long after grant funding runs dry?

Many archivists agree that technical questions are the most solvable, given enough funding to cover the labour involved. But they are nonetheless complex. Ideally, any open research should be testable for reproducibility, but re-running scripts or procedures might not be possible unless all of the required coding libraries and environments used to analyse the data have also been preserved. Besides the contents of spreadsheets and databases, scientific-research data can include 2D or 3D images, audio, video, websites and other digital media, all in a variety of formats. Some of these might be accessible only with proprietary or outdated software…(More)”.

When Science Meets Power


Book by Geoff Mulgan: “Science and politics have collaborated throughout human history, and science is repeatedly invoked today in political debates, from pandemic management to climate change. But the relationship between the two is muddled and muddied.

Leading policy analyst Geoff Mulgan here calls attention to the growing frictions caused by the expanding authority of science, which sometimes helps politics but often challenges it.

He dissects the complex history of states’ use of science for conquest, glory and economic growth and shows the challenges of governing risk – from nuclear weapons to genetic modification, artificial intelligence to synthetic biology. He shows why the governance of science has become one of the biggest challenges of the twenty-first century, ever more prominent in daily politics and policy.

Whereas science is ordered around what we know and what is, politics engages what we feel and what matters. How can we reconcile the two, so that crucial decisions are both well informed and legitimate?

The book proposes new ways to organize democracy and government, both within nations and at a global scale, to better shape science and technology so that we can reap more of the benefits and fewer of the harms…(More)”.

Transmission Versus Truth, Imitation Versus Innovation: What Children Can Do That Large Language and Language-and-Vision Models Cannot (Yet)


Paper by Eunice Yiu, Eliza Kosoy, and Alison Gopnik: “Much discussion about large language models and language-and-vision models has focused on whether these models are intelligent agents. We present an alternative perspective. First, we argue that these artificial intelligence (AI) models are cultural technologies that enhance cultural transmission and are efficient and powerful imitation engines. Second, we explore what AI models can tell us about imitation and innovation by testing whether they can be used to discover new tools and novel causal structures and contrasting their responses with those of human children. Our work serves as a first step in determining which particular representations and competences, as well as which kinds of knowledge or skills, can be derived from particular learning techniques and data. In particular, we explore which kinds of cognitive capacities can be enabled by statistical analysis of large-scale linguistic data. Critically, our findings suggest that machines may need more than large-scale language and image data to allow the kinds of innovation that a small child can produce…(More)”.

Elon Musk is now taking applications for data to study X — but only EU risk researchers need apply…


Article by Natasha Lomas: “Lawmakers take note: Elon Musk-owned X appears to have quietly complied with a hard legal requirement in the European Union that requires larger platforms (aka VLOPs) to provide researchers with data access in order to study systemic risks arising from use of their services — risks such as disinformation, child safety issues, gender-based violence and mental heath concerns.

X (or Twitter as it was still called at the time) was designated a VLOP under the EU’s Digital Services Act (DSA) back in April after the bloc’s regulators confirmed it meets their criteria for an extra layer of rules to kick in that are intended to drive algorithmic accountability via applying transparency measures on larger platforms.

Researchers intending to study systemic risks in the EU now appear to at least be able to apply for access to study X’s data by accessing a web form through a button which appears at the bottom of this page on its developer platform. (Note researchers can be based in the EU but don’t have to be to meet the criteria; they just need to intend to study systemic risks in the EU.)…(More)”.

The Oligopoly’s Shift to Open Access. How the Big Five Academic Publishers Profit from Article Processing Charges 


Paper by Leigh-Ann Butler et al: “This study aims to estimate the total amount of article processing charges (APCs) paid to publish open access (OA) in journals controlled by the five large commercial publishers Elsevier, Sage, Springer-Nature, Taylor & Francis and Wiley between 2015 and 2018. Using publication data from WoS, OA status from Unpaywall and annual APC prices from open datasets and historical fees retrieved via the Internet Archive Wayback Machine, we estimate that globally authors paid $1.06 billion in publication fees to these publishers from 2015–2018. Revenue from gold OA amounted to $612.5 million, while $448.3 million was obtained for publishing OA in hybrid journals. Among the five publishers, Springer-Nature made the most revenue from OA ($589.7 million), followed by Elsevier ($221.4 million), Wiley ($114.3 million), Taylor & Francis ($76.8 million) and Sage ($31.6 million). With Elsevier and Wiley making most of APC revenue from hybrid fees and others focusing on gold, different OA strategies could be observed between publishers…(More)”.This study aims to estimate the total amount of article processing charges (APCs) paid to publish open access (OA) in journals controlled by the five large commercial publishers Elsevier, Sage, Springer-Nature, Taylor & Francis and Wiley between 2015 and 2018. Using publication data from WoS, OA status from Unpaywall and annual APC prices from open datasets and historical fees retrieved via the Internet Archive Wayback Machine, we estimate that globally authors paid $1.06 billion in publication fees to these publishers from 2015–2018. Revenue from gold OA amounted to $612.5 million, while $448.3 million was obtained for publishing OA in hybrid journals. Among the five publishers, Springer-Nature made the most revenue from OA ($589.7 million), followed by Elsevier ($221.4 million), Wiley ($114.3 million), Taylor & Francis ($76.8 million) and Sage ($31.6 million). With Elsevier and Wiley making most of APC revenue from hybrid fees and others focusing on gold, different OA strategies could be observed between publishers.

Meta is giving researchers more access to Facebook and Instagram data


Article by Tate Ryan-Mosley: “Meta is releasing a new transparency product called the Meta Content Library and API, according to an announcement from the company today. The new tools will allow select researchers to access publicly available data on Facebook and Instagram in an effort to give a more overarching view of what’s happening on the platforms. 

The move comes as social media companies are facing public and regulatory pressure to increase transparency about how their products—specifically recommendation algorithms—work and what impact they have. Academic researchers have long been calling for better access to data from social media platforms, including Meta. This new library is a step toward increased visibility about what is happening on its platforms and the effect that Meta’s products have on online conversations, politics, and society at large. 

In an interview, Meta’s president of global affairs, Nick Clegg, said the tools “are really quite important” in that they provide, in a lot of ways, “the most comprehensive access to publicly available content across Facebook and Instagram of anything that we’ve built to date.” The Content Library will also help the company meet new regulatory requirements and obligations on data sharing and transparency, as the company notes in a blog post Tuesday

The library and associated API were first released as a beta version several months ago and allow researchers to access near-real-time data about pages, posts, groups, and events on Facebook and creator and business accounts on Instagram, as well as the associated numbers of reactions, shares, comments, and post view counts. While all this data is publicly available—as in, anyone can see public posts, reactions, and comments on Facebook—the new library makes it easier for researchers to search and analyze this content at scale…(More)”.