Why these scientists devote time to editing and updating Wikipedia


Article by Christine Ro: “…A 2018 survey of more than 4,000 Wikipedians (as the site’s editors are called) found that 12% had a doctorate. Scientists made up one-third of the Wikimedia Foundation’s 16 trustees, according to Doronina.

Although Wikipedia is the best-known project under the Wikimedia umbrella, there are other ways for scientists to contribute besides editing Wikipedia pages. For example, an entomologist could upload photos of little-known insect species to Wikimedia Commons, a collection of images and other media. A computer scientist could add a self-published book to the digital textbook site Wikibooks. Or a linguist could explain etymology on the collaborative dictionary Wiktionary. All of these are open access, a key part of Wikimedia’s mission.

Although Wikipedia’s structure might seem daunting for new editors, there are parallels with academic documents.

For instance, Jess Wade, a physicist at Imperial College London, who focuses on creating and improving biographies of female scientists and scientists from low- and middle-income countries, says that the talk page, which is the behind-the-scenes portion of a Wikipedia page on which editors discuss how to improve it, is almost like the peer-review file of an academic paper…However, scientists have their own biases about aspects such as how to classify certain topics. This matters, Harrison says, because “Wikipedia is intended to be a general-purpose encyclopaedia instead of a scientific encyclopaedia.”

One example is a long-standing battle over Wikipedia pages on cryptids and folklore creatures such as Bigfoot. Labels such as ‘pseudoscience’ have angered cryptid enthusiasts and raised questions about different types of knowledge. One suggestion is for the pages to feature a disclaimer that says that a topic is not accepted by mainstream science.

Wade raises a point about resourcing, saying it’s especially difficult for the platform to retain academics who might be enthusiastic about editing Wikipedia initially, but then drop off. One reason is time. For full-time researchers, Wikipedia editing could be an activity best left to evenings, weekends and holidays…(More)”.

Regulatory Markets: The Future of AI Governance


Paper by Gillian K. Hadfield, and Jack Clark: “Appropriately regulating artificial intelligence is an increasingly urgent policy challenge. Legislatures and regulators lack the specialized knowledge required to best translate public demands into legal requirements. Overreliance on industry self-regulation fails to hold producers and users of AI systems accountable to democratic demands. Regulatory markets, in which governments require the targets of regulation to purchase regulatory services from a private regulator, are proposed. This approach to AI regulation could overcome the limitations of both command-and-control regulation and self-regulation. Regulatory market could enable governments to establish policy priorities for the regulation of AI, whilst relying on market forces and industry R&D efforts to pioneer the methods of regulation that best achieve policymakers’ stated objectives…(More)”.

Social Informatics


Book edited by Noriko Hara, and Pnina Fichman: “Social informatics examines how society is influenced by digital technologies and how digital technologies are shaped by political, economic, and socio-cultural forces. The chapters in this edited volume use social informatics approaches to analyze recent issues in our increasingly data-intensive society.

Taking a social informatics perspective, this edited volume investigates the interaction between society and digital technologies and includes research that examines individuals, groups, organizations, and nations, as well as their complex relationships with pervasive mobile and wearable devices, social media platforms, artificial intelligence, and big data. This volume’s contributors range from seasoned and renowned researchers to upcoming researchers in social informatics. The readers of the book will understand theoretical frameworks of social informatics; gain insights into recent empirical studies of social informatics in specific areas such as big data and its effects on privacy, ethical issues related to digital technologies, and the implications of digital technologies for daily practices; and learn how the social informatics perspective informs research and practice…(More)”.

Handbook on Governance and Data Science


Handbook edited by Sarah Giest, Bram Klievink, Alex Ingrams, and Matthew M. Young: “This book is based on the idea that there are quite a few overlaps and connections between the field of governance studies and data science. Data science, with its focus on extracting insights from large datasets through sophisticated algorithms and analytics (Provost and Fawcett 2013), provides government with tools to potentially make more informed decisions, enhance service delivery, and foster transparency and accountability. Governance studies, concerned with the processes and structures through which public policy and services are formulated and delivered (Osborne 2006), increasingly rely on data-driven insights to address complex societal challenges, optimize resource allocation, and engage citizens more effectively (Meijer and Bolívar 2016). However, research insights in journals or at conferences remain quite separate, and thus there are limited spaces for having interconnected conversations. In addition, unprecedented societal challenges demand not only innovative solutions but new approaches to problem-solving.

In this context, data science techniques emerge as a crucial element in crafting a modern governance paradigm, offering predictive insights, revealing hidden patterns, and enabling real-time monitoring of public sentiment and service effectiveness, which are invaluable for public administrators (Kitchin 2014). However, the integration of data science into public governance also raises important considerations regarding data privacy, ethical use of data, and the need for transparency in algorithmic decision-making processes (Zuiderwijk and Janssen 2014). In short, this book is a space where governance and data science studies intersect and highlight relevant opportunities and challenges in this space at the intersection of both fields. Contributors to this book discuss the types of data science techniques applied in a governance context and the implications these have for government decisions and services. This also includes questions around the types of data that are used in government and how certain processes and challenges are measured…(More)”.

AI Upgrades the Internet of Things


Article by R. Colin Johnson: “Artificial Intelligence (AI) is renovating the fast-growing Internet of Things (IoT) by migrating AI innovations, including deep neural networks, Generative AI, and large language models (LLMs) from power-hungry datacenters to the low-power Artificial Intelligence of Things (AIoT). Located at the network’s edge, there are already billions of connected devices today, plus a predicted trillion more connected devices by 2035 (according to Arm, which licenses many of their processors).

The emerging details of this AIoT development period got a boost from ACM Transactions on Sensor Networks, which recently accepted for publication “Artificial Intelligence of Things: A Survey,” a paper authored by Mi Zhang of Ohio State University and collaborators at Michigan State University, the University of Southern California, and the University of California, Los Angeles. The survey is an in-depth reference to the latest AIoT research…

The survey addresses the subject of AIoT with AI-empowered sensing modalities including motion, wireless, vision, acoustic, multi-modal, ear-bud, and GenAI-assisted sensing. The computing section covers on-device inference engines, on-device learning, methods of training by partitioning workloads among heterogeneous accelerators, offloading privacy functions, federated learning that distributes workloads while preserving anonymity, integration with LLMs, and AI-empowered agents. Connection technologies discussed include Internet over Wi-Fi and over cellular/mobile networks, visible light communication systems, LoRa (long-range chirp spread-spectrum connections), and wide-area networks.

A sampling of domain-specific AIoTs reviewed in the survey include AIoT systems for healthcare and well-being, for smart speakers, for video streaming, for video analytics, for autonomous driving, for drones, for satellites, for agriculture, for biology, and for artificial reality, virtual reality, and mixed reality…(More)”.

Figure for AIoT article

Intellectual property issues in artificial intelligence trained on scraped data


OECD Report: “Recent technological advances in artificial intelligence (AI), especially the rise of generative AI, have raised questions regarding the intellectual property (IP) landscape. As the demand for AI training data surges, certain data collection methods give rise to concerns about the protection of IP and other rights. This report provides an overview of key issues at the intersection of AI and some IP rights. It aims to facilitate a greater understanding of data scraping — a primary method for obtaining AI training data needed to develop many large language models. It analyses data scraping techniques, identifies key stakeholders, and worldwide legal and regulatory responses. Finally, it offers preliminary considerations and potential policy approaches to help guide policymakers in navigating these issues, ensuring that AI’s innovative potential is unleashed while protecting IP and other rights…(More)”.

Being an Effective Policy Analyst in the Age of Information Overload


Blog by Adam Thierer: “The biggest challenge of being an effective technology policy analyst, academic, or journalist these days is that the shelf life of your products is measured in weeks — and sometimes days — instead of months. Because of that, I’ve been adjusting my own strategies over time to remain effective.

The thoughts and advice I offer here are meant mostly for other technology policy analysts, whether you are a student or young professional just breaking into the field, or someone in the middle of your career looking to take it to the next level. But much of what I’ll say here is generally applicable across the field of policy analysis. It’s just a lot more relevant for people in the field of tech policy because of its fast-moving, ever-changing nature.

This essay will repeatedly reference two realities that have shaped my life both as an average citizen and as an academic and policy analyst: First, we used to live in a world of information scarcity, but we now live in a world of information abundance–and that trend is only accelerating. Second, life and work in a world of information overload is simultaneously a wonderful and awful thing, but one thing is for sure: there is absolutely no going back to the sleepy days of information scarcity.

If you care to be an effective policy analyst today, then you have to come to grips with these new realities. Here are a few tips…(More)”.

Building AI for the pluralistic society


Paper by Aida Davani and Vinodkumar Prabhakaran: “Modern artificial intelligence (AI) systems rely on input from people. Human feedback helps train models to perform useful tasks, guides them toward safe and responsible behavior, and is used to assess their performance. While hailing the recent AI advancements, we should also ask: which humans are we actually talking about? For AI to be most beneficial, it should reflect and respect the diverse tapestry of values, beliefs, and perspectives present in the pluralistic world in which we live, not just a single “average” or majority viewpoint. Diversity in perspectives is especially relevant when AI systems perform subjective tasks, such as deciding whether a response will be perceived as helpful, offensive, or unsafe. For instance, what one value system deems as offensive may be perfectly acceptable within another set of values.

Since divergence in perspectives often aligns with socio-cultural and demographic lines, preferentially capturing certain groups’ perspectives over others in data may result in disparities in how well AI systems serve different social groups. For instance, we previously demonstrated that simply taking a majority vote from human annotations may obfuscate valid divergence in perspectives across social groups, inadvertently marginalizing minority perspectives, and consequently performing less reliably for groups marginalized in the data. How AI systems should deal with such diversity in perspectives depends on the context in which they are used. However, current models lack a systematic way to recognize and handle such contexts.

With this in mind, here we describe our ongoing efforts in pursuit of capturing diverse perspectives and building AI for the pluralistic society in which we live… (More)”.

AI crawler wars threaten to make the web more closed for everyone


Article by Shayne Longpre: “We often take the internet for granted. It’s an ocean of information at our fingertips—and it simply works. But this system relies on swarms of “crawlers”—bots that roam the web, visit millions of websites every day, and report what they see. This is how Google powers its search engines, how Amazon sets competitive prices, and how Kayak aggregates travel listings. Beyond the world of commerce, crawlers are essential for monitoring web security, enabling accessibility tools, and preserving historical archives. Academics, journalists, and civil societies also rely on them to conduct crucial investigative research.  

Crawlers are endemic. Now representing half of all internet traffic, they will soon outpace human traffic. This unseen subway of the web ferries information from site to site, day and night. And as of late, they serve one more purpose: Companies such as OpenAI use web-crawled data to train their artificial intelligence systems, like ChatGPT. 

Understandably, websites are now fighting back for fear that this invasive species—AI crawlers—will help displace them. But there’s a problem: This pushback is also threatening the transparency and open borders of the web, that allow non-AI applications to flourish. Unless we are thoughtful about how we fix this, the web will increasingly be fortified with logins, paywalls, and access tolls that inhibit not just AI but the biodiversity of real users and useful crawlers…(More)”.

How Philanthropy Built, Lost, and Could Reclaim the A.I. Race


Article by Sara Herschander: “How do we know you won’t pull an OpenAI?”

It’s the question Stella Biderman has gotten used to answering when she seeks funding from major foundations for EleutherAI, her two-year-old nonprofit A.I. lab that has developed open-source artificial intelligence models.

The irony isn’t lost on her. Not long ago, she declined a deal dangled by one of Silicon Valley’s most prominent venture capitalists who, with the snap of his fingers, promised to raise $100 million for the fledgling nonprofit lab — over 30 times EleutherAI’s current annual budget — if only the lab’s leaders would agree to drop its 501(c)(3) status.

In today’s A.I. gold rush, where tech giants spend billions on increasingly powerful models and top researchers command seven-figure salaries, to be a nonprofit A.I. lab is to be caught in a Catch-22: defend your mission to increasingly wary philanthropic funders or give in to temptation and become a for-profit company.

Philanthropy once played an outsize role in building major A.I. research centers and nurturing influential theorists — by donating hundreds of millions of dollars, largely to university labs — yet today those dollars are dwarfed by the billions flowing from corporations and venture capitalists. For tech nonprofits and their philanthropic backers, this has meant embracing a new role: pioneering the research and safeguards the corporate world won’t touch.

“If making a lot of money was my goal, that would be easy,” said Biderman, whose employees have seen their pay packages triple or quadruple after being poached by companies like OpenAI, Anthropic, and Google.

But EleutherAI doesn’t want to join the race to build ever-larger models. Instead, backed by grants from Open Philanthropy, Omidyar Network, and A.I. companies Hugging Face and StabilityAI, the group has carved out a different niche: researching how A.I. systems make decisions, maintaining widely used training datasets, and shaping global policy around A.I. safety and transparency…(More)”.