Paper by Philipp Schoenegger, Indre Tuminauskaite, Peter S. Park, and Philip E. Tetlock: “Human forecasting accuracy in practice relies on the ‘wisdom of the crowd’ effect, in which predictions about future events are significantly improved by aggregating across a crowd of individual forecasters. Past work on the forecasting ability of large language models (LLMs) suggests that frontier LLMs, as individual forecasters, underperform compared to the gold standard of a human crowd forecasting tournament aggregate. In Study 1, we expand this research by using an LLM ensemble approach consisting of a crowd of twelve LLMs. We compare the aggregated LLM predictions on 31 binary questions to that of a crowd of 925 human forecasters from a three-month forecasting tournament. Our preregistered main analysis shows that the LLM crowd outperforms a simple no-information benchmark and is not statistically different from the human crowd. In exploratory analyses, we find that these two approaches are equivalent with respect to medium-effect-size equivalence bounds. We also observe an acquiescence effect, with mean model predictions being significantly above 50%, despite an almost even split of positive and negative resolutions. Moreover, in Study 2, we test whether LLM predictions (of GPT-4 and Claude 2) can be improved by drawing on human cognitive output. We find that both models’ forecasting accuracy benefits from exposure to the median human prediction as information, improving accuracy by between 17% and 28%: though this leads to less accurate predictions than simply averaging human and machine forecasts. Our results suggest that LLMs can achieve forecasting accuracy rivaling that of human crowd forecasting tournaments: via the simple, practically applicable method of forecast aggregation. This replicates the ‘wisdom of the crowd’ effect for LLMs, and opens up their use for a variety of applications throughout society…(More)”.
Unconventional data, unprecedented insights: leveraging non-traditional data during a pandemic
Paper by Kaylin Bolt et al: “The COVID-19 pandemic prompted new interest in non-traditional data sources to inform response efforts and mitigate knowledge gaps. While non-traditional data offers some advantages over traditional data, it also raises concerns related to biases, representativity, informed consent and security vulnerabilities. This study focuses on three specific types of non-traditional data: mobility, social media, and participatory surveillance platform data. Qualitative results are presented on the successes, challenges, and recommendations of key informants who used these non-traditional data sources during the COVID-19 pandemic in Spain and Italy….
Non-traditional data proved valuable in providing rapid results and filling data gaps, especially when traditional data faced delays. Increased data access and innovative collaborative efforts across sectors facilitated its use. Challenges included unreliable access and data quality concerns, particularly the lack of comprehensive demographic and geographic information. To further leverage non-traditional data, participants recommended prioritizing data governance, establishing data brokers, and sustaining multi-institutional collaborations. The value of non-traditional data was perceived as underutilized in public health surveillance, program evaluation and policymaking. Participants saw opportunities to integrate them into public health systems with the necessary investments in data pipelines, infrastructure, and technical capacity…(More)”.
Evaluation in the Post-Truth World
Book edited by Mita Marra, Karol Olejniczak, and Arne Paulson:”…explores the relationship between the nature of evaluative knowledge, the increasing demand in decision-making for evaluation and other forms of research evidence, and the post-truth phenomena of antiscience sentiments combined with illiberal tendencies of the present day. Rather than offer a checklist on how to deal with post-truth, the experts found herein wish to raise awareness and reflection throughout policy circles on the factors that influence our assessment and policy-related work in such a challenging environment. Journeying alongside the editor and contributors, readers benefit from three guiding questions to help identify specific challenges but tools to deal with such challenges: How are policy problems conceptualized in the current political climate? What is the relationship between expertise and decision-making in today’s political circumstances? How complex has evaluation become as a social practice? Evaluation in the Post-Truth World will benefit evaluation practitioners at the program and project levels, as well as policy analysts and scholars interested in applications of evaluation in the public policy domain…(More)”.
Mark the good stuff: Content provenance and the fight against disinformation
BBC Blog: “BBC News’s Verify team is a dedicated group of 60 journalists who fact-check, verify video, counter disinformation, analyse data and – crucially – explain complex stories in the pursuit of truth. On Monday, March 4th, Verify published their first article using a new open media provenance technology called C2PA. The C2PA standard is a technology that records digitally signed information about the provenance of imagery, video and audio – information (or signals) that shows where a piece of media has come from and how it’s been edited. Like an audit trail or a history, these signals are called ‘content credentials’.
Content credentials can be used to help audiences distinguish between authentic, trustworthy media and content that has been faked. The digital signature attached to the provenance information ensures that when the media is “validated”, the person or computer reading the image can be sure that it came from the BBC (or any other source with its own x.509 certificate).
This is important for two reasons. First, it gives publishers like the BBC the ability to share transparently with our audiences what we do every day to deliver great journalism. It also allows us to mark content that is shared across third party platforms (like Facebook) so audiences can trust that when they see a piece of BBC content it does in fact come from the BBC.
For the past three years, BBC R&D has been an active partner in the development of the C2PA standard. It has been developed in collaboration with major media and technology partners, including Microsoft, the New York Times and Adobe. Membership in C2PA is growing to include organisations from all over the world, from established hardware manufacturers like Canon, to technology leaders like OpenAI, fellow media organisations like NHK, and even the Publicis Group covering the advertising industry. Google has now joined the C2PA steering committee and social media companies are leaning in too: Meta has recently announced they are actively assessing implementing C2PA across their platforms…(More)”.
The AI data scraping challenge: How can we proceed responsibly?
Article by Lee Tiedrich: “Society faces an urgent and complex artificial intelligence (AI) data scraping challenge. Left unsolved, it could threaten responsible AI innovation. Data scraping refers to using web crawlers or other means to obtain data from third-party websites or social media properties. Today’s large language models (LLMs) depend on vast amounts of scraped data for training and potentially other purposes. Scraped data can include facts, creative content, computer code, personal information, brands, and just about anything else. At least some LLM operators directly scrape data from third-party sites. Common Crawl, LAION, and other sites make scraped data readily accessible. Meanwhile, Bright Data and others offer scraped data for a fee.
In addition to fueling commercial LLMs, scraped data can provide researchers with much-needed data to advance social good. For instance, Environmental Journal explains how scraped data enhances sustainability analysis. Nature reports that scraped data improves research about opioid-related deaths. Training data in different languages can help make AI more accessible for users in Africa and other underserved regions. Access to training data can even advance the OECD AI Principles by improving safety and reducing bias and other harms, particularly when such data is suitable for the AI system’s intended purpose…(More)”.
The Computable City: Histories, Technologies, Stories, Predictions
Book by Michael Batty: “At every stage in the history of computers and communications, it is safe to say we have been unable to predict what happens next. When computers first appeared nearly seventy-five years ago, primitive computer models were used to help understand and plan cities, but as computers became faster, smaller, more powerful, and ever more ubiquitous, cities themselves began to embrace them. As a result, the smart city emerged. In The Computable City, Michael Batty investigates the circularity of this peculiar evolution: how computers and communications changed the very nature of our city models, which, in turn, are used to simulate systems composed of those same computers.
Batty first charts the origins of computers and examines how our computational urban models have developed and how they have been enriched by computer graphics. He then explores the sequence of digital revolutions and how they are converging, focusing on continual changes in new technologies, as well as the twenty-first-century surge in social media, platform economies, and the planning of the smart city. He concludes by revisiting the digital transformation as it continues to confound us, with the understanding that the city, now a high-frequency twenty-four-hour version of itself, changes our understanding of what is possible…(More)”.
Societal challenges and big qualitative data require a new era of methodological pragmatism
Blog by Alex Gillespie, Vlad Glăveanu, and Constance de Saint-Laurent: “The ‘classic’ methods we use today in psychology and the social sciences might seem relatively fixed, but they are the product of collective responses to concerns within a historical context. The 20th century methods of questionnaires and interviews made sense in a world where researchers did not have access to what people did or said, and even if they did, could not analyse it at scale. Questionnaires and interviews were suited to 20th century concerns (shaped by colonialism, capitalism, and the ideological battles of the Cold War) for understanding, classifying, and mapping opinions and beliefs.
However, what social scientists are faced with today is different due to the culmination of two historical trends. The first has to do with the nature of the problems we face. Inequalities, the climate emergency and current wars are compounded by a general rise in nationalism, populism, and especially post-truth discourses and ideologies. Nationalism and populism are not new, but the scale and sophistication of misinformation threatens to undermine collective responses to collective problems.
It is often said that we live in the age of ‘big data’, but what is less often said is that this is in fact the age of ‘big qualitative data’.
The second trend refers to technology and its accelerated development, especially the unprecedented accumulation of naturally occurring data (digital footprints) combined with increasingly powerful methods for data analysis (traditional and generative AI). It is often said that we live in the age of ‘big data’, but what is less often said is that this is in fact the age of ‘big qualitative data’. The biggest datasets are unstructured qualitative data (each minute adds 2.5 million Google text searches, 500 thousand photos on Snapchat, 500 hours of YouTube videos) and the most significant AI advances leverage this qualitative data and make it tractable for social research.
These two trends have been fuelling the rise in mixed methods research…(More)” (See also their new book ‘Pragmatism and Methodology’ (open access)
Evaluating LLMs Through a Federated, Scenario-Writing Approach
Article by Bogdana “Bobi” Rakova: “What do screenwriters, AI builders, researchers, and survivors of gender-based violence have in common? I’d argue they all imagine new, safe, compassionate, and empowering approaches to building understanding.
In partnership with Kwanele South Africa, I lead an interdisciplinary team, exploring this commonality in the context of evaluating large language models (LLMs) — more specifically, chatbots that provide legal and social assistance in a critical context. The outcomes of our engagement are a series of evaluation objectives and scenarios that contribute to an evaluation protocol with the core tenet that when we design for the most vulnerable, we create better futures for everyone. In what follows I describe our process. I hope this methodological approach and our early findings will inspire other evaluation efforts to meaningfully center the margins in building more positive futures that work for everyone…(More)”
Generative AI: Navigating Intellectual Property
Factsheet by WIPO: “Generative artificial intelligence (AI) tools are rapidly being adopted by many businesses and organizations for the purpose of content generation. Such tools represent both a substantial opportunity to assist business operations and a significant legal risk due to current uncertainties, including intellectual property (IP) questions.
Many organizations are seeking to put guidance in place to help their employees mitigate these risks. While each business situation and legal context will be unique, the following Guiding Principles and Checklist are intended to assist organizations in understanding the IP risks, asking the right questions, and considering potential safeguards…(More)”.
Surveilling Alone
Essay by Christine Rosen: “When Jane Jacobs, author of the 1961 classic The Death and Life of Great American Cities, outlined the qualities of successful neighborhoods, she included “eyes on the street,” or, as she described this, the “eyes belonging to those we might call the natural proprietors of the street,” including shopkeepers and residents going about their daily routines. Not every neighborhood enjoyed the benefit of this informal sense of community, of course, but it was widely seen to be desirable. What Jacobs understood is that the combined impact of many local people practicing normal levels of awareness in their neighborhoods on any given day is surprisingly effective for community-building, with the added benefit of building trust and deterring crime.
Jacobs’s championing of these “natural proprietors of the street” was a response to a mid-century concern that aggressive city planning would eradicate the vibrant experience of neighborhoods like her own, the Village in New York City. Jacobs famously took on “master planner” Robert Moses after he proposed building an expressway through Lower Manhattan, a scheme that, had it succeeded, would have destroyed Washington Square Park and the Village, and turned neighborhoods around SoHo into highway underpasses. For Jacobs and her fellow citizen activists, the efficiency of the proposed highway was not enough to justify eliminating bustling sidewalks and streets, where people played a crucial role in maintaining the health and order of their communities.
Today, a different form of efficient design is eliminating “eyes on the street” — by replacing them with technological ones. The proliferation of neighborhood surveillance technologies such as Ring cameras and digital neighborhood-watch platforms and apps such as Nextdoor and Citizen have freed us from the constraints of having to be physically present to monitor our homes and streets. Jacobs’s “eyes on the street” are now cameras on many homes, and the everyday interactions between neighbors and strangers are now a network of cameras and platforms that promise to put “neighborhood security in your hands,” as the Ring Neighbors app puts it.
Inside our homes, we monitor ourselves and our family members with equal zeal, making use of video baby monitors, GPS-tracking software for children’s smartphones (or for covert surveillance by a suspicious spouse), and “smart” speakers that are always listening and often recording when they shouldn’t. A new generation of domestic robots, such as Amazon’s Astro, combines several of these features into a roving service-machine always at your beck and call around the house and ever watchful of its security when you are away…(More)”.