What Does Information Integrity Mean for Democracies?


Article by Kamya Yadav and Samantha Lai: “Democracies around the world are encountering unique challenges with the rise of new technologies. Experts continue to debate how social media has impacted democratic discourse, pointing to how algorithmic recommendationsinfluence operations, and cultural changes in norms of communication alter the way people consume information. Meanwhile, developments in artificial intelligence (AI) surface new concerns over how the technology might affect voters’ decision-making process. Already, we have seen its increased use in relation to political campaigning. 

In the run-up to Pakistan’s 2024 presidential elections, former Prime Minister Imran Khan used an artificially generated speech to campaign while imprisoned. Meanwhile, in the United States, a private company used an AI-generated imitation of President Biden’s voice to discourage people from voting. In response, the Federal Communications Commission outlawed the use of AI-generated robocalls.

Evolving technologies present new threats. Disinformation, misinformation, and propaganda are all different faces of the same problem: Our information environment—the ecosystem in which we disseminate, create, receive, and process information—is not secure and we lack coherent goals to direct policy actions. Formulating short-term, reactive policy to counter or mitigate the effects of disinformation or propaganda can only bring us so far. Beyond defending democracies from unending threats, we should also be looking at what it will take to strengthen them. This begs the question: How do we work toward building secure and resilient information ecosystems? How can policymakers and democratic governments identify policy areas that require further improvement and shape their actions accordingly?…(More)”.

Commons-based Data Set: Governance for AI


Report by Open Future: “In this white paper, we propose an approach to sharing data sets for AI training as a public good governed as a commons. By adhering to the six principles of commons-based governance, data sets can be managed in a way that generates public value while making shared resources resilient to extraction or capture by commercial interests.

The purpose of defining these principles is two-fold:

We propose these principles as input into policy debates on data and AI governance. A commons-based approach can be introduced through regulatory means, funding and procurement rules, statements of principles, or data sharing frameworks. Secondly, these principles can also serve as a blueprint for the design of data sets that are governed and shared as a commons. To this end, we also provide practical examples of how these principles are being brought to life. Projects like Big Science or Common Voice have demonstrated that commons-based data sets can be successfully built.

These principles, tailored for the governance of AI data sets, are built on our previous work on Data Commons Primer. They are also the outcome of our research into the governance of AI datasets, including the AI_Commons case study.  Finally, they are based on ongoing efforts to define how AI systems can be shared and made open, in which we have been participating – including the OSI-led process to define open-source AI systems, and the DPGA Community of Practice exploring AI systems as Digital Public Goods…(More)”.

The six principles for commons-based data set governance are as follows:

Digital public infrastructure and public value: What is ‘public’ about DPI?


Paper by David Eaves, Mariana Mazzucato and Beatriz Vasconcellos: “Digital Public Infrastructures (DPI) are becoming increasingly relevant in the policy and academic domains, with DPI not just being regulated, but funded and created by governments, international organisations, philanthropies and the private sector. However, these transformations are not neutral; they have a direction. This paper addresses how to ensure that DPI is not only regulated but created and governed for the common good by maximising public value creation. Our analysis makes explicit which normative values may be associated with DPI development. We also argue that normative values are necessary but not sufficient for maximising public value creation with DPI, and that a more proactive role of the state and governance are key. In this work, policymakers and researchers will find valuable frameworks for understanding where the value-creation elements of DPI come from and how to design a DPI governance that maximises public value…(More)”.

Using online search activity for earlier detection of gynaecological malignancy


Paper by Jennifer F. Barcroft et al: Ovarian cancer is the most lethal and endometrial cancer the most common gynaecological cancer in the UK, yet neither have a screening program in place to facilitate early disease detection. The aim is to evaluate whether online search data can be used to differentiate between individuals with malignant and benign gynaecological diagnoses.

This is a prospective cohort study evaluating online search data in symptomatic individuals (Google user) referred from primary care (GP) with a suspected cancer to a London Hospital (UK) between December 2020 and June 2022. Informed written consent was obtained and online search data was extracted via Google takeout and anonymised. A health filter was applied to extract health-related terms for 24 months prior to GP referral. A predictive model (outcome: malignancy) was developed using (1) search queries (terms model) and (2) categorised search queries (categories model). Area under the ROC curve (AUC) was used to evaluate model performance. 844 women were approached, 652 were eligible to participate and 392 were recruited. Of those recruited, 108 did not complete enrollment, 12 withdrew and 37 were excluded as they did not track Google searches or had an empty search history, leaving a cohort of 235.s

The cohort had a median age of 53 years old (range 20–81) and a malignancy rate of 26.0%. There was a difference in online search data between those with a benign and malignant diagnosis, noted as early as 360 days in advance of GP referral, when search queries were used directly, but only 60 days in advance, when queries were divided into health categories. A model using online search data from patients (n = 153) who performed health-related search and corrected for sample size, achieved its highest sample-corrected AUC of 0.82, 60 days prior to GP referral.

Online search data appears to be different between individuals with malignant and benign gynaecological conditions, with a signal observed in advance of GP referral date. Online search data needs to be evaluated in a larger dataset to determine its value as an early disease detection tool and whether its use leads to improved clinical outcomes…(More)”.

Influence of public innovation laboratories on the development of public sector ambidexterity


Article by Christophe Favoreu et al: “Ambidexterity has become a major issue for public organizations as they manage increasingly strong contradictory pressures to optimize existing processes while innovating. Moreover, although public innovation laboratories are emerging, their influence on the development of ambidexterity remains largely unexplored. Our research aims to understand how innovation laboratories contribute to the formation of individual ambidexterity within the public sector. Drawing from three case studies, this research underscores the influence of these labs on public ambidexterity through the development of innovations by non-specialized actors and the deployment and reuse of innovative managerial practices and techniques outside the i-labs…(More)”.

Responsible Data Re-use in Developing Countries: Social Licence through Public Engagement


Report by Stefaan Verhulst, Laura Sandor, Natalia Mejia Pardo, Elena Murray and Peter Addo: “The datafication era has transformed the technological landscape, digitizing multiple areas of human life and offering opportunities for societal progress through the re-use of digital data. Developing countries stand to benefit from datafication but are faced with challenges like insufficient data quality and limited infrastructure. One of the primary obstacles to unlocking data re-use lies in agency asymmetries—disparities in decision-making authority among stakeholders—which fuel public distrust. Existing consent frameworks amplify the challenge, as they are individual-focused, lack information, and fail to address the nuances of data re-use. To address these limitations, a Social License for re-use becomes imperative—a community-focused approach that fosters responsible data practices and benefits all stakeholders. This shift is crucial for establishing trust and collaboration, and bridging the gap between institutions, governments, and citizens…(More)”.

Untapped


About: “Twenty-first century collective intelligence- combining people’s knowledge and skills, new forms of data and increasingly, technology – has the untapped potential to transform the way we understand and act on climate change.

Collective intelligence for climate action in the Global South takes many forms: from crowdsourcing of indigenous knowledge to preserve biodiversity to participatory monitoring of extreme heat and farmer experiments adapting crops to weather variability.

This research analyzes 100+ climate case studies across 45 countries that tap into people’s participation and use new forms of data. This research illustrates the potential that exists in communities everywhere to contribute to climate adaptation and mitigation efforts. It also aims to shine a light on practical ways in which these initiatives could be designed and further developed so this potential can be fully unleashed…(More)”.

Central banks use AI to assess climate-related risks


Article by Huw Jones: “Central bankers said on Tuesday they have broken new ground by using artificial intelligence to collect data for assessing climate-related financial risks, just as the volume of disclosures from banks and other companies is set to rise.

The Bank for International Settlements, a forum for central banks, the Bank of Spain, Germany’s Bundesbank and the European Central Bank said their experimental Gaia AI project was used to analyse company disclosures on carbon emissions, green bond issuance and voluntary net-zero commitments.

Regulators of banks, insurers and asset managers need high-quality data to assess the impact of climate-change on financial institutions. However, the absence of a single reporting standard confronts them with a patchwork of public information spread across text, tables and footnotes in annual reports.

Gaia was able to overcome differences in definitions and disclosure frameworks across jurisdictions to offer much-needed transparency, and make it easier to compare indicators on climate-related financial risks, the central banks said in a joint statement.

Despite variations in how the same data is reported by companies, Gaia focuses on the definition of each indicator, rather than how the data is labelled.

Furthermore, with the traditional approach, each additional key performance indicator, or KPI, and each new institution requires the analyst to either search for the information in public corporate reports or contact the institution for information…(More)”.

The Wisdom of Partisan Crowds: Comparing Collective Intelligence in Humans and LLM-based Agents


Paper by Yun-Shiuan Chuang et al: “Human groups are able to converge to more accurate beliefs through deliberation, even in the presence of polarization and partisan bias – a phenomenon known as the “wisdom of partisan crowds.” Large Language Models (LLMs) agents are increasingly being used to simulate human collective behavior, yet few benchmarks exist for evaluating their dynamics against the behavior of human groups. In this paper, we examine the extent to which the wisdom of partisan crowds emerges in groups of LLM-based agents that are prompted to role-play as partisan personas (e.g., Democrat or Republican). We find that they not only display human-like partisan biases, but also converge to more accurate beliefs through deliberation, as humans do. We then identify several factors that interfere with convergence, including the use of chain-of-thought prompting and lack of details in personas. Conversely, fine-tuning on human data appears to enhance convergence. These findings show the potential and limitations of LLM-based agents as a model of human collective intelligence…(More)”

Data Disquiet: Concerns about the Governance of Data for Generative AI


Paper by Susan Aaronson: “The growing popularity of large language models (LLMs) has raised concerns about their accuracy. These chatbots can be used to provide information, but it may be tainted by errors or made-up or false information (hallucinations) caused by problematic data sets or incorrect assumptions made by the model. The questionable results produced by chatbots has led to growing disquiet among users, developers and policy makers. The author argues that policy makers need to develop a systemic approach to address these concerns. The current piecemeal approach does not reflect the complexity of LLMs or the magnitude of the data upon which they are based, therefore, the author recommends incentivizing greater transparency and accountability around data-set development…(More)”.