Paper by Damon M Hall et al: “Citizen science is personal. Participation is contingent on the citizens’ connection to a topic or to interpersonal relationships meaningful to them. But from the peer-reviewed literature, scientists appear to have an acquisitive data-centered relationship with citizens. This has spurred ethical and pragmatic criticisms of extractive relationships with citizen scientists. We suggest five practical steps to shift citizen-science research from extractive to relational, reorienting the research process and providing reciprocal benefits to researchers and citizen scientists. By virtue of their interests and experience within their local environments, citizen scientists have expertise that, if engaged, can improve research methods and product design decisions. To boost the value of scientific outputs to society and participants, citizen-science research teams should rethink how they engage and value volunteers…(More)”.
Predicting IMF-Supported Programs: A Machine Learning Approach
Paper by Tsendsuren Batsuuri, Shan He, Ruofei Hu, Jonathan Leslie and Flora Lutz: “This study applies state-of-the-art machine learning (ML) techniques to forecast IMF-supported programs, analyzes the ML prediction results relative to traditional econometric approaches, explores non-linear relationships among predictors indicative of IMF-supported programs, and evaluates model robustness with regard to different feature sets and time periods. ML models consistently outperform traditional methods in out-of-sample prediction of new IMF-supported arrangements with key predictors that align well with the literature and show consensus across different algorithms. The analysis underscores the importance of incorporating a variety of external, fiscal, real, and financial features as well as institutional factors like membership in regional financing arrangements. The findings also highlight the varying influence of data processing choices such as feature selection, sampling techniques, and missing data imputation on the performance of different ML models and therefore indicate the usefulness of a flexible, algorithm-tailored approach. Additionally, the results reveal that models that are most effective in near and medium-term predictions may tend to underperform over the long term, thus illustrating the need for regular updates or more stable – albeit potentially near-term suboptimal – models when frequent updates are impractical…(More)”.
Whatever Happened to All Those Care Robots?
Article by Stephanie H. Murray: “So far, companion robots haven’t lived up to the hype—and might even exacerbate the problems they’re meant to solve…There are likely many reasons that the long-predicted robot takeover of elder care has yet to take off. Robots are expensive, and cash-strapped care homes don’t have money lying around to purchase a robot, let alone to pay for the training needed to actually use one effectively. And at least so far, social robots just aren’t worth the investment, Wright told me. Pepper can’t do a lot of the things people claimed he could—and he relies heavily on humans to help him do what he can. Despite some research suggesting they can boost well-being among the elderly, robots have shown little evidence that they make life easier for human caregivers. In fact, they require quite a bit of care themselves. Perhaps robots of the future will revolutionize caregiving as hoped. But the care robots we have now don’t even come close, and might even exacerbate the problems they’re meant to solve…(More)”.
Facial Recognition Technology: Current Capabilities, Future Prospects, and Governance
Report by the National Academies of Sciences, Engineering, and Medicine: “Facial recognition technology is increasingly used for identity verification and identification, from aiding law enforcement investigations to identifying potential security threats at large venues. However, advances in this technology have outpaced laws and regulations, raising significant concerns related to equity, privacy, and civil liberties.
This report explores the current capabilities, future possibilities, and necessary governance for facial recognition technology. Facial Recognition Technology discusses legal, societal, and ethical implications of the technology, and recommends ways that federal agencies and others developing and deploying the technology can mitigate potential harms and enact more comprehensive safeguards…(More)”.
Why we’re fighting to make sure labor unions have a voice in how AI is implemented
Article by Liz Shuler and Mike Kubzansky: “Earlier this month, Google’s co-founder admitted that the company had “definitely messed up” after its AI tool, Gemini, produced historically inaccurate images—including depictions of racially diverse Nazis. Sergey Brin cited a lack of “thorough testing” of the AI tool, but the incident is a good reminder that, despite all the hype around generative AI replacing human output, the technology still has a long way to go.
Of course, that hasn’t stopped companies from deploying AI in the workplace. Some even use the technology as an excuse to lay workers off. Since last May, at least 4,000 people have lost their jobs to AI, and 70% of workers across the country live with the fear that AI is coming for theirs next. And while the technology may still be in its infancy, it’s developing fast. Earlier this year, AI pioneer Mustafa Suleyman said that “left completely to the market and to their own devices, [AI tools are] fundamentally labor-replacing.” Without changes now, AI could be coming to replace a lot of people’s jobs.
It doesn’t have to be this way. AI has enormous potential to build prosperity and unleash human creativity, but only if it also works for working people. Ensuring that happens requires giving the voice of workers—the people who will engage with these technologies every day, and whose lives, health, and livelihoods are increasingly affected by AI and automation—a seat at the decision-making table.
As president of the AFL-CIO, representing 12.5 million working people across 60 unions, and CEO of Omidyar Network, a social change philanthropy that supports responsible technology, we believe that the single best movement to give everyone a voice is the labor movement. Empowering workers—from warehouse associates to software engineers—is the most powerful tactic we have to ensure that AI develops in the interests of the many, not the few…(More)”.
Monitoring global trade using data on vessel traffic
Article by Graham Pilgrim, Emmanuelle Guidetti and Annabelle Mourougane: “Rising uncertainties and geo-political tensions, together with more complex trade relations have increased the demand for data and tools to monitor global trade in a timely manner. At the same time, advances in Big Data Analytics and access to a huge quantity of alternative data – outside the realm of official statistics – have opened new avenues to monitor trade. These data can help identify bottlenecks and disruptions in real time but need to be cleaned and validated.
One such alternative data source is the Automatic Identification System (AIS), developed by the International Maritime Organisation, facilitating the tracking of vessels across the globe. The system includes messages transmitted by ships to land or satellite receivers, available in quasi real time. While it was primarily designed to ensure vessel safety, this data is particularly well suited for providing insights on trade developments, as over 80% in volume of international merchandise trade is carried by sea (UNCTAD, 2022). Furthermore, AIS data holds granular vessel information and detailed location data, which combined with other data sources can enable the identification of activity at a port (or even berth) level, by vessel type or by the jurisdiction of vessel ownership.
For a number of years, the UN Global Platform has made AIS data available to those compiling official statistics, such as National Statistics Offices (NSOs) or International Organisations. This has facilitated the development of new methodologies, for instance the automated identification of port locations (Irish Central Statistics Office, 2022). The data has also been exploited by data scientists and research centres to monitor trade in specific commodities such as Liquefied Natural Gas (QuantCube Technology, 2022) or to analyse port and shipping operations in a specific country (Tsalamanis et al., 2018). Beyond trade, the dataset has been used to track CO2 emissions from the maritime sector (Clarke et al., 2023).
New work from the OECD Statistics and Data Directorate contributes to existing research in this field in two major ways. First, it proposes a new methodology to identify ports, at a higher level of precision than in past research. Second, it builds indicators to monitor port congestion and trends in maritime trade flows and provides a tool to get detailed information and better understand those flows…(More)”.
What Does Information Integrity Mean for Democracies?
Article by Kamya Yadav and Samantha Lai: “Democracies around the world are encountering unique challenges with the rise of new technologies. Experts continue to debate how social media has impacted democratic discourse, pointing to how algorithmic recommendations, influence operations, and cultural changes in norms of communication alter the way people consume information. Meanwhile, developments in artificial intelligence (AI) surface new concerns over how the technology might affect voters’ decision-making process. Already, we have seen its increased use in relation to political campaigning.
In the run-up to Pakistan’s 2024 presidential elections, former Prime Minister Imran Khan used an artificially generated speech to campaign while imprisoned. Meanwhile, in the United States, a private company used an AI-generated imitation of President Biden’s voice to discourage people from voting. In response, the Federal Communications Commission outlawed the use of AI-generated robocalls.
Evolving technologies present new threats. Disinformation, misinformation, and propaganda are all different faces of the same problem: Our information environment—the ecosystem in which we disseminate, create, receive, and process information—is not secure and we lack coherent goals to direct policy actions. Formulating short-term, reactive policy to counter or mitigate the effects of disinformation or propaganda can only bring us so far. Beyond defending democracies from unending threats, we should also be looking at what it will take to strengthen them. This begs the question: How do we work toward building secure and resilient information ecosystems? How can policymakers and democratic governments identify policy areas that require further improvement and shape their actions accordingly?…(More)”.
Commons-based Data Set: Governance for AI
Report by Open Future: “In this white paper, we propose an approach to sharing data sets for AI training as a public good governed as a commons. By adhering to the six principles of commons-based governance, data sets can be managed in a way that generates public value while making shared resources resilient to extraction or capture by commercial interests.
The purpose of defining these principles is two-fold:
We propose these principles as input into policy debates on data and AI governance. A commons-based approach can be introduced through regulatory means, funding and procurement rules, statements of principles, or data sharing frameworks. Secondly, these principles can also serve as a blueprint for the design of data sets that are governed and shared as a commons. To this end, we also provide practical examples of how these principles are being brought to life. Projects like Big Science or Common Voice have demonstrated that commons-based data sets can be successfully built.
These principles, tailored for the governance of AI data sets, are built on our previous work on Data Commons Primer. They are also the outcome of our research into the governance of AI datasets, including the AI_Commons case study. Finally, they are based on ongoing efforts to define how AI systems can be shared and made open, in which we have been participating – including the OSI-led process to define open-source AI systems, and the DPGA Community of Practice exploring AI systems as Digital Public Goods…(More)”.

The six principles for commons-based data set governance are as follows:
Digital public infrastructure and public value: What is ‘public’ about DPI?
Paper by David Eaves, Mariana Mazzucato and Beatriz Vasconcellos: “Digital Public Infrastructures (DPI) are becoming increasingly relevant in the policy and academic domains, with DPI not just being regulated, but funded and created by governments, international organisations, philanthropies and the private sector. However, these transformations are not neutral; they have a direction. This paper addresses how to ensure that DPI is not only regulated but created and governed for the common good by maximising public value creation. Our analysis makes explicit which normative values may be associated with DPI development. We also argue that normative values are necessary but not sufficient for maximising public value creation with DPI, and that a more proactive role of the state and governance are key. In this work, policymakers and researchers will find valuable frameworks for understanding where the value-creation elements of DPI come from and how to design a DPI governance that maximises public value…(More)”.
Using online search activity for earlier detection of gynaecological malignancy
Paper by Jennifer F. Barcroft et al: Ovarian cancer is the most lethal and endometrial cancer the most common gynaecological cancer in the UK, yet neither have a screening program in place to facilitate early disease detection. The aim is to evaluate whether online search data can be used to differentiate between individuals with malignant and benign gynaecological diagnoses.
This is a prospective cohort study evaluating online search data in symptomatic individuals (Google user) referred from primary care (GP) with a suspected cancer to a London Hospital (UK) between December 2020 and June 2022. Informed written consent was obtained and online search data was extracted via Google takeout and anonymised. A health filter was applied to extract health-related terms for 24 months prior to GP referral. A predictive model (outcome: malignancy) was developed using (1) search queries (terms model) and (2) categorised search queries (categories model). Area under the ROC curve (AUC) was used to evaluate model performance. 844 women were approached, 652 were eligible to participate and 392 were recruited. Of those recruited, 108 did not complete enrollment, 12 withdrew and 37 were excluded as they did not track Google searches or had an empty search history, leaving a cohort of 235.s
The cohort had a median age of 53 years old (range 20–81) and a malignancy rate of 26.0%. There was a difference in online search data between those with a benign and malignant diagnosis, noted as early as 360 days in advance of GP referral, when search queries were used directly, but only 60 days in advance, when queries were divided into health categories. A model using online search data from patients (n = 153) who performed health-related search and corrected for sample size, achieved its highest sample-corrected AUC of 0.82, 60 days prior to GP referral.
Online search data appears to be different between individuals with malignant and benign gynaecological conditions, with a signal observed in advance of GP referral date. Online search data needs to be evaluated in a larger dataset to determine its value as an early disease detection tool and whether its use leads to improved clinical outcomes…(More)”.