Robotics for Global development


Report by the Frontier Tech Hub: “Robotics could enable progress on 46% of SDG targets  yet this potential remains largely untapped in low and middle-income countries. 

While technological developments and new-found applications of artificial intelligence (AI) keep captivating significant attention and investments, using robotics to advance the Sustainable Development Goals (SDGs) is consistently overlooked. This is especially true when the focus moves from aerial robotics (drones) to robotic arms, ground robotics, and aquatic robotics. How might these types of robots accelerate global development in the least developed countries? 

We aim to answer this question and inform the UK Foreign, Commonwealth & Development Office’s (FCDO) investment and policy towards robotics in the least developed countries (LDCs). In an emergent space, the UK FCDO has a unique opportunity to position itself as a global leader in leveraging robotics technology to accelerate sustainable development outcomes…(More)”.

Towards a set of Universal data principles


Paper by Steve MacFeely, Angela Me, Friederike Schueuer, Joseph Costanzo, David Passarelli, Malarvizhi Veerappan, and Stefaan Verhulst: “Humanity collects, processes, shares, uses, and reuses a staggering volume of data. These data are the lifeblood of the digital economy; they feed algorithms and artificial intelligence, inform logistics, and shape markets, communication, and politics. Data do not just yield economic benefits; they can also have individual and societal benefits and impacts. Being able to access, process, use, and reuse data is essential for dealing with global challenges, such as managing and protecting the environment, intervening in the event of a pandemic, or responding to a disaster or crisis. While we have made great strides, we have yet to realize the full potential of data, in particular, the potential of data to serve the public good. This will require international cooperation and a globally coordinated approach. Many data governance issues cannot be fully resolved at national level. This paper presents a proposal for a preliminary set of data goals and principles. These goals and principles are envisaged as the normative foundations for an international data governance framework – one that is grounded in human rights and sustainable development. A principles-based approach to data governance helps create common values, and in doing so, helps to change behaviours, mindsets and practices. It can also help create a foundation for the safe use of all types of data and data transactions. The purpose of this paper is to present the preliminary principles to solicit reaction and feedback…(More)”.

Differential Privacy


Open access book by  Simson L. Garfinkel: “Differential privacy (DP) is an increasingly popular, though controversial, approach to protecting personal data. DP protects confidential data by introducing carefully calibrated random numbers, called statistical noise, when the data is used. Google, Apple, and Microsoft have all integrated the technology into their software, and the US Census Bureau used DP to protect data collected in the 2020 census. In this book, Simson Garfinkel presents the underlying ideas of DP, and helps explain why DP is needed in today’s information-rich environment, why it was used as the privacy protection mechanism for the 2020 census, and why it is so controversial in some communities.

When DP is used to protect confidential data, like an advertising profile based on the web pages you have viewed with a web browser, the noise makes it impossible for someone to take that profile and reverse engineer, with absolute certainty, the underlying confidential data on which the profile was computed. The book also chronicles the history of DP and describes the key participants and its limitations. Along the way, it also presents a short history of the US Census and other approaches for data protection such as de-identification and k-anonymity…(More)”.

Which Data Do Economists Use to Study Corruption ?


World Bank paper: “…examines the data sources and methodologies used in economic research on corruption by analyzing 339 journal articles published in 2022 that include Journal of Economic Literature codes. The paper identifies the most commonly used data types, sources, and geographical foci, as well as whether studies primarily investigate the causes or consequences of corruption. Cross-country composite indicators remain the dominant measure, while single country studies more frequently utilize administrative data. Articles in ranked journals are more likely to employ administrative and experimental data and focus on the causes of corruption. The broader dataset of 882 articles highlights the significant academic interest in corruption across disciplines, particularly in political science and public policy. The findings raise concerns about the limited use of novel data sources and the relative neglect of research on the causes of corruption, underscoring the need for a more integrated approach within the field of economics…(More)”.

Global population data is in crisis – here’s why that matters


Article by Andrew J Tatem and Jessica Espey: “Every day, decisions that affect our lives depend on knowing how many people live where. For example, how many vaccines are needed in a community, where polling stations should be placed for elections or who might be in danger as a hurricane approaches. The answers rely on population data.

But counting people is getting harder.

For centuries, census and household surveys have been the backbone of population knowledge. But we’ve just returned from the UN’s statistical commission meetings in New York, where experts reported that something alarming is happening to population data systems globally.

Census response rates are declining in many countries, resulting in large margins of error. The 2020 US census undercounted America’s Latino population by more than three times the rate of the 2010 census. In Paraguay, the latest census revealed a population one-fifth smaller than previously thought.

South Africa’s 2022 census post-enumeration survey revealed a likely undercount of more than 30%. According to the UN Economic Commission for Africa, undercounts and census delays due to COVID-19, conflict or financial limitations have resulted in an estimated one in three Africans not being counted in the 2020 census round.

When people vanish from data, they vanish from policy. When certain groups are systematically undercounted – often minorities, rural communities or poorer people – they become invisible to policymakers. This translates directly into political underrepresentation and inadequate resource allocation…(More)”.

Trump Admin Plans to Cut Team Responsible for Critical Atomic Measurement Data


Article by Louise Matsakis and Will Knight: “The US National Institute of Standards and Technology (NIST) is discussing plans to eliminate an entire team responsible for publishing and maintaining critical atomic measurement data in the coming weeks, as the Trump administration continues its efforts to reduce the US federal workforce, according to a March 18 email sent to dozens of outside scientists. The data in question underpins advanced scientific research around the world in areas like semiconductor manufacturing and nuclear fusion…(More)”.

Web 3.0 Requires Data Integrity


Article by Bruce Schneier and Davi Ottenheimer: “If you’ve ever taken a computer security class, you’ve probably learned about the three legs of computer security—confidentiality, integrity, and availability—known as the CIA triad.a When we talk about a system being secure, that’s what we’re referring to. All are important, but to different degrees in different contexts. In a world populated by artificial intelligence (AI) systems and artificial intelligent agents, integrity will be paramount.

What is data integrity? It’s ensuring that no one can modify data—that’s the security angle—but it’s much more than that. It encompasses accuracy, completeness, and quality of data—all over both time and space. It’s preventing accidental data loss; the “undo” button is a primitive integrity measure. It’s also making sure that data is accurate when it’s collected—that it comes from a trustworthy source, that nothing important is missing, and that it doesn’t change as it moves from format to format. The ability to restart your computer is another integrity measure.

The CIA triad has evolved with the Internet. The first iteration of the Web—Web 1.0 of the 1990s and early 2000s—prioritized availability. This era saw organizations and individuals rush to digitize their content, creating what has become an unprecedented repository of human knowledge. Organizations worldwide established their digital presence, leading to massive digitization projects where quantity took precedence over quality. The emphasis on making information available overshadowed other concerns.

As Web technologies matured, the focus shifted to protecting the vast amounts of data flowing through online systems. This is Web 2.0: the Internet of today. Interactive features and user-generated content transformed the Web from a read-only medium to a participatory platform. The increase in personal data, and the emergence of interactive platforms for e-commerce, social media, and online everything demanded both data protection and user privacy. Confidentiality became paramount.

We stand at the threshold of a new Web paradigm: Web 3.0. This is a distributed, decentralized, intelligent Web. Peer-to-peer social-networking systems promise to break the tech monopolies’ control on how we interact with each other. Tim Berners-Lee’s open W3C protocol, Solid, represents a fundamental shift in how we think about data ownership and control. A future filled with AI agents requires verifiable, trustworthy personal data and computation. In this world, data integrity takes center stage…(More)”.

Cloze Encounters: The Impact of Pirated Data Access on LLM Performance


Paper by Stella Jia & Abhishek Nagaraj: “Large Language Models (LLMs) have demonstrated remarkable capabilities in text generation, but their performance may be influenced by the datasets on which they are trained, including potentially unauthorized or pirated content. We investigate the extent to which data access through pirated books influences LLM responses. We test the performance of leading foundation models (GPT, Claude, Llama, and Gemini) on a set of books that were and were not included in the Books3 dataset, which contains full-text pirated books and could be used for LLM training. We assess book-level performance using the “name cloze” word-prediction task. To examine the causal effect of Books3 inclusion we employ an instrumental variables strategy that exploits the pattern of book publication years in the Books3 dataset. In our sample of 12,916 books, we find significant improvements in LLM name cloze accuracy on books available within the Books3 dataset compared to those not present in these data. These effects are more pronounced for less popular books as compared to more popular books and vary across leading models. These findings have crucial implications for the economics of digitization, copyright policy, and the design and training of AI systems…(More)”.

Bubble Trouble


Article by Bryan McMahon: “…Venture capital (VC) funds, drunk on a decade of “growth at all costs,” have poured about $200 billion into generative AI. Making matters worse, the stock market’s bull run is deeply dependent on the growth of the Big Tech companies fueling the AI bubble. In 2023, 71 percent of the total gains in the S&P 500 were attributable to the “Magnificent Seven”—Apple, Nvidia, Tesla, Alphabet, Meta, Amazon, and Microsoft—all of which are among the biggest spenders on AI. Just four—Microsoft, Alphabet, Amazon, and Meta—combined for $246 billion of capital expenditure in 2024 to support the AI build-out. Goldman Sachs expects Big Tech to spend over $1 trillion on chips and data centers to power AI over the next five years. Yet OpenAI, the current market leader, expects to lose $5 billion this year, and its annual losses to swell to $11 billion by 2026. If the AI bubble bursts, it not only threatens to wipe out VC firms in the Valley but also blow a gaping hole in the public markets and cause an economy-wide meltdown…(More)”.

From Insights to Action: Amplifying Positive Deviance within Somali Rangelands


Article by Basma Albanna, Andreas Pawelke and Hodan Abdullahi: “In every community, some individuals or groups achieve significantly better outcomes than their peers, despite having similar challenges and resources. Finding these so-called positive deviants and working with them to diffuse their practices is referred to as the Positive Deviance approach. The Data-Powered Positive Deviance (DPPD) method follows the same logic as the Positive Deviance approach but leverages existing, non-traditional data sources, in conjunction with traditional data sources to identify and scale the solutions of positive deviants. The UNDP Somalia Accelerator Lab was part of the first cohort of teams that piloted the application of DPPD trying to tackle the rangeland health problem in the West Golis region. In this blog post we’re reflecting on the process we designed and tested to go from the identification and validation of successful practices to helping other communities adopt them.

Uncovering Rangeland Success

Three years ago we embarked on a journey to identify pastoral communities in Somaliland that demonstrated resilience in the face of adversity. Using a mix of traditional and non-traditional data sources, we wanted to explore and learn from communities that managed to have healthy rangelands despite the severe droughts of 2016 and 2017.

We engaged with government officials from various ministries, experts from the University of Hargeisa, international organizations like the FAO and members of agro-pastoral communities to learn more about rangeland health. We then selected the West Golis as our region of interest with a majority pastoral community and relative ease of access. Employing the Soil-Adjusted Vegetation Index (SAVI) and using geospatial and earth observation data allowed us to identify an initial group of potential positive deviants illustrated as green circles in Figure 1 below.

From Insights to Action: Amplifying Positive Deviance within Somali Rangelands
Figure 1: Measuring the vegetation health within 5 km community buffer zones based on SAVI.

Following the identification of potential positive deviants, we engaged with 18 pastoral communities from the Togdheer, Awdal, and Maroodijeex regions to validate whether the positive deviants we found using earth observation data were indeed doing better than the other communities.

The primary objective of the fieldwork was to uncover the existing practices and strategies that could explain the outperformance of positively-deviant communities compared to other communities. The research team identified a range of strategies, including soil and water conservation techniques, locally-produced pesticides, and reseeding practices as summarized in Figure 2.

From Insights to Action
Figure 2: Strategies and practices that emerged from the fieldwork

Data-Powered Positive Deviance is not just about identifying outperformers and their successful practices. The real value lies in the diffusion, adoption and adaptation of these practices by individuals, groups or communities facing similar challenges. For this to succeed, both the positive deviants and those learning about their practices must take ownership and drive the process. Merely presenting the uncommon but successful practices of positive deviants to others will not work. The secret to success is in empowering the community to take charge, overcome challenges, and leverage their own resources and capabilities to effect change…(More)”.