Technical Tiers: A New Classification Framework for Global AI Workforce Analysis


Report by Siddhi Pal, Catherine Schneider and Ruggero Marino Lazzaroni: “… introduces a novel three-tiered classification system for global AI talent that addresses significant methodological limitations in existing workforce analyses, by distinguishing between different skill categories within the existing AI talent pool. By distinguishing between non-technical roles (Category 0), technical software development (Category 1), and advanced deep learning specialization (Category 2), our framework enables precise examination of AI workforce dynamics at a pivotal moment in global AI policy.

Through our analysis of a sample of 1.6 million individuals in the AI talent pool across 31 countries, we’ve uncovered clear patterns in technical talent distribution that significantly impact Europe’s AI ambitions. Asian nations hold an advantage in specialized AI expertise, with South Korea (27%), Israel (23%), and Japan (20%) maintaining the highest proportions of Category 2 talent. Within Europe, Poland and Germany stand out as leaders in specialized AI talent. This may be connected to their initiatives to attract tech companies and investments in elite research institutions, though further research is needed to confirm these relationships.

Our data also reveals a shifting landscape of global talent flows. Research shows that countries employing points-based immigration systems attract 1.5 times more high-skilled migrants than those using demand-led approaches. This finding takes on new significance in light of recent geopolitical developments affecting scientific research globally. As restrictive policies and funding cuts create uncertainty for researchers in the United States, one of the big destinations for European AI talent, the way nations position their regulatory environments, scientific freedoms, and research infrastructure will increasingly determine their ability to attract and retain specialized AI talent.

The gender analysis in our study illuminates another dimension of competitive advantage. Contrary to the overall AI talent pool, EU countries lead in female representation in highly technical roles (Category 2), occupying seven of the top ten global rankings. Finland, Czechia, and Italy have the highest proportion of female representation in Category 2 roles globally (39%, 31%, and 28%, respectively). This gender diversity represents not merely a social achievement but a potential strategic asset in AI innovation, particularly as global coalitions increasingly emphasize the importance of diverse perspectives in AI development…(More)”

Mind the (Language) Gap: Mapping the Challenges of LLM Development in Low-Resource Language Contexts


White Paper by the Stanford Institute for Human-Centered AI (HAI), the Asia Foundation and the University of Pretoria: “…maps the LLM development landscape for low-resource languages, highlighting challenges, trade-offs, and strategies to increase investment; prioritize cross-disciplinary, community-driven development; and ensure fair data ownership…

  • Large language model (LLM) development suffers from a digital divide: Most major LLMs underperform for non-English—and especially low-resource—languages; are not attuned to relevant cultural contexts; and are not accessible in parts of the Global South.
  • Low-resource languages (such as Swahili or Burmese) face two crucial limitations: a scarcity of labeled and unlabeled language data and poor quality data that is not sufficiently representative of the languages and their sociocultural contexts.
  • To bridge these gaps, researchers and developers are exploring different technical approaches to developing LLMs that better perform for and represent low-resource languages but come with different trade-offs:
    • Massively multilingual models, developed primarily by large U.S.-based firms, aim to improve performance for more languages by including a wider range of (100-plus) languages in their training datasets.
    • Regional multilingual models, developed by academics, governments, and nonprofits in the Global South, use smaller training datasets made up of 10-20 low-resource languages to better cater to and represent a smaller group of languages and cultures.
    • Monolingual or monocultural models, developed by a variety of public and private actors, are trained on or fine-tuned for a single low-resource language and thus tailored to perform well for that language…(More)”

Artificial Intelligence and Big Data


Book edited by Frans L. Leeuw and Michael Bamberger: “…explores how Artificial Intelligence (AI) and Big Data contribute to the evaluation of the rule of law (covering legal arrangements, empirical legal research, law and technology, and international law), and social and economic development programs in both industrialized and developing countries. Issues of ethics and bias in the use of AI are also addressed and indicators of the growth of knowledge in the field are discussed.

Interdisciplinary and international in scope, and bringing together leading academics and practitioners from across the globe, the book explores the applications of AI and big data in Rule of Law and development evaluation, identifies differences in the approaches used in the two fields, and how each could learn from the approaches used in the other, as well as differences in the AI-related issues addressed in industrialized nations compared to those addressed in Africa and Asia.

Artificial Intelligence and Big Data is an essential read for researchers, academics and students working in the fields of Rule of Law and Development, and researchers in institutions working on new applications in AI will all benefit from the book’s practical insights…(More)”.

Designing New Institutions and Renewing Existing Ones – A Playbook


UNDP Report: “The world has long depended on public institutions to solve problems and meet needs — from running schools to building roads, taking care of public health to defense. Today, global challenges like climate change, election security, forced migration, and AI-induced unemployment demand new institutional responses, especially in the Global South.

The bad news? Many institutions now struggle with public distrust, being seen as too wasteful
and inefficient, unresponsive and ineffective, and sometimes corrupt and outdated.
The good news? Fresh methods and models inspired by innovations in government, business, and civil
society are now available that can help us rethink institutions — making them more public results
oriented, agile, transparent, and fit for purpose. And ready for the future…(More)”.

Global population data is in crisis – here’s why that matters


Article by Andrew J Tatem and Jessica Espey: “Every day, decisions that affect our lives depend on knowing how many people live where. For example, how many vaccines are needed in a community, where polling stations should be placed for elections or who might be in danger as a hurricane approaches. The answers rely on population data.

But counting people is getting harder.

For centuries, census and household surveys have been the backbone of population knowledge. But we’ve just returned from the UN’s statistical commission meetings in New York, where experts reported that something alarming is happening to population data systems globally.

Census response rates are declining in many countries, resulting in large margins of error. The 2020 US census undercounted America’s Latino population by more than three times the rate of the 2010 census. In Paraguay, the latest census revealed a population one-fifth smaller than previously thought.

South Africa’s 2022 census post-enumeration survey revealed a likely undercount of more than 30%. According to the UN Economic Commission for Africa, undercounts and census delays due to COVID-19, conflict or financial limitations have resulted in an estimated one in three Africans not being counted in the 2020 census round.

When people vanish from data, they vanish from policy. When certain groups are systematically undercounted – often minorities, rural communities or poorer people – they become invisible to policymakers. This translates directly into political underrepresentation and inadequate resource allocation…(More)”.

From Insights to Action: Amplifying Positive Deviance within Somali Rangelands


Article by Basma Albanna, Andreas Pawelke and Hodan Abdullahi: “In every community, some individuals or groups achieve significantly better outcomes than their peers, despite having similar challenges and resources. Finding these so-called positive deviants and working with them to diffuse their practices is referred to as the Positive Deviance approach. The Data-Powered Positive Deviance (DPPD) method follows the same logic as the Positive Deviance approach but leverages existing, non-traditional data sources, in conjunction with traditional data sources to identify and scale the solutions of positive deviants. The UNDP Somalia Accelerator Lab was part of the first cohort of teams that piloted the application of DPPD trying to tackle the rangeland health problem in the West Golis region. In this blog post we’re reflecting on the process we designed and tested to go from the identification and validation of successful practices to helping other communities adopt them.

Uncovering Rangeland Success

Three years ago we embarked on a journey to identify pastoral communities in Somaliland that demonstrated resilience in the face of adversity. Using a mix of traditional and non-traditional data sources, we wanted to explore and learn from communities that managed to have healthy rangelands despite the severe droughts of 2016 and 2017.

We engaged with government officials from various ministries, experts from the University of Hargeisa, international organizations like the FAO and members of agro-pastoral communities to learn more about rangeland health. We then selected the West Golis as our region of interest with a majority pastoral community and relative ease of access. Employing the Soil-Adjusted Vegetation Index (SAVI) and using geospatial and earth observation data allowed us to identify an initial group of potential positive deviants illustrated as green circles in Figure 1 below.

From Insights to Action: Amplifying Positive Deviance within Somali Rangelands
Figure 1: Measuring the vegetation health within 5 km community buffer zones based on SAVI.

Following the identification of potential positive deviants, we engaged with 18 pastoral communities from the Togdheer, Awdal, and Maroodijeex regions to validate whether the positive deviants we found using earth observation data were indeed doing better than the other communities.

The primary objective of the fieldwork was to uncover the existing practices and strategies that could explain the outperformance of positively-deviant communities compared to other communities. The research team identified a range of strategies, including soil and water conservation techniques, locally-produced pesticides, and reseeding practices as summarized in Figure 2.

From Insights to Action
Figure 2: Strategies and practices that emerged from the fieldwork

Data-Powered Positive Deviance is not just about identifying outperformers and their successful practices. The real value lies in the diffusion, adoption and adaptation of these practices by individuals, groups or communities facing similar challenges. For this to succeed, both the positive deviants and those learning about their practices must take ownership and drive the process. Merely presenting the uncommon but successful practices of positive deviants to others will not work. The secret to success is in empowering the community to take charge, overcome challenges, and leverage their own resources and capabilities to effect change…(More)”.

Privacy guarantees for personal mobility data in humanitarian response


Paper by Nitin Kohli,  Emily Aiken & Joshua E. Blumenstock: “Personal mobility data from mobile phones and other sensors are increasingly used to inform policymaking during pandemics, natural disasters, and other humanitarian crises. However, even aggregated mobility traces can reveal private information about individual movements to potentially malicious actors. This paper develops and tests an approach for releasing private mobility data, which provides formal guarantees over the privacy of the underlying subjects. Specifically, we (1) introduce an algorithm for constructing differentially private mobility matrices and derive privacy and accuracy bounds on this algorithm; (2) use real-world data from mobile phone operators in Afghanistan and Rwanda to show how this algorithm can enable the use of private mobility data in two high-stakes policy decisions: pandemic response and the distribution of humanitarian aid; and (3) discuss practical decisions that need to be made when implementing this approach, such as how to optimally balance privacy and accuracy. Taken together, these results can help enable the responsible use of private mobility data in humanitarian response…(More)”.

Impact Inversion


Blog by Victor Zhenyi Wang: “The very first project I worked on when I transitioned from commercial data science to development was during the nadir between South Africa’s first two COVID waves. A large international foundation was interested in working with the South African government and a technology non-profit to build an early warning system for COVID. The non-profit operated a WhatsApp based health messaging service that served about 2 million people in South Africa. The platform had run a COVID symptoms questionnaire which the foundation hoped could help the government predict surges in cases.

This kind of data-based “nowcasting” proved a useful tool in a number of other places e.g. some cities in the US. Yet in the context of South Africa, where the National Department of Health was mired in serious capacity constraints, government stakeholders were bearish about the usefulness of such a tool. Nonetheless, since the foundation was interested in funding this project, we went ahead with it anyway. The result was that we pitched this “early warning system” a handful of times to polite public health officials but it was otherwise never used. A classic case of development practitioners rendering problems technical and generating non-solutions that primarily serve the strategic objectives of the funders.

The technology non-profit did however express interest in a different kind of service — what about a language model that helps users answer questions about COVID? The non-profit’s WhatsApp messaging service is menu-based and they thought that a natural language interface could provide a better experience for users by letting them engage with health content on their own terms. Since we had ample funding from the foundation for the early warning system, we decided to pursue the chatbot project.

The project has now spanned to multiple other services run by the same non-profit, including the largest digital health service in South Africa. The project has won multiple grants and partnerships, including with Google, and has spun out into its own open source library. In many ways, in terms of sheer number of lives affected, this is the most impactful project I have had the privilege of supporting in my career in development, and I am deeply grateful to have been part of the team involved bringing it into existence.

Yet the truth is, the “impact” of this class of interventions remain unclear. Even though a large randomized controlled trial was done to assess the impact of the WhatsApp service, such an evaluation only captures the performance of the service on outcome variables determined by the non-profit, not on whether these outcomes are appropriate. It certainly does not tell us whether the service was the best means available to achieve the ultimate goal of improving the lives of those in communities underserved by health services.

This project, and many others that I have worked on as a data scientist in development, uses an implicit framework for impact which I describe as the design-to-impact pipeline. A technology is designed and developed, then its impact is assessed on the world. There is a strong emphasis to reform, to improve the design, development, and deployment of development technologies. Development practitioners have a broad range of techniques to make sure that the process of creation is ethical and responsible — in some sense, legitimate. With the broad adoption of data-based methods of program evaluation, e.g. randomized control trials, we might even make knowledge claims that an intervention truly ought to bring certain benefits to communities in which the intervention is placed. This view imagines that technologies, once this process is completed, is simply unleashed onto the world, and its impact is simply what was assessed ex ante. An industry of monitoring and evaluation surrounds its subsequent deployment; the relative success of interventions depends on the performance of benchmark indicators…(More)”.

AI Investment Potential Index: Mapping Global Opportunities for Sustainable Development


Paper by AFD: “…examines the potential of artificial intelligence (AI) investment to drive sustainable development across diverse national contexts. By evaluating critical factors, including AI readiness, social inclusion, human capital, and macroeconomic conditions, we construct a nuanced and comprehensive analysis of the global AI landscape. Employing advanced statistical techniques and machine learning algorithms, we identify nations with significant untapped potential for AI investment.
We introduce the AI Investment Potential Index (AIIPI), a novel instrument designed to guide financial institutions, development banks, and governments in making informed, strategic AI investment decisions. The AIIPI synthesizes metrics of AI readiness with socio-economic indicators to identify and highlight opportunities for fostering inclusive and sustainable growth. The methodological novelty lies in the weight selection process, which combines statistical modeling and also an entropy-based weighting approach. Furthermore, we provide detailed policy implications to support stakeholders in making targeted investments aimed at reducing disparities and advancing equitable technological development…(More)”.

Cross-border data flows in Africa: Continental ambitions and political realities


Paper by Melody Musoni, Poorva Karkare and Chloe Teevan: “Africa must prioritise data usage and cross-border data sharing to realise the goals of the African Continental Free Trade Area and to drive innovation and AI development. Accessible and shareable data is essential for the growth and success of the digital economy, enabling innovations and economic opportunities, especially in a rapidly evolving landscape.

African countries, through the African Union (AU), have a common vision of sharing data across borders to boost economic growth. However, the adopted continental digital policies are often inconsistently applied at the national level, where some member states implement restrictive measures like data localisation that limit the free flow of data.

The paper looks at national policies that often prioritise domestic interests and how those conflict with continental goals. This is due to differences in political ideologies, socio-economic conditions, security concerns and economic priorities. This misalignment between national agendas and the broader AU strategy is shaped by each country’s unique context, as seen in the examples of Senegal, Nigeria and Mozambique, which face distinct challenges in implementing the continental vision.

The paper concludes with actionable recommendations for the AU, member states and the partnership with the European Union. It suggests that the AU enhances support for data-sharing initiatives and urges member states to focus on policy alignment, address data deficiencies, build data infrastructure and find new ways to use data. It also highlights how the EU can strengthen its support for Africa’s datasharing goals…(More)”.