From Insights to Action: Amplifying Positive Deviance within Somali Rangelands


Article by Basma Albanna, Andreas Pawelke and Hodan Abdullahi: “In every community, some individuals or groups achieve significantly better outcomes than their peers, despite having similar challenges and resources. Finding these so-called positive deviants and working with them to diffuse their practices is referred to as the Positive Deviance approach. The Data-Powered Positive Deviance (DPPD) method follows the same logic as the Positive Deviance approach but leverages existing, non-traditional data sources, in conjunction with traditional data sources to identify and scale the solutions of positive deviants. The UNDP Somalia Accelerator Lab was part of the first cohort of teams that piloted the application of DPPD trying to tackle the rangeland health problem in the West Golis region. In this blog post we’re reflecting on the process we designed and tested to go from the identification and validation of successful practices to helping other communities adopt them.

Uncovering Rangeland Success

Three years ago we embarked on a journey to identify pastoral communities in Somaliland that demonstrated resilience in the face of adversity. Using a mix of traditional and non-traditional data sources, we wanted to explore and learn from communities that managed to have healthy rangelands despite the severe droughts of 2016 and 2017.

We engaged with government officials from various ministries, experts from the University of Hargeisa, international organizations like the FAO and members of agro-pastoral communities to learn more about rangeland health. We then selected the West Golis as our region of interest with a majority pastoral community and relative ease of access. Employing the Soil-Adjusted Vegetation Index (SAVI) and using geospatial and earth observation data allowed us to identify an initial group of potential positive deviants illustrated as green circles in Figure 1 below.

From Insights to Action: Amplifying Positive Deviance within Somali Rangelands
Figure 1: Measuring the vegetation health within 5 km community buffer zones based on SAVI.

Following the identification of potential positive deviants, we engaged with 18 pastoral communities from the Togdheer, Awdal, and Maroodijeex regions to validate whether the positive deviants we found using earth observation data were indeed doing better than the other communities.

The primary objective of the fieldwork was to uncover the existing practices and strategies that could explain the outperformance of positively-deviant communities compared to other communities. The research team identified a range of strategies, including soil and water conservation techniques, locally-produced pesticides, and reseeding practices as summarized in Figure 2.

From Insights to Action
Figure 2: Strategies and practices that emerged from the fieldwork

Data-Powered Positive Deviance is not just about identifying outperformers and their successful practices. The real value lies in the diffusion, adoption and adaptation of these practices by individuals, groups or communities facing similar challenges. For this to succeed, both the positive deviants and those learning about their practices must take ownership and drive the process. Merely presenting the uncommon but successful practices of positive deviants to others will not work. The secret to success is in empowering the community to take charge, overcome challenges, and leverage their own resources and capabilities to effect change…(More)”.

Integrating Social Media into Biodiversity Databases: The Next Big Step?


Article by Muhammad Osama: “Digital technologies and social media have transformed ecology and conservation biology data collection. Traditional biodiversity monitoring often relies on field surveys, which can be time-consuming and biased toward rural habitats.

The Global Biodiversity Information Facility (GBIF) serves as a key repository for biodiversity data, but it faces challenges such as delayed data availability and underrepresentation of urban habitats.

Social media platforms have become valuable tools for rapid data collection, enabling users to share georeferenced observations instantly, reducing time lags associated with traditional methods. The widespread use of smartphones with cameras allows individuals to document wildlife sightings in real-time, enhancing biodiversity monitoring. Integrating social media data with traditional ecological datasets offers significant advancements, particularly in tracking species distributions in urban areas.

In this paper, the authors evaluated the Jersey tiger moth’s habitat usage by comparing occurrence data from social media platforms (Instagram and Flickr) with traditional records from GBIF and iNaturalist. They hypothesized that social media data would reveal significant JTM occurrences in urban environments, which may be underrepresented in traditional datasets…(More)”.

The Language Data Space (LDS)


European Commission: “… welcomes launch of the Alliance for Language Technologies European Digital Infrastructure Consortium (ALT-EDIC) and the Language Data Space (LDS).

Aimed at addressing the shortage of European language data needed for training large language models, these projects are set to revolutionise multilingual Artificial Intelligence (AI) systems across the EU.

By offering services in all EU languages, the initiatives are designed to break down language barriers, providing better, more accessible solutions for smaller businesses within the EU. This effort not only aims to preserve the EU’s rich cultural and linguistic heritage in the digital age but also strengthens Europe’s quest for tech sovereignty. Formed in February 2024, the ALT-EDIC includes 17 participating Member States and 9 observer Member States and regions, making it one of the pioneering European Digital Infrastructure Consortia.

The LDS, part of the Common European Data Spaces, is crucial for increasing data availability for AI development in Europe. Developed by the Commission and funded by the DIGITAL programme,  this project aims to create a cohesive marketplace for language data. This will enhance the collection and sharing of multilingual data to support European large language models. Initially accessible to selected institutions and companies, the project aims to eventually involve all European public and private stakeholders.

Find more information about the Alliance for Language Technologies European Digital Infrastructure Consortium (ALT-EDIC) and the Language Data Space (LDS)…(More)”

Panels giving scientific advice to Census Bureau disbanded by Trump administration


Article by Jeffrey Mervis: “…U.S. Secretary of Commerce Howard Lutnick has disbanded five outside panels that provide scientific and community advice to the U.S. Census Bureau and other federal statistical agencies just as preparations are ramping up for the country’s next decennial census, in 2030.

The dozens of demographers, statisticians, and public members on the five panels received nearly identical letters this week telling them that “the Secretary of Commerce has determined that the purposes for which the [committee] was established have been fulfilled, and the committee has been terminated effective February 28, 2025. Thank you for your service.”

Statistician Robert Santos, who last month resigned as Census Bureau director 3 years into his 5-year term, says he’s “terribly disappointed but not surprised” by the move, noting how a recent directive by President Donald Trump on gender identity has disrupted data collection for a host of federal surveys…(More)”.

New AI Collaboratives to take action on wildfires and food insecurity


Google: “…last September we introduced AI Collaboratives, a new funding approach designed to unite public, private and nonprofit organizations, and researchers, to create AI-powered solutions to help people around the world.

Today, we’re sharing more about our first two focus areas for AI Collaboratives: Wildfires and Food Security.

Wildfires are a global crisis, claiming more than 300,000 lives due to smoke exposure annually and causing billions of dollars in economic damage. …Google.org has convened more than 15 organizations, including Earth Fire Alliance and Moore Foundation, to help in this important effort. By coordinating funding and integrating cutting-edge science, emerging technology and on-the-ground applications, we can provide collaborators with the tools they need to identify and track wildfires in near real time; quantify wildfire risk; shift more acreage to beneficial fires; and ultimately reduce the damage caused by catastrophic wildfires.

Nearly one-third of the world’s population faces moderate or severe food insecurity due to extreme weather, conflict and economic shocks. The AI Collaborative: Food Security will strengthen the resilience of global food systems and improve food security for the world’s most vulnerable populations through AI technologies, collaborative research, data-sharing and coordinated action. To date, 10 organizations have joined us in this effort, and we’ll share more updates soon…(More)”.

Government data is disappearing before our eyes


Article by Anna Massoglia: “A battle is being waged in the quiet corners of government websites and data repositories. Essential public records are disappearing and, with them, Americans’ ability to hold those in power accountable.

Take the Department of Government Efficiency, Elon Musk’s federal cost-cutting initiative. Touted as “maximally transparent,” DOGE is supposed to make government spending more efficient. But when journalists and researchers exposed major errors — from double-counting contracts to conflating caps with actual spending — DOGE didn’t fix the mistakes. Instead, it made them harder to detect.

Many Americans hoped DOGE’s work would be a step toward cutting costs and restoring trust in government. But trust must be earned. If our leaders truly want to restore faith in our institutions, they must ensure that facts remain available to everyone, not just when convenient.

Since Jan. 20, public records across the federal government have been erased. Economic indicators that guide investments, scientific datasets that drive medical breakthroughs, federal health guidelines and historical archives that inform policy decisions have all been put on the chopping block. Some missing datasets have been restored but are incomplete or have unexplained changes, rendering them unreliable.

Both Republican and Democratic administrations have played a role in limiting public access to government records. But the scale and speed of the Trump administration’s data manipulation — combined with buyouts, resignations and other restructuring across federal agencies — signal a new phase in the war on public information. This is not just about deleting files, it’s about controlling what the public sees, shaping the narrative and limiting accountability.

The Trump administration is accelerating this trend with revisions to official records. Unelected advisors are overseeing a sweeping reorganization of federal data, granting entities like DOGE unprecedented access to taxpayer records with little oversight. This is not just a bureaucratic reshuffle — it is a fundamental reshaping of the public record.

The consequences of data manipulation extend far beyond politics. When those in power control the flow of information, they can dictate collective truth. Governments that manipulate information are not just rewriting statistics — they are rewriting history.

From authoritarian regimes that have erased dissent to leaders who have fabricated economic numbers to maintain their grip on power, the dangers of suppressing and distorting data are well-documented.

Misleading or inconsistent data can be just as dangerous as opacity. When hard facts are replaced with political spin, conspiracy theories take root and misinformation fills the void.

The fact that data suppression and manipulation has occurred before does not lessen the danger, but underscores the urgency of taking proactive measures to safeguard transparency. A missing statistic today can become a missing historical fact tomorrow. Over time, that can reshape our reality…(More)”.

Large AI models are cultural and social technologies


Essay by Henry Farrell, Alison Gopnik, Cosma Shalizi, and James Evans: “Debates about artificial intelligence (AI) tend to revolve around whether large models are intelligent, autonomous agents. Some AI researchers and commentators speculate that we are on the cusp of creating agents with artificial general intelligence (AGI), a prospect anticipated with both elation and anxiety. There have also been extensive conversations about cultural and social consequences of large models, orbiting around two foci: immediate effects of these systems as they are currently used, and hypothetical futures when these systems turn into AGI agents perhaps even superintelligent AGI agents.

But this discourse about large models as intelligent agents is fundamentally misconceived. Combining ideas from social and behavioral sciences with computer science can help us understand AI systems more accurately. Large Models should not be viewed primarily as intelligent agents, but as a new kind of cultural and social technology, allowing humans to take advantage of information other humans have accumulated.

The new technology of large models combines important features of earlier technologies. Like pictures, writing, print, video, Internet search, and other such technologies, large models allow people to access information that other people have created. Large Models – currently language, vision, and multi-modal depend on the fact that the Internet has made the products of these earlier technologies readily available in machine-readable form. But like economic markets, state bureaucracies, and other social technologies, these systems not only make information widely available, they allow it to be reorganized, transformed, and restructured in distinctive ways. Adopting Herbert Simon’s terminology, large models are a new variant of the “artificial systems of human society” that process information to enable large-scale coordination…(More)”

Can small language models revitalize Indigenous languages?


Article by Brooke Tanner and Cameron F. Kerry: “Indigenous languages play a critical role in preserving cultural identity and transmitting unique worldviews, traditions, and knowledge, but at least 40% of the world’s 6,700 languages are currently endangered. The United Nations declared 2022-2032 as the International Decade of Indigenous Languages to draw attention to this threat, in hopes of supporting the revitalization of these languages and preservation of access to linguistic resources.  

Building on the advantages of SLMs, several initiatives have successfully adapted these models specifically for Indigenous languages. Such Indigenous language models (ILMs) represent a subset of SLMs that are designed, trained, and fine-tuned with input from the communities they serve. 

Case studies and applications 

  • Meta released No Language Left Behind (NLLB-200), a 54 billion–parameter open-source machine translation model that supports 200 languages as part of Meta’s universal speech translator project. The model includes support for languages with limited translation resources. While the model’s breadth of languages included is novel, NLLB-200 can struggle to capture the intricacies of local context for low-resource languages and often relies on machine-translated sentence pairs across the internet due to the scarcity of digitized monolingual data. 
  • Lelapa AI’s InkubaLM-0.4B is an SLM with applications for low-resource African languages. Trained on 1.9 billion tokens across languages including isiZulu, Yoruba, Swahili, and isiXhosa, InkubaLM-0.4B (with 400 million parameters) builds on Meta’s LLaMA 2 architecture, providing a smaller model than the original LLaMA 2 pretrained model with 7 billion parameters. 
  • IBM Research Brazil and the University of São Paulo have collaborated on projects aimed at preserving Brazilian Indigenous languages such as Guarani Mbya and Nheengatu. These initiatives emphasize co-creation with Indigenous communities and address concerns about cultural exposure and language ownership. Initial efforts included electronic dictionaries, word prediction, and basic translation tools. Notably, when a prototype writing assistant for Guarani Mbya raised concerns about exposing their language and culture online, project leaders paused further development pending community consensus.  
  • Researchers have fine-tuned pre-trained models for Nheengatu using linguistic educational sources and translations of the Bible, with plans to incorporate community-guided spellcheck tools. Since the translations relying on data from the Bible, primarily translated by colonial priests, often sounded archaic and could reflect cultural abuse and violence, they were classified as potentially “toxic” data that would not be used in any deployed system without explicit Indigenous community agreement…(More)”.

Bridging Digital Divides: How PescaData is Connecting Small-Scale Fishing Cooperatives to the Blue Economy


Article by Stuart Fulton: “In this research project, we examine how digital platforms – specifically PescaData – can be leveraged to connect small-scale fishing cooperatives with impact investors and donors, creating new pathways for sustainable blue economy financing, while simultaneously ensuring fair data practices that respect data sovereignty and traditional ecological knowledge.

PescaData emerged as a pioneering digital platform that enables fishing communities to collect more accurate data to ensure sustainable fisheries. Since then, PescaData has evolved to provide software as a service to fishing cooperatives and to allow fishers to document their solutions to environmental and economic challenges. Since 2022, small-scale fishers have used it to document nearly 300 initiatives that contribute to multiple Sustainable Development Goals. 

Respecting Data Sovereignty in the Digital Age

One critical aspect of our research acknowledges the unique challenges of implementing digital tools in traditional cooperative settings. Unlike conventional tech implementations that often extract value from communities, PescaData´s approach centers on data sovereignty – the principle that fishing communities should maintain ownership and control over their data. As the PescaData case study demonstrates, a humanity-centric rather than merely user-centric approach is essential. This means designing with compassion and establishing clear governance around data from the very beginning. The data generated by fishing cooperatives represents not just information, but traditional knowledge accumulated over generations of resource management.

The fishers themselves have articulated clear principles for data governance in a cooperative model:

  • Ownership: Fishers, as data producers, decide who has access and under what conditions.
  • Transparency: Clear agreements on data use.
  • Knowledge assessment: Highlighting fishers’ contributions and placing them in decision-making positions.
  • Co-design: Ensuring the platform meets their specific needs.
  • Security: Protecting collected data…(More)”.

A Quest for AI Knowledge


Paper by Joshua S. Gans: “This paper examines how the introduction of artificial intelligence (AI), particularly generative and large language models capable of interpolating precisely between known data points, reshapes scientists’ incentives for pursuing novel versus incremental research. Extending the theoretical framework of Carnehl and Schneider (2025), we analyse how decision-makers leverage AI to improve precision within well-defined knowledge domains. We identify conditions under which the availability of AI tools encourages scientists to choose more socially valuable, highly novel research projects, contrasting sharply with traditional patterns of incremental knowledge growth. Our model demonstrates a critical complementarity: scientists strategically align their research novelty choices to maximise the domain where AI can reliably inform decision-making. This dynamic fundamentally transforms the evolution of scientific knowledge, leading either to systematic “stepping stone” expansions or endogenous research cycles of strategic knowledge deepening. We discuss the broader implications for science policy, highlighting how sufficiently capable AI tools could mitigate traditional inefficiencies in scientific innovation, aligning private research incentives closely with the social optimum…(More)”.