Stefaan Verhulst
Article by Basma Albanna, Andreas Pawelke and Hodan Abdullahi: “In every community, some individuals or groups achieve significantly better outcomes than their peers, despite having similar challenges and resources. Finding these so-called positive deviants and working with them to diffuse their practices is referred to as the Positive Deviance approach. The Data-Powered Positive Deviance (DPPD) method follows the same logic as the Positive Deviance approach but leverages existing, non-traditional data sources, in conjunction with traditional data sources to identify and scale the solutions of positive deviants. The UNDP Somalia Accelerator Lab was part of the first cohort of teams that piloted the application of DPPD trying to tackle the rangeland health problem in the West Golis region. In this blog post we’re reflecting on the process we designed and tested to go from the identification and validation of successful practices to helping other communities adopt them.
Uncovering Rangeland Success
Three years ago we embarked on a journey to identify pastoral communities in Somaliland that demonstrated resilience in the face of adversity. Using a mix of traditional and non-traditional data sources, we wanted to explore and learn from communities that managed to have healthy rangelands despite the severe droughts of 2016 and 2017.
We engaged with government officials from various ministries, experts from the University of Hargeisa, international organizations like the FAO and members of agro-pastoral communities to learn more about rangeland health. We then selected the West Golis as our region of interest with a majority pastoral community and relative ease of access. Employing the Soil-Adjusted Vegetation Index (SAVI) and using geospatial and earth observation data allowed us to identify an initial group of potential positive deviants illustrated as green circles in Figure 1 below.

Following the identification of potential positive deviants, we engaged with 18 pastoral communities from the Togdheer, Awdal, and Maroodijeex regions to validate whether the positive deviants we found using earth observation data were indeed doing better than the other communities.
The primary objective of the fieldwork was to uncover the existing practices and strategies that could explain the outperformance of positively-deviant communities compared to other communities. The research team identified a range of strategies, including soil and water conservation techniques, locally-produced pesticides, and reseeding practices as summarized in Figure 2.

Data-Powered Positive Deviance is not just about identifying outperformers and their successful practices. The real value lies in the diffusion, adoption and adaptation of these practices by individuals, groups or communities facing similar challenges. For this to succeed, both the positive deviants and those learning about their practices must take ownership and drive the process. Merely presenting the uncommon but successful practices of positive deviants to others will not work. The secret to success is in empowering the community to take charge, overcome challenges, and leverage their own resources and capabilities to effect change…(More)”.
Article by Muhammad Osama: “Digital technologies and social media have transformed ecology and conservation biology data collection. Traditional biodiversity monitoring often relies on field surveys, which can be time-consuming and biased toward rural habitats.
The Global Biodiversity Information Facility (GBIF) serves as a key repository for biodiversity data, but it faces challenges such as delayed data availability and underrepresentation of urban habitats.
Social media platforms have become valuable tools for rapid data collection, enabling users to share georeferenced observations instantly, reducing time lags associated with traditional methods. The widespread use of smartphones with cameras allows individuals to document wildlife sightings in real-time, enhancing biodiversity monitoring. Integrating social media data with traditional ecological datasets offers significant advancements, particularly in tracking species distributions in urban areas.
In this paper, the authors evaluated the Jersey tiger moth’s habitat usage by comparing occurrence data from social media platforms (Instagram and Flickr) with traditional records from GBIF and iNaturalist. They hypothesized that social media data would reveal significant JTM occurrences in urban environments, which may be underrepresented in traditional datasets…(More)”.
European Commission: “… welcomes launch of the Alliance for Language Technologies European Digital Infrastructure Consortium (ALT-EDIC) and the Language Data Space (LDS).
Aimed at addressing the shortage of European language data needed for training large language models, these projects are set to revolutionise multilingual Artificial Intelligence (AI) systems across the EU.
By offering services in all EU languages, the initiatives are designed to break down language barriers, providing better, more accessible solutions for smaller businesses within the EU. This effort not only aims to preserve the EU’s rich cultural and linguistic heritage in the digital age but also strengthens Europe’s quest for tech sovereignty. Formed in February 2024, the ALT-EDIC includes 17 participating Member States and 9 observer Member States and regions, making it one of the pioneering European Digital Infrastructure Consortia.
The LDS, part of the Common European Data Spaces, is crucial for increasing data availability for AI development in Europe. Developed by the Commission and funded by the DIGITAL programme, this project aims to create a cohesive marketplace for language data. This will enhance the collection and sharing of multilingual data to support European large language models. Initially accessible to selected institutions and companies, the project aims to eventually involve all European public and private stakeholders.
Find more information about the Alliance for Language Technologies European Digital Infrastructure Consortium (ALT-EDIC) and the Language Data Space (LDS)…(More)”
Paper by Julie Battilana, Christine M. Beckman, and Julie Yen: “As threats to democracy endanger the rights and freedoms of people around the world, scholars are increasingly interrogating the role that organizations play in shaping democratic and authoritarian societies. Just as societies can be more or less democratic, so, too, can organizations. This essay, in honor of ASQ’s 70th volume, argues for a deeper focus in organizational research on the extent to which organizations themselves are democratic and the outcomes associated with these varied models of organizing. First, we provide a framework for considering the extent to which organizations are democratically organized, accounting for the varied ways in which workers can participate in their organizations. Second, we call for research on the outcomes associated with democratic organizing at both the organizational and societal levels. We build from research arguing that the extent to which workers participate in organizational decision making can spill over to impact their expectations of and participation in civic life. Moving forward, we argue it is critical to recognize that questions of democracy and authoritarianism concern not only the political contexts in which organizations are embedded but also how organizations themselves are structured and contribute to society…(More)”
OECD Policy Primer: “Immersive technologies, such as augmented reality, digital twins and virtual worlds, offer innovative ways to interact with information and the environment by engaging one’s senses. This paper explores potential benefits of these technologies, from innovative commercial applications to addressing societal challenges. It also highlights potential risks, such as extensive data collection, mental or physical risks from misuse, and emerging cyber threats. It outlines policy opportunities and challenges in maximising these benefits while mitigating risks, with real-world use cases in areas like remote healthcare and education for people with disabilities. The paper emphasises the critical role of anticipatory governance and international collaboration in shaping the human-centric and values-based development and use of immersive technologies…(More)”.
Blog by Evelien Verschroeven: “As AI continues to evolve, it is essential to move beyond focusing solely on individual behavior changes. The individual input — whether through behavior, data, or critical content — remains important. New data and fresh perspectives are necessary for AI to continue learning, growing, and improving its relevance. However, as we head into what some are calling the golden years of AI, it’s critical to acknowledge a potential challenge: within five years, it is predicted that 50% of AI-generated content will be based on AI-created material, creating a risk of inbreeding where AI learns from itself, rather than from the diversity of human experience and knowledge.
Platforms like Google’s AI for Social Good and Unanimous AI’s Swarm play pivotal roles in breaking this cycle. By encouraging the aggregation of real-world data, they add new content that can influence and shape AI’s evolution. While they focus on individual data contributions, they also help keep AI systems grounded in real-world scenarios, ensuring that the content remains critical and diverse.
However, human oversight is key. AI systems, even with the best intentions, are still learning from patterns that humans provide. It’s essential that AI continues to receive diverse human input, so that its understanding remains grounded in real-world perspectives. AI should be continuously checked and guided by human creativity, critical thinking, and social contexts, to ensure that it doesn’t become isolated or too self-referential.
As we continue advancing AI, it is crucial to embrace relational cognition and collective intelligence. This approach will allow AI to address both individual and collective needs, enhancing not only personal development but also strengthening social bonds and fostering more resilient, adaptive communities…(More)”.
Google: “…last September we introduced AI Collaboratives, a new funding approach designed to unite public, private and nonprofit organizations, and researchers, to create AI-powered solutions to help people around the world.
Today, we’re sharing more about our first two focus areas for AI Collaboratives: Wildfires and Food Security.
Wildfires are a global crisis, claiming more than 300,000 lives due to smoke exposure annually and causing billions of dollars in economic damage. …Google.org has convened more than 15 organizations, including Earth Fire Alliance and Moore Foundation, to help in this important effort. By coordinating funding and integrating cutting-edge science, emerging technology and on-the-ground applications, we can provide collaborators with the tools they need to identify and track wildfires in near real time; quantify wildfire risk; shift more acreage to beneficial fires; and ultimately reduce the damage caused by catastrophic wildfires.
Nearly one-third of the world’s population faces moderate or severe food insecurity due to extreme weather, conflict and economic shocks. The AI Collaborative: Food Security will strengthen the resilience of global food systems and improve food security for the world’s most vulnerable populations through AI technologies, collaborative research, data-sharing and coordinated action. To date, 10 organizations have joined us in this effort, and we’ll share more updates soon…(More)”.
Article by Jeffrey Mervis: “…U.S. Secretary of Commerce Howard Lutnick has disbanded five outside panels that provide scientific and community advice to the U.S. Census Bureau and other federal statistical agencies just as preparations are ramping up for the country’s next decennial census, in 2030.
The dozens of demographers, statisticians, and public members on the five panels received nearly identical letters this week telling them that “the Secretary of Commerce has determined that the purposes for which the [committee] was established have been fulfilled, and the committee has been terminated effective February 28, 2025. Thank you for your service.”
Statistician Robert Santos, who last month resigned as Census Bureau director 3 years into his 5-year term, says he’s “terribly disappointed but not surprised” by the move, noting how a recent directive by President Donald Trump on gender identity has disrupted data collection for a host of federal surveys…(More)”.
Article by Brooke Tanner and Cameron F. Kerry: “Indigenous languages play a critical role in preserving cultural identity and transmitting unique worldviews, traditions, and knowledge, but at least 40% of the world’s 6,700 languages are currently endangered. The United Nations declared 2022-2032 as the International Decade of Indigenous Languages to draw attention to this threat, in hopes of supporting the revitalization of these languages and preservation of access to linguistic resources.
Building on the advantages of SLMs, several initiatives have successfully adapted these models specifically for Indigenous languages. Such Indigenous language models (ILMs) represent a subset of SLMs that are designed, trained, and fine-tuned with input from the communities they serve.
Case studies and applications
- Meta released No Language Left Behind (NLLB-200), a 54 billion–parameter open-source machine translation model that supports 200 languages as part of Meta’s universal speech translator project. The model includes support for languages with limited translation resources. While the model’s breadth of languages included is novel, NLLB-200 can struggle to capture the intricacies of local context for low-resource languages and often relies on machine-translated sentence pairs across the internet due to the scarcity of digitized monolingual data.
- Lelapa AI’s InkubaLM-0.4B is an SLM with applications for low-resource African languages. Trained on 1.9 billion tokens across languages including isiZulu, Yoruba, Swahili, and isiXhosa, InkubaLM-0.4B (with 400 million parameters) builds on Meta’s LLaMA 2 architecture, providing a smaller model than the original LLaMA 2 pretrained model with 7 billion parameters.
- IBM Research Brazil and the University of São Paulo have collaborated on projects aimed at preserving Brazilian Indigenous languages such as Guarani Mbya and Nheengatu. These initiatives emphasize co-creation with Indigenous communities and address concerns about cultural exposure and language ownership. Initial efforts included electronic dictionaries, word prediction, and basic translation tools. Notably, when a prototype writing assistant for Guarani Mbya raised concerns about exposing their language and culture online, project leaders paused further development pending community consensus.
- Researchers have fine-tuned pre-trained models for Nheengatu using linguistic educational sources and translations of the Bible, with plans to incorporate community-guided spellcheck tools. Since the translations relying on data from the Bible, primarily translated by colonial priests, often sounded archaic and could reflect cultural abuse and violence, they were classified as potentially “toxic” data that would not be used in any deployed system without explicit Indigenous community agreement…(More)”.
Article by Anna Massoglia: “A battle is being waged in the quiet corners of government websites and data repositories. Essential public records are disappearing and, with them, Americans’ ability to hold those in power accountable.
Take the Department of Government Efficiency, Elon Musk’s federal cost-cutting initiative. Touted as “maximally transparent,” DOGE is supposed to make government spending more efficient. But when journalists and researchers exposed major errors — from double-counting contracts to conflating caps with actual spending — DOGE didn’t fix the mistakes. Instead, it made them harder to detect.
Many Americans hoped DOGE’s work would be a step toward cutting costs and restoring trust in government. But trust must be earned. If our leaders truly want to restore faith in our institutions, they must ensure that facts remain available to everyone, not just when convenient.
Since Jan. 20, public records across the federal government have been erased. Economic indicators that guide investments, scientific datasets that drive medical breakthroughs, federal health guidelines and historical archives that inform policy decisions have all been put on the chopping block. Some missing datasets have been restored but are incomplete or have unexplained changes, rendering them unreliable.
Both Republican and Democratic administrations have played a role in limiting public access to government records. But the scale and speed of the Trump administration’s data manipulation — combined with buyouts, resignations and other restructuring across federal agencies — signal a new phase in the war on public information. This is not just about deleting files, it’s about controlling what the public sees, shaping the narrative and limiting accountability.
The Trump administration is accelerating this trend with revisions to official records. Unelected advisors are overseeing a sweeping reorganization of federal data, granting entities like DOGE unprecedented access to taxpayer records with little oversight. This is not just a bureaucratic reshuffle — it is a fundamental reshaping of the public record.
The consequences of data manipulation extend far beyond politics. When those in power control the flow of information, they can dictate collective truth. Governments that manipulate information are not just rewriting statistics — they are rewriting history.
From authoritarian regimes that have erased dissent to leaders who have fabricated economic numbers to maintain their grip on power, the dangers of suppressing and distorting data are well-documented.
Misleading or inconsistent data can be just as dangerous as opacity. When hard facts are replaced with political spin, conspiracy theories take root and misinformation fills the void.
The fact that data suppression and manipulation has occurred before does not lessen the danger, but underscores the urgency of taking proactive measures to safeguard transparency. A missing statistic today can become a missing historical fact tomorrow. Over time, that can reshape our reality…(More)”.