Announcing SPARROW: A Breakthrough AI Tool to Measure and Protect Earth’s Biodiversity in the Most Remote Places


Blog by Juan Lavista Ferres: “The biodiversity of our planet is rapidly declining. We’ve likely reached a tipping point where it is crucial to use every tool at our disposal to help preserve what remains. That’s why I am pleased to announce SPARROW—Solar-Powered Acoustic and Remote Recording Observation Watch, developed by Microsoft’s AI for Good Lab. SPARROW is an AI-powered edge computing solution designed to operate autonomously in the most remote corners of the planet. Solar-powered and equipped with advanced sensors, it collects biodiversity data—from camera traps, acoustic monitors, and other environmental detectors—that are processed using our most advanced PyTorch-based wildlife AI models on low-energy edge GPUs. The resulting critical information is then transmitted via low-Earth orbit satellites directly to the cloud, allowing researchers to access fresh, actionable insights in real time, no matter where they are. 

Think of SPARROW as a network of Earth-bound satellites, quietly observing and reporting on the health of our ecosystems without disrupting them. By leveraging solar energy, these devices can run for a long time, minimizing their footprint and any potential harm to the environment…(More)”.

A linkless internet


Essay by Collin Jennings: “..But now Google and other websites are moving away from relying on links in favour of artificial intelligence chatbots. Considered as preserved trails of connected ideas, links make sense as early victims of the AI revolution since large language models (LLMs) such as ChatGPT, Google’s Gemini and others abstract the information represented online and present it in source-less summaries. We are at a moment in the history of the web in which the link itself – the countless connections made by website creators, the endless tapestry of ideas woven together throughout the web – is in danger of going extinct. So it’s pertinent to ask: how did links come to represent information in the first place? And what’s at stake in the movement away from links toward AI chat interfaces?

To answer these questions, we need to go back to the 17th century, when writers and philosophers developed the theory of mind that ultimately inspired early hypertext plans. In this era, prominent philosophers, including Thomas Hobbes and John Locke, debated the extent to which a person controls the succession of ideas that appears in her mind. They posited that the succession of ideas reflects the interaction between the data received from the senses and one’s mental faculties – reason and imagination. Subsequently, David Hume argued that all successive ideas are linked by association. He enumerated three kinds of associative connections among ideas: resemblance, contiguity, and cause and effect. In An Enquiry Concerning Human Understanding (1748), Hume offers examples of each relationship:

A picture naturally leads our thoughts to the original: the mention of one apartment in a building naturally introduces an enquiry or discourse concerning the others: and if we think of a wound, we can scarcely forbear reflecting on the pain which follows it.

The mind follows connections found in the world. Locke and Hume believed that all human knowledge comes from experience, and so they had to explain how the mind receives, processes and stores external data. They often reached for media metaphors to describe the relationship between the mind and the world. Locke compared the mind to a blank tablet, a cabinet and a camera obscura. Hume relied on the language of printing to distinguish between the vivacity of impressions imprinted upon one’s senses and the ideas recalled in the mind…(More)”.

Harnessing AI: How to develop and integrate automated prediction systems for humanitarian anticipatory action


CEPR Report: “Despite unprecedented access to data, resources, and wealth, the world faces an escalating wave of humanitarian crises. Armed conflict, climate-induced disasters, and political instability are displacing millions and devastating communities. Nearly one in every five children are living in or fleeing conflict zones (OCHA, 2024). Often the impacts of conflict and climatic hazards – such as droughts and flood – exacerbate each other, leading to even greater suffering. As crises unfold and escalate, the need for timely and effective humanitarian action becomes paramount.

Sophisticated systems for forecasting and monitoring natural and man-made hazards have emerged as critical tools to help inform and prompt action. The full potential for the use of such automated forecasting systems to inform anticipatory action (AA) is immense but is still to be realised. By providing early warnings and predictive insights, these systems could help organisations allocate resources more efficiently, plan interventions more effectively, and ultimately save lives and prevent or reduce humanitarian impact.


This Policy Insight provides an account of the significant technical, ethical, and organisational difficulties involved in such systems, and the current solutions in place…(More)”.

Harvard Is Releasing a Massive Free AI Training Dataset Funded by OpenAI and Microsoft


Article by Kate Knibbs: “Harvard University announced Thursday it’s releasing a high-quality dataset of nearly 1 million public-domain books that could be used by anyone to train large language models and other AI tools. The dataset was created by Harvard’s newly formed Institutional Data Initiative with funding from both Microsoft and OpenAI. It contains books scanned as part of the Google Books project that are no longer protected by copyright.

Around five times the size of the notorious Books3 dataset that was used to train AI models like Meta’s Llama, the Institutional Data Initiative’s database spans genres, decades, and languages, with classics from Shakespeare, Charles Dickens, and Dante included alongside obscure Czech math textbooks and Welsh pocket dictionaries. Greg Leppert, executive director of the Institutional Data Initiative, says the project is an attempt to “level the playing field” by giving the general public, including small players in the AI industry and individual researchers, access to the sort of highly-refined and curated content repositories that normally only established tech giants have the resources to assemble. “It’s gone through rigorous review,” he says…(More)”.

The Recommendation on Information Integrity


OECD Recommendation: “…The digital transformation of societies has reshaped how people interact and engage with information. Advancements in digital technologies and novel forms of communication have changed the way information is produced, shared, and consumed, locally and globally and across all media. Technological changes and the critical importance of online information platforms offer unprecedented access to information, foster citizen engagement and connection, and allow for innovative news reporting. However, they can also provide a fertile ground for the rapid spread of false, altered, or misleading content. In addition, new generative AI tools have greatly reduced the barriers to creating and spreading content.

Promoting the availability and free flow of high-quality, evidence-based information is key to upholding individuals’ ability to seek and receive information and ideas of all kinds and to safeguarding freedom of opinion and expression. 

The volume of content to which citizens are exposed can obscure and saturate public debates and help widen societal divisions. In this context, the quality of civic discourse declines as evidence-based information, which helps people make sense of their social environment, becomes harder to find. This reality has acted as a catalyst for governments to explore more closely the roles they can play, keeping as a priority in our democracies the necessity that governments should not exercise control of the information ecosystem and that, on the contrary, they support an environment where a plurality of information sources, views, and opinions can thrive…Building on the detailed policy framework outlined in the OECD report Facts not Fakes: Tackling Disinformation, Strengthening Information Integrity, the Recommendation provides an ambitious and actionable international standard that will help governments develop a systemic approach to foster information integrity, relying on a multi-stakeholder approach…(More)”.

How Years of Reddit Posts Have Made the Company an AI Darling


Article by Sarah E. Needleman: “Artificial-intelligence companies were one of Reddit’s biggest frustrations last year. Now they are a key source of growth for the social-media platform. 

These companies have an insatiable appetite for online data to train their models and display content in an easy-to-digest format. In mid-2023, Reddit, a social-media veteran and IPO newbie, turned off the spigot and began charging some businesses for access to its data. 

It turns out that Reddit’s ever-growing 19-year warehouse of user commentary makes it an attractive resource for AI companies. The platform recently reported its first quarterly profit as a publicly traded company, thanks partly to data-licensing deals it made in the past year with OpenAI and Google.

Reddit Chief Executive and co-founder Steve Huffman has said the company had to stop giving away its valuable data to the world’s largest companies for free. 

“It is an arms race,” he said at The Wall Street Journal’s Tech Live conference in October. “But we’re in talks with just about everybody, so we’ll see where these things land.”

Reddit’s huge amount of data works well for AI companies because it is organized by topics and uses a voting system instead of an algorithm to sort content quality, and because people’s posts tend to be candid.

For the first nine months of 2024, Reddit’s revenue category that includes licensing grew to $81.6 million from $12.3 million a year earlier.

While data-licensing revenue remains dwarfed by Reddit’s core advertising sales, the new category’s rapid growth reveals a potential lucrative business line with relatively high margins.

Diversifying away from a reliance on advertising, while tapping into an AI-adjacent market, has also made Reddit attractive to investors who are searching for new exposure to the latest technology boom. Reddit’s stock has more than doubled in the past three months.

The source of Reddit’s newfound wealth is the burgeoning market for AI-useful data. Reddit’s willingness to sell its data to AI outfits makes it stand out, because there is only a finite amount of data available for AI companies to gobble up for free or purchase. Some executives and researchers say the industry’s need for high-quality text could outstrip supply within two years, potentially slowing AI’s development…(More)”.

Citizen science as an instrument for women’s health research


Paper by Sarah Ahannach et al: “Women’s health research is receiving increasing attention globally, but considerable knowledge gaps remain. Across many fields of research, active involvement of citizens in science has emerged as a promising strategy to help align scientific research with societal needs. Citizen science offers researchers the opportunity for large-scale sampling and data acquisition while engaging the public in a co-creative approach that solicits their input on study aims, research design, data gathering and analysis. Here, we argue that citizen science has the potential to generate new data and insights that advance women’s health. Based on our experience with the international Isala project, which used a citizen-science approach to study the female microbiome and its influence on health, we address key challenges and lessons for generating a holistic, community-centered approach to women’s health research. We advocate for interdisciplinary collaborations to fully leverage citizen science in women’s health toward a more inclusive research landscape that amplifies underrepresented voices, challenges taboos around intimate health topics and prioritizes women’s involvement in shaping health research agendas…(More)”.

Changing Behaviour by Adding an Option


Paper by Lukas Fuchs: “Adding an option is a neglected mechanism for bringing about behavioural change. This mechanism is distinct from nudges, which are changes in the choice architecture, and instead makes it possible to pursue republican paternalism, a unique form of paternalism in which choices are changed by expanding people’s set of options. I argue that this is truly a form of paternalism (albeit a relatively soft one) and illustrate some of its manifestations in public policy, specifically public options and market creation. Furthermore, I compare it with libertarian paternalism on several dimensions, namely respect for individuals’ agency, effectiveness, and efficiency. Finally, I consider whether policymakers have the necessary knowledge to successfully change behaviour by adding options. Given that adding an option has key advantages over nudges in most if not all of these dimensions, it should be considered indispensable in the behavioural policymaker’s toolbox…(More)”.

Must NLP be Extractive?


Paper by Steven Bird: “How do we roll out language technologies across a world with 7,000 languages? In one story, we scale the successes of NLP further into ‘low-resource’ languages, doing ever more with less. However, this approach does not recognise the fact that – beyond the 500 institutional languages – the remaining languages are oral vernaculars. These speech communities interact with the outside world using a ‘con-
tact language’. I argue that contact languages are the appropriate target for technologies like speech recognition and machine translation, and that the 6,500 oral vernaculars should be approached differently. I share stories from an Indigenous community where local people reshaped an extractive agenda to align with their relational agenda. I describe the emerging paradigm of Relational NLP and explain how it opens the way to non-extractive methods and to solutions that enhance human agency…(More)”

Navigating the AI Frontier: A Primer on the Evolution and Impact of AI Agents


Report by the World Economic Forum: “AI agents are autonomous systems capable of sensing, learning and acting upon their environments. This white paper explores their development and looks at how they are linked to recent advances in large language and multimodal models. It highlights how AI agents can enhance efficiency across sectors including healthcare, education and finance.

Tracing their evolution from simple rule-based programmes to sophisticated entities with complex decision-making abilities, the paper discusses both the benefits and the risks associated with AI agents. Ethical considerations such as transparency and accountability are emphasized, highlighting the need for robust governance frameworks and cross-sector collaboration.

By understanding the opportunities and challenges that AI agents present, stakeholders can responsibly leverage these systems to drive innovation, improve practices and enhance quality of life. This primer serves as a valuable resource for anyone seeking to gain a better grasp of this rapidly advancing field…(More)”.