Real-time prices, real results: comparing crowdsourcing, AI, and traditional data collection


Article by Julius Adewopo, Bo Andree, Zacharey Carmichael, Steve Penson, Kamwoo Lee: “Timely, high-quality food price data is essential for shock responsive decision-making. However, in many low- and middle-income countries, such data is often delayed, limited in geographic coverage, or unavailable due to operational constraints. Traditional price monitoring, which relies on structured surveys conducted by trained enumerators, is often constrained by challenges related to cost, frequency, and reach.

To help overcome these limitations, the World Bank launched the Real-Time Prices (RTP) data platform. This effort provides monthly price data using a machine learning framework. The models combine survey results with predictions derived from observations in nearby markets and related commodities. This approach helps fill gaps in local price data across a basket of goods, enabling real-time monitoring of inflation dynamics even when survey data is incomplete or irregular.

In parallel, new approaches—such as citizen-submitted (crowdsourced) data—are being explored to complement conventional data collection methods. These crowdsourced data were recently published in a Nature Scientific Data paper. While the adoption of these innovations is accelerating, maintaining trust requires rigorous validation.

newly published study in PLOS compares the two emerging methods with the traditional, enumerator-led gold standard, providing  new evidence that both crowdsourced and AI-imputed prices can serve as credible, timely alternatives to traditional ground-truth data collection—especially in contexts where conventional methods face limitations…(More)”.

These Startups Are Building Advanced AI Models Without Data Centers


Article by Will Knight: “Researchers have trained a new kind of large language model (LLM) using GPUs dotted across the world and fed private as well as public data—a move that suggests that the dominant way of building artificial intelligence could be disrupted.

Article by Will Knight: “Flower AI and Vana, two startups pursuing unconventional approaches to building AI, worked together to create the new model, called Collective-1.

Flower created techniques that allow training to be spread across hundreds of computers connected over the internet. The company’s technology is already used by some firms to train AI models without needing to pool compute resources or data. Vana provided sources of data including private messages from X, Reddit, and Telegram.

Collective-1 is small by modern standards, with 7 billion parameters—values that combine to give the model its abilities—compared to hundreds of billions for today’s most advanced models, such as those that power programs like ChatGPTClaude, and Gemini.

Nic Lane, a computer scientist at the University of Cambridge and cofounder of Flower AI, says that the distributed approach promises to scale far beyond the size of Collective-1. Lane adds that Flower AI is partway through training a model with 30 billion parameters using conventional data, and plans to train another model with 100 billion parameters—close to the size offered by industry leaders—later this year. “It could really change the way everyone thinks about AI, so we’re chasing this pretty hard,” Lane says. He says the startup is also incorporating images and audio into training to create multimodal models.

Distributed model-building could also unsettle the power dynamics that have shaped the AI industry…(More)”

In Uncertain Times, Get Curious


Chapter (and book) by Elizabeth Weingarten: “Questions flow from curiosity. If we want to live and love the questions of our lives—How to live a life of purpose? Who am I in the aftermath of a big change or transition? What kind of person do I want to become as I grow older?—we must first ask them into conscious existence.

Many people have written entire books defining and redefining curiosity. But for me, the most helpful definition comes from a philosophy professor, Perry Zurn, and a systems neuroscientist, Dani Bassett: “For too long—and still too often—curiosity has been oversimplified,” they write, typically “reduced to the simple act of raising a hand or voicing a question, especially from behind a desk or a podium. . . . Scholars generally boil it down to ‘information-seeking’ behavior or a ‘desire to know.’ But curiosity is more than a feeling and certainly more than an act. And curiosity is always more than a single move or a single question.”Curiosity works, they write, by “linking ideas, facts, perceptions, sensations and data points together.”It is complex, mutating, unpredictable, and transformational. It is, fundamentally, an act of connection, an act of creating relationships between ideas and people. Asking questions then, becoming curious, is not just about wanting to find the answer—it is also about our need to connect, with ourselves, with others, with the world.

And this, perhaps, is why our deeper questions are hardly ever satisfied by Google or by fast, easy answers from the people I refer to as the Charlatans of Certainty—the gurus, influencers, and “experts” peddling simple solutions to all the complex problems you face. This is also the reason there is no one-size-fits-all formula for cultivating curiosity—particularly the kind that allows us to live and love our questions, especially the questions that are hard to love, like “How can I live with chronic pain?” or “How do I extricate myself from a challenging relationship?” This kind of curiosity is a special flavor…(More)”. See also: Inquiry as Infrastructure: Defining Good Questions in the Age of Data and AI.

Mapping local knowledge supports science and stewardship


Paper by Sarah C. Risley, Melissa L. Britsch, Joshua S. Stoll & Heather M. Leslie: “Coastal marine social–ecological systems are experiencing rapid change. Yet, many coastal communities are challenged by incomplete data to inform collaborative research and stewardship. We investigated the role of participatory mapping of local knowledge in addressing these challenges. We used participatory mapping and semi-structured interviews to document local knowledge in two focal social–ecological systems in Maine, USA. By co-producing fine-scale characterizations of coastal marine social–ecological systems, highlighting local questions and needs, and generating locally relevant hypotheses on system change, our research demonstrates how participatory mapping and local knowledge can enhance decision-making capacity in collaborative research and stewardship. The results of this study directly informed a collaborative research project to document changes in multiple shellfish species, shellfish predators, and shellfish harvester behavior and other human activities. This research demonstrates that local knowledge can be a keystone component of collaborative social–ecological systems research and community-lead environmental stewardship…(More)”.

Inquiry as Infrastructure: Defining Good Questions in the Age of Data and AI


Paper by Stefaan Verhulst: “The most consequential failures in data-driven policymaking and AI deployment often stem not from poor models or inadequate datasets but from poorly framed questions. This paper centers question literacy as a critical yet underdeveloped competency in the data and policy landscape. Arguing for a “new science of questions,” it explores what constitutes a good question-one that is not only technically feasible but also ethically grounded, socially legitimate, and aligned with real-world needs. Drawing on insights from The GovLab’s 100 Questions Initiative, the paper develops a taxonomy of question types-descriptive, diagnostic, predictive, and prescriptive-and identifies five essential criteria for question quality: questions must be general yet concrete, co-designed with affected communities and domain experts, purpose-driven and ethically sound, grounded in data and technical realities, and capable of evolving through iterative refinement. The paper also outlines common pathologies of bad questions, such as vague formulation, biased framing, and solution-first thinking. Rather than treating questions as incidental to analysis, it argues for institutionalizing deliberate question design through tools like Q-Labs, question maturity models, and new professional roles for data stewards. Ultimately, the paper contends that the questions are infrastructures of meaning. What we ask shapes not only what data we collect or what models we build but also what values we uphold and what futures we make possible…(More)”.

Open with care: transparency and data sharing in civically engaged research


Paper by Ankushi Mitra: “Research transparency and data access are considered increasingly important for advancing research credibility, cumulative learning, and discovery. However, debates persist about how to define and achieve these goals across diverse forms of inquiry. This article intervenes in these debates, arguing that the participants and communities with whom scholars work are active stakeholders in science, and thus have a range of rights, interests, and researcher obligations to them in the practice of transparency and openness. Drawing on civically engaged research and related approaches that advocate for subjects of inquiry to more actively shape its process and share in its benefits, I outline a broader vision of research openness not only as a matter of peer scrutiny among scholars or a top-down exercise in compliance, but rather as a space for engaging and maximizing opportunities for all stakeholders in research. Accordingly, this article provides an ethical and practical framework for broadening transparency, accessibility, and data-sharing and benefit-sharing in research. It promotes movement beyond open science to a more inclusive and socially responsive science anchored in a larger ethical commitment: that the pursuit of knowledge be accountable and its benefits made accessible to the citizens and communities who make it possible…(More)”.

Who Owns Science?


Article by Lisa Margonelli: “Only a few months into 2025, the scientific enterprise is reeling from a series of shocks—mass firings of the scientific workforce across federal agencies, cuts to federal research budgets, threats to indirect costs for university research, proposals to tax endowments, termination of federal science advisory committees, and research funds to prominent universities held hostage over political conditions. Amid all this, the public has not shown much outrage at—or even interest in—the dismantling of the national research project that they’ve been bankrolling for the past 75 years.

Some evidence of a disconnect from the scientific establishment was visible in confirmation hearings of administration appointees. During his Senate nomination hearing to head the department of Health and Human Services, Robert F. Kennedy Jr. promised a reorientation of research from infectious disease toward chronic conditions, along with “radical transparency” to rebuild trust in science. While his fans applauded, he insisted that he was not anti-vaccine, declaring, “I am pro-safety.”

But lack of public reaction to funding cuts need not be pinned on distrust of science; it could simply be that few citizens see the $200-billion-per-year, envy-of-the-world scientific enterprise as their own. On March 15, Alabama meteorologist James Spann took to Facebook to narrate the approach of 16 tornadoes in the state, taking note that people didn’t seem to care about the president’s threat to close the National Weather Service. “People say, ‘Well, if they shut it down, I’ll just use my app,’” Spann told Inside Climate News. “Well, where do you think the information on your app comes from? It comes from computer model output that’s run by the National Weather Service.” The public has paid for those models for generations, but only a die-hard weather nerd can find the acronyms for the weather models that signal that investment on these apps…(More)”.

Inside arXiv—the Most Transformative Platform in All of Science


Article by Sheon Han: “Nearly 35 years ago, Ginsparg created arXiv, a digital repository where researchers could share their latest findings—before those findings had been systematically reviewed or verified. Visit arXiv.org today (it’s pronounced like “archive”) and you’ll still see its old-school Web 1.0 design, featuring a red banner and the seal of Cornell University, the platform’s institutional home. But arXiv’s unassuming facade belies the tectonic reconfiguration it set off in the scientific community. If arXiv were to stop functioning, scientists from every corner of the planet would suffer an immediate and profound disruption. “Everybody in math and physics uses it,” Scott Aaronson, a computer scientist at the University of Texas at Austin, told me. “I scan it every night.”

Every industry has certain problems universally acknowledged as broken: insurance in health care, licensing in music, standardized testing in education, tipping in the restaurant business. In academia, it’s publishing. Academic publishing is dominated by for-profit giants like Elsevier and Springer. Calling their practice a form of thuggery isn’t so much an insult as an economic observation. Imagine if a book publisher demanded that authors write books for free and, instead of employing in-house editors, relied on other authors to edit those books, also for free. And not only that: The final product was then sold at prohibitively expensive prices to ordinary readers, and institutions were forced to pay exorbitant fees for access…(More)”.

Can We Measure the Impact of a Database?


Article by Peter Buneman, Dennis Dosso, Matteo Lissandrini, Gianmaria Silvello, and He Sun: “Databases publish data. This is undoubtedly the case for scientific and statistical databases, which have largely replaced traditional reference works. Database and Web technologies have led to an explosion in the number of databases that support scientific research, for obvious reasons: Databases provide faster communication of knowledge, hold larger volumes of data, are more easily searched, and are both human- and machine-readable. Moreover, they can be developed rapidly and collaboratively by a mixture of researchers and curators. For example, more than 1,500 curated databases are relevant to molecular biology alone. The value of these databases lies not only in the data they present but also in how they organize that data.

In the case of an author or journal, most bibliometric measures are obtained from citations to an associated set of publications. There are typically many ways of decomposing a database into publications, so we might use its organization to guide our choice of decompositions. We will show that when the database has a hierarchical structure, there is a natural extension of the h-index that works on this hierarchy…(More)”.

AI Is Evolving — And Changing Our Understanding Of Intelligence


Essay by Blaise Agüera y Arcas and James Manyika: “Dramatic advances in artificial intelligence today are compelling us to rethink our understanding of what intelligence truly is. Our new insights will enable us to build better AI and understand ourselves better.

In short, we are in paradigm-shifting territory.

Paradigm shifts are often fraught because it’s easier to adopt new ideas when they are compatible with one’s existing worldview but harder when they’re not. A classic example is the collapse of the geocentric paradigm, which dominated cosmological thought for roughly two millennia. In the geocentric model, the Earth stood still while the Sun, Moon, planets and stars revolved around us. The belief that we were at the center of the universe — bolstered by Ptolemy’s theory of epicycles, a major scientific achievement in its day — was both intuitive and compatible with religious traditions. Hence, Copernicus’s heliocentric paradigm wasn’t just a scientific advance but a hotly contested heresy and perhaps even, for some, as Benjamin Bratton notes, an existential trauma. So, today, artificial intelligence.

In this essay, we will describe five interrelated paradigm shifts informing our development of AI:

  1. Natural Computing — Computing existed in nature long before we built the first “artificial computers.” Understanding computing as a natural phenomenon will enable fundamental advances not only in computer science and AI but also in physics and biology.
  2. Neural Computing — Our brains are an exquisite instance of natural computing. Redesigning the computers that power AI so they work more like a brain will greatly increase AI’s energy efficiency — and its capabilities too.
  3. Predictive Intelligence — The success of large language models (LLMs) shows us something fundamental about the nature of intelligence: it involves statistical modeling of the future (including one’s own future actions) given evolving knowledge, observations and feedback from the past. This insight suggests that current distinctions between designing, training and running AI models are transitory; more sophisticated AI will evolve, grow and learn continuously and interactively, as we do.
  4. General Intelligence — Intelligence does not necessarily require biologically based computation. Although AI models will continue to improve, they are already broadly capable, tackling an increasing range of cognitive tasks with a skill level approaching and, in some cases, exceeding individual human capability. In this sense, “Artificial General Intelligence” (AGI) may already be here — we just keep shifting the goalposts.
  5. Collective Intelligence — Brains, AI agents and societies can all become more capable through increased scale. However, size alone is not enough. Intelligence is fundamentally social, powered by cooperation and the division of labor among many agents. In addition to causing us to rethink the nature of human (or “more than human”) intelligence, this insight suggests social aggregations of intelligences and multi-agent approaches to AI development that could reduce computational costs, increase AI heterogeneity and reframe AI safety debates.

But to understand our own “intelligence geocentrism,” we must begin by reassessing our assumptions about the nature of computing, since it is the foundation of both AI and, we will argue, intelligence in any form…(More)”.