Inquiry as Infrastructure: Defining Good Questions in the Age of Data and AI


Paper by Stefaan Verhulst: “The most consequential failures in data-driven policymaking and AI deployment often stem not from poor models or inadequate datasets but from poorly framed questions. This paper centers question literacy as a critical yet underdeveloped competency in the data and policy landscape. Arguing for a “new science of questions,” it explores what constitutes a good question-one that is not only technically feasible but also ethically grounded, socially legitimate, and aligned with real-world needs. Drawing on insights from The GovLab’s 100 Questions Initiative, the paper develops a taxonomy of question types-descriptive, diagnostic, predictive, and prescriptive-and identifies five essential criteria for question quality: questions must be general yet concrete, co-designed with affected communities and domain experts, purpose-driven and ethically sound, grounded in data and technical realities, and capable of evolving through iterative refinement. The paper also outlines common pathologies of bad questions, such as vague formulation, biased framing, and solution-first thinking. Rather than treating questions as incidental to analysis, it argues for institutionalizing deliberate question design through tools like Q-Labs, question maturity models, and new professional roles for data stewards. Ultimately, the paper contends that the questions are infrastructures of meaning. What we ask shapes not only what data we collect or what models we build but also what values we uphold and what futures we make possible…(More)”.

Open with care: transparency and data sharing in civically engaged research


Paper by Ankushi Mitra: “Research transparency and data access are considered increasingly important for advancing research credibility, cumulative learning, and discovery. However, debates persist about how to define and achieve these goals across diverse forms of inquiry. This article intervenes in these debates, arguing that the participants and communities with whom scholars work are active stakeholders in science, and thus have a range of rights, interests, and researcher obligations to them in the practice of transparency and openness. Drawing on civically engaged research and related approaches that advocate for subjects of inquiry to more actively shape its process and share in its benefits, I outline a broader vision of research openness not only as a matter of peer scrutiny among scholars or a top-down exercise in compliance, but rather as a space for engaging and maximizing opportunities for all stakeholders in research. Accordingly, this article provides an ethical and practical framework for broadening transparency, accessibility, and data-sharing and benefit-sharing in research. It promotes movement beyond open science to a more inclusive and socially responsive science anchored in a larger ethical commitment: that the pursuit of knowledge be accountable and its benefits made accessible to the citizens and communities who make it possible…(More)”.

Who Owns Science?


Article by Lisa Margonelli: “Only a few months into 2025, the scientific enterprise is reeling from a series of shocks—mass firings of the scientific workforce across federal agencies, cuts to federal research budgets, threats to indirect costs for university research, proposals to tax endowments, termination of federal science advisory committees, and research funds to prominent universities held hostage over political conditions. Amid all this, the public has not shown much outrage at—or even interest in—the dismantling of the national research project that they’ve been bankrolling for the past 75 years.

Some evidence of a disconnect from the scientific establishment was visible in confirmation hearings of administration appointees. During his Senate nomination hearing to head the department of Health and Human Services, Robert F. Kennedy Jr. promised a reorientation of research from infectious disease toward chronic conditions, along with “radical transparency” to rebuild trust in science. While his fans applauded, he insisted that he was not anti-vaccine, declaring, “I am pro-safety.”

But lack of public reaction to funding cuts need not be pinned on distrust of science; it could simply be that few citizens see the $200-billion-per-year, envy-of-the-world scientific enterprise as their own. On March 15, Alabama meteorologist James Spann took to Facebook to narrate the approach of 16 tornadoes in the state, taking note that people didn’t seem to care about the president’s threat to close the National Weather Service. “People say, ‘Well, if they shut it down, I’ll just use my app,’” Spann told Inside Climate News. “Well, where do you think the information on your app comes from? It comes from computer model output that’s run by the National Weather Service.” The public has paid for those models for generations, but only a die-hard weather nerd can find the acronyms for the weather models that signal that investment on these apps…(More)”.

Inside arXiv—the Most Transformative Platform in All of Science


Article by Sheon Han: “Nearly 35 years ago, Ginsparg created arXiv, a digital repository where researchers could share their latest findings—before those findings had been systematically reviewed or verified. Visit arXiv.org today (it’s pronounced like “archive”) and you’ll still see its old-school Web 1.0 design, featuring a red banner and the seal of Cornell University, the platform’s institutional home. But arXiv’s unassuming facade belies the tectonic reconfiguration it set off in the scientific community. If arXiv were to stop functioning, scientists from every corner of the planet would suffer an immediate and profound disruption. “Everybody in math and physics uses it,” Scott Aaronson, a computer scientist at the University of Texas at Austin, told me. “I scan it every night.”

Every industry has certain problems universally acknowledged as broken: insurance in health care, licensing in music, standardized testing in education, tipping in the restaurant business. In academia, it’s publishing. Academic publishing is dominated by for-profit giants like Elsevier and Springer. Calling their practice a form of thuggery isn’t so much an insult as an economic observation. Imagine if a book publisher demanded that authors write books for free and, instead of employing in-house editors, relied on other authors to edit those books, also for free. And not only that: The final product was then sold at prohibitively expensive prices to ordinary readers, and institutions were forced to pay exorbitant fees for access…(More)”.

Can We Measure the Impact of a Database?


Article by Peter Buneman, Dennis Dosso, Matteo Lissandrini, Gianmaria Silvello, and He Sun: “Databases publish data. This is undoubtedly the case for scientific and statistical databases, which have largely replaced traditional reference works. Database and Web technologies have led to an explosion in the number of databases that support scientific research, for obvious reasons: Databases provide faster communication of knowledge, hold larger volumes of data, are more easily searched, and are both human- and machine-readable. Moreover, they can be developed rapidly and collaboratively by a mixture of researchers and curators. For example, more than 1,500 curated databases are relevant to molecular biology alone. The value of these databases lies not only in the data they present but also in how they organize that data.

In the case of an author or journal, most bibliometric measures are obtained from citations to an associated set of publications. There are typically many ways of decomposing a database into publications, so we might use its organization to guide our choice of decompositions. We will show that when the database has a hierarchical structure, there is a natural extension of the h-index that works on this hierarchy…(More)”.

AI Is Evolving — And Changing Our Understanding Of Intelligence


Essay by Blaise Agüera y Arcas and James Manyika: “Dramatic advances in artificial intelligence today are compelling us to rethink our understanding of what intelligence truly is. Our new insights will enable us to build better AI and understand ourselves better.

In short, we are in paradigm-shifting territory.

Paradigm shifts are often fraught because it’s easier to adopt new ideas when they are compatible with one’s existing worldview but harder when they’re not. A classic example is the collapse of the geocentric paradigm, which dominated cosmological thought for roughly two millennia. In the geocentric model, the Earth stood still while the Sun, Moon, planets and stars revolved around us. The belief that we were at the center of the universe — bolstered by Ptolemy’s theory of epicycles, a major scientific achievement in its day — was both intuitive and compatible with religious traditions. Hence, Copernicus’s heliocentric paradigm wasn’t just a scientific advance but a hotly contested heresy and perhaps even, for some, as Benjamin Bratton notes, an existential trauma. So, today, artificial intelligence.

In this essay, we will describe five interrelated paradigm shifts informing our development of AI:

  1. Natural Computing — Computing existed in nature long before we built the first “artificial computers.” Understanding computing as a natural phenomenon will enable fundamental advances not only in computer science and AI but also in physics and biology.
  2. Neural Computing — Our brains are an exquisite instance of natural computing. Redesigning the computers that power AI so they work more like a brain will greatly increase AI’s energy efficiency — and its capabilities too.
  3. Predictive Intelligence — The success of large language models (LLMs) shows us something fundamental about the nature of intelligence: it involves statistical modeling of the future (including one’s own future actions) given evolving knowledge, observations and feedback from the past. This insight suggests that current distinctions between designing, training and running AI models are transitory; more sophisticated AI will evolve, grow and learn continuously and interactively, as we do.
  4. General Intelligence — Intelligence does not necessarily require biologically based computation. Although AI models will continue to improve, they are already broadly capable, tackling an increasing range of cognitive tasks with a skill level approaching and, in some cases, exceeding individual human capability. In this sense, “Artificial General Intelligence” (AGI) may already be here — we just keep shifting the goalposts.
  5. Collective Intelligence — Brains, AI agents and societies can all become more capable through increased scale. However, size alone is not enough. Intelligence is fundamentally social, powered by cooperation and the division of labor among many agents. In addition to causing us to rethink the nature of human (or “more than human”) intelligence, this insight suggests social aggregations of intelligences and multi-agent approaches to AI development that could reduce computational costs, increase AI heterogeneity and reframe AI safety debates.

But to understand our own “intelligence geocentrism,” we must begin by reassessing our assumptions about the nature of computing, since it is the foundation of both AI and, we will argue, intelligence in any form…(More)”.

Behavioral AI: Unleash Decision Making with Data


Book by Rogayeh Tabrizi: “…delivers an intuitive roadmap to help organizations disentangle the complexity of their data to create tangible and lasting value. The book explains how to balance the multiple disciplines that power AI and behavioral economics using a combination of the right questions and insightful problem solving.

You’ll learn why intellectual diversity and combining subject matter experts in psychology, behavior, economics, physics, computer science, and engineering is essential to creating advanced AI solutions. You’ll also discover:

  • How behavioral economics principles influence data models and governance architectures and make digital transformation processes more efficient and effective
  • Discussions of the most important barriers to value in typical big data and AI projects and how to bring them down
  • The most effective methodology to help shorten the long, wasteful process of “boiling the ocean of data”

An exciting and essential resource for managers, executives, board members, and other business leaders engaged or interested in harnessing the power of artificial intelligence and big data, Behavioral AI will also benefit data and machine learning professionals…(More)”

A Century of Tomorrows 


Book by Glenn Adamson: “For millennia, predicting the future was the province of priests and prophets, the realm of astrologers and seers. Then, in the twentieth century, futurologists emerged, claiming that data and design could make planning into a rational certainty. Over time, many of these technologists and trend forecasters amassed power as public intellectuals, even as their predictions proved less than reliable. Now, amid political and ecological crises of our own making, we drown in a cacophony of potential futures-including, possibly, no future at all.

A Century of Tomorrows offers an illuminating account of how the world was transformed by the science (or is it?) of futurecasting. Beneath the chaos of competing tomorrows, Adamson reveals a hidden order: six key themes that have structured visions of what’s next. Helping him to tell this story are remarkable characters, including self-proclaimed futurologists such as Buckminster Fuller and Stewart Brand, as well as an eclectic array of other visionaries who have influenced our thinking about the world ahead: Octavia Butler and Ursula LeGuin, Shulamith Firestone and Sun Ra, Marcus Garvey and Timothy Leary, and more.

Arriving at a moment of collective anxiety and fragile hope, Adamson’s extraordinary bookshows how our projections for the future are, always and ultimately, debates about the present. For tomorrow is contained within the only thing we can ever truly know: today…(More)”.

Unlocking Public Value with Non-Traditional Data: Recent Use Cases and Emerging Trends


Article by Adam Zable and Stefaan Verhulst: “Non-Traditional Data (NTD)—digitally captured, mediated, or observed data such as mobile phone records, online transactions, or satellite imagery—is reshaping how we identify, understand, and respond to public interest challenges. As part of the Third Wave of Open Data, these often privately held datasets are being responsibly re-used through new governance models and cross-sector collaboration to generate public value at scale.

In our previous post, we shared emerging case studies across health, urban planning, the environment, and more. Several months later, the momentum has not only continued but diversified. New projects reaffirm NTD’s potential—especially when linked with traditional data, embedded in interdisciplinary research, and deployed in ways that are privacy-aware and impact-focused.

This update profiles recent initiatives that push the boundaries of what NTD can do. Together, they highlight the evolving domains where this type of data is helping to surface hidden inequities, improve decision-making, and build more responsive systems:

  • Financial Inclusion
  • Public Health and Well-Being
  • Socioeconomic Analysis
  • Transportation and Urban Mobility
  • Data Systems and Governance
  • Economic and Labor Dynamics
  • Digital Behavior and Communication…(More)”.

LLM Social Simulations Are a Promising Research Method


Paper by Jacy Reese Anthis et al: “Accurate and verifiable large language model (LLM) simulations of human research subjects promise an accessible data source for understanding human behavior and training new AI systems. However, results to date have been limited, and few social scientists have adopted these methods. In this position paper, we argue that the promise of LLM social simulations can be achieved by addressing five tractable challenges. We ground our argument in a literature survey of empirical comparisons between LLMs and human research subjects, commentaries on the topic, and related work. We identify promising directions with prompting, fine-tuning, and complementary methods. We believe that LLM social simulations can already be used for exploratory research, such as pilot experiments for psychology, economics, sociology, and marketing. More widespread use may soon be possible with rapidly advancing LLM capabilities, and researchers should prioritize developing conceptual models and evaluations that can be iteratively deployed and refined at pace with ongoing AI advances…(More)”.