Article by Kevin Frazier: “The data relied on by OpenAI, Google, Meta, and other artificial intelligence (AI) developers is not readily available to other AI labs. Google and Meta relied, in part, on data gathered from their own products to train and fine-tune their models. OpenAI used tactics to acquire data that now would not work or may be more likely to be found in violation of the law (whether such tactics violated the law when originally used by OpenAI is being worked out in the courts). Upstart labs as well as research outfits find themselves with a dearth of data. Full realization of the positive benefits of AI, such as being deployed in costly but publicly useful ways (think tutoring kids or identifying common illnesses), as well as complete identification of the negative possibilities of AI (think perpetuating cultural biases) requires that labs other than the big players have access to quality, sufficient data.
The proper response is not to return to an exploitative status quo. Google, for example, may have relied on data from YouTube videos without meaningful consent from users. OpenAI may have hoovered up copyrighted data with little regard for the legal and social ramifications of that approach. In response to these questionable approaches, data has (rightfully) become harder to acquire. Cloudflare has equipped websites with the tools necessary to limit data scraping—the process of extracting data from another computer program. Regulators have developed new legal limits on data scraping or enforced old ones. Data owners have become more defensive over their content and, in some cases, more litigious. All of these largely positive developments from the perspective of data creators (which is to say, anyone and everyone who uses the internet) diminish the odds of newcomers entering the AI space. The creation of a public AI training data bank is necessary to ensure the availability of enough data for upstart labs and public research entities. Such banks would prevent those new entrants from having to go down the costly and legally questionable path of trying to hoover up as much data as possible…(More)”.
Article by Timothy Taylor: “When most people think of “experiments,” they think of test tubes and telescopes, of Petri dishes and Bunsen burners. But the physical apparatus is not central to what an “experiment” means. Instead, what matters is the ability to specify different conditions–and then to observe how the differences in the underlying conditions alter the outcomes. When “experiments” are understood in this broader way, the application of “experiments” is expanded.
For example, back in 1881 when Louis Pasteur tested his vaccine for sheep anthrax, he gave the vaccine to half of a flock of sheep, expose the entire group to anthrax, and showed that those with the vaccine survived. More recently, the “Green Revolution” in agricultural technology was essentially a set of experiments, by systematically breeding plant varieties and then looking at the outcomes in terms of yield, water use, pest resistance, and the like.
This understanding of “experiment” can be applied in economics, as well. John A. List explains in “Field Experiments: Here Today Gone Tomorrow?” (American Economist, published online August 6, 2024). By “field experiments,” List is seeking to differentiate his topic from “lab experiments,” which for economists refers to experiments carried out in a classroom context, often with students as the subjects, and to focus instead on experiments that involve people in the “field”–that is, in the context of their actual economic activities, including work, selling and buying, charitable giving, and the like. As List points out, these kinds of economic experiments have been going on for decades. He points out that government agencies have been conducting field experiments for decades…(More)”.
Unfortunately, most of the urban data remains in silos and capacities for our cities to harness urban data to improve decision-making, strengthen citizen participation continues to be limited. As per the last Data Maturity Assessment Framework (DMAF) assessment conducted in November 2020 by MoHUA, among 100 smart cities only 45 cities have drafted/ approved their City Data Policies with just 32 cities having a dedicated data budget in 2020–21 for data-related activities. Moreover, in-terms of fostering data collaborations, only 12 cities formed data alliances to achieve tangible outcomes. We hope smart cities continue this practice by conducting a yearly self-assessment to progress in their journey to harness data for improving their urban planning.
Seeding Urban Data Collaborative to advance City-level Data Engagements
There is a need to bring together a diverse set of stakeholders including governments, civil societies, academia, businesses and startups, volunteer groups and more to share and exchange urban data in a secure, standardised and interoperable manner, deriving more value from re-using data for participatory urban development. Along with improving data sharing among these stakeholders, it is necessary to regularly convene, ideate and conduct capacity building sessions and institutionalise data practices.
Urban Data Collaborative can bring together such diverse stakeholders who could address some of these perennial challenges in the ecosystem while spurring innovation…(More)”
By: Roshni Singh, Hannah Chafetz, and Stefaan G. Verhulst
The questions that society asks can transform public policy making, mobilize resources, and shape public discourse, yet decision makers around the world frequently focus on developing solutions rather than identifying the questions that need to be addressed to develop those solutions.
This blog provides a range of resources on the potential of questions for society. It includes readings on new approaches to formulating questions, how questions benefit public policy making and democracy, the importance of increasing the capacity for questioning at the individual level, and the role of questions in the age of AI and prompt engineering.
These readings underscore the need for a new science of questions – a new discipline solely focused on integrating participatory approaches for identifying, prioritizing, and addressing questions for society. This emerging discipline not only fosters creativity and critical thinking within societies but also empowers individuals and communities to engage actively in the questioning process, thereby promoting a more inclusive and equitable approach to addressing today’s societal challenges.
A few key takeaways from these readings:
Incorporating participatory approaches in questioning processes: Several of the readings discuss the value of including participatory approaches in questioning as a means to incorporate diverse perspectives, identify where there knowledge gaps, and ensure the questions prioritized reflect current needs. In particular, the readings emphasize the role of open innovation and co-creation principles, workshops, surveys, as ways to make the questioning process more collaborative.
Advancing individuals’ questioning capability: Teaching individuals to ask their own questions fosters agency and is essential for effective democratic participation. The readings recommend cultivating this skill from early education through adulthood to empower individuals to engage actively in decision-making processes.
Improving questioning processes for responsible AI use: In the era of AI and prompt engineering, how questions are framed is key for deriving meaningful responses to AI queries. More focus on participatory question formulation in the context of AI can help foster more inclusive and responsible data governance.
In “Crowdsourcing Research Questions in Science,” the authors examine how involving the general public in formulating research questions can enhance scientific inquiry. They analyze two crowdsourcing projects in the medical sciences and find that crowd-generated questions often restate problems but provide valuable cross-disciplinary insights. Although these questions typically rank lower in novelty and scientific impact compared to professional questions, they match the practical impact of professional research. The authors argue that crowdsourcing can improve research by offering diverse perspectives. They emphasize the importance of using effective selection methods to identify and prioritize the most valuable contributions from the crowd, ensuring that the highest quality questions are highlighted and addressed.
This journal article emphasizes the growing importance of openness and collaboration in scientific research. The authors identify the lack of a unified understanding of these practices due to differences in disciplinary approaches and propose an Open Innovation in Science (OIS) Research Framework (co-developed with 47 scholars) to bridge these knowledge gaps and synthesize information across fields. The authors argue that integrating Open Science and Open Innovation concepts can enhance researchers’ and practitioners’ understanding of how these practices influence the generation and dissemination of scientific insights and innovation. The article highlights the need for interdisciplinary collaboration to address the complexities of societal, technical, and environmental challenges and provides a foundation for future research, policy discussions, and practical guidance in promoting open and collaborative scientific practices.
In “The Surprising Power of Questions,” published in Harvard Business Review, Alison Wood Brooks and Leslie K. John highlight how asking questions drives learning, innovation, and relationship building within organizations. They argue that many executives focus on answers but underestimate how well-crafted questions can enhance communication, build trust, and uncover risks. Drawing from behavioral science, the authors show how the type, tone, and sequence of questions influence the effectiveness of conversations. By refining their questioning skills, individuals can boost emotional intelligence, foster deeper connections, and unlock valuable insights that benefit both themselves and their organizations.
In “Choosing Policy-Relevant Research Questions,” Paul Kellner explains how social scientists can craft research questions that better inform policy decisions. He highlights the ongoing issue of social sciences not significantly impacting policy, as noted by experts like William Julius Wilson and Christopher Whitty. The article suggests methods for engaging policymakers in the research question formulation process, such as user engagement, co creation, surveys, voting, and consensus-building workshops. Kellner provides examples where policymakers directly participated in the research, resulting in more practical and relevant outcomes. He concludes that improving coordination between researchers and policymakers can enhance the policy impact of social science research.
In this Op-Ed, Andrew P. Minigan emphasizes the critical role of curiosity and question formulation in education. He argues that alongside the “4 Cs” (creativity, critical thinking, communication, and collaboration), there should be a fifth C: curiosity. Asking questions enables students to identify knowledge gaps, think critically and creatively, and engage with peers. Research links curiosity to improved memory, academic achievement, and creativity. Despite these benefits, traditional teaching models often overlook curiosity. Minigan suggests teaching students to formulate questions to boost their curiosity and support educational goals. He concludes that nurturing curiosity is essential for developing innovative thinkers who can explore new, complex questions.
In this blog, Dan Rothstein highlights the importance of fostering “agency,” which is the ability of individuals to think and act independently, as a cornerstone of democracy. Rothstein and his colleague Luz Santana have spent over two decades at The Right Question Institute teaching people how to ask their own questions to enhance their participation in decision-making. They discovered that the inability to ask questions hinders involvement in decisions that impact individuals. Rothstein argues that learning to formulate questions is essential for developing agency and effective democratic participation. This skill should be taught from early education through adulthood. Despite its importance, many students do not learn this in college, so educators must focus on teaching question formulation at all levels. Rothstein concludes that empowering individuals to ask questions is vital for a strong democracy and should be a continuous effort across society.
In the chapter “From a Policy Problem to a Research Question: Getting It Right Together” from the Science for Policy Handbook, Marta Sienkiewicz emphasizes the importance of co-creation between researchers and policymakers to determine relevant research questions. She highlights the need for this approach due to the separation between research and policy cultures, and the differing natures of scientific (tame) and policy (wicked) problems. Sienkiewicz outlines a skills framework and provides examples from the Joint Research Centre (JRC), such as Knowledge Centres, staff exchanges, and collaboration facilitators, to foster interaction and collaboration. Engaging policymakers in the research question development process leads to more practical and relevant outcomes, builds trust, and strengthens relationships. This collaborative approach ensures that research is aligned with policy needs, increases the chances of evidence being used effectively in decision-making, and ultimately enhances the impact of scientific research on policy.
In “Methods for Collaboratively Identifying Research Priorities and Emerging Issues in Science and Policy,” the authors, William J. Sutherland et al., emphasize the importance of bridging the gap between scientific research and policy needs through collaborative approaches. They outline a structured, inclusive methodology that involves researchers, policymakers, and practitioners to jointly identify priority research questions. The approach includes gathering input from diverse stakeholders, iterative voting processes, and structured workshops to refine and prioritize questions, ensuring that the resulting research addresses critical societal and environmental challenges. These methods foster greater collaboration and ensure that scientific research is aligned with the practical needs of policymakers, thereby enhancing the relevance and impact of the research on policy decisions. This approach has been successfully applied in multiple fields, including conservation and agriculture, demonstrating its versatility in addressing both emerging issues and long-term policy priorities.
In this article co-authored with Anil Ananthaswamy, , Stefaan Verhulst emphasizes the crucial role of framing questions correctly, particularly in the era of AI and data. They highlight how ChatGPT’s success underscores the power of well-formulated questions and their impact on deriving meaningful answers. Verhulst and Ananthaswamy argue that society’s focus on answers has overshadowed the importance of questioning, which shapes scientific inquiry, public policy, and data utilization. They call for a new science of questions that integrates diverse fields and promotes critical thinking, data literacy, and inclusive questioning to address biases and improve decision-making. This interdisciplinary effort aims to shift the emphasis from merely seeking answers to understanding the context and purpose behind the questions.
In this chapter published in “Global Digital Data Governance: Polycentric Perspectives”, Stefaan Verhulst explores the crucial role of formulating questions in ensuring responsible data usage. Verhulst argues that, in our data-driven society, responsibly handling data is key to maximizing public good and minimizing risks. He proposes a polycentric approach where the right questions are co-defined to enhance the social impact of data science. Drawing from both conceptual and practical knowledge, including his experience with The 100 Questions Initiative, Verhulst emphasizes that a participatory methodology in question formulation can democratize data use, ensuring data minimization, proportionality, participation, and accountability. By shifting from a supply-driven to a demand-driven approach, Verhulst envisions a new “science of questions” that complements data science, fostering a more inclusive and responsible data governance framework.
As we navigate the complexities of our rapidly changing world, the importance of asking the right questions cannot be overstated. We invite researchers, educators, policymakers, and curious minds alike to delve deeper into new approaches for questioning. By fostering an environment that values and prioritizes well-crafted questions, we can drive innovation, enhance education, improve public policy, and harness the potential of AI and data science. In the coming months, The GovLab, with the support of the Henry Luce Foundation, will be exploring these topics further through a series of roundtable discussions. Are you working on participatory approaches to questioning and are interested in getting involved? Email Stefaan G. Verhulst, Co-Founder and Chief R&D at The GovLab, at sverhulst@thegovlab.org.
Article by Jon Keegan: “As we run, drive, bike, and fly, humans leave behind telltale tracks of movement on Earth—if you know where to look. Physical tracks, thermal signatures, and chemical traces can reveal where we’ve been. But another type of breadcrumb trail comes from the radio signals emitted by the cars, planes, trains, and boats we use.
Just like ADS-B transmitters on airplanes, which provide real-time location, identification, speed, and orientation data, the AIS (Automatic Identification System) performs the same function for ships at sea.
Operating at 161.975 and 162.025 MHz, AIS transmitters broadcast a ship’s identification number, name, call sign, length, beam, type, and antenna location every six minutes. Ship location, position timestamp, and direction are transmitted more frequently. The primary purpose of AIS is maritime safety—it helps prevent collisions, assists in rescues, and provides insight into the impact of ship traffic on marine life.
Unlike ADS-B in a plane, AIS can only be turned off in rare circumstances. The result of this is a treasure trove of fascinating ship movement data. You can even watch live ship data on sites like Vessel Finder.
Using NOAA’s “Marine Cadastre” tool, you can download 16 years’ worth of detailed daily ship movements (filtered to the minute), in addition to “transit count” maps generated from a year’s worth of data to show each ship’s accumulated paths…(More)”.
Article for Urban AI: “In May 2024, Nantes Métropole (France) launched a pioneering initiative titled“Nantes Débat de l’IA” (meaning “Nantes is Debating AI”). This year-long project is designed to curate the organization of events dedicated to artificial intelligence (AI) across the territory. The primary aim of this initiative is to foster dialogue among local stakeholders, enabling them to engage in meaningful discussions, exchange ideas, and develop a shared understanding of AI’s impact on the region.
Over the course of one year, the Nantes metropolitan area will host around sixty events focused on AI, bringing together a wide range of participants, including policymakers, businesses, researchers, and civil society. These events provide a platform for these diverse actors to share their perspectives, debate critical issues, and explore the potential opportunities and challenges AI presents. Through this collaborative process, the goal is to cultivate a common culture around AI, ensuring that all relevant voices are heard as the city navigates to integrate this transformative technology…(More)”.
About: “In a world where AI continues to be ever more entangled with our communities, cities, and decision-making processes, local governments are stepping up to address the challenges of AI governance. Today, we’re excited to announce the launch of the newly updated AI Localism Repository—a curated resource designed to help local governments, researchers, and citizens understand how AI is being governed at the state, city, or community level.
What is AI Localism?
AI Localism refers to the actions taken by local decision-makers to address AI governance in their communities. Unlike national or global policies, AI Localism offers immediate solutions tailored to specific local conditions, creating opportunities for greater effectiveness and accountability in the governance of AI.
What’s the AI Localism Repository?
The AI Localism Repository is a collection of examples of AI governance measures from around the world, focusing on how local governments are navigating the evolving landscape of AI. This resource is more than just a list of laws—it highlights innovative methods of AI governance, from the creation of expert advisory groups to the implementation of AI pilot programs.
Why AI Localism Matters
Local governments often face unique challenges in regulating AI, from ethical considerations to the social impact of AI in areas like law enforcement, housing, and employment. Yet, local initiatives are frequently overlooked by national and global AI policy observatories. The AI Localism Repository fills this gap, offering a platform for local policymakers to share their experiences and learn from one another…(More)”
Article by Stefaan Verhulst: “In September of this year, as world leaders assemble in New York for the 78th annual meeting of the United Nations (UN) General Assembly, they will confront a weighty agenda. War and peace will be at the forefront of conversations, along with efforts to tackle climate change and the ongoing migration crisis. Alongside these usual topics, however, the gathered dignitaries will also turn their attention to digital governance.
In 2021, the UN Secretary General proposed that a Global Digital Compact (GDC) be agreed upon that would “outline shared principles for an open, free and secure digital future for all”. The development of this Compact, which builds on a range of adjacent work streams at the UN, including activities related to the Sustainable Development Goals (SDGs), has now reached a vital inflection point. After a wide-ranging process of consultation, the General Assembly is expected to ratify the latest draft of the Digital Compact, which contains five key objectives and a commitment to thirteen cross-cutting principles. We have reached a rare moment of near-consensus in the global digital ecosystem, one that offers undeniable potential for revamping (and improving) our frameworks for global governance.
The Global Digital Compact will be agreed upon by UN Member States at the Summit of the Future at the United Nations Headquarters in New York, establishing guidelines for the responsible use and governance of digital technologies.
The growing prominence of these objectives and principles at the seat of global governance is a welcome development. Each is essential to developing a healthy, safe and responsible digital ecosystem. In particular, the emphasis on better data governance is a step forward, as is the related call for an enhanced approach for international AI governance. Both cannot be separated: data governance is the bedrock of AI governance.
Yet now that we are moving toward ratification of the Compact, we must focus on the next crucial—and in some ways most difficult – step: implementation. This is particularly important given that the digital realm faces in many ways a growing crisis of credibility, marked by growing concerns over exclusion, extraction, concentrations of power, mis- and disinformation, and what we have elsewhere referred to as an impending “data winter”.
Manifesting the goals of the Compact to create genuine and lasting impact is thus critical. In what follows, we explore four key ways in which the Compact’s key objectives can be operationalized to create a more vibrant, responsive and free global digital commons…(More)”.
Blog by Federico Bartolomucci: “…The conceptual distinctions between different data sharing models are mostly based on one fundamental element: the economic nature of data and its value.
Open data projects operate under the assumption that data is a non-rival (i.e. can be used by multiple people at the same time) and a non-excludable asset (i.e. anyone can use it, similar to a public good like roads or the air we breathe). This means that data can be shared with everyone, for any use, without losing its market and competitive value. The Humanitarian Data Exchange platform is a great example that allows organizations to share over 19,000 open data sets on all aspects of humanitarian response with others.
Data collaboratives treat data as an excludable asset that some people may be excluded from accessing (i.e. a ‘club good’, like a movie theater) and therefore share it only among a restricted pool of actors. At the same time, they overcome the rival nature of this data set up by linking its use to a specific purpose. These work best by giving the actors a voice in choosing the purpose for which the data will be used, and through specific agreements and governance bodies that ensure that those contributing data will not have their competitive position harmed, therefore incentivizing them to engage. A good example of this is the California Data Collaborative, which uses data from different actors in the water sector to develop high-level analysis on water distribution to guide policy, planning, and operations for water districts in the state of California.
Data ecosystems work by activating market mechanisms around data exchange to overcome reluctance to share data, rather than relying solely on its purpose of use. This means that actors can choose to share their data in exchange for compensation, be it monetary or in alternate forms such as other data. In this way, the compensation balances the potential loss of competitive advantage created by the sharing of a rival asset, as well as the costs and risks of sharing. The Enershare initiative aims to establish a marketplace utilizing blockchain and smart contracts to facilitate data exchange in the energy sector. The platform is based on a compensation system, which can be non-monetary, for exchanging assets and resources related to data (such as datasets, algorithms, and models) with energy assets and services (like heating system maintenance or the transfer of surplus locally self-produced energy).
These different models of data sharing have different operational implications…(More)”.
Column by Spencer Greenberg and Amber Dawn Ace: “In 1994, the U.S. Congress passed the largest crime bill in U.S. history, called the Violent Crime Control and Law Enforcement Act. The bill allocated billions of dollars to build more prisons and hire 100,000 new police officers, among other things. In the years following the bill’s passage, violent crime rates in the U.S. dropped drastically, from around 750 offenses per 100,000 people in 1990 to under 400 in 2018.
But can we infer, as this chart seems to ask us to, that the bill caused the drop in crime?
As it turns out, this chart wasn’t put together by sociologists or political scientists who’ve studied violent crime. Rather, we—a mathematician and a writer—devised it to make a point: Although charts seem to reflect reality, they often convey narratives that are misleading or entirely false.
Upon seeing that violent crime dipped after 1990, we looked up major events that happened right around that time—selecting one, the 1994 Crime Bill, and slapping it on the graph. There are other events we could have stuck on the graph just as easily that would likely have invited you to construct a completely different causal story. In other words, the bill and the data in the graph are real, but the story is manufactured.
Perhaps the 1994 Crime Bill really did cause the drop in violent crime, or perhaps the causality goes the other way: the spike in violent crime motivated politicians to pass the act in the first place. (Note that the act was passed slightly after the violent crime rate peaked!)
Charts are a concise way not only to show data but also to tell a story. Such stories, however, reflect the interpretations of a chart’s creators and are often accepted by the viewer without skepticism. As Noah Smith and many others have argued, charts contain hidden assumptions that can drastically change the story they tell…(More)”.