Data’s Role in Unlocking Scientific Potential


Report by the Special Competitive Studies Project: “…we outline two actionable steps the U.S. government can take immediately to address the data sharing challenges hindering scientific research.

1. Create Comprehensive Data Inventories Across Scientific Domains

We recommend the Secretary of Commerce, acting through the Department of Commerce’s Chief Data Officer and the Director of the National Institute of Standards and Technology (NIST), and with the Federal Chief Data Officer Council (CDO Council) create a government-led inventory where organizations – universities, industries, and research institutes – can catalog their datasets with key details like purpose, description, and accreditation. Similar to platforms like data.gov, this centralized repository would make high-quality data more visible and accessible, promoting scientific collaboration. To boost participation, the government could offer incentives, such as grants or citation credits for researchers whose data is used. Contributing organizations would also be responsible for regularly updating their entries, ensuring the data stays relevant and searchable. 

2. Create Scientific Data Sharing Public-Private Partnerships

A critical recommendation of the National Data Action Plan was for the United States to facilitate the creation of data sharing public-private partnerships for specific sectors. The U.S. Government should coordinate data sharing partnerships with its departments and agencies, industry, academia, and civil society. Data collected by one entity can be tremendously valuable to others. But incentivizing data sharing is challenging as privacy, security, legal (e.g., liability), and intellectual property (IP) concerns can limit willingness to share. However, narrowly-scoped PPPs can help overcome these barriers, allowing for greater data sharing and mutually beneficial data use…(More)”

How Generative AI Content Could Influence the U.S. Election


Article by Valerie Wirtschafter: “…The contested nature of the presidential race means such efforts will undoubtedly continue, but they likely will remain discoverable, and their reach and ability to shape election outcomes will be minimal. Instead, the most meaningful uses of generative AI content could occur in highly targeted scenarios just prior to the election and/or in a contentious post-election environment where experience has demonstrated that potential “evidence” of malfeasance need not be true to mobilize a small subset of believers to act.

Because U.S. elections are managed at the state and county levels, low-level actors in some swing precincts or counties are catapulted to the national spotlight every four years. Since these actors are not well known to the public, targeted and personal AI-generated content can cause significant harm. Before the election, this type of fabricated content could take the form of a last-minute phone call by someone claiming to be election worker alerting voters to an issue at their polling place.

After the election, it could become harassment of election officials or “evidence” of foul play. Due to the localized and personalized nature of this type of effort, it could be less rapidly discoverable for unknown figures not regularly in the public eye, difficult to debunk or prevent with existing tools and guardrails, and damaging to reputations. This tailored approach need not be driven by domestic actors—in fact, in the lead up to the 2020 elections, Iranian actors pretended to be members of the Proud Boys and sent threatening emails to Democratic voters in select states demanding they vote for Donald Trump. Although election officials have worked tirelessly to brace for this possibility, they are correct to be on guard…(More)”

Buried Academic Treasures


Barrett and Greene: “…one of the presenters who said: “We have lots of research that leads to no results.”

As some of you know, we’ve written a book with Don Kettl to help academically trained researchers write in a way that would be understandable by decision makers who could make use of their findings. But the keys to writing well are only a small part of the picture. Elected and appointed officials have the capacity to ignore nearly anything, no matter how well written it is.

This is more than just a frustration to researchers, it’s a gigantic loss to the world of public administration. We spend lots of time reading through reports and frequently come across nuggets of insights that we believe could help make improvements in nearly every public sector endeavor from human resources to budgeting to performance management to procurement and on and on. We, and others, can do our best to get attention for this kind of information, but that doesn’t mean that the decision makers have the time or the inclination to take steps toward taking advantage of great ideas.

We don’t want to place the blame for the disconnect between academia and practitioners on either party. To one degree or the other they’re both at fault, with taxpayers and the people who rely on government services – and that’s pretty much everybody except for people who have gone off the grid – as the losers.

Following, from our experience, are six reasons we believe that it’s difficult to close the gap between the world of research and the realm of utility. The first three are aimed at government leaders, the last three have academics in mind…(More)”

First-of-its-kind dataset connects greenhouse gases and air quality


NOAA Research: “The GReenhouse gas And Air Pollutants Emissions System (GRA²PES), from NOAA and the National Institute of Standards and Technology (NIST), combines information on greenhouse gas and air quality pollutant sources into a single national database, offering innovative interactive map displays and new benefits for both climate and public health solutions.

A new U.S.-based system to combine air quality and greenhouse gas pollution sources into a single national research database is now available in the U.S. Greenhouse Gas Center portal. This geospatial data allows leaders at city, state, and regional scales to more easily identify and take steps to address air quality issues while reducing climate-related hazards for populations.

The dataset is the GReenhouse gas And Air Pollutants Emissions System (GRA²PES). A research project developed by NOAA and NIST, GRA²PES captures monthly greenhouse gas (GHG) emissions activity for multiple economic sectors to improve measurement and modeling for both GHG and air pollutants across the contiguous U.S.

Having the GHG and air quality constituents in the same dataset will be exceedingly helpful, said Columbia University atmospheric scientist Roisin Commane, the lead on a New York City project to improve emissions estimates…(More)”.

Science Diplomacy and the Rise of Technopoles


Article by Vaughan Turekian and Peter Gluckman: “…Science diplomacy has an important, even existential imperative to help the world reconsider the necessity of working together toward big global goals. Climate change may be the most obvious example of where global action is needed, but many other issues have similar characteristics—deep ocean resources, space, and other ungoverned areas, to name a few.

However, taking up this mantle requires acknowledging why past efforts have failed to meet their goals. The global commitment to Sustainable Development Goals (SDGs) is an example. Weaknesses in the UN system, compounded by varied commitments from member states, will prevent the achievement of the SDGs by 2030. This year’s UN Summit of the Future is intended to reboot the global commitment to the sustainability agenda. Regardless of what type of agreement is signed at the summit, its impact may be limited.  

Science diplomacy has an important, even existential imperative to help the world reconsider the necessity of working together toward big global goals.

The science community must play an active part in ensuring progress is in fact made, but that will require an expansion of the community’s current role. To understand what this might mean, consider that the Pact for the Future agreed in New York City in September 2024 places “science, technology, and innovation” as one of its five themes. But that becomes actionable either in the narrow sense that technology will provide “answers” to global problems or in the platitudinous sense that science provides advice that is not acted upon. This dichotomy of unacceptable approaches has long bedeviled science’s influence.

For the world to make better use of science, science must take on an expanded responsibility in solving problems at both global and local scales. And science itself must become part of a toolkit—both at the practical and the diplomatic level—to address the sorts of challenges the world will face in the future. To make this happen, more countries must make science diplomacy a core part of their agenda by embedding science advisors within foreign ministries, connecting diplomats to science communities.

As the pace of technological change generates both existential risk and economic, environmental, and social opportunities, science diplomacy has a vital task in balancing outcomes for the benefit of more people. It can also bring the science community (including the social sciences and humanities) to play a critical role alongside nation states. And, as new technological developments enable nonstate actors, and especially the private sector, science diplomacy has an important role to play in helping nation states develop policy that can identify common solutions and engage key partners…(More)”.

How The New York Times incorporates editorial judgment in algorithms to curate its home page


Article by Zhen Yang: “Whether on the web or the app, the home page of The New York Times is a crucial gateway, setting the stage for readers’ experiences and guiding them to the most important news of the day. The Times publishes over 250 stories daily, far more than the 50 to 60 stories that can be featured on the home page at a given time. Traditionally, editors have manually selected and programmed which stories appear, when and where, multiple times daily. This manual process presents challenges:

  • How can we provide readers a relevant, useful, and fresh experience each time they visit the home page?
  • How can we make our editorial curation process more efficient and scalable?
  • How do we maximize the reach of each story and expose more stories to our readers?

To address these challenges, the Times has been actively developing and testing editorially driven algorithms to assist in curating home page content. These algorithms are editorially driven in that a human editor’s judgment or input is incorporated into every aspect of the algorithm — including deciding where on the home page the stories are placed, informing the rankings, and potentially influencing and overriding algorithmic outputs when necessary. From the get-go, we’ve designed algorithmic programming to elevate human curation, not to replace it…

The Times began using algorithms for content recommendations in 2011 but only recently started applying them to home page modules. For years, we only had one algorithmically-powered module, “Smarter Living,” on the home page, and later, “Popular in The Times.” Both were positioned relatively low on the page.

Three years ago, the formation of a cross-functional team — including newsroom editors, product managers, data scientists, data analysts, and engineers — brought the momentum needed to advance our responsible use of algorithms. Today, nearly half of the home page is programmed with assistance from algorithms that help promote news, features, and sub-brand content, such as The Athletic and Wirecutter. Some of these modules, such as the features module located at the top right of the home page on the web version, are in highly visible locations. During major news moments, editors can also deploy algorithmic modules to display additional coverage to complement a main module of stories near the top of the page. (The topmost news package of Figure 1 is an example of this in action.)…(More)”

How is editorial judgment incorporated into algorithmic programming?

Someone Put Facial Recognition Tech onto Meta’s Smart Glasses to Instantly Dox Strangers


Article by Joseph Cox: “A pair of students at Harvard have built what big tech companies refused to release publicly due to the overwhelming risks and danger involved: smart glasses with facial recognition technology that automatically looks up someone’s face and identifies them. The students have gone a step further too. Their customized glasses also pull other information about their subject from around the web, including their home address, phone number, and family members. 

The project is designed to raise awareness of what is possible with this technology, and the pair are not releasing their code, AnhPhu Nguyen, one of the creators, told 404 Media. But the experiment, tested in some cases on unsuspecting people in the real world according to a demo video, still shows the razor thin line between a world in which people can move around with relative anonymity, to one where your identity and personal information can be pulled up in an instant by strangers.

Nguyen and co-creator Caine Ardayfio call the project I-XRAY. It uses a pair of Meta’s commercially available Ray Ban smart glasses, and allows a user to “just go from face to name,” Nguyen said…(More)”.

Unlocking AI for All: The Case for Public Data Banks


Article by Kevin Frazier: “The data relied on by OpenAI, Google, Meta, and other artificial intelligence (AI) developers is not readily available to other AI labs. Google and Meta relied, in part, on data gathered from their own products to train and fine-tune their models. OpenAI used tactics to acquire data that now would not work or may be more likely to be found in violation of the law (whether such tactics violated the law when originally used by OpenAI is being worked out in the courts). Upstart labs as well as research outfits find themselves with a dearth of data. Full realization of the positive benefits of AI, such as being deployed in costly but publicly useful ways (think tutoring kids or identifying common illnesses), as well as complete identification of the negative possibilities of AI (think perpetuating cultural biases) requires that labs other than the big players have access to quality, sufficient data.

The proper response is not to return to an exploitative status quo. Google, for example, may have relied on data from YouTube videos without meaningful consent from users. OpenAI may have hoovered up copyrighted data with little regard for the legal and social ramifications of that approach. In response to these questionable approaches, data has (rightfully) become harder to acquire. Cloudflare has equipped websites with the tools necessary to limit data scraping—the process of extracting data from another computer program. Regulators have developed new legal limits on data scraping or enforced old ones. Data owners have become more defensive over their content and, in some cases, more litigious. All of these largely positive developments from the perspective of data creators (which is to say, anyone and everyone who uses the internet) diminish the odds of newcomers entering the AI space. The creation of a public AI training data bank is necessary to ensure the availability of enough data for upstart labs and public research entities. Such banks would prevent those new entrants from having to go down the costly and legally questionable path of trying to hoover up as much data as possible…(More)”.

Zillow introduces First Street’s comprehensive climate risk data on for-sale listings across the US


Press Release: “Zillow® is introducing climate risk data, provided by First Street…Home shoppers will gain insights into five key risks—flood, wildfire, wind, heat and air quality—directly from listing pages, complete with risk scores, interactive maps and insurance requirements.

Zillow® is introducing climate risk data, provided by First Street, the standard for climate risk financial modeling, on for-sale property listings across the U.S. Home shoppers will gain insights into five key risks—flood, wildfire, wind, heat and air quality—directly from listing pages, complete with risk scores, interactive maps and insurance requirements.

With more than 80% of buyers now considering climate risks when purchasing a home, this feature provides a clearer understanding of potential hazards, helping buyers to better assess long-term affordability and plan for the future. In assisting buyers to navigate the growing risk of climate change, Zillow is the only platform to feature tailored insurance recommendations alongside detailed historical insights, showing if or when a property has experienced past climate events, such as flooding or wildfires…
When using Zillow’s search map view, home shoppers can explore climate risk data through an interactive map highlighting five key risk categories: flood, wildfire, wind, heat and air quality. Each risk is color-coded and has its own color scale, helping consumers intuitively navigate their search. Informative labels give more context to climate data and link to First Street’s property-specific climate risk reports for full insights.

When viewing a for-sale property on Zillow, home shoppers will see a new climate risk section. This section includes a separate module for each risk category—flood, wildfire, wind, heat and air quality—giving detailed, property-specific data from First Street. This section not only shows how these risks might affect the home now and in the future, but also provides crucial information on wind, fire and flood insurance requirements.

Nationwide, more new listings came with major climate risk, compared to homes listed for sale five years ago, according to a Zillow analysis conducted in August. That trend holds true for all five of the climate risk categories Zillow analyzed. Across all new listings in August, 16.7% were at major risk of wildfire, while 12.8% came with a major risk of flooding…(More)”.

Federal Court Invalidates NYC Law Requiring Food Delivery Apps to Share Customer Data with Restaurants


Article by Hunton, Andrews, Kurth: “On September 24, 2024, a federal district court held that New York City’s “Customer Data Law” violates the First Amendment. Passed in the summer of 2021, the law requires food-delivery apps to share customer-specific data with restaurants that prepare delivered meals.

The New York City Council enacted the Customer Data Law to boost the local restaurant industry in the wake of the pandemic. The law requires food-delivery apps to provide restaurants (upon the restaurants’ request) with each diner’s full name, email address, phone number, delivery address, and order contents. Customers may opt out of such sharing. The law’s supporters argue that requiring such disclosure addresses exploitation by the delivery apps and helps restaurants advertise more effectively.

Normally, when a customer places an order through a food-delivery app, the app provides the restaurant with the customer’s first name, last initial and food order. Food-delivery apps share aggregate data analytics with restaurants but generally do not share customer-specific data beyond the information necessary to fulfill an order. Some apps, for example, provide restaurants with data related to their menu performance, customer feedback and daily operations.

Major food-delivery app companies challenged the Customer Data Law, arguing that its data sharing requirement compels speech impermissibly under the First Amendment. Siding with the apps, the U.S. District Court for the Southern District of New York declared the city’s law invalid, holding that its data sharing requirement is not appropriately tailored to a substantial government interest…(More)”.