Without appropriate metadata, data-sharing mandates are pointless


Article by Mark A. Musen: “Last month, the US government announced that research articles and most underlying data generated with federal funds should be made publicly available without cost, a policy to be implemented by the end of 2025. That’s atop other important moves. The European Union’s programme for science funding, Horizon Europe, already mandates that almost all data be FAIR (that is, findable, accessible, interoperable and reusable). The motivation behind such data-sharing policies is to make data more accessible so others can use them to both verify results and conduct further analyses.

But just getting those data sets online will not bring anticipated benefits: few data sets will really be FAIR, because most will be unfindable. What’s needed are policies and infrastructure to organize metadata.

Imagine having to search for publications on some topic — say, methods for carbon reclamation — but you could use only the article titles (no keywords, abstracts or search terms). That’s essentially the situation for finding data sets. If I wanted to identify all the deposited data related to carbon reclamation, the task would be futile. Current metadata often contain only administrative and organizational information, such as the name of the investigator and the date when the data were acquired.

What’s more, for scientific data to be useful to other researchers, metadata must sensibly and consistently communicate essentials of the experiments — what was measured, and under what conditions. As an investigator who builds technology to assist with data annotation, it’s frustrating that, in the majority of fields, the metadata standards needed to make data FAIR don’t even exist.

Metadata about data sets typically lack experiment-specific descriptors. If present, they’re sparse and idiosyncratic. An investigator searching the Gene Expression Omnibus (GEO), for example, might seek genomic data sets containing information on how a disease or condition manifests itself in young animals or humans. Performing such a search requires knowledge of how the age of individuals is represented — which in the GEO repository, could be age, AGE, age (after birth), age (years), Age (yr-old) or dozens of other possibilities. (Often, such information is missing from data sets altogether.) Because the metadata are so ad hoc, automated searches fail, and investigators waste enormous amounts of time manually sifting through records to locate relevant data sets, with no guarantee that most (or any) can be found…(More)”.

Spirals of Delusion: How AI Distorts Decision-Making and Makes Dictators More Dangerous


Essay by Henry Farrell, Abraham Newman, and Jeremy Wallace: “In policy circles, discussions about artificial intelligence invariably pit China against the United States in a race for technological supremacy. If the key resource is data, then China, with its billion-plus citizens and lax protections against state surveillance, seems destined to win. Kai-Fu Lee, a famous computer scientist, has claimed that data is the new oil, and China the new OPEC. If superior technology is what provides the edge, however, then the United States, with its world class university system and talented workforce, still has a chance to come out ahead. For either country, pundits assume that superiority in AI will lead naturally to broader economic and military superiority.

But thinking about AI in terms of a race for dominance misses the more fundamental ways in which AI is transforming global politics. AI will not transform the rivalry between powers so much as it will transform the rivals themselves. The United States is a democracy, whereas China is an authoritarian regime, and machine learning challenges each political system in its own way. The challenges to democracies such as the United States are all too visible. Machine learning may increase polarization—reengineering the online world to promote political division. It will certainly increase disinformation in the future, generating convincing fake speech at scale. The challenges to autocracies are more subtle but possibly more corrosive. Just as machine learning reflects and reinforces the divisions of democracy, it may confound autocracies, creating a false appearance of consensus and concealing underlying societal fissures until it is too late.

Early pioneers of AI, including the political scientist Herbert Simon, realized that AI technology has more in common with markets, bureaucracies, and political institutions than with simple engineering applications. Another pioneer of artificial intelligence, Norbert Wiener, described AI as a “cybernetic” system—one that can respond and adapt to feedback. Neither Simon nor Wiener anticipated how machine learning would dominate AI, but its evolution fits with their way of thinking. Facebook and Google use machine learning as the analytic engine of a self-correcting system, which continually updates its understanding of the data depending on whether its predictions succeed or fail. It is this loop between statistical analysis and feedback from the environment that has made machine learning such a formidable force…(More)”

Belfast to launch ‘Citizen Office of Digital Innovation’


Article by Sarah Wray: The City of Belfast in Northern Ireland has launched a tender to develop and pilot a Citizen Office of Digital Innovation (CODI) – a capacity-building programme to boost resident engagement around data and technology.

The council says the pilot will support a ‘digital citizenship skillset’, enabling citizens to better understand and shape how technology is used in Belfast. It could also lead to the creation of tools that can be used and adapted by other cities under a creative commons licence.

The tender is seeking creative and interactive methods to explore topics such as co-design, citizen science, the Internet of Things, artificial intelligence and data science, and privacy. It cites examples of citizen-centric programmes elsewhere including Dublin’s Academy of the Near Future and the DTPR standard for visual icons to explain sensors and cameras that are deployed in public spaces…(More)”

Nudging Consumers to Purchase More Sustainably


Article by Erez Yoeli: “Most consumers still don’t choose sustainable products when the option is available. Americans may claim to be willing to pay more for green energy, but while green energy is available in the majority of states — 35 out of 50 states or roughly 80% of American households as of 2018, at least — only 14% of households were even aware of the green option, and less than half of these households purchased it. Hybrids and electric vehicles are available nationwide, but still amount to just 10% of sales — 6.6% and 3.4%, respectively, according to S&P Global’s subscription services.

Now it may be that this virtue thinking-doing gap will eventually close. I hope so. But it will certainly need help, because in these situations there’s often an insidious behavioral dynamic at work that often stops stated good intentions from turning into actual good deeds…

Allow me to illustrate what I mean by “the plausible deniability effect” with an example from a now-classic behavioral economics study. Every year, around the holidays, Salvation Army volunteers collect donations for the needy outside supermarkets and other retail outlets. Researchers Justin Rao, Jim Andreoni, and Hanna Trachtmann teamed up with a Boston chapter of the Salvation Army to test ways of increasing donations.

Taking a supermarket that had two exit/entry points, the team randomly divided the volunteers into two groups. In one group, just one volunteer was assigned to stand in front of one door. For the other group, volunteers were stationed at both doors…(More)”.

China May Be Chasing Impossible Dream by Trying to Harness Internet Algorithms


Article by Karen Hao: “China’s powerful cyberspace regulator has taken the first step in a pioneering—and uncertain—government effort to rein in the automated systems that shape the internet.

Earlier this month, the Cyberspace Administration of China published summaries of 30 core algorithms belonging to two dozen of the country’s most influential internet companies, including TikTok owner ByteDance Ltd., e-commerce behemoth Alibaba Group Holding Ltd. and Tencent Holdings Ltd., owner of China’s ubiquitous WeChat super app.

The milestone marks the first systematic effort by a regulator to compel internet companies to reveal information about the technologies powering their platforms, which have shown the capacity to radically alter everything from pop culture to politics. It also puts Beijing on a path that some technology experts say few governments, if any, are equipped to handle….

One important question the effort raises, algorithm experts say, is whether direct government regulation of algorithms is practically possible.

The majority of today’s internet platform algorithms are based on a technology called machine learning, which automates decisions such as ad-targeting by learning to predict user behaviors from vast repositories of data. Unlike traditional algorithms that contain explicit rules coded by engineers, most machine-learning systems are black boxes, making it hard to decipher their logic or anticipate the consequences of their use.

Beijing’s interest in regulating algorithms started in 2020, after TikTok sought an American buyer to avoid being banned in the U.S., according to people familiar with the government’s thinking. When several bidders for the short-video platform lost interest after Chinese regulators announced new export controls on information-recommendation technology, it tipped off Beijing to the importance of algorithms, the people said…(More)”.

New Theory for Increasingly Tangled Banks


Essay by Saran Twombly: “Decades before the COVID-19 pandemic demonstrated how rapidly infectious diseases could emerge and spread, the world faced the AIDS epidemic. Initial efforts to halt the contagion were slow as researchers focused on understanding the epidemiology of the virus. It was only by integrating epidemiological theory with behavioral theory that successful interventions began to control the spread of HIV. 

As the current pandemic persists, it is clear that similar applications of interdisciplinary theory are needed to inform decisions, interventions, and policy. Continued infections and the emergence of new variants are the result of complex interactions among evolution, human behavior, and shifting policies across space and over time. Due to this complexity, predictions about the pandemic based on data and statistical models alone—in the absence of any broader conceptual framework—have proven inadequate. Classical epidemiological theory has helped, but alone it has also led to limited success in anticipating surges in COVID-19 infections. Integrating evolutionary theory with data and other theories has revealed more about how and under what conditions new variants arise, improving such predictions.  

AIDS and COVID-19 are examples of complex challenges requiring coordination across families of scientific theories and perspectives. They are, in this sense, typical of many issues facing science and society today—climate change, biodiversity decline, and environmental degradation, to name a few. Such problems occupy interdisciplinary space and arise from no-analog conditions (i.e., situations to which there are no current equivalents), as what were previously only local perturbations trigger global instabilities. As with the pandemic crises, they involve interdependencies and new sources of uncertainty, cross levels of governance, span national boundaries, and include interactions at different temporal and spatial scales. 

Such problems, while impossible to solve from a single perspective, may be successfully addressed by integrating multiple theories. …(More)”.

The Theft of the Commons


Eula Biss at The New Yorker: “…The idea that shared resources are inevitably ruined by people who exploit them is sometimes called the tragedy of the commons. This is not just an attitude that passes for common sense but an economic theory: “The Tragedy of the Commons” was the title of a 1968 essay by the ecologist Garrett Hardin. His essay has been cited so often that it has kept the word commons in use among people who know nothing about the commons. “The tragedy of the commons develops in this way,” Hardin wrote. “Picture a pasture open to all. It is to be expected that each herdsman will try to keep as many cattle as possible on the commons. Such an arrangement may work reasonably satisfactorily for centuries because tribal wars, poaching, and disease keep the numbers of both man and beast well below the carrying capacity of the land. Finally, however, comes the day of reckoning, that is, the day when the long-desired goal of social stability becomes a reality. At this point, the inherent logic of the commons remorselessly generates tragedy.”

Hardin was a white nationalist who subscribed to what is now called “replacement theory.” He believed that the United States needed to restrict nonwhite immigration, because, as he put it, “a multiethnic society is insanity.” In 1974, he published an essay titled “Lifeboat Ethics: The Case Against Helping the Poor,” in which he warned of the dangers of creating a world food bank: “The less provident and less able will multiply at the expense of the abler and more provident,” he wrote, “bringing eventual ruin upon all who share in the commons.”

Hardin was writing long after the commons had been lost to enclosure, and his commons was purely hypothetical. Actual, historical commons weren’t the free-for-all he imagined. In Laxton, villagers who held rights to Westwood Common could keep twenty sheep there, or the equivalent in cows. No one was allowed to keep more animals on the commons in summer than they could support in winter. Common rights were continuously revisited and revised in the course of centuries, as demand rose and fell. In 1662, the court fined a Laxton man “for not felling his part of thistles in the Town Moor.” As E. P. Thompson observed, “Commoners themselves were not without commonsense.”…(More)”.

Why Japan is building smart cities from scratch


Article by Tim Hornyak: “By 2050, nearly 7 out of 10 people in the world will live in cities, up from just over half in 2020. Urbanization is nothing new, but an effort is under way across many high-income countries to make their cities smarter, using data, instrumentation and more efficient resource management. In most of these nations, the vast majority of smart-city projects involve upgrades to existing infrastructure. Japan stands out for its willingness to build smart communities from scratch as it grapples with a rapidly ageing population and a shrinking workforce, meaning that there are fewer people of working age to support older people.

In 2021, the proportion of Japan’s population aged 65 and over hit 29.1%, the highest in the world. By 2036 it will be 33%. Regional cities, especially, face a long, slow economic decline.

As a resource-poor, disaster-prone country, Japan has also had to pursue energy efficiency and resilience following the 2011 Tohoku earthquake and the tsunamis it triggered. The resulting meltdowns at the Fukushima Daiichi nuclear power plant initially encouraged a shift away from nuclear power, which accounted for less than 4% of Japan’s energy use in 2020. However, there are growing calls, led by Japan’s ruling Liberal Democratic Party, for some reactors to be reopened to provide energy security and tackle rising fuel prices…(More)”.

Turning city planning into a game


Article by Brian Owens: “…The digital twins that Eicker’s team builds are powerful modelling tools — but, because they are complex and data-intensive, they are generally used only by experts. That’s something Eicker wants to change. “We want more people to use [these tools] in an easier, more accessible and more playful way,” she says.

So the team harnessed the Unity video-game engine, essentially a software-development workspace that is optimized for quickly and easily building interactive video-game environments, to create Future City Playgrounds. This puts their complex scientific models behind the scenes of a computer game, creating a sort of Minecraft for urban design. “You can change the parameters of your simulation models in a game and send that back to the computational engines and then see what that does for your carbon balance,” she says. “It’s still running pretty serious scientific calculations in the back end, but the user doesn’t see that any more.”

In the game, users can play with a digital version of Montreal: they can shape a single building or cluster of buildings to simulate a neighbourhood retrofit project, click on surfaces or streets to modify them, or design buildings in empty lots to see how changing materials or adding clean-energy systems can affect the neighbourhood’s character, energy use and emissions. The goal of the game is to create the most sustainable building with a budget of $1 million — for example, by adding highly insulating but expensive windows, optimizing the arrangement of rooftop solar panels or using rooftop vegetation to moderate demand for heating and cooling.

A larger web-based version of the project that does not use the game engine allows users to see the effects of city-wide changes — such as how retrofitting 50% of all buildings in Montreal built before 1950 would affect the city’s carbon footprint….(More)”.

U.S. Government Effort to Tap Private Weather Data Moves Along Slowly


Article by Isabelle Bousquette: “The U.S. government’s six-year-old effort to improve its weather forecasting ability by purchasing data from private-sector satellite companies has started to show results, although the process is moving more slowly than anticipated.

After a period of testing, the National Oceanic and Atmospheric Administration, a scientific, service and regulatory arm of the Commerce Department, began purchasing data from two satellite companies, Spire Global Inc. of Vienna, Va., and GeoOptics Inc. of Pasadena, Calif.

The weather data from these two companies fills gaps in coverage left by NOAA’s own satellites, the agency said. NOAA also began testing data from a third company this year.

Beyond these companies, new entrants to the field offering weather data based on a broader range of technologies have been slow to emerge, the agency said.

“We’re getting a subset of what we hoped,” said Dan St. Jean, deputy director of the Office of System Architecture and Advanced Planning at NOAA’s Satellite and Information Service.

NOAA’s weather forecasts help the government formulate hurricane evacuation plans and make other important decisions. The agency began seeking out private sources of satellite weather data in 2016. The idea was to find a more cost-effective alternative to funding NOAA’s own satellite constellations, the agency said. It also hoped to seed competition and innovation in the private satellite sector.

It isn’t yet clear whether there is a cost benefit to using private data, in part because the relatively small number of competitors in the market has made it challenging to determine a steady market price, NOAA said.

“All the signs in the nascent ‘new space’ industry indicated that there would be a plethora of venture capitalists wanting to compete for NOAA’s commercial pilot/purchase dollars. But that just never materialized,” said Mr. St. Jean…(More)”.