A 630-Billion-Word Internet Analysis Shows ‘People’ Is Interpreted as ‘Men’


Dana G. Smith at Scientific American: “A massive linguistic analysis of more than half a trillion words concludes that we assign gender to words that, by their very definition, should be gender-neutral.

Psychologists at New York University analyzed text from nearly three billion Web pages and compared how often words for person (“individual,” “people,” and so on) were associated with terms for a man (“male,” “he”) or a woman (“female,” “she”). They found that male-related words overlapped with “person” more frequently than female words did. The cultural concept of a person, from this perspective, is more often a man than a woman, according to the study, which was published on April 1 in Science Advances.

To conduct the study, the researchers turned to an enormous open-source data set of Web pages called the Common Crawl, which pulls text from everything from corporate white papers to Internet discussion forums. For their analysis of the text—a total of more than 630 billion words—the researchers used word embeddings, a computational linguistic technique that assesses how similar two words are by looking for how often they appear together.

“You can take a word like the word ‘person’ and understand what we mean by ‘person,’ how we represent the word ‘person,’ by looking at the other words that we often use around the word ‘person,’” explains April Bailey, a postdoctoral researcher at N.Y.U., who conducted the study. “We found that there was more overlap between the words for people and words for men than words for people and the words for women…, suggesting that there is this male bias in the concept of a person.”

Scientists have previously studied gender bias in language, such as the idea that women are more closely associated with family and home life and that men are more closely linked with work. “But this is the first to study this really general gender stereotype—the idea that men are sort of the default humans—in this quantitative computational social science way,” says Molly Lewis, a research scientist at the psychology department at Carnegie Mellon University, who was not involved in the study….(More)”.

The rise of the data steward


Article by Sarah Wray: “As data use and collaboration become more advanced, there is a need for a new profession within the public and private sectors, says Stefaan Verhulst, Co-Founder and Chief Research and Development Officer at New York University’s The GovLab. He calls this role the ‘data steward’ and is also seeking to expand existing definitions of the term.

While many cities, government organisations, and private sector companies have chief data officers and chief privacy officers, Verhulst says this new function is broader and necessary as more organisations begin to explore data collaborations which bring together data from various sources to solve problems for the public good.

Many cities, for instance, want to get more value and innovation from the open data they share, and are also increasingly partnering to benefit from private sector data on mobility, spending, and more.

Several examples highlight the challenges, though. There have been disputes about data-sharing and privacy, such as between Uber and the Los Angeles Department of Transportation, while other initiatives have failed to gain traction. Copenhagen’s City Data Exchange facilitated the exchange of public and private data but was disbanded after it struggled to get enough data providers and users on the platform and to become financially sustainable.

Verhulst says that beyond ensuring the security and integrity of data, new skills required by data stewards include the ability to secure partnerships, adequately vet data partners and set up data-sharing agreements, as well as the capacity to steward data-sharing initiatives internally and obtain legal and executive buy-in. Data stewards should also develop financial models for data-sharing to ensure partnerships are sustainable over time.

“That’s quite often ignored,” says Verhulst. “It’s assumed that these things will pay for themselves. Well surprise, surprise, there are costs.”

In addition, there’s an important role for retaining an active focus on insights from data and problems to be solved. Many early open data efforts have taken a ‘build it and they will come’ approach, and usage at scale hasn’t always materialised.

A dynamic regulatory environment is also driving demand for new skills, says Verhulst, noting that the proposed EU Data Act indicates a mandate “to knock on the doors of the private sector [for data] in emergency contexts”.

“The question is: how do you go about that?” Verhulst comments. “Many organisations are going to have to figure this out.”

The GovLab is now running the third cohort of its training for data stewards, and the first focused in the Eastern Hemisphere.

The Developing a Data Reuse Strategy for Public Problems course is part of The GovLab’s Open Data Policy Lab, which is supported by Microsoft..(More)”.

The Russian invasion shows how digital technologies have become involved in all aspects of war


Article by Katharina Niemeyer, Dominique Trudel, Heidi J.S. Tworek, Maria Silina and Svitlana Matviyenko: “Since Russia invaded Ukraine, we keep hearing that this war is like no other; because Ukrainians have cellphones and access to social media platforms, the traditional control of information and propaganda cannot work and people are able to see through the fog of war.

As communications scholars and historians, it is important to add nuance to such claims. The question is not so much what is “new” in this war, but rather to understand its specific media dynamics. One important facet of this war is the interplay between old and new media — the many loops that go from Twitter to television to TikTok, and back and forth.

We have moved away from a relatively static communication model, where journalists report on the news within predetermined constraints and formats, to intense fragmentation and even participation. Information about the war becomes content, and users contribute to its circulation by sharing and commenting online…(More)”.

Researcher Helps Create Big Data ‘Early Alarm’ for Ukraine Abuses


Article by Chris Carroll: From searing images of civilians targeted by shelling to detailed accounts of sick children and their families fleeing nearby fighting to seek medical care, journalists have created a kaleidoscopic view of the suffering that has engulfed Ukraine since Russia invaded—but the news media can’t be everywhere.

Social media practically can be, however, and a University of Maryland researcher is part of a U.S.-Ukrainian multi-institutional team that’s harvesting data from Twitter and analyzing it with machine-learning algorithms. The result is a real-time system that provides a running account of what people in Ukraine are facing, constructed from their own accounts.

The project, Data for Ukraine, has been running for about three weeks, and has shown itself able to surface important events a few hours ahead of Western or even Ukrainian media sources. It focuses on four areas: humanitarian needs, displaced people, civilian resistance and human rights violations. In addition to simply showing spikes of credible tweets about certain subjects the team is tracking, the system also geolocates tweets—essentially mapping where events take place.

“It’s an early alarm system for human rights abuses,” said Ernesto Calvo, professor of government and politics and director of UMD’s Inter-Disciplinary Lab for Computational Social Science. “For it to work, we need to know two basic things: what is happening or being reported, and who is reporting those things.”

Calvo and his lab focus on the second of those two requirements, and constructed a “community detection” system to identify key nodes of Twitter users from which to use data. Other team members with expertise in Ukrainian society and politics spotted him a list of about 400 verified users who actively tweet on relevant topics. Then Calvo, who honed his approach analyzing social media from political and environmental crises in Latin America, and his team expanded and deepened the collection, drawing on connections and followers of the initial list so that millions of tweets per day now feed the system.

Nearly half of the captured tweets are in Ukrainian, 30% are in English and 20% are in Russian. Knowing who to exclude—accounts started the day before the invasion, for instance, or with few long-term connections—is key, Calvo said…(More)”.

How Cities Are Using Digital Twins Like a SimCity for Policymakers


Article by Linda Poon: “The entire 40-square-mile metro region of Orlando, Florida, may soon live virtually inside the offices of the Orlando Economic Partnership (OEP). The group has partnered with the gaming company Unity to develop a 3-D model of the area — from its downtown core all the way out to Space Coast on the eastern edge of central Florida — that the city can show off to potential investors in its bid to grow as a tech hub.

“It’ll be a circular room with LED screens kind of 180 degrees,” says OEP President and Chief Executive Officer Tim Giuliani.“Then in the middle, we’re planning the holographic image, where the digital twin of the region will come to life.” 

Orlando’s planned showcase is one of the flashier uses of a new technology that’s being lauded as a potential game changer for urban planning. Like a SimCity for policymakers, digital twins allow cities not just to create virtual models, but to run simulations of new policies or infrastructure projects and preview their potential impacts before making a decision in the real world. 

They may be also one of the more tangible opportunities for cities in the race for the so-called metaverse, an immersive network of virtual worlds that some leaders believe to be the future of urban living. Using 3-D mapping and analysis of static and real-time data, municipalities and businesses are increasingly adopting digital twin technology — although many of its potential uses remain aspirational thus far. 

Orlando expects to use its digital twin technology for more than virtual tours. It also hopes to preview how different investments, like a transit system upgrade, might affect the built environment and its residents. Several other U.S. cities are building replicas to model traffic congestion strategies and drive net-zero climate goals. Las Vegas, Los Angeles, New York and Phoenix are all building out digital twins to lower building emissions as part of the Clean Cities Clean Future campaign from the software company Cityzenith. Globally, cities from Singapore to Helsinki and Dubai are also investing in the technology, with goals ranging from driving sustainability to promoting virtual tourism. 

The technology could help officials cut operating costs and carbon emissions of new construction, and avoid costly modifications after a project is completed. Amid an ever-looming climate crisis facing urban areas, it could enable cities to test the effectiveness of various measures against rising sea levels and urban heat. By one estimate, digital twins could save cities some $280 billion by 2030….(More)”

Google is using AI to better detect searches from people in crisis


Article by James Vincent: “In a personal crisis, many people turn to an impersonal source of support: Google. Every day, the company fields searches on topics like suicide, sexual assault, and domestic abuse. But Google wants to do more to direct people to the information they need, and says new AI techniques that better parse the complexities of language are helping.

Specifically, Google is integrating its latest machine learning model, MUM, into its search engine to “more accurately detect a wider range of personal crisis searches.” The company unveiled MUM at its IO conference last year, and has since used it to augment search with features that try to answer questions connected to the original search.

In this case, MUM will be able to spot search queries related to difficult personal situations that earlier search tools could not, says Anne Merritt, a Google product manager for health and information quality.

“MUM is able to help us understand longer or more complex queries like ‘why did he attack me when i said i dont love him,’” Merrit told The Verge. “It may be obvious to humans that this query is about domestic violence, but long, natural-language queries like these are difficult for our systems to understand without advanced AI.”

Other examples of queries that MUM can react to include “most common ways suicide is completed” (a search Merrit says earlier systems “may have previously understood as information seeking”) and “Sydney suicide hot spots” (where, again, earlier responses would have likely returned travel information — ignoring the mention of “suicide” in favor of the more popular query for “hot spots”). When Google detects such crisis searches, it responds with an information box telling users “Help is available,” usually accompanied by a phone number or website for a mental health charity like Samaritans.

In addition to using MUM to respond to personal crises, Google says it’s also using an older AI language model, BERT, to better identify searches looking for explicit content like pornography. By leveraging BERT, Google says it’s “reduced unexpected shocking results by 30%” year-on-year. However, the company was unable to share absolute figures for how many “shocking results” its users come across on average, so while this is a comparative improvement, it gives no indication of how big or small the problem actually is.

Google is keen to tell you that AI is helping the company improve its search products — especially at a time when there’s a building narrative that “Google search is dying.” But integrating this technology comes with its downsides, too.

Many AI experts warn that Google’s increasing use of machine learning language models could surface new problems for the company, like introducing biases and misinformation into search results. AI systems are also opaque, offering engineers restricted insight into how they come to certain conclusions…(More)”.

Public Meetings Thwart Housing Reform Where It Is Needed Most


Interview with Katherine Levine Einstein by Jake Blumgart: “Public engagement can have downsides. Neighborhood participation in the housing permitting process makes existing political inequalities worse, limits housing supply and contributes to the affordability crisis….

In 2019, Katherine Levine Einstein and her co-authors at Boston University produced the first in-depth study of this dynamic, Neighborhood Defenders, providing a unique insight into how hyper-local democracy can produce warped land-use outcomes. Governing talked with her about the politics of delay, what kind of regulations hamper growth and when community meetings can still be an effective means of public feedback.

Governing: What could be wrong with a neighborhood meeting? Isn’t this democracy in its purest form? 

Katherine Levine Einstein: In this book, rather than look at things in their ideal form, we actually evaluated how they are working on the ground. We bring data to the question of whether neighborhood meetings are really providing community voice. One of the reasons that we think of them as this important cornerstone of American democracy is because they are supposedly providing us perspectives that are not widely heard, really amplifying the voices of neighborhood residents.

What we’re able to do in the book is to really bring home the idea that the people who are showing up are not actually representative of their broader communities and they are unrepresentative in really important ways. They’re much more likely to be opposed to new housing, and they’re demographically privileged on a number of dimensions….

What we find happens in practice is that even in less privileged places, these neighborhood meetings are actually amplifying more privileged voices. We study a variety of more disadvantaged places and what the dynamics of these meetings look like. The principles that hold in more affluent communities still play out in these less privileged places. You still hear from voices that are overwhelmingly opposed to new housing. The voices that are heard are much more likely to be homeowners, white and older…(More)”.

Befriending Trees to Lower a City’s Temperature


Peter Wilson at the New York Times: “New York, Denver, Shanghai, Ottawa and Los Angeles have all unveiled Million Tree Initiatives aimed at greatly increasing their urban forests because of the ability of trees to reduce city temperatures, absorb carbon dioxide and soak up excess rainfall.

Central Melbourne, on the other hand, lacks those cities’ financial firepower and is planning to plant a little more than 3,000 trees a year over the next decade. Yet it has gained the interest of other cities by using its extensive data to shore up the community engagement and political commitment required to sustain the decades-long work of building urban forests.

A small municipality covering just 14.5 square miles in the center of the greater Melbourne metropolitan area — which sprawls for 3,860 square miles and houses 5.2 million people in 31 municipalities — the city of Melbourne introduced its online map in 2013.

Called the Urban Forest Visual, the map displayed each of the 80,000 trees in its parks and streets, and showed each tree’s age, species and health. It also gave each tree its own email address so that people could help to monitor them and alert council workers to any specific problems.

That is when the magic happened.

City officials were surprised to see the trees receiving thousands of love letters. They ranged from jaunty greetings — “good luck with the photosynthesis” — to love poems and emotional tributes about how much joy the trees brought to people’s lives….(More)”.

Ukraine shows us the power of the 21st Century Citizen


Essay by Matt Leighninger: “This is a new kind of war, waged by a new kind of citizen.

The failure of the Russian forces to subdue Ukraine quickly has astonished experts, officials, and journalists worldwide. It shouldn’t. The Ukrainian resistance is just the latest example of the new attitudes and abilities of 21st century citizens.

While social media has been getting a lot of attention in this “TikTok War,” the real story is the growing determination and capacity of ordinary people. Around the world, ordinary people are fundamentally different from people of generations past. They have dramatically higher levels of education, far less deference to authority figures, and much greater facility with technology.

These trends have changed citizenship itself. We need to understand this shift so that societies, especially democratic ones, can figure out how to adapt, both in war and peace.

The war in Ukraine is instructive, in at least four ways.

First, citizens now have the ability to make their own media; Ukrainians, under attack, are mass-producing reality TV. Thanks to footage produced by thousands of people and viewed by millions, the war has a constantly unfolding cast of characters. Ukrainian farmers towing Russian vehicles, a soldier moonwalking in a field, people joyriding on a captured Russian tank, and a little girl singing “Let It Go” in a Kiev bomb shelter have become relatable, inspiring figures in the conflict. Seemingly every time Ukrainians have success on the battlefield, they upload videos of burned tanks and downed planes…(More)”.

How Do We End Wars? A Peace Researcher Puts Forward Some Innovative Approaches


Interview by Theodor Schaarschmidt: “Thania Paffenholz is an expert in international relations, based in Switzerland and Kenya, who conducts research on sustainable peace processes and advises institutions such as the United Nations, the European Union and the Organization for Security and Co-operation in Europe (OSCE). She is executive director of Inclusive Peace, a think tank that accompanies peace processes worldwide. Paffenholz talked with Spektrum der Wissenschaftthe German-language edition of Scientific American, about new ways to think about peacekeeping…

It is absurd that the fate of the country is mainly discussed by men older than 60, as is usual in this type of negotiation. Where is the rest of the population? What about women? What about younger people? Do they really want the same things as those in power? How can their perspectives be carried into the peace processes? There are now concepts for inclusive negotiation in which delegations from civil society discuss issues together with the leaders. In Eastern Europe, however, there are only a few examples of this….(More)”.