Who Is Making Sure the A.I. Machines Aren’t Racist?


Cade Metz at the New York Times: “Hundreds of people gathered for the first lecture at what had become the world’s most important conference on artificial intelligence — row after row of faces. Some were East Asian, a few were Indian, and a few were women. But the vast majority were white men. More than 5,500 people attended the meeting, five years ago in Barcelona, Spain.

Timnit Gebru, then a graduate student at Stanford University, remembers counting only six Black people other than herself, all of whom she knew, all of whom were men.

The homogeneous crowd crystallized for her a glaring issue. The big thinkers of tech say A.I. is the future. It will underpin everything from search engines and email to the software that drives our cars, directs the policing of our streets and helps create our vaccines.

But it is being built in a way that replicates the biases of the almost entirely male, predominantly white work force making it.

In the nearly 10 years I’ve written about artificial intelligence, two things have remained a constant: The technology relentlessly improves in fits and sudden, great leaps forward. And bias is a thread that subtly weaves through that work in a way that tech companies are reluctant to acknowledge.

On her first night home in Menlo Park, Calif., after the Barcelona conference, sitting cross-​legged on the couch with her laptop, Dr. Gebru described the A.I. work force conundrum in a Facebook post.

“I’m not worried about machines taking over the world. I’m worried about groupthink, insularity and arrogance in the A.I. community — especially with the current hype and demand for people in the field,” she wrote. “The people creating the technology are a big part of the system. If many are actively excluded from its creation, this technology will benefit a few while harming a great many.”

The A.I. community buzzed about the mini-manifesto. Soon after, Dr. Gebru helped create a new organization, Black in A.I. After finishing her Ph.D., she was hired by Google….(More)”.

The Mathematics of How Connections Become Global


Kelsey Houston-Edwards at Scientific American: “When you hit “send” on a text message, it is easy to imagine that the note will travel directly from your phone to your friend’s. In fact, it typically goes on a long journey through a cellular network or the Internet, both of which rely on centralized infrastructure that can be damaged by natural disasters or shut down by repressive governments. For fear of state surveillance or interference, tech-savvy protesters in Hong Kong avoided the Internet by using software such as FireChat and Bridgefy to send messages directly between nearby phones.

These apps let a missive hop silently from one phone to the next, eventually connecting the sender to the receiver—the only users capable of viewing the message. The collections of linked phones, known as mesh networks or mobile ad hoc networks, enable a flexible and decentralized mode of communication. But for any two phones to communicate, they need to be linked via a chain of other phones. How many people scattered throughout Hong Kong need to be connected via the same mesh network before we can be confident that crosstown communication is possible?

Mesh network in action: when cell-phone ranges overlap, a linked chain of connections is established.
Credit: Jen Christiansen (graphic); Wee People font, ProPublica and Alberto Cairo (figure drawings)

A branch of mathematics called percolation theory offers a surprising answer: just a few people can make all the difference. As users join a new network, isolated pockets of connected phones slowly emerge. But full east-to-west or north-to-south communication appears all of a sudden as the density of users passes a critical and sharp threshold. Scientists describe such a rapid change in a network’s connectivity as a phase transition—the same concept used to explain abrupt changes in the state of a material such as the melting of ice or the boiling of water.

A phase transition in a mesh network: the density of users suddenly passes a critical threshold.
Credit: Jen Christiansen (graphic); Wee People font, ProPublica and Alberto Cairo (figure drawings)

Percolation theory examines the consequences of randomly creating or removing links in such networks, which mathematicians conceive of as a collection of nodes (represented by points) linked by “edges” (lines). Each node represents an object such as a phone or a person, and the edges represent a specific relation between two of them. The fundamental insight of percolation theory, which dates back to the 1950s, is that as the number of links in a network gradually increases, a global cluster of connected nodes will suddenly emerge….(More)”.

What the drive for open science data can learn from the evolving history of open government data


Stefaan Verhulst, Andrew Young, and Andrew Zahuranec at The Conversation: “Nineteen years ago, a group of international researchers met in Budapest to discuss a persistent problem. While experts published an enormous amount of scientific and scholarly material, few of these works were accessible. New research remained locked behind paywalls run by academic journals. The result was researchers struggled to learn from one another. They could not build on one another’s findings to achieve new insights. In response to these problems, the group developed the Budapest Open Access Initiative, a declaration calling for free and unrestricted access to scholarly journal literature in all academic fields.

In the years since, open access has become a priority for a growing number of universitiesgovernments, and journals. But while access to scientific literature has increased, access to the scientific data underlying this research remains extremely limited. Researchers can increasingly see what their colleagues are doing but, in an era defined by the replication crisis, they cannot access the data to reproduce the findings or analyze it to produce new findings. In some cases there are good reasons to keep access to the data limited – such as confidentiality or sensitivity concerns – yet in many other cases data hoarding still reigns.

To make scientific research data open to citizens and scientists alike, open science data advocates can learn from open data efforts in other domains. By looking at the evolving history of the open government data movement, scientists can see both limitations to current approaches and identify ways to move forward from them….(More) (French version)”.

Lessons from all democracies


David Stasavage at Aeon: “Today, many people see democracy as under threat in a way that only a decade ago seemed unimaginable. Following the fall of the Berlin Wall in 1989, it seemed like democracy was the way of the future. But nowadays, the state of democracy looks very different; we hear about ‘backsliding’ and ‘decay’ and other descriptions of a sort of creeping authoritarianism. Some long-established democracies, such as the United States, are witnessing a violation of governmental norms once thought secure, and this has culminated in the recent insurrection at the US Capitol. If democracy is a torch that shines for a time before then burning out – think of Classical Athens and Renaissance city republics – it all feels as if we might be heading toward a new period of darkness. What can we do to reverse this apparent trend and support democracy?

First, we must dispense with the idea that democracy is like a torch that gets passed from one leading society to another. The core feature of democracy – that those who rule can do so only with the consent of the people – wasn’t invented in one place at one time: it evolved independently in a great many human societies.

Over several millennia and across multiple continents, early democracy was an institution in which rulers governed jointly with councils and assemblies of the people. From the Huron (who called themselves the Wendats) and the Iroquois (who called themselves the Haudenosaunee) in the Northeastern Woodlands of North America, to the republics of Ancient India, to examples of city governance in ancient Mesopotamia, these councils and assemblies were common. Classical Greece provided particularly important instances of this democratic practice, and it’s true that the Greeks gave us a language for thinking about democracy, including the word demokratia itself. But they didn’t invent the practice. If we want to better understand the strengths and weaknesses of our modern democracies, then early democratic societies from around the world provide important lessons.

The core feature of early democracy was that the people had power, even if multiparty elections (today, often thought to be a definitive feature of democracy) didn’t happen. The people, or at least some significant fraction of them, exercised this power in many different ways. In some cases, a ruler was chosen by a council or assembly, and was limited to being first among equals. In other instances, a ruler inherited their position, but faced constraints to seek consent from the people before taking actions both large and small. The alternative to early democracy was autocracy, a system where one person ruled on their own via bureaucratic subordinates whom they had recruited and remunerated. The word ‘autocracy’ is a bit of a misnomer here in that no one in this position ever truly ruled on their own, but it does signify a different way of organising political power.

Early democratic governance is clearly apparent in some ancient societies in Mesopotamia as well as in India. It flourished in a number of places in the Americas before European conquest, such as among the Huron and the Iroquois in the Northeastern Woodlands and in the ‘Republic of Tlaxcala’ that abutted the Triple Alliance, more commonly known as the Aztec Empire. It was also common in precolonial Africa. In all of these societies there were several defining features that tended to reinforce early democracy: small scale, a need for rulers to depend on the people for knowledge, and finally the ability of members of society to exit to other locales if they were unhappy with a ruler. These three features were not always present in the same measure, but collectively they helped to underpin early democracy….(More)”

Biden Creates Road Map for Equitable State and Local Data


Daniel Castro at GovTech: “On his first day in office, President Biden issued a flurry of administrative actions to reverse a number of President Trump’s policies and address the ongoing coronavirus pandemic. One of these included an executive order to advance racial equity and provide support for underserved communities. Notably, the order recognizes that achieving this goal will be difficult, if not impossible, without better data. This is a lesson that many state and local governments should take to heart by revisiting their collection policies to ensure data is equitable.

The executive order establishes that it is the policy of the Biden administration to “pursue a comprehensive approach to advancing equity for all, including people of color and others who have been historically underserved, marginalized, and adversely affected by persistent poverty and inequality.” To that end, the order dedicates a section to establishing an interagency working group on equitable data tasked with identifying inadequacies in federal data collection policies and programs, and recommending strategies for addressing any deficiencies.   

An inability to disaggregate data prevents policymakers from identifying disparate impacts of government programs on different populations in a variety of areas including health care, education, criminal justice, workforce and housing. Indeed, the U.S. Commission on Civil Rights has found that “data collection and reporting are essential to effective civil rights enforcement, and that a lack of effective civil rights data collection is problematic.”

This problem has repeatedly been on display throughout the COVID-19 pandemic. For example, at the outset of the pandemic last year, nearly half of states did not report data on race or ethnicity on those who were tested, hospitalized or died of COVID-19. And while the government has tried to take a data-driven response to the COVID-19 pandemic, a lack of data about different groups means that their needs are often hidden from policymakers….(More)”.

The Future of Nudging Will Be Personal


Essay by Stuart Mills: “Nudging, now more than a decade old as an intervention tool, has become something of a poster child for the behavioral sciences. We know that people don’t always act in their own best interest—sometimes spectacularly so—and nudges have emerged as a noncoercive way to live better in a world shaped by our behavioral foibles.

But with nudging’s maturity, we’ve also begun to understand some of the ways that it falls short. Take, for instance, research by Linda Thunström and her colleagues. They found that “successful” nudges can actually harm subgroups of a population. In their research, spendthrifts (those who spend freely) spent less when nudged, bringing them closer to optimal spending. But when given the same nudge, tightwads also spent less, taking them further from the optimal.

While a nudge might appear effective because a population benefited on average, at the individual level the story could be different. Should nudging penalize people that differ from the average just because, on the whole, a policy would benefit the population? Though individual versus population trade-offs are part and parcel to policymaking, as our ability to personalize advances, through technology and data, these trade-offs seem less and less appealing….(More)”.

Building Digital Worlds: Where does GIS data come from?


Julie Stoner at Library of Congress: “Whether you’ve used an online map to check traffic conditions, a fitness app to track your jogging route, or found photos tagged by location on social media, many of us rely on geospatial data more and more each day. So what are the most common ways geospatial data is created and stored, and how does it differ from how we have stored geographic information in the past?

A primary method for creating geospatial data is to digitize directly from scanned analog maps. After maps are georeferenced, GIS software allows a data creator to manually digitize boundaries, place points, or define areas using the georeferenced map image as a reference layer. The goal of digitization is to capture information carefully stored in the original map and translate it into a digital format. As an example, let’s explore and then digitize a section of this 1914 Sanborn Fire Insurance Map from Eatonville, Washington.

Sanborn Fire Insurance Map from Eatonville, Pierce County, Washington. Sanborn Map Company, October 1914. Geography & Map Division, Library of Congress.

Sanborn Fire Insurance Maps were created to detail the built environment of American towns and cities through the late 19th and early 20th centuries. The creation of these information-dense maps allowed the Sanborn Fire Insurance Company to underwrite insurance agreements without needing to inspect each building in person. Sanborn maps have become incredibly valuable sources of historic information because of the rich geographic detail they store on each page.

When extracting information from analog maps, the digitizer must decide which features will be digitized and how information about those features will be stored. Behind the geometric features created through the digitization process, a table is utilized to store information about each feature on the map.  Using the table, we can store information gleaned from the analog map, such as the name of a road or the purpose of a building. We can also quickly calculate new data, such as the length of a road segment. The data in the table can then be put to work in the visual display of the new digital information that has been created. This often done through symbolization and map labels….(More)”.

How Digital Trust Varies Around the World


Bhaskar Chakravorti, Ajay Bhalla, and Ravi Shankar Chaturvedi at Harvard Business Review: “As economies around the world digitalize rapidly in response to the pandemic, one component that can sometimes get left behind is user trust. What does it take to build out a digital ecosystem that users will feel comfortable actually using? To answer this question, the authors explored four components of digital trust: the security of an economy’s digital environment; the quality of the digital user experience; the extent to which users report trust in their digital environment; and the extent to which users actually use the digital tools available to them. They then used almost 200 indicators to rank 42 global economies on their performance in each of these four metrics, finding a number of interesting trends around how different economies have developed mechanisms for engendering trust, as well as how different types of trust do — or don’t — correspond to other digital development metrics…(More)”.

DNA databases are too white, so genetics doesn’t help everyone. How do we fix that?


Tina Hesman Saey at ScienceNews: “It’s been two decades since the Human Genome Project first unveiled a rough draft of our genetic instruction book. The promise of that medical moon shot was that doctors would soon be able to look at an individual’s DNA and prescribe the right medicines for that person’s illness or even prevent certain diseases.

That promise, known as precision medicine, has yet to be fulfilled in any widespread way. True, researchers are getting clues about some genetic variants linked to certain conditions and some that affect how drugs work in the body. But many of those advances have benefited just one group: people whose ancestral roots stem from Europe. In other words, white people.

Instead of a truly human genome that represents everyone, “what we have is essentially a European genome,” says Constance Hilliard, an evolutionary historian at the University of North Texas in Denton. “That data doesn’t work for anybody apart from people of European ancestry.”

She’s talking about more than the Human Genome Project’s reference genome. That database is just one of many that researchers are using to develop precision medicine strategies. Often those genetic databases draw on data mainly from white participants. But race isn’t the issue. The problem is that collectively, those data add up to a catalog of genetic variants that don’t represent the full range of human genetic diversity.

When people of African, Asian, Native American or Pacific Island ancestry get a DNA test to determine if they inherited a variant that may cause cancer or if a particular drug will work for them, they’re often left with more questions than answers. The results often reveal “variants of uncertain significance,” leaving doctors with too little useful information. This happens less often for people of European descent. That disparity could change if genetics included a more diverse group of participants, researchers agree (SN: 9/17/16, p. 8).

One solution is to make customized reference genomes for populations whose members die from cancer or heart disease at higher rates than other groups, for example, or who face other worse health outcomes, Hilliard suggests….(More)”.

Machine Learning Shows Social Media Greatly Affects COVID-19 Beliefs


Jessica Kent at HealthITAnalytics: “Using machine learning, researchers found that people’s biases about COVID-19 and its treatments are exacerbated when they read tweets from other users, a study published in JMIR showed.

The analysis also revealed that scientific events, like scientific publications, and non-scientific events, like speeches from politicians, equally influence health belief trends on social media.

The rapid spread of COVID-19 has resulted in an explosion of accurate and inaccurate information related to the pandemic – mainly across social media platforms, researchers noted.

“In the pandemic, social media has contributed to much of the information and misinformation and bias of the public’s attitude toward the disease, treatment and policy,” said corresponding study author Yuan Luo, chief Artificial Intelligence officer at the Institute for Augmented Intelligence in Medicine at Northwestern University Feinberg School of Medicine.

“Our study helps people to realize and re-think the personal decisions that they make when facing the pandemic. The study sends an ‘alert’ to the audience that the information they encounter daily might be right or wrong, and guide them to pick the information endorsed by solid scientific evidence. We also wanted to provide useful insight for scientists or healthcare providers, so that they can more effectively broadcast their voice to targeted audiences.”…(More)”.