We Need to Save Ignorance From AI


Christina Leuker and Wouter van den Bos in Nautilus:  “After the fall of the Berlin Wall, East German citizens were offered the chance to read the files kept on them by the Stasi, the much-feared Communist-era secret police service. To date, it is estimated that only 10 percent have taken the opportunity.

In 2007, James Watson, the co-discoverer of the structure of DNA, asked that he not be given any information about his APOE gene, one allele of which is a known risk factor for Alzheimer’s disease.

Most people tell pollsters that, given the choice, they would prefer not to know the date of their own death—or even the future dates of happy events.

Each of these is an example of willful ignorance. Socrates may have made the case that the unexamined life is not worth living, and Hobbes may have argued that curiosity is mankind’s primary passion, but many of our oldest stories actually describe the dangers of knowing too much. From Adam and Eve and the tree of knowledge to Prometheus stealing the secret of fire, they teach us that real-life decisions need to strike a delicate balance between choosing to know, and choosing not to.

But what if a technology came along that shifted this balance unpredictably, complicating how we make decisions about when to remain ignorant? That technology is here: It’s called artificial intelligence.

AI can find patterns and make inferences using relatively little data. Only a handful of Facebook likes are necessary to predict your personality, race, and gender, for example. Another computer algorithm claims it can distinguish between homosexual and heterosexual men with 81 percent accuracy, and homosexual and heterosexual women with 71 percent accuracy, based on their picture alone. An algorithm named COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) can predict criminal recidivism from data like juvenile arrests, criminal records in the family, education, social isolation, and leisure activities with 65 percent accuracy….

Recently, though, the psychologist Ralph Hertwig and legal scholar Christoph Engel have published an extensive taxonomy of motives for deliberate ignorance. They identified two sets of motives, in particular, that have a particular relevance to the need for ignorance in the face of AI.

The first set of motives revolves around impartiality and fairness. Simply put, knowledge can sometimes corrupt judgment, and we often choose to remain deliberately ignorant in response. For example, peer reviews of academic papers are usually anonymous. Insurance companies in most countries are not permitted to know all the details of their client’s health before they enroll; they only know general risk factors. This type of consideration is particularly relevant to AI, because AI can produce highly prejudicial information….(More)”.

Can crowdsourcing scale fact-checking up, up, up? Probably not, and here’s why


Mevan Babakar at NiemanLab: “We foolishly thought that harnessing the crowd was going to require fewer human resources, when in fact it required, at least at the micro level, more.”….There’s no end to the need for fact-checking, but fact-checking teams are usually small and struggle to keep up with the demand. In recent months, organizations like WikiTribune have suggested crowdsourcing as an attractive, low-cost way that fact-checking could scale.

As the head of automated fact-checking at the U.K.’s independent fact-checking organization Full Fact, I’ve had a lot of time to think about these suggestions, and I don’t believe that crowdsourcing can solve the fact-checking bottleneck. It might even make it worse. But — as two notable attempts, TruthSquad and FactcheckEU, have shown — even if crowdsourcing can’t help scale the core business of fact checking, it could help streamline activities that take place around it.

Think of crowdsourced fact-checking as including three components: speed (how quickly the task can be done), complexity (how difficult the task is to perform; how much oversight it needs), and coverage (the number of topics or areas that can be covered). You can optimize for (at most) two of these at a time; the third has to be sacrificed.

High-profile examples of crowdsourcing like Wikipedia, Quora, and Stack Overflow harness and gather collective knowledge, and have proven that large crowds can be used in meaningful ways for complex tasks across many topics. But the tradeoff is speed.

Projects like Gender Balance (which asks users to identify the gender of politicians) and Democracy Club Candidates (which crowdsources information about election candidates) have shown that small crowds can have a big effect when it comes to simple tasks, done quickly. But the tradeoff is broad coverage.

At Full Fact, during the 2015 U.K. general election, we had 120 volunteers aid our media monitoring operation. They looked through the entire media output every day and extracted the claims being made. The tradeoff here was that the task wasn’t very complex (it didn’t need oversight, and we only had to do a few spot checks).

But we do have two examples of projects that have operated at both high levels of complexity, within short timeframes, and across broad areas: TruthSquad and FactCheckEU….(More)”.

NZ to perform urgent algorithm ‘stocktake’ fearing data misuse within government


Asha McLean at ZDNet: “The New Zealand government has announced it will be assessing how government agencies are using algorithms to analyse data, hoping to ensure transparency and fairness in decisions that affect citizens.

A joint statement from Minister for Government Digital Services Clare Curran and Minister of Statistics James Shaw said the algorithm “stocktake” will be conducted with urgency, but cites only the growing interest in data analytics as the reason for the probe.

“The government is acutely aware of the need to ensure transparency and accountability as interest grows regarding the challenges and opportunities associated with emerging technology such as artificial intelligence,” Curran said.

It was revealed in April that Immigration New Zealand may have been using citizen data for less than desirable purposes, with claims that data collected through the country’s visa application process that was being used to determine those in breach of their visa conditions was in fact filtering people based on their age, gender, and ethnicity.

Rejecting the idea the data-collection project was racial profiling, Immigration Minister Iain Lees-Galloway told Radio New Zealand that Immigration looks at a range of issues, including at those who have made — and have had rejected — multiple visa applications.

“It looks at people who place the greatest burden on the health system, people who place the greatest burden on the criminal justice system, and uses that data to prioritise those people,” he said.

“It is important that we protect the integrity of our immigration system and that we use the resources that immigration has as effectively as we can — I do support them using good data to make good decisions about where best to deploy their resources.”

In the statement on Wednesday, Shaw pointed to two further data-modelling projects the government had embarked on, with one from the Ministry of Health looking into the probability of five-year post-transplant survival in New Zealand.

“Using existing data to help model possible outcomes is an important part of modern government decision-making,” Shaw said….(More)”.

Using Collaborative Crowdsourcing to Give Voice to Diverse Communities


Dennis Di Lorenzo at Campus Technology: “Universities face many critical challenges — student retention, campus safety, curriculum development priorities, alumni engagement and fundraising, and inclusion of diverse populations. In my role as dean of the New York University School of Professional Studies (NYUSPS) for the past four years, and in my prior 20 years of employment in senior-level positions within the school and at NYU, I have become intimately familiar with the complexities and the nuances of such multifaceted challenges.

For the past two years, one of our top priorities at NYUSPS has been striving to address sensitive issues regarding diversity and inclusion….

To identify and address the issues we saw arising from the shifting dynamics we were encountering in our classrooms, my team initially set about gathering feedback from NYUSPS faculty members and students through roundtable discussions. Though many individuals participated in these, we sensed that some were anxious and unwilling to fully share their experiences. We were able to initiate some productive conversations; however, we found they weren’t getting to the heart of the matter. To provide a sense of anonymity that would allow members of the NYUSPS community to express their concerns more freely, we identified a collaboration tool called POPin and utilized it to conduct a series of crowdsourcing campaigns that commenced with faculty members and then proceeded on to students.

Fostering Vital Conversations

Using POPin’s online discussion tool, we were able to scale an intimate and sensitive conversation up to include more than 4,500 students and 2,100 faculty members from a wide variety of countries, cultural and religious backgrounds, gender and sexual identities, economic classes and life stages. Because the tool’s feedback mechanism is both anonymous and interactive, the scope and quality of the conversations increased dramatically….(More)”.

Data Violence and How Bad Engineering Choices Can Damage Society


Blog by Anna Lauren Hoffmann: “…In 2015, a black developer in New York discovered that Google’s algorithmic photo recognition software had tagged pictures of him and his friends as gorillas.

The same year, Facebook auto-suspended Native Americans for using their real names, and in 2016, facial recognition was found to struggle to read black faces.

Software in airport body scanners has flagged transgender bodies as threatsfor years. In 2017, Google Translate took gender-neutral pronouns in Turkish and converted them to gendered pronouns in English — with startlingly biased results.

“Violence” might seem like a dramatic way to talk about these accidents of engineering and the processes of gathering data and using algorithms to interpret it. Yet just like physical violence in the real world, this kind of “data violence” (a term inspired by Dean Spade’s concept of administrative violence) occurs as the result of choices that implicitly and explicitly lead to harmful or even fatal outcomes.

Those choices are built on assumptions and prejudices about people, intimately weaving them into processes and results that reinforce biases and, worse, make them seem natural or given.

Take the experience of being a woman and having to constantly push back against rigid stereotypes and aggressive objectification.

Writer and novelist Kate Zambreno describes these biases as “ghosts,” a violent haunting of our true reality. “A return to these old roles that we play, that we didn’t even originate. All the ghosts of the past. Ghosts that aren’t even our ghosts.”

Structural bias is reinforced by the stereotypes fed to us in novels, films, and a pervasive cultural narrative that shapes the lives of real women every day, Zambreno describes. This extends to data and automated systems that now mediate our lives as well. Our viewing and shopping habits, our health and fitness tracking, our financial information all conspire to create a “data double” of ourselves, produced about us by third parties and standing in for us on data-driven systems and platforms.

These fabrications don’t emerge de novo, disconnected from history or social context. Rather, they often pick up and unwittingly spit out a tangled mess of historical conditions and current realities.

Search engines are a prime example of how data and algorithms can conspire to amplify racist and sexist biases. The academic Safiya Umoja Noble threw these messy entanglements into sharp relief in her book Algorithms of OppressionGoogle Search, she explains, has a history of offering up pages of porn for women from particular racial or ethnic groups, and especially black women. Google have also served up ads for criminal background checksalongside search results for African American–sounding names, as former Federal Trade Commission CTO Latanya Sweeney discovered.

“These search engine results for women whose identities are already maligned in the media, such as Black women and girls, only further debase and erode efforts for social, political, and economic recognition and justice,” Noble says.

These kinds of cultural harms go well beyond search results. Sociologist Rena Bivens has shown how the gender categories employed by platforms like Facebook can inflict symbolic violences against transgender and nonbinary users in ways that may never be made obvious to users….(More)”.

Gender is personal – not computational


Foad Hamidi, Morgan Scheuerman and Stacy Branham in the Conversation: “Efforts at automatic gender recognition – using algorithms to guess a person’s gender based on images, video or audio – raise significant social and ethical concerns that are not yet fully explored. Most current research on automatic gender recognition technologies focuses instead on technological details.

Our recent research found that people with diverse gender identities, including those identifying as transgender or gender nonbinary, are particularly concerned that these systems could miscategorize them. People who express their gender differently from stereotypical male and female norms already experience discrimination and harm as a result of being miscategorized or misunderstood. Ideally, technology designers should develop systems to make these problems less common, not more so.

As digital technologies become more powerful and sophisticated, their designers are trying to use them to identify and categorize complex human characteristics, such as sexual orientation, gender and ethnicity. The idea is that with enough training on abundant user data, algorithms can learn to analyze people’s appearance and behavior – and perhaps one day characterize people as well as, or even better than, other humans do.

Gender is a hard topic for people to handle. It’s a complex concept with important roles both as a cultural construct and a core aspect of an individual’s identity. Researchers, scholars and activists are increasingly revealing the diverse, fluid and multifaceted aspects of gender. In the process, they find that ignoring this diversity can lead to both harmful experiences and social injustice. For example, according to the 2016 National Transgender Survey, 47 percent of transgender participants stated that they had experienced some form of discrimination at their workplace due to their gender identity. More than half of transgender people who were harassed, assaulted or expelled because of their gender identity had attempted suicide….(More)”.

Introducing Sourcelist: Promoting diversity in technology policy


Susan Hennessey at Brookings: “…delighted to announce the launch of Sourcelist, a database of experts in technology policy from diverse backgrounds.

Here at Brookings, we built Sourcelist on the principle that technology policymaking stands to benefit from the inclusion of the voices of a broader diversity of people. It aims to help journalists, conference planners, and others to identify and connect with experts outside of their usual sources and panelists. Sourcelist’s purpose is to facilitate more diverse representation by leveraging technology to create a user-friendly resource for people whose decisions can make a difference. We hope that Sourcelist will take away the excuse that diverse experts couldn’t be found to comment on a story or participate on a panel.

Our first database is devoted to Women+. Countless organizations now recognize the institutional barriers that women and underrepresented gender identities face in tech policy. Sourcelist is a resource for those hoping to put recognition into practice.

I want to take the opportunity to personally thank the incredible team at Objectively that took an idea and turned it into the remarkable resource we’re launching today….(More)”.

The global identification challenge: Who are the 1 billion people without proof of identity?


Vyjayanti Desai at The Worldbank: “…Using a combination of the self-reported figures from country authorities, birth registration and other proxy data, the 2018 ID4D Global Dataset suggests that as many as 1 billion people struggle to prove who they are. The data also revealed that of the 1 billion people without an official proof of identity:

  • 81% live in Sub-Saharan Africa and South Asia, indicating the need to scale up efforts in these regions
  • 47% are below the national ID age of their country, highlighting the importance of strengthening birth registration efforts and creating a unique, lifetime identity;
  • 63% live in lower-middle income economies, while 28% live in low-income economies, reinforcing that lack of identification is a critical concern for the global poor….

In addition, to further strengthen understanding of who the undocumented are and the barriers they face, ID4D partnered with the 2017 Global Findex to gather for the first time this year, nationally-representative survey data from 99 countries on foundational ID coverage, use, and barriers to access. Early findings suggest that residents of low income countries, particularly women and the poorest 40%, are the most affected by a lack of ID. The survey data (albeit limited in its coverage to people aged 15 and older) confirm that the coverage gap is largest in low income countries (LICs), where 38% of the surveyed population does not have a foundational ID. Regionally, sub-Saharan Africa shows the largest coverage gap, where close to one in three people in surveyed countries lack a foundational ID.

Although global gender gaps in foundational ID coverage are relatively small, there is a large gender gap for the unregistered population in low income countries – where over 45% of women lack a foundational ID, compared to 30% of men.  The countries with the greatest #gender gaps in foundational ID coverage also tend to be those with #legal barriers for women’s access to #identity documents….(More)”.

Using Data to Inform the Science of Broadening Participation


Donna K. Ginther at the American Behavioral Scientist: “In this article, I describe how data and econometric methods can be used to study the science of broadening participation. I start by showing that theory can be used to structure the approach to using data to investigate gender and race/ethnicity differences in career outcomes. I also illustrate this process by examining whether women of color who apply for National Institutes of Health research funding are confronted with a double bind where race and gender compound their disadvantage relative to Whites. Although high-quality data are needed for understanding the barriers to broadening participation in science careers, it cannot fully explain why women and underrepresented minorities are less likely to be scientists or have less productive science careers. As researchers, it is important to use all forms of data—quantitative, experimental, and qualitative—to deepen our understanding of the barriers to broadening participation….(More)”.

From Texts to Tweets to Satellites: The Power of Big Data to Fill Gender Data Gaps


 at UN Foundation Blog: “Twitter posts, credit card purchases, phone calls, and satellites are all part of our day-to-day digital landscape.

Detailed data, known broadly as “big data” because of the massive amounts of passively collected and high-frequency information that such interactions generate, are produced every time we use one of these technologies. These digital traces have great potential and have already developed a track record for application in global development and humanitarian response.

Data2X has focused particularly on what big data can tell us about the lives of women and girls in resource-poor settings. Our research, released today in a new report, Big Data and the Well-Being of Women and Girls, demonstrates how four big data sources can be harnessed to fill gender data gaps and inform policy aimed at mitigating global gender inequality. Big data can complement traditional surveys and other data sources, offering a glimpse into dimensions of girls’ and women’s lives that have otherwise been overlooked and providing a level of precision and timeliness that policymakers need to make actionable decisions.

Here are three findings from our report that underscore the power and potential offered by big data to fill gender data gaps:

  1. Social media data can improve understanding of the mental health of girls and women.

Mental health conditions, from anxiety to depression, are thought to be significant contributors to the global burden of disease, particularly for young women, though precise data on mental health is sparse in most countries. However, research by Georgia Tech University, commissioned by Data2X, finds that social media provides an accurate barometer of mental health status…..

  1. Cell phone and credit card records can illustrate women’s economic and social patterns – and track impacts of shocks in the economy.

Our spending priorities and social habits often indicate economic status, and these activities can also expose economic disparities between women and men.

By compiling cell phone and credit card records, our research partners at MIT traced patterns of women’s expenditures, spending priorities, and physical mobility. The research found that women have less mobility diversity than men, live further away from city centers, and report less total expenditure per capita…..

  1. Satellite imagery can map rivers and roads, but it can also measure gender inequality.

Satellite imagery has the power to capture high-resolution, real-time data on everything from natural landscape features, like vegetation and river flows, to human infrastructure, like roads and schools. Research by our partners at the Flowminder Foundation finds that it is also able to measure gender inequality….(More)”.