Discrimination by algorithm: scientists devise test to detect AI bias


 at the Guardian: “There was the voice recognition software that struggled to understand women, the crime prediction algorithm that targeted black neighbourhoods and the online ad platform which was more likely to show men highly paid executive jobs.

Concerns have been growing about AI’s so-called “white guy problem” and now scientists have devised a way to test whether an algorithm is introducing gender or racial biases into decision-making.

Mortiz Hardt, a senior research scientist at Google and a co-author of the paper, said: “Decisions based on machine learning can be both incredibly useful and have a profound impact on our lives … Despite the need, a vetted methodology in machine learning for preventing this kind of discrimination based on sensitive attributes has been lacking.”

The paper was one of several on detecting discrimination by algorithms to be presented at the Neural Information Processing Systems (NIPS) conference in Barcelona this month, indicating a growing recognition of the problem.

Nathan Srebro, a computer scientist at the Toyota Technological Institute at Chicago and co-author, said: “We are trying to enforce that you will not have inappropriate bias in the statistical prediction.”

The test is aimed at machine learning programs, which learn to make predictions about the future by crunching through vast quantities of existing data. Since the decision-making criteria are essentially learnt by the computer, rather than being pre-programmed by humans, the exact logic behind decisions is often opaque, even to the scientists who wrote the software….“Our criteria does not look at the innards of the learning algorithm,” said Srebro. “It just looks at the predictions it makes.”

Their approach, called Equality of Opportunity in Supervised Learning, works on the basic principle that when an algorithm makes a decision about an individual – be it to show them an online ad or award them parole – the decision should not reveal anything about the individual’s race or gender beyond what might be gleaned from the data itself.

For instance, if men were on average twice as likely to default on bank loans than women, and if you knew that a particular individual in a dataset had defaulted on a loan, you could reasonably conclude they were more likely (but not certain) to be male.

However, if an algorithm calculated that the most profitable strategy for a lender was to reject all loan applications from men and accept all female applications, the decision would precisely confirm a person’s gender.

“This can be interpreted as inappropriate discrimination,” said Srebro….(More)”.

Social Movements and World-System Transformation


Book edited by Jackie Smith, Michael Goodhart, Patrick Manning, and John Markoff: “At a particularly urgent world-historical moment, this volume brings together some of the leading researchers of social movements and global social change and other emerging scholars and practitioners to advance new thinking about social movements and global transformation. Social movements around the world today are responding to crisis by defying both political and epistemological borders, offering alternatives to the global capitalist order that are imperceptible through the modernist lens. Informed by a world-historical perspective, contributors explain today’s struggles as building upon the experiences of the past while also coming together globally in ways that are inspiring innovation and consolidating new thinking about what a fundamentally different, more equitable, just, and sustainable world order might look like.

This collection offers new insights into contemporary movements for global justice, challenging readers to appreciate how modernist thinking both colors our own observations and complicates the work of activists seeking to resolve inequities and contradictions that are deeply embedded in Western cultural traditions and institutions. Contributors consider today’s movements in the longue durée—that is, they ask how Occupy Wall Street, the Arab Spring, and other contemporary struggles for liberation reflect, build upon, or diverge from anti-colonial and other emancipatory struggles of the past. Critical to this volume is its exploration of how divisions over gender equity and diversity of national cultures and class have impacted what are increasingly intersectional global movements. The contributions of feminist and indigenous movements come to the fore in this collective exploration of what the movements of yesterday and today can contribute to our ongoing effort to understand the dynamics of global transformation in order to help advance a more equitable, just, and ecologically sustainable world….(More)”.

What does Big Data mean to public affairs research?


Ines Mergel, R. Karl Rethemeyer, and Kimberley R. Isett at LSE’s The Impact Blog: “…Big Data promises access to vast amounts of real-time information from public and private sources that should allow insights into behavioral preferences, policy options, and methods for public service improvement. In the private sector, marketing preferences can be aligned with customer insights gleaned from Big Data. In the public sector however, government agencies are less responsive and agile in their real-time interactions by design – instead using time for deliberation to respond to broader public goods. The responsiveness Big Data promises is a virtue in the private sector but could be a vice in the public.

Moreover, we raise several important concerns with respect to relying on Big Data as a decision and policymaking tool. While in the abstract Big Data is comprehensive and complete, in practice today’sversion of Big Data has several features that should give public sector practitioners and scholars pause. First, most of what we think of as Big Data is really ‘digital exhaust’ – that is, data collected for purposes other than public sector operations or research. Data sets that might be publicly available from social networking sites such as Facebook or Twitter were designed for purely technical reasons. The degree to which this data lines up conceptually and operationally with public sector questions is purely coincidental. Use of digital exhaust for purposes not previously envisioned can go awry. A good example is Google’s attempt to predict the flu based on search terms.

Second, we believe there are ethical issues that may arise when researchers use data that was created as a byproduct of citizens’ interactions with each other or with a government social media account. Citizens are not able to understand or control how their data is used and have not given consent for storage and re-use of their data. We believe that research institutions need to examine their institutional review board processes to help researchers and their subjects understand important privacy issues that may arise. Too often it is possible to infer individual-level insights about private citizens from a combination of data points and thus predict their behaviors or choices.

Lastly, Big Data can only represent those that spend some part of their life online. Yet we know that certain segments of society opt in to life online (by using social media or network-connected devices), opt out (either knowingly or passively), or lack the resources to participate at all. The demography of the internet matters. For instance, researchers tend to use Twitter data because its API allows data collection for research purposes, but many forget that Twitter users are not representative of the overall population. Instead, as a recent Pew Social Media 2016 update shows, only 24% of all online adults use Twitter. Internet participation generally is biased in terms of age, educational attainment, and income – all of which correlate with gender, race, and ethnicity. We believe therefore that predictive insights are potentially biased toward certain parts of the population, making generalisations highly problematic at this time….(More)”

Microsoft Shows Searches Can Boost Early Detection of Lung Cancer


Dina Bass at BloombergTech: “Microsoft Corp. researchers want to give patients and doctors a new tool in the quest to find cancers earlier: web searches.

Lung cancer can be detected a year prior to current methods of diagnosis in more than one-third of cases by analyzing a patient’s internet searches for symptoms and demographic data that put them at higher risk, according to research from Microsoft published Thursday in the journal JAMA Oncology. The study shows it’s possible to use search data to give patients or doctors enough reason to seek cancer screenings earlier, improving the prospects for treatment for lung cancer, which is the leading cause of cancer deaths worldwide.

To train their algorithms, researchers Ryen White and Eric Horvitz scanned anonymous queries in Bing, the company’s search engine. They took searchers who had asked Bing something that indicated a recent lung cancer diagnosis, such as questions about specific treatments or the phrase “I was just diagnosed with lung cancer.”
Then they went back over the user’s previous searches to see if there were other queries that might have indicated the possibility of cancer prior to diagnosis. They looked for searches such as those related to symptoms, including bronchitis, chest pain and blood in sputum. The researchers reviewed other risk factors such as gender, age, race and whether searchers lived in areas with high levels of asbestos and radon, both of which increase the risk of lung cancer. And they looked for indications the user was a smoker, such as people searching for smoking cessation products like Nicorette gum.

How effective this method can be depends on how many false positives — people who don’t end up having cancer but are told they may — you are willing to tolerate, the researchers said. More false positives also mean catching more cases early. With one false positive in 1,000, 39 percent of cases can be caught a year earlier, according to the study. Dropping to one false positive per 100,000 still could allow researchers to catch 3 percent of cases a year earlier, Horvitz said.  The company published similar research on pancreatic cancer in June….(More)”

AI Ethics: The Future of Humanity 


Report by sparks & honey: “Through our interaction with machines, we develop emotional, human expectations of them. Alexa, for example, comes alive when we speak with it. AI is and will be a representation of its cultural context, the values and ethics we apply to one another as humans.

This machinery is eerily familiar as it mirrors us, and eventually becomes even smarter than us mere mortals. We’re programming its advantages based on how we see ourselves and the world around us, and we’re doing this at an incredible pace. This shift is pervading culture from our perceptions of beauty and aesthetics to how we interact with one another – and our AI.

Infused with technology, we’re asking: what does it mean to be human?

Our report examines:

• The evolution of our empathy from humans to animals and robots
• How we treat AI in its infancy like we do a child, allowing it space to grow
• The spectrum of our emotional comfort in a world embracing AI
• The cultural contexts fueling AI biases, such as gender stereotypes, that drive the direction of AI
• How we place an innate trust in machines, more than we do one another (Download for free)”

 

Obama Brought Silicon Valley to Washington


Jenna Wortham at The New York Times: “…“Fixing” problems with technology often just creates more problems, largely because technology is never developed in a neutral way: It embodies the values and biases of the people who create it. Crime-predicting software, celebrated when it was introduced in police departments around the country, turned out to reinforce discriminatory policing. Facebook was recently accused of suppressing conservative news from its trending topics. (The company denied a bias, but announced plans to train employees to neutralize political, racial, gender and age biases that could influence what it shows its user base.) Several studies have found that Airbnb has worsened the housing crises in some cities where it operates. In January, a report from the World Bank declared that tech companies were widening income inequality and wealth disparities, not improving them….

None of this was mentioned at South by South Lawn. Instead, speakers heralded the power of the tech community. John Lewis, the congressman and civil rights leader, gave a rousing talk that implored listeners to “get in trouble. Good trouble. Get in the way and make some noise.” Clay Dumas, chief of staff for the Office of Digital Strategy at the White House, told me in an email that the event could be considered part of a legacy to inspire social change and activism through technology. “In his final months in office,” he wrote, “President Obama wants to empower the generation of people that helped launch his candidacy and whose efforts carried him into office.”

…But a few days later, during a speech at Carnegie Mellon, Obama seemed to reckon with his feelings about the potential — and limits — of the tech world. The White House can’t be as freewheeling as a start-up, he said, because “by definition, democracy is messy. And part of government’s job is dealing with problems that nobody else wants to deal with.” But he added that he didn’t want people to become “discouraged and say, ‘I’m just not going to deal with government.’ ” Obama was the first American president to see technology as an engine to improve lives and accelerate society more quickly than any government body could. That lesson was apparent on the lawn. While I still don’t believe that technology is a panacea for society’s problems, I will always appreciate the first president who tried to bring what’s best about Silicon Valley to Washington, even if some of the bad came with it….(More)”

One Crucial Thing Can Help End Violence Against Girls


Eleanor Goldberg at The Huffington Post: “…There are statistics that demonstrate how many girls are in school, for example. But there’s a glaring lack of information on how many of them have dropped out ― and why ― concluded a new study, “Counting the Invisible Girls,” published this month by Plan International.

Why Data On Women And Girls Is Crucial

Without accurate information about the struggles girls face, such as abuse, child marriage, and dropout rates, governments and nonprofit groups can’t develop programs that cater to the specific needs of underserved girls. As a result, struggling girls across the globe, have little chance of escaping the problems that prevent them from pursuing an education and becoming economically independent.

“If data used for policy-making is incomplete, we have a real challenge. Current data is not telling the full story,” Emily Courey Pryor, senior director of Data2X, said at the Social Good Summit in New York City last month. Data2X is a U.N.-led group that works with data collectors and policymakers to identify gender data issues and to help bring about solutions.

Plan International released its report to coincide with a number of major recent events….

How Data Helps Improve The Lives Of Women And Girls 

While data isn’t a panacea, it has proven in a number of instances to help marginalized groups.

Until last year, it was legal in Guatemala for a girl to marry at age 14 ― despite the numerous health risks associated with the practice. Young brides are more vulnerable to sexual abuse and more likely to face fatal complications related to pregnancy and childbirth than those who marry later.

To urge lawmakers to raise the minimum age of marriage, Plan International partnered with advocates and civil society groups to launch its “Because I am a Girl” initiative. It analyzed traditional Mayan laws and gathered evidence about the prevalence of child marriage and its impact on children’s lives. The group presented the information before Guatemala’s Congress and in August of last year, the minimum age for marriage was raised to 18.

A number of groups are heeding the call to continue to amass better data.

In May, the Bill and Melinda Gates Foundation pledged $80 million over the next three years to gather robust and reliable data.

In September, the U.N. women announced “Making Every Woman and Girl Count,”a public-private partnership that’s working to tackle the data issue. The program was unveiled at the U.N. General Assembly, and is working with the Gates Foundation, Data2X and a number of world leaders…(More)”

A cautionary tale about humans creating biased AI models


 at TechCrunch: “Most artificial intelligence models are built and trained by humans, and therefore have the potential to learn, perpetuate and massively scale the human trainers’ biases. This is the word of warning put forth in two illuminating articles published earlier this year by Jack Clark at Bloomberg and Kate Crawford at The New York Times.

Tl;dr: The AI field lacks diversity — even more spectacularly than most of our software industry. When an AI practitioner builds a data set on which to train his or her algorithm, it is likely that the data set will only represent one worldview: the practitioner’s. The resulting AImodel demonstrates a non-diverse “intelligence” at best, and a biased or even offensive one at worst….

So what happens when you don’t consider carefully who is annotating the data? What happens when you don’t account for the differing preferences, tendencies and biases among varying humans? We ran a fun experiment to find out….Actually, we didn’t set out to run an experiment. We just wanted to create something fun that we thought our awesome tasking community would enjoy. The idea? Give people the chance to rate puppies’ cuteness in their spare time…There was a clear gender gap — a very consistent pattern of women rating the puppies as cuter than the men did. The gap between women’s and men’s ratings was more narrow for the “less-cute” (ouch!) dogs, and wider for the cuter ones. Fascinating.

I won’t even try to unpack the societal implications of these findings, but the lesson here is this: If you’re training an artificial intelligence model — especially one that you want to be able to perform subjective tasks — there are three areas in which you must evaluate and consider demographics and diversity:

  • yourself
  • your data
  • your annotators

This was a simple example: binary gender differences explaining one subjective numeric measure of an image. Yet it was unexpected and significant. As our industry deploys incredibly complex models that are pushing to the limit chip sets, algorithms and scientists, we risk reinforcing subtle biases, powerfully and at a previously unimaginable scale. Even more pernicious, many AIs reinforce their own learning, so we need to carefully consider “supervised” (aka human) re-training over time.

Artificial intelligence promises to change all of our lives — and it already subtly guides the way we shop, date, navigate, invest and more. But to make sure that it does so for the better, all of us practitioners need to go out of our way to be inclusive. We need to remain keenly aware of what makes us all, well… human. Especially the subtle, hidden stuff….(More)”

The risks of relying on robots for fairer staff recruitment


Sarah O’Connor at the Financial Times: “Robots are not just taking people’s jobs away, they are beginning to hand them out, too. Go to any recruitment industry event and you will find the air is thick with terms like “machine learning”, “big data” and “predictive analytics”.

The argument for using these tools in recruitment is simple. Robo-recruiters can sift through thousands of job candidates far more efficiently than humans. They can also do it more fairly. Since they do not harbour conscious or unconscious human biases, they will recruit a more diverse and meritocratic workforce.

This is a seductive idea but it is also dangerous. Algorithms are not inherently neutral just because they see the world in zeros and ones.

For a start, any machine learning algorithm is only as good as the training data from which it learns. Take the PhD thesis of academic researcher Colin Lee, released to the press this year. He analysed data on the success or failure of 441,769 job applications and built a model that could predict with 70 to 80 per cent accuracy which candidates would be invited to interview. The press release plugged this algorithm as a potential tool to screen a large number of CVs while avoiding “human error and unconscious bias”.

But a model like this would absorb any human biases at work in the original recruitment decisions. For example, the research found that age was the biggest predictor of being invited to interview, with the youngest and the oldest applicants least likely to be successful. You might think it fair enough that inexperienced youngsters do badly, but the routine rejection of older candidates seems like something to investigate rather than codify and perpetuate. Mr Lee acknowledges these problems and suggests it would be better to strip the CVs of attributes such as gender, age and ethnicity before using them….(More)”

The Racist Algorithm?


Anupam Chander in the Michigan Law Review (2017 Forthcoming) : “Are we on the verge of an apartheid by algorithm? Will the age of big data lead to decisions that unfairly favor one race over others, or men over women? At the dawn of the Information Age, legal scholars are sounding warnings about the ubiquity of automated algorithms that increasingly govern our lives. In his new book, The Black Box Society: The Hidden Algorithms Behind Money and Information, Frank Pasquale forcefully argues that human beings are increasingly relying on computerized algorithms that make decisions about what information we receive, how much we can borrow, where we go for dinner, or even whom we date. Pasquale’s central claim is that these algorithms will mask invidious discrimination, undermining democracy and worsening inequality. In this review, I rebut this prominent claim. I argue that any fair assessment of algorithms must be made against their alternative. Algorithms are certainly obscure and mysterious, but often no more so than the committees or individuals they replace. The ultimate black box is the human mind. Relying on contemporary theories of unconscious discrimination, I show that the consciously racist or sexist algorithm is less likely than the consciously or unconsciously racist or sexist human decision-maker it replaces. The principal problem of algorithmic discrimination lies elsewhere, in a process I label viral discrimination: algorithms trained or operated on a world pervaded by discriminatory effects are likely to reproduce that discrimination.

I argue that the solution to this problem lies in a kind of algorithmic affirmative action. This would require training algorithms on data that includes diverse communities and continually assessing the results for disparate impacts. Instead of insisting on race or gender neutrality and blindness, this would require decision-makers to approach algorithmic design and assessment in a race and gender conscious manner….(More)