Unconscious gender bias in the Google algorithm


Interview in Metode with Londa Schiebinger, director of Gendered Innovations: “We were interested, because the methods of sex and gender analysis are not in the university curriculum, yet it is very important. The first thing our group did was to develop those methods and we present twelve methods on the website. We knew it would be very important to create case studies or concrete examples where sex and gender analysis added something new to the research. One of my favorite examples is machine translation. If you look at Google Translate, which is the main one in the United States – SYSTRAN is the main one in Europe – we found that it defaults the masculine pronoun. So does SYSTRAN. If I put an article about myself into Google Translate, it defaults to «he said» instead of «she said». So, in an article of one of my visits to Spain, it defaults to «he thinks, he says…» and, occasionally, «it wrote». We wondered why this happened and we found out, because Google Translate works on an algorithm, the problem is that «he said» appears on the web four times more than «she said», so the machine gets it right if it chooses «he said». Because the algorithm is just set up for that. But, anyway, we found that there was a huge change in English language from 1968 to the current time, and the proportion of «he said» and «she said» changed from 4-to-1 to 2-to-1. But, still, the translation does not take this into account. So we went to Google and we said «Hey, what is going on?» and they said «Oh, wow, we didn’t know, we had no idea!». So what we recognized is that there is an unconscious gender bias in the Google algorithm. They did not intend to do this at all, so now there are a lot of people who are trying to fix it….

How can you fix that?

Oh, well, this is the thing! …I think algorithms in general are a problem because if there is any kind of unconscious bias in the data, the algorithm just returns that to you. So even though Google has policies, company policies, to support gender equality, they had an unconscious bias in their product and they do not mean to. Now that they know about it, they can try to fix it….(More)”

Big data may be reinforcing racial bias in the criminal justice system


Laurel Eckhouse at the Washington Post: “Big data has expanded to the criminal justice system. In Los Angeles, police use computerized “predictive policing” to anticipate crimes and allocate officers. In Fort Lauderdale, Fla., machine-learning algorithms are used to set bond amounts. In states across the country, data-driven estimates of the risk of recidivism are being used to set jail sentences.

Advocates say these data-driven tools remove human bias from the system, making it more fair as well as more effective. But even as they have become widespread, we have little information about exactly how they work. Few of the organizations producing them have released the data and algorithms they use to determine risk.

 We need to know more, because it’s clear that such systems face a fundamental problem: The data they rely on are collected by a criminal justice system in which race makes a big difference in the probability of arrest — even for people who behave identically. Inputs derived from biased policing will inevitably make black and Latino defendants look riskier than white defendants to a computer. As a result, data-driven decision-making risks exacerbating, rather than eliminating, racial bias in criminal justice.
Consider a judge tasked with making a decision about bail for two defendants, one black and one white. Our two defendants have behaved in exactly the same way prior to their arrest: They used drugs in the same amount, have committed the same traffic offenses, owned similar homes and took their two children to the same school every morning. But the criminal justice algorithms do not rely on all of a defendant’s prior actions to reach a bail assessment — just those actions for which he or she has been previously arrested and convicted. Because of racial biases in arrest and conviction rates, the black defendant is more likely to have a prior conviction than the white one, despite identical conduct. A risk assessment relying on racially compromised criminal-history data will unfairly rate the black defendant as riskier than the white defendant.

To make matters worse, risk-assessment tools typically evaluate their success in predicting a defendant’s dangerousness on rearrests — not on defendants’ overall behavior after release. If our two defendants return to the same neighborhood and continue their identical lives, the black defendant is more likely to be arrested. Thus, the tool will falsely appear to predict dangerousness effectively, because the entire process is circular: Racial disparities in arrests bias both the predictions and the justification for those predictions.

We know that a black person and a white person are not equally likely to be stopped by police: Evidence on New York’s stop-and-frisk policy, investigatory stops, vehicle searches and drug arrests show that black and Latino civilians are more likely to be stopped, searched and arrested than whites. In 2012, a white attorney spent days trying to get himself arrested in Brooklyn for carrying graffiti stencils and spray paint, a Class B misdemeanor. Even when police saw him tagging the City Hall gateposts, they sped past him, ignoring a crime for which 3,598 people were arrested by the New York Police Department the following year.

Before adopting risk-assessment tools in the judicial decision-making process, jurisdictions should demand that any tool being implemented undergo a thorough and independent peer-review process. We need more transparencyand better data to learn whether these risk assessments have disparate impacts on defendants of different races. Foundations and organizations developing risk-assessment tools should be willing to release the data used to build these tools to researchers to evaluate their techniques for internal racial bias and problems of statistical interpretation. Even better, with multiple sources of data, researchers could identify biases in data generated by the criminal justice system before the data is used to make decisions about liberty. Unfortunately, producers of risk-assessment tools — even nonprofit organizations — have not voluntarily released anonymized data and computational details to other researchers, as is now standard in quantitative social science research….(More)”.

How to Do Social Science Without Data


Neil Gross in the New York Times: With the death last month of the sociologist Zygmunt Bauman at age 91, the intellectual world lost a thinker of rare insight and range. Because his style of work was radically different from that of most social scientists in the United States today, his passing is an occasion to consider what might be gained if more members of our profession were to follow his example….

Weber saw bureaucracies as powerful, but dispiritingly impersonal. Mr. Bauman amended this: Bureaucracy can be inhuman. Bureaucratic structures had deadened the moral sense of ordinary German soldiers, he contended, which made the Holocaust possible. They could tell themselves they were just doing their job and following orders.

Later, Mr. Bauman turned his scholarly attention to the postwar and late-20th-century worlds, where the nature and role of all-encompassing institutions were again his focal point. Craving stability after the war, he argued, people had set up such institutions to direct their lives — more benign versions of Weber’s bureaucracy. You could go to work for a company at a young age and know that it would be a sheltering umbrella for you until you retired. Governments kept the peace and helped those who couldn’t help themselves. Marriages were formed through community ties and were expected to last.

But by the end of the century, under pressure from various sources, those institutions were withering. Economically, global trade had expanded, while in Europe and North America manufacturing went into decline; job security vanished. Politically, too, changes were afoot: The Cold War drew to an end, Europe integrated and politicians trimmed back the welfare state. Culturally, consumerism seemed to pervade everything. Mr. Bauman noted major shifts in love and intimacy as well, including a growing belief in the contingency of marriage and — eventually — the popularity of online dating.

In Mr. Bauman’s view, it all connected. He argued we were witnessing a transition from the “solid modernity” of the mid-20th century to the “liquid modernity” of today. Life had become freer, more fluid and a lot more risky. In principle, contemporary workers could change jobs whenever they got bored. They could relocate abroad or reinvent themselves through shopping. They could find new sexual partners with the push of a button. But there was little continuity.

Mr. Bauman considered the implications. Some thrived in this new atmosphere; the institutions and norms previously in place could be stultifying, oppressive. But could a transient work force come together to fight for a more equitable distribution of resources? Could shopping-obsessed consumers return to the task of being responsible, engaged citizens? Could intimate partners motivated by short-term desire ever learn the value of commitment?…(More)”

Facebook introduces a way to help your neighbors after a disaster


Casey Newton at the Verge: “Last year Facebook announced Community Help, a new part of its Safety Check feature designed to connect disaster victims with Facebook users in the area who are offering their help. Now whenever Safety Check is activated, Community Help will let users find or offer food, shelter, transportation, and other forms of assistance. After testing the feature in December, Facebook is beginning to roll it out today in the United States, Canada, India, Saudi Arabia, Australia, and New Zealand.

Facebook says Community Help represents a logical next step for Safety Check, which was first announced in November 2014. Initially, each Safety Check was essentially created manually by Facebook’s team.

In November, the company announced that Safety Check would become more automated. Global crisis reporting agencies send Facebook alerts, which it then attempts to match to user posts in a geographic area. When it finds a spike in user posts, coupled with the alert, Facebook activates Safety Check. The company says employees oversee the process to prevent false positives — something it hasn’t always succeeded at doing.

In discussions with relief agencies, Facebook says it found that disaster victims were often coming to Facebook in search of help — or to offer some. In some cases, product designer Preethi Chethan says, they were pasting Facebook posts into spreadsheets to help sort them.

Community Help is designed to make post-disaster matchmaking easier. You’ll find it inside Safety Check — go there in the wake of a calamity, and after marking yourself safe you can create a post seeking or offering help. For starters, Community Help will only be available after natural disasters and accidents….(More)”.

It takes more than social media to make a social movement


Hayley Tsukayama in the Washington Post: “President Trump may have used the power of social media to make his way into the White House, but now social media networks are showing that muscle can work for his opposition, too. Last week, more than 1 million marchers went to Washington and cities around the country — sparked by a Facebook post from one woman with no history of activism. This weekend, the Internet exploded again in discussion about Trump’s travel suspension order, and many used social media to get together and protest the decision.

Twitter said that more than 25 million tweets were sent about the order — as compared with 12 million about Trump’s inauguration. Facebook said that its users generated 151 million “likes, posts, comments and shares” related to the ban, less than the 208 million interactions generated about the inauguration. The companies didn’t reveal how many of those were aimed at organizing, but the social media calls to get people to protest are a testament to the power of these platforms to move people.

The real questionhowever, is whether this burgeoning new movement can avoid the fate of many so others kick-started by the power of social networks — only to find that it’s much harder to make political change than to make a popular hashtag….

Zeynep Tufekci, an associate professor at the University of North Carolina at Chapel Hill who has written a forthcoming book on the power and fragility of movements borne of social media, found in her research that the very ability for these movements to scale quickly is, in part, why they also can fall apart so quickly compared with traditional grass-roots campaigns….

Now, organizers can bypass the time it takes to build up the infrastructure for a massive march and all the publicity that comes with it. But that also means their high-profile movements skip some crucial organizing steps.

“Digitally networked movements look like the old movements. But by the time the civil rights movement had such a large march, they’d been working on [the issues] for 10 years — if not more,” Tufekci said. The months or even years spent discussing logistics, leafleting and building a coalition, she said, were crucial to the success of the civil rights movements. Other successful efforts, such as the Human Rights Campaign’s efforts to end the “don’t ask, don’t tell” policy against allowing gay people to serve openly in the military were also rooted in organization structures that had been developing and refining their demands for years to present a unified front. Movements organized over social networks often have more trouble jelling, she said, particularly if different factions air their differences on Facebook and Twitter, drawing attention to fractures in a movement….(More).”

The City as a Lab: Open Innovation Meets the Collaborative Economy


Introduction to Special Issue of California Management Review by , and : “This article introduces the special issue on the increasing role of cities as a driver for (open) innovation and entrepreneurship. It frames the innovation space being cultivated by proactive cities. Drawing on the diverse papers selected in this special issue, this introduction explores a series of tensions that are emerging as innovators and entrepreneurs seek to engage with local governments and citizens in an effort to improve the quality of life and promote local economic growth…Urbanization, the democratization of innovation and technology, and collaboration are converging paradigms helping to drive entrepreneurship and innovation in urban areas around the globe. These three factors are converging to drive innovation and entrepreneurship in cities and have been referred to as the urbanpreneur spiral….(More)”figure

Using GitHub in Government: A Look at a New Collaboration Platform


Justin Longo at the Center for Policy Informatics: “…I became interested in the potential for using GitHub to facilitate collaboration on text documents. This was largely inspired by the 2012 TED Talk by Clay Shirky where he argued that open source programmers could teach us something about how to do open governance:

Somebody put up a tool during the copyright debate last year in the Senate, saying, “It’s strange that Hollywood has more access to Canadian legislators than Canadian citizens do. Why don’t we use GitHub to show them what a citizen-developed bill might look like?” …

For this research, we undertook a census of Canadian government and public servant accounts on GitHub and surveyed those users, supplemented by interviews with key government technology leaders.

This research has now been published in the journal Canadian Public Administration. (If you don’t have access to the full document through the publisher, you can also find it here).

Despite the growing enthusiasm for GitHub (mostly from those familiar with open source software development), and the general rhetoric in favour of collaboration, we suspected that getting GitHub used in public sector organizations for text collaboration might be an uphill battle – not least of which because of the steep learning curve involved in using GitHub, and its inflexibility when being used to edit text.

The history of computer-supported collaborative work platforms is littered with really cool interfaces that failed to appeal to users. The experience to date with GitHub in Canadian governments reflects this, as far as our research shows.

We found few government agencies having an active presence on GitHub compared to social media presence in general. And while federal departments and public servants on GitHub are rare, provincial, territorial, First Nations and local governments are even rarer.

For individual accounts held by public servants, most were found in the federal government at higher rates than those found in broader society (see Mapping Collaborative Software). Within this small community, the distribution of contributions per user follows the classic long-tail distribution with a small number of contributors responsible for most of the work, a larger number of contributors doing very little on average, and many users contributing nothing.

GitHub is still resisted by all but the most technically savvy. With a peculiar terminology and work model that presupposes a familiarity with command line computer operations and the language of software coding, using GitHub presents many barriers to the novice user. But while it is tempting to dismiss GitHub, as it currently exists, as ill-suited as a collaboration tool to support document writing, it holds potential as a useful platform for facilitating collaboration in the public sector.

As an example, to help understand how GitHub might be used within governments for collaboration on text documents, we discuss a briefing note document flow in the paper (see the paper for a description of this lovely graphic).

screen-shot-2017-01-21-at-8-54-24-pm

A few other finding are addressed in the paper, from why public servants may choose not to collaborate even though they believe it’s the right thing to do, to an interesting story about what propelled the use of GitHub in the government of Canada in the first place….(More)”

Scientists have a word for studying the post-truth world: agnotology


 and  in The Conversation: “But scientists have another word for “post-truth”. You might have heard of epistemology, or the study of knowledge. This field helps define what we know and why we know it. On the flip side of this is agnotology, or the study of ignorance. Agnotology is not often discussed, because studying the absence of something — in this case knowledge — is incredibly difficult.

Doubt is our product

Agnotology is more than the study of what we don’t know; it’s also the study of why we are not supposed to know it. One of its more important aspects is revealing how people, usually powerful ones, use ignorance as a strategic tool to hide or divert attention from societal problems in which they have a vested interest.

A perfect example is the tobacco industry’s dissemination of reports that continuously questioned the link between smoking and cancer. As one tobacco employee famously stated, “Doubt is our product.”

In a similar way, conservative think tanks such as The Heartland Institute work to discredit the science behind human-caused climate change.

Despite the fact that 97% of scientists support the anthropogenic causes of climate change, hired “experts” have been able to populate talk shows, news programmes, and the op-ed pages to suggest a lack of credible data or established consensus, even with evidence to the contrary.

These institutes generate pseudo-academic reports to counter scientific results. In this way, they are responsible for promoting ignorance….

Under agnotology 2.0, truth becomes a moot point. It is the sensation that counts. Public media leaders create an impact with whichever arguments they can muster based in whatever fictional data they can create…Donald Trump entering the White House is the pinnacle of agnotology 2.0. Washington Post journalist Fareed Zakaria has argued that in politics, what matters is no longer the economy but identity; we would like to suggest that the problem runs deeper than that.

The issue is not whether we should search for identity, for fame, or for sensational opinions and entertainment. The overarching issue is the fallen status of our collective search for truth, in its many forms. It is no longer a positive attribute to seek out truth, determine biases, evaluate facts, or share knowledge.

Under agnotology 2.0, scientific thinking itself is under attack. In a post-fact and post-truth era, we could very well become post-science….(More)”.

How statistics lost their power – and why we should fear what comes next


 in The Guardian: “In theory, statistics should help settle arguments. They ought to provide stable reference points that everyone – no matter what their politics – can agree on. Yet in recent years, divergent levels of trust in statistics has become one of the key schisms that have opened up in western liberal democracies. Shortly before the November presidential election, a study in the US discovered that 68% of Trump supporters distrusted the economic data published by the federal government. In the UK, a research project by Cambridge University and YouGov looking at conspiracy theories discovered that 55% of the population believes that the government “is hiding the truth about the number of immigrants living here”.

Rather than diffusing controversy and polarisation, it seems as if statistics are actually stoking them. Antipathy to statistics has become one of the hallmarks of the populist right, with statisticians and economists chief among the various “experts” that were ostensibly rejected by voters in 2016. Not only are statistics viewed by many as untrustworthy, there appears to be something almost insulting or arrogant about them. Reducing social and economic issues to numerical aggregates and averages seems to violate some people’s sense of political decency.

Nowhere is this more vividly manifest than with immigration. The thinktank British Future has studied how best to win arguments in favour ofimmigration and multiculturalism. One of its main findings is that people often respond warmly to qualitative evidence, such as the stories of individual migrants and photographs of diverse communities. But statistics – especially regarding alleged benefits of migration to Britain’s economy – elicit quite the opposite reaction. People assume that the numbers are manipulated and dislike the elitism of resorting to quantitative evidence. Presented with official estimates of how many immigrants are in the country illegally, a common response is to scoff. Far from increasing support for immigration, British Future found, pointing to its positive effect on GDP can actually make people more hostile to it. GDP itself has come to seem like a Trojan horse for an elitist liberal agenda. Sensing this, politicians have now largely abandoned discussing immigration in economic terms.

All of this presents a serious challenge for liberal democracy. Put bluntly, the British government – its officials, experts, advisers and many of its politicians – does believe that immigration is on balance good for the economy. The British government did believe that Brexit was the wrong choice. The problem is that the government is now engaged in self-censorship, for fear of provoking people further.

This is an unwelcome dilemma. Either the state continues to make claims that it believes to be valid and is accused by sceptics of propaganda, or else, politicians and officials are confined to saying what feels plausible and intuitively true, but may ultimately be inaccurate. Either way, politics becomes mired in accusations of lies and cover-ups.

The declining authority of statistics – and the experts who analyse them – is at the heart of the crisis that has become known as “post-truth” politics. And in this uncertain new world, attitudes towards quantitative expertise have become increasingly divided. From one perspective, grounding politics in statistics is elitist, undemocratic and oblivious to people’s emotional investments in their community and nation. It is just one more way that privileged people in London, Washington DC or Brussels seek to impose their worldview on everybody else. From the opposite perspective, statistics are quite the opposite of elitist. They enable journalists, citizens and politicians to discuss society as a whole, not on the basis of anecdote, sentiment or prejudice, but in ways that can be validated. The alternative to quantitative expertise is less likely to be democracy than an unleashing of tabloid editors and demagogues to provide their own “truth” of what is going on across society.

Is there a way out of this polarisation? Must we simply choose between a politics of facts and one of emotions, or is there another way of looking at this situation?One way is to view statistics through the lens of their history. We need to try and see them for what they are: neither unquestionable truths nor elite conspiracies, but rather as tools designed to simplify the job of government, for better or worse. Viewed historically, we can see what a crucial role statistics have played in our understanding of nation states and their progress. This raises the alarming question of how – if at all – we will continue to have common ideas of society and collective progress, should statistics fall by the wayside….(More).”

DataCollaboratives.org – A New Resource on Creating Public Value by Exchanging Data


Recent years have seen exponential growth in the amount of data being generated and stored around the world. There is increasing recognition that this data can play a key role in solving some of the most difficult public problems we face.

However, much of the potentially useful data is currently privately held and not available for public insights. Data in the form of web clicks, social “likes,” geo location and online purchases are typically tightly controlled, usually by entities in the private sector. Companies today generate an ever-growing stream of information from our proliferating sensors and devices. Increasingly, they—and various other actors—are asking if there is a way to make this data available for the public good. There is an ongoing search for new models of corporate responsibility in the digital era around data toward the creation of “data collaboratives”.

Screen Shot 2017-01-17 at 2.54.05 PM

Today, the GovLab is excited to launch a new resource for Data Collaboratives (datacollaboratives.org). Data Collaboratives are an emerging form of public-private partnership in which participants from different sectors — including private companies, research institutions, and government agencies — exchange data to help solve public problems.

The resource results from different partnerships with UNICEF (focused on creating data collaboratives to improve children’s lives) and Omidyar Network (studying new ways to match (open) data demand and supply to increase impact).

Natalia Adler, a data, research and policy planning specialist and the UNICEF Data Collaboratives Project Lead notes, “At UNICEF, we’re dealing with the world’s most complex problems affecting children. Data Collaboratives offer an exciting opportunity to tap on previously inaccessible datasets and mobilize a wide range of data expertise to advance child rights around the world. It’s all about connecting the dots.”

To better understand the potential of these Collaboratives, the GovLab collected information on dozens of examples from across the world. These many and diverse initiatives clearly suggest the potential of Data Collaboratives to improve people’s lives when done responsibly. As Stefaan Verhulst, co-founder of the GovLab, puts it: “In the coming months and years, Data Collaboratives will be essential vehicles for harnessing the vast stores of privately held data toward the public good.”

In particular, our research to date suggests that Data Collaboratives offer a number of potential benefits, including enhanced:

  • Situational Awareness and Response: For example, Orbital Insights and the World Bank are using satellite imagery to measure and track poverty. This technology can, in some instances, “be more accurate than U.S. census data.”
  • Public Service Design and Delivery: Global mapping company, Esri, and Waze’s Connected Citizen’s program are using crowdsourced traffic information to help governments design better transportation.
  • Impact Assessment and Evaluation: Nielsen and the World Food Program (WFP) have been using data collected via mobile phone surveys to better monitor food insecurity in order to advise the WFP’s resource allocations….(More)