Algorithmic fairness: A code-based primer for public-sector data scientists


Paper by Ken Steif and Sydney Goldstein: “As the number of government algorithms grow, so does the need to evaluate algorithmic fairness. This paper has three goals. First, we ground the notion of algorithmic fairness in the context of disparate impact, arguing that for an algorithm to be fair, its predictions must generalize across different protected groups. Next, two algorithmic use cases are presented with code examples for how to evaluate fairness. Finally, we promote the concept of an open source repository of government algorithmic “scorecards,” allowing stakeholders to compare across algorithms and use cases….(More)”.

Governance of artificial intelligence and personal health information


Jenifer Sunrise Winter in Digital Policy, Regulation and Governance: “This paper aims to assess the increasing challenges to governing the personal health information (PHI) essential for advancing artificial intelligence (AI) machine learning innovations in health care. Risks to privacy and justice/equity are discussed, along with potential solutions….

This paper argues that these characteristics of machine learning will overwhelm existing data governance approaches such as privacy regulation and informed consent. Enhanced governance techniques and tools will be required to help preserve the autonomy and rights of individuals to control their PHI. Debate among all stakeholders and informed critique of how, and for whom, PHI-fueled health AI are developed and deployed are needed to channel these innovations in societally beneficial directions.

Health data may be used to address pressing societal concerns, such as operational and system-level improvement, and innovations such as personalized medicine. This paper informs work seeking to harness these resources for societal good amidst many competing value claims and substantial risks for privacy and security….(More).

Claudette: an automated detector of potentially unfair clauses in online terms of service


Marco Lippi et al in AI and the Law Journal: “Terms of service of on-line platforms too often contain clauses that are potentially unfair to the consumer. We present an experimental study where machine learning is employed to automatically detect such potentially unfair clauses. Results show that the proposed system could provide a valuable tool for lawyers and consumers alike….(More)”.

Dirty Data, Bad Predictions: How Civil Rights Violations Impact Police Data, Predictive Policing Systems, and Justice


Paper by Rashida Richardson, Jason Schultz, and Kate Crawford: “Law enforcement agencies are increasingly using algorithmic predictive policing systems to forecast criminal activity and allocate police resources. Yet in numerous jurisdictions, these systems are built on data produced within the context of flawed, racially fraught and sometimes unlawful practices (‘dirty policing’). This can include systemic data manipulation, falsifying police reports, unlawful use of force, planted evidence, and unconstitutional searches. These policing practices shape the environment and the methodology by which data is created, which leads to inaccuracies, skews, and forms of systemic bias embedded in the data (‘dirty data’). Predictive policing systems informed by such data cannot escape the legacy of unlawful or biased policing practices that they are built on. Nor do claims by predictive policing vendors that these systems provide greater objectivity, transparency, or accountability hold up. While some systems offer the ability to see the algorithms used and even occasionally access to the data itself, there is no evidence to suggest that vendors independently or adequately assess the impact that unlawful and bias policing practices have on their systems, or otherwise assess how broader societal biases may affect their systems.

In our research, we examine the implications of using dirty data with predictive policing, and look at jurisdictions that (1) have utilized predictive policing systems and (2) have done so while under government commission investigations or federal court monitored settlements, consent decrees, or memoranda of agreement stemming from corrupt, racially biased, or otherwise illegal policing practices. In particular, we examine the link between unlawful and biased police practices and the data used to train or implement these systems across thirteen case studies. We highlight three of these: (1) Chicago, an example of where dirty data was ingested directly into the city’s predictive system; (2) New Orleans, an example where the extensive evidence of dirty policing practices suggests an extremely high risk that dirty data was or will be used in any predictive policing application, and (3) Maricopa County where despite extensive evidence of dirty policing practices, lack of transparency and public accountability surrounding predictive policing inhibits the public from assessing the risks of dirty data within such systems. The implications of these findings have widespread ramifications for predictive policing writ large. Deploying predictive policing systems in jurisdictions with extensive histories of unlawful police practices presents elevated risks that dirty data will lead to flawed, biased, and unlawful predictions which in turn risk perpetuating additional harm via feedback loops throughout the criminal justice system. Thus, for any jurisdiction where police have been found to engage in such practices, the use of predictive policing in any context must be treated with skepticism and mechanisms for the public to examine and reject such systems are imperative….(More)”.

7 things we’ve learned about computer algorithms


Aaron Smith at Pew Research Center: “Algorithms are all around us, using massive stores of data and complex analytics to make decisions with often significant impacts on humans – from choosing the content people see on social media to judging whether a person is a good credit risk or job candidate. Pew Research Center released several reports in 2018 that explored the role and meaning of algorithms in people’s lives today. Here are some of the key themes that emerged from that research.

  1. Algorithmically generated content platforms play a prominent role in Americans’ information diets. Sizable shares of U.S. adults now get news on sites like Facebook or YouTube that use algorithms to curate the content they show to their users. A study by the Center found that 81% of YouTube users say they at least occasionally watch the videos suggested by the platform’s recommendation algorithm, and that these recommendations encourage users to watch progressively longer content as they click through the videos suggested by the site.
  2. The inner workings of even the most common algorithms can be confusing to users. Facebook is among the most popular social media platforms, but roughly half of Facebook users – including six-in-ten users ages 50 and older – say they do not understand how the site’s algorithmically generated news feed selects which posts to show them. And around three-quarters of Facebook users are not aware that the site automatically estimates their interests and preferences based on their online behaviors in order to deliver them targeted advertisements and other content.
  3. The public is wary of computer algorithms being used to make decisions with real-world consequences. The public expresses widespread concern about companies and other institutions using computer algorithms in situations with potential impacts on people’s lives. More than half (56%) of U.S. adults think it is unacceptable to use automated criminal risk scores when evaluating people who are up for parole. And 68% think it is unacceptable for companies to collect large quantities of data about individuals for the purposes of offering them deals or other financial incentives. When asked to elaborate about their worries, many feel that these programs violate people’s privacy, are unfair, or simply will not work as well as decisions made by humans….(More)”.

Decoding Algorithms


Malcalester University: “Ada Lovelace probably didn’t foresee the impact of the mathematical formula she published in 1843, now considered the first computer algorithm.

Nor could she have anticipated today’s widespread use of algorithms, in applications as different as the 2016 U.S. presidential campaign and Mac’s first-year seminar registration. “Over the last decade algorithms have become embedded in every aspect of our lives,” says Shilad Sen, professor in Macalester’s Math, Statistics, and Computer Science (MSCS) Department.

How do algorithms shape our society? Why is it important to be aware of them? And for readers who don’t know, what is an algorithm, anyway?…(More)”.

Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence


Paper by Huimin Xia et al in at Nature Medicine: “Artificial intelligence (AI)-based methods have emerged as powerful tools to transform medical care. Although machine learning classifiers (MLCs) have already demonstrated strong performance in image-based diagnoses, analysis of diverse and massive electronic health record (EHR) data remains challenging. Here, we show that MLCs can query EHRs in a manner similar to the hypothetico-deductive reasoning used by physicians and unearth associations that previous statistical methods have not found. Our model applies an automated natural language processing system using deep learning techniques to extract clinically relevant information from EHRs. In total, 101.6 million data points from 1,362,559 pediatric patient visits presenting to a major referral center were analyzed to train and validate the framework.

Our model demonstrates high diagnostic accuracy across multiple organ systems and is comparable to experienced pediatricians in diagnosing common childhood diseases. Our study provides a proof of concept for implementing an AI-based system as a means to aid physicians in tackling large amounts of data, augmenting diagnostic evaluations, and to provide clinical decision support in cases of diagnostic uncertainty or complexity. Although this impact may be most evident in areas where healthcare providers are in relative shortage, the benefits of such an AI system are likely to be universal….(More)”.

This is how AI bias really happens—and why it’s so hard to fix


Karen Hao at MIT Technology Review: “Over the past few months, we’ve documented how the vast majority of AI’s applications today are based on the category of algorithms known as deep learning, and how deep-learning algorithms find patterns in data. We’ve also covered how these technologies affect people’s lives: how they can perpetuate injustice in hiring, retail, and security and may already be doing so in the criminal legal system.

But it’s not enough just to know that this bias exists. If we want to be able to fix it, we need to understand the mechanics of how it arises in the first place.

How AI bias happens

We often shorthand our explanation of AI bias by blaming it on biased training data. The reality is more nuanced: bias can creep in long before the data is collected as well as at many other stages of the deep-learning process. For the purposes of this discussion, we’ll focus on three key stages.Sign up for the The AlgorithmArtificial intelligence, demystified

By signing up you agree to receive email newsletters and notifications from MIT Technology Review. You can change your preferences at any time. View our Privacy Policy for more detail.

Framing the problem. The first thing computer scientists do when they create a deep-learning model is decide what they actually want it to achieve. A credit card company, for example, might want to predict a customer’s creditworthiness, but “creditworthiness” is a rather nebulous concept. In order to translate it into something that can be computed, the company must decide whether it wants to, say, maximize its profit margins or maximize the number of loans that get repaid. It could then define creditworthiness within the context of that goal. The problem is that “those decisions are made for various business reasons other than fairness or discrimination,” explains Solon Barocas, an assistant professor at Cornell University who specializes in fairness in machine learning. If the algorithm discovered that giving out subprime loans was an effective way to maximize profit, it would end up engaging in predatory behavior even if that wasn’t the company’s intention.

Collecting the data. There are two main ways that bias shows up in training data: either the data you collect is unrepresentative of reality, or it reflects existing prejudices. The first case might occur, for example, if a deep-learning algorithm is fed more photos of light-skinned faces than dark-skinned faces. The resulting face recognition system would inevitably be worse at recognizing darker-skinned faces. The second case is precisely what happened when Amazon discovered that its internal recruiting tool was dismissing female candidates. Because it was trained on historical hiring decisions, which favored men over women, it learned to do the same.

Preparing the data. Finally, it is possible to introduce bias during the data preparation stage, which involves selecting which attributes you want the algorithm to consider. (This is not to be confused with the problem-framing stage. You can use the same attributes to train a model for very different goals or use very different attributes to train a model for the same goal.) In the case of modeling creditworthiness, an “attribute” could be the customer’s age, income, or number of paid-off loans. In the case of Amazon’s recruiting tool, an “attribute” could be the candidate’s gender, education level, or years of experience. This is what people often call the “art” of deep learning: choosing which attributes to consider or ignore can significantly influence your model’s prediction accuracy. But while its impact on accuracy is easy to measure, its impact on the model’s bias is not.

Why AI bias is hard to fix

Given that context, some of the challenges of mitigating bias may already be apparent to you. Here we highlight four main ones….(More)”

Artificial Intelligence and National Security


Report by Congressional Research Service: “Artificial intelligence (AI) is a rapidly growing field of technology with potentially significant implications for national security. As such, the U.S. Department of Defense (DOD) and other nations are developing AI applications for a range of military functions. AI research is underway in the fields of intelligence collection and analysis, logistics, cyber operations, information operations, command and control, and in a variety of semi-autonomous and autonomous vehicles.

Already, AI has been incorporated into military operations in Iraq and Syria. Congressional action has the potential to shape the technology’s development further, with budgetary and legislative decisions influencing the growth of military applications as well as the pace of their adoption.

AI technologies present unique challenges for military integration, particularly because the bulk of AI development is happening in the commercial sector. Although AI is not unique in this regard, the defense acquisition process may need to be adapted for acquiring emerging technologies like AI.

In addition, many commercial AI applications must undergo significant modification prior to being functional for the military. A number of cultural issues also challenge AI acquisition, as some commercial AI companies are averse to partnering with DOD due to ethical concerns, and even within the department, there can be resistance to incorporating AI technology into existing weapons systems and processes.

Potential international rivals in the AI market are creating pressure for the United States to compete for innovative military AI applications. China is a leading competitor in this regard, releasing a plan in 2017 to capture the global lead in AI development by 2030. Currently, China is primarily focused on using AI to make faster and more well-informed decisions, as well as on developing a variety of autonomous military vehicles. Russia is also active in military AI development, with a primary focus on robotics. Although AI has the potential to impart a number of advantages in the military context, it may also introduce distinct challenges.

AI technology could, for example, facilitate autonomous operations, lead to more informed military decisionmaking, and increase the speed and scale of military action. However, it may also be unpredictable or vulnerable to unique forms of manipulation. As a result of these factors, analysts hold a broad range of opinions on how influential AI will be in future combat operations.

While a small number of analysts believe that the technology will have minimal impact, most believe that AI will have at least an evolutionary—if not revolutionary—effect….(More)”.

Mapping the challenges and opportunities of artificial intelligence for the conduct of diplomacy


DiploFoundation: “This report provides an overview of the evolution of diplomacy in the context of artificial intelligence (AI). AI has emerged as a very hot topic on the international agenda impacting numerous aspects of our political, social, and economic lives. It is clear that AI will remain a permanent feature of international debates and will continue to shape societies and international relations.

It is impossible to ignore the challenges – and opportunities – AI is bringing to the diplomatic realm. Its relevance as a topic for diplomats and others working in international relations will only increase….(More)”.