AI ethics groups are repeating one of society’s classic mistakes


Article by Abhishek Gupta and Victoria Heath: “International organizations and corporations are racing to develop global guidelines for the ethical use of artificial intelligence. Declarations, manifestos, and recommendations are flooding the internet. But these efforts will be futile if they fail to account for the cultural and regional contexts in which AI operates.

AI systems have repeatedly been shown to cause problems that disproportionately affect marginalized groups while benefiting a privileged few. The global AI ethics efforts under way today—of which there are dozens—aim to help everyone benefit from this technology, and to prevent it from causing harm. Generally speaking, they do this by creating guidelines and principles for developers, funders, and regulators to follow. They might, for example, recommend routine internal audits or require protections for users’ personally identifiable information.

We believe these groups are well-intentioned and are doing worthwhile work. The AI community should, indeed, agree on a set of international definitions and concepts for ethical AI. But without more geographic representation, they’ll produce a global vision for AI ethics that reflects the perspectives of people in only a few regions of the world, particularly North America and northwestern Europe.

This work is not easy or straightforward. “Fairness,” “privacy,” and “bias” mean different things (pdf) in different places. People also have disparate expectations of these concepts depending on their own political, social, and economic realities. The challenges and risks posed by AI also differ depending on one’s locale.

If organizations working on global AI ethics fail to acknowledge this, they risk developing standards that are, at best, meaningless and ineffective across all the world’s regions. At worst, these flawed standards will lead to more AI systems and tools that perpetuate existing biases and are insensitive to local cultures….(More)”.

AI Governance through Political Fora and Standards Developing Organizations


Report by Philippe Lorenz: “Shaping international norms around the ethics of Artificial Intelligence (AI) is perceived as a new responsibility by foreign policy makers. This responsibility is accompanied by a desire to play an active role in the most important international fora. Given the limited resources in terms of time and budget, foreign ministries need to set priorities for their involvement in the gover­nance of AI. First and foremost, this requires an understanding of the entire AI governance landscape and the actors involved. The intention of this paper is to take a step back and familiarize foreign policy makers with the internal structures of the individual AI governance initiatives and the relationships between the involved actors. A basic understanding of the landscape also makes it easier to classify thematic developments and emerging actors, their agendas, and strategies.

This paper provides foreign policy practitioners with a mapping that can serve as a compass to navigate the complex web of stakeholders that shape the international debate on AI ethics. It plots political fora that serve as a platform for actors to agree upon ethical principles and pursue binding regulation. The mapping supplements the political purview with key actors who create technical standards on the ethics of AI. Furthermore, it describes the dynamic relationships between actors from these two domains. International governance addresses AI ethics through two different dimensions: political fora and Standards Developing Organizations (SDOs). Although it may be tempting to only engage on the diplomatic stage, this would be insufficient to help shape AI policy. Foreign policy makers must tend to both dimensions. While both governance worlds share the same topics and themes (in this case, AI ethics), they differ in their stakeholders, goals, outputs, and reach.

Key political and economic organizations such as the United Nations (UN), the Organisation for Economic Co-operation and Development (OECD), and the European Commission (EC) address ethical concerns raised by AI technologies. But so do SDOs such as the International Organization for Standardization (ISO), the International Electrotechnical Commission (IEC), and the IEEE Standards Association (IEEE SA). Although actors from the latter category are typically concerned with the development of standards that address terminology, ontology, and technical benchmarks that facilitate product interoperability and market access, they, too, address AI ethics.

But these discussions on AI ethics will be useless if they do not inform the development of concrete policies for how to govern the technology.
At international political fora, on the one hand, states shape the outputs that are often limited to non-binding, soft AI principles. SDOs, on the other hand, tend to the private sector. They are characterized by consensus-based decision-making processes that facilitate the adoption of industry standards. These fora are generally not accessible to (foreign) policy makers. Either because they exclusively cater to private sector and bar policy makers from joining, or because active participation requires in-depth technical expertise as well as industry knowledge which may surpass diplomats’ skill sets. Nonetheless, as prominent standard setting bodies such as ISO, IEC, and IEEE SA pursue industry standards in AI ethics, foreign policy makers need to take notice, as this will likely have consequences for their negotiations at international political fora.

The precondition for active engagement is to gain an overview of the AI Governance environment. Foreign policy practitioners need to understand the landscape of stakeholders, identify key actors, and start to strategically engage with questions relevant to AI governance. This is necessary to determine whether a given initiative on AI ethics is aligned with one’s own foreign policy goals and, therefore, worth engaging with. It is also helpful to assess industry dynamics that might affect geo-economic deliberations. Lastly, all of this is vital information to report back to government headquarters to inform policy making, as AI policy is a matter of domestic and foreign policy….(More)”.

The Oxford Handbook of Ethics of AI


Book edited by Markus D. Dubber, Frank Pasquale, and Sunit Das: “This volume tackles a quickly-evolving field of inquiry, mapping the existing discourse as part of a general attempt to place current developments in historical context; at the same time, breaking new ground in taking on novel subjects and pursuing fresh approaches.

The term “A.I.” is used to refer to a broad range of phenomena, from machine learning and data mining to artificial general intelligence. The recent advent of more sophisticated AI systems, which function with partial or full autonomy and are capable of tasks which require learning and ‘intelligence’, presents difficult ethical questions, and has drawn concerns from many quarters about individual and societal welfare, democratic decision-making, moral agency, and the prevention of harm. This work ranges from explorations of normative constraints on specific applications of machine learning algorithms today-in everyday medical practice, for instance-to reflections on the (potential) status of AI as a form of consciousness with attendant rights and duties and, more generally still, on the conceptual terms and frameworks necessarily to understand tasks requiring intelligence, whether “human” or “A.I.”…(More)”.

The Broken Algorithm That Poisoned American Transportation


Aaron Gordon at Vice: “…The Louisville highway project is hardly the first time travel demand models have missed the mark. Despite them being a legally required portion of any transportation infrastructure project that gets federal dollars, it is one of urban planning’s worst kept secrets that these models are error-prone at best and fundamentally flawed at worst.

Recently, I asked Renn how important those initial, rosy traffic forecasts of double-digit growth were to the boondoggle actually getting built.

“I think it was very important,” Renn said. “Because I don’t believe they could have gotten approval to build the project if they had not had traffic forecasts that said traffic across the river is going to increase substantially. If there isn’t going to be an increase in traffic, how do you justify building two bridges?”

ravel demand models come in different shapes and sizes. They can cover entire metro regions spanning across state lines or tackle a small stretch of a suburban roadway. And they have gotten more complicated over time. But they are rooted in what’s called the Four Step process, a rough approximation of how humans make decisions about getting from A to B. At the end, the model spits out numbers estimating how many trips there will be along certain routes.

As befits its name, the model goes through four steps in order to arrive at that number. First, it generates a kind of algorithmic map based on expected land use patterns (businesses will generate more trips than homes) and socio-economic factors (for example, high rates of employment will generate more trips than lower ones). Then it will estimate where people will generally be coming from and going to. The third step is to guess how they will get there, and the fourth is to then plot their actual routes, based mostly on travel time. The end result is a number of how many trips there will be in the project area and how long it will take to get around. Engineers and planners will then add a new highway, transit line, bridge, or other travel infrastructure to the model and see how things change. Or they will change the numbers in the first step to account for expected population or employment growth into the future. Often, these numbers are then used by policymakers to justify a given project, whether it’s a highway expansion or a light rail line…(More)”.

Too many AI researchers think real-world problems are not relevant


Essay by Hannah Kerner: “Any researcher who’s focused on applying machine learning to real-world problems has likely received a response like this one: “The authors present a solution for an original and highly motivating problem, but it is an application and the significance seems limited for the machine-learning community.”

These words are straight from a review I received for a paper I submitted to the NeurIPS (Neural Information Processing Systems) conference, a top venue for machine-learning research. I’ve seen the refrain time and again in reviews of papers where my coauthors and I presented a method motivated by an application, and I’ve heard similar stories from countless others.

This makes me wonder: If the community feels that aiming to solve high-impact real-world problems with machine learning is of limited significance, then what are we trying to achieve?

The goal of artificial intelligence (pdf) is to push forward the frontier of machine intelligence. In the field of machine learning, a novel development usually means a new algorithm or procedure, or—in the case of deep learning—a new network architecture. As others have pointed out, this hyperfocus on novel methods leads to a scourge of papers that report marginal or incremental improvements on benchmark data sets and exhibit flawed scholarship (pdf) as researchers race to top the leaderboard.

Meanwhile, many papers that describe new applications present both novel concepts and high-impact results. But even a hint of the word “application” seems to spoil the paper for reviewers. As a result, such research is marginalized at major conferences. Their authors’ only real hope is to have their papers accepted in workshops, which rarely get the same attention from the community.

This is a problem because machine learning holds great promise for advancing health, agriculture, scientific discovery, and more. The first image of a black hole was produced using machine learning. The most accurate predictions of protein structures, an important step for drug discovery, are made using machine learning. If others in the field had prioritized real-world applications, what other groundbreaking discoveries would we have made by now?

This is not a new revelation. To quote a classic paper titled “Machine Learning that Matters” (pdf), by NASA computer scientist Kiri Wagstaff: “Much of current machine learning research has lost its connection to problems of import to the larger world of science and society.” The same year that Wagstaff published her paper, a convolutional neural network called AlexNet won a high-profile competition for image recognition centered on the popular ImageNet data set, leading to an explosion of interest in deep learning. Unfortunately, the disconnect she described appears to have grown even worse since then….(More)”.

AI technologies — like police facial recognition — discriminate against people of colour


Jane Bailey et al at The Conversation: “…In his game-changing 1993 book, The Panoptic Sort, scholar Oscar Gandy warned that “complex technology [that] involves the collection, processing and sharing of information about individuals and groups that is generated through their daily lives … is used to coordinate and control their access to the goods and services that define life in the modern capitalist economy.” Law enforcement uses it to pluck suspects from the general public, and private organizations use it to determine whether we have access to things like banking and employment.

Gandy prophetically warned that, if left unchecked, this form of “cybernetic triage” would exponentially disadvantage members of equality-seeking communities — for example, groups that are racialized or socio-economically disadvantaged — both in terms of what would be allocated to them and how they might come to understand themselves.

Some 25 years later, we’re now living with the panoptic sort on steroids. And examples of its negative effects on equality-seeking communities abound, such as the false identification of Williams.

Pre-existing bias

This sorting using algorithms infiltrates the most fundamental aspects of everyday life, occasioning both direct and structural violence in its wake.

The direct violence experienced by Williams is immediately evident in the events surrounding his arrest and detention, and the individual harms he experienced are obvious and can be traced to the actions of police who chose to rely on the technology’s “match” to make an arrest. More insidious is the structural violence perpetrated through facial recognition technology and other digital technologies that rate, match, categorize and sort individuals in ways that magnify pre-existing discriminatory patterns.

Structural violence harms are less obvious and less direct, and cause injury to equality-seeking groups through systematic denial to power, resources and opportunity. Simultaneously, it increases direct risk and harm to individual members of those groups.

Predictive policing uses algorithmic processing of historical data to predict when and where new crimes are likely to occur, assigns police resources accordingly and embeds enhanced police surveillance into communities, usually in lower-income and racialized neighbourhoods. This increases the chances that any criminal activity — including less serious criminal activity that might otherwise prompt no police response — will be detected and punished, ultimately limiting the life chances of the people who live within that environment….(More)”.

Algorithmic Colonisation of Africa Read


Abeba Birhane at The Elephant: “The African equivalents of Silicon Valley’s tech start-ups can be found in every possible sphere of life around all corners of the continent—in “Sheba Valley” in Addis Abeba, “Yabacon Valley” in Lagos, and “Silicon Savannah” in Nairobi, to name a few—all pursuing “cutting-edge innovations” in sectors like banking, finance, healthcare, and education. They are headed by technologists and those in finance from both within and outside the continent who seemingly want to “solve” society’s problems, using data and AI to provide quick “solutions”. As a result, the attempt to “solve” social problems with technology is exactly where problems arise. Complex cultural, moral, and political problems that are inherently embedded in history and context are reduced to problems that can be measured and quantified—matters that can be “fixed” with the latest algorithm.

As dynamic and interactive human activities and processes are automated, they are inherently simplified to the engineers’ and tech corporations’ subjective notions of what they mean. The reduction of complex social problems to a matter that can be “solved” by technology also treats people as passive objects for manipulation. Humans, however, far from being passive objects, are active meaning-seekers embedded in dynamic social, cultural, and historical backgrounds.

The discourse around “data mining”, “abundance of data”, and “data-rich continent” shows the extent to which the individual behind each data point is disregarded. This muting of the individual—a person with fears, emotions, dreams, and hopes—is symptomatic of how little attention is given to matters such as people’s well-being and consent, which should be the primary concerns if the goal is indeed to “help” those in need. Furthermore, this discourse of “mining” people for data is reminiscent of the coloniser’s attitude that declares humans as raw material free for the taking. Complex cultural, moral, and political problems that are inherently embedded in history and context are reduced to problems that can be measured and quantified Data is necessarily always about something and never about an abstract entity.

The collection, analysis, and manipulation of data potentially entails monitoring, tracking, and surveilling people. This necessarily impacts people directly or indirectly whether it manifests as change in their insurance premiums or refusal of services. The erasure of the person behind each data point makes it easy to “manipulate behavior” or “nudge” users, often towards profitable outcomes for companies. Considerations around the wellbeing and welfare of the individual user, the long-term social impacts, and the unintended consequences of these systems on society’s most vulnerable are pushed aside, if they enter the equation at all. For companies that develop and deploy AI, at the top of the agenda is the collection of more data to develop profitable AI systems rather than the welfare of individual people or communities. This is most evident in the FinTech sector, one of the prominent digital markets in Africa. People’s digital footprints, from their interactions with others to how much they spend on their mobile top ups, are continually surveyed and monitored to form data for making loan assessments. Smartphone data from browsing history, likes, and locations is recorded forming the basis for a borrower’s creditworthiness.

Artificial Intelligence technologies that aid decision-making in the social sphere are, for the most part, developed and implemented by the private sector whose primary aim is to maximise profit. Protecting individual privacy rights and cultivating a fair society is therefore the least of their concerns, especially if such practice gets in the way of “mining” data, building predictive models, and pushing products to customers. As decision-making of social outcomes is handed over to predictive systems developed by profit-driven corporates, not only are we allowing our social concerns to be dictated by corporate incentives, we are also allowing moral questions to be dictated by corporate interest.

“Digital nudges”, behaviour modifications developed to suit commercial interests, are a prime example. As “nudging” mechanisms become the norm for “correcting” individuals’ behaviour, eating habits, or exercise routines, those developing predictive models are bestowed with the power to decide what “correct” is. In the process, individuals that do not fit our stereotypical ideas of a “fit body”, “good health”, and “good eating habits” end up being punished, outcast, and pushed further to the margins. When these models are imported as state-of-the-art technology that will save money and “leapfrog” the continent into development, Western values and ideals are enforced, either deliberately or intentionally….(More)”.

‘Selfies’ could be used to detect heart disease: new research uses artificial intelligence to analyse facial photos


European Society of Cardiology: “Sending a “selfie” to the doctor could be a cheap and simple way of detecting heart disease, according to the authors of a new study published today (Friday) in the European Heart Journal [1].

The study is the first to show that it’s possible to use a deep learning computer algorithm to detect coronary artery disease (CAD) by analysing four photographs of a person’s face.

Although the algorithm needs to be developed further and tested in larger groups of people from different ethnic backgrounds, the researchers say it has the potential to be used as a screening tool that could identify possible heart disease in people in the general population or in high-risk groups, who could be referred for further clinical investigations.

“To our knowledge, this is the first work demonstrating that artificial intelligence can be used to analyse faces to detect heart disease. It is a step towards the development of a deep learning-based tool that could be used to assess the risk of heart disease, either in outpatient clinics or by means of patients taking ‘selfies’ to perform their own screening. This could guide further diagnostic testing or a clinical visit,” said Professor Zhe Zheng, who led the research and is vice director of the National Center for Cardiovascular Diseases and vice president of Fuwai Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, People’s Republic of China.

He continued: “Our ultimate goal is to develop a self-reported application for high risk communities to assess heart disease risk in advance of visiting a clinic. This could be a cheap, simple and effective of identifying patients who need further investigation. However, the algorithm requires further refinement and external validation in other populations and ethnicities.”

It is known already that certain facial features are associated with an increased risk of heart disease. These include thinning or grey hair, wrinkles, ear lobe crease, xanthelasmata (small, yellow deposits of cholesterol underneath the skin, usually around the eyelids) and arcus corneae (fat and cholesterol deposits that appear as a hazy white, grey or blue opaque ring in the outer edges of the cornea). However, they are difficult for humans to use successfully to predict and quantify heart disease risk.

Prof. Zheng, Professor Xiang-Yang Ji, who is director of the Brain and Cognition Institute in the Department of Automation at Tsinghua University, Beijing, and other colleagues enrolled 5,796 patients from eight hospitals in China to the study between July 2017 and March 2019. The patients were undergoing imaging procedures to investigate their blood vessels, such as coronary angiography or coronary computed tomography angiography (CCTA). They were divided randomly into training (5,216 patients, 90%) or validation (580, 10%) groups.

Trained research nurses took four facial photos with digital cameras: one frontal, two profiles and one view of the top of the head. They also interviewed the patients to collect data on socioeconomic status, lifestyle and medical history. Radiologists reviewed the patients’ angiograms and assessed the degree of heart disease depending on how many blood vessels were narrowed by 50% or more (≥ 50% stenosis), and their location. This information was used to create, train and validate the deep learning algorithm….(More)”.

An algorithm shouldn’t decide a student’s future


Hye Jung Han at Politico: “…Education systems across Europe struggled this year with how to determine students’ all-important final grades. But one system, the International Baccalaureate (“IB”) — a high school program that is highly regarded by European universities, and offered by both public and private schools in 152 countries — did something unusual.

Having canceled final exams, which make up the majority of an IB student’s grade, the Geneva-based foundation of the same name hastily built an algorithm that used a student’s coursework scores, predicted grades by teachers and their school’s historical IB results to guess what students might have scored if they had taken their exams in a hypothetical, pandemic-free year. The result of the algorithm became the student’s final grade.

The results were catastrophic. Soon after the grades were released, serious mismatches emerged between expected grades based on a student’s prior performance, and those awarded by the algorithm. Because IB students’ university admissions are contingent upon their final grades, the unexpectedly poor grades generated for some resulted in scholarships and admissions offers being revoked

The IB had alternatives. Instead, it could have used students’ actual academic performance and graded on a generous curve. It could have incorporated practice test grades, third-party moderation to minimize grading bias and teachers’ broad evaluations of student progress.

It could have engaged with universities on flexibly factoring in final grades into this year’s admissions decisions, as universities contemplate opening their now-virtual classes to more students to replace lost revenue.

It increasingly seems like the greatest potential of the power promised by predictive data lies in the realm of misuse.

For this year’s graduating class, who have already responded with grace and resilience in their final year of school, the automating away of their capacity and potential is an unfair and unwanted preview of the world they are graduating into….(More)”.

Blame the politicians, not the technology, for A-level fiasco


The Editorial Board at the Financial Times: “The soundtrack of school students marching through Britain’s streets shouting “f*** the algorithm” captured the sense of outrage surrounding the botched awarding of A-level exam grades this year. But the students’ anger towards a disembodied computer algorithm is misplaced. This was a human failure. The algorithm used to “moderate” teacher-assessed grades had no agency and delivered exactly what it was designed to do.

It is politicians and educational officials who are responsible for the government’s latest fiasco and should be the target of students’ criticism….

Sensibly designed, computer algorithms could have been used to moderate teacher assessments in a constructive way. Using past school performance data, they could have highlighted anomalies in the distribution of predicted grades between and within schools. That could have led to a dialogue between Ofqual, the exam regulator, and anomalous schools to come up with more realistic assessments….

There are broader lessons to be drawn from the government’s algo fiasco about the dangers of automated decision-making systems. The inappropriate use of such systems to assess immigration status, policing policies and prison sentencing decisions is a live danger. In the private sector, incomplete and partial data sets can also significantly disadvantage under-represented groups when it comes to hiring decisions and performance measures.

Given the severe erosion of public trust in the government’s use of technology, it might now be advisable to subject all automated decision-making systems to critical scrutiny by independent experts. The Royal Statistical Society and The Alan Turing Institute certainly have the expertise to give a Kitemark of approval or flag concerns.

As ever, technology in itself is neither good nor bad. But it is certainly not neutral. The more we deploy automated decision-making systems, the smarter we must become in considering how best to use them and in scrutinising their outcomes. We often talk about a deficit of trust in our societies. But we should also be aware of the dangers of over-trusting technology. That may be a good essay subject for next year’s philosophy A-level….(More)”.