The imperative of interpretable machines


Julia Stoyanovich, Jay J. Van Bavel & Tessa V. West at Nature: “As artificial intelligence becomes prevalent in society, a framework is needed to connect interpretability and trust in algorithm-assisted decisions, for a range of stakeholders.

We are in the midst of a global trend to regulate the use of algorithms, artificial intelligence (AI) and automated decision systems (ADS). As reported by the One Hundred Year Study on Artificial Intelligence: “AI technologies already pervade our lives. As they become a central force in society, the field is shifting from simply building systems that are intelligent to building intelligent systems that are human-aware and trustworthy.” Major cities, states and national governments are establishing task forces, passing laws and issuing guidelines about responsible development and use of technology, often starting with its use in government itself, where there is, at least in theory, less friction between organizational goals and societal values.

In the United States, New York City has made a public commitment to opening the black box of the government’s use of technology: in 2018, an ADS task force was convened, the first of such in the nation, and charged with providing recommendations to New York City’s government agencies for how to become transparent and accountable in their use of ADS. In a 2019 report, the task force recommended using ADS where they are beneficial, reduce potential harm and promote fairness, equity, accountability and transparency2. Can these principles become policy in the face of the apparent lack of trust in the government’s ability to manage AI in the interest of the public? We argue that overcoming this mistrust hinges on our ability to engage in substantive multi-stakeholder conversations around ADS, bringing with it the imperative of interpretability — allowing humans to understand and, if necessary, contest the computational process and its outcomes.

Remarkably little is known about how humans perceive and evaluate algorithms and their outputs, what makes a human trust or mistrust an algorithm3, and how we can empower humans to exercise agency — to adopt or challenge an algorithmic decision. Consider, for example, scoring and ranking — data-driven algorithms that prioritize entities such as individuals, schools, or products and services. These algorithms may be used to determine credit worthiness, and desirability for college admissions or employment. Scoring and ranking are as ubiquitous and powerful as they are opaque. Despite their importance, members of the public often know little about why one person is ranked higher than another by a résumé screening or a credit scoring tool, how the ranking process is designed and whether its results can be trusted.

As an interdisciplinary team of scientists in computer science and social psychology, we propose a framework that forms connections between interpretability and trust, and develops actionable explanations for a diversity of stakeholders, recognizing their unique perspectives and needs. We focus on three questions (Box 1) about making machines interpretable: (1) what are we explaining, (2) to whom are we explaining and for what purpose, and (3) how do we know that an explanation is effective? By asking — and charting the path towards answering — these questions, we can promote greater trust in algorithms, and improve fairness and efficiency of algorithm-assisted decision making…(More)”.

Data & Policy


Data & Policy, an open-access journal exploring the potential of data science for governance and public decision-making, published its first cluster of peer-reviewed articles last week.

The articles include three contributions specifically concerned with data protection by design:

·       Gefion Theurmer and colleagues (University of Southampton) distinguish between data trusts and other data sharing mechanisms and discuss the need for workflows with data protection at their core;

·       Swee Leng Harris (King’s College London) explores Data Protection Impact Assessments as a framework for helping us know whether government use of data is legal, transparent and upholds human rights;

·       Giorgia Bincoletto’s (University of Bologna) study investigates data protection concerns arising from cross-border interoperability of Electronic Health Record systems in the European Union;

Also published, research by Jacqueline Lam and colleagues (University of Cambridge; Hong Kong University) on how fine-grained data from satellites and other sources can help us understand environmental inequality and socio-economic disparities in China, and this also reflects upon the importance of safeguarding data privacy and security. See also the blogs this week on the potential of Data Collaboratives for COVID-19 by Editor-in-Chief Stefaan Verhulst (the GovLab) and how COVID-19 exposes a widening data divide for the Global South, by Stefania Milan (University of Amsterdam) and Emiliano Treré (University of Cardiff).

Data & Policy is an open access, peer-reviewed venue for contributions that consider how systems of policy and data relate to one another. Read the 5 ways you can contribute to Data & Policy and contact [email protected] with any questions….(More)”.

Why we need responsible data for children


Andrew Young and Stefaan Verhulst at The Conversation: “…Without question, the increased use of data poses unique risks for and responsibilities to children. While practitioners may have well-intended purposes to leverage data for and about children, the data systems used are often designed with (consenting) adults in mind without a focus on the unique needs and vulnerabilities of children. This can lead to the collection of inaccurate and unreliable data as well as the inappropriate and potentially harmful use of data for and about children….

Research undertaken in the context of the RD4C initiative uncovered the following trends and realities. These issues make clear why we need a dedicated data responsibility approach for children.

  • Today’s children are the first generation growing up at a time of rapid datafication where almost all aspects of their lives, both on and off-line, are turned into data points. An entire generation of young people is being datafied – often starting even before birth. Every year the average child will have more data collected about them in their lifetime than would a similar child born any year prior. The potential uses of such large volumes of data and the impact on children’s lives are unpredictable, and could potentially be used against them.
  • Children typically do not have full agency to make decisions about their participation in programs or services which may generate and record personal data. Children may also lack the understanding to assess a decision’s purported risks and benefits. Privacy terms and conditions are often barely understood by educated adults, let alone children. As a result, there is a higher duty of care for children’s data.
  • Disaggregating data according to socio-demographic characteristics can improve service delivery and assist with policy development. However, it also creates risks for group privacy. Children can be identified, exposing them to possible harms. Disaggregated data for groups such as child-headed households and children experiencing gender-based violence can put vulnerable communities and children at risk. Data about children’s location itself can be risky, especially if they have some additional vulnerability that could expose them to harm.
  • Mishandling data can cause children to lose trust in institutions that deliver essential services including vaccines, medicine, and nutrition supplies. For organizations dealing with child well-being, these retreats can have severe consequences. Distrust can cause families and children to refuse health, education, child protection and other public services. Such privacy protective behavior can impact children throughout the course of their lifetime, and potentially exacerbate existing inequities and vulnerabilities.
  • As volumes of collected and stored data increase, obligations and protections traditionally put in place for children may be difficult or impossible to uphold. The interests of children are not always prioritized when organizations define their legitimate interest to access or share personal information of children. The immediate benefit of a service provided does not always justify the risk or harm that might be caused by it in the future. Data analysis may be undertaken by people who do not have expertise in the area of child rights, as opposed to traditional research where practitioners are specifically educated in child subject research. Similarly, service providers collecting children’s data are not always specially trained to handle it, as international standards recommend.
  • Recent events around the world reveal the promise and pitfalls of algorithmic decision-making. While it can expedite certain processes, algorithms and their inferences can possess biases that can have adverse effects on people, for example those seeking medical care and attempting to secure jobs. The danger posed by algorithmic bias is especially pronounced for children and other vulnerable populations. These groups often lack the awareness or resources necessary to respond to instances of bias or to rectify any misconceptions or inaccuracies in their data.
  • Many of the children served by child welfare organizations have suffered trauma. Whether physical, social, emotional in nature, repeatedly making children register for services or provide confidential personal information can amount to revictimization – re-exposing them to traumas or instigating unwarranted feelings of shame and guilt.

These trends and realities make clear the need for new approaches for maximizing the value of data to improve children’s lives, while mitigating the risks posed by our increasingly datafied society….(More)”.

The world after coronavirus


Yuval Noah Harari at the Financial Times: “Humankind is now facing a global crisis. Perhaps the biggest crisis of our generation. The decisions people and governments take in the next few weeks will probably shape the world for years to come. They will shape not just our healthcare systems but also our economy, politics and culture. We must act quickly and decisively. We should also take into account the long-term consequences of our actions.

When choosing between alternatives, we should ask ourselves not only how to overcome the immediate threat, but also what kind of world we will inhabit once the storm passes. Yes, the storm will pass, humankind will survive, most of us will still be alive — but we will inhabit a different world.  Many short-term emergency measures will become a fixture of life. That is the nature of emergencies. They fast-forward historical processes.

Decisions that in normal times could take years of deliberation are passed in a matter of hours. Immature and even dangerous technologies are pressed into service, because the risks of doing nothing are bigger. Entire countries serve as guinea-pigs in large-scale social experiments. What happens when everybody works from home and communicates only at a distance? What happens when entire schools and universities go online? In normal times, governments, businesses and educational boards would never agree to conduct such experiments. But these aren’t normal times. 

In this time of crisis, we face two particularly important choices. The first is between totalitarian surveillance and citizen empowerment. The second is between nationalist isolation and global solidarity. 

Under-the-skin surveillance

In order to stop the epidemic, entire populations need to comply with certain guidelines. There are two main ways of achieving this. One method is for the government to monitor people, and punish those who break the rules. Today, for the first time in human history, technology makes it possible to monitor everyone all the time. Fifty years ago, the KGB couldn’t follow 240m Soviet citizens 24 hours a day, nor could the KGB hope to effectively process all the information gathered. The KGB relied on human agents and analysts, and it just couldn’t place a human agent to follow every citizen. But now governments can rely on ubiquitous sensors and powerful algorithms instead of flesh-and-blood spooks. 

In their battle against the coronavirus epidemic several governments have already deployed the new surveillance tools. The most notable case is China. By closely monitoring people’s smartphones, making use of hundreds of millions of face-recognising cameras, and obliging people to check and report their body temperature and medical condition, the Chinese authorities can not only quickly identify suspected coronavirus carriers, but also track their movements and identify anyone they came into contact with. A range of mobile apps warn citizens about their proximity to infected patients…

If I could track my own medical condition 24 hours a day, I would learn not only whether I have become a health hazard to other people, but also which habits contribute to my health. And if I could access and analyse reliable statistics on the spread of coronavirus, I would be able to judge whether the government is telling me the truth and whether it is adopting the right policies to combat the epidemic. Whenever people talk about surveillance, remember that the same surveillance technology can usually be used not only by governments to monitor individuals — but also by individuals to monitor governments. 

The coronavirus epidemic is thus a major test of citizenship….(More)”.

Beyond a Human Rights-based approach to AI Governance: Promise, Pitfalls, Plea


Paper by Nathalie A. Smuha: “This paper discusses the establishment of a governance framework to secure the development and deployment of “good AI”, and describes the quest for a morally objective compass to steer it. Asserting that human rights can provide such compass, this paper first examines what a human rights-based approach to AI governance entails, and sets out the promise it propagates. Subsequently, it examines the pitfalls associated with human rights, particularly focusing on the criticism that these rights may be too Western, too individualistic, too narrow in scope and too abstract to form the basis of sound AI governance. After rebutting these reproaches, a plea is made to move beyond the calls for a human rights-based approach, and start taking the necessary steps to attain its realisation. It is argued that, without elucidating the applicability and enforceability of human rights in the context of AI; adopting legal rules that concretise those rights where appropriate; enhancing existing enforcement mechanisms; and securing an underlying societal infrastructure that enables human rights in the first place, any human rights-based governance framework for AI risks falling short of its purpose….(More)”.

World Justice Project (WJP) Rule of Law Index®


Interactive Overview: “The World Justice Project (WJP) Rule of Law Index® is the world’s leading source for original, independent data on the rule of law. Now covering 128 countries and jurisdictions, the Index relies on national surveys of more than 130,000 households and 4,000 legal practitioners and experts to measure how the rule of law is experienced and perceived around the world.

Effective rule of law reduces corruption, combats poverty and disease, and protects people from injustices large and small. It is the foundation for communities of justice, opportunity, and peace—underpinning development, accountable government, and respect for fundamental rights.

Learn more about the rule of law and explore the full WJP Rule of Law Index 2020 report, including PDF report download, data insights, methodology, and more at the Index report resources page….(More)”

CARE Principles for Indigenous Data Governance


The Global Indigenous Data Alliance: “The current movement toward open data and open science does not fully engage with Indigenous Peoples rights and interests. Existing principles within the open data movement (e.g. FAIR: findable, accessible, interoperable, reusable) primarily focus on characteristics of data that will facilitate increased data sharing among entities while ignoring power differentials and historical contexts. The emphasis on greater data sharing alone creates a tension for Indigenous Peoples who are also asserting greater control over the application and use of Indigenous data and Indigenous Knowledge for collective benefit.

This includes the right to create value from Indigenous data in ways that are grounded in Indigenous worldviews and realise opportunities within the knowledge economy. The CARE Principles for Indigenous Data Governance are people and purpose-oriented, reflecting the crucial role of data in advancing Indigenous innovation and self-determination. These principles complement the existing FAIR principles encouraging open and other data movements to consider both people and purpose in their advocacy and pursuits….(More)”.

Policymaking in an Infomocracy


An interview with Malka Older: “…Nisa: There’s a line in your first book, “Democracy is of limited usefulness when there are no good choices, or when all the information access in the world can’t make people use it.” So imagine this world you’ve imagined has a much higher demand for free and accurate information access than we have now, in exchange for a fairly high amount of state surveillance. I’m curious what else we give up when we allow that amount of surveillance into our communities and whether that trade-off is necessary.

Malka: The amount of surveillance in the books is a very gentle extrapolation from where we are now. I don’t know if they need to be that connected but I do feel like privacy is a very relative concept. The way that we think of privacy now is very different than the way that it’s been thought of in the past, or the way it’s thought of in different places, and it’s very hard to put that back in the box. I was thinking more in terms of, since we are giving up our privacy anyway, what would I like to see done with all this information? Most of the types of surveillance that I mentioned are already very much in place. It’s hard to walk down the street without seeing surveillance cameras — they’re in private businesses, outside of apartment buildings, in lobbies, and buses and trains and pretty much everywhere.  We already know that whatever we do online is recorded and tracked in some way. If we have smartphones—which I don’t, I’m trying to resist, although it’s getting harder and harder—pretty much all of our movements are being tracked that way. The difference from the book is that the current situation of surveillance is very fragmented, and a combination of private sector and public sector, as opposed to one monolithic organization. Although, it’s not clear how different it really is from our present when governments are able to subpoena information from the private sector. The other part is that we give away a lot of this information, if not all of it, whenever we accept the terms of service agreements. We’re basically saying, in exchange for having this cool phone, I will let you use my data. But we’re learning that companies are often going far beyond what we legally agreed to, and even what we legally agree to is done in such convoluted terms and there’s an imbalance of information to begin with. That’s really problematic. Rather than thinking in terms of privacy as a kind of absolute or in terms of surveillance, I tend to think more about who owns the data, who has access to the data. The real problem is not just that there are cameras everywhere, but that we don’t know who is watching those cameras or who is able to access those cameras at any given time. Similarly, the fact that all of our online data is being recorded is not necessarily a huge problem, except when we have no way of knowing what the data is contributing to when it’s amalgamated and no recourse or control over how it’s eventually used. All this data that we create in our online trails being in the hands of a corporation that does not need to share it or reveal it, and is using it to make money, or all of that data being available to everybody or held under some sort of very clear and equitable terms where we have much more choice about what’s it’s used for and where we could access our own data. For me, it’s very much about the power structures involved….(More)”.

How Philanthropy Can Help Lead on Data Justice


Louise Lief at Stanford Social Innovation Review: “Today, data governs almost every aspect of our lives, shaping the opportunities we have, how we perceive reality and understand problems, and even what we believe to be possible. Philanthropy is particularly data driven, relying on it to inform decision-making, define problems, and measure impact. But what happens when data design and collection methods are flawed, lack context, or contain critical omissions and misdirected questions? With bad data, data-driven strategies can misdiagnose problems and worsen inequities with interventions that don’t reflect what is needed.

Data justice begins by asking who controls the narrative. Who decides what data is collected and for which purpose? Who interprets what it means for a community? Who governs it? In recent years, affected communities, social justice philanthropists, and academics have all begun looking deeper into the relationship between data and social justice in our increasingly data-driven world. But philanthropy can play a game-changing role in developing practices of data justice to more accurately reflect the lived experience of communities being studied. Simply incorporating data justice principles into everyday foundation practice—and requiring it of grantees—would be transformative: It would not only revitalize research, strengthen communities, influence policy, and accelerate social change, it would also help address deficiencies in current government data sets.

When Data Is Flawed

Some of the most pioneering work on data justice has been done by Native American communities, who have suffered more than most from problems with bad data. A 2017 analysis of American Indian data challenges—funded by the W.K. Kellogg Foundation and the Morris K. Udall and Stewart L. Udall Foundation—documented how much data on Native American communities is of poor quality, inaccurate, inadequate, inconsistent, irrelevant, and/or inaccessible. The National Congress of American Indians even described American Native communities as “The Asterisk Nation,” because in many government data sets they are represented only by an asterisk denoting sampling errors instead of data points.

Where it concerns Native Americans, data is often not standardized and different government databases identify tribal members at least seven different ways using different criteria; federal and state statistics often misclassify race and ethnicity; and some data collection methods don’t allow tribes to count tribal citizens living off the reservation. For over a decade the Department of the Interior’s Bureau of Indian Affairs has struggled to capture the data it needs for a crucial labor force report it is legally required to produce; methodology errors and reporting problems have been so extensive that at times it prevented the report from even being published. But when the Department of the Interior changed several reporting requirements in 2014 and combined data submitted by tribes with US Census data, it only compounded the problem, making historical comparisons more difficult. Moreover, Native Americans have charged that the Census Bureau significantly undercounts both the American Indian population and key indicators like joblessness….(More)”.

Smarter government or data-driven disaster: the algorithms helping control local communities


Release by MuckRock: “What is the chance you, or your neighbor, will commit a crime? Should the government change a child’s bus route? Add more police to a neighborhood or take some away?

Every day government decisions from bus routes to policing used to be based on limited information and human judgment. Governments now use the ability to collect and analyze hundreds of data points everyday to automate many of their decisions.

Does handing government decisions over to algorithms save time and money? Can algorithms be fairer or less biased than human decision making? Do they make us safer? Automation and artificial intelligence could improve the notorious inefficiencies of government, and it could exacerbate existing errors in the data being used to power it.

MuckRock and the Rutgers Institute for Information Policy & Law (RIIPL) have compiled a collection of algorithms used in communities across the country to automate government decision-making.

Go right to the database.

We have also compiled policies and other guiding documents local governments use to make room for the future use of algorithms. You can find those as a project on DocumentCloud.

View policies on smart cities and technologies

These collections are a living resource and attempt to communally collect records and known instances of automated decision making in government….(More)”.