The Seductive Diversion of ‘Solving’ Bias in Artificial Intelligence


Blog by Julia Powles and Helen Nissenbaum: “Serious thinkers in academia and business have swarmed to the A.I. bias problem, eager to tweak and improve the data and algorithms that drive artificial intelligence. They’ve latched onto fairness as the objective, obsessing over competing constructs of the term that can be rendered in measurable, mathematical form. If the hunt for a science of computational fairness was restricted to engineers, it would be one thing. But given our contemporary exaltation and deference to technologists, it has limited the entire imagination of ethics, law, and the media as well.

There are three problems with this focus on A.I. bias. The first is that addressing bias as a computational problem obscures its root causes. Bias is a social problem, and seeking to solve it within the logic of automation is always going to be inadequate.

Second, even apparent success in tackling bias can have perverse consequences. Take the example of a facial recognition system that works poorly on women of color because of the group’s underrepresentation both in the training data and among system designers. Alleviating this problem by seeking to “equalize” representation merely co-opts designers in perfecting vast instruments of surveillance and classification.

When underlying systemic issues remain fundamentally untouched, the bias fighters simply render humans more machine readable, exposing minorities in particular to additional harms.

Third — and most dangerous and urgent of all — is the way in which the seductive controversy of A.I. bias, and the false allure of “solving” it, detracts from bigger, more pressing questions. Bias is real, but it’s also a captivating diversion.

What has been remarkably underappreciated is the key interdependence of the twin stories of A.I. inevitability and A.I. bias. Against the corporate projection of an otherwise sunny horizon of unstoppable A.I. integration, recognizing and acknowledging bias can be seen as a strategic concession — one that subdues the scale of the challenge. Bias, like job losses and safety hazards, becomes part of the grand bargain of innovation.

The reality that bias is primarily a social problem and cannot be fully solved technically becomes a strength, rather than a weakness, for the inevitability narrative. It flips the script. It absorbs and regularizes the classification practices and underlying systems of inequality perpetuated by automation, allowing relative increases in “fairness” to be claimed as victories — even if all that is being done is to slice, dice, and redistribute the makeup of those negatively affected by actuarial decision-making.

In short, the preoccupation with narrow computational puzzles distracts us from the far more important issue of the colossal asymmetry between societal cost and private gain in the rollout of automated systems. It also denies us the possibility of asking: Should we be building these systems at all?…(More)”.

To Reduce Privacy Risks, the Census Plans to Report Less Accurate Data


Mark Hansen at the New York Times: “When the Census Bureau gathered data in 2010, it made two promises. The form would be “quick and easy,” it said. And “your answers are protected by law.”

But mathematical breakthroughs, easy access to more powerful computing, and widespread availability of large and varied public data sets have made the bureau reconsider whether the protection it offers Americans is strong enough. To preserve confidentiality, the bureau’s directors have determined they need to adopt a “formal privacy” approach, one that adds uncertainty to census data before it is published and achieves privacy assurances that are provable mathematically.

The census has always added some uncertainty to its data, but a key innovation of this new framework, known as “differential privacy,” is a numerical value describing how much privacy loss a person will experience. It determines the amount of randomness — “noise” — that needs to be added to a data set before it is released, and sets up a balancing act between accuracy and privacy. Too much noise would mean the data would not be accurate enough to be useful — in redistricting, in enforcing the Voting Rights Act or in conducting academic research. But too little, and someone’s personal data could be revealed.

On Thursday, the bureau will announce the trade-off it has chosen for data publications from the 2018 End-to-End Census Test it conducted in Rhode Island, the only dress rehearsal before the actual census in 2020. The bureau has decided to enforce stronger privacy protections than companies like Apple or Google had when they each first took up differential privacy….

In presentation materials for Thursday’s announcement, special attention is paid to lessening any problems with redistricting: the potential complications of using noisy counts of voting-age people to draw district lines. (By contrast, in 2000 and 2010 the swapping mechanism produced exact counts of potential voters down to the block level.)

The Census Bureau has been an early adopter of differential privacy. Still, instituting the framework on such a large scale is not an easy task, and even some of the big technology firms have had difficulties. For example, shortly after Apple’s announcement in 2016 that it would use differential privacy for data collected from its macOS and iOS operating systems, it was revealed that the actual privacy loss of their systems was much higher than advertised.

Some scholars question the bureau’s abandonment of techniques like swapping in favor of differential privacy. Steven Ruggles, Regents Professor of history and population studies at the University of Minnesota, has relied on census data for decades. Through the Integrated Public Use Microdata Series, he and his team have regularized census data dating to 1850, providing consistency between questionnaires as the forms have changed, and enabling researchers to analyze data across years.

“All of the sudden, Title 13 gets equated with differential privacy — it’s not,” he said, adding that if you make a guess about someone’s identity from looking at census data, you are probably wrong. “That has been regarded in the past as protection of privacy. They want to make it so that you can’t even guess.”

“There is a trade-off between usability and risk,” he added. “I am concerned they may go far too far on privileging an absolutist standard of risk.”

In a working paper published Friday, he said that with the number of private services offering personal data, a prospective hacker would have little incentive to turn to public data such as the census “in an attempt to uncover uncertain, imprecise and outdated information about a particular individual.”…(More)”.

New methods help identify what drives sensitive or socially unacceptable behaviors


Mary Guiden at Physorg: “Conservation scientists and statisticians at Colorado State University have teamed up to solve a key problem for the study of sensitive behaviors like poaching, harassment, bribery, and drug use.

Sensitive behaviors—defined as socially unacceptable or not compliant with rules and regulations—are notoriously hard to study, researchers say, because people often do not want to answer direct questions about them.

To overcome this challenge, scientists have developed indirect questioning approaches that protect responders’ identities. However, these methods also make it difficult to predict which sectors of a population are more likely to participate in sensitive behaviors, and which factors, such as knowledge of laws, education, or income, influence the probability that an individual will engage in a sensitive behavior.

Assistant Professor Jennifer Solomon and Associate Professor Michael Gavin of the Department of Human Dimensions of Natural Resources at CSU, and Abu Conteh from MacEwan University in Alberta, Canada, have teamed up with Professor Jay Breidt and doctoral student Meng Cao in the CSU Department of Statistics to develop a new method to solve the problem.

The study, “Understanding the drivers of sensitive behavior using Poisson regression from quantitative randomized response technique data,” was published recently in PLOS One.

Conteh, who, as a doctoral student, worked with Gavin in New Zealand, used a specific technique, known as quantitative randomized response, to elicit confidential answers to questions on behaviors related to non-compliance with natural resource regulations from a protected area in Sierra Leone.

In this technique, the researcher conducting interviews has a large container containing pingpong balls, some with numbers and some without numbers. The interviewer asks the respondent to pick a ball at random, without revealing it to the interviewer. If the ball has a number, the respondent tells the interviewer the number. If the ball does not have a number, the respondent reveals how many times he illegaly hunted animals in a given time period….

Armed with the new computer program, the scientists found that people from rural communities with less access to jobs in urban centers were more likely to hunt in the reserve. People in communities with a greater proportion people displaced by Sierra Leone’s 10-year civil war were also more likely to hunt illegally….(More)”

The researchers said that collaborating across disciplines was and is key to addressing complex problems like this one. It is commonplace for people to be noncompliant with rules and regulations and equally important for social scientists to analyze these behaviors….(More)”

Beyond GDP: Measuring What Counts for Economic and Social Performance


OECD Book: “Metrics matter for policy and policy matters for well-being. In this report, the co-chairs of the OECD-hosted High Level Expert Group on the Measurement of Economic Performance and Social Progress, Joseph E. Stiglitz, Jean-Paul Fitoussi and Martine Durand, show how over-reliance on GDP as the yardstick of economic performance misled policy makers who did not see the 2008 crisis coming. When the crisis did hit, concentrating on the wrong indicators meant that governments made inadequate policy choices, with severe and long-lasting consequences for many people.

While GDP is the most well-known, and most powerful economic indicator, it can’t tell us everything we need to know about the health of countries and societies. In fact, it can’t even tell us everything we need to know about economic performance. We need to develop dashboards of indicators that reveal who is benefitting from growth, whether that growth is environmentally sustainable, how people feel about their lives, what factors contribute to an individual’s or a country’s success. This book looks at progress made over the past 10 years in collecting well-being data, and in using them to inform policies. An accompanying volume, For Good Measure: Advancing Research on Well-being Metrics Beyond GDP, presents the latest findings from leading economists and statisticians on selected issues within the broader agenda on defining and measuring well-being….(More)”

Open Government Data for Inclusive Development


Chapter by F. van Schalkwyk and M,  Cañares in  “Making Open Development Inclusive”, MIT Press by Matthew L. Smith and Ruhiya Kris Seward (Eds):  “This chapter examines the relationship between open government data and social inclusion. Twenty-eight open data initiatives from the Global South are analyzed to find out how and in what contexts the publication of open government data tend to result in the inclusion of habitually marginalized communities in governance processes such that they may lead better lives.

The relationship between open government data and social inclusion is examined by presenting an analysis of the outcomes of open data projects. This analysis is based on a constellation of factors that were identified as having a bearing on open data initiatives with respect to inclusion. The findings indicate that open data can contribute to an increase in access and participation— both components of inclusion. In these cases, this particular finding indicates that a more open, participatory approach to governance practice is taking root. However, the findings also show that access and participation approaches to open government data have, in the cases studied here, not successfully disrupted the concentration of power in political and other networks, and this has placed limits on open data’s contribution to a more inclusive society.

The chapter starts by presenting a theoretical framework for the analysis of the relationship between open data and inclusion. The framework sets out the complex relationship between social actors, information and power in the network society. This is critical, we suggest, in developing a realistic analysis of the contexts in which open data activates its potential for
transformation. The chapter then articulates the research question and presents the methodology used to operationalize those questions. The findings and discussion section that follows examines the factors affecting the relationship between open data and inclusion, and how these factors
are observed to play out across several open data initiatives in different contexts. The chapter ends with concluding remarks and an attempt to synthesize the insights that emerged in the preceding sections….(More)”.

Better Data for Doing Good: Responsible Use of Big Data and Artificial Intelligence


Report by the World Bank: “Describes opportunities for harnessing the value of big data and artificial intelligence (AI) for social good and how new families of AI algorithms now make it possible to obtain actionable insights automatically and at scale. Beyond internet business or commercial applications, multiple examples already exist of how big data and AI can help achieve shared development objectives, such as the 2030 Agenda for Sustainable Development and the Sustainable Development Goals (SDGs). But ethical frameworks in line with increased uptake of these new technologies remain necessary—not only concerning data privacy but also relating to the impact and consequences of using data and algorithms. Public recognition has grown concerning AI’s potential to create both opportunities for societal benefit and risks to human rights. Development calls for seizing the opportunity to shape future use as a force for good, while at the same time ensuring the technologies address inequalities and avoid widening the digital divide….(More)”.

Artificial Intelligence: Public-Private Partnerships join forces to boost AI progress in Europe


European Commission Press Release: “…the Big Data Value Association and euRobotics agreed to cooperate more in order to boost the advancement of artificial intelligence’s (AI) in Europe. Both associations want to strengthen their collaboration on AI in the future. Specifically by:

  • Working together to boost European AI, building on existing industrial and research communities and on results of the Big Data Value PPP and SPARC PPP. This to contribute to the European Commission’s ambitious approach to AI, backed up with a drastic increase investment, reaching €20 billion total public and private funding in Europe until 2020.
  • Enabling joint-pilots, for example, to accelerate the use and integration of big data, robotics and AI technologies in different sectors and society as a whole
  • Exchanging best practices and approaches from existing and future projects of the Big Data PPP and the SPARC PPP
  • Contributing to the European Digital Single Market, developing strategic roadmaps and  position papers

This Memorandum of Understanding between the PPPs follows the European Commission’s approach to AI presented in April 2018 and the Declaration of Cooperation on Artificial Intelligence signed by all 28 Member States and Norway. This Friday 7 December the Commission will present its EU coordinated plan….(More)”.

Data Collaboration, Pooling and Hoarding under Competition Law


Paper by Bjorn Lundqvist: “In the Internet of Things era devices will monitor and collect data, whilst device producing firms will store, distribute, analyse and re-use data on a grand scale. Great deal of data analytics will be used to enable firms to understand and make use of the collected data. The infrastructure around the collected data is controlled and access to the data flow is thus restricted on technical, but also on legal grounds. Legally, the data are being obscured behind a thicket of property rights, including intellectual property rights. Therefore, there is no general “data commons” for everyone to enjoy.

If firms would like to combine data, they need to give each other access either by sharing, trading, or pooling the data. On the one hand, industry-wide pooling of data could increase efficiency of certain services, and contribute to the innovation of other services, e.g., think about self-driven cars or personalized medicine. On the other hand, firms combining business data may use the data, not to advance their services or products, but to collude, to exclude competitors or to abuse their market position. Indeed by combining their data in a pool, they can gain market power, and, hence, the ability to violate competition law. Moreover, we also see firms hoarding data from various source creating de facto data pools. This article will discuss what implications combining data in data pools by firms might have on competition, and when competition law should be applicable. It develops the idea that data pools harbour great opportunities, whilst acknowledging that there are still risks to take into consideration, and to regulate….(More)”.

Using Mobile Network Data for Development: How it works


Blog by Derval Usher and Darren Hanniffy: “…We aim to equip decision makers with data tools so that they have access to the analysis on the fly. But to help this scale we need progress in three areas:

1. The framework to support Shared Value partnerships.

2. Shared understanding of The Proposition and the benefits for all parties.

3. Access to finance and a funding strategy, designing-in innovation.

1. Any Public-Private Partnership should be aligned to achieve impact centered on the SDGs through a Shared Value / Inclusive Business approach. Mobile network operators are consumed with the challenge of maintaining or upgrading their infrastructure, driving device sales and sustaining their agent networks to reach the last mile. Measuring impact against the SDGs has not been a priority. Mobile network operators tend not to seek out partnerships with traditional development donors or development implementers. But there is a growing realisation of the potential and the need to partner. It’s important to move from a service level transactional relationship to a strategic partnership approach.

Private sector partners have been fundamental to the success of UN Global Pulse as these companies are often the custodians of the big data sets from which we develop valuable development and humanitarian insights. Although in previous years our private sector partners were framed primarily as data philanthropists, we are beginning to see a shift in the relationship to one of shared value. Our work generates public value and also insights that can enhance business operations. This shared value model is attracting more private enterprises to engage and to explore their own data, and more broadly to investigate the value of their networks and data as part of the data innovation ecosystem, which the Global Pulse lab network will build on as we move forward.

2. Partners need to be more propositional and less charitable. They need to recognise the fact that earning profit may help ensure the sustainability of digital platforms and services that offer developmental impact. Through partnership we can attract innovative finance, deliver mobile for development programmes, measure impact and create affordable commercial solutions to development challenges that become sustainable by design. Pulse Lab Jakarta and Digicel have been flexible with one another which is important as this partnership has not always been a priority for either side all the time. But we believe in unlocking the power of mobile data for development and therefore continue to make progress.

3. Development and commercial strategies should be more aligned to create an enabling environment. Currently they are not. Private sector needs to become a strategic partner to development where multi-annual development funds align with commercial strategy. Mobile network operators continue to invest in their network particularly in developing countries and the digital platform is coming into being in the markets where Digicel operates. But the platform is new and experience is limited within governments, the development community and indeed even within mobile network operators.

We need to see donors actively engage during the development of multi-annual funding facilities….(More)”.

Why We Need to Audit Algorithms


James Guszcza, Iyad Rahwan, Will Bible, Manuel Cebrian and Vic Katyal at Harvard Business Review: “Algorithmic decision-making and artificial intelligence (AI) hold enormous potential and are likely to be economic blockbusters, but we worry that the hype has led many people to overlook the serious problems of introducing algorithms into business and society. Indeed, we see many succumbing to what Microsoft’s Kate Crawford calls “data fundamentalism” — the notion that massive datasets are repositories that yield reliable and objective truths, if only we can extract them using machine learning tools. A more nuanced view is needed. It is by now abundantly clear that, left unchecked, AI algorithms embedded in digital and social technologies can encode societal biasesaccelerate the spread of rumors and disinformation, amplify echo chambers of public opinion, hijack our attention, and even impair our mental wellbeing.

Ensuring that societal values are reflected in algorithms and AI technologies will require no less creativity, hard work, and innovation than developing the AI technologies themselves. We have a proposal for a good place to start: auditing. Companies have long been required to issue audited financial statements for the benefit of financial markets and other stakeholders. That’s because — like algorithms — companies’ internal operations appear as “black boxes” to those on the outside. This gives managers an informational advantage over the investing public which could be abused by unethical actors. Requiring managers to report periodically on their operations provides a check on that advantage. To bolster the trustworthiness of these reports, independent auditors are hired to provide reasonable assurance that the reports coming from the “black box” are free of material misstatement. Should we not subject societally impactful “black box” algorithms to comparable scrutiny?

Indeed, some forward thinking regulators are beginning to explore this possibility. For example, the EU’s General Data Protection Regulation (GDPR) requires that organizations be able to explain their algorithmic decisions. The city of New York recently assembled a task force to study possible biases in algorithmic decision systems. It is reasonable to anticipate that emerging regulations might be met with market pull for services involving algorithmic accountability.

So what might an algorithm auditing discipline look like? First, it should adopt a holistic perspective. Computer science and machine learning methods will be necessary, but likely not sufficient foundations for an algorithm auditing discipline. Strategic thinking, contextually informed professional judgment, communication, and the scientific method are also required.

As a result, algorithm auditing must be interdisciplinary in order for it to succeed….(More)”.