A Right to Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI

Paper by Sandra Wachter and Brent Mittelstadt: “Big Data analytics and artificial intelligence (AI) draw non-intuitive and unverifiable inferences and predictions about the behaviors, preferences, and private lives of individuals. These inferences draw on highly diverse and feature-rich data of unpredictable value, and create new opportunities for discriminatory, biased, and invasive decision-making. Concerns about algorithmic accountability are often actually concerns about the way in which these technologies draw privacy invasive and non-verifiable inferences about us that we cannot predict, understand, or refute.

Data protection law is meant to protect people’s privacy, identity, reputation, and autonomy, but is currently failing to protect data subjects from the novel risks of inferential analytics. The broad concept of personal datain Europe could be interpreted to include inferences, predictions, and assumptions that refer to or impact on an individual. If seen as personal data, individuals are granted numerous rights under data protection law. However, the legal status of inferences is heavily disputed in legal scholarship, and marked by inconsistencies and contradictions within and between the views of the Article 29 Working Party and the European Court of Justice.

As we show in this paper, individuals are granted little control and oversight over how their personal data is used to draw inferences about them. Compared to other types of personal data, inferences are effectively ‘economy class’ personal data in the General Data Protection Regulation (GDPR). Data subjects’ rights to know about (Art 13-15), rectify (Art 16), delete (Art 17), object to (Art 21), or port (Art 20) personal data are significantly curtailed when it comes to inferences, often requiring a greater balance with controller’s interests (e.g. trade secrets, intellectual property) than would otherwise be the case. Similarly, the GDPR provides insufficient protection against sensitive inferences (Art 9) or remedies to challenge inferences or important decisions based on them (Art 22(3))….

In this paper we argue that a new data protection right, the ‘right to reasonable inferences’, is needed to help close the accountability gap currently posed ‘high risk inferences’ , meaning inferences that are privacy invasive or reputation damaging and have low verifiability in the sense of being predictive or opinion-based. In cases where algorithms draw ‘high risk inferences’ about individuals, this right would require ex-ante justification to be given by the data controller to establish whether an inference is reasonable. This disclosure would address (1) why certain data is a relevant basis to draw inferences; (2) why these inferences are relevant for the chosen processing purpose or type of automated decision; and (3) whether the data and methods used to draw the inferences are accurate and statistically reliable. The ex-ante justification is bolstered by an additional ex-post mechanism enabling unreasonable inferences to be challenged. A right to reasonable inferences must, however, be reconciled with EU jurisprudence and counterbalanced with IP and trade secrets law as well as freedom of expression and Article 16 of the EU Charter of Fundamental Rights: the freedom to conduct a business….(More)”.

Human Rights in the Big Data World

Paper by Francis Kuriakose and Deepa Iyer: “Ethical approach to human rights conceives and evaluates law through the underlying value concerns. This paper examines human rights after the introduction of big data using an ethical approach to rights. First, the central value concerns such as equity, equality, sustainability and security are derived from the history of digital technological revolution. Then, the properties and characteristics of big data are analyzed to understand emerging value concerns such as accountability, transparency, tracability, explainability and disprovability.

Using these value points, this paper argues that big data calls for two types of evaluations regarding human rights. The first is the reassessment of existing human rights in the digital sphere predominantly through right to equality and right to work. The second is the conceptualization of new digital rights such as right to privacy and right against propensity-based discrimination. The paper concludes that as we increasingly share the world with intelligence systems, these new values expand and modify the existing human rights paradigm….(More)”.

The law and ethics of big data analytics: A new role for international human rights in the search for global standards

David Nersessian at Business Horizons: “The Economist recently declared that digital information has overtaken oil as the world’s most valuable commodity. Big data technology is inherently global and borderless, yet little international consensus exists over what standards should govern its use. One source of global standards benefitting from considerable international consensus might be used to fill the gap: international human rights law.

This article considers the extent to which international human rights law operates as a legal or ethical constraint on global commercial use of big data technologies. By providing clear baseline standards that apply worldwide, human rights can help shape cultural norms—implemented as ethical practices and global policies and procedures—about what businesses should do with their information technologies. In this way, human rights could play a broad and important role in shaping business thinking about the proper handling of this increasingly valuable commodity in the modern global society…(More)”.

Emerging Labour Market Data Sources towards Digital Technical and Vocational Education and Training (TVET)

Paper by Nikos Askitas, Rafik Mahjoubi, Pedro S. Martins, Koffi Zougbede for Paris21/OECD: “Experience from both technology and policy making shows that solutions for labour market improvements are simply choices of new, more tolerable problems. All data solutions supporting digital Technical and Vocational Education and Training (TVET) will have to incorporate a roadmap of changes rather than an unrealistic super-solution. The ideal situation is a world in which labour market participants engage in intelligent strategic behavior in an informed, fair and sophisticated manner.

Labour market data captures transactions within labour market processes. In order to successfully capture such data, we need to understand the specifics of these market processes. Designing an ecosystem of labour market matching facilitators and rules of engagement for contributing to a lean and streamlined Logistics Management and Information System (LMIS) is the best way to create Big Data with context relevance. This is in contrast with pre-existing Big Data captured by global job boards or social media for which relevance is limited by the technology access gap and its variations across the developing world.

Network effects occur in technology and job facilitation, as seen in the developed world. Managing and instigating the right network effects might be crucial to avoid fragmented stagnation and inefficiency. This is key to avoid throwing money behind wrong choices that do not gain traction.

A mixed mode approach is possibly the ideal approach for developing countries. Mixing offline and online elements correctly will be crucial in bridging the technology access gap and reaping the benefits of digitisation at the same time.

Properly incentivising the various entities is critical for progression, and more specifically the private sector, which is significantly more agile and inventive, has “skin in the game” and a long-term commitment to the conditions in the field, has intimate knowledge of how to solve the the technology gap and brings a better understanding of the particular ambient context they are operating in. To summarise: Big Data starts small.

Managing expectations and creating incentives for the various stakeholders will be crucial in establishing digitally supported TVET. Developing the right business models will be crucial in the short term and beyond, and it will be the result of creating the right mix of technological and policy expertise with good knowledge of the situation on the ground….(More)”.

Don’t forget people in the use of big data for development

Joshua Blumenstock at Nature: “Today, 95% of the global population has mobile-phone coverage, and the number of people who own a phone is rising fast (see ‘Dialling up’)1. Phones generate troves of personal data on billions of people, including those who live on a few dollars a day. So aid organizations, researchers and private companies are looking at ways in which this ‘data revolution’ could transform international development.

Some businesses are starting to make their data and tools available to those trying to solve humanitarian problems. The Earth-imaging company Planet in San Francisco, California, for example, makes its high-resolution satellite pictures freely available after natural disasters so that researchers and aid organizations can coordinate relief efforts. Meanwhile, organizations such as the World Bank and the United Nations are recruiting teams of data scientists to apply their skills in statistics and machine learning to challenges in international development.

But in the rush to find technological solutions to complex global problems there’s a danger of researchers and others being distracted by the technology and losing track of the key hardships and constraints that are unique to each local context. Designing data-enabled applications that work in the real world will require a slower approach that pays much more attention to the people behind the numbers…(More)”.

Is the Government More Entrepreneurial Than You Think?

 Freakonomics Radio (Podcast): We all know the standard story: our economy would be more dynamic if only the government would get out of the way. The economist Mariana Mazzucato says we’ve got that story backward. She argues that the government, by funding so much early-stage research, is hugely responsible for big successes in tech, pharma, energy, and more. But the government also does a terrible job in claiming credit — and, more important, getting a return on its investment….


MAZZUCATO: “…And I’ve been thinking about this especially around the big data and the kind of new questions around privacy with Facebook, etc. Instead of having a situation where all the data basically gets captured, which is citizens’ data, by companies which then, in some way, we have to pay into in terms of accessing these great new services — whether they’re free or not, we’re still indirectly paying. We should have the data in some sort of public repository because it’s citizens’ data. The technology itself was funded by the citizens. What would Uber be without GPS, publicly financed? What would Google be without the Internet, publicly financed? So, the tech was financed from the state, the citizens; it’s their data. Why not completely reverse the current relationship and have that data in a public repository which companies actually have to pay into to get access to it under certain strict conditions which could be set by an independent advisory council?… (More)”

What if technologies had their own ethical standards?

European Parliament: “Technologies are often seen either as objects of ethical scrutiny or as challenging traditional ethical norms. The advent of autonomous machines, deep learning and big data techniques, blockchain applications and ‘smart’ technological products raises the need to introduce ethical norms into these devices. The very act of building new and emerging technologies has also become the act of creating specific moral systems within which human and artificial agents will interact through transactions with moral implications. But what if technologies introduced and defined their own ethical standards?…(More)”.

AI and Big Data: A Blueprint for a Human Rights, Social and Ethical Impact Assessment

Alessandro Mantelero in Computer Law & Security Review: “The use of algorithms in modern data processing techniques, as well as data-intensive technological trends, suggests the adoption of a broader view of the data protection impact assessment. This will force data controllers to go beyond the traditional focus on data quality and security, and consider the impact of data processing on fundamental rights and collective social and ethical values.

Building on studies of the collective dimension of data protection, this article sets out to embed this new perspective in an assessment model centred on human rights (Human Rights, Ethical and Social Impact Assessment-HRESIA). This self-assessment model intends to overcome the limitations of the existing assessment models, which are either too closely focused on data processing or have an extent and granularity that make them too complicated to evaluate the consequences of a given use of data. In terms of architecture, the HRESIA has two main elements: a self-assessment questionnaire and an ad hoc expert committee. As a blueprint, this contribution focuses mainly on the nature of the proposed model, its architecture and its challenges; a more detailed description of the model and the content of the questionnaire will be discussed in a future publication drawing on the ongoing research….(More)”.

Towards Digital Enlightenment: Essays on the Dark and Light Sides of the Digital Revolution

Book edited by Dirk Helbing: “This new collection of essays follows in the footsteps of the successful volume Thinking Ahead – Essays on Big Data, Digital Revolution, and Participatory Market Society, published at a time when our societies were on a path to technological totalitarianism, as exemplified by mass surveillance reported by Edward Snowden and others.

Meanwhile the threats have diversified and tech companies have gathered enough data to create detailed profiles about almost everyone living in the modern world – profiles that can predict our behavior better than our friends, families, or even partners. This is not only used to manipulate peoples’ opinions and voting behaviors, but more generally to influence consumer behavior at all levels. It is becoming increasingly clear that we are rapidly heading towards a cybernetic society, in which algorithms and social bots aim to control both the societal dynamics and individual behaviors….(More)”.

Origin Privacy: Protecting Privacy in the Big-Data Era

Paper by Helen Nissenbaum, Sebastian Benthall, Anupam Datta, Michael Carl Tschantz, and Piot Mardziel: “Machine learning over big data poses challenges for our conceptualization of privacy. Such techniques can discover surprising and counteractive associations that take innocent looking data and turns it into important inferences about a person. For example, the buying carbon monoxide monitors has been linked to paying credit card bills, while buying chrome-skull car accessories predicts not doing so. Also, Target may have used the buying of scent-free hand lotion and vitamins as a sign that the buyer is pregnant. If we take pregnancy status to be private and assume that we should prohibit the sharing information that can reveal that fact, then we have created an unworkable notion of privacy, one in which sharing any scrap of data may violate privacy.

Prior technical specifications of privacy depend on the classification of certain types of information as private or sensitive; privacy policies in these frameworks limit access to data that allow inference of this sensitive information. As the above examples show, today’s data rich world creates a new kind of problem: it is difficult if not impossible to guarantee that information does notallow inference of sensitive topics. This makes information flow rules based on information topic unstable.

We address the problem of providing a workable definition of private data that takes into account emerging threats to privacy from large-scale data collection systems. We build on Contextual Integrity and its claim that privacy is appropriate information flow, or flow according to socially or legally specified rules.

As in other adaptations of Contextual Integrity (CI) to computer science, the parameterization of social norms in CI is translated into a logical specification. In this work, we depart from CI by considering rules that restrict information flow based on its origin and provenance, instead of on it’s type, topic, or subject.

We call this concept of privacy as adherence to origin-based rules Origin Privacy. Origin Privacy rules can be found in some existing data protection laws. This motivates the computational implementation of origin-based rules for the simple purpose of compliance engineering. We also formally model origin privacy to determine what security properties it guarantees relative to the concerns that motivate it….(More)”.