AI Ethics Needs Good Data


Paper by Angela Daly, S Kate Devitt, and Monique Mann: “In this chapter we argue that discourses on AI must transcend the language of ‘ethics’ and engage with power and political economy in order to constitute ‘Good Data’. In particular, we must move beyond the depoliticised language of ‘ethics’ currently deployed (Wagner 2018) in determining whether AI is ‘good’ given the limitations of ethics as a frame through which AI issues can be viewed. In order to circumvent these limits, we use instead the language and conceptualisation of ‘Good Data’, as a more expansive term to elucidate the values, rights and interests at stake when it comes to AI’s development and deployment, as well as that of other digital technologies.

Good Data considerations move beyond recurring themes of data protection/privacy and the FAT (fairness, transparency and accountability) movement to include explicit political economy critiques of power. Instead of yet more ethics principles (that tend to say the same or similar things anyway), we offer four ‘pillars’ on which Good Data AI can be built: community, rights, usability and politics. Overall we view AI’s ‘goodness’ as an explicly political (economy) question of power and one which is always related to the degree which AI is created and used to increase the wellbeing of society and especially to increase the power of the most marginalized and disenfranchised. We offer recommendations and remedies towards implementing ‘better’ approaches towards AI. Our strategies enable a different (but complementary) kind of evaluation of AI as part of the broader socio-technical systems in which AI is built and deployed….(More)”.

Towards Algorithm Auditing: A Survey on Managing Legal, Ethical and Technological Risks of AI, ML and Associated Algorithms


Paper by Adriano Koshiyama: “Business reliance on algorithms are becoming ubiquitous, and companies are increasingly concerned about their algorithms causing major financial or reputational damage. High-profile cases include VW’s Dieselgate scandal with fines worth of $34.69B, Knight Capital’s bankruptcy (~$450M) by a glitch in its algorithmic trading system, and Amazon’s AI recruiting tool being scrapped after showing bias against women. In response, governments are legislating and imposing bans, regulators fining companies, and the Judiciary discussing potentially making algorithms artificial “persons” in Law.

Soon there will be ‘billions’ of algorithms making decisions with minimal human intervention; from autonomous vehicles and finance, to medical treatment, employment, and legal decisions. Indeed, scaling to problems beyond the human is a major point of using such algorithms in the first place. As with Financial Audit, governments, business and society will require Algorithm Audit; formal assurance that algorithms are legal, ethical and safe. A new industry is envisaged: Auditing and Assurance of Algorithms (cf. Data privacy), with the remit to professionalize and industrialize AI, ML and associated algorithms.

The stakeholders range from those working on policy and regulation, to industry practitioners and developers. We also anticipate the nature and scope of the auditing levels and framework presented will inform those interested in systems of governance and compliance to regulation/standards. Our goal in this paper is to survey the key areas necessary to perform auditing and assurance, and instigate the debate in this novel area of research and practice….(More)”.

Pretty Good Phone Privacy


Paper by Paul Schmitt and Barath Raghavan: “To receive service in today’s cellular architecture, phones uniquely identify themselves to towers and thus to operators. This is now a cause of major privacy violations, as operators sell and leak identity and location data of hundreds of millions of mobile users. In this paper, we take an end-to-end perspective on the cellular architecture and find key points of decoupling that enable us to protect user identity and location privacy with no changes to physical infrastructure, no added latency, and no requirement of direct cooperation from existing operators. We describe Pretty Good Phone Privacy (PGPP) and demonstrate how our modified backend stack (NGC) works with real phones to provide ordinary yet privacy-preserving connectivity. We explore inherent privacy and efficiency tradeoffs in a simulation of a large metropolitan region. We show how PGPP maintains today’s control overheads while significantly improving user identity and location privacy…(More)”.

Mini-Publics and the Wider Public: The Perceived Legitimacy of Randomly Selecting Citizen Representatives


Paper by James Pow: “There are two important dimensions to the membership of mini-publics that are distinct from the membership of conventional representative institutions: the selection mechanism (sortition) and the profile of the body’s eligible membership (‘ordinary’ citizens). This article examines the effects of these design features on perceived legitimacy. A survey experiment in the deeply divided context of Northern Ireland finds no evidence that variation in mini-public selection features has an overall effect on perceived legitimacy, but there are important individual-level differences….(More)”.

Fostering trustworthy data sharing: Establishing data foundations in practice


Paper by Sophie Stalla-Bourdillon, Laura Carmichael and Alexsis Wintour: “Independent data stewardship remains a core component of good data governance practice. Yet, there is a need for more robust independent data stewardship models that are able to oversee data-driven, multi-party data sharing, usage and re-usage, which can better incorporate citizen representation, especially in relation to personal data. We propose that data foundations—inspired by Channel Islands’ foundations laws—provide a workable model for good data governance not only in the Channel Islands, but also elsewhere. A key advantage of this model—in addition to leveraging existing legislation and building on established precedent—is the statutory role of the guardian that is a unique requirement in the Channel Islands, and when interpreted in a data governance model provides the independent data steward. The principal purpose for this paper, therefore, is to demonstrate why data foundations are well suited to the needs of data sharing initiatives. We further examine how data foundations could be established in practice—and provide key design principles that should be used to guide the design and development of any data foundation….(More)”.

Tracking COVID-19 using online search


Paper by Vasileios Lampos et al: “Previous research has demonstrated that various properties of infectious diseases can be inferred from online search behaviour. In this work we use time series of online search query frequencies to gain insights about the prevalence of COVID-19 in multiple countries. We first develop unsupervised modelling techniques based on associated symptom categories identified by the United Kingdom’s National Health Service and Public Health England. We then attempt to minimise an expected bias in these signals caused by public interest—as opposed to infections—using the proportion of news media coverage devoted to COVID-19 as a proxy indicator. Our analysis indicates that models based on online searches precede the reported confirmed cases and deaths by 16.7 (10.2–23.2) and 22.1 (17.4–26.9) days, respectively. We also investigate transfer learning techniques for mapping supervised models from countries where the spread of the disease has progressed extensively to countries that are in earlier phases of their respective epidemic curves. Furthermore, we compare time series of online search activity against confirmed COVID-19 cases or deaths jointly across multiple countries, uncovering interesting querying patterns, including the finding that rarer symptoms are better predictors than common ones. Finally, we show that web searches improve the short-term forecasting accuracy of autoregressive models for COVID-19 deaths. Our work provides evidence that online search data can be used to develop complementary public health surveillance methods to help inform the COVID-19 response in conjunction with more established approaches….(More)”.

Politics and Open Science: How the European Open Science Cloud Became Reality (the Untold Story)


Jean-Claude Burgelman at Data Intelligence: “This article will document how the European Open Science Cloud (EOSC) emerged as one of the key policy intentions to foster Open Science (OS) in Europe. It will describe some of the typical, non-rational roadblocks on the way to implement EOSC. The article will also argue that the only way Europe can take care of its research data in a way that fits the European specificities fully, is by supporting EOSC.

It is fair to say—note the word FAIR here—that realizing the European Open Science Cloud (EOSC) is now part and parcel of the European Data Science (DS) policy. In particular since EOSC will be from 2021 in the hands of the independent EOSC Association and thus potentially way out of the so-called “Brussels Bubble”.

This article will document the whole story of how EOSC emerged in this “bubble” as one of the policy intentions to foster Open Science (OS) in Europe. In addition, it will describe some of the typical, non-rational roadblocks on the way to implement EOSC. The article will also argue that the only way Europe can take care of its research data in a way that fits the European specificities fully, is by supporting EOSC….(More)”

The Janus Face of the Liberal International Information Order: When Global Institutions Are Self-Undermining


Paper by Henry Farrell and Abraham L. Newman: “Scholars and policymakers long believed that norms of global information openness and private-sector governance helped to sustain and promote liberalism. These norms are being increasingly contested within liberal democracies. In this article, we argue that a key source of debate over the Liberal International Information Order (LIIO), a sub-order of the Liberal International Order (LIO), is generated internally by “self-undermining feedback effects,” that is, mechanisms through which institutional arrangements undermine their own political conditions of survival over time. Empirically, we demonstrate how global governance of the Internet, transnational disinformation campaigns, and domestic information governance interact to sow the seeds of this contention. In particular, illiberal states converted norms of openness into a vector of attack, unsettling political bargains in liberal states concerning the LIIO. More generally, we set out a broader research agenda to show how the international relations discipline might better understand institutional change as well as the informational aspects of the current crisis in the LIO….(More)”

How a Google Street View image of your house predicts your risk of a car accident


MIT Technology Review: “Google Street View has become a surprisingly useful way to learn about the world without stepping into it. People use it to plan journeys, to explore holiday destinations, and to virtually stalk friends and enemies alike.

But researchers have found more insidious uses. In 2017 a team of researchers used the images to study the distribution of car types in the US and then used that data to determine the demographic makeup of the country. It turns out that the car you drive is a surprisingly reliable proxy for your income level, your education, your occupation, and even the way you vote in elections.

Street view of houses in Poland

Now a different group has gone even further. Łukasz Kidziński at Stanford University in California and Kinga Kita-Wojciechowska at the University of Warsaw in Poland have used Street View images of people’s houses to determine how likely they are to be involved in a car accident. That’s valuable information that an insurance company could use to set premiums.

The result raises important questions about the way personal information can leak from seemingly innocent data sets and whether organizations should be able to use it for commercial purposes.

Insurance data

The researchers’ method is straightforward. They began with a data set of 20,000 records of people who had taken out car insurance in Poland between 2013 and 2015. These were randomly selected from the database of an undisclosed insurance company.

Each record included the address of the policyholder and the number of damage claims he or she made during the 2013–’15 period. The insurer also shared its own prediction of future claims, calculated using its state-of-the-art risk model that takes into account the policyholder’s zip code and the driver’s age, sex, claim history, and so on.

The question that Kidziński and Kita-Wojciechowska investigated is whether they could make a more accurate prediction using a Google Street View image of the policyholder’s house….(More)”.

Economic complexity theory and applications


Paper by César A. Hidalgo: “Economic complexity methods have become popular tools in economic geography, international development and innovation studies. Here, I review economic complexity theory and applications, with a particular focus on two streams of literature: the literature on relatedness, which focuses on the evolution of specialization patterns, and the literature on metrics of economic complexity, which uses dimensionality reduction techniques to create metrics of economic sophistication that are predictive of variations in income, economic growth, emissions and income inequality….(More)”.