Declaration on Ethics and Data Protection in Artifical Intelligence


Declaration: “…The 40th International Conference of Data Protection and Privacy Commissioners considers that any creation, development and use of artificial intelligence systems shall fully respect human rights, particularly the rights to the protection of personal data and to privacy, as well as human dignity, non-discrimination and fundamental values, and shall provide solutions to allow individuals to maintain control and understanding of artificial intelligence systems.

The Conference therefore endorses the following guiding principles, as its core values to preserve human rights in the development of artificial intelligence:

  1. Artificial intelligence and machine learning technologies should be designed, developed and used in respect of fundamental human rights and in accordance with the fairness principle, in particular by:
  2. Considering individuals’ reasonable expectations by ensuring that the use of artificial intelligence systems remains consistent with their original purposes, and that the data are used in a way that is not incompatible with the original purpose of their collection,
  3. taking into consideration not only the impact that the use of artificial intelligence may have on the individual, but also the collective impact on groups and on society at large,
  4. ensuring that artificial intelligence systems are developed in a way that facilitates human development and does not obstruct or endanger it, thus recognizing the need for delineation and boundaries on certain uses,…(More)

When AI Misjudgment Is Not an Accident


Douglas Yeung at Scientific American: “The conversation about unconscious bias in artificial intelligence often focuses on algorithms that unintentionally cause disproportionate harm to entire swaths of society—those that wrongly predict black defendants will commit future crimes, for example, or facial-recognition technologies developed mainly by using photos of white men that do a poor job of identifying women and people with darker skin.

But the problem could run much deeper than that. Society should be on guard for another twist: the possibility that nefarious actors could seek to attack artificial intelligence systems by deliberately introducing bias into them, smuggled inside the data that helps those systems learn. This could introduce a worrisome new dimension to cyberattacks, disinformation campaigns or the proliferation of fake news.

According to a U.S. government study on big data and privacy, biased algorithms could make it easier to mask discriminatory lending, hiring or other unsavory business practices. Algorithms could be designed to take advantage of seemingly innocuous factors that can be discriminatory. Employing existing techniques, but with biased data or algorithms, could make it easier to hide nefarious intent. Commercial data brokers collect and hold onto all kinds of information, such as online browsing or shopping habits, that could be used in this way.

Biased data could also serve as bait. Corporations could release biased data with the hope competitors would use it to train artificial intelligence algorithms, causing competitors to diminish the quality of their own products and consumer confidence in them.

Algorithmic bias attacks could also be used to more easily advance ideological agendas. If hate groups or political advocacy organizations want to target or exclude people on the basis of race, gender, religion or other characteristics, biased algorithms could give them either the justification or more advanced means to directly do so. Biased data also could come into play in redistricting efforts that entrench racial segregation (“redlining”) or restrict voting rights.

Finally, national security threats from foreign actors could use deliberate bias attacks to destabilize societies by undermining government legitimacy or sharpening public polarization. This would fit naturally with tactics that reportedly seek to exploit ideological divides by creating social media posts and buying online ads designed to inflame racial tensions….(More)”.

The Lack of Decentralization of Data: Barriers, Exclusivity, and Monopoly in Open Data


Paper by Carla Hamida and Amanda Landi: “Recently, Facebook creator Mark Zuckerberg was on trial for the misuse of personal data. In 2013, the National Security Agency was exposed by Edward Snowden for invading the privacy of inhabitants of the United States by examining personal data. We see in the news examples, like the two just described, of government agencies and private companies being less than truthful about their use of our data. A related issue is that these same government agencies and private companies do not share their own data, and this creates the openness of data problem.

Government, academics, and citizens can play a role in making data more open. In the present, there are non-profit organizations that research data openness, such as OpenData Charter, Global Open Data Index, and Open Data Barometer. These organizations have different methods on measuring openness of data, so this leads us to question what does open data mean, how does one measure how open data is and who decides how open should data be, and to what extent society is affected by the availability, or lack of availability, of data. In this paper, we explore these questions with an examination of two of the non-profit organizations that study the open data problem extensively….(More)”.

Deep Fakes: A Looming Challenge for Privacy, Democracy, and National Security


Paper by Robert Chesney and Danielle Keats Citron: “Harmful lies are nothing new. But the ability to distort reality has taken an exponential leap forward with “deep fake” technology. This capability makes it possible to create audio and video of real people saying and doing things they never said or did. Machine learning techniques are escalating the technology’s sophistication, making deep fakes ever more realistic and increasingly resistant to detection.

Deep-fake technology has characteristics that enable rapid and widespread diffusion, putting it into the hands of both sophisticated and unsophisticated actors. While deep-fake technology will bring with it certain benefits, it also will introduce many harms. The marketplace of ideas already suffers from truth decay as our networked information environment interacts in toxic ways with our cognitive biases. Deep fakes will exacerbate this problem significantly. Individuals and businesses will face novel forms of exploitation, intimidation, and personal sabotage. The risks to our democracy and to national security are profound as well.

Our aim is to provide the first in-depth assessment of the causes and consequences of this disruptive technological change, and to explore the existing and potential tools for responding to it. We survey a broad array of responses, including: the role of technological solutions; criminal penalties, civil liability, and regulatory action; military and covert-action responses; economic sanctions; and market developments. We cover the waterfront from immunities to immutable authentication trails, offering recommendations to improve law and policy and anticipating the pitfalls embedded in various solutions….(More)”.

Privacy and Synthetic Datasets


Paper by Steven M. Bellovin, Preetam K. Dutta and Nathan Reitinger: “Sharing is a virtue, instilled in us from childhood. Unfortunately, when it comes to big data — i.e., databases possessing the potential to usher in a whole new world of scientific progress — the legal landscape prefers a hoggish motif. The historic approach to the resulting database–privacy problem has been anonymization, a subtractive technique incurring not only poor privacy results, but also lackluster utility. In anonymization’s stead, differential privacy arose; it provides better, near-perfect privacy, but is nonetheless subtractive in terms of utility.

Today, another solution is leaning into the fore, synthetic data. Using the magic of machine learning, synthetic data offers a generative, additive approach — the creation of almost-but-not-quite replica data. In fact, as we recommend, synthetic data may be combined with differential privacy to achieve a best-of-both-worlds scenario. After unpacking the technical nuances of synthetic data, we analyze its legal implications, finding both over and under inclusive applications. Privacy statutes either overweigh or downplay the potential for synthetic data to leak secrets, inviting ambiguity. We conclude by finding that synthetic data is a valid, privacy-conscious alternative to raw data, but is not a cure-all for every situation. In the end, computer science progress must be met with proper policy in order to move the area of useful data dissemination forward….(More)”.

‘Do Not Track,’ the Privacy Tool Used by Millions of People, Doesn’t Do Anything


Kashmir Hill at Gizmodo: “When you go into the privacy settings on your browser, there’s a little option there to turn on the “Do Not Track” function, which will send an invisible request on your behalf to all the websites you visit telling them not to track you. A reasonable person might think that enabling it will stop a porn site from keeping track of what she watches, or keep Facebook from collecting the addresses of all the places she visits on the internet, or prevent third-party trackers she’s never heard of from following her from site to site. According to a recent survey by Forrester Research, a quarter of American adults use “Do Not Track” to protect their privacy. (Our own stats at Gizmodo Media Group show that 9% of visitors have it turned on.) We’ve got bad news for those millions of privacy-minded people, though: “Do Not Track” is like spray-on sunscreen, a product that makes you feel safe while doing little to actually protect you.

“Do Not Track,” as it was first imagined a decade ago by consumer advocates, was going to be a “Do Not Call” list for the internet, helping to free people from annoying targeted ads and creepy data collection. But only a handful of sites respect the request, the most prominent of which are Pinterest and Medium. (Pinterest won’t use offsite data to target ads to a visitor who’s elected not to be tracked, while Medium won’t send their data to third parties.) The vast majority of sites, including this one, ignore it….(More)”.

Here’s What the USMCA Does for Data Innovation


Joshua New at the Center for Data Innovation: “…the Trump administration announced the United States-Mexico-Canada Agreement (USMCA), the trade deal it intends to replace NAFTA with. The parties—Canada, Mexico, and the United States—still have to adopt the deal, and if they do, they will enjoy several welcome provisions that can give a boost to data-driven innovation in all three countries.

First, USMCA is the first trade agreement in the world to promote the publication of open government data. Article 19.18 of the agreement officially recognizes that “facilitating public access to and use of government information fosters economic and social development, competitiveness, and innovation.” Though the deal does not require parties to publish open government data, to the extent they choose to publish this data, it directs them to adhere to best practices for open data, including ensuring it is in open, machine-readable formats. Additionally, the deal directs parties to try to cooperate and identify ways they can expand access to and the use of government data, particularly for the purposes of creating economic opportunity for small and medium-sized businesses. While this is a welcome provision, the United States still needs legislation to ensure that publishing open data becomes an official responsibility of federal government agencies.

Second, Article 19.11 of USMCA prevents parties from restricting “the cross-border transfer of information, including personal information, by electronic means if this activity is for the conduct of the business of a covered person.” Additionally, Article 19.12 prevents parties from requiring people or firms “to use or locate computing facilities in that Party’s territory as a condition for conducting business in that territory.” In effect, these provisions prevent parties from enacting protectionist data localization requirements that inhibit the flow of data across borders. This is important because many countries have disingenuously argued for data localization requirements on the grounds that it protects their citizens from privacy or security harms, despite the location of data having no bearing on either privacy or security, to prop up their domestic data-driven industries….(More)”.

A Right to Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI


Paper by Sandra Wachter and Brent Mittelstadt: “Big Data analytics and artificial intelligence (AI) draw non-intuitive and unverifiable inferences and predictions about the behaviors, preferences, and private lives of individuals. These inferences draw on highly diverse and feature-rich data of unpredictable value, and create new opportunities for discriminatory, biased, and invasive decision-making. Concerns about algorithmic accountability are often actually concerns about the way in which these technologies draw privacy invasive and non-verifiable inferences about us that we cannot predict, understand, or refute.

Data protection law is meant to protect people’s privacy, identity, reputation, and autonomy, but is currently failing to protect data subjects from the novel risks of inferential analytics. The broad concept of personal datain Europe could be interpreted to include inferences, predictions, and assumptions that refer to or impact on an individual. If seen as personal data, individuals are granted numerous rights under data protection law. However, the legal status of inferences is heavily disputed in legal scholarship, and marked by inconsistencies and contradictions within and between the views of the Article 29 Working Party and the European Court of Justice.

As we show in this paper, individuals are granted little control and oversight over how their personal data is used to draw inferences about them. Compared to other types of personal data, inferences are effectively ‘economy class’ personal data in the General Data Protection Regulation (GDPR). Data subjects’ rights to know about (Art 13-15), rectify (Art 16), delete (Art 17), object to (Art 21), or port (Art 20) personal data are significantly curtailed when it comes to inferences, often requiring a greater balance with controller’s interests (e.g. trade secrets, intellectual property) than would otherwise be the case. Similarly, the GDPR provides insufficient protection against sensitive inferences (Art 9) or remedies to challenge inferences or important decisions based on them (Art 22(3))….

In this paper we argue that a new data protection right, the ‘right to reasonable inferences’, is needed to help close the accountability gap currently posed ‘high risk inferences’ , meaning inferences that are privacy invasive or reputation damaging and have low verifiability in the sense of being predictive or opinion-based. In cases where algorithms draw ‘high risk inferences’ about individuals, this right would require ex-ante justification to be given by the data controller to establish whether an inference is reasonable. This disclosure would address (1) why certain data is a relevant basis to draw inferences; (2) why these inferences are relevant for the chosen processing purpose or type of automated decision; and (3) whether the data and methods used to draw the inferences are accurate and statistically reliable. The ex-ante justification is bolstered by an additional ex-post mechanism enabling unreasonable inferences to be challenged. A right to reasonable inferences must, however, be reconciled with EU jurisprudence and counterbalanced with IP and trade secrets law as well as freedom of expression and Article 16 of the EU Charter of Fundamental Rights: the freedom to conduct a business….(More)”.

Human Rights in the Big Data World


Paper by Francis Kuriakose and Deepa Iyer: “Ethical approach to human rights conceives and evaluates law through the underlying value concerns. This paper examines human rights after the introduction of big data using an ethical approach to rights. First, the central value concerns such as equity, equality, sustainability and security are derived from the history of digital technological revolution. Then, the properties and characteristics of big data are analyzed to understand emerging value concerns such as accountability, transparency, tracability, explainability and disprovability.

Using these value points, this paper argues that big data calls for two types of evaluations regarding human rights. The first is the reassessment of existing human rights in the digital sphere predominantly through right to equality and right to work. The second is the conceptualization of new digital rights such as right to privacy and right against propensity-based discrimination. The paper concludes that as we increasingly share the world with intelligence systems, these new values expand and modify the existing human rights paradigm….(More)”.

Text Analysis Systems Mine Workplace Emails to Measure Staff Sentiments


Alan Rothman at LLRX: “…For all of these good, bad or indifferent workplaces, a key question is whether any of the actions of management to engage the staff and listen to their concerns ever resulted in improved working conditions and higher levels of job satisfaction?

The answer is most often “yes”. Just having a say in, and some sense of control over, our jobs and workflows can indeed have a demonstrable impact on morale, camaraderie and the bottom line. As posited in the Hawthorne Effect, also termed the “Observer Effect”, this was first discovered during studies in the 1920’s and 1930’s when the management of a factory made improvements to the lighting and work schedules. In turn, worker satisfaction and productivity temporarily increased. This was not so much because there was more light, but rather, that the workers sensed that management was paying attention to, and then acting upon, their concerns. The workers perceived they were no longer just cogs in a machine.

Perhaps, too, the Hawthorne Effect is in some ways the workplace equivalent of the Heisenberg’s Uncertainty Principle in physics. To vastly oversimplify this slippery concept, the mere act of observing a subatomic particle can change its position.¹

Giving the processes of observation, analysis and change at the enterprise level a modern (but non-quantum) spin, is a fascinating new article in the September 2018 issue of The Atlantic entitled What Your Boss Could Learn by Reading the Whole Company’s Emails, by Frank Partnoy.  I highly recommend a click-through and full read if you have an opportunity. I will summarize and annotate it, and then, considering my own thorough lack of understanding of the basics of y=f(x), pose some of my own physics-free questions….

Today the text analytics business, like the work done by KeenCorp, is thriving. It has been long-established as the processing behind email spam filters. Now it is finding other applications including monitoring corporate reputations on social media and other sites.²

The finance industry is another growth sector, as investment banks and hedge funds scan a wide variety of information sources to locate “slight changes in language” that may point towards pending increases or decreases in share prices. Financial research providers are using artificial intelligence to mine “insights” from their own selections of news and analytical sources.

But is this technology effective?

In a paper entitled Lazy Prices, by Lauren Cohen (Harvard Business School and NBER), Christopher Malloy (Harvard Business School and NBER), and Quoc Nguyen (University of Illinois at Chicago), in a draft dated February 22, 2018, these researchers found that the share price of company, in this case NetApp in their 2010 annual report, measurably went down after the firm “subtly changes” its reporting “descriptions of certain risks”. Algorithms can detect such changes more quickly and effectively than humans. The company subsequently clarified in its 2011 annual report their “failure to comply” with reporting requirements in 2010. A highly skilled stock analyst “might have missed that phrase”, but once again its was captured by “researcher’s algorithms”.

In the hands of a “skeptical investor”, this information might well have resulted in them questioning the differences in the 2010 and 2011 annual reports and, in turn, saved him or her a great deal of money. This detection was an early signal of a looming decline in NetApp’s stock. Half a year after the 2011 report’s publication, it was reported that the Syrian government has bought the company and “used that equipment to spy on its citizen”, causing further declines.

Now text analytics is being deployed at a new target: The composition of employees’ communications. Although it has been found that workers have no expectations of privacy in their workplaces, some companies remain reluctant to do so because of privacy concerns. Thus, companies are finding it more challenging to resist the “urge to mine employee information”, especially as text analysis systems continue to improve.

Among the evolving enterprise applications are the human resources departments in assessing overall employee morale. For example, Vibe is such an app that scans through communications on Slack, a widely used enterprise platform. Vibe’s algorithm, in real-time reporting, measures the positive and negative emotions of a work team….(More)”.