This is how computers “predict the future”


Dan Kopf at Quartz: “The poetically named “random forest” is one of data science’s most-loved prediction algorithms. Developed primarily by statistician Leo Breiman in the 1990s, the random forest is cherished for its simplicity. Though it is not always the most accurate prediction method for a given problem, it holds a special place in machine learning because even those new to data science can implement and understand this powerful algorithm.

This was the algorithm used in an exciting 2017 study on suicide predictions, conducted by biomedical-informatics specialist Colin Walsh of Vanderbilt University and psychologists Jessica Ribeiro and Joseph Franklin of Florida State University. Their goal was to take what they knew about a set of 5,000 patients with a history of self-injury, and see if they could use those data to predict the likelihood that those patients would commit suicide. The study was done retrospectively. Sadly, almost 2,000 of these patients had killed themselves by the time the research was underway.

Altogether, the researchers had over 1,300 different characteristics they could use to make their predictions, including age, gender, and various aspects of the individuals’ medical histories. If the predictions from the algorithm proved to be accurate, the algorithm could theoretically be used in the future to identify people at high risk of suicide, and deliver targeted programs to them. That would be a very good thing.

Predictive algorithms are everywhere. In an age when data are plentiful and computing power is mighty and cheap, data scientists increasingly take information on people, companies, and markets—whether given willingly or harvested surreptitiously—and use it to guess the future. Algorithms predict what movie we might want to watch next, which stocks will increase in value, and which advertisement we’re most likely to respond to on social media. Artificial-intelligence tools, like those used for self-driving cars, often rely on predictive algorithms for decision making….(More)”.

The Moral Machine experiment


Jean-François Bonnefon, Iyad Rahwan et al in Nature:  “With the rapid development of artificial intelligence have come concerns about how machines will make moral decisions, and the major challenge of quantifying societal expectations about the ethical principles that should guide machine behaviour. To address this challenge, we deployed the Moral Machine, an online experimental platform designed to explore the moral dilemmas faced by autonomous vehicles.

This platform gathered 40 million decisions in ten languages from millions of people in 233 countries and territories. Here we describe the results of this experiment. First, we summarize global moral preferences. Second, we document individual variations in preferences, based on respondents’ demographics. Third, we report cross-cultural ethical variation, and uncover three major clusters of countries. Fourth, we show that these differences correlate with modern institutions and deep cultural traits. We discuss how these preferences can contribute to developing global, socially acceptable principles for machine ethics. All data used in this article are publicly available….(More)”.

Deep Fakes: A Looming Challenge for Privacy, Democracy, and National Security


Paper by Robert Chesney and Danielle Keats Citron: “Harmful lies are nothing new. But the ability to distort reality has taken an exponential leap forward with “deep fake” technology. This capability makes it possible to create audio and video of real people saying and doing things they never said or did. Machine learning techniques are escalating the technology’s sophistication, making deep fakes ever more realistic and increasingly resistant to detection.

Deep-fake technology has characteristics that enable rapid and widespread diffusion, putting it into the hands of both sophisticated and unsophisticated actors. While deep-fake technology will bring with it certain benefits, it also will introduce many harms. The marketplace of ideas already suffers from truth decay as our networked information environment interacts in toxic ways with our cognitive biases. Deep fakes will exacerbate this problem significantly. Individuals and businesses will face novel forms of exploitation, intimidation, and personal sabotage. The risks to our democracy and to national security are profound as well.

Our aim is to provide the first in-depth assessment of the causes and consequences of this disruptive technological change, and to explore the existing and potential tools for responding to it. We survey a broad array of responses, including: the role of technological solutions; criminal penalties, civil liability, and regulatory action; military and covert-action responses; economic sanctions; and market developments. We cover the waterfront from immunities to immutable authentication trails, offering recommendations to improve law and policy and anticipating the pitfalls embedded in various solutions….(More)”.

Privacy and Synthetic Datasets


Paper by Steven M. Bellovin, Preetam K. Dutta and Nathan Reitinger: “Sharing is a virtue, instilled in us from childhood. Unfortunately, when it comes to big data — i.e., databases possessing the potential to usher in a whole new world of scientific progress — the legal landscape prefers a hoggish motif. The historic approach to the resulting database–privacy problem has been anonymization, a subtractive technique incurring not only poor privacy results, but also lackluster utility. In anonymization’s stead, differential privacy arose; it provides better, near-perfect privacy, but is nonetheless subtractive in terms of utility.

Today, another solution is leaning into the fore, synthetic data. Using the magic of machine learning, synthetic data offers a generative, additive approach — the creation of almost-but-not-quite replica data. In fact, as we recommend, synthetic data may be combined with differential privacy to achieve a best-of-both-worlds scenario. After unpacking the technical nuances of synthetic data, we analyze its legal implications, finding both over and under inclusive applications. Privacy statutes either overweigh or downplay the potential for synthetic data to leak secrets, inviting ambiguity. We conclude by finding that synthetic data is a valid, privacy-conscious alternative to raw data, but is not a cure-all for every situation. In the end, computer science progress must be met with proper policy in order to move the area of useful data dissemination forward….(More)”.

Wearable device data and AI can reduce health care costs and paperwork


Darrell West at Brookings: “Though digital technology has transformed nearly every corner of the economy in recent years, the health care industry seems stubbornly immune to these trends. That may soon change if more wearable devices record medical information that physicians can use to diagnose and treat illnesses at earlier stages. Last month, Apple announced that an FDA-approved electrocardiograph (EKG) will be included in the latest generation Apple Watch to check the heart’s electrical activity for signs of arrhythmia. However, the availability of this data does not guarantee that health care providers are currently equipped to process all of it. To cope with growing amounts of medical data from wearable devices, health care providers may need to adopt artificial intelligence that can identify data trends and spot any deviations that indicate illness. Greater medical data, accompanied by artificial intelligence to analyze it, could expand the capabilities of human health care providers and offer better outcomes at lower costs for patients….

By 2016, American health care spending had already ballooned to 17.9 percent of GDP. The rise in spending saw a parallel rise in health care employment. Patients still need doctors, nurses, and health aides to administer care, yet these health care professionals might not yet be able to make sense of the massive quantities of data coming from wearable devices. Doctors already spend much of their time filling out paperwork, which leaves less time to interact with patients. The opportunity may arise for artificial intelligence to analyze the coming flood of data from wearable devices. Tracking small changes as they happen could make a large difference in diagnosis and treatment: AI could detect abnormal heartbeat, respiration, or other signs that indicate worsening health. Catching symptoms before they worsen may be key to improving health outcomes and lowering costs….(More)”.

How pro-trust initiatives are taking over the Internet


Sara Fisher at Axios: “Dozens of new initiatives have launched over the past few years to address fake news and the erosion of faith in the media, creating a measurement problem of its own.

Why it matters: So many new efforts are launching simultaneously to solve the same problem that it’s become difficult to track which ones do what and which ones are partnering with each other….

To name a few:

  • The Trust Project, which is made up of dozens of global news companies, announced this morning that the number of journalism organizations using the global network’s “Trust Indicators” now totals 120, making it one of the larger global initiatives to combat fake news. Some of these groups (like NewsGuard) work with Trust Project and are a part of it.
  • News Integrity Initiative (Facebook, Craig Newmark Philanthropic Fund, Ford Foundation, Democracy Fund, John S. and James L. Knight Foundation, Tow Foundation, AppNexus, Mozilla and Betaworks)
  • NewsGuard (Longtime journalists and media entrepreneurs Steven Brill and Gordon Crovitz)
  • The Journalism Trust Initiative (Reporters Without Borders, and Agence France Presse, the European Broadcasting Union and the Global Editors Network )
  • Internews (Longtime international non-profit)
  • Accountability Journalism Program (American Press Institute)
  • Trusting News (Reynolds Journalism Institute)
  • Media Manipulation Initiative (Data & Society)
  • Deepnews.ai (Frédéric Filloux)
  • Trust & News Initiative (Knight Foundation, Facebook and Craig Newmark in. affiliation with Duke University)
  • Our.News (Independently run)
  • WikiTribune (Wikipedia founder Jimmy Wales)

There are also dozens of fact-checking efforts being championed by different third-parties, as well as efforts being built around blockchain and artificial intelligence.

Between the lines: Most of these efforts include some sort of mechanism for allowing readers to physically discern real journalism from fake news via some sort of badge or watermark, but that presents problems as well.

  • Attempts to flag or call out news as being real and valid have in the past been rejected even further by those who wish to discredit vetted media.
  • For example, Facebook said in December that it will no longer use “Disputed Flags” — red flags next to fake news articles — to identify fake news for users, because it found that “putting a strong image, like a red flag, next to an article may actually entrench deeply held beliefs – the opposite effect to what we intended.”…(More)”.

Governing artificial intelligence: ethical, legal, and technical opportunities and challenges


Introduction to the Special Issue of the Philosophical Transactions of the Royal Society by Sandra Wachter, Brent Mittelstadt, Luciano Floridi and Corinne Cath: “Artificial intelligence (AI) increasingly permeates every aspect of our society, from the critical, like urban infrastructure, law enforcement, banking, healthcare and humanitarian aid, to the mundane like dating. AI, including embodied AI in robotics and techniques like machine learning, can improve economic, social welfare and the exercise of human rights. Owing to the proliferation of AI in high-risk areas, the pressure is mounting to design and govern AI to be accountable, fair and transparent. How can this be achieved and through which frameworks? This is one of the central questions addressed in this special issue, in which eight authors present in-depth analyses of the ethical, legal-regulatory and technical challenges posed by developing governance regimes for AI systems. It also gives a brief overview of recent developments in AI governance, how much of the agenda for defining AI regulation, ethical frameworks and technical approaches is set, as well as providing some concrete suggestions to further the debate on AI governance…(More)”.

Governing Artificial Intelligence: Upholding Human Rights & Dignity


Report by Mark Latonero that “…shows how human rights can serve as a “North Star” to guide the development and governance of artificial intelligence.

The report draws the connections between AI and human rights; reframes recent AI-related controversies through a human rights lens; and reviews current stakeholder efforts at the intersection of AI and human rights.

This report is intended for stakeholders–such as technology companies, governments, intergovernmental organizations, civil society groups, academia, and the United Nations (UN) system–looking to incorporate human rights into social and organizational contexts related to the development and governance of AI….(More)”.

A Right to Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI


Paper by Sandra Wachter and Brent Mittelstadt: “Big Data analytics and artificial intelligence (AI) draw non-intuitive and unverifiable inferences and predictions about the behaviors, preferences, and private lives of individuals. These inferences draw on highly diverse and feature-rich data of unpredictable value, and create new opportunities for discriminatory, biased, and invasive decision-making. Concerns about algorithmic accountability are often actually concerns about the way in which these technologies draw privacy invasive and non-verifiable inferences about us that we cannot predict, understand, or refute.

Data protection law is meant to protect people’s privacy, identity, reputation, and autonomy, but is currently failing to protect data subjects from the novel risks of inferential analytics. The broad concept of personal datain Europe could be interpreted to include inferences, predictions, and assumptions that refer to or impact on an individual. If seen as personal data, individuals are granted numerous rights under data protection law. However, the legal status of inferences is heavily disputed in legal scholarship, and marked by inconsistencies and contradictions within and between the views of the Article 29 Working Party and the European Court of Justice.

As we show in this paper, individuals are granted little control and oversight over how their personal data is used to draw inferences about them. Compared to other types of personal data, inferences are effectively ‘economy class’ personal data in the General Data Protection Regulation (GDPR). Data subjects’ rights to know about (Art 13-15), rectify (Art 16), delete (Art 17), object to (Art 21), or port (Art 20) personal data are significantly curtailed when it comes to inferences, often requiring a greater balance with controller’s interests (e.g. trade secrets, intellectual property) than would otherwise be the case. Similarly, the GDPR provides insufficient protection against sensitive inferences (Art 9) or remedies to challenge inferences or important decisions based on them (Art 22(3))….

In this paper we argue that a new data protection right, the ‘right to reasonable inferences’, is needed to help close the accountability gap currently posed ‘high risk inferences’ , meaning inferences that are privacy invasive or reputation damaging and have low verifiability in the sense of being predictive or opinion-based. In cases where algorithms draw ‘high risk inferences’ about individuals, this right would require ex-ante justification to be given by the data controller to establish whether an inference is reasonable. This disclosure would address (1) why certain data is a relevant basis to draw inferences; (2) why these inferences are relevant for the chosen processing purpose or type of automated decision; and (3) whether the data and methods used to draw the inferences are accurate and statistically reliable. The ex-ante justification is bolstered by an additional ex-post mechanism enabling unreasonable inferences to be challenged. A right to reasonable inferences must, however, be reconciled with EU jurisprudence and counterbalanced with IP and trade secrets law as well as freedom of expression and Article 16 of the EU Charter of Fundamental Rights: the freedom to conduct a business….(More)”.

The free flow of non-personal data


Joint statement by Vice-President Ansip and Commissioner Gabriel on the European Parliament’s vote on the new EU rules facilitating the free flow of non-personal data: “The European Parliament adopted today a Regulation on the free flow of non-personal data proposed by the European Commission in September 2017. …

We welcome today’s vote at the European Parliament. A digital economy and society cannot exist without data and this Regulation concludes another key pillar of the Digital Single Market. Only if data flows freely can Europe get the best from the opportunities offered by digital progress and technologies such as artificial intelligence and supercomputers.  

This Regulation does for non-personal data what the General Data Protection Regulation has already done for personal data: free and safe movement across the European Union. 

With its vote, the European Parliament has sent a clear signal to all businesses of Europe: it makes no difference where in the EU you store and process your data – data localisation requirements within the Member States are a thing of the past. 

The new rules will provide a major boost to the European data economy, as it opens up potential for European start-ups and SMEs to create new services through cross-border data innovation. This could lead to a 4% – or €739 billion – higher EU GDP until 2020 alone. 

Together with the General Data Protection Regulation, the Regulation on the free flow of non-personal data will allow the EU to fully benefit from today’s and tomorrow’s data-based global economy.” 

Background

Since the Communication on the European Data Economy was adopted in January 2017 as part of the Digital Single Market strategy, the Commission has run a public online consultation, organised structured dialogues with Member States and has undertaken several workshops with different stakeholders. These evidence-gathering initiatives have led to the publication of an impact assessment….The Regulation on the free flow of non-personal data has no impact on the application of the General Data Protection Regulation (GDPR), as it does not cover personal data. However, the two Regulations will function together to enable the free flow of any data – personal and non-personal – thus creating a single European space for data. In the case of a mixed dataset, the GDPR provision guaranteeing free flow of personal data will apply to the personal data part of the set, and the free flow of non-personal data principle will apply to the non-personal part. …(More)”.