Let’s make private data into a public good


Article by Mariana Mazzucato: “The internet giants depend on our data. A new relationship between us and them could deliver real value to society….We should ask how the value of these companies has been created, how that value has been measured, and who benefits from it. If we go by national accounts, the contribution of internet platforms to national income (as measured, for example, by GDP) is represented by the advertisement-related services they sell. But does that make sense? It’s not clear that ads really contribute to the national product, let alone to social well-being—which should be the aim of economic activity. Measuring the value of a company like Google or Facebook by the number of ads it sells is consistent with standard neoclassical economics, which interprets any market-based transaction as signaling the production of some kind of output—in other words, no matter what the thing is, as long as a price is received, it must be valuable. But in the case of these internet companies, that’s misleading: if online giants contribute to social well-being, they do it through the services they provide to users, not through the accompanying advertisements.

This way we have of ascribing value to what the internet giants produce is completely confusing, and it’s generating a paradoxical result: their advertising activities are counted as a net contribution to national income, while the more valuable services they provide to users are not.

Let’s not forget that a large part of the technology and necessary data was created by all of us, and should thus belong to all of us. The underlying infrastructure that all these companies rely on was created collectively (via the tax dollars that built the internet), and it also feeds off network effects that are produced collectively. There is indeed no reason why the public’s data should not be owned by a public repository that sells the data to the tech giants, rather than vice versa. But the key issue here is not just sending a portion of the profits from data back to citizens but also allowing them to shape the digital economy in a way that satisfies public needs. Using big data and AI to improve the services provided by the welfare state—from health care to social housing—is just one example.

Only by thinking about digital platforms as collective creations can we construct a new model that offers something of real value, driven by public purpose. We’re never far from a media story that stirs up a debate about the need to regulate tech companies, which creates a sense that there’s a war between their interests and those of national governments. We need to move beyond this narrative. The digital economy must be subject to the needs of all sides; it’s a partnership of equals where regulators should have the confidence to be market shapers and value creators….(More)”.

Artificial Intelligence


Stanford Encyclopedia of Philosophy: “Artificial intelligence (AI) is the field devoted to building artificial animals (or at least artificial creatures that – in suitable contexts – appear to be animals) and, for many, artificial persons (or at least artificial creatures that – in suitable contexts – appear to be persons).[1] Such goals immediately ensure that AI is a discipline of considerable interest to many philosophers, and this has been confirmed (e.g.) by the energetic attempt, on the part of numerous philosophers, to show that these goals are in fact un/attainable. On the constructive side, many of the core formalisms and techniques used in AI come out of, and are indeed still much used and refined in, philosophy: first-order logic and its extensions; intensional logics suitable for the modeling of doxastic attitudes and deontic reasoning; inductive logic, probability theory, and probabilistic reasoning; practical reasoning and planning, and so on. In light of this, some philosophers conduct AI research and development as philosophy.

In the present entry, the history of AI is briefly recounted, proposed definitions of the field are discussed, and an overview of the field is provided. In addition, both philosophical AI (AI pursued as and out of philosophy) and philosophy of AI are discussed, via examples of both. The entry ends with some de rigueur speculative commentary regarding the future of AI….(More)”.

Health Insurers Are Vacuuming Up Details About You — And It Could Raise Your Rates


Marshall Allen at ProPublica: “With little public scrutiny, the health insurance industry has joined forces with data brokers to vacuum up personal details about hundreds of millions of Americans, including, odds are, many readers of this story. The companies are tracking your race, education level, TV habits, marital status, net worth. They’re collecting what you post on social media, whether you’re behind on your bills, what you order online. Then they feed this information into complicated computer algorithms that spit out predictions about how much your health care could cost them.

Are you a woman who recently changed your name? You could be newly married and have a pricey pregnancy pending. Or maybe you’re stressed and anxious from a recent divorce. That, too, the computer models predict, may run up your medical bills.

Are you a woman who’s purchased plus-size clothing? You’re considered at risk of depression. Mental health care can be expensive.

Low-income and a minority? That means, the data brokers say, you are more likely to live in a dilapidated and dangerous neighborhood, increasing your health risks.

“We sit on oceans of data,” said Eric McCulley, director of strategic solutions for LexisNexis Risk Solutions, during a conversation at the data firm’s booth. And he isn’t apologetic about using it. “The fact is, our data is in the public domain,” he said. “We didn’t put it out there.”

Insurers contend they use the information to spot health issues in their clients — and flag them so they get services they need. And companies like LexisNexis say the data shouldn’t be used to set prices. But as a research scientist from one company told me: “I can’t say it hasn’t happened.”

At a time when every week brings a new privacy scandal and worries abound about the misuse of personal information, patient advocates and privacy scholars say the insurance industry’s data gathering runs counter to its touted, and federally required, allegiance to patients’ medical privacy. The Health Insurance Portability and Accountability Act, or HIPAA, only protects medical information.

“We have a health privacy machine that’s in crisis,” said Frank Pasquale, a professor at the University of Maryland Carey School of Law who specializes in issues related to machine learning and algorithms. “We have a law that only covers one source of health information. They are rapidly developing another source.”…(More)”.

How Charities Are Using Artificial Intelligence to Boost Impact


Nicole Wallace at the Chronicle of Philanthropy: “The chaos and confusion of conflict often separate family members fleeing for safety. The nonprofit Refunite uses advanced technology to help loved ones reconnect, sometimes across continents and after years of separation.

Refugees register with the service by providing basic information — their name, age, birthplace, clan and subclan, and so forth — along with similar facts about the people they’re trying to find. Powerful algorithms search for possible matches among the more than 1.1 million individuals in the Refunite system. The analytics are further refined using the more than 2,000 searches that the refugees themselves do daily.

The goal: find loved ones or those connected to them who might help in the hunt. Since Refunite introduced the first version of the system in 2010, it has helped more than 40,000 people reconnect.

One factor complicating the work: Cultures define family lineage differently. Refunite co-founder Christopher Mikkelsen confronted this problem when he asked a boy in a refugee camp if he knew where his mother was. “He asked me, ‘Well, what mother do you mean?’ ” Mikkelsen remembers. “And I went, ‘Uh-huh, this is going to be challenging.’ ”

Fortunately, artificial intelligence is well suited to learn and recognize different family patterns. But the technology struggles with some simple things like distinguishing the image of a chicken from that of a car. Mikkelsen believes refugees in camps could offset this weakness by tagging photographs — “car” or “not car” — to help train algorithms. Such work could earn them badly needed cash: The group hopes to set up a system that pays refugees for doing such work.

“To an American, earning $4 a day just isn’t viable as a living,” Mikkelsen says. “But to the global poor, getting an access point to earning this is revolutionizing.”

Another group, Wild Me, a nonprofit created by scientists and technologists, has created an open-source software platform that combines artificial intelligence and image recognition, to identify and track individual animals. Using the system, scientists can better estimate the number of endangered animals and follow them over large expanses without using invasive techniques….

To fight sex trafficking, police officers often go undercover and interact with people trying to buy sex online. Sadly, demand is high, and there are never enough officers.

Enter Seattle Against Slavery. The nonprofit’s tech-savvy volunteers created chatbots designed to disrupt sex trafficking significantly. Using input from trafficking survivors and law-enforcement agencies, the bots can conduct simultaneous conversations with hundreds of people, engaging them in multiple, drawn-out conversations, and arranging rendezvous that don’t materialize. The group hopes to frustrate buyers so much that they give up their hunt for sex online….

A Philadelphia charity is using machine learning to adapt its services to clients’ needs.

Benefits Data Trust helps people enroll for government-assistance programs like food stamps and Medicaid. Since 2005, the group has helped more than 650,000 people access $7 billion in aid.

The nonprofit has data-sharing agreements with jurisdictions to access more than 40 lists of people who likely qualify for government benefits but do not receive them. The charity contacts those who might be eligible and encourages them to call the Benefits Data Trust for help applying….(More)”.

What is a data trust?


Essay by Jack Hardinges at ODI: “There are different interpretations of what a data trust is, or should be…

There’s not a well-used definition of ‘a data trust’, or even consensus on what one is. Much of the recent interest in data trusts in the UK has been fuelled by them being recommended as a way to ‘share data in a fair, safe and equitable way’ by a UK government-commissioned independent review into Artificial Intelligence (AI) in 2017. However, there has been wider international interest in the concept for some time.

At a very high level, the aim of data trusts appears to be to give people and organisations confidence when enabling access to data in ways that provide them with some value (either directly or indirectly) in return. Beyond that high level goal, there are a variety of thoughts about what form they should take. In our work so far, we’ve found different interpretations of the term ‘data trust’:

  • A data trust as a repeatable framework of terms and mechanisms.
  • A data trust as a mutual organisation.
  • A data trust as a legal structure.
  • A data trust as a store of data.
  • A data trust as public oversight of data access….(More)”

What if people were paid for their data?


The Economist: “Data Slavery” Jennifer Lyn Morone, an American artist, thinks this is the state in which most people now live. To get free online services, she laments, they hand over intimate information to technology firms. “Personal data are much more valuable than you think,” she says. To highlight this sorry state of affairs, Ms Morone has resorted to what she calls “extreme capitalism”: she registered herself as a company in Delaware in an effort to exploit her personal data for financial gain. She created dossiers containing different subsets of data, which she displayed in a London gallery in 2016 and offered for sale, starting at £100 ($135). The entire collection, including her health data and social-security number, can be had for £7,000.

Only a few buyers have taken her up on this offer and she finds “the whole thing really absurd”. ..Given the current state of digital affairs, in which the collection and exploitation of personal data is dominated by big tech firms, Ms Morone’s approach, in which individuals offer their data for sale, seems unlikely to catch on. But what if people really controlled their data—and the tech giants were required to pay for access? What would such a data economy look like?…

Labour, like data, is a resource that is hard to pin down. Workers were not properly compensated for labour for most of human history. Even once people were free to sell their labour, it took decades for wages to reach liveable levels on average. History won’t repeat itself, but chances are that it will rhyme, Mr Weyl predicts in “Radical Markets”, a provocative new book he has co-written with Eric Posner of the University of Chicago. He argues that in the age of artificial intelligence, it makes sense to treat data as a form of labour.

To understand why, it helps to keep in mind that “artificial intelligence” is something of a misnomer. Messrs Weyl and Posner call it “collective intelligence”: most AI algorithms need to be trained using reams of human-generated examples, in a process called machine learning. Unless they know what the right answers (provided by humans) are meant to be, algorithms cannot translate languages, understand speech or recognise objects in images. Data provided by humans can thus be seen as a form of labour which powers AI. As the data economy grows up, such data work will take many forms. Much of it will be passive, as people engage in all kinds of activities—liking social-media posts, listening to music, recommending restaurants—that generate the data needed to power new services. But some people’s data work will be more active, as they make decisions (such as labelling images or steering a car through a busy city) that can be used as the basis for training AI systems….

But much still needs to happen for personal data to be widely considered as labour, and paid for as such. For one thing, the right legal framework will be needed to encourage the emergence of a new data economy. The European Union’s new General Data Protection Regulation, which came into effect in May, already gives people extensive rights to check, download and even delete personal data held by companies. Second, the technology to keep track of data flows needs to become much more capable. Research to calculate the value of particular data to an AI service is in its infancy.

Third, and most important, people will have to develop a “class consciousness” as data workers. Most people say they want their personal information to be protected, but then trade it away for nearly nothing, something known as the “privacy paradox”. Yet things may be changing: more than 90% of Americans think being in control of who can get data on them is important, according to the Pew Research Centre, a think-tank….(More)”.

Ways to think about machine learning


Benedict Evans: “We’re now four or five years into the current explosion of machine learning, and pretty much everyone has heard of it. It’s not just that startups are forming every day or that the big tech platform companies are rebuilding themselves around it – everyone outside tech has read the Economist or BusinessWeek cover story, and many big companies have some projects underway. We know this is a Next Big Thing.

Going a step further, we mostly understand what neural networks might be, in theory, and we get that this might be about patterns and data. Machine learning lets us find patterns or structures in data that are implicit and probabilistic (hence ‘inferred’) rather than explicit, that previously only people and not computers could find. They address a class of questions that were previously ‘hard for computers and easy for people’, or, perhaps more usefully, ‘hard for people to describe to computers’. And we’ve seen some cool (or worrying, depending on your perspective) speech and vision demos.

I don’t think, though, that we yet have a settled sense of quite what machine learning means – what it will mean for tech companies or for companies in the broader economy, how to think structurally about what new things it could enable, or what machine learning means for all the rest of us, and what important problems it might actually be able to solve.

This isn’t helped by the term ‘artificial intelligence’, which tends to end any conversation as soon as it’s begun. As soon as we say ‘AI’, it’s as though the black monolith from the beginning of 2001 has appeared, and we all become apes screaming at it and shaking our fists. You can’t analyze ‘AI’.

Indeed, I think one could propose a whole list of unhelpful ways of talking about current developments in machine learning. For example:

  • Data is the new oil
  • Google and China (or Facebook, or Amazon, or BAT) have all the data
  • AI will take all the jobs
  • And, of course, saying AI itself.

More useful things to talk about, perhaps, might be:

  • Automation
  • Enabling technology layers
  • Relational databases. …(More).

We Need to Save Ignorance From AI


Christina Leuker and Wouter van den Bos in Nautilus:  “After the fall of the Berlin Wall, East German citizens were offered the chance to read the files kept on them by the Stasi, the much-feared Communist-era secret police service. To date, it is estimated that only 10 percent have taken the opportunity.

In 2007, James Watson, the co-discoverer of the structure of DNA, asked that he not be given any information about his APOE gene, one allele of which is a known risk factor for Alzheimer’s disease.

Most people tell pollsters that, given the choice, they would prefer not to know the date of their own death—or even the future dates of happy events.

Each of these is an example of willful ignorance. Socrates may have made the case that the unexamined life is not worth living, and Hobbes may have argued that curiosity is mankind’s primary passion, but many of our oldest stories actually describe the dangers of knowing too much. From Adam and Eve and the tree of knowledge to Prometheus stealing the secret of fire, they teach us that real-life decisions need to strike a delicate balance between choosing to know, and choosing not to.

But what if a technology came along that shifted this balance unpredictably, complicating how we make decisions about when to remain ignorant? That technology is here: It’s called artificial intelligence.

AI can find patterns and make inferences using relatively little data. Only a handful of Facebook likes are necessary to predict your personality, race, and gender, for example. Another computer algorithm claims it can distinguish between homosexual and heterosexual men with 81 percent accuracy, and homosexual and heterosexual women with 71 percent accuracy, based on their picture alone. An algorithm named COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) can predict criminal recidivism from data like juvenile arrests, criminal records in the family, education, social isolation, and leisure activities with 65 percent accuracy….

Recently, though, the psychologist Ralph Hertwig and legal scholar Christoph Engel have published an extensive taxonomy of motives for deliberate ignorance. They identified two sets of motives, in particular, that have a particular relevance to the need for ignorance in the face of AI.

The first set of motives revolves around impartiality and fairness. Simply put, knowledge can sometimes corrupt judgment, and we often choose to remain deliberately ignorant in response. For example, peer reviews of academic papers are usually anonymous. Insurance companies in most countries are not permitted to know all the details of their client’s health before they enroll; they only know general risk factors. This type of consideration is particularly relevant to AI, because AI can produce highly prejudicial information….(More)”.

Against the Dehumanisation of Decision-Making – Algorithmic Decisions at the Crossroads of Intellectual Property, Data Protection, and Freedom of Information


Paper by Guido Noto La Diega: “Nowadays algorithms can decide if one can get a loan, is allowed to cross a border, or must go to prison. Artificial intelligence techniques (natural language processing and machine learning in the first place) enable private and public decision-makers to analyse big data in order to build profiles, which are used to make decisions in an automated way.

This work presents ten arguments against algorithmic decision-making. These revolve around the concepts of ubiquitous discretionary interpretation, holistic intuition, algorithmic bias, the three black boxes, psychology of conformity, power of sanctions, civilising force of hypocrisy, pluralism, empathy, and technocracy.

The lack of transparency of the algorithmic decision-making process does not stem merely from the characteristics of the relevant techniques used, which can make it impossible to access the rationale of the decision. It depends also on the abuse of and overlap between intellectual property rights (the “legal black box”). In the US, nearly half a million patented inventions concern algorithms; more than 67% of the algorithm-related patents were issued over the last ten years and the trend is increasing.

To counter the increased monopolisation of algorithms by means of intellectual property rights (with trade secrets leading the way), this paper presents three legal routes that enable citizens to ‘open’ the algorithms.

First, copyright and patent exceptions, as well as trade secrets are discussed.

Second, the GDPR is critically assessed. In principle, data controllers are not allowed to use algorithms to take decisions that have legal effects on the data subject’s life or similarly significantly affect them. However, when they are allowed to do so, the data subject still has the right to obtain human intervention, to express their point of view, as well as to contest the decision. Additionally, the data controller shall provide meaningful information about the logic involved in the algorithmic decision.

Third, this paper critically analyses the first known case of a court using the access right under the freedom of information regime to grant an injunction to release the source code of the computer program that implements an algorithm.

Only an integrated approach – which takes into account intellectual property, data protection, and freedom of information – may provide the citizen affected by an algorithmic decision of an effective remedy as required by the Charter of Fundamental Rights of the EU and the European Convention on Human Rights….(More)”.

Ontario is trying a wild experiment: Opening access to its residents’ health data


Dave Gershorn at Quartz: “The world’s most powerful technology companies have a vision for the future of healthcare. You’ll still go to your doctor’s office, sit in a waiting room, and explain your problem to someone in a white coat. But instead of relying solely on their own experience and knowledge, your doctor will consult an algorithm that’s been trained on the symptoms, diagnoses, and outcomes of millions of other patients. Instead of a radiologist reading your x-ray, a computer will be able to detect minute differences and instantly identify a tumor or lesion. Or at least that’s the goal.

AI systems like these, currently under development by companies including Google and IBM, can’t read textbooks and journals, attend lectures, and do rounds—they need millions of real life examples to understand all the different variations between one patient and another. In general, AI is only as good as the data it’s trained on, but medical data is exceedingly private—most developed countries have strict health data protection laws, such as HIPAA in the United States….

These approaches, which favor companies with considerable resources, are pretty much the only way to get large troves of health data in the US because the American health system is so disparate. Healthcare providers keep personal files on each of their patients, and can only transmit them to other accredited healthcare workers at the patient’s request. There’s no single place where all health data exists. It’s more secure, but less efficient for analysis and research.

Ontario, Canada, might have a solution, thanks to its single-payer healthcare system. All of Ontario’s health data exists in a few enormous caches under government control. (After all, the government needs to keep track of all the bills its paying.) Similar structures exist elsewhere in Canada, such as Quebec, but Toronto, which has become a major hub for AI research, wants to lead the charge in providing this data to businesses.

Until now, the only people allowed to study this data were government organizations or researchers who partnered with the government to study disease. But Ontario has now entrusted the MaRS Discovery District—a cross between a tech incubator and WeWork—to build a platform for approved companies and researchers to access this data, dubbed Project Spark. The project, initiated by MaRS and Canada’s University Health Network, began exploring how to share this data after both organizations expressed interest to the government about giving broader health data access to researchers and companies looking to build healthcare-related tools.

Project Spark’s goal is to create an API, or a way for developers to request information from the government’s data cache. This could be used to create an app for doctors to access the full medical history of a new patient. Ontarians could access their health records at any time through similar software, and catalog health issues as they occur. Or researchers, like the ones trying to build AI to assist doctors, could request a different level of access that provides anonymized data on Ontarians who meet certain criteria. If you wanted to study every Ontarian who had Alzheimer’s disease over the last 40 years, that data would only be authorization and a few lines of code away.

There are currently 100 companies lined up to get access to data, comprised of health records from Ontario’s 14 million residents. (MaRS won’t say who the companies are). …(More)”