Paper by Wolfgang Kerber: “…analyses whether competition law can help to solve problems of access to data and interoperability in IoT ecosystems, where often one firm has exclusive control of the data produced by a smart device (and of the technical access to this device). Such a gatekeeper position can lead to the elimination of competition for aftermarket and other complementary services in such IoT ecosystems. This problem is analysed both from an economic and a legal perspective, and also generally for IoT ecosystems as well as for the much discussed problems of “access to in-vehicle data and re-sources” in connected cars, where the “extended vehicle” concept of the car manufacturers leads to such positions of exclusive control. The paper analyses, in particular, the competition rules about abusive behavior of dominant firms (Art. 102 TFEU) and of firms with “relative market power” (§ 20 (1) GWB) in German competition law. These provisions might offer (if appropriately applied and amended) at least some solutions for these data access problems. Competition law, however, might not be sufficient for dealing with all or most of these problems, i.e. that also additional solutions might be needed (data portability, direct data (access) rights, or sector-specific regulation)….(More)”.
Algorithmic Censorship on Social Platforms: Power, Legitimacy, and Resistance
Paper by Jennifer Cobbe: “Effective content moderation by social platforms has long been recognised as both important and difficult, with numerous issues arising from the volume of information to be dealt with, the culturally sensitive and contextual nature of that information, and the nuances of human communication. Attempting to scale moderation efforts, various platforms have adopted, or signalled their intention to adopt, increasingly automated approaches to identifying and suppressing content and communications that they deem undesirable. However, algorithmic forms of online censorship by social platforms bring their own concerns, including the extensive surveillance of communications and the use of machine learning systems with the distinct possibility of errors and biases. This paper adopts a governmentality lens to examine algorithmic censorship by social platforms in order to assist in the development of a more comprehensive understanding of the risks of such approaches to content moderation. This analysis shows that algorithmic censorship is distinctive for two reasons: (1) it would potentially bring all communications carried out on social platforms within reach, and (2) it would potentially allow those platforms to take a much more active, interventionist approach to moderating those communications. Consequently, algorithmic censorship could allow social platforms to exercise an unprecedented degree of control over both public and private communications, with poor transparency, weak or non-existent accountability mechanisms, and little legitimacy. Moreover, commercial considerations would be inserted further into the everyday communications of billions of people. Due to the dominance of the web by a small number of social platforms, this control may be difficult or impractical to escape for many people, although opportunities for resistance do exist.
While automating content moderation may seem like an attractive proposition for both governments and platforms themselves, the issues identified in this paper are cause for concern and should be given serious consideration.Jennifer CobbeEffective content moderation by social platforms has long been recognised as both important and difficult, with numerous issues arising from the volume of information to be dealt with, the culturally sensitive and contextual nature of that information, and the nuances of human communication. Attempting to scale moderation efforts, various platforms have adopted, or signalled their intention to adopt, increasingly automated approaches to identifying and suppressing content and communications that they deem undesirable. However, algorithmic forms of online censorship by social platforms bring their own concerns, including the extensive surveillance of communications and the use of machine learning systems with the distinct possibility of errors and biases. This paper adopts a governmentality lens to examine algorithmic censorship by social platforms in order to assist in the development of a more comprehensive understanding of the risks of such approaches to content moderation.
This analysis shows that algorithmic censorship is distinctive for two reasons: (1) it would potentially bring all communications carried out on social platforms within reach, and (2) it would potentially allow those platforms to take a much more active, interventionist approach to moderating those communications. Consequently, algorithmic censorship could allow social platforms to exercise an unprecedented degree of control over both public and private communications, with poor transparency, weak or non-existent accountability mechanisms, and little legitimacy. Moreover, commercial considerations would be inserted further into the everyday communications of billions of people. Due to the dominance of the web by a small number of social platforms, this control may be difficult or impractical to escape for many people, although opportunities for resistance do exist. While automating content moderation may seem like an attractive proposition for both governments and platforms themselves, the issues identified in this paper are cause for concern and should be given serious consideration….(More)”.
How big data can affect your bank account – and life
Alena Buyx, Barbara Prainsack and Aisling McMahon at The Conversation: “Mustafa loves good coffee. In his free time, he often browses high-end coffee machines that he cannot currently afford but is saving for. One day, travelling to a friend’s wedding abroad, he gets to sit next to another friend on the plane. When Mustafa complains about how much he paid for his ticket, it turns out that his friend paid less than half of what he paid, even though they booked around the same time.
He looks into possible reasons for this and concludes that it must be related to his browsing of expensive coffee machines and equipment. He is very angry about this and complains to the airline, who send him a lukewarm apology that refers to personalised pricing models. Mustafa feels that this is unfair but does not challenge it. Pursuing it any further would cost him time and money.
This story – which is hypothetical, but can and does occur – demonstrates the potential for people to be harmed by data use in the current “big data” era. Big data analytics involves using large amounts of data from many sources which are linked and analysed to find patterns that help to predict human behaviour. Such analysis, even when perfectly legal, can harm people.
Mustafa, for example, has likely been affected by personalised pricing practices whereby his search for high-end coffee machines has been used to make certain assumptions about his willingness to pay or buying power. This in turn may have led to his higher priced airfare. While this has not resulted in serious harm in Mustafa’s case, instances of serious emotional and financial harm are, unfortunately, not rare, including the denial of mortgages for individuals and risks to a person’s general credit worthiness based on associations with other individuals. This might happen if an individual shares some similar characteristics to other individuals who have poor repayment histories….(More)”.
Making data colonialism liveable: how might data’s social order be regulated?
Paper by Nick Couldry & Ulises Mejias: “Humanity is currently undergoing a large-scale social, economic and legal transformation based on the massive appropriation of social life through data extraction. This quantification of the social represents a new colonial move. While the modes, intensities, scales and contexts of dispossession have changed, the underlying drive of today’s data colonialism remains the same: to acquire “territory” and resources from which economic value can be extracted by capital. The injustices embedded in this system need to be made “liveable” through a new legal and regulatory order….(More)”.
The Why of the World
Book review by Tim Maudlin of The Book of Why: The New Science of Cause and Effect by Judea Pearl and Dana Mackenzie: “Correlation is not causation.” Though true and important, the warning has hardened into the familiarity of a cliché. Stock examples of so-called spurious correlations are now a dime a dozen. As one example goes, a Pacific island tribe believed flea infestations to be good for one’s health because they observed that healthy people had fleas while sick people did not. The correlation is real and robust, but fleas do not cause health, of course: they merely indicate it. Fleas on a fevered body abandon ship and seek a healthier host. One should not seek out and encourage fleas in the quest to ward off sickness.
The rub lies in another observation: that the evidence for causation seems to lie entirely in correlations. But for seeing correlations, we would have no clue about causation. The only reason we discovered that smoking causes lung cancer, for example, is that we observed correlations in that particular circumstance. And thus a puzzle arises: if causation cannot be reduced to correlation, how can correlation serve as evidence of causation?
The Book of Why, co-authored by the computer scientist Judea Pearl and the science writer Dana Mackenzie, sets out to give a new answer to this old question, which has been around—in some form or another, posed by scientists and philosophers alike—at least since the Enlightenment. In 2011 Pearl won the Turing Award, computer science’s highest honor, for “fundamental contributions to artificial intelligence through the development of a calculus of probabilistic and causal reasoning,” and this book sets out to explain what all that means for a general audience, updating his more technical book on the same subject, Causality, published nearly two decades ago. Written in the first person, the new volume mixes theory, history, and memoir, detailing both the technical tools of causal reasoning Pearl has developed as well as the tortuous path by which he arrived at them—all along bucking a scientific establishment that, in his telling, had long ago contented itself with data-crunching analysis of correlations at the expense of investigation of causes. There are nuggets of wisdom and cautionary tales in both these aspects of the book, the scientific as well as the sociological…(More)”.
Sharenthood: Why We Should Think before We Talk about Our Kids Online
Book by Leah Plunkett: “Our children’s first digital footprints are made before they can walk—even before they are born—as parents use fertility apps to aid conception, post ultrasound images, and share their baby’s hospital mug shot. Then, in rapid succession come terabytes of baby pictures stored in the cloud, digital baby monitors with built-in artificial intelligence, and real-time updates from daycare. When school starts, there are cafeteria cards that catalog food purchases, bus passes that track when kids are on and off the bus, electronic health records in the nurse’s office, and a school surveillance system that has eyes everywhere. Unwittingly, parents, teachers, and other trusted adults are compiling digital dossiers for children that could be available to everyone—friends, employers, law enforcement—forever. In this incisive book, Leah Plunkett examines the implications of “sharenthood”—adults’ excessive digital sharing of children’s data. She outlines the mistakes adults make with kids’ private information, the risks that result, and the legal system that enables “sharenting.”
Plunkett describes various modes of sharenting—including “commercial sharenting,” efforts by parents to use their families’ private experiences to make money—and unpacks the faulty assumptions made by our legal system about children, parents, and privacy. She proposes a “thought compass” to guide adults in their decision making about children’s digital data: play, forget, connect, and respect. Enshrining every false step and bad choice, Plunkett argues, can rob children of their chance to explore and learn lessons. The Internet needs to forget. We need to remember….(More)”.
Is Privacy and Personal Data Set to Become the New Intellectual Property?
Paper by Leon Trakman, Robert Walters, and Bruno Zeller: “A pressing concern today is whether the rationale underlying the protection of personal data is itself a meaningful foundation for according intellectual property (IP) rights in personal data to data subjects. In particular, are there particular technological attributes about the collection, use and processing of personal data on the Internet, and global access to that data, that provide a strong justification to extend IP rights to data subjects? A central issue in so determining is whether data subjects need the protection of such rights in a technological revolution in which they are increasingly exposed to the use and abuse of their personal data. A further question is how IP law can provide them with the requisite protection of their private space, or whether other means of protecting personal data, such as through general contract rights, render IP protections redundant, or at least, less necessary. This paper maintains that lawmakers often fail to distinguish between general property and IP protection of personal data; that IP protection encompasses important attributes of both property and contract law; and that laws that implement IP protection in light of its sui generis attributes are more fitting means of protecting personal data than the alternatives. The paper demonstrates that one of the benefits of providing IP rights in personal data goes some way to strengthening data subjects’ control and protection over their personal data and strengthening data protection law more generally. It also argues for greater harmonization of IP law across jurisdictions to ensure that the protection of personal data becomes more coherent and internationally sustainable….(More)”.
How to Build Artificial Intelligence We Can Trust
Gary Marcus and Ernest Davis at the New York Times: “Artificial intelligence has a trust problem. We are relying on A.I. more and more, but it hasn’t yet earned our confidence.
Tesla cars driving in Autopilot mode, for example, have a troubling history of crashing into stopped vehicles. Amazon’s facial recognition system works great much of the time, but when asked to compare the faces of all 535 members of Congress with 25,000 public arrest photos, it found 28 matches, when in reality there were none. A computer program designed to vet job applicants for Amazon was discovered to systematically discriminate against women. Every month new weaknesses in A.I. are uncovered.
The problem is not that today’s A.I. needs to get better at what it does. The problem is that today’s A.I. needs to try to do something completely different.
In particular, we need to stop building computer systems that merely get better and better at detecting statistical patterns in data sets — often using an approach known as deep learning — and start building computer systems that from the moment of their assembly innately grasp three basic concepts: time, space and causality….
We face a choice. We can stick with today’s approach to A.I. and greatly restrict what the machines are allowed to do (lest we end up with autonomous-vehicle crashes and machines that perpetuate bias rather than reduce it). Or we can shift our approach to A.I. in the hope of developing machines that have a rich enough conceptual understanding of the world that we need not fear their operation. Anything else would be too risky….(More)”.
How Should Scientists’ Access To Health Databanks Be Managed?
Richard Harris at NPR: “More than a million Americans have donated genetic information and medical data for research projects. But how that information gets used varies a lot, depending on the philosophy of the organizations that have gathered the data.
Some hold the data close, while others are working to make the data as widely available to as many researchers as possible — figuring science will progress faster that way. But scientific openness can be constrained b y both practical and commercial considerations.
Three major projects in the United States illustrate these differing philosophies.
VA scientists spearhead research on veterans database
The first project involves three-quarters of a million veterans, mostly men over age 60. Every day, 400 to 500 blood samples show up in a modern lab in the basement of the Veterans Affairs hospital in Boston. Luis Selva, the center’s associate director, explains that robots extract DNA from the samples and then the genetic material is sent out for analysis….
Intermountain Healthcare teams with deCODE genetics
Our second example involves what is largely an extended family: descendants of settlers in Utah, primarily from the Church of Jesus Christ of Latter-day Saints. This year, Intermountain Healthcare in Utah announced that it was going to sequence the complete DNA of half a million of its patients, resulting in what the health system says will be the world’s largest collection of complete genomes….
NIH’s All of Us aims to diversify and democratize research
Our third and final example is an effort by the National Institutes of Health to recruit a million Americans for a long-term study of health, behavior and genetics. Its philosophy sharply contrasts with that of Intermountain Health.
“We do have a very strong goal around diversity, in making sure that the participants in the All of Us research program reflect the vast diversity of the United States,” says Stephanie Devaney, the program’s deputy director….(More)”.
Raw data won’t solve our problems — asking the right questions will
Stefaan G. Verhulst in apolitical: “If I had only one hour to save the world, I would spend fifty-five minutes defining the questions, and only five minutes finding the answers,” is a famous aphorism attributed to Albert Einstein.
Behind this quote is an important insight about human nature: Too often, we leap to answers without first pausing to examine our questions. We tout solutions without considering whether we are addressing real or relevant challenges or priorities. We advocate fixes for problems, or for aspects of society, that may not be broken at all.
This misordering of priorities is especially acute — and represents a missed opportunity — in our era of big data. Today’s data has enormous potential to solve important public challenges.
However, policymakers often fail to invest in defining the questions that matter, focusing mainly on the supply side of the data equation (“What data do we have or must have access to?”) rather than the demand side (“What is the core question and what data do we really need to answer it?” or “What data can or should we actually use to solve those problems that matter?”).
As such, data initiatives often provide marginal insights while at the same time generating unnecessary privacy risks by accessing and exploring data that may not in fact be needed at all in order to address the root of our most important societal problems.
A new science of questions
So what are the truly vexing questions that deserve attention and investment today? Toward what end should we strategically seek to leverage data and AI?
The truth is that policymakers and other stakeholders currently don’t have a good way of defining questions or identifying priorities, nor a clear framework to help us leverage the potential of data and data science toward the public good.
This is a situation we seek to remedy at The GovLab, an action research center based at New York University.
Our most recent project, the 100 Questions Initiative, seeks to begin developing a new science and practice of questions — one that identifies the most urgent questions in a participatory manner. Launched last month, the goal of this project is to develop a process that takes advantage of distributed and diverse expertise on a range of given topics or domains so as to identify and prioritize those questions that are high impact, novel and feasible.
Because we live in an age of data and much of our work focuses on the promises and perils of data, we seek to identify the 100 most pressing problems confronting the world that could be addressed by greater use of existing, often inaccessible, datasets through data collaboratives – new forms of cross-disciplinary collaboration beyond public-private partnerships focused on leveraging data for good….(More)”.