Crowdsourcing reliable local data


Paper by Jane Lawrence Sumner, Emily M. Farris, and Mirya R. Holman: “The adage “All politics is local” in the United States is largely true. Of the United States’ 90,106 governments, 99.9% are local governments. Despite variations in institutional features, descriptive representation, and policy making power, political scientists have been slow to take advantage of these variations. One obstacle is that comprehensive data on local politics is often extremely difficult to obtain; as a result, data is unavailable or costly, hard to replicate, and rarely updated.

We provide an alternative: crowdsourcing this data. We demonstrate and validate crowdsourcing data on local politics, using two different data collection projects. We evaluate different measures of consensus across coders and validate the crowd’s work against elite and professional datasets. In doing so, we show that crowd-sourced data is both highly accurate and easy to use. In doing so, we demonstrate that non-experts can be used to collect, validate, or update local data….All data from the project available at https://dataverse.harvard.edu/dataverse/2chainz …(More)”.

Deep Fakes: A Looming Challenge for Privacy, Democracy, and National Security


Paper by Robert Chesney and Danielle Keats Citron: “Harmful lies are nothing new. But the ability to distort reality has taken an exponential leap forward with “deep fake” technology. This capability makes it possible to create audio and video of real people saying and doing things they never said or did. Machine learning techniques are escalating the technology’s sophistication, making deep fakes ever more realistic and increasingly resistant to detection.

Deep-fake technology has characteristics that enable rapid and widespread diffusion, putting it into the hands of both sophisticated and unsophisticated actors. While deep-fake technology will bring with it certain benefits, it also will introduce many harms. The marketplace of ideas already suffers from truth decay as our networked information environment interacts in toxic ways with our cognitive biases. Deep fakes will exacerbate this problem significantly. Individuals and businesses will face novel forms of exploitation, intimidation, and personal sabotage. The risks to our democracy and to national security are profound as well.

Our aim is to provide the first in-depth assessment of the causes and consequences of this disruptive technological change, and to explore the existing and potential tools for responding to it. We survey a broad array of responses, including: the role of technological solutions; criminal penalties, civil liability, and regulatory action; military and covert-action responses; economic sanctions; and market developments. We cover the waterfront from immunities to immutable authentication trails, offering recommendations to improve law and policy and anticipating the pitfalls embedded in various solutions….(More)”.

Privacy and Synthetic Datasets


Paper by Steven M. Bellovin, Preetam K. Dutta and Nathan Reitinger: “Sharing is a virtue, instilled in us from childhood. Unfortunately, when it comes to big data — i.e., databases possessing the potential to usher in a whole new world of scientific progress — the legal landscape prefers a hoggish motif. The historic approach to the resulting database–privacy problem has been anonymization, a subtractive technique incurring not only poor privacy results, but also lackluster utility. In anonymization’s stead, differential privacy arose; it provides better, near-perfect privacy, but is nonetheless subtractive in terms of utility.

Today, another solution is leaning into the fore, synthetic data. Using the magic of machine learning, synthetic data offers a generative, additive approach — the creation of almost-but-not-quite replica data. In fact, as we recommend, synthetic data may be combined with differential privacy to achieve a best-of-both-worlds scenario. After unpacking the technical nuances of synthetic data, we analyze its legal implications, finding both over and under inclusive applications. Privacy statutes either overweigh or downplay the potential for synthetic data to leak secrets, inviting ambiguity. We conclude by finding that synthetic data is a valid, privacy-conscious alternative to raw data, but is not a cure-all for every situation. In the end, computer science progress must be met with proper policy in order to move the area of useful data dissemination forward….(More)”.

Challenges facing social media platforms in conflict prevention in Kenya since 2007: A case of Ushahidi platform


Paper by A.K. Njeru, B. Malakwen and M. Lumala in the International Academic Journal of Social Sciences and Education: “Throughout history information is a key factor in conflict management around the world. The media can play its important role of being the society’s watch dog of the society, by exposing to the masses what is essential but hidden, however the same media may also be used to mobilize masses to violence. Social media can therefore act as a tool for widening the democratic space, but can also lead to destabilization of peace.

The aim of the study was to establish the challenges facing social media platforms in conflict prevention in Kenya since 2007: a case of Ushahidi platform in Kenya. The paradigm that was found suitable for this study is Pragmatism. The study used a mixed approach. In this study, interviews, focus group discussions and content analysis of the Ushahidi platform were chosen as the tools of data collection. In order to bring order, structure and interpretation to the collected data, the researcher systematically organized the data by coding it into categories and constructing matrixes. After classifying the data, the researcher compared and contrasted it to the information retrieved from the literature review.

The study found that One major weak point social media as a tool for conflict prevention is the lack of ethical standards and professionalism for the users. It is too liberal and thus can be used to spread unverified information and distorted facts that might be detrimental to peace building and conflict prevention. This has led to some of the users already questioning the credibility of the information that is circulated through social media. The other weak point about social media as tool for peace building is that it is dependent to a major extent on the access to internet. The availability of internet in low units doesn’t necessarily mean cheap access. So over time the high cost of internet might affect the efficiency of the social media as a tool. The study concluded that information credibility is essential if social media as a tool is to be effective in conflict prevention and peace building.

The nature of social media which allows for anonymity of identity gives room for unverified information to be floated around the social media networks; this can be detrimental to the conflict prevention and peace building initiatives. There is therefore need for information verification and authentication by a trusted agent, to offer information appertaining to violence, conflict prevention and peace building on the social media platforms. The study recommends that Ushahidi platform should be seen as an agent of social change and should discuss the social mobilization which may be able to bring about. The study further suggest that if we can look at Ushahidi platform as a development agent, can we then take this a step further and ask, or try to find, a methodology that looks at the Ushahidi platform as peacemaking agent, or to assist in the maintenance of peace in a post-conflict thereby tapping into Ushahidi platform’s full potential….(More)”.

Political Lawyering for the 21st Century


Paper by Deborah N. Archer: “Legal education purports to prepare the next generation of lawyers capable of tackling the urgent and complex social justice challenges of our time. But law schools are failing in that public promise. Clinical education offers the best opportunity to overcome those failings by teaching the skills lawyers need to tackle systemic and interlocking legal and social problems. But too often even clinical education falls short: it adheres to conventional pedagogical methodologies that are overly narrow and, in the end, limit students’ abilities to manage today’s complex racial and social justice issues. This article contends that clinical education needs to embrace and reimagine political lawyering for the 21st century in order to prepare aspiring lawyers to tackle both new and chronic issues of injustice through a broad array of advocacy strategies….(More)”.

DNA databases are too white. This man aims to fix that.


Interview of Carlos D. Bustamante by David Rotman: “In the 15 years since the Human Genome Project first exposed our DNA blueprint, vast amounts of genetic data have been collected from millions of people in many different parts of the world. Carlos D. Bustamante’s job is to search that genetic data for clues to everything from ancient history and human migration patterns to the reasons people with different ancestries are so varied in their response to common diseases.

Bustamante’s career has roughly spanned the period since the Human Genome Project was completed. A professor of genetics and biomedical data science at Stanford and 2010 winner of a MacArthur genius award, he has helped to tease out the complex genetic variation across different populations. These variants mean that the causes of diseases can vary greatly between groups. Part of the motivation for Bustamante, who was born in Venezuela and moved to the US when he was seven, is to use those insights to lessen the medical disparities that still plague us.

But while it’s an area ripe with potential for improving medicine, it’s also fraught with controversies over how to interpret genetic differences between human populations. In an era still obsessed with race and ethnicity—and marred by the frequent misuse of science in defining the characteristics of different groups—Bustamante remains undaunted in searching for the nuanced genetic differences that these groups display.

Perhaps his optimism is due to his personality—few sentences go by without a “fantastic” or “extraordinarily exciting.” But it is also his recognition as a population geneticist of the incredible opportunity that understanding differences in human genomes presents for improving health and fighting disease.

David Rotman, MIT Technology Review’s editor at large, discussed with Bustamante why it’s so important to include more people in genetic studies and understand the genetics of different populations.

How good are we at making sure that the genomic data we’re collecting is inclusive?

I’m optimistic, but it’s not there yet.

In our 2011 paper, the statistic we had was that more than 96% of participants in genome-wide association studies were of European descent. In the follow-up in 2016, the number went from 96% to around 80%. So that’s getting better. Unfortunately, or perhaps fortunately, a lot of that is due to the entry of China into genetics. A lot of that was due to large-scale studies in Chinese and East Asian populations. Hispanics, for example, make up less than 1% of genome-wide association studies. So we need to do better. Ultimately, we want precision medicine to benefit everybody.

Aside from a fairness issue, why is diversity in genomic data important? What do we miss without it?

First of all, it has nothing to do with political correctness. It has everything to do with human biology and the fact that human populations and the great diaspora of human migrations have left their mark on the human genome. The genetic underpinnings of health and disease have shared components across human populations and things that are unique to different populations….(More)”.

Crowdsourcing the vote: New horizons in citizen forecasting


Article by Mickael Temporão Yannick Dufresne Justin Savoie and Clifton van der Linden in International Journal of Forecasting: “People do not know much about politics. This is one of the most robust findings in political science and is backed by decades of research. Most of this research has focused on people’s ability to know about political issues and party positions on these issues. But can people predict elections? Our research uses a very large dataset (n>2,000,000) collected during ten provincial and federal elections in Canada to test whether people can predict the electoral victor and the closeness of the race in their district throughout the campaign. The results show that they can. This paper also contributes to the emerging literature on citizen forecasting by developing a scaling method that allows us to compare the closeness of races and that can be applied to multiparty contexts with varying numbers of parties. Finally, we assess the accuracy of citizen forecasting in Canada when compared to voter expectations weighted by past votes and political competency….(More)”.

A Right to Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI


Paper by Sandra Wachter and Brent Mittelstadt: “Big Data analytics and artificial intelligence (AI) draw non-intuitive and unverifiable inferences and predictions about the behaviors, preferences, and private lives of individuals. These inferences draw on highly diverse and feature-rich data of unpredictable value, and create new opportunities for discriminatory, biased, and invasive decision-making. Concerns about algorithmic accountability are often actually concerns about the way in which these technologies draw privacy invasive and non-verifiable inferences about us that we cannot predict, understand, or refute.

Data protection law is meant to protect people’s privacy, identity, reputation, and autonomy, but is currently failing to protect data subjects from the novel risks of inferential analytics. The broad concept of personal datain Europe could be interpreted to include inferences, predictions, and assumptions that refer to or impact on an individual. If seen as personal data, individuals are granted numerous rights under data protection law. However, the legal status of inferences is heavily disputed in legal scholarship, and marked by inconsistencies and contradictions within and between the views of the Article 29 Working Party and the European Court of Justice.

As we show in this paper, individuals are granted little control and oversight over how their personal data is used to draw inferences about them. Compared to other types of personal data, inferences are effectively ‘economy class’ personal data in the General Data Protection Regulation (GDPR). Data subjects’ rights to know about (Art 13-15), rectify (Art 16), delete (Art 17), object to (Art 21), or port (Art 20) personal data are significantly curtailed when it comes to inferences, often requiring a greater balance with controller’s interests (e.g. trade secrets, intellectual property) than would otherwise be the case. Similarly, the GDPR provides insufficient protection against sensitive inferences (Art 9) or remedies to challenge inferences or important decisions based on them (Art 22(3))….

In this paper we argue that a new data protection right, the ‘right to reasonable inferences’, is needed to help close the accountability gap currently posed ‘high risk inferences’ , meaning inferences that are privacy invasive or reputation damaging and have low verifiability in the sense of being predictive or opinion-based. In cases where algorithms draw ‘high risk inferences’ about individuals, this right would require ex-ante justification to be given by the data controller to establish whether an inference is reasonable. This disclosure would address (1) why certain data is a relevant basis to draw inferences; (2) why these inferences are relevant for the chosen processing purpose or type of automated decision; and (3) whether the data and methods used to draw the inferences are accurate and statistically reliable. The ex-ante justification is bolstered by an additional ex-post mechanism enabling unreasonable inferences to be challenged. A right to reasonable inferences must, however, be reconciled with EU jurisprudence and counterbalanced with IP and trade secrets law as well as freedom of expression and Article 16 of the EU Charter of Fundamental Rights: the freedom to conduct a business….(More)”.

A Doctor’s Prescription: Data May Finally Be Good for Your Health


Interview by Art Kleiner: “In 2015, Robert Wachter published The Digital Doctor: Hope, Hype, and Harm at the Dawn of Medicine’s Computer Age, a skeptical account of digitization in hospitals. Despite the promise offered by the digital transformation of healthcare, electronic health records had not delivered better care and greater efficiency. The cumbersome design, legacy procedures, and resistance from staff were frustrating everyone — administrators, nurses, consultants, and patients. Costs continued to rise, and preventable medical mistakes were not spotted. One patient at Wachter’s own hospital, one of the nation’s finest, was given 39 times the correct dose of antibiotics by an automated system that nobody questioned. The teenager survived, but it was clear that there needed to be a new approach to the management and use of data.

Wachter has for decades considered the delivery of healthcare through a lens focused on patient safety and quality. In 1996, he coauthored a paper in the New England Journal of Medicine that coined the term hospitalist in describing and promoting a new way of managing patients in hospitals: having one doctor — the hospitalist — “own” the patient journey from admission to discharge. The primary goal was to improve outcomes and save lives. Wachter argued it would also reduce costs and increase efficiency, making the business case for better healthcare. And he was right. Today there are more than 50,000 hospitalists, and it took just two years from the article’s publication to have the first data proving his point. In 2016, Wachter was named chair of the Department of Medicine at the University of California, San Francisco (UCSF), where he has worked since 1990.

Today, Wachter is, to paraphrase the title of a recent talk, less grumpy than he used to be about health tech. The hope part of his book’s title has materialized in some areas faster than he predicted. AI’s advances in imaging are already helping the detection of cancers become more accurate. As data collection has become better systematized, big technology firms such as Google, Amazon, and Apple are entering (in Google’s case, reentering) the field and having more success focusing their problem-solving skills on healthcare issues. In his San Francisco office, Wachter sat down with strategy+businessto discuss why the healthcare system may finally be about to change….

Systems for Fresh Thinking

S+B: The changes you appreciate seem to have less to do with technological design and more to do with people getting used to the new systems, building their own variations, and making them work.
WACHTER:
 The original electronic health record was just a platform play to get the data in digital form. It didn’t do anything particularly helpful in terms of helping the physicians make better decisions or helping to connect one kind of doctor with another kind of doctor. But it was a start.

I remember that when we were starting to develop our electronic health record at UCSF, 12 or 13 years ago, I hired a physician who is now in charge of our health computer system. I said to him, “We don’t have our electronic health record in yet, but I’m pretty sure we will in seven or eight years. What will your job be when that’s done?” I actually thought once the system was fully implemented, we’d be done with the need to innovate and evolve in health IT. That, of course, was asinine.

S+B: That’s like saying to an auto mechanic, “What will your job be when we have automatic transmissions?”
WACHTER:
 Right, but even more so, because many of us saw electronic health records as the be-all and end-all of digitally facilitated medicine. But putting in the electronic health record is just step one of 10. Then you need to start connecting all the pieces, and then you add analytics that make sense of the data and make predictions. Then you build tools and apps to fit into the workflow and change the way you work.

One of my biggest epiphanies was this: When you digitize, in any industry, nobody is clever enough to actually change anything. All they know how to do is digitize the old practice. You only start seeing real progress when smart people come in, begin using the new system, and say, “Why the hell do we do it that way?” And then you start thinking freshly about the work. That’s when you have a chance to reimagine the work in a digital environment…(More)”.

Human Rights in the Big Data World


Paper by Francis Kuriakose and Deepa Iyer: “Ethical approach to human rights conceives and evaluates law through the underlying value concerns. This paper examines human rights after the introduction of big data using an ethical approach to rights. First, the central value concerns such as equity, equality, sustainability and security are derived from the history of digital technological revolution. Then, the properties and characteristics of big data are analyzed to understand emerging value concerns such as accountability, transparency, tracability, explainability and disprovability.

Using these value points, this paper argues that big data calls for two types of evaluations regarding human rights. The first is the reassessment of existing human rights in the digital sphere predominantly through right to equality and right to work. The second is the conceptualization of new digital rights such as right to privacy and right against propensity-based discrimination. The paper concludes that as we increasingly share the world with intelligence systems, these new values expand and modify the existing human rights paradigm….(More)”.