Rationality and politics of algorithms. Will the promise of big data survive the dynamics of public decision making?


Paper by H.G. (Haiko)van der Voort et al: “Big data promises to transform public decision-making for the better by making it more responsive to actual needs and policy effects. However, much recent work on big data in public decision-making assumes a rational view of decision-making, which has been much criticized in the public administration debate.

In this paper, we apply this view, and a more political one, to the context of big data and offer a qualitative study. We question the impact of big data on decision-making, realizing that big data – including its new methods and functions – must inevitably encounter existing political and managerial institutions. By studying two illustrative cases of big data use processes, we explore how these two worlds meet. Specifically, we look at the interaction between data analysts and decision makers.

In this we distinguish between a rational view and a political view, and between an information logic and a decision logic. We find that big data provides ample opportunities for both analysts and decision makers to do a better job, but this doesn’t necessarily imply better decision-making, because big data also provides opportunities for actors to pursue their own interests. Big data enables both data analysts and decision makers to act as autonomous agents rather than as links in a functional chain. Therefore, big data’s impact cannot be interpreted only in terms of its functional promise; it must also be acknowledged as a phenomenon set to impact our policymaking institutions, including their legitimacy….(More)”.

The causal effect of trust


Paper by Björn Bartling, Ernst Fehr, David Huffman and Nick Netzer: “Trust affects almost all human relationships – in families, organizations, markets and politics. However, identifying the conditions under which trust, defined as people’s beliefs in the trustworthiness of others, has a causal effect on the efficiency of human interactions has proven to be difficult. We show experimentally and theoretically that trust indeed has a causal effect. The duration of the effect depends, however, on whether initial trust variations are supported by multiple equilibria.

We study a repeated principal-agent game with multiple equilibria and document empirically that an efficient equilibrium is selected if principals believe that agents are trustworthy, while players coordinate on an inefficient equilibrium if principals believe that agents are untrustworthy. Yet, if we change the institutional environment such that there is a unique equilibrium, initial variations in trust have short-run effects only. Moreover, if we weaken contract enforcement in the latter environment, exogenous variations in trust do not even have a short-run effect. The institutional environment thus appears to be key for whether trust has causal effects and whether the effects are transient or persistent…(More)”.

The Lack of Decentralization of Data: Barriers, Exclusivity, and Monopoly in Open Data


Paper by Carla Hamida and Amanda Landi: “Recently, Facebook creator Mark Zuckerberg was on trial for the misuse of personal data. In 2013, the National Security Agency was exposed by Edward Snowden for invading the privacy of inhabitants of the United States by examining personal data. We see in the news examples, like the two just described, of government agencies and private companies being less than truthful about their use of our data. A related issue is that these same government agencies and private companies do not share their own data, and this creates the openness of data problem.

Government, academics, and citizens can play a role in making data more open. In the present, there are non-profit organizations that research data openness, such as OpenData Charter, Global Open Data Index, and Open Data Barometer. These organizations have different methods on measuring openness of data, so this leads us to question what does open data mean, how does one measure how open data is and who decides how open should data be, and to what extent society is affected by the availability, or lack of availability, of data. In this paper, we explore these questions with an examination of two of the non-profit organizations that study the open data problem extensively….(More)”.

Crowdsourcing reliable local data


Paper by Jane Lawrence Sumner, Emily M. Farris, and Mirya R. Holman: “The adage “All politics is local” in the United States is largely true. Of the United States’ 90,106 governments, 99.9% are local governments. Despite variations in institutional features, descriptive representation, and policy making power, political scientists have been slow to take advantage of these variations. One obstacle is that comprehensive data on local politics is often extremely difficult to obtain; as a result, data is unavailable or costly, hard to replicate, and rarely updated.

We provide an alternative: crowdsourcing this data. We demonstrate and validate crowdsourcing data on local politics, using two different data collection projects. We evaluate different measures of consensus across coders and validate the crowd’s work against elite and professional datasets. In doing so, we show that crowd-sourced data is both highly accurate and easy to use. In doing so, we demonstrate that non-experts can be used to collect, validate, or update local data….All data from the project available at https://dataverse.harvard.edu/dataverse/2chainz …(More)”.

Deep Fakes: A Looming Challenge for Privacy, Democracy, and National Security


Paper by Robert Chesney and Danielle Keats Citron: “Harmful lies are nothing new. But the ability to distort reality has taken an exponential leap forward with “deep fake” technology. This capability makes it possible to create audio and video of real people saying and doing things they never said or did. Machine learning techniques are escalating the technology’s sophistication, making deep fakes ever more realistic and increasingly resistant to detection.

Deep-fake technology has characteristics that enable rapid and widespread diffusion, putting it into the hands of both sophisticated and unsophisticated actors. While deep-fake technology will bring with it certain benefits, it also will introduce many harms. The marketplace of ideas already suffers from truth decay as our networked information environment interacts in toxic ways with our cognitive biases. Deep fakes will exacerbate this problem significantly. Individuals and businesses will face novel forms of exploitation, intimidation, and personal sabotage. The risks to our democracy and to national security are profound as well.

Our aim is to provide the first in-depth assessment of the causes and consequences of this disruptive technological change, and to explore the existing and potential tools for responding to it. We survey a broad array of responses, including: the role of technological solutions; criminal penalties, civil liability, and regulatory action; military and covert-action responses; economic sanctions; and market developments. We cover the waterfront from immunities to immutable authentication trails, offering recommendations to improve law and policy and anticipating the pitfalls embedded in various solutions….(More)”.

Privacy and Synthetic Datasets


Paper by Steven M. Bellovin, Preetam K. Dutta and Nathan Reitinger: “Sharing is a virtue, instilled in us from childhood. Unfortunately, when it comes to big data — i.e., databases possessing the potential to usher in a whole new world of scientific progress — the legal landscape prefers a hoggish motif. The historic approach to the resulting database–privacy problem has been anonymization, a subtractive technique incurring not only poor privacy results, but also lackluster utility. In anonymization’s stead, differential privacy arose; it provides better, near-perfect privacy, but is nonetheless subtractive in terms of utility.

Today, another solution is leaning into the fore, synthetic data. Using the magic of machine learning, synthetic data offers a generative, additive approach — the creation of almost-but-not-quite replica data. In fact, as we recommend, synthetic data may be combined with differential privacy to achieve a best-of-both-worlds scenario. After unpacking the technical nuances of synthetic data, we analyze its legal implications, finding both over and under inclusive applications. Privacy statutes either overweigh or downplay the potential for synthetic data to leak secrets, inviting ambiguity. We conclude by finding that synthetic data is a valid, privacy-conscious alternative to raw data, but is not a cure-all for every situation. In the end, computer science progress must be met with proper policy in order to move the area of useful data dissemination forward….(More)”.

Challenges facing social media platforms in conflict prevention in Kenya since 2007: A case of Ushahidi platform


Paper by A.K. Njeru, B. Malakwen and M. Lumala in the International Academic Journal of Social Sciences and Education: “Throughout history information is a key factor in conflict management around the world. The media can play its important role of being the society’s watch dog of the society, by exposing to the masses what is essential but hidden, however the same media may also be used to mobilize masses to violence. Social media can therefore act as a tool for widening the democratic space, but can also lead to destabilization of peace.

The aim of the study was to establish the challenges facing social media platforms in conflict prevention in Kenya since 2007: a case of Ushahidi platform in Kenya. The paradigm that was found suitable for this study is Pragmatism. The study used a mixed approach. In this study, interviews, focus group discussions and content analysis of the Ushahidi platform were chosen as the tools of data collection. In order to bring order, structure and interpretation to the collected data, the researcher systematically organized the data by coding it into categories and constructing matrixes. After classifying the data, the researcher compared and contrasted it to the information retrieved from the literature review.

The study found that One major weak point social media as a tool for conflict prevention is the lack of ethical standards and professionalism for the users. It is too liberal and thus can be used to spread unverified information and distorted facts that might be detrimental to peace building and conflict prevention. This has led to some of the users already questioning the credibility of the information that is circulated through social media. The other weak point about social media as tool for peace building is that it is dependent to a major extent on the access to internet. The availability of internet in low units doesn’t necessarily mean cheap access. So over time the high cost of internet might affect the efficiency of the social media as a tool. The study concluded that information credibility is essential if social media as a tool is to be effective in conflict prevention and peace building.

The nature of social media which allows for anonymity of identity gives room for unverified information to be floated around the social media networks; this can be detrimental to the conflict prevention and peace building initiatives. There is therefore need for information verification and authentication by a trusted agent, to offer information appertaining to violence, conflict prevention and peace building on the social media platforms. The study recommends that Ushahidi platform should be seen as an agent of social change and should discuss the social mobilization which may be able to bring about. The study further suggest that if we can look at Ushahidi platform as a development agent, can we then take this a step further and ask, or try to find, a methodology that looks at the Ushahidi platform as peacemaking agent, or to assist in the maintenance of peace in a post-conflict thereby tapping into Ushahidi platform’s full potential….(More)”.

Political Lawyering for the 21st Century


Paper by Deborah N. Archer: “Legal education purports to prepare the next generation of lawyers capable of tackling the urgent and complex social justice challenges of our time. But law schools are failing in that public promise. Clinical education offers the best opportunity to overcome those failings by teaching the skills lawyers need to tackle systemic and interlocking legal and social problems. But too often even clinical education falls short: it adheres to conventional pedagogical methodologies that are overly narrow and, in the end, limit students’ abilities to manage today’s complex racial and social justice issues. This article contends that clinical education needs to embrace and reimagine political lawyering for the 21st century in order to prepare aspiring lawyers to tackle both new and chronic issues of injustice through a broad array of advocacy strategies….(More)”.

DNA databases are too white. This man aims to fix that.


Interview of Carlos D. Bustamante by David Rotman: “In the 15 years since the Human Genome Project first exposed our DNA blueprint, vast amounts of genetic data have been collected from millions of people in many different parts of the world. Carlos D. Bustamante’s job is to search that genetic data for clues to everything from ancient history and human migration patterns to the reasons people with different ancestries are so varied in their response to common diseases.

Bustamante’s career has roughly spanned the period since the Human Genome Project was completed. A professor of genetics and biomedical data science at Stanford and 2010 winner of a MacArthur genius award, he has helped to tease out the complex genetic variation across different populations. These variants mean that the causes of diseases can vary greatly between groups. Part of the motivation for Bustamante, who was born in Venezuela and moved to the US when he was seven, is to use those insights to lessen the medical disparities that still plague us.

But while it’s an area ripe with potential for improving medicine, it’s also fraught with controversies over how to interpret genetic differences between human populations. In an era still obsessed with race and ethnicity—and marred by the frequent misuse of science in defining the characteristics of different groups—Bustamante remains undaunted in searching for the nuanced genetic differences that these groups display.

Perhaps his optimism is due to his personality—few sentences go by without a “fantastic” or “extraordinarily exciting.” But it is also his recognition as a population geneticist of the incredible opportunity that understanding differences in human genomes presents for improving health and fighting disease.

David Rotman, MIT Technology Review’s editor at large, discussed with Bustamante why it’s so important to include more people in genetic studies and understand the genetics of different populations.

How good are we at making sure that the genomic data we’re collecting is inclusive?

I’m optimistic, but it’s not there yet.

In our 2011 paper, the statistic we had was that more than 96% of participants in genome-wide association studies were of European descent. In the follow-up in 2016, the number went from 96% to around 80%. So that’s getting better. Unfortunately, or perhaps fortunately, a lot of that is due to the entry of China into genetics. A lot of that was due to large-scale studies in Chinese and East Asian populations. Hispanics, for example, make up less than 1% of genome-wide association studies. So we need to do better. Ultimately, we want precision medicine to benefit everybody.

Aside from a fairness issue, why is diversity in genomic data important? What do we miss without it?

First of all, it has nothing to do with political correctness. It has everything to do with human biology and the fact that human populations and the great diaspora of human migrations have left their mark on the human genome. The genetic underpinnings of health and disease have shared components across human populations and things that are unique to different populations….(More)”.

Crowdsourcing the vote: New horizons in citizen forecasting


Article by Mickael Temporão Yannick Dufresne Justin Savoie and Clifton van der Linden in International Journal of Forecasting: “People do not know much about politics. This is one of the most robust findings in political science and is backed by decades of research. Most of this research has focused on people’s ability to know about political issues and party positions on these issues. But can people predict elections? Our research uses a very large dataset (n>2,000,000) collected during ten provincial and federal elections in Canada to test whether people can predict the electoral victor and the closeness of the race in their district throughout the campaign. The results show that they can. This paper also contributes to the emerging literature on citizen forecasting by developing a scaling method that allows us to compare the closeness of races and that can be applied to multiparty contexts with varying numbers of parties. Finally, we assess the accuracy of citizen forecasting in Canada when compared to voter expectations weighted by past votes and political competency….(More)”.