Can Big Data Revolutionize International Human Rights Law?


Galit A. Sarfaty in the Journal of International Law: “International human rights efforts have been overly reliant on reactive tools and focused on treaty compliance, while often underemphasizing the prevention of human rights violations. I argue that data analytics can play an important role in refocusing the international human rights regime on its original goal of preventing human rights abuses, but it comes at a cost.

There are risks in advancing a data-driven approach to human rights, including the privileging of certain rights subject to quantitative measurement and the precipitation of further human rights abuses in the process of preventing other violations. Moreover, the increasing use of big data can ultimately privatize the international human rights regime by transforming the corporation into a primary gatekeeper of rights protection. Such unintended consequences need to be addressed in order to maximize the benefits and minimize the risks of using big data in this field….(More)”.

Data-Intensive Approaches To Creating Innovation For Sustainable Smart Cities


Science Trends: “Located at the complex intersection of economic development and environmental change, cities play a central role in our efforts to move towards sustainability. Reducing air and water pollution, improving energy efficiency while securing energy supply, and minimizing vulnerabilities to disruptions and disturbances are interconnected and pose a formidable challenge, with their dynamic interactions changing in highly complex and unpredictable manners….

The Beijing City Lab demonstrates the usefulness of open urban data in mapping urbanization with a fine spatiotemporal scale and reflecting social and environmental dimensions of urbanization through visualization at multiple scales.

The basic principle of open data will generate significant opportunities for promoting inter-disciplinary and inter-organizational research, producing new data sets through the integration of different sources, avoiding duplication of research, facilitating the verification of previous results, and encouraging citizen scientists and crowdsourcing approaches. Open data also is expected to help governments promote transparency, citizen participation, and access to information in policy-making processes.

Despite a significant potential, however, there still remain numerous challenges in facilitating innovation for urban sustainability through open data. The scope and amount of data collected and shared are still limited, and the quality control, error monitoring, and cleaning of open data is also indispensable in securing the reliability of the analysis. Also, the organizational and legal frameworks of data sharing platforms are often not well-defined or established, and it is critical to address the interoperability between various data standards, balance between open and proprietary data, and normative and legal issues such as the data ownership, personal privacy, confidentiality, law enforcement, and the maintenance of public safety and national security….

These findings are described in the article entitled Facilitating data-intensive approaches to innovation for sustainability: opportunities and challenges in building smart cities, published in the journal Sustainability Science. This work was led by Masaru Yarime from the City University of Hong Kong….(More)”.

Government data: How open is too open?


Sharon Fisher at HPE: “The notion of “open government” appeals to both citizens and IT professionals seeking access to freely available government data. But is there such a thing as data access being too open? Governments may want to be transparent, yet they need to avoid releasing personally identifiable information.

There’s no question that open government data offers many benefits. It gives citizens access to the data their taxes paid for, enables government oversight, and powers the applications developed by government, vendors, and citizens that improve people’s lives.

However, data breaches and concerns about the amount of data that government is collecting makes some people wonder: When is it too much?

“As we think through the big questions about what kind of data a state should collect, how it should use it, and how to disclose it, these nuances become not some theoretical issue but a matter of life and death to some people,” says Alexander Howard, deputy director of the Sunlight Foundation, a Washington nonprofit that advocates for open government. “There are people in government databases where the disclosure of their [physical] location is the difference between a life-changing day and Wednesday.

Open data supporters point out that much of this data has been considered a public record all along and tout the value of its use in analytics. But having personal data aggregated in a single place that is accessible online—as opposed to, say, having to go to an office and physically look up each record—makes some people uneasy.

Privacy breaches, wholesale

“We’ve seen a real change in how people perceive privacy,” says Michael Morisy, executive director at MuckRock, a Cambridge, Massachusetts, nonprofit that helps media and citizens file public records requests. “It’s been driven by a long-standing concept in transparency: practical obscurity.” Even if something was technically a public record, effort needed to be expended to get one’s hands on it. That amount of work might be worth it about, say, someone running for office, but on the whole, private citizens didn’t have to worry. Things are different now, says Morisy. “With Google, and so much data being available at the click of a mouse or the tap of a phone, what was once practically obscure is now instantly available.”

People are sometimes also surprised to find out that public records can contain their personally identifiable information (PII), such as addresses, phone numbers, and even Social Security numbers. That may be on purpose or because someone failed to redact the data properly.

That’s had consequences. Over the years, there have been a number of incidents in which PII from public records, including addresses, was used to harass and sometimes even kill people. For example, in 1989, Rebecca Schaeffer was murdered by a stalker who learned her address from the Department of Motor Vehicles. Other examples of harassment via driver’s license numbers include thieves who tracked down the address of owners of expensive cars and activists who sent anti-abortion literature to women who had visited health clinics that performed abortions.

In response, in 1994, Congress enacted the Driver’s Privacy Protection Act to restrict the sale of such data. More recently, the state of Idaho passed a law protecting the identity of hunters who shot wolves, because the hunters were being harassed by wolf supporters. Similarly, the state of New York allowed concealed pistol permit holders to make their name and address private after a newspaper published an online interactive map showing the names and addresses of all handgun permit holders in Westchester and Rockland counties….(More)”.

Cops, Docs, and Code: A Dialogue between Big Data in Health Care and Predictive Policing


Paper by I. Glenn Cohen and Harry Graver: “Big data” has become the ubiquitous watchword of this decade. Predictive analytics, which is something we want to do with big data — to use of electronic algorithms to forecast future events in real time. Predictive analytics is interfacing with the law in a myriad of settings: how votes are counted and voter rolls revised, the targeting of taxpayers for auditing, the selection of travelers for more intensive searching, pharmacovigilance, the creation of new drugs and diagnostics, etc.

In this paper, written for the symposium “Future Proofing the Law,” we want to engage in a bit of legal arbitrage; that is, we want to examine which insights from legal analysis of predictive analytics in better-trodden ground — predictive policing — can be useful for understanding relatively newer ground for legal scholars — the use of predictive analytics in health care. To the degree lessons can be learned from this dialogue, we think they go in both directions….(More)”.

New York City moves to create accountability for algorithms


Lauren Kirchner at ArsTechnica: “The algorithms that play increasingly central roles in our lives often emanate from Silicon Valley, but the effort to hold them accountable may have another epicenter: New York City. Last week, the New York City Council unanimously passed a bill to tackle algorithmic discrimination—the first measure of its kind in the country.

The algorithmic accountability bill, waiting to be signed into law by Mayor Bill de Blasio, establishes a task force that will study how city agencies use algorithms to make decisions that affect New Yorkers’ lives, and whether any of the systems appear to discriminate against people based on age, race, religion, gender, sexual orientation, or citizenship status. The task force’s report will also explore how to make these decision-making processes understandable to the public.

The bill’s sponsor, Council Member James Vacca, said he was inspired by ProPublica’s investigation into racially biased algorithms used to assess the criminal risk of defendants….

A previous, more sweeping version of the bill had mandated that city agencies publish the source code of all algorithms being used for “targeting services” or “imposing penalties upon persons or policing” and to make them available for “self-testing” by the public. At a hearing at City Hall in October, representatives from the mayor’s office expressed concerns that this mandate would threaten New Yorkers’ privacy and the government’s cybersecurity.

The bill was one of two moves the City Council made last week concerning algorithms. On Thursday, the committees on health and public safety held a hearing on the city’s forensic methods, including controversial tools that the chief medical examiner’s office crime lab has used for difficult-to-analyze samples of DNA.

As a ProPublica/New York Times investigation detailed in September, an algorithm created by the lab for complex DNA samples has been called into question by scientific experts and former crime lab employees.

The software, called the Forensic Statistical Tool, or FST, has never been adopted by any other lab in the country….(More)”.

The whys of social exclusion : insights from behavioral economics


Paper by Karla Hoff and James Sonam Walsh: “All over the world, people are prevented from participating fully in society through mechanisms that go beyond the structural and institutional barriers identified by rational choice theory (poverty, exclusion by law or force, taste-based and statistical discrimination, and externalities from social networks).

This essay discusses four additional mechanisms that bounded rationality can explain: (i) implicit discrimination, (ii) self-stereotyping and self-censorship, (iii) “fast thinking” adapted to underclass neighborhoods, and (iv)”adaptive preferences” in which an oppressed group views its oppression as natural or even preferred.

Stable institutions have cognitive foundations — concepts, categories, social identities, and worldviews — that function like lenses through which individuals see themselves and the world. Abolishing or reforming a discriminatory institution may have little effect on these lenses. Groups previously discriminated against by law or policy may remain excluded through habits of the mind. Behavioral economics recognizes forces of social exclusion left out of rational choice theory, and identifies ways to overcome them. Some interventions have had very consequential impact….(More)”.

Accountability of AI Under the Law: The Role of Explanation


Paper by Finale Doshi-Velez and Mason Kortz: “The ubiquity of systems using artificial intelligence or “AI” has brought increasing attention to how those systems should be regulated. The choice of how to regulate AI systems will require care. AI systems have the potential to synthesize large amounts of data, allowing for greater levels of personalization and precision than ever before—applications range from clinical decision support to autonomous driving and predictive policing. That said, our AIs continue to lag in common sense reasoning [McCarthy, 1960], and thus there exist legitimate concerns about the intentional and unintentional negative consequences of AI systems [Bostrom, 2003, Amodei et al., 2016, Sculley et al., 2014]. How can we take advantage of what AI systems have to offer, while also holding them accountable?

In this work, we focus on one tool: explanation. Questions about a legal right to explanation from AI systems was recently debated in the EU General Data Protection Regulation [Goodman and Flaxman, 2016, Wachter et al., 2017a], and thus thinking carefully about when and how explanation from AI systems might improve accountability is timely. Good choices about when to demand explanation can help prevent negative consequences from AI systems, while poor choices may not only fail to hold AI systems accountable but also hamper the development of much-needed beneficial AI systems.

Below, we briefly review current societal, moral, and legal norms around explanation, and then focus on the different contexts under which explanation is currently required under the law. We find that there exists great variation around when explanation is demanded, but there also exist important consistencies: when demanding explanation from humans, what we typically want to know is whether and how certain input factors affected the final decision or outcome.

These consistencies allow us to list the technical considerations that must be considered if we desired AI systems that could provide kinds of explanations that are currently required of humans under the law. Contrary to popular wisdom of AI systems as indecipherable black boxes, we find that this level of explanation should generally be technically feasible but may sometimes be practically onerous—there are certain aspects of explanation that may be simple for humans to provide but challenging for AI systems, and vice versa. As an interdisciplinary team of legal scholars, computer scientists, and cognitive scientists, we recommend that for the present, AI systems can and should be held to a similar standard of explanation as humans currently are; in the future we may wish to hold an AI to a different standard….(More)”

Democracy in the digital age: digital agora or dystopia


Paper by Peter Parycek, Bettina Rinnerbauer, and Judith Schossböck in the International Journal of Electronic Governance: “Information and communication technologies (ICTs) affect democracy and the rule of law. Digitalisation has been perceived as a stimulus towards a more participative society or as support to decision making, but not without criticism. Authors draw on a legal review, case studies and quantitative survey data about citizens’ view on transparency and participation in the German-speaking region to summarise selected discourses of democratisation via ICTs and the dominant critique. The paper concludes with an outlook on contemporary questions of digital democracy between the dialectic of protecting citizens’ rights and citizen control. It is proposed that prospective e-participation projects will concentrate on processes of innovation and creativity as opposed to participation rates. Future investigations should evaluate the contexts in which a more data-driven, automated form of decision making could be supported and collect indicators for where to draw the line between the protection and control of citizens, including research on specific tools…(More).

What Are Data? A Categorization of the Data Sensitivity Spectrum


Paper by John M.M. Rumbold and Barbara K. Pierscionek in Big Data Research: “The definition of data might at first glance seem prosaic, but formulating a definitive and useful definition is surprisingly difficult. This question is important because of the protection given to data in law and ethics. Healthcare data are universally considered sensitive (and confidential), so it might seem that the categorisation of less sensitive data is relatively unimportant for medical data research. This paper will explore the arguments that this is not necessarily the case and the relevance of recognizing this.

The categorization of data and information requires re-evaluation in the age of Big Data in order to ensure that the appropriate protections are given to different types of data. The aggregation of large amounts of data requires an assessment of the harms and benefits that pertain to large datasets linked together, rather than simply assessing each datum or dataset in isolation. Big Data produce new data via inferences, and this must be recognized in ethical assessments. We propose a schema for a granular assessment of data categories. The use of schemata such as this will assist decision-making by providing research ethics committees and information governance bodies with guidance about the relative sensitivities of data. This will ensure that appropriate and proportionate safeguards are provided for data research subjects and reduce inconsistency in decision making…(More)”.

Mixed Messages? The Limits of Automated Social Media Content Analysis


CDT Paper by Natasha Duarte, Emma Llanso and Anna Loup: “Governments and companies are turning to automated tools to make sense of what people post on social media, for everything ranging from hate speech detection to law enforcement investigations. Policymakers routinely call for social media companies to identify and take down hate speech, terrorist propaganda, harassment, “fake news” or disinformation, and other forms of problematic speech. Other policy proposals have focused on mining social media to inform law enforcement and immigration decisions. But these proposals wrongly assume that automated technology can accomplish on a large scale the kind of nuanced analysis that humans can accomplish on a small scale.

This paper explains the capabilities and limitations of tools for analyzing the text of social media posts and other online content. It is intended to help policymakers understand and evaluate available tools and the potential consequences of using them to carry out government policies. This paper focuses specifically on the use of natural language processing (NLP) tools for analyzing the text of social media posts. We explain five limitations of these tools that caution against relying on them to decide who gets to speak, who gets admitted into the country, and other critical determinations. This paper concludes with recommendations for policymakers and developers, including a set of questions to guide policymakers’ evaluation of available tools….(More)”.