Habeas Data: Privacy vs. The Rise of Surveillance Tech


Book by Cyrus Farivar: “Habeas Data shows how the explosive growth of surveillance technology has outpaced our understanding of the ethics, mores, and laws of privacy.

Award-winning tech reporter Cyrus Farivar makes the case by taking ten historic court decisions that defined our privacy rights and matching them against the capabilities of modern technology. It’s an approach that combines the charge of a legal thriller with the shock of the daily headlines.

Chapters include: the 1960s prosecution of a bookie that established the “reasonable expectation of privacy” in nonpublic places beyond your home (but how does that ruling apply now, when police can chart your every move and hear your every conversation within your own home — without even having to enter it?); the 1970s case where the police monitored a lewd caller — the decision of which is now the linchpin of the NSA’s controversial metadata tracking program revealed by Edward Snowden; and a 2010 low-level burglary trial that revealed police had tracked a defendant’s past 12,898 locations before arrest — an invasion of privacy grossly out of proportion to the alleged crime, which showed how authorities are all too willing to take advantage of the ludicrous gap between the slow pace of legal reform and the rapid transformation of technology.

A dazzling exposé that journeys from Oakland, California to the halls of the Supreme Court to the back of a squad car, Habeas Data combines deft reportage, deep research, and original interviews to offer an X-ray diagnostic of our current surveillance state….(More)”.

The EU Wants to Build One of the World’s Largest Biometric Databases. What Could Possibly Go Wrong?


Grace Dobush at Fortune: “China and India have built the world’s largest biometric databases, but the European Union is about to join the club.

The Common Identity Repository (CIR) will consolidate biometric data on almost all visitors and migrants to the bloc, as well as some EU citizens—connecting existing criminal, asylum, and migration databases and integrating new ones. It has the potential to affect hundreds of millions of people.

The plan for the database, first proposed in 2016 and approved by the EU Parliament on April 16, was sold as a way to better track and monitor terrorists, criminals, and unauthorized immigrants.

The system will target the fingerprints and identity data for visitors and immigrants initially, and represents the first step towards building a truly EU-wide citizen database. At the same time, though, critics argue its mere existence will increase the potential for hacks, leaks, and law enforcement abuse of the information….

The European Parliament and the European Council have promised to address those concerns, through “proper safeguards” to protect personal privacy and to regulate officers’ access to data. In 2016, they passed a law regarding law enforcement’s access to personal data, alongside General Data Protection Regulation or GDPR.

But total security is a tall order. Germany is currently dealing with multipleinstances of police officers allegedly leaking personal information to far-right groups. Meanwhile, a Swedish hacker went to prison for hacking into Denmark’s public records system in 2012 and dumping online the personal data of hundreds of thousands of citizens and migrants….(More)”.


LAPD moving away data-driven crime programs over potential racial bias


Mark Puente in The Los Angeles Times: “The Los Angeles Police Department pioneered the controversial use of data to pinpoint crime hot spots and track violent offenders.

Complex algorithms and vast databases were supposed to revolutionize crime fighting, making policing more efficient as number-crunching computers helped to position scarce resources.

But critics long complained about inherent bias in the data — gathered by officers — that underpinned the tools.

They claimed a partial victory when LAPD Chief Michel Moore announced he would end one highly touted program intended to identify and monitor violent criminals. On Tuesday, the department’s civilian oversight panel raised questions about whether another program, aimed at reducing property crime, also disproportionately targets black and Latino communities.

Members of the Police Commission demanded more information about how the agency plans to overhaul a data program that helps predict where and when crimes will likely occur. One questioned why the program couldn’t be suspended.

“There is very limited information” on the program’s impact, Commissioner Shane Murphy Goldsmith said.

The action came as so-called predictive policing— using search tools, point scores and other methods — is under increasing scrutiny by privacy and civil liberties groups that say the tactics result in heavier policing of black and Latino communities. The argument was underscored at Tuesday’s commission meeting when several UCLA academics cast doubt on the research behind crime modeling and predictive policing….(More)”.

Introducing the Contractual Wheel of Data Collaboration


Blog by Andrew Young and Stefaan Verhulst: “Earlier this year we launched the Contracts for Data Collaboration (C4DC) initiative — an open collaborative with charter members from The GovLab, UN SDSN Thematic Research Network on Data and Statistics (TReNDS), University of Washington and the World Economic Forum. C4DC seeks to address the inefficiencies of developing contractual agreements for public-private data collaboration by informing and guiding those seeking to establish a data collaborative by developing and making available a shared repository of relevant contractual clauses taken from existing legal agreements. Today TReNDS published “Partnerships Founded on Trust,” a brief capturing some initial findings from the C4DC initiative.

The Contractual Wheel of Data Collaboration [beta]

The Contractual Wheel of Data Collaboration [beta] — Stefaan G. Verhulst and Andrew Young, The GovLab

As part of the C4DC effort, and to support Data Stewards in the private sector and decision-makers in the public and civil sectors seeking to establish Data Collaboratives, The GovLab developed the Contractual Wheel of Data Collaboration [beta]. The Wheel seeks to capture key elements involved in data collaboration while demystifying contracts and moving beyond the type of legalese that can create confusion and barriers to experimentation.

The Wheel was developed based on an assessment of existing legal agreements, engagement with The GovLab-facilitated Data Stewards Network, and analysis of the key elements of our Data Collaboratives Methodology. It features 22 legal considerations organized across 6 operational categories that can act as a checklist for the development of a legal agreement between parties participating in a Data Collaborative:…(More)”.

Data Trusts: More Data than Trust? The Perspective of the Data Subject in the Face of a Growing Problem


Paper by Christine Rinik: “In the recent report, Growing the Artificial Intelligence Industry in the UK, Hall and Pesenti suggest the use of a ‘data trust’ to facilitate data sharing. Whilst government and corporations are focusing on their need to facilitate data sharing, the perspective of many individuals is that too much data is being shared. The issue is not only about data, but about power. The individual does not often have a voice when issues relating to data sharing are tackled. Regulators can cite the ‘public interest’ when data governance is discussed, but the individual’s interests may diverge from that of the public.

This paper considers the data subject’s position with respect to data collection leading to considerations about surveillance and datafication. Proposals for data trusts will be considered applying principles of English trust law to possibly mitigate the imbalance of power between large data users and individual data subjects. Finally, the possibility of a workable remedy in the form of a class action lawsuit which could give the data subjects some collective power in the event of a data breach will be explored. Despite regulatory efforts to protect personal data, there is a lack of public trust in the current data sharing system….(More)”.

Synthetic data: innovation for public good


Blog Post by Catrin Cheung: “What is synthetic data, and how can it be used for public good? ….Synthetic data are artificially generated data that have the look and structure of real data, but do not contain any information on individuals. They also contain more general characteristics that are used to find patterns in the data.

They are modelled on real data, but designed in a way which safeguards the legal, ethical and confidentiality requirements of the original data. Given their resemblance to the original data, synthetic data are useful in a range of situations, for example when data is sensitive or missing. They are used widely as teaching materials, to test code or mathematical models, or as training data for machine learning models….

There’s currently a wealth of research emerging from the health sector, as the nature of data published is often sensitive. Public Health England have synthesised cancer data which can be freely accessed online. NHS Scotland are making advances in cutting-edge machine learning methods such as Variational Auto Encoders and Generative Adversarial Networks (GANs).

There is growing interest in this area of research, and its influence extends beyond the statistical community. While the Data Science Campus have also used GANs to generate synthetic data in their latest research, its power is not limited to data generation. It can be trained to construct features almost identical to our own across imagery, music, speech and text. In fact, GANs have been used to create a painting of Edmond de Belamy, which sold for $432,500 in 2018!

Within the ONS, a pilot to create synthetic versions of securely held Labour Force Survey data has been carried out using a package in R called “synthpop”. This synthetic dataset can be shared with approved researchers to de-bug codes, prior to analysis of data held in the Secure Research Service….

Although much progress is done in this field, one challenge that persists is guaranteeing the accuracy of synthetic data. We must ensure that the statistical properties of synthetic data match properties of the original data.

Additional features, such as the presence of non-numerical data, add to this difficult task. For example, if something is listed as “animal” and can take the possible values “dog”,”cat” or “elephant”, it is difficult to convert this information into a format suitable for precise calculations. Furthermore, given that datasets have different characteristics, there is no straightforward solution that can be applied to all types of data….particular focus was also placed on the use of synthetic data in the field of privacy, following from the challenges and opportunities identified by the National Statistician’s Quality Review of privacy and data confidentiality methods published in December 2018….(More)”.

Tracking Phones, Google Is a Dragnet for the Police


Jennifer Valentino-DeVries at the New York Times: “….The warrants, which draw on an enormous Google database employees call Sensorvault, turn the business of tracking cellphone users’ locations into a digital dragnet for law enforcement. In an era of ubiquitous data gathering by tech companies, it is just the latest example of how personal information — where you go, who your friends are, what you read, eat and watch, and when you do it — is being used for purposes many people never expected. As privacy concerns have mounted among consumers, policymakers and regulators, tech companies have come under intensifying scrutiny over their data collection practices.

The Arizona case demonstrates the promise and perils of the new investigative technique, whose use has risen sharply in the past six months, according to Google employees familiar with the requests. It can help solve crimes. But it can also snare innocent people.

Technology companies have for years responded to court orders for specific users’ information. The new warrants go further, suggesting possible suspects and witnesses in the absence of other clues. Often, Google employees said, the company responds to a single warrant with location information on dozens or hundreds of devices.

Law enforcement officials described the method as exciting, but cautioned that it was just one tool….

The technique illustrates a phenomenon privacy advocates have long referred to as the “if you build it, they will come” principle — anytime a technology company creates a system that could be used in surveillance, law enforcement inevitably comes knocking. Sensorvault, according to Google employees, includes detailed location records involving at least hundreds of millions of devices worldwide and dating back nearly a decade….(More)”.

The Privacy Project


The New York Times: “Companies and governments are gaining new powers to follow people across the internet and around the world, and even to peer into their genomes. The benefits of such advances have been apparent for years; the costs — in anonymity, even autonomy — are now becoming clearer. The boundaries of privacy are in dispute, and its future is in doubt. Citizens, politicians and business leaders are asking if societies are making the wisest tradeoffs. The Times is embarking on this months long project to explore the technology and where it’s taking us, and to convene debate about how it can best help realize human potential….(More)”

Does Privacy Matter?

What Do They Know, and How Do They Know It?

What Should Be Done About This?

What Can I Do?

(View all Privacy articles…)

The Market for Data Privacy


Paper by Tarun Ramadorai, Antoine Uettwiller and Ansgar Walther: “We scrape a comprehensive set of US firms’ privacy policies to facilitate research on the supply of data privacy. We analyze these data with the help of expert legal evaluations, and also acquire data on firms’ web tracking activities. We find considerable and systematic variation in privacy policies along multiple dimensions including ease of access, length, readability, and quality, both within and between industries. Motivated by a simple theory of big data acquisition and usage, we analyze the relationship between firm size, knowledge capital intensity, and privacy supply. We find that large firms with intermediate data intensity have longer, legally watertight policies, but are more likely to share user data with third parties….(More)”.

Platform Surveillance


Editorial by David Murakami Wood and Torin Monahan of Special Issue of Surveillance and Society: “This editorial introduces this special responsive issue on “platform surveillance.” We develop the term platform surveillance to account for the manifold and often insidious ways that digital platforms fundamentally transform social practices and relations, recasting them as surveillant exchanges whose coordination must be technologically mediated and therefore made exploitable as data. In the process, digital platforms become dominant social structures in their own right, subordinating other institutions, conjuring or sedimenting social divisions and inequalities, and setting the terms upon which individuals, organizations, and governments interact.

Emergent forms of platform capitalism portend new governmentalities, as they gradually draw existing institutions into alignment or harmonization with the logics of platform surveillance while also engendering subjectivities (e.g., the gig-economy worker) that support those logics. Because surveillance is essential to the operations of digital platforms, because it structures the forms of governance and capital that emerge, the field of surveillance studies is uniquely positioned to investigate and theorize these phenomena….(More)”.