A Premature Eulogy for Privacy


Review Article by Evan Selinger: “Every so often a sound critical thinker and superb writer asks the wrong questions of a wheel-spinning topic like privacy and then draws the wrong conclusions. This is the case with Firmin DeBrabander in his Life After Privacy: Reclaiming Democracy in a Surveillance Society. Professor of Philosophy at the Maryland Institute College of Art, DeBrabander has a gift for clearly expressing complex ideas and explaining why underappreciated moments in the history of ideas have contemporary relevance. In this book, he aims “to understand the prospects and future of democracy without privacy, or very little of it.” That attempt necessarily leads him both to undervalue privacy and make a case for accepting a severely weakened democracy.

To be sure, DeBrabander doesn’t dismiss privacy with any sort of enthusiasm. On the contrary, he loves his privacy, depicting himself as someone who has to block his “beloved” but overly disclosive students on Facebook. “If I had my druthers, my personal data would be sacrosanct,” he writes. But he is convinced privacy is a lost cause. He makes what are essentially six claims about privacy — some of them seemingly obvious and others more startling — to buttress his eulogy.

Prosecuting Privacy

It’s worth listing DeBrabander’s six propositions here before then rebutting or at least complicating their veracity. The first privacy proposition repeats what has become a seeming truism: we’re living in a “confessional culture” that normalizes oversharing. His second has two components, both much discussed in the last several years: companies participating in the “surveillance economy” have an insatiable appetite for our personal information, and consumers don’t fully comprehend just how much value these companies are able to extract from it through data analytics. He emphasizes Charles Duhigg’s much-discussed 2012 New York Times article about Target using predictive analytics on big data to identify pregnant customers and present them with relevant coupons. Consumers, he contends, will continue giving away massive amounts of personal information to make their cars, homes, cities, and even bodies smarter; it’s likely they won’t be any more equipped in the future to assess tradeoffs and determine when they’re being exploited….(More)”.

COVID-19 Tests Gone Rogue: Privacy, Efficacy, Mismanagement and Misunderstandings


Paper by Manuel Morales et al: “COVID-19 testing, the cornerstone for effective screening and identification of COVID-19 cases, remains paramount as an intervention tool to curb the spread of COVID-19 both at local and national levels. However, the speed at which the pandemic struck and the response was rolled out, the widespread impact on healthcare infrastructure, the lack of sufficient preparation within the public health system, and the complexity of the crisis led to utter confusion among test-takers. Invasion of privacy remains a crucial concern. The user experience of test takers remains low. User friction affects user behavior and discourages participation in testing programs. Test efficacy has been overstated. Test results are poorly understood resulting in inappropriate follow-up recommendations. Herein, we review the current landscape of COVID-19 testing, identify four key challenges, and discuss the consequences of the failure to address these challenges. The current infrastructure around testing and information propagation is highly privacy-invasive and does not leverage scalable digital components. In this work, we discuss challenges complicating the existing covid-19 testing ecosystem and highlight the need to improve the testing experience for the user and reduce privacy invasions. Digital tools will play a critical role in resolving these challenges….(More)”.

Augmented Reality and the Surveillance Society


Mark Pesce at IEEE Spectrum: “First articulated in a 1965 white paper by Ivan Sutherland, titled “The Ultimate Display,” augmented reality (AR) lay beyond our technical capacities for 50 years. That changed when smartphones began providing people with a combination of cheap sensors, powerful processors, and high-bandwidth networking—the trifecta needed for AR to generate its spatial illusions. Among today’s emerging technologies, AR stands out as particularly demanding—for computational power, for sensed data, and, I’d argue, for attention to the danger it poses.

Unlike virtual-reality (VR) gear, which creates for the user a completely synthetic experience, AR gear adds to the user’s perception of her environment. To do that effectively, AR systems need to know where in space the user is located. VR systems originally used expensive and fragile systems for tracking user movements from the outside in, often requiring external sensors to be set up in the room. But the new generation of VR accomplishes this through a set of techniques collectively known as simultaneous localization and mapping (SLAM). These systems harvest a rich stream of observational data—mostly from cameras affixed to the user’s headgear, but sometimes also from sonar, lidar, structured light, and time-of-flight sensors—using those measurements to update a continuously evolving model of the user’s spatial environment.

For safety’s sake, VR systems must be restricted to certain tightly constrained areas, lest someone blinded by VR goggles tumble down a staircase. AR doesn’t hide the real world, though, so people can use it anywhere. That’s important because the purpose of AR is to add helpful (or perhaps just entertaining) digital illusions to the user’s perceptions. But AR has a second, less appreciated, facet: It also functions as a sophisticated mobile surveillance system.

This second quality is what makes Facebook’s recent Project Aria experiment so unnerving. Nearly four years ago, Mark Zuckerberg announced Facebook’s goal to create AR “spectacles”—consumer-grade devices that could one day rival the smartphone in utility and ubiquity. That’s a substantial technical ask, so Facebook’s research team has taken an incremental approach. Project Aria packs the sensors necessary for SLAM within a form factor that resembles a pair of sunglasses. Wearers collect copious amounts of data, which is fed back to Facebook for analysis. This information will presumably help the company to refine the design of an eventual Facebook AR product.

The concern here is obvious: When it comes to market in a few years, these glasses will transform their users into data-gathering minions for Facebook. Tens, then hundreds of millions of these AR spectacles will be mapping the contours of the world, along with all of its people, pets, possessions, and peccadilloes. The prospect of such intensive surveillance at planetary scale poses some tough questions about who will be doing all this watching and why….(More)”.

Inside India’s booming dark data economy


Snigdha Poonam and Samarath Bansal at the Rest of the World: “…The black market for data, as it exists online in India, resembles those for wholesale vegetables or smuggled goods. Customers are encouraged to buy in bulk, and the variety of what’s on offer is mind-boggling: There are databases about parents, cable customers, pregnant women, pizza eaters, mutual funds investors, and almost any niche group one can imagine. A typical database consists of a spreadsheet with row after row of names and key details: Sheila Gupta, 35, lives in Kolkata, runs a travel agency, and owns a BMW; Irfaan Khan, 52, lives in Greater Noida, and has a son who just applied to engineering college. The databases are usually updated every three months (the older one is, the less it is worth), and if you buy several at the same time, you’ll get a discount. Business is always brisk, and transactions are conducted quickly. No one will ask you for your name, let alone inquire why you want the phone numbers of five million people who have applied for bank loans.

There isn’t a reliable estimate of the size of India’s data economy or of how much money it generates annually. Regarding the former, each broker we spoke to had a different guess: One said only about one or two hundred professionals make up the top tier, another that every big Indian city has at least a thousand people trading data. To find them, potential customers need only look for their ads on social media or run searches with industry keywords and hashtags — “data,” “leads,” “database” — combined with detailed information about the kind of data they want and the city they want it from.

Privacy experts believe that the data-brokering industry has existed since the early days of the internet’s arrival in India. “Databases have been bought and sold in India for at least 15 years now. I remember a case from way back in 2006 of leaked employee data from Naukri.com (one of India’s first online job portals) being sold on CDs,” says Nikhil Pahwa, the editor and publisher of MediaNama, which covers technology policy. By 2009, data brokers were running SMS-marketing companies that offered complementary services: procuring targeted data and sending text messages in bulk. Back then, there was simply less data, “and those who had it could sell it at whatever price,” says Himanshu Bhatt, a data broker who claims to be retired. That is no longer the case: “Today, everyone has every kind of data,” he said.

No broker we contacted would openly discuss their methods of hunting, harvesting, and selling data. But the day-to-day work generally consists of following the trails that people leave during their travels around the internet. Brokers trawl data storage websites armed with a digital fishing net. “I was shocked when I was surfing [cloud-hosted data sites] one day and came across Aadhaar cards,” Bhatt remarked, referring to India’s state-issued biometric ID cards. Images of them were available to download in bulk, alongside completed loan applications and salary sheets.

Again, the legal boundaries here are far from clear. Anybody who has ever filled out a form on a coupon website or requested a refund for a movie ticket has effectively entered their information into a database that can be sold without their consent by the company it belongs to. A neighborhood cell phone store can sell demographic information to a political party for hyperlocal campaigning, and a fintech company can stealthily transfer an individual’s details from an astrology app onto its own server, to gauge that person’s creditworthiness. When somebody shares employment history on LinkedIn or contact details on a public directory, brokers can use basic software such as web scrapers to extract that data.

But why bother hacking into a database when you can buy it outright? More often, “brokers will directly approach a bank employee and tell them, ‘I need the high-end database’,” Bhatt said. And as demand for information increases, so, too, does data vulnerability. A 2019 survey found that 69% of Indian companies haven’t set up reliable data security systems; 44% have experienced at least one breach already. “In the past 12 months, we have seen an increasing trend of Indians’ data [appearing] on the dark web,” says Beenu Arora, the CEO of the global cyberintelligence firm Cyble….(More)”.

Consumer Bureau To Decide Who Owns Your Financial Data


Article by Jillian S. Ambroz: “A federal agency is gearing up to make wide-ranging policy changes on consumers’ access to their financial data.

The Consumer Financial Protection Bureau (CFPB) is looking to implement the area of the 2010 Dodd-Frank Wall Street Reform and Consumer Protection Act pertaining to a consumer’s rights to his or her own financial data. It is detailed in section 1033.

The agency has been laying the groundwork on this move for years, from requesting information in 2016 from financial institutions to hosting a symposium earlier this year on the problems of screen scraping, a risky but common method of collecting consumer data.

Now the agency, which was established by the Dodd-Frank Act, is asking for comments on this critical and controversial topic ahead of the proposed rulemaking. Unlike other regulations that affect single industries, this could be all-encompassing because the consumer data rule touches almost every market the agency covers, according to the story in American Banker.

The Trump administration all but ‘systematically neutered’ the agency.

With the ruling, the agency seeks to clarify its compliance expectations and help establish market practices to ensure consumers have access to consumer financial data. The agency sees an opportunity here to help shape this evolving area of financial technology, or fintech, recognizing both the opportunities and the risks to consumers as more fintechs become enmeshed with their data and day-to-day lives.

Its goal is “to better effectuate consumer access to financial records,” as stated in the regulatory filing….(More)”.

Evaluating Identity Disclosure Risk in Fully Synthetic Health Data: Model Development and Validation


Paper by Khaled El Emam et al: “There has been growing interest in data synthesis for enabling the sharing of data for secondary analysis; however, there is a need for a comprehensive privacy risk model for fully synthetic data: If the generative models have been overfit, then it is possible to identify individuals from synthetic data and learn something new about them.

Objective: The purpose of this study is to develop and apply a methodology for evaluating the identity disclosure risks of fully synthetic data.

Methods: A full risk model is presented, which evaluates both identity disclosure and the ability of an adversary to learn something new if there is a match between a synthetic record and a real person. We term this “meaningful identity disclosure risk.” The model is applied on samples from the Washington State Hospital discharge database (2007) and the Canadian COVID-19 cases database. Both of these datasets were synthesized using a sequential decision tree process commonly used to synthesize health and social science data.

Results: The meaningful identity disclosure risk for both of these synthesized samples was below the commonly used 0.09 risk threshold (0.0198 and 0.0086, respectively), and 4 times and 5 times lower than the risk values for the original datasets, respectively.

Conclusions: We have presented a comprehensive identity disclosure risk model for fully synthetic data. The results for this synthesis method on 2 datasets demonstrate that synthesis can reduce meaningful identity disclosure risks considerably. The risk model can be applied in the future to evaluate the privacy of fully synthetic data….(More)”.

How the U.S. Military Buys Location Data from Ordinary Apps


Joseph Cox at Vice: “The U.S. military is buying the granular movement data of people around the world, harvested from innocuous-seeming apps, Motherboard has learned. The most popular app among a group Motherboard analyzed connected to this sort of data sale is a Muslim prayer and Quran app that has more than 98 million downloads worldwide. Others include a Muslim dating app, a popular Craigslist app, an app for following storms, and a “level” app that can be used to help, for example, install shelves in a bedroom.

Through public records, interviews with developers, and technical analysis, Motherboard uncovered two separate, parallel data streams that the U.S. military uses, or has used, to obtain location data. One relies on a company called Babel Street, which creates a product called Locate X. U.S. Special Operations Command (USSOCOM), a branch of the military tasked with counterterrorism, counterinsurgency, and special reconnaissance, bought access to Locate X to assist on overseas special forces operations. The other stream is through a company called X-Mode, which obtains location data directly from apps, then sells that data to contractors, and by extension, the military.

The news highlights the opaque location data industry and the fact that the U.S. military, which has infamously used other location data to target drone strikes, is purchasing access to sensitive data. Many of the users of apps involved in the data supply chain are Muslim, which is notable considering that the United States has waged a decades-long war on predominantly Muslim terror groups in the Middle East, and has killed hundreds of thousands of civilians during its military operations in Pakistan, Afghanistan, and Iraq. Motherboard does not know of any specific operations in which this type of app-based location data has been used by the U.S. military.

The apps sending data to X-Mode include Muslim Pro, an app that reminds users when to pray and what direction Mecca is in relation to the user’s current location. The app has been downloaded over 50 million times on Android, according to the Google Play Store, and over 98 million in total across other platforms including iOS, according to Muslim Pro’s website….(More)”.

The responsible use of data for and about children: treading carefully and ethically


Q&A with Stefaan G. Verhulst and Andrew Young …” working in collaboration with UNICEF on an initiative called Responsible Data for Children initiative (RD4C) . Its focus is on data – the risks it poses to children, as well as the opportunities it offers.

You have been working with UNICEF on the Responsible Data for Children initiative (RD4C). What is this and why do we need to be talking more about ‘responsible data’?

To date, the relationship between the datafication of everyday life and child welfare has been under-explored, both by researchers in data ethics and those who work to advance the rights of children. This neglect is a lost opportunity, and also poses a risk to children.

Today’s children are the first generation to grow up amid the rapid datafication of virtually every aspect of social, cultural, political and economic life. This alone calls for greater scrutiny of the role played by data. An entire generation is being datafied, often starting before birth. Every year the average child will have more data collected about them in their lifetime than would a similar child born any year prior. Ironically, humanitarian and development organizations working with children are themselves among the key actors contributing to the increased collection of data. These organizations rely on a wide range of technologies, including biometrics, digital identity systems, remote-sensing technologies, mobile and social media messaging apps, and administrative data systems. The data generated by these tools and platforms inevitably includes potentially sensitive PII data (personally identifiable information) and DII data (demographically identifiable information). All of this begs much closer scrutiny, and a more systematic framework to guide how child-related data is collected, stored, and used.

Towards this aim, we have also been working with the Data for Children Collaborative, based in Edinburgh in establishing innovative and ethical practices around the use of data to improve the lives of children worldwide….(More)”.

Federated Learning for Privacy-Preserving Data Access


Paper by Małgorzata Śmietanka, Hirsh Pithadia and Philip Treleaven: “Federated learning is a pioneering privacy-preserving data technology and also a new machine learning model trained on distributed data sets.

Companies collect huge amounts of historic and real-time data to drive their business and collaborate with other organisations. However, data privacy is becoming increasingly important because of regulations (e.g. EU GDPR) and the need to protect their sensitive and personal data. Companies need to manage data access: firstly within their organizations (so they can control staff access), and secondly protecting raw data when collaborating with third parties. What is more, companies are increasingly looking to ‘monetize’ the data they’ve collected. However, under new legislations, utilising data by different organization is becoming increasingly difficult (Yu, 2016).

Federated learning pioneered by Google is the emerging privacy- preserving data technology and also a new class of distributed machine learning models. This paper discusses federated learning as a solution for privacy-preserving data access and distributed machine learning applied to distributed data sets. It also presents a privacy-preserving federated learning infrastructure….(More)”.

Not fit for Purpose: A critical analysis of the ‘Five Safes’


Paper by Chris Culnane, Benjamin I. P. Rubinstein, and David Watts: “Adopted by government agencies in Australia, New Zealand, and the UK as policy instrument or as embodied into legislation, the ‘Five Safes’ framework aims to manage risks of releasing data derived from personal information. Despite its popularity, the Five Safes has undergone little legal or technical critical analysis. We argue that the Fives Safes is fundamentally flawed: from being disconnected from existing legal protections and appropriation of notions of safety without providing any means to prefer strong technical measures, to viewing disclosure risk as static through time and not requiring repeat assessment. The Five Safes provides little confidence that resulting data sharing is performed using ‘safety’ best practice or for purposes in service of public interest….(More)”.