privacy

Health Data Privacy under the GDPR: Big Data Challenges and Regulatory Responses

Curated on August 21, 2020August 21, 2020 by Stefaan Verhulst

Book edited by Maria Tzanou: “The growth of data collecting goods and services, such as ehealth and mhealth apps, smart watches, mobile fitness and dieting apps, electronic skin and ingestible tech, combined with recent technological developments such as increased capacity of data storage, artificial intelligence and smart algorithms have spawned a big data revolution that has reshaped how we understand and approach health data. Recently the COVID-19 pandemic has foregrounded a variety of data privacy issues. The collection, storage, sharing and analysis of health- related data raises major legal and ethical questions relating to privacy, data protection, profiling, discrimination, surveillance, personal autonomy and dignity.

This book examines health privacy questions in light of the GDPR and the EU’s general data privacy legal framework. The GDPR is a complex and evolving body of law that aims to deal with several technological and societal health data privacy problems, while safeguarding public health interests and addressing its internal gaps and uncertainties. The book answers a diverse range of questions including: What role can the GDPR play in regulating health surveillance and big (health) data analytics? Can it catch up with the Internet age developments? Are the solutions to the challenges posed by big health data to be found in the law? Does the GDPR provide adequate tools and mechanisms to ensure public health objectives and the effective protection of privacy? How does the GDPR deal with data that concern children’s health and academic research?

By analysing a number of diverse questions concerning big health data under the GDPR from various different perspectives, this book will appeal to those interested in privacy, data protection, big data, health sciences, information technology, the GDPR, EU and human rights law….(More)”.

Data Justice and COVID-19: Global Perspectives

Curated on August 15, 2020August 13, 2020 by Stefaan Verhulst

Book edited by edited by Linnet Taylor, Aaron Martin, Gargi Sharma and Shazade Jameson: “In early 2020, as the COVID-19 pandemic swept the world and states of emergency were declared by one country after another, the global technology sector—already equipped with unprecedented wealth, power, and influence—mobilised to seize the opportunity. This collection is an account of what happened next and captures the emergent conflicts and responses around the world. The essays provide a global perspective on the implications of these developments for justice: they make it possible to compare how the intersection of state and corporate power—and the way that power is targeted and exercised—confronts, and invites resistance from, civil society in countries worldwide.

This edited volume captures the technological response to the pandemic in 33 countries, accompanied by nine thematic reflections, and reflects the unfolding of the first wave of the pandemic.

This book can be read as a guide to the landscape of technologies deployed during the pandemic and also be used to compare individual country strategies. It will prove useful as a tool for teaching and learning in various disciplines and as a reference point for activists and analysts interested in issues of data justice.

The essays interrogate these technologies and the political, legal, and regulatory structures that determine how they are applied. In doing so,the book exposes the workings of state technological power to critical assessment and contestation….(More)”

Strengthening Privacy Protections in COVID-19 Mobile Phone–Enhanced Surveillance Programs

Curated on August 11, 2020August 11, 2020 by Stefaan Verhulst

Rand Report: “Dozens of countries, including the United States, have been using mobile phone tools and data sources for COVID-19 surveillance activities, such as tracking infections and community spread, identifying populated areas at risk, and enforcing quarantine orders. These tools can augment traditional epidemiological interventions, such as contact tracing with technology-based data collection (e.g., automated signaling and record-keeping on mobile phone apps). As the response progresses, other beneficial technologies could include tools that authenticate those with low risk of contagion or that build community trust as stay-at-home orders are lifted.

However, the potential benefits that COVID-19 mobile phone–enhanced public health (“mobile”) surveillance program tools could provide are also accompanied by potential for harm. There are significant risks to citizens from the collection of sensitive data, including personal health, location, and contact data. People whose personal information is being collected might worry about who will receive the data, how those recipients might use the data, how the data might be shared with other entities, and what measures will be taken to safeguard the data from theft or abuse.

The risk of privacy violations can also impact government accountability and public trust. The possibility that one’s privacy will be violated by government officials or technology companies might dissuade citizens from getting tested for COVID-19, downloading public health–oriented mobile phone apps, or sharing symptom or location data. More broadly, real or perceived privacy violations might discourage citizens from believing government messaging or complying with government orders regarding COVID-19.

As U.S. public health agencies consider COVID-19-related mobile surveillance programs, they will need to address privacy concerns to encourage broad uptake and protect against privacy harms. Otherwise, COVID-19 mobile surveillance programs likely will be ineffective and the data collected unrepresentative of the situation on the ground….(More)“.

What privacy preserving techniques make possible: for transport authorities

Curated on August 9, 2020August 12, 2020 by Stefaan Verhulst

Blog by Georgina Bourke: “The Mayor of London listed cycling and walking as key population health indicators in the London Health Inequalities Strategy. The pandemic has only amplified the need for people to use cycling as a safer and healthier mode of transport. Yet as the majority of cyclists are white, Black communities are less likely to get the health benefits that cycling provides. Groups like Transport for London (TfL) should monitor how different communities cycle and who is excluded. Organisations like the London Office of Technology and Innovation (LOTI) could help boroughs procure privacy preserving technology to help their efforts.

But at the moment, it’s difficult for public organisations to access mobility data held by private companies. One reason is because mobility data is sensitive. Even if you remove identifiers like name and address, there’s still a risk you can reidentify someone by linking different data sets together. This means you could track how an individual moved around a city. I wrote more about the privacy risks with mobility data in a previous blog post. The industry’s awareness of privacy issues in using and sharing mobility data is rising. In the case of Los Angeles Department of Transport’s Mobility Data Specification (LADOT), Uber is concerned about sharing anonymised data because of the privacy risk. Both organisations are now involved in a legal battle to see which has the rights to the data. This might have been avoided if Uber had applied privacy preserving techniques….

Privacy preserving techniques can help mobility providers share important insights with authorities without compromising peoples’ privacy.

Instead of requiring access to all customer trip data, authorities could ask specific questions like, where are the least popular places to cycle? If mobility providers apply techniques like randomised response, an individual’s identity is obscured by the noise added to the data. This means it’s highly unlikely that someone could be reidentified later on. And because this technique requires authorities to ask very specific questions – for randomised response to work, the answer has to be binary, ie Yes or No – authorities will also be practicing data minimisation by default.

It’s easy to imagine transport authorities like TfL combining privacy preserved mobility data from multiple mobility providers to compare insights and measure service provision. They could cross reference the privacy preserved bike trip data with demographic data in the local area to learn how different communities cycle. The first step to addressing inequality is being able to measure it….(More)”.

Public perceptions on data sharing: key insights from the UK and the USA

Curated on August 3, 2020August 3, 2020 by Stefaan Verhulst

Paper by Saira Ghafur, Jackie Van Dael, Melanie Leis and Ara Darzi, and Aziz Sheikh: “Data science and artificial intelligence (AI) have the potential to transform the delivery of health care. Health care as a sector, with all of the longitudinal data it holds on patients across their lifetimes, is positioned to take advantage of what data science and AI have to offer. The current COVID-19 pandemic has shown the benefits of sharing data globally to permit a data-driven response through rapid data collection, analysis, modelling, and timely reporting.

Despite its obvious advantages, data sharing is a controversial subject, with researchers and members of the public justifiably concerned about how and why health data are shared. The most common concern is privacy; even when data are (pseudo-)anonymised, there remains a risk that a malicious hacker could, using only a few datapoints, re-identify individuals. For many, it is often unclear whether the risks of data sharing outweigh the benefits.

A series of surveys over recent years indicate that the public holds a range of views about data sharing. Over the past few years, there have been several important data breaches and cyberattacks. This has resulted in patients and the public questioning the safety of their data, including the prospect or risk of their health data being shared with unauthorised third parties.

We surveyed people across the UK and the USA to examine public attitude towards data sharing, data access, and the use of AI in health care. These two countries were chosen as comparators as both are high-income countries that have had substantial national investments in health information technology (IT) with established track records of using data to support health-care planning, delivery, and research. The UK and USA, however, have sharply contrasting models of health-care delivery, making it interesting to observe if these differences affect public attitudes.

Willingness to share anonymised personal health information varied across receiving bodies (figure). The more commercial the purpose of the receiving institution (eg, for an insurance or tech company), the less often respondents were willing to share their anonymised personal health information in both the UK and the USA. Older respondents (≥35 years) in both countries were generally less likely to trust any organisation with their anonymised personal health information than younger respondents (<35 years)…

Despite the benefits of big data and technology in health care, our findings suggest that the rapid development of novel technologies has been received with concern. Growing commodification of patient data has increased awareness of the risks involved in data sharing. There is a need for public standards that secure regulation and transparency of data use and sharing and support patient understanding of how data are used and for what purposes….(More)”.

Interventions to mitigate the racially discriminatory impacts of emerging tech including AI

Curated on July 30, 2020July 30, 2020 by Stefaan Verhulst

Joint Civil Society Statement: “As widespread recent protests have highlighted, racial inequality remains an urgent and devastating issue around the world, and this is as true in the context of technology as it is everywhere else. In fact, it may be more so, as algorithmic technologies based on big data are deployed at previously unimaginable scale, reproducing the discriminatory systems that build and govern them.

The undersigned organizations welcome the publication of the report “Racial discrimination and emerging digital technologies: a human rights analysis,” by Special Rapporteur on contemporary forms of racism, racial discrimination, xenophobia and related intolerance, E. Tendayi Achiume, and wish to underscore the importance and timeliness of a number of the recommendations made therein:

Technologies that have had or will have significant racially discriminatory impacts should be banned outright.
While incremental regulatory approaches may be appropriate in some contexts, where a technology is demonstrably likely to cause racially discriminatory harm, it should not be deployed until that harm can be prevented. Moreover, certain technologies may always have disparate racial impacts, no matter how much their accuracy can be improved. In the present moment, racially discriminatory technologies include facial and affect recognition technology and so-called predictive analytics. We support Special Rapporteur Achiume’s call for mandatory human rights impact assessments as a prerequisite for the adoption of new technologies. We also believe that where such assessments reveal that a technology has a high likelihood of deleterious racially disparate impacts, states should prevent its use through a ban or moratorium. We join the Special Rapporteur in welcoming recent municipal bans, for example, on the use of facial recognition technology, and encourage national governments to adopt similar policies. Correspondingly, we reiterate our support for states’ imposition of an immediate moratorium on the trade and use of privately developed surveillance tools until such time as states enact appropriate safeguards, and congratulate Special Rapporteur Achiume on joining that call.
Gender mainstreaming and representation along racial, national and other intersecting identities requires radical improvement at all levels of the tech sector.…
Technologists cannot solve political, social, and economic problems without the input of domain experts and those personally impacted.…
Access to technology is as urgent an issue of racial discrimination as inequity in the design of technologies themselves.…
Representative and disaggregated data is a necessary, if not sufficient, condition for racial equity in emerging digital technologies, but it must be collected and managed equitably as well.…
States as well as corporations must provide remedies for racial discrimination, including reparations.… (More)”.

The Atlas of Surveillance

Curated on July 28, 2020July 28, 2020 by Stefaan Verhulst

Electronic Frontier Foundation: “Law enforcement surveillance isn’t always secret. These technologies can be discovered in news articles and government meeting agendas, in company press releases and social media posts. It just hasn’t been aggregated before.

That’s the starting point for the Atlas of Surveillance, a collaborative effort between the Electronic Frontier Foundation and the University of Nevada, Reno Reynolds School of Journalism. Through a combination of crowdsourcing and data journalism, we are creating the largest-ever repository of information on which law enforcement agencies are using what surveillance technologies. The aim is to generate a resource for journalists, academics, and, most importantly, members of the public to check what’s been purchased locally and how technologies are spreading across the country.

We specifically focused on the most pervasive technologies, including drones, body-worn cameras, face recognition, cell-site simulators, automated license plate readers, predictive policing, camera registries, and gunshot detection. Although we have amassed more than 5,000 datapoints in 3,000 jurisdictions, our research only reveals the tip of the iceberg and underlines the need for journalists and members of the public to continue demanding transparency from criminal justice agencies….(More)”.

Differential Privacy for Privacy-Preserving Data Analysis

Curated on July 27, 2020July 27, 2020 by Stefaan Verhulst

Introduction to a Special Blog Series by NIST: “…How can we use data to learn about a population, without learning about specific individuals within the population? Consider these two questions:

“How many people live in Vermont?”
“How many people named Joe Near live in Vermont?”

The first reveals a property of the whole population, while the second reveals information about one person. We need to be able to learn about trends in the population while preventing the ability to learn anything new about a particular individual. This is the goal of many statistical analyses of data, such as the statistics published by the U.S. Census Bureau, and machine learning more broadly. In each of these settings, models are intended to reveal trends in populations, not reflect information about any single individual.

But how can we answer the first question “How many people live in Vermont?” — which we’ll refer to as a query — while preventing the second question from being answered “How many people name Joe Near live in Vermont?” The most widely used solution is called de-identification (or anonymization), which removes identifying information from the dataset. (We’ll generally assume a dataset contains information collected from many individuals.) Another option is to allow only aggregate queries, such as an average over the data. Unfortunately, we now understand that neither approach actually provides strong privacy protection. De-identified datasets are subject to database-linkage attacks. Aggregation only protects privacy if the groups being aggregated are sufficiently large, and even then, privacy attacks are still possible [1, 2, 3, 4].

Differential Privacy

Differential privacy [5, 6] is a mathematical definition of what it means to have privacy. It is not a specific process like de-identification, but a property that a process can have. For example, it is possible to prove that a specific algorithm “satisfies” differential privacy.

Informally, differential privacy guarantees the following for each individual who contributes data for analysis: the output of a differentially private analysis will be roughly the same, whether or not you contribute your data. A differentially private analysis is often called a mechanism, and we denote it ℳ.

Figure 1: Informal Definition of Differential Privacy

Figure 1 illustrates this principle. Answer “A” is computed without Joe’s data, while answer “B” is computed with Joe’s data. Differential privacy says that the two answers should be indistinguishable. This implies that whoever sees the output won’t be able to tell whether or not Joe’s data was used, or what Joe’s data contained.

We control the strength of the privacy guarantee by tuning the privacy parameter ε, also called a privacy loss or privacy budget. The lower the value of the ε parameter, the more indistinguishable the results, and therefore the more each individual’s data is protected.

Figure 2: Formal Definition of Differential Privacy

We can often answer a query with differential privacy by adding some random noise to the query’s answer. The challenge lies in determining where to add the noise and how much to add. One of the most commonly used mechanisms for adding noise is the Laplace mechanism [5, 7].

Queries with higher sensitivity require adding more noise in order to satisfy a particular `epsilon` quantity of differential privacy, and this extra noise has the potential to make results less useful. We will describe sensitivity and this tradeoff between privacy and usefulness in more detail in future blog posts….(More)”.

Addressing trust in public sector data use

Curated on July 22, 2020July 22, 2020 by Stefaan Verhulst

Centre for Data Ethics and Innovation: “Data sharing is fundamental to effective government and the running of public services. But it is not an end in itself. Data needs to be shared to drive improvements in service delivery and benefit citizens. For this to happen sustainably and effectively, public trust in the way data is shared and used is vital. Without such trust, the government and wider public sector risks losing society’s consent, setting back innovation as well as the smooth running of public services. Maximising the benefits of data driven technology therefore requires a solid foundation of societal approval.

AI and data driven technology offer extraordinary potential to improve decision making and service delivery in the public sector – from improved diagnostics to more efficient infrastructure and personalised public services. This makes effective use of data more important than it has ever been, and requires a step-change in the way data is shared and used. Yet sharing more data also poses risks and challenges to current governance arrangements.

The only way to build trust sustainably is to operate in a trustworthy way. Without adequate safeguards the collection and use of personal data risks changing power relationships between the citizen and the state. Insights derived by big data and the matching of different data sets can also undermine individual privacy or personal autonomy. Trade-offs are required which reflect democratic values, wider public acceptability and a shared vision of a data driven society. CDEI has a key role to play in exploring this challenge and setting out how it can be addressed. This report identifies barriers to data sharing, but focuses on building and sustaining the public trust which is vital if society is to maximise the benefits of data driven technology.

There are many areas where the sharing of anonymised and identifiable personal data by the public sector already improves services, prevents harm, and benefits the public. Over the last 20 years, different governments have adopted various measures to increase data sharing, including creating new legal sharing gateways. However, despite efforts to increase the amount of data sharing across the government, and significant successes in areas like open data, data sharing continues to be challenging and resource-intensive. This report identifies a range of technical, legal and cultural barriers that can inhibit data sharing.

Barriers to data sharing in the public sector

Technical barriers include limited adoption of common data standards and inconsistent security requirements across the public sector. Such inconsistency can prevent data sharing, or increase the cost and time for organisations to finalise data sharing agreements.

While there are often pre-existing legal gateways for data sharing, underpinned by data protection legislation, there is still a large amount of legal confusion on the part of public sector bodies wishing to share data which can cause them to start from scratch when determining legality and commit significant resources to legal advice. It is not unusual for the development of data sharing agreements to delay the projects for which the data is intended. While the legal scrutiny of data sharing arrangements is an important part of governance, improving the efficiency of these processes – without sacrificing their rigour – would allow data to be shared more quickly and at less expense.

Even when legal, the permissive nature of many legal gateways means significant cultural and organisational barriers to data sharing remain. Individual departments and agencies decide whether or not to share the data they hold and may be overly risk averse. Data sharing may not be prioritised by a department if it would require them to bear costs to deliver benefits that accrue elsewhere (i.e. to those gaining access to the data). Departments sharing data may need to invest significant resources to do so, as well as considering potential reputational or legal risks. This may hold up progress towards finding common agreement on data sharing. When there is an absence of incentives, even relatively small obstacles may mean data sharing is not deemed worthwhile by those who hold the data – despite the fact that other parts of the public sector might benefit significantly….(More)”.

Privacy‐Preserving Data Visualization: Reflections on the State of the Art and Research Opportunities

Curated on July 22, 2020July 22, 2020 by Stefaan Verhulst

Paper by Kaustav Bhattacharjee, Min Chen. and Aritra Dasgupta: “Preservation of data privacy and protection of sensitive information from potential adversaries constitute a key socio‐technical challenge in the modern era of ubiquitous digital transformation. Addressing this challenge needs analysis of multiple factors: algorithmic choices for balancing privacy and loss of utility, potential attack scenarios that can be undertaken by adversaries, implications for data owners, data subjects, and data sharing policies, and access control mechanisms that need to be built into interactive data interfaces.

Visualization has a key role to play as part of the solution space, both as a medium of privacy‐aware information communication and also as a tool for understanding the link between privacy parameters and data sharing policies. The field of privacy‐preserving data visualization has witnessed progress along many of these dimensions. In this state‐of‐the‐art report, our goal is to provide a systematic analysis of the approaches, methods, and techniques used for handling data privacy in visualization. We also reflect on the road‐map ahead by analyzing the gaps and research opportunities for solving some of the pressing socio‐technical challenges involving data privacy with the help of visualization….(More)”.