Data, Privacy Laws and Firm Production: Evidence from the GDPR


Paper by Mert Demirer, Diego J. Jiménez Hernández, Dean Li & Sida Peng: “By regulating how firms collect, store, and use data, privacy laws may change the role of data in production and alter firm demand for information technology inputs. We study how firms respond to privacy laws in the context of the EU’s General Data Protection Regulation (GDPR) by using seven years of data from a large global cloud-computing provider. Our difference-in-difference estimates indicate that, in response to the GDPR, EU firms decreased data storage by 26% and data processing by 15% relative to comparable US firms, becoming less “data-intensive.” To estimate the costs of the GDPR for firms, we propose and estimate a production function where data and computation serve as inputs to the production of “information.” We find that data and computation are strong complements in production and that firm responses are consistent with the GDPR, representing a 20% increase in the cost of data on average. Variation in the firm-level effects of the GDPR and industry-level exposure to data, however, drives significant heterogeneity in our estimates of the impact of the GDPR on production costs…(More)”

Data Is What Data Does: Regulating Based on Harm and Risk Instead of Sensitive Data


Paper by Daniel J. Solove: “Heightened protection for sensitive data is becoming quite trendy in privacy laws around the world. Originating in European Union (EU) data protection law and included in the EU’s General Data Protection Regulation, sensitive data singles out certain categories of personal data for extra protection. Commonly recognized special categories of sensitive data include racial or ethnic origin, political opinions, religious or philosophical beliefs, trade union membership, health, sexual orientation and sex life, and biometric and genetic data.

Although heightened protection for sensitive data appropriately recognizes that not all situations involving personal data should be protected uniformly, the sensitive data approach is a dead end. The sensitive data categories are arbitrary and lack any coherent theory for identifying them. The borderlines of many categories are so blurry that they are useless. Moreover, it is easy to use nonsensitive data as a proxy for certain types of sensitive data.

Personal data is akin to a grand tapestry, with different types of data interwoven to a degree that makes it impossible to separate out the strands. With Big Data and powerful machine learning algorithms, most nonsensitive data give rise to inferences about sensitive data. In many privacy laws, data giving rise to inferences about sensitive data is also protected as sensitive data. Arguably, then, nearly all personal data can be sensitive, and the sensitive data categories can swallow up everything. As a result, most organizations are currently processing a vast amount of data in violation of the laws.

This Article argues that the problems with the sensitive data approach make it unworkable and counterproductive as well as expose a deeper flaw at the root of many privacy laws. These laws make a fundamental conceptual mistake—they embrace the idea that the nature of personal data is a sufficiently useful focal point for the law. But nothing meaningful for regulation can be determined solely by looking at the data itself. Data is what data does.

To be effective, privacy law must focus on harm and risk rather than on the nature of personal data…(More)”.

Future-Proofing Transparency: Re-Thinking Public Record Governance For the Age of Big Data


Paper by Beatriz Botero Arcila: “Public records, public deeds, and even open data portals often include personal information that can now be easily accessed online. Yet, for all the recent attention given to informational privacy and data protection, scant literature exists on the governance of personal information that is available in public documents. This Article examines the critical issue of balancing privacy and transparency within public record governance in the age of Big Data.

With Big Data and powerful machine learning algorithms, personal information in public records can easily be used to infer sensitive data about people or aggregated to create a comprehensive personal profile of almost anyone. This information is public and open, however, for many good reasons: ensuring political accountability, facilitating democratic participation, enabling economic transactions, combating illegal activities such as money laundering and terrorism financing, and facilitating. Can the interest in record publicity coexist with the growing ease of deanonymizing and revealing sensitive information about individuals?

This Article addresses this question from a comparative perspective, focusing on US and EU access to information law. The Article shows that the publicity of records was, in the past and not withstanding its presumptive public nature, protected because most people would not trouble themselves to go to public offices to review them, and it was practical impossible to aggregate them to draw extensive profiles about people. Drawing from this insight and contemporary debates on data governance, this Article challenges the binary classification of data as either published or not and proposes a risk-based framework that re-insert that natural friction to public record governance by leveraging techno-legal methods in how information is published and accessed…(More)”.

Defending the rights of refugees and migrants in the digital age


Primer by Amnesty International: “This is an introduction to the pervasive and rapid deployment of digital technologies in asylum and migration management systems across the globe including the United States, United Kingdom and the European Union. Defending the rights of refugees and migrants in the digital age, highlights some of the key digital technology developments in asylum and migration management systems, in particular systems that process large quantities of data, and the human rights issues arising from their use. This introductory briefing aims to build our collective understanding of these emerging technologies and hopes to add to wider advocacy efforts to stem their harmful effects…(More)”.

Privacy-Enhancing and Privacy-Preserving Technologies: Understanding the Role of PETs and PPTs in the Digital Age


Paper by the Centre for Information Policy Leadership: “The paper explores how organizations are approaching privacy-enhancing technologies (“PETs”) and how PETs can advance data protection principles, and provides examples of how specific types of PETs work. It also explores potential challenges to the use of PETs and possible solutions to those challenges.

CIPL emphasizes the enormous potential inherent in these technologies to mitigate privacy risks and support innovation, and recommends a number of steps to foster further development and adoption of PETs. In particular, CIPL calls for policymakers and regulators to incentivize the use of PETs through clearer guidance on key legal concepts that impact the use of PETs, and by adopting a pragmatic approach to the application of these concepts.

CIPL’s recommendations towards wider adoption are as follows:

  • Issue regulatory guidance and incentives regarding PETs: Official regulatory guidance addressing PETs in the context of specific legal obligations or concepts (such as anonymization) will incentivize greater investment in PETs.
  • Increase education and awareness about PETs: PET developers and providers need to show tangible evidence of the value of PETs and help policymakers, regulators and organizations understand how such technologies can facilitate responsible data use.
  • Develop industry standards for PETs: Industry standards would help facilitate interoperability for the use of PETs across jurisdictions and help codify best practices to support technical reliability to foster trust in these technologies.
  • Recognize PETs as a demonstrable element of accountability: PETs complement robust data privacy management programs and should be recognized as an element of organizational accountability…(More)”.

Toward a Solid Acceptance of the Decentralized Web of Personal Data: Societal and Technological Convergence


Article by Ana Pop Stefanija et al: “Citizens using common online services such as social media, health tracking, or online shopping effectively hand over control of their personal data to the service providers—often large corporations. The services using and processing personal data are also holding the data. This situation is problematic, as has been recognized for some time: competition and innovation are stifled; data is duplicated; and citizens are in a weak position to enforce legal rights such as access, rectification, or erasure. The approach to address this problem has been to ascertain that citizens can access and update, with every possible service provider, the personal data that providers hold of or about them—the foundational view taken in the European General Data Protection Regulation (GDPR).

Recently, however, various societal, technological, and regulatory efforts are taking a very different approach, turning things around. The central tenet of this complementary view is that citizens should regain control of their personal data. Once in control, citizens can decide which providers they want to share data with, and if so, exactly which part of their data. Moreover, they can revisit these decisions anytime…(More)”.

How Tracking and Technology in Cars Is Being Weaponized by Abusive Partners


Article by Kashmir Hill: “After almost 10 years of marriage, Christine Dowdall wanted out. Her husband was no longer the charming man she had fallen in love with. He had become narcissistic, abusive and unfaithful, she said. After one of their fights turned violent in September 2022, Ms. Dowdall, a real estate agent, fled their home in Covington, La., driving her Mercedes-Benz C300 sedan to her daughter’s house near Shreveport, five hours away. She filed a domestic abuse report with the police two days later.

Her husband, a Drug Enforcement Administration agent, didn’t want to let her go. He called her repeatedly, she said, first pleading with her to return, and then threatening her. She stopped responding to him, she said, even though he texted and called her hundreds of times.

Ms. Dowdall, 59, started occasionally seeing a strange new message on the display in her Mercedes, about a location-based service called “mbrace.” The second time it happened, she took a photograph and searched for the name online.

“I realized, oh my God, that’s him tracking me,” Ms. Dowdall said.

“Mbrace” was part of “Mercedes me” — a suite of connected services for the car, accessible via a smartphone app. Ms. Dowdall had only ever used the Mercedes Me app to make auto loan payments. She hadn’t realized that the service could also be used to track the car’s location. One night, when she visited a male friend’s home, her husband sent the man a message with a thumbs-up emoji. A nearby camera captured his car driving in the area, according to the detective who worked on her case.

Ms. Dowdall called Mercedes customer service repeatedly to try to remove her husband’s digital access to the car, but the loan and title were in his name, a decision the couple had made because he had a better credit score than hers. Even though she was making the payments, had a restraining order against her husband and had been granted sole use of the car during divorce proceedings, Mercedes representatives told her that her husband was the customer so he would be able to keep his access. There was no button she could press to take away the app’s connection to the vehicle.

“This is not the first time that I’ve heard something like this,” one of the representatives told Ms. Dowdall…(More)”.

The 2010 Census Confidentiality Protections Failed, Here’s How and Why


Paper by John M. Abowd, et al: “Using only 34 published tables, we reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records. Using the 38-bin age variable tabulated at the census block level, at most 20.1% of reconstructed records can differ from their confidential source on even a single value for these five variables. Using only published data, an attacker can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. The tabular publications in Summary File 1 thus have prohibited disclosure risk similar to the unreleased confidential microdata. Reidentification studies confirm that an attacker can, within blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with nonmodal characteristics) with 95% accuracy, the same precision as the confidential data achieve and far greater than statistical baselines. The flaw in the 2010 Census framework was the assumption that aggregation prevented accurate microdata reconstruction, justifying weaker disclosure limitation methods than were applied to 2010 Census public microdata. The framework used for 2020 Census publications defends against attacks that are based on reconstruction, as we also demonstrate here. Finally, we show that alternatives to the 2020 Census Disclosure Avoidance System with similar accuracy (enhanced swapping) also fail to protect confidentiality, and those that partially defend against reconstruction attacks (incomplete suppression implementations) destroy the primary statutory use case: data for redistricting all legislatures in the country in compliance with the 1965 Voting Rights Act…(More)”.

 Privacy-Enhancing and Privacy-Preserving Technologies: Understanding the Role of PETs and PPTs in the Digital Age


Paper by the Centre for Information Policy Leadership: “…explores how organizations are approaching privacy-enhancing technologies (“PETs”) and how PETs can advance data protection principles, and provides examples of how specific types of PETs work. It also explores potential challenges to the use of PETs and possible solutions to those challenges.

CIPL emphasizes the enormous potential inherent in these technologies to mitigate privacy risks and support innovation, and recommends a number of steps to foster further development and adoption of PETs. In particular, CIPL calls for policymakers and regulators to incentivize the use of PETs through clearer guidance on key legal concepts that impact the use of PETs, and by adopting a pragmatic approach to the application of these concepts.

CIPL’s recommendations towards wider adoption are as follows:

  • Issue regulatory guidance and incentives regarding PETs: Official regulatory guidance addressing PETs in the context of specific legal obligations or concepts (such as anonymization) will incentivize greater investment in PETs.
  • Increase education and awareness about PETs: PET developers and providers need to show tangible evidence of the value of PETs and help policymakers, regulators and organizations understand how such technologies can facilitate responsible data use.
  • Develop industry standards for PETs: Industry standards would help facilitate interoperability for the use of PETs across jurisdictions and help codify best practices to support technical reliability to foster trust in these technologies.
  • Recognize PETs as a demonstrable element of accountability: PETs complement robust data privacy management programs and should be recognized as an element of organizational accountability…(More)”.

A Feasibility Study of Differentially Private Summary Statistics and Regression Analyses with Evaluations on Administrative and Survey Data


Report by Andrés F. Barrientos, Aaron R. Williams, Joshua Snoke, Claire McKay Bowen: “Federal administrative data, such as tax data, are invaluable for research, but because of privacy concerns, access to these data is typically limited to select agencies and a few individuals. An alternative to sharing microlevel data is to allow individuals to query statistics without directly accessing the confidential data. This paper studies the feasibility of using differentially private (DP) methods to make certain queries while preserving privacy. We also include new methodological adaptations to existing DP regression methods for using new data types and returning standard error estimates. We define feasibility as the impact of DP methods on analyses for making public policy decisions and the queries accuracy according to several utility metrics. We evaluate the methods using Internal Revenue Service data and public-use Current Population Survey data and identify how specific data features might challenge some of these methods. Our findings show that DP methods are feasible for simple, univariate statistics but struggle to produce accurate regression estimates and confidence intervals. To the best of our knowledge, this is the first comprehensive statistical study of DP regression methodology on real, complex datasets, and the findings have significant implications for the direction of a growing research field and public policy…(More)”.