Privacy-Enhancing and Privacy-Preserving Technologies: Understanding the Role of PETs and PPTs in the Digital Age


Paper by the Centre for Information Policy Leadership: “The paper explores how organizations are approaching privacy-enhancing technologies (“PETs”) and how PETs can advance data protection principles, and provides examples of how specific types of PETs work. It also explores potential challenges to the use of PETs and possible solutions to those challenges.

CIPL emphasizes the enormous potential inherent in these technologies to mitigate privacy risks and support innovation, and recommends a number of steps to foster further development and adoption of PETs. In particular, CIPL calls for policymakers and regulators to incentivize the use of PETs through clearer guidance on key legal concepts that impact the use of PETs, and by adopting a pragmatic approach to the application of these concepts.

CIPL’s recommendations towards wider adoption are as follows:

  • Issue regulatory guidance and incentives regarding PETs: Official regulatory guidance addressing PETs in the context of specific legal obligations or concepts (such as anonymization) will incentivize greater investment in PETs.
  • Increase education and awareness about PETs: PET developers and providers need to show tangible evidence of the value of PETs and help policymakers, regulators and organizations understand how such technologies can facilitate responsible data use.
  • Develop industry standards for PETs: Industry standards would help facilitate interoperability for the use of PETs across jurisdictions and help codify best practices to support technical reliability to foster trust in these technologies.
  • Recognize PETs as a demonstrable element of accountability: PETs complement robust data privacy management programs and should be recognized as an element of organizational accountability…(More)”.

Toward a Solid Acceptance of the Decentralized Web of Personal Data: Societal and Technological Convergence


Article by Ana Pop Stefanija et al: “Citizens using common online services such as social media, health tracking, or online shopping effectively hand over control of their personal data to the service providers—often large corporations. The services using and processing personal data are also holding the data. This situation is problematic, as has been recognized for some time: competition and innovation are stifled; data is duplicated; and citizens are in a weak position to enforce legal rights such as access, rectification, or erasure. The approach to address this problem has been to ascertain that citizens can access and update, with every possible service provider, the personal data that providers hold of or about them—the foundational view taken in the European General Data Protection Regulation (GDPR).

Recently, however, various societal, technological, and regulatory efforts are taking a very different approach, turning things around. The central tenet of this complementary view is that citizens should regain control of their personal data. Once in control, citizens can decide which providers they want to share data with, and if so, exactly which part of their data. Moreover, they can revisit these decisions anytime…(More)”.

How Tracking and Technology in Cars Is Being Weaponized by Abusive Partners


Article by Kashmir Hill: “After almost 10 years of marriage, Christine Dowdall wanted out. Her husband was no longer the charming man she had fallen in love with. He had become narcissistic, abusive and unfaithful, she said. After one of their fights turned violent in September 2022, Ms. Dowdall, a real estate agent, fled their home in Covington, La., driving her Mercedes-Benz C300 sedan to her daughter’s house near Shreveport, five hours away. She filed a domestic abuse report with the police two days later.

Her husband, a Drug Enforcement Administration agent, didn’t want to let her go. He called her repeatedly, she said, first pleading with her to return, and then threatening her. She stopped responding to him, she said, even though he texted and called her hundreds of times.

Ms. Dowdall, 59, started occasionally seeing a strange new message on the display in her Mercedes, about a location-based service called “mbrace.” The second time it happened, she took a photograph and searched for the name online.

“I realized, oh my God, that’s him tracking me,” Ms. Dowdall said.

“Mbrace” was part of “Mercedes me” — a suite of connected services for the car, accessible via a smartphone app. Ms. Dowdall had only ever used the Mercedes Me app to make auto loan payments. She hadn’t realized that the service could also be used to track the car’s location. One night, when she visited a male friend’s home, her husband sent the man a message with a thumbs-up emoji. A nearby camera captured his car driving in the area, according to the detective who worked on her case.

Ms. Dowdall called Mercedes customer service repeatedly to try to remove her husband’s digital access to the car, but the loan and title were in his name, a decision the couple had made because he had a better credit score than hers. Even though she was making the payments, had a restraining order against her husband and had been granted sole use of the car during divorce proceedings, Mercedes representatives told her that her husband was the customer so he would be able to keep his access. There was no button she could press to take away the app’s connection to the vehicle.

“This is not the first time that I’ve heard something like this,” one of the representatives told Ms. Dowdall…(More)”.

The 2010 Census Confidentiality Protections Failed, Here’s How and Why


Paper by John M. Abowd, et al: “Using only 34 published tables, we reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records. Using the 38-bin age variable tabulated at the census block level, at most 20.1% of reconstructed records can differ from their confidential source on even a single value for these five variables. Using only published data, an attacker can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. The tabular publications in Summary File 1 thus have prohibited disclosure risk similar to the unreleased confidential microdata. Reidentification studies confirm that an attacker can, within blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with nonmodal characteristics) with 95% accuracy, the same precision as the confidential data achieve and far greater than statistical baselines. The flaw in the 2010 Census framework was the assumption that aggregation prevented accurate microdata reconstruction, justifying weaker disclosure limitation methods than were applied to 2010 Census public microdata. The framework used for 2020 Census publications defends against attacks that are based on reconstruction, as we also demonstrate here. Finally, we show that alternatives to the 2020 Census Disclosure Avoidance System with similar accuracy (enhanced swapping) also fail to protect confidentiality, and those that partially defend against reconstruction attacks (incomplete suppression implementations) destroy the primary statutory use case: data for redistricting all legislatures in the country in compliance with the 1965 Voting Rights Act…(More)”.

 Privacy-Enhancing and Privacy-Preserving Technologies: Understanding the Role of PETs and PPTs in the Digital Age


Paper by the Centre for Information Policy Leadership: “…explores how organizations are approaching privacy-enhancing technologies (“PETs”) and how PETs can advance data protection principles, and provides examples of how specific types of PETs work. It also explores potential challenges to the use of PETs and possible solutions to those challenges.

CIPL emphasizes the enormous potential inherent in these technologies to mitigate privacy risks and support innovation, and recommends a number of steps to foster further development and adoption of PETs. In particular, CIPL calls for policymakers and regulators to incentivize the use of PETs through clearer guidance on key legal concepts that impact the use of PETs, and by adopting a pragmatic approach to the application of these concepts.

CIPL’s recommendations towards wider adoption are as follows:

  • Issue regulatory guidance and incentives regarding PETs: Official regulatory guidance addressing PETs in the context of specific legal obligations or concepts (such as anonymization) will incentivize greater investment in PETs.
  • Increase education and awareness about PETs: PET developers and providers need to show tangible evidence of the value of PETs and help policymakers, regulators and organizations understand how such technologies can facilitate responsible data use.
  • Develop industry standards for PETs: Industry standards would help facilitate interoperability for the use of PETs across jurisdictions and help codify best practices to support technical reliability to foster trust in these technologies.
  • Recognize PETs as a demonstrable element of accountability: PETs complement robust data privacy management programs and should be recognized as an element of organizational accountability…(More)”.

A Feasibility Study of Differentially Private Summary Statistics and Regression Analyses with Evaluations on Administrative and Survey Data


Report by Andrés F. Barrientos, Aaron R. Williams, Joshua Snoke, Claire McKay Bowen: “Federal administrative data, such as tax data, are invaluable for research, but because of privacy concerns, access to these data is typically limited to select agencies and a few individuals. An alternative to sharing microlevel data is to allow individuals to query statistics without directly accessing the confidential data. This paper studies the feasibility of using differentially private (DP) methods to make certain queries while preserving privacy. We also include new methodological adaptations to existing DP regression methods for using new data types and returning standard error estimates. We define feasibility as the impact of DP methods on analyses for making public policy decisions and the queries accuracy according to several utility metrics. We evaluate the methods using Internal Revenue Service data and public-use Current Population Survey data and identify how specific data features might challenge some of these methods. Our findings show that DP methods are feasible for simple, univariate statistics but struggle to produce accurate regression estimates and confidence intervals. To the best of our knowledge, this is the first comprehensive statistical study of DP regression methodology on real, complex datasets, and the findings have significant implications for the direction of a growing research field and public policy…(More)”.

How Americans View Data Privacy


Pew Research: “…Americans – particularly Republicans – have grown more concerned about how the government uses their data. The share who say they are worried about government use of people’s data has increased from 64% in 2019 to 71% today. That reflects rising concern among Republicans (from 63% to 77%), while Democrats’ concern has held steady. (Each group includes those who lean toward the respective party.)

The public increasingly says they don’t understand what companies are doing with their data. Some 67% say they understand little to nothing about what companies are doing with their personal data, up from 59%.

Most believe they have little to no control over what companies or the government do with their data. While these shares have ticked down compared with 2019, vast majorities feel this way about data collected by companies (73%) and the government (79%).

We’ve studied Americans’ views on data privacy for years. The topic remains in the national spotlight today, and it’s particularly relevant given the policy debates ranging from regulating AI to protecting kids on social media. But these are far from abstract concepts. They play out in the day-to-day lives of Americans in the passwords they choose, the privacy policies they agree to and the tactics they take – or not – to secure their personal information. We surveyed 5,101 U.S. adults using Pew Research Center’s American Trends Panel to give voice to people’s views and experiences on these topics.

In addition to the key findings covered on this page, the three chapters of this report provide more detail on:

What Big Tech Knows About Your Body


Article by Yael Grauer: “If you were seeking online therapy from 2017 to 2021—and a lot of people were—chances are good that you found your way to BetterHelp, which today describes itself as the world’s largest online-therapy purveyor, with more than 2 million users. Once you were there, after a few clicks, you would have completed a form—an intake questionnaire, not unlike the paper one you’d fill out at any therapist’s office: Are you new to therapy? Are you taking any medications? Having problems with intimacy? Experiencing overwhelming sadness? Thinking of hurting yourself? BetterHelp would have asked you if you were religious, if you were LGBTQ, if you were a teenager. These questions were just meant to match you with the best counselor for your needs, small text would have assured you. Your information would remain private.

Except BetterHelp isn’t exactly a therapist’s office, and your information may not have been completely private. In fact, according to a complaint brought by federal regulators, for years, BetterHelp was sharing user data—including email addresses, IP addresses, and questionnaire answers—with third parties, including Facebook and Snapchat, for the purposes of targeting ads for its services. It was also, according to the Federal Trade Commission, poorly regulating what those third parties did with users’ data once they got them. In July, the company finalized a settlement with the FTC and agreed to refund $7.8 million to consumers whose privacy regulators claimed had been compromised. (In a statement, BetterHelp admitted no wrongdoing and described the alleged sharing of user information as an “industry-standard practice.”)

We leave digital traces about our health everywhere we go: by completing forms like BetterHelp’s. By requesting a prescription refill online. By clicking on a link. By asking a search engine about dosages or directions to a clinic or pain in chest dying. By shopping, online or off. By participating in consumer genetic testing. By stepping on a smart scale or using a smart thermometer. By joining a Facebook group or a Discord server for people with a certain medical condition. By using internet-connected exercise equipment. By using an app or a service to count your steps or track your menstrual cycle or log your workouts. Even demographic and financial data unrelated to health can be aggregated and analyzed to reveal or infer sensitive information about people’s physical or mental-health conditions…(More)”.

It’s Official: Cars Are the Worst Product Category We Have Ever Reviewed for Privacy


Article by the Mozilla Foundation: “Car makers have been bragging about their cars being “computers on wheels” for years to promote their advanced features. However, the conversation about what driving a computer means for its occupants’ privacy hasn’t really caught up. While we worried that our doorbells and watches that connect to the internet might be spying on us, car brands quietly entered the data business by turning their vehicles into powerful data-gobbling machines. Machines that, because of their all those brag-worthy bells and whistles, have an unmatched power to watch, listen, and collect information about what you do and where you go in your car.

All 25 car brands we researched earned our *Privacy Not Included warning label — making cars the official worst category of products for privacy that we have ever reviewed…(More)”.

A new way to look at data privacy


Article by Adam Zewe: “Imagine that a team of scientists has developed a machine-learning model that can predict whether a patient has cancer from lung scan images. They want to share this model with hospitals around the world so clinicians can start using it in diagnosis.

But there’s a problem. To teach their model how to predict cancer, they showed it millions of real lung scan images, a process called training. Those sensitive data, which are now encoded into the inner workings of the model, could potentially be extracted by a malicious agent. The scientists can prevent this by adding noise, or more generic randomness, to the model that makes it harder for an adversary to guess the original data. However, perturbation reduces a model’s accuracy, so the less noise one can add, the better.

MIT researchers have developed a technique that enables the user to potentially add the smallest amount of noise possible, while still ensuring the sensitive data are protected.

The researchers created a new privacy metric, which they call Probably Approximately Correct (PAC) Privacy, and built a framework based on this metric that can automatically determine the minimal amount of noise that needs to be added. Moreover, this framework does not need knowledge of the inner workings of a model or its training process, which makes it easier to use for different types of models and applications.

In several cases, the researchers show that the amount of noise required to protect sensitive data from adversaries is far less with PAC Privacy than with other approaches. This could help engineers create machine-learning models that provably hide training data, while maintaining accuracy in real-world settings…

A fundamental question in data privacy is: How much sensitive data could an adversary recover from a machine-learning model with noise added to it?

Differential Privacy, one popular privacy definition, says privacy is achieved if an adversary who observes the released model cannot infer whether an arbitrary individual’s data is used for the training processing. But provably preventing an adversary from distinguishing data usage often requires large amounts of noise to obscure it. This noise reduces the model’s accuracy.

PAC Privacy looks at the problem a bit differently. It characterizes how hard it would be for an adversary to reconstruct any part of randomly sampled or generated sensitive data after noise has been added, rather than only focusing on the distinguishability problem…(More)”