Anonymization: The imperfect science of using data while preserving privacy


Paper by Andrea Gadotti et al: “Information about us, our actions, and our preferences is created at scale through surveys or scientific studies or as a result of our interaction with digital devices such as smartphones and fitness trackers. The ability to safely share and analyze such data is key for scientific and societal progress. Anonymization is considered by scientists and policy-makers as one of the main ways to share data while minimizing privacy risks. In this review, we offer a pragmatic perspective on the modern literature on privacy attacks and anonymization techniques. We discuss traditional de-identification techniques and their strong limitations in the age of big data. We then turn our attention to modern approaches to share anonymous aggregate data, such as data query systems, synthetic data, and differential privacy. We find that, although no perfect solution exists, applying modern techniques while auditing their guarantees against attacks is the best approach to safely use and share data today…(More)”.

AI mass surveillance at Paris Olympics


Article by Anne Toomey McKenna: “The 2024 Paris Olympics is drawing the eyes of the world as thousands of athletes and support personnel and hundreds of thousands of visitors from around the globe converge in France. It’s not just the eyes of the world that will be watching. Artificial intelligence systems will be watching, too.

Government and private companies will be using advanced AI tools and other surveillance tech to conduct pervasive and persistent surveillance before, during and after the Games. The Olympic world stage and international crowds pose increased security risks so significant that in recent years authorities and critics have described the Olympics as the “world’s largest security operations outside of war.”

The French government, hand in hand with the private tech sector, has harnessed that legitimate need for increased security as grounds to deploy technologically advanced surveillance and data gathering tools. Its surveillance plans to meet those risks, including controversial use of experimental AI video surveillance, are so extensive that the country had to change its laws to make the planned surveillance legal.

The plan goes beyond new AI video surveillance systems. According to news reports, the prime minister’s office has negotiated a provisional decree that is classified to permit the government to significantly ramp up traditional, surreptitious surveillance and information gathering tools for the duration of the Games. These include wiretapping; collecting geolocation, communications and computer data; and capturing greater amounts of visual and audio data…(More)”.

Community consent: neither a ceiling nor a floor


Article by Jasmine McNealy: “The 23andMe breach and the Golden State Killer case are two of the more “flashy” cases, but questions of consent, especially the consent of all of those affected by biodata collection and analysis in more mundane or routine health and medical research projects, are just as important. The communities of people affected have expectations about their privacy and the possible impacts of inferences that could be made about them in data processing systems. Researchers must, then, acquire community consent when attempting to work with networked biodata. 

Several benefits of community consent exist, especially for marginalized and vulnerable populations. These benefits include:

  • Ensuring that information about the research project spreads throughout the community,
  • Removing potential barriers that might be created by resistance from community members,
  • Alleviating the possible concerns of individuals about the perspectives of community leaders, and 
  • Allowing the recruitment of participants using methods most salient to the community.

But community consent does not replace individual consent and limits exist for both community and individual consent. Therefore, within the context of a biorepository, understanding whether community consent might be a ceiling or a floor requires examining governance and autonomy…(More)”.

The Great Scrape: The Clash Between Scraping and Privacy


Paper by Daniel J. Solove and Woodrow Hartzog: “Artificial intelligence (AI) systems depend on massive quantities of data, often gathered by “scraping” – the automated extraction of large amounts of data from the internet. A great deal of scraped data is about people. This personal data provides the grist for AI tools such as facial recognition, deep fakes, and generative AI. Although scraping enables web searching, archival, and meaningful scientific research, scraping for AI can also be objectionable or even harmful to individuals and society.

Organizations are scraping at an escalating pace and scale, even though many privacy laws are seemingly incongruous with the practice. In this Article, we contend that scraping must undergo a serious reckoning with privacy law.  Scraping violates nearly all of the key principles in privacy laws, including fairness; individual rights and control; transparency; consent; purpose specification and secondary use restrictions; data minimization; onward transfer; and data security. With scraping, data protection laws built around these requirements are ignored.

Scraping has evaded a reckoning with privacy law largely because scrapers act as if all publicly available data were free for the taking. But the public availability of scraped data shouldn’t give scrapers a free pass. Privacy law regularly protects publicly available data, and privacy principles are implicated even when personal data is accessible to others.

This Article explores the fundamental tension between scraping and privacy law. With the zealous pursuit and astronomical growth of AI, we are in the midst of what we call the “great scrape.” There must now be a great reconciliation…(More)”.

Everyone Has A Price — And Corporations Know Yours


Article by David Dayen: “Six years ago, I was at a conference at the University of Chicago, the intellectual heart of corporate-friendly capitalism, when my eyes found the cover of the Chicago Booth Review, the business school’s flagship publication. “Are You Ready for Personalized Pricing?” the headline asked. I wasn’t, so I started reading.

The story looked at how online shopping, persistent data collection, and machine-learning algorithms could combine to generate the stuff of economists’ dreams: individual prices for each customer. It even recounted an experiment in 2015, where online employment website ZipRecruiter essentially outsourced its pricing strategy to two University of Chicago economists, Sanjog Misra and Jean-Pierre Dubé…(More)”.

How the Rise of the Camera Launched a Fight to Protect Gilded Age Americans’ Privacy


Article by Sohini Desai: “In 1904, a widow named Elizabeth Peck had her portrait taken at a studio in a small Iowa town. The photographer sold the negatives to Duffy’s Pure Malt Whiskey, a company that avoided liquor taxes for years by falsely advertising its product as medicinal. Duffy’s ads claimed the fantastical: that it cured everything from influenza to consumption, that it was endorsed by clergymen, that it could help you live until the age of 106. The portrait of Peck ended up in one of these dubious ads, published in newspapers across the country alongside what appeared to be her unqualified praise: “After years of constant use of your Pure Malt Whiskey, both by myself and as given to patients in my capacity as nurse, I have no hesitation in recommending it.”

Duffy’s lies were numerous. Peck (misleadingly identified as “Mrs. A. Schuman”) was not a nurse, and she had not spent years constantly slinging back malt beverages. In fact, she fully abstained from alcohol. Peck never consented to the ad.

The camera’s first great age—which began in 1888 when George Eastman debuted the Kodak—is full of stories like this one. Beyond the wonders of a quickly developing art form and technology lay widespread lack of control over one’s own image, perverse incentives to make a quick buck, and generalized fear at the prospect of humiliation and the invasion of privacy…(More)”.

Cryptographers Discover a New Foundation for Quantum Secrecy


Article by Ben Brubaker: “…Say you want to send a private message, cast a secret vote or sign a document securely. If you do any of these tasks on a computer, you’re relying on encryption to keep your data safe. That encryption needs to withstand attacks from codebreakers with their own computers, so modern encryption methods rely on assumptions about what mathematical problems are hard for computers to solve.

But as cryptographers laid the mathematical foundations for this approach to information security in the 1980s, a few researchers discovered that computational hardness wasn’t the only way to safeguard secrets. Quantum theory, originally developed to understand the physics of atoms, turned out to have deep connections to information and cryptography. Researchers found ways to base the security of a few specific cryptographic tasks directly on the laws of physics. But these tasks were strange outliers — for all others, there seemed to be no alternative to the classical computational approach.

By the end of the millennium, quantum cryptography researchers thought that was the end of the story. But in just the past few years, the field has undergone another seismic shift.

“There’s been this rearrangement of what we believe is possible with quantum cryptography,” said Henry Yuen, a quantum information theorist at Columbia University.

In a string of recent papers, researchers have shown that most cryptographic tasks could still be accomplished securely even in hypothetical worlds where practically all computation is easy. All that matters is the difficulty of a special computational problem about quantum theory itself.

“The assumptions you need can be way, way, way weaker,” said Fermi Ma, a quantum cryptographer at the Simons Institute for the Theory of Computing in Berkeley, California. “This is giving us new insights into computational hardness itself.”…(More)”.

Uganda’s Sweeping Surveillance State Is Built on National ID Cards


Article by Olivia Solon: “Uganda has spent hundreds of millions of dollars in the past decade on biometric tools that document a person’s unique physical characteristics, such as their face, fingerprints and irises, to form the basis of a comprehensive identification system. While the system is central to many of the state’s everyday functions, as Museveni has grown increasingly authoritarian over nearly four decades in power, it has also become a powerful mechanism for surveilling politicians, journalists, human rights advocates and ordinary citizens, according to dozens of interviews and hundreds of pages of documents obtained and analyzed by Bloomberg and nonprofit investigative newsroom Lighthouse Reports.

It’s a cautionary tale for any country considering establishing a biometric identity system without rigorous checks and balances and input from civil society. Dozens of global south countries have adopted this approach as part of an effort to meet sustainable development goals from the UN, which considers having a legal identity to be a fundamental human right. But, despite billions of dollars of investment, with backing from organizations including the World Bank, those identity systems haven’t always lived up to expectations. In many cases, the key problem is the failure to register large swathes of the population, leading to exclusion from public services. But in other places, like Uganda, inclusion in the system has been weaponized for surveillance purposes.

A year-long investigation by Bloomberg and Lighthouse Reports sheds new light on the ways in which Museveni’s regime has built and deployed this system to target opponents and consolidate power. It shows how the underlying software and data sets are easily accessed by individuals at all levels of law enforcement, despite official claims to the contrary. It also highlights, in some cases for the first time, how senior government and law enforcement officials have used these tools to target individuals deemed to pose a political threat…(More)”.

What are location services and how do they work?


Article by Douglas Crawford: “Location services refer to a combination of technologies used in devices like smartphones and computers that use data from your device’s GPS, WiFi, mobile (cellular networks), and sometimes even Bluetooth connections to determine and track your geographic location.

This information can be accessed by your operating system (OS) and the apps installed on your device. In many cases, this allows them to perform their purpose correctly or otherwise deliver useful content and features. 

For example, navigation/map, weather, ridesharing (such Uber or Lyft), and health and fitness tracking apps require location services to perform their functions, while datingtravel, and social media apps can offer additional functionality with access to your device’s location services (such as being able to locate a Tinder match or see recommendations for nearby restaurants ).

There’s no doubt location services (and the apps that use them) can be useful. However, the technology can be (and is) also abused by apps to track your movements. The apps then usually sell this information to advertising and analytics companies  that combine it with other data to create a profile of you, which they can then use to sell ads. 

Unfortunately, this behavior is not limited to “rogue” apps. Apps usually regarded as legitimate, including almost all Google apps, Facebook, Instagram, and others, routinely send detailed and highly sensitive location details back to their developers by default. And it’s not just apps — operating systems themselves, such as Google’s Android and Microsoft Windows also closely track your movements using location services. 

This makes weighing the undeniable usefulness of location services with the need to maintain a basic level of privacy a tricky balancing act. However, because location services are so easy to abuse, all operating systems include built-in safeguards that give you some control over their use.

In this article, we’ll look at how location services work..(More)”.

The not-so-silent type: Vulnerabilities across keyboard apps reveal keystrokes to network eavesdroppers


Report by Jeffrey KnockelMona Wang, and Zoë Reichert: “Typing logographic languages such as Chinese is more difficult than typing alphabetic languages, where each letter can be represented by one key. There is no way to fit the tens of thousands of Chinese characters that exist onto a single keyboard. Despite this obvious challenge, technologies have developed which make typing in Chinese possible. To enable the input of Chinese characters, a writer will generally use a keyboard app with an “Input Method Editor” (IME). IMEs offer a variety of approaches to inputting Chinese characters, including via handwriting, voice, and optical character recognition (OCR). One popular phonetic input method is Zhuyin, and shape or stroke-based input methods such as Cangjie or Wubi are commonly used as well. However, used by nearly 76% of mainland Chinese keyboard users, the most popular way of typing in Chinese is the pinyin method, which is based on the pinyin romanization of Chinese characters.

All of the keyboard apps we analyze in this report fall into the category of input method editors (IMEs) that offer pinyin input. These keyboard apps are particularly interesting because they have grown to accommodate the challenge of allowing users to type Chinese characters quickly and easily. While many keyboard apps operate locally, solely within a user’s device, IME-based keyboard apps often have cloud features which enhance their functionality. Because of the complexities of predicting which characters a user may want to type next, especially in logographic languages like Chinese, IMEs often offer “cloud-based” prediction services which reach out over the network. Enabling “cloud-based” features in these apps means that longer strings of syllables that users type will be transmitted to servers elsewhere. As many have previously pointed out, “cloud-based” keyboards and input methods can function as vectors for surveillance and essentially behave as keyloggers. While the content of what users type is traveling from their device to the cloud, it is additionally vulnerable to network attackers if not properly secured. This report is not about how operators of cloud-based IMEs read users’ keystrokes, which is a phenomenon that has already been extensively studied and documented. This report is primarily concerned with the issue of protecting this sensitive data from network eavesdroppers…(More)”.