AI Global Surveillance Technology


Carnegie Endowment: “Artificial intelligence (AI) technology is rapidly proliferating around the world. A growing number of states are deploying advanced AI surveillance tools to monitor, track, and surveil citizens to accomplish a range of policy objectives—some lawful, others that violate human rights, and many of which fall into a murky middle ground.

In order to appropriately address the effects of this technology, it is important to first understand where these tools are being deployed and how they are being used.

To provide greater clarity, Carnegie presents an AI Global Surveillance (AIGS) Index—representing one of the first research efforts of its kind. The index compiles empirical data on AI surveillance use for 176 countries around the world. It does not distinguish between legitimate and unlawful uses of AI surveillance. Rather, the purpose of the research is to show how new surveillance capabilities are transforming the ability of governments to monitor and track individuals or systems. It specifically asks:

  • Which countries are adopting AI surveillance technology?
  • What specific types of AI surveillance are governments deploying?
  • Which countries and companies are supplying this technology?

Learn more about our findings and how AI surveillance technology is spreading rapidly around the globe….(More)”.

How big data can affect your bank account – and life


Alena Buyx, Barbara Prainsack and Aisling McMahon at The Conversation: “Mustafa loves good coffee. In his free time, he often browses high-end coffee machines that he cannot currently afford but is saving for. One day, travelling to a friend’s wedding abroad, he gets to sit next to another friend on the plane. When Mustafa complains about how much he paid for his ticket, it turns out that his friend paid less than half of what he paid, even though they booked around the same time.

He looks into possible reasons for this and concludes that it must be related to his browsing of expensive coffee machines and equipment. He is very angry about this and complains to the airline, who send him a lukewarm apology that refers to personalised pricing models. Mustafa feels that this is unfair but does not challenge it. Pursuing it any further would cost him time and money.

This story – which is hypothetical, but can and does occur – demonstrates the potential for people to be harmed by data use in the current “big data” era. Big data analytics involves using large amounts of data from many sources which are linked and analysed to find patterns that help to predict human behaviour. Such analysis, even when perfectly legal, can harm people.

Mustafa, for example, has likely been affected by personalised pricing practices whereby his search for high-end coffee machines has been used to make certain assumptions about his willingness to pay or buying power. This in turn may have led to his higher priced airfare. While this has not resulted in serious harm in Mustafa’s case, instances of serious emotional and financial harm are, unfortunately, not rare, including the denial of mortgages for individuals and risks to a person’s general credit worthiness based on associations with other individuals. This might happen if an individual shares some similar characteristics to other individuals who have poor repayment histories….(More)”.

Sharenthood: Why We Should Think before We Talk about Our Kids Online


Book by Leah Plunkett: “Our children’s first digital footprints are made before they can walk—even before they are born—as parents use fertility apps to aid conception, post ultrasound images, and share their baby’s hospital mug shot. Then, in rapid succession come terabytes of baby pictures stored in the cloud, digital baby monitors with built-in artificial intelligence, and real-time updates from daycare. When school starts, there are cafeteria cards that catalog food purchases, bus passes that track when kids are on and off the bus, electronic health records in the nurse’s office, and a school surveillance system that has eyes everywhere. Unwittingly, parents, teachers, and other trusted adults are compiling digital dossiers for children that could be available to everyone—friends, employers, law enforcement—forever. In this incisive book, Leah Plunkett examines the implications of “sharenthood”—adults’ excessive digital sharing of children’s data. She outlines the mistakes adults make with kids’ private information, the risks that result, and the legal system that enables “sharenting.”

Plunkett describes various modes of sharenting—including “commercial sharenting,” efforts by parents to use their families’ private experiences to make money—and unpacks the faulty assumptions made by our legal system about children, parents, and privacy. She proposes a “thought compass” to guide adults in their decision making about children’s digital data: play, forget, connect, and respect. Enshrining every false step and bad choice, Plunkett argues, can rob children of their chance to explore and learn lessons. The Internet needs to forget. We need to remember….(More)”.

Study finds Big Data eliminates confidentiality in court judgements


Swissinfo: “Swiss researchers have found that algorithms that mine large swaths of data can eliminate anonymity in federal court rulings. This could have major ramifications for transparency and privacy protection.

This is the result of a study by the University of Zurich’s Institute of Law, published in the legal journal “Jusletter” and shared by Swiss public television SRF on Monday.

The study relied on a “web scraping technique” or mining of large swaths of data. The researchers created a database of all decisions of the Supreme Court available online from 2000 to 2018 – a total of 122,218 decisions. Additional decisions from the Federal Administrative Court and the Federal Office of Public Health were also added.

Using an algorithm and manual searches for connections between data, the researchers were able to de-anonymise, in other words reveal identities, in 84% of the judgments in less than an hour.

In this specific study, the researchers were able to identify the pharma companies and medicines hidden in the documents of the complaints filed in court.  

Study authors say that this could have far-reaching consequences for transparency and privacy. One of the study’s co-authors Kerstin Noëlle Vokinger, professor of law at the University of Zurich explains that, “With today’s technological possibilities, anonymisation is no longer guaranteed in certain areas”. The researchers say the technique could be applied to any publicly available database.

Vokinger added there is a need to balance necessary transparency while safeguarding the personal rights of individuals.

Adrian Lobsiger, the Swiss Federal Data Protection Commissioner, told SRF that this confirms his view that facts may need to be treated as personal data in the age of technology….(More)”.

Government wants access to personal data while it pushes privacy


Sara Fischer and Scott Rosenberg at Axios: “Over the past two years, the U.S. government has tried to rein in how major tech companies use the personal data they’ve gathered on their customers. At the same time, government agencies are themselves seeking to harness those troves of data.

Why it matters: Tech platforms use personal information to target ads, whereas the government can use it to prevent and solve crimes, deliver benefits to citizens — or (illegally) target political dissent.

Driving the news: A new report from the Wall Street Journal details the ways in which family DNA testing sites like FamilyTreeDNA are pressured by the FBI to hand over customer data to help solve criminal cases using DNA.

  • The trend has privacy experts worried about the potential implications of the government having access to large pools of genetic data, even though many people whose data is included never agreed to its use for that purpose.

The FBI has particular interest in data from genetic and social media sites, because it could help solve crimes and protect the public.

  • For example, the FBI is “soliciting proposals from outside vendors for a contract to pull vast quantities of public data” from Facebook, Twitter Inc. and other social media companies,“ the Wall Street Journal reports.
  • The request is meant to help the agency surveil social behavior to “mitigate multifaceted threats, while ensuring all privacy and civil liberties compliance requirements are met.”
  • Meanwhile, the Trump administration has also urged social media platforms to cooperate with the governmentin efforts to flag individual users as potential mass shooters.

Other agencies have their eyes on big data troves as well.

  • Earlier this year, settlement talks between Facebook and the Department of Housing and Urban Development broke down over an advertising discrimination lawsuit when, according to a Facebook spokesperson, HUD “insisted on access to sensitive information — like user data — without adequate safeguards.”
  • HUD presumably wanted access to the data to ensure advertising discrimination wasn’t occurring on the platform, but it’s unclear whether the agency needed user data to be able to support that investigation….(More)”.

The Ethics of Hiding Your Data From the Machines


Molly Wood at Wired: “…But now that data is being used to train artificial intelligence, and the insights those future algorithms create could quite literally save lives.

So while targeted advertising is an easy villain, data-hogging artificial intelligence is a dangerously nuanced and highly sympathetic bad guy, like Erik Killmonger in Black Panther. And it won’t be easy to hate.

I recently met with a company that wants to do a sincerely good thing. They’ve created a sensor that pregnant women can wear, and it measures their contractions. It can reliably predict when women are going into labor, which can help reduce preterm births and C-sections. It can get women into care sooner, which can reduce both maternal and infant mortality.

All of this is an unquestionable good.

And this little device is also collecting a treasure trove of information about pregnancy and labor that is feeding into clinical research that could upend maternal care as we know it. Did you know that the way most obstetricians learn to track a woman’s progress through labor is based on a single study from the 1950s, involving 500 women, all of whom were white?…

To save the lives of pregnant women and their babies, researchers and doctors, and yes, startup CEOs and even artificial intelligence algorithms need data. To cure cancer, or at least offer personalized treatments that have a much higher possibility of saving lives, those same entities will need data….

And for we consumers, well, a blanket refusal to offer up our data to the AI gods isn’t necessarily the good choice either. I don’t want to be the person who refuses to contribute my genetic data via 23andMe to a massive research study that could, and I actually believe this is possible, lead to cures and treatments for diseases like Parkinson’s and Alzheimer’s and who knows what else.

I also think I deserve a realistic assessment of the potential for harm to find its way back to me, because I didn’t think through or wasn’t told all the potential implications of that choice—like how, let’s be honest, we all felt a little stung when we realized the 23andMe research would be through a partnership with drugmaker (and reliable drug price-hiker) GlaxoSmithKline. Drug companies, like targeted ads, are easy villains—even though this partnership actually couldproduce a Parkinson’s drug. But do we know what GSK’s privacy policy looks like? That deal was a level of sharing we didn’t necessarily expect….(More)”.

Stop the Open Data Bus, We Want to Get Off


Paper by Chris Culnane, Benjamin I. P. Rubinstein, and Vanessa Teague: “The subject of this report is the re-identification of individuals in the Myki public transport dataset released as part of the Melbourne Datathon 2018. We demonstrate the ease with which we were able to re-identify ourselves, our co-travellers, and complete strangers; our analysis raises concerns about the nature and granularity of the data released, in particular the ability to identify vulnerable or sensitive groups…..

This work highlights how a large number of passengers could be re-identified in the 2018 Myki data release, with detailed discussion of specific people. The implications of re-identification are potentially serious: ex-partners, one-time acquaintances, or other parties can determine places of home, work, times of travel, co-travelling patterns—presenting risk to vulnerable groups in particular…

In 2018 the Victorian Government released a large passenger centric transport dataset to a data science competition—the 2018 Melbourne Datathon. Access to the data was unrestricted, with a URL provided on the datathon’s website to download the complete dataset from an Amazon S3 Bucket. Over 190 teams continued to analyse the data through the 2 month competition period. The data consisted of touch on and touch off events for the Myki smart card ticketing system used throughout the state of Victoria, Australia. With such data, contestants would be able to apply retrospective analyses on an entire public transport system, explore suitability of predictive models, etc.

The Myki ticketing system is used across Victorian public transport: on trains, buses and trams. The dataset was a longitudinal dataset, consisting of touch on and touch off events from Week 27 in 2015 through to Week 26 in 2018. Each event contained a card identifier (cardId; not the actual card number), the card type, the time of the touch on or off, and various location information, for example a stop ID or route ID, along with other fields which we omit here for brevity. Events could be indexed by the cardId and as such, all the events associated with a single card could be retrieved. There are a total of 15,184,336 cards in the dataset—more than twice the 2018 population of Victoria. It appears that all touch on and off events for metropolitan trains and trams have been included, though other forms of transport such as intercity trains and some buses are absent. In total there are nearly 2 billion touch on and off events in the dataset.

No information was provided as to the de-identification that was performed on the dataset. Our analysis indicates that little to no de-identification took place on the bulk of the data, as will become evident in Section 3. The exception is the cardId, which appears to have been mapped in some way from the Myki Card Number. The exact mapping has not been discovered, although concerns remain as to its security effectiveness….(More)”.

Data Management Law for the 2020s: The Lost Origins and the New Needs


Paper by Przemysław Pałka: “In the data analytics society, each individual’s disclosure of personal information imposes costs on others. This disclosure enables companies, deploying novel forms of data analytics, to infer new knowledge about other people and to use this knowledge to engage in potentially harmful activities. These harms go beyond privacy and include difficult to detect price discrimination, preference manipulation, and even social exclusion. Currently existing, individual-focused, data protection regimes leave law unable to account for these social costs or to manage them. 

This Article suggests a way out, by proposing to re-conceptualize the problem of social costs of data analytics through the new frame of “data management law.” It offers a critical comparison of the two existing models of data governance: the American “notice and choice” approach and the European “personal data protection” regime (currently expressed in the GDPR). Tracing their origin to a single report issued in 1973, the article demonstrates how they developed differently under the influence of different ideologies (market-centered liberalism, and human rights, respectively). It also shows how both ultimately failed at addressing the challenges outlined already forty-five years ago. 

To tackle these challenges, this Article argues for three normative shifts. First, it proposes to go beyond “privacy” and towards “social costs of data management” as the framework for conceptualizing and mitigating negative effects of corporate data usage. Second, it argues to go beyond the individual interests, to account for collective ones, and to replace contracts with regulation as the means of creating norms governing data management. Third, it argues that the nature of the decisions about these norms is political, and so political means, in place of technocratic solutions, need to be employed….(More)”.

The Data Protection Officer Handbook


Handbook by Douwe Korff and Marie Georges: “This Handbook was prepared for and is used in the EU-funded  “T4DATA” training‐of-trainers programme. Part I explains the history and development of European data protection law and provides an overview of European data protection instruments including the Council of Europe Convention and its “Modernisation” and the various EU data protection instruments relating to Justice and Home Affairs, the CFSP and the EU institutions, before focusing on the GDPR in Part II. The final part (Part III) consists of detailed practical advice on the various tasks of the Data Protection Officer now institutionalised by the GDPR. Although produced for the T4DATA programme that focusses on DPOs in the public sector, it is hoped that the Handbook will be useful also to anyone else interested in the application of the GDPR, including DPOs in the private sector….(More)”.

Guidance Note: Statistical Disclosure Control


Centre for Humanitarian Data: “Survey and needs assessment data, or what is known as ‘microdata’, is essential for providing adequate response to crisis-affected people. However, collecting this information does present risks. Even as great effort is taken to remove unique identifiers such as names and phone numbers from microdata so no individual persons or communities are exposed, combining key variables such as location or ethnicity can still allow for re-identification of individual respondents. Statistical Disclosure Control (SDC) is one method for reducing this risk. 

The Centre has developed a Guidance Note on Statistical Disclosure Control that outlines the steps involved in the SDC process, potential applications for its use, case studies and key actions for humanitarian data practitioners to take when managing sensitive microdata. Along with an overview of what SDC is and what tools are available, the Guidance Note outlines how the Centre is using this process to mitigate risk for datasets shared on HDX. …(More)”.