privacy

IRS Used Cellphone Location Data to Try to Find Suspects

Curated on June 23, 2020June 23, 2020 by Stefaan Verhulst

Byron Tau at the Wall Street Journal: “The Internal Revenue Service attempted to identify and track potential criminal suspects by purchasing access to a commercial database that records the locations of millions of American cellphones.

The IRS Criminal Investigation unit, or IRS CI, had a subscription to access the data in 2017 and 2018, and the way it used the data was revealed last week in a briefing by IRS CI officials to Sen. Ron Wyden’s (D., Ore.) office. The briefing was described to The Wall Street Journal by an aide to the senator.

IRS CI officials told Mr. Wyden’s office that their lawyers had given verbal approval for the use of the database, which is sold by a Virginia-based government contractor called Venntel Inc. Venntel obtains anonymized location data from the marketing industry and resells it to governments. IRS CI added that it let its Venntel subscription lapse after it failed to locate any targets of interest during the year it paid for the service, according to Mr. Wyden’s aide.

Justin Cole, a spokesman for IRS CI, said it entered into a “limited contract with Venntel to test their services against the law enforcement requirements of our agency.” IRS CI pursues the most serious and flagrant violations of tax law, and it said it used the Venntel database in “significant money-laundering, cyber, drug and organized-crime cases.”

The episode demonstrates a growing law enforcement interest in reams of anonymized cellphone movement data collected by the marketing industry. Government entities can try to use the data to identify individuals—which in many cases isn’t difficult with such databases.

It also shows that data from the marketing industry can be used as an alternative to obtaining data from cellphone carriers, a process that requires a court order. Until 2018, prosecutors needed “reasonable grounds” to seek cell tower records from a carrier. In June 2018, the U.S. Supreme Court strengthened the requirement to show probable cause a crime has been committed before such data can be obtained from carriers….(More)”

Defining a ‘new normal’ for data privacy in the wake of COVID-19

Curated on June 17, 2020June 17, 2020 by Stefaan Verhulst

Jack Dunn at IAPP: “…It is revealing that our relationship with privacy is amorphous and requires additional context in light of transformative technologies, new economic realities and public health emergencies. How can we reasonably evaluate the costs and benefits of Google or Facebook sharing location data with the federal government when it has been perfectly legal for Walgreen’s to share access to customer data with pharmaceutical advertisers? How does aggregating and anonymizing data safeguard privacy when a user’s personal data can be revealed through other data points?

The pandemic is only revealing that we’ve yet to reach a consensus on privacy norms that will come to define the digital age.

This isn’t the first time that technology confounded notions of privacy and consumer protection. In fact, the constitutional right to privacy was born out of another public health crisis. Before 1965, 32 women per 100,000 live births died while giving birth. Similarly, 25 infants died per 100,000 live births. As a result, medical professionals and women’s rights advocates began arguing for greater access to birth control. When state legislatures sought to minimize access, birth control advocates filed lawsuits that eventually lead to the Supreme Court’s seminal case regarding the right to privacy, Griswold v. Connecticut.…

Today, there is growing public concern over the way in which consumer data is used to consolidate economic gain among the few while steering public perception among the many — particularly at a time when privacy seems to be the price for ending public health emergencies.

But the COVID-19 outbreak is also highlighting how user data has the capacity to improve consumer well being and public health. While strict adherence to traditional notions of privacy may be ineffectual in a time of exponential technological growth, the history of our relationship to privacy and technology suggests regulatory policies can strike a balance between otherwise competing interests….(More)“.

Tech Firms Are Spying on You. In a Pandemic, Governments Say That’s OK.

Curated on June 17, 2020June 17, 2020 by Stefaan Verhulst

Sam Schechner, Kirsten Grind and Patience Haggin at the Wall Street Journal: “While an undergraduate at the University of Virginia, Joshua Anton created an app to prevent users from drunk dialing, which he called Drunk Mode. He later began harvesting huge amounts of user data from smartphones to resell to advertisers.

Now Mr. Anton’s company, called X-Mode Social Inc., is one of a number of little-known location-tracking companies that are being deployed in the effort to reopen the country. State and local authorities wielding the power to decide when and how to reopen are leaning on these vendors for the data to underpin those critical judgment calls.

In California, Gov. Gavin Newsom’s office used data from Foursquare Labs Inc. to figure out if beaches were getting too crowded; when the state discovered they were, it tightened its rules. In Denver, the Tri-County Health Department is monitoring counties where the population on average tends to stray more than 330 feet from home, using data from Cuebiq Inc.

Researchers at the University of Texas in San Antonio are using movement data from a variety of companies, including the geolocation firm SafeGraph, to guide city officials there on the best strategies for getting residents back to work.

Many of the location-tracking firms, data brokers and other middlemen are part of the ad-tech industry, which has come under increasing fire in recent years for building what critics call a surveillance economy. Data for targeting ads at individuals, including location information, can also end up in the hands of law-enforcement agencies or political groups, often with limited disclosure to users. Privacy laws are cropping up in states including California, along with calls for federal privacy legislation like that in the European Union.

But some public-health authorities are setting aside those concerns to fight an unprecedented pandemic. Officials are desperate for all types of data to identify people potentially infected with the virus and to understand how they are behaving to predict potential hot spots—whether those people realize it or not…(More)”

IoT Security Is a Mess. Privacy ‘Nutrition’ Labels Could Help

Curated on June 13, 2020June 13, 2020 by Stefaan Verhulst

Lily Hay Newman at Wired: “…Given that IoT security seems unlikely to magically improve anytime soon, researchers and regulators are rallying behind a new approach to managing IoT risk. Think of it as nutrition labels for embedded devices.

At the IEEE Symposium on Security & Privacy last month, researchers from Carnegie Mellon University presented a prototype security and privacy label they created based on interviews and surveys of people who own IoT devices, as well as privacy and security experts. They also published a tool for generating their labels. The idea is to shed light on a device’s security posture but also explain how it manages user data and what privacy controls it has. For example, the labels highlight whether a device can get security updates and how long a company has pledged to support it, as well as the types of sensors present, the data they collect, and whether the company shares that data with third parties.

“In an IoT setting, the amount of sensors and information you have about users is potentially invasive and ubiquitous,” says Yuvraj Agarwal, a networking and embedded systems researcher who worked on the project. “It’s like trying to fix a leaky bucket. So transparency is the most important part. This work shows and enumerates all the choices and factors for consumers.”

Nutrition labels on packaged foods have a certain amount of standardization around the world, but they’re still more opaque than they could be. And security and privacy issues are even less intuitive to most people than soluble and insoluble fiber. So the CMU researchers focused a lot of their efforts on making their IoT label as transparent and accessible as possible. To that end, they included both a primary and secondary layer to the label. The primary label is what would be printed on device boxes. To access the secondary label, you could follow a URL or scan a QR code to see more granular information about a device….(More)”.

Sharing Health Data and Biospecimens with Industry — A Principle-Driven, Practical Approach

Curated on June 3, 2020June 3, 2020 by Stefaan Verhulst

Kayte Spector-Bagdady et al at the New England Journal of Medicine: “The advent of standardized electronic health records, sustainable biobanks, consumer-wellness applications, and advanced diagnostics has resulted in new health information repositories. As highlighted by the Covid-19 pandemic, these repositories create an opportunity for advancing health research by means of secondary use of data and biospecimens. Current regulations in this space give substantial discretion to individual organizations when it comes to sharing deidentified data and specimens. But some recent examples of health care institutions sharing individual-level data and specimens with companies have generated controversy. Academic medical centers are therefore both practically and ethically compelled to establish best practices for governing the sharing of such contributions with outside entities.¹ We believe that the approach we have taken at Michigan Medicine could help inform the national conversation on this issue.

The Federal Policy for the Protection of Human Subjects offers some safeguards for research participants from whom data and specimens have been collected. For example, researchers must notify participants if commercial use of their specimens is a possibility. These regulations generally cover only federally funded work, however, and they don’t apply to deidentified data or specimens. Because participants value transparency regarding industry access to their data and biospecimens, our institution set out to create standards that would better reflect participants’ expectations and honor their trust. Using a principlist approach that balances beneficence and nonmaleficence, respect for persons, and justice, buttressed by recent analyses and findings regarding contributors’ preferences, Michigan Medicine established a formal process to guide our approach….(More)”.

Digital contact tracing and surveillance during COVID-19

Curated on June 3, 2020June 3, 2020 by Stefaan Verhulst

Report on General and Child-specific Ethical Issues by Gabrielle Berman, Karen Carter, Manuel García-Herranz and Vedran Sekara: “The last few years have seen a proliferation of means and approaches being used to collect sensitive or identifiable data on children. Technologies such as facial recognition and other biometrics, increased processing capacity for ‘big data’ analysis and data linkage, and the roll-out of mobile and internet services and access have substantially changed the nature of data collection, analysis, and use.

Real-time data are essential to support decision-makers in government, development and humanitarian agencies such as UNICEF to better understand the issues facing children, plan appropriate action, monitor progress and ensure that no one is left behind. But the collation and use of personally identifiable data may also pose significant risks to children’s rights.

UNICEF has undertaken substantial work to provide a foundation to understand and balance the potential benefits and risks to children of data collection. This work includes the Industry Toolkit on Children’s Online Privacy and Freedom of Expression and a partnership with GovLab on Responsible Data for Children (RD4C) – which promotes good practice principles and has developed practical tools to assist field offices, partners and governments to make responsible data management decisions.

Balancing the need to collect data to support good decision-making versus the need to protect children from harm created through the collection of the data has never been more challenging than in the context of the global COVID-19 pandemic. The response to the pandemic has seen an unprecedented rapid scaling up of technologies to support digital contact tracing and surveillance. The initial approach has included:

tracking using mobile phones and other digital devices (tablet computers, the Internet of Things, etc.)
surveillance to support movement restrictions, including through the use of location monitoring and facial recognition
a shift from in-person service provision and routine data collection to the use of remote or online platforms (including new processes for identity verification)
an increased focus on big data analysis and predictive modelling to fill data gaps…(More)”.

Using Data for COVID-19 Requires New and Innovative Governance Approaches

Curated on May 28, 2020May 28, 2020 by Stefaan Verhulst

Stefaan G. Verhulst and Andrew Zahuranec at Data & Policy blog: “There has been a rapid increase in the number of data-driven projects and tools released to contain the spread of COVID-19. Over the last three months, governments, tech companies, civic groups, and international agencies have launched hundreds of initiatives. These efforts range from simple visualizations of public health data to complex analyses of travel patterns.

When designed responsibly, data-driven initiatives could provide the public and their leaders the ability to be more effective in addressing the virus. The Atlantic andNew York Times have both published work that relies on innovative data use. These and other examples, detailed in our #Data4COVID19 repository, can fill vital gaps in our understanding and allow us to better respond and recover to the crisis.

But data is not without risk. Collecting, processing, analyzing and using any type of data, no matter how good intention of its users, can lead to harmful ends. Vulnerable groups can be excluded. Analysis can be biased. Data use can reveal sensitive information about people and locations. In addressing all these hazards, organizations need to be intentional in how they work throughout the data lifecycle.

Decision Provenance: Documenting decisions and decision makers across the Data Life Cycle

Unfortunately the individuals and teams responsible for making these design decisions at each critical point of the data lifecycle are rarely identified or recognized by all those interacting with these data systems.

The lack of visibility into the origins of these decisions can impact professional accountability negatively as well as limit the ability of actors to identify the optimal intervention points for mitigating data risks and to avoid missed use of potentially impactful data. Tracking decision provenance is essential.

As Jatinder Singh, Jennifer Cobbe, and Chris Norval of the University of Cambridge explain, decision provenance refers to tracking and recording decisions about the collection, processing, sharing, analyzing, and use of data. It involves instituting mechanisms to force individuals to explain how and why they acted. It is about using documentation to provide transparency and oversight in the decision-making process for everyone inside and outside an organization.

Toward that end, The GovLab at NYU Tandon developed the Decision Provenance Mapping. We designed this tool for designated data stewards tasked with coordinating the responsible use of data across organizational priorities and departments….(More)”

Removing the pump handle: Stewarding data at times of public health emergency

Curated on May 21, 2020May 21, 2020 by Stefaan Verhulst

Reema Patel at Significance: “There is a saying, incorrectly attributed to Mark Twain, that states: “History never repeat itself but it rhymes”. Seeking to understand the implications of the current crisis for the effective use of data, I’ve drawn on the nineteenth-century cholera outbreak in London’s Soho to identify some “rhyming patterns” that might inform our approaches to data use and governance at this time of public health crisis.

Where better to begin than with the work of Victorian pioneer John Snow? In 1854, Snow’s use of a dot map to illustrate clusters of cholera cases around public water pumps, and of statistics to establish the connection between the quality of water sources and cholera outbreaks, led to a breakthrough in public health interventions – and, famously, the removal of the handle of a water pump in Broad Street.

Data is vital

We owe a lot to Snow, especially now. His examples teaches us that data has a central role to play in saving lives, and that the effective use of (and access to) data is critical for enabling timely responses to public health emergencies.

Take, for instance, transport app CityMapper’s rapid redeployment of its aggregated transport data. In the early days of the Covid-19 pandemic, this formed part of an analysis of compliance with social distancing restrictions across a range of European cities. There is also the US-based health weather map, which uses anonymised and aggregated data to visualise fever, specifically influenza-like illnesses. This data helped model early indications of where, and how quickly, Covid-19 was spreading….

Ethics and human rights still matter

As the current crisis evolves, many have expressed concern that the pandemic will be used to justify the rapid roll out of surveillance technologies that do not meet ethical and human rights standards, and that this will be done in the name of the “public good”. Examples of these technologies include symptom- and contact-tracing applications. Privacy experts are also increasingly concerned that governments will be trading off more personal data than is necessary or proportionate to respond to the public health crisis.

Many ethical and human rights considerations (including those listed at the bottom of this piece) are at risk of being overlooked at this time of emergency, and governments would be wise not to press ahead regardless, ignoring legitimate concerns about rights and standards. Instead, policymakers should begin to address these concerns by asking how we can prepare (now and in future) to establish clear and trusted boundaries for the use of data (personal and non-personal) in such crises.

Democratic states in Europe and the US have not, in recent memory, prioritised infrastructures and systems for a crisis of this scale – and this has contributed to our current predicament. Contrast this with Singapore, which suffered outbreaks of SARS and H1N1, and channelled this experience into implementing pandemic preparedness measures.

We cannot undo the past, but we can begin planning and preparing constructively for the future, and that means strengthening global coordination and finding mechanisms to share learning internationally. Getting the right data infrastructure in place has a central role to play in addressing ethical and human rights concerns around the use of data….(More)”.

Data Privacy Budget and Solutions Forecast

Curated on May 20, 2020May 20, 2020 by Stefaan Verhulst

Survey by FTI Consulting: “…reported significant increases in spend and data privacy-related programs. Though respondents are increasing their emphasis on privacy compliance, the results showed that many are also willing to take risks in the interest of tapping into the value of their data. Still others believe that “good faith” efforts will improve their position with regulators. Key findings include:

97 percent of organizations will increase their spend on data privacy in the coming year, with nearly one-third indicating plans to increase budgets by between 90 percent and more than 100 percent.
78 percent agreed with the statement: “The value of data is encouraging organizations to find ways to avoid complying fully with data privacy regulation.”
87 percent of respondents believed that steps toward compliance will mitigate regulatory scrutiny. More than half strongly agreed with this idea.
44 percent said they expect lack of awareness and training to be the key data privacy challenge of the coming year.

In terms of solutions, respondents indicated a diverse array of techniques for the coming year, and only 6 percent said they had no plans for change. The top-rated solutions set for implementation over the next 12 months included establishing a clear, consistent set of data privacy standards, updating agreements and contracts with external parties, reviewing standard data privacy practices of supply chains and building privacy-by-design programs….(More)”.

Big data, privacy and COVID-19 – learning from humanitarian expertise in data protection

Curated on May 19, 2020May 19, 2020 by Stefaan Verhulst

Andrej Zwitter & Oskar J. Gstrein at the Journal of International Humanitarian Action: “The use of location data to control the coronavirus pandemic can be fruitful and might improve the ability of governments and research institutions to combat the threat more quickly. It is important to note that location data is not the only useful data that can be used to curb the current crisis. Genetic data can be relevant for AI enhanced searches for vaccines and monitoring online communication on social media might be helpful to keep an eye on peace and security (Taulli n.d.). However, the use of such large amounts of data comes at a price for individual freedom and collective autonomy. The risks of the use of such data should ideally be mitigated through dedicated legal frameworks which describe the purpose and objectives of data use, its collection, analysis, storage and sharing, as well as the erasure of ‘raw’ data once insights have been extracted. In the absence of such clear and democratically legitimized norms, one can only resort to fundamental rights provisions such as Article 8 paragraph 2 of the ECHR that reminds us that any infringement of rights such as privacy need to be in accordance with law, necessary in a democratic society, pursuing a legitimate objective and proportionate in their application.

However as shown above, legal frameworks including human rights standards are currently not capable of effectively ensuring data protection, since they focus too much on the individual as the point of departure. Hence, we submit that currently applicable guidelines and standards for responsible data use in the humanitarian sector should also be fully applicable to corporate, academic and state efforts which are currently enacted to curb the COVID-19 crisis globally. Instead of ‘re-calibrating’ the expectations of individuals on their own privacy and collective autonomy, the requirements for the use of data should be broader and more comprehensive. Applicable principles and standards as developed by OCHA, the 510 project of the Dutch Red Cross, or by academic initiatives such as the Signal Code are valid minimum standards during a humanitarian crisis. Hence, they are also applicable minimum standards during the current pandemic.

Core findings that can be extracted from these guidelines and standards for the practical implementation into data driven responses to COVIC-19 are:

data sensitivity is highly contextual; one and the same data can be sensitive in different contexts. Location data during the current pandemic might be very useful for epidemiological analysis. However, if (ab-)used to re-calibrate political power relations, data can be open for misuse. Hence, any party supplying data and data analysis needs to check whether data and insights can be misused in the context they are presented.
privacy and data protection are important values; they do not disappear during a crisis. Nevertheless, they have to be weighed against respective benefits and risks.
data-breaches are inevitable; with time (t) approaching infinity, the chance of any system being hacked or becoming insecure approaches 100%. Hence, it is not a question of whether, but when. Therefore, organisations have to prepare sound data retention and deletion policies.
data ethics is an obligation to provide high quality analysis; using machine learning and big data might be appealing for the moment, but the quality of source data might be low, and results might be unreliable, or even harmful. Biases in incomplete datasets, algorithms and human users are abundant and widely discussed. We must not forget that in times of crisis, the risk of bias is more pronounced, and more problematic due to the vulnerability of data subjects and groups. Therefore, working to the highest standards of data processing and analysis is an ethical obligation.

The adherence to these principles is particularly relevant in times of crisis such as now, where they mark the difference between societies that focus on control and repression on the one hand, and those who believe in freedom and autonomy on the other. Eventually, we will need to think of including data policies into legal frameworks for state of emergency regulations, and coordinate with corporate stakeholders as well as private organisations on how to best deal with such crises. Data-driven practices have to be used in a responsible manner. Furthermore, it will be important to observe whether data practices and surveillance assemblages introduced under current circumstances will be rolled back to status quo ante when returning to normalcy. If not, our rights will become hollowed out, just waiting for the next crisis to eventually become irrelevant….(More)”.