Privacy by design in big data


An overview of privacy enhancing technologies in the era of big data analytics by the European Union Agency for Network and Information Security (ENISA) : “The extensive collection and further processing of personal information in the context of big data analytics has given rise to serious privacy concerns, especially relating to wide scale electronic surveillance, profiling, and disclosure of private data. In order to allow for all the benefits of analytics without invading individuals’ private sphere, it is of utmost importance to draw the limits of big data processing and integrate the appropriate data protection safeguards in the core of the analytics value chain. ENISA, with the current report, aims at supporting this approach, taking the position that, with respect to the underlying legal obligations, the challenges of technology (for big data) should be addressed by the opportunities of technology (for privacy). To this end, in the present study we first explain the need to shift the discussion from “big data versus privacy” to “big data with privacy”, adopting the privacy and data protection principles as an essential value of big data, not only for the benefit of the individuals, but also for the very prosperity of big data analytics. In this respect, the concept of privacy by design is key in identifying the privacy requirements early at the big data analytics value chain and in subsequently implementing the necessary technical and organizational measures. Therefore, after an analysis of the proposed privacy by design strategies in the different phases of the big data value chain, we provide an overview of specific identified privacy enhancing technologies that we find of special interest for the current and future big data landscape. In particular, we discuss anonymization, the “traditional” analytics technique, the emerging area of encrypted search and privacy preserving computations, granular access control mechanisms, policy enforcement and accountability, as well as data provenance issues. Moreover, new transparency and access tools in big data are explored, together with techniques for user empowerment and control. Following the aforementioned work, one immediate conclusion that can be derived is that achieving “big data with privacy” is not an easy task and a lot of research and implementation is still needed. Yet, we find that this task can be possible, as long as all the involved stakeholders take the necessary steps to integrate privacy and data protection safeguards in the heart of big data, by design and by default. To this end, ENISA makes the following recommendations:

  • Privacy by design applied …
  • Decentralised versus centralised data analytics …
  • Support and automation of policy enforcement
  • Transparency and control….
  • User awareness and promotion of PETs …
  • A coherent approach towards privacy and big data ….(More)”

Big and Open Linked Data (BOLD) in government: A challenge to transparency and privacy?


Marijn Janssen and Jeroen van den Hoven in Government Information Quarterly: “Big and Open Linked Data (BOLD) results in new opportunities and have the potential to transform government and its interactions with the public. BOLD provides the opportunity to analyze the behavior of individuals, increase control, and reduce privacy. At the same time BOLD can be used to create an open and transparent government. Transparency and privacy are considered as important societal and democratic values that are needed to inform citizens and let them participate in democratic processes. Practices in these areas are changing with the rise of BOLD. Although intuitively appealing, the concepts of transparency and privacy have many interpretations and are difficult to conceptualize, which makes it often hard to implement them. Transparency and privacy should be conceptualized as complex, non-dichotomous constructs interrelated with other factors. Only by conceptualizing these values in this way, the nature and impact of BOLD on privacy and transparency can be understood, and their levels can be balanced with security, safety, openness and other socially-desirable values….(More)”

 

Privacy in Public Spaces: What Expectations of Privacy Do We Have in Social Media Intelligence?


Paper by Edwards, Lilian and Urquhart, Lachlan: “In this paper we give a basic introduction to the transition in contemporary surveillance from top down traditional police surveillance to profiling and “pre-crime” methods. We then review in more detail the rise of open source (OSINT) and social media (SOCMINT) intelligence and its use by law enforcement and security authorities. Following this we consider what if any privacy protection is currently given in UK law to SOCMINT. Given the largely negative response to the above question, we analyse what reasonable expectations of privacy there may be for users of public social media, with reference to existing case law on art 8 of the ECHR. Two factors are in particular argued to be supportive of a reasonable expectation of privacy in open public social media communications: first, the failure of many social network users to perceive the environment where they communicate as “public”; and secondly, the impact of search engines (and other automated analytics) on traditional conceptions of structured dossiers as most problematic for state surveillance. Lastly, we conclude that existing law does not provide adequate protection foropen SOCMINT and that this will be increasingly significant as more and more personal data is disclosed and collected in public without well-defined expectations of privacy….(More)”

Big Data for Development: A Review of Promises and Challenges


Martin Hilbert in the Development Policy Review: “The article uses a conceptual framework to review empirical evidence and some 180 articles related to the opportunities and threats of Big Data Analytics for international development. The advent of Big Data delivers a cost-effective prospect for improved decision-making in critical development areas such as healthcare, economic productivity and security. At the same time, the well-known caveats of the Big Data debate, such as privacy concerns and human resource scarcity, are aggravated in developing countries by long-standing structural shortages in the areas of infrastructure, economic resources and institutions. The result is a new kind of digital divide: a divide in the use of data-based knowledge to inform intelligent decision-making. The article systematically reviews several available policy options in terms of fostering opportunities and minimising risks…..(More)”

The Moral Failure of Computer Scientists


Kaveh Waddell at the Atlantic: “Computer scientists and cryptographers occupy some of the ivory tower’s highest floors. Among academics, their work is prestigious and celebrated. To the average observer, much of it is too technical to comprehend. The field’s problems can sometimes seem remote from reality.

But computer science has quite a bit to do with reality. Its practitioners devise the surveillance systems that watch over nearly every space, public or otherwise—and they design the tools that allow for privacy in the digital realm. Computer science is political, by its very nature.

That’s at least according to Phillip Rogaway, a professor of computer science at the University of California, Davis, who has helped create some of the most important tools that secure the Internet today. Last week, Rogaway took his case directly to a roomful of cryptographers at a conference in Auckland, New Zealand. He accused them of a moral failure: By allowing the government to construct a massive surveillance apparatus, the field had abused the public trust. Rogaway said the scientists had a duty to pursue social good in their work.
He likened the danger posed by modern governments’ growing surveillance capabilities to the threat of nuclear warfare in the 1950s, and called upon scientists to step up and speak out today, as they did then.

I spoke to Rogaway about why cryptographers fail to see their work in moral terms, and the emerging link between encryption and terrorism in the national conversation. A transcript of our conversation appears below, lightly edited for concision and clarity….(More)”

Opening up government data for public benefit


Keiran Hardy at the Mandarin (Australia): “…This post explains the open data movement and considers the benefits and risks of releasing government data as open data. It then outlines the steps taken by the Labor and Liberal governments in accordance with this trend. It argues that the Prime Minister’stask, while admirably intentioned, is likely to prove difficult due to ongoing challenges surrounding the requirements of privacy law and a public service culture that remains reluctant to release government data into the public domain….

A key purpose of releasing government data is to improve the effectiveness and efficiency of services delivered by the government. For example, data on crops, weather and geography might be analysed to improve current approaches to farming and industry, or data on hospital admissions might be analysed alongside demographic and census data to improve the efficiency of health services in areas of need. It has been estimated that such innovation based on open data could benefit the Australian economy by up to $16 billion per year.

Another core benefit is that the open data movement is making gains in transparency and accountability, as a greater proportion of government decisions and operations are being shared with the public. These democratic values are made clear in the OGP’s Open Government Declaration, which aims to make governments ‘more open, accountable, and responsive to citizens’.

Open data can also improve democratic participation by allowing citizens to contribute to policy innovation. Events like GovHack, an annual Australian competition in which government, industry and the general public collaborate to find new uses for open government data, epitomise a growing trend towards service delivery informed by user input. The winner of the “Best Policy Insights Hack” at GovHack 2015 developed a software program for analysing which suburbs are best placed for rooftop solar investment.

At the same time, the release of government data poses significant risks to the privacy of Australian citizens. Much of the open data currently available is spatial (geographic or satellite) data, which is relatively unproblematic to post online as it poses minimal privacy risks. However, for the full benefits of open data to be gained, these kinds of data need to be supplemented with information on welfare payments, hospital admission rates and other potentially sensitive areas which could drive policy innovation.

Policy data in these areas would be de-identified — that is, all names, addresses and other obvious identifying information would be removed so that only aggregate or statistical data remains. However, debates continue as to the reliability of de-identification techniques, as there have been prominent examples of individuals being re-identified by cross-referencing datasets….

With regard to open data, a culture resistant to releasing government informationappears to be driven by several similar factors, including:

  • A generational preference amongst public service management for maintaining secrecy of information, whereas younger generations expect that data should be made freely available;
  • Concerns about the quality or accuracy of information being released;
  • Fear that mistakes or misconduct on behalf of government employees might be exposed;
  • Limited understanding of the benefits that can be gained from open data; and
  • A lack of leadership to help drive the open data movement.

If open data policies have a similar effect on public service culture as FOI legislation, it may be that open data policies in fact hinder transparency by having a chilling effect on government decision-making for fear of what might be exposed….

These legal and cultural hurdles will pose ongoing challenges for the Turnbull government in seeking to release greater amounts of government data as open data….(More)

What Privacy Papers Should Policymakers be Reading in 2016?


Stacy Gray at the Future of Privacy Forum: “Each year, FPF invites privacy scholars and authors to submit articles and papers to be considered by members of our Advisory Board, with an aim toward showcasing those articles that should inform any conversation about privacy among policymakers in Congress, as well as at the Federal Trade Commission and in other government agencies. For our sixth annual Privacy Papers for Policymakers, we received submissions on topics ranging from mobile app privacy, to location tracking, to drone policy.

Our Advisory Board selected papers that describe the challenges and best practices of designing privacy notices, ways to minimize the risks of re-identification of data by focusing on process-based data release policy and taking a precautionary approach to data release, the relationship between privacy and markets, and bringing the concept of trust more strongly into privacy principles.

Our top privacy papers for 2015 are, in alphabetical order:
Florian Schaub, Rebecca Balebako, Adam L. Durity, and Lorrie Faith Cranor
Ira S. Rubinstein and Woodrow Hartzog
Arvind Narayanan, Joanna Huey, and Edward W. Felten
Ryan Calo
Neil Richards and Woodrow Hartzog
Our two papers selected for Notable Mention are:
Peter Swire (Testimony, Senate Judiciary Committee Hearing, July 8, 2015)
Joel R. Reidenberg
….(More)”

Open Data, Privacy, and Fair Information Principles: Towards a Balancing Framework


Paper by Zuiderveen Borgesius, Frederik J. and van Eechoud, Mireille and Gray, Jonathan: “Open data are held to contribute to a wide variety of social and political goals, including strengthening transparency, public participation and democratic accountability, promoting economic growth and innovation, and enabling greater public sector efficiency and cost savings. However, releasing government data that contain personal information may threaten privacy and related rights and interests. In this paper we ask how these privacy interests can be respected, without unduly hampering benefits from disclosing public sector information. We propose a balancing framework to help public authorities address this question in different contexts. The framework takes into account different levels of privacy risks for different types of data. It also separates decisions about access and re-use, and highlights a range of different disclosure routes. A circumstance catalogue lists factors that might be considered when assessing whether, under which conditions, and how a dataset can be released. While open data remains an important route for the publication of government information, we conclude that it is not the only route, and there must be clear and robust public interest arguments in order to justify the disclosure of personal information as open data….(More)

Decoding the Future for National Security


George I. Seffers at Signal: “U.S. intelligence agencies are in the business of predicting the future, but no one has systematically evaluated the accuracy of those predictions—until now. The intelligence community’s cutting-edge research and development agency uses a handful of predictive analytics programs to measure and improve the ability to forecast major events, including political upheavals, disease outbreaks, insider threats and cyber attacks.

The Office for Anticipating Surprise at the Intelligence Advanced Research Projects Activity (IARPA) is a place where crystal balls come in the form of software, tournaments and throngs of people. The office sponsors eight programs designed to improve predictive analytics, which uses a variety of data to forecast events. The programs all focus on incidents outside of the United States, and the information is anonymized to protect privacy. The programs are in different stages, some having recently ended as others are preparing to award contracts.

But they all have one more thing in common: They use tournaments to advance the state of the predictive analytic arts. “We decided to run a series of forecasting tournaments in which people from around the world generate forecasts about, now, thousands of real-world events,” says Jason Matheny, IARPA’s new director. “All of our programs on predictive analytics do use this tournament style of funding and evaluating research.” The Open Source Indicators program used a crowdsourcing technique in which people across the globe offered their predictions on such events as political uprisings, disease outbreaks and elections.

The data analyzed included social media trends, Web search queries and even cancelled dinner reservations—an indication that people are sick. “The methods applied to this were all automated. They used machine learning to comb through billions of pieces of data to look for that signal, that leading indicator, that an event was about to happen,” Matheny explains. “And they made amazing progress. They were able to predict disease outbreaks weeks earlier than traditional reporting.” The recently completed Aggregative Contingent Estimation (ACE) program also used a crowdsourcing competition in which people predicted events, including whether weapons would be tested, treaties would be signed or armed conflict would break out along certain borders. Volunteers were asked to provide information about their own background and what sources they used. IARPA also tested participants’ cognitive reasoning abilities. Volunteers provided their forecasts every day, and IARPA personnel kept score. Interestingly, they discovered the “deep domain” experts were not the best at predicting events. Instead, people with a certain style of thinking came out the winners. “They read a lot, not just from one source, but from multiple sources that come from different viewpoints. They have different sources of data, and they revise their judgments when presented with new information. They don’t stick to their guns,” Matheny reveals. …

The ACE research also contributed to a recently released book, Superforecasting: The Art and Science of Prediction, according to the IARPA director. The book was co-authored, along with Dan Gardner, by Philip Tetlock, the Annenberg University professor of psychology and management at the University of Pennsylvania who also served as a principal investigator for the ACE program. Like ACE, the Crowdsourcing Evidence, Argumentation, Thinking and Evaluation program uses the forecasting tournament format, but it also requires participants to explain and defend their reasoning. The initiative aims to improve analytic thinking by combining structured reasoning techniques with crowdsourcing.

Meanwhile, the Foresight and Understanding from Scientific Exposition (FUSE) program forecasts science and technology breakthroughs….(More)”

Sharing Information


An Ericsson Consumer Insight Summary Report: “In the age of the internet we often hear how companies, authorities and other organizations get access to our personal information. As a result, the topic of privacy is frequently debated. What is sometimes overlooked is how we as individuals watch in return. We observe not only each other, but also companies and authorities – and we share what we see. Your neighbor searches the net about the family that just moved in next door. The traveler films his hotel and shares the video with other potential holidaymakers. A friend shares her experience about her employer on a social network. As sharing online continues to grow, we are starting to see the impact on both our individual lives and society. In this report we begin to uncover how consumers perceive their influence – but also some issues that arise as a result….

By sharing more information than ever, smartphone owners are increasingly acting like citizen journalists > Over 70 percent of all smartphone users share personal photos regularly. 69 percent share more than they did 2 years ago

> 69 percent also read or watch other people’s shared content more than they did 2 years ago

People report wrongdoings by businesses and authorities online

> 34 percent of smartphone owners who have had bad experiences with companies say they usually share their experiences online. 27 percent repost other consumers’ complaints on a weekly basis

> Over half of smartphone users surveyed believe that being able to express opinions online about companies has increased their influence

Consumers expect shared information to have an effect on society and the world

> 54 percent believe that the internet has increased the possibility for whistleblowers to expose corrupt and illicit behavior in companies and organizations

> Furthermore, 37 percent of smartphone users believe that sharing information about a corrupt company online has greater impact than going to the police

With new power comes new challenges

> 46 percent of smartphone users would like a verification service to check the authenticity of an online posting or news clip

> 64 percent would like to be able to stop negative information about themselves circulating online

> 1 in 2 says protecting personal information should be a priority on the political agenda, although only 1 in 4 says it is not”…(More)”