Can a set of equations keep U.S. census data private?


Jeffrey Mervis at Science: “The U.S. Census Bureau is making waves among social scientists with what it calls a “sea change” in how it plans to safeguard the confidentiality of data it releases from the decennial census.

The agency announced in September 2018 that it will apply a mathematical concept called differential privacy to its release of 2020 census data after conducting experiments that suggest current approaches can’t assure confidentiality. But critics of the new policy believe the Census Bureau is moving too quickly to fix a system that isn’t broken. They also fear the changes will degrade the quality of the information used by thousands of researchers, businesses, and government agencies.

The move has implications that extend far beyond the research community. Proponents of differential privacy say a fierce, ongoing legal battle over plans to add a citizenship question to the 2020 census has only underscored the need to assure people that the government will protect their privacy....

Differential privacy, first described in 2006, isn’t a substitute for swapping and other ways to perturb the data. Rather, it allows someone—in this case, the Census Bureau—to measure the likelihood that enough information will “leak” from a public data set to open the door to reconstruction.

“Any time you release a statistic, you’re leaking something,” explains Jerry Reiter, a professor of statistics at Duke University in Durham, North Carolina, who has worked on differential privacy as a consultant with the Census Bureau. “The only way to absolutely ensure confidentiality is to release no data. So the question is, how much risk is OK? Differential privacy allows you to put a boundary” on that risk....

In the case of census data, however, the agency has already decided what information it will release, and the number of queries is unlimited. So its challenge is to calculate how much the data must be perturbed to prevent reconstruction....

A professor of labor economics at Cornell University, Abowd first learned that traditional procedures to limit disclosure were vulnerable—and that algorithms existed to quantify the risk—at a 2005 conference on privacy attended mainly by cryptographers and computer scientists. “We were speaking different languages, and there was no Rosetta Stone,” he says.

He took on the challenge of finding common ground. In 2008, building on a long relationship with the Census Bureau, he and a team at Cornell created the first application of differential privacy to a census product. It is a web-based tool, called OnTheMap, that shows where people work and live….

The three-step process required substantial computing power. First, the researchers reconstructed records for individuals—say, a 55-year-old Hispanic woman—by mining the aggregated census tables. Then, they tried to match the reconstructed individuals to even more detailed census block records (that still lacked names or addresses); they found “putative matches” about half the time.

Finally, they compared the putative matches to commercially available credit databases in hopes of attaching a name to a particular record. Even if they could, however, the team didn’t know whether they had actually found the right person.

Abowd won’t say what proportion of the putative matches appeared to be correct. (He says a forthcoming paper will contain the ratio, which he calls “the amount of uncertainty an attacker would have once they claim to have reidentified a person from the public data.”) Although one of Abowd’s recent papers notes that “the risk of re-identification is small,” he believes the experiment proved reidentification “can be done.” And that, he says, “is a strong motivation for moving to differential privacy.”…

Such arguments haven’t convinced Ruggles and other social scientists opposed to applying differential privacy on the 2020 census. They are circulating manuscripts that question the significance of the census reconstruction exercise and that call on the agency to delay and change its plan....

Ruggles, meanwhile, has spent a lot of time thinking about the kinds of problems differential privacy might create. His Minnesota institute, for instance, disseminates data from the Census Bureau and 105 other national statistical agencies to 176,000 users. And he fears differential privacy will put a serious crimp in that flow of information…

There are also questions of capacity and accessibility. The centers require users to do all their work onsite, so researchers would have to travel, and the centers offer fewer than 300 workstations in total....

Abowd has said, “The deployment of differential privacy within the Census Bureau marks a sea change for the way that official statistics are produced and published.” And Ruggles agrees. But he says the agency hasn’t done enough to equip researchers with the maps and tools needed to navigate the uncharted waters….(More)”.

The Paradox of Police Data


Stacy Wood in KULA: knowledge creation, dissemination, and preservation studies: “This paper considers the history and politics of ‘police data.’ Police data, I contend, is a category of endangered data reliant on voluntary and inconsistent reporting by law enforcement agencies; it is also inconsistently described and routinely housed in systems that were not designed with long-term strategies for data preservation, curation or management in mind. Moreover, whereas US law enforcement agencies have, for over a century, produced and published a great deal of data about crime, data about the ways in which police officers spend their time and make decisions about resources—as well as information about patterns of individual officer behavior, use of force, and in-custody deaths—is difficult to find. This presents a paradoxical situation wherein vast stores of extant data are completely inaccessible to the public. This paradoxical state is not new, but the continuation of a long history co-constituted by technologies, epistemologies and context….(More)”.

Data Policy in the Fourth Industrial Revolution: Insights on personal data


Report by the World Economic Forum: “Development of comprehensive data policy necessarily involves trade-offs. Cross-border data flows are crucial to the digital economy. The use of data is critical to innovation and technology. However, to engender trust, we need to have appropriate levels of protection in place to ensure privacy, security and safety. Over 120 laws in effect across the globe today provide differing levels of protection for data but few anticipated 

Data Policy in the Fourth Industrial Revolution: Insights on personal data, a paper by the World Economic Forum in collaboration with the Ministry of Cabinet Affairs and the Future, United Arab Emirates, examines the relationship between risk and benefit, recognizing the impact of culture, values and social norms This work is a start toward developing a comprehensive data policy toolkit and knowledge repository of case studies for policy makers and data policy leaders globally….(More)”.

The UN Principles on Personal Data Protection and Privacy


United Nations System: “The Principles on Personal Data Protection and Privacy set out a basic framework for the processing of personal data by, or on behalf of, the United Nations System Organizations in carrying out their mandated activities.

The Principles aim to: (i) harmonize standards for the protection of personal data across the UN System; (ii) facilitate the accountable processing of personal data; and (iii) ensure respect for the human rights and fundamental freedoms of individuals, in particular the right to privacy. These Principles apply to personal data, contained in any form, and processed in any manner. Where appropriate, they may also be used as a benchmark for the processing of non-personal data, in a sensitive context that may put certain individuals or groups of individuals at risk of harms. 
 
The High Level Committee on Management (HLCM) formally adopted the Principles at its 36th Meeting on 11 October 2018. The adoption followed the HLCM’s decision at its 35th Meeting in April 2018 to engage with the UN Data Privacy Policy Group (UN PPG) in developing a set of high-level principles on the cross-cutting issue of data privacy. Preceding the 36th HLCM meeting in October, the Principles were developed and unanimously endorsed by the organizations represented on the UN PPG….(More) (Download the Personal Data Protection and Privacy Principles)

In High-Tech Cities, No More Potholes, but What About Privacy?


Timothy Williams in The New York Times: “Hundreds of cities, large and small, have adopted or begun planning smart cities projects. But the risks are daunting. Experts say cities frequently lack the expertise to understand privacy, security and financial implications of such arrangements. Some mayors acknowledge that they have yet to master the responsibilities that go along with collecting billions of bits of data from residents….

Supporters of “smart cities” say that the potential is enormous and that some projects could go beyond creating efficiencies and actually save lives. Among the plans under development are augmented reality programs that could help firefighters find people trapped in burning buildings and the collection of sewer samples by robots to determine opioid use so that city services could be aimed at neighborhoods most in need.

The hazards are also clear.

“Cities don’t know enough about data, privacy or security,” said Lee Tien, a lawyer at the Electronic Frontier Foundation, a nonprofit organization focused on digital rights. “Local governments bear the brunt of so many duties — and in a lot of these cases, they are often too stupid or too lazy to talk to people who know.”

Cities habitually feel compelled to outdo each other, but the competition has now been intensified by lobbying from tech companies and federal inducements to modernize.

“There is incredible pressure on an unenlightened city to be a ‘smart city,’” said Ben Levine, executive director at MetroLab Network, a nonprofit organization that helps cities adapt to technology change.

That has left Washington, D.C., and dozens of other cities testing self-driving cars and Orlando trying to harness its sunshine to power electric vehicles. San Francisco has a system that tracks bicycle traffic, while Palm Beach, Fla., uses cycling data to decide where to send street sweepers. Boise, Idaho, monitors its trash dumps with drones. Arlington, Tex., is looking at creating a transit system based on data from ride-sharing apps….(More)”.

A Research Roadmap to Advance Data Collaboratives Practice as a Novel Research Direction


Iryna Susha, Theresa A. Pardo, Marijn Janssen, Natalia Adler, Stefaan G. Verhulst and Todd Harbour in the  International Journal of Electronic Government Research (IJEGR): “An increasing number of initiatives have emerged around the world to help facilitate data sharing and collaborations to leverage different sources of data to address societal problems. They are called “data collaboratives”. Data collaboratives are seen as a novel way to match real life problems with relevant expertise and data from across the sectors. Despite its significance and growing experimentation by practitioners, there has been limited research in this field. In this article, the authors report on the outcomes of a panel discussing critical issues facing data collaboratives and develop a research and development agenda. The panel included participants from the government, academics, and practitioners and was held in June 2017 during the 18th International Conference on Digital Government Research at City University of New York (Staten Island, New York, USA). The article begins by discussing the concept of data collaboratives. Then the authors formulate research questions and topics for the research roadmap based on the panel discussions. The research roadmap poses questions across nine different topics: conceptualizing data collaboratives, value of data, matching data to problems, impact analysis, incentives, capabilities, governance, data management, and interoperability. Finally, the authors discuss how digital government research can contribute to answering some of the identified research questions….(More)”. See also: http://datacollaboratives.org/

Blockchain and Sustainable Growth


Cathy Mulligan in the UN Chronicle: “…What can blockchain give us, then?

Blockchain’s 1,000 Thought Experiments

Blockchain is still new and will evolve many times before it can be fully integrated into society. We have seen similar trajectories before in the technology industry; examples include the Internet of things, mobile telephony and even the Internet itself. Every one of those technologies went through various iterations before it was fully integrated and used within society. Many technical, social and political obstacles had to be slowly but surely overcome.

It is often useful, therefore, to approach emerging technologies with some depth of thought—not by expecting them to act immediately as a fully functional solution but rather as a lens on the possible. Such an approach allows for a broader discussion, one in which we can challenge our preconceived notions. Blockchain has already illustrated the power of individuals connected via the Internet with sufficient computing power at their disposal. Far from merely tweeting, taking and sharing photos or videos, such people can also create an entirely new economic structure.

The power of blockchain thus lies not in the technology itself but rather in how it has reframed many discussions across various parts of our society and economy. Blockchain shows us that there are options, that we can organize society differently. It has launched 1,000 different thought experiments but the resulting solutions, which will be delivered a decade or two from now, may or may not be based on blockchain or cryptocurrencies. The discussions that started from this point, however, will have been important contributions to the progress that society makes around digital technologies and what they can mean for humankind. For these reasons, it is important that everyone, including the United Nations, engage with these technologies to understand and learn from them.

At its most basic level, blockchain speaks to a deep, human need, one of being able to trust other people, organizations and companies in a world where most of our interactions are mediated and stored digitally. It is arguable how well it captures that notion of trust, or whether any technology can ever actually replicate what a human being thinks, feels and acts like when they trust and are trusted. These concepts are deeply human, as are the power structures within which digital solutions are built. Blockchain is often discussed as removing intermediaries or creating democratic solutions to problems, but it may merely replace existing analogue power structures with digital ones, and cause decision-making within such contexts to become more brutally binary. ‘Truth’ on the blockchain does not leave room for interpretation, as today’s systems do.

Context is critical for the development of any technology, as is the political economy within which it exists. Those who have tried to use blockchain, however, have quickly realized something: it forces a new level of cooperation. It requires partnerships and deep discussions of what transparency and inclusion truly look like….

Perhaps one of the reasons that blockchain has received so much attention is because it speaks to something that many people across the world are feeling instinctively: that we can only create new solutions to some of the world’s oldest problems by working together and including everyone in the discussion. Blockchain appeals to many people as a viable solution precisely because it is about applying a counter-intuitive approach to problems; despite the often technology-deterministic manner in which it is discussed, it is important to listen to the underlying message. The call to inclusion, trust and multilateralism that blockchain attempts to address from a technical perspective is one that will continue for many decades to come and one to which we must find new ways to respond via Governments, civil society, academia, non-governmental organizations and international organizations such as the United Nations….(More)”.

Digital Investigative Journalism


Book edited by Oliver Hahn and Florian Stalph: “In the post-digital era, investigative journalism around the world faces a revolutionary shift in the way information is gathered and interpreted. Reporters in the field are confronted with data sources, new logics of information dissemination, and a flood of disinformation. Investigative journalists are working with programmers, designers and scientists to develop innovative tools and hands-on approaches that assist them in disclosing the misuse of power and uncovering injustice.

This volume provides an overview of the most sophisticated techniques of digital investigative journalism: data and computational journalism, which investigates stories hidden in numbers; immersive journalism, which digs into virtual reality; drone journalism, which conquers hitherto inaccessible territories; visual and interactive journalism, which reforms storytelling with images and audience perspectives; and digital forensics and visual analytics, which help to authenticate digital content and identify sources in order to detect manipulation. All these techniques are discussed against the backdrop of international political scenarios and globally networked societies….(More)”.

What Is (in) Good Data?


Introductory Chapter by Monique Mann, S. Kate Devitt and Angela Daly for the book on “Good Data”: “In recent years, there has been an exponential increase in the collection, aggregation and automated analysis of information by government and private actors. In response to this there has been significant critique regarding what could be termed ‘bad’ data practices in the globalised digital economy. These include the mass gathering of data about individuals, in opaque, unethical and at times illegal ways, and the increased use of that data in unaccountable and discriminatory forms of algorithmic decision-making.

This edited collection has emerged from our frustration and depression over the previous years of our academic and activist careers critiquing these dystopian ‘Bad Data’ practices. Rather, in this text on ‘Good Data’ we seek to move our work from critique to imagining and articulating a more optimistic vision of the datafied future. We see many previous considerations of Bad Data practices, including our own, as only providing critiques rather than engaging constructively with a new vision of how digital technologies and data can be used productively and justly to further social, economic, cultural and political goals. The objective of the Good Data project is to start a multi-disciplinary and multi-stakeholder conversation around promoting good and ethical data practices and initiatives, towards a fair and just digital economy and society. In doing so, we combine expertise from various disciplines and sectors, including law, criminology, justice, public health, data science, digital media, and philosophy. The contributors to this text also have expertise in areas such as renewable energy, sociology, social media, digital humanities, and political science. There are many fields of knowledge that need to come together to build the Good Data future. This project has also brought together academic, government and industry experts along with rights advocates and activists to examine and propose initiatives that seeks to promote and embed social justice, due process rights, autonomy, freedom from discrimination and environmental sustainability principles….(More)”.

Government Information in Canada: Access and Stewardship


Book edited by Amanda Wakaruk and Sam-chin Li: “Government information is not something that most people think about until they need it or see it in a headline. Indeed, even then librarians, journalists, and intellectually curious citizens will rarely recognize or identify that the statistics needed to complete a report, or the scandal-breaking evidence behind a politician’s resignation, was sourced from taxpayer-funded publications and documents. Fewer people will likely appreciate the fact that access to government information is a requirement of a democratic society.

Government Information in Canada introduces the average librarian, journalist, researcher, and intellectually curious citizen to the often complex, rarely obvious, and sometimes elusive foundational element of a liberal democracy: publicly accessible government information.

While our primary goal is to provide an overview of the state of access to Canadian government information in the late-twentieth and early twenty-first centuries, we hope that this work will also encourage its readers to become more active in the government information community by contributing to government consultations and seeking out information that is produced by their governing bodies. ….

One of our goals is to document the state of government information in Canada at a point of transition. To help orient readers to today’s sub-discipline of librarianship, we offer four points that have been observed and learned over decades of working with government information in academic environments.

  1. Access to government information is the foundation of a functioning democracy and underpins informed citizen engagement. Government information allows us to assess our governing bodies — access that is required for a democracy to function.
  2. Government information has enduring value. The work of countless academics and other experts is disseminated via government information. Government publications and documents are used by academics and social commentators in all areas of intellectual output, resulting in the production of books, reports, speeches, and so forth, which have shaped our society and understanding of the world. For example, the book that introduced the public to the science of climate change, Silent Spring, was full of references to government information; furthermore, legal scholars, lawyers, and judges use legislative documents to interpret and apply the law; journalists use government documents to inform the electorate about their governing bodies. Government information is precarious and requires stewardship.
  3. The strongest system of stewardship for government information is one that operates in partnership with, and at arm’s length of, author agencies. Most content is digital, but this does not mean that it is posted and openly available online. Furthermore, content made available online does not necessarily remain accessible to the public.
  4. Government publications and documents are different from most books, journals, and content born on the Internet. Government information does not fit into the traditional dissemination channels developed and simplified through customer feedback and the pursuit of higher profits. The agencies that produce government information are motivated by different factors than those of traditional publishers…(More)”.