Why Protecting Data Privacy Matters, and When


Anne Russell at Data Science Central: “It’s official. Public concerns over the privacy of data used in digital approaches have reached an apex. Worried about the safety of digital networks, consumers want to gain control over what they increasingly sense as a loss of power over how their data is used. It’s not hard to wonder why. Look at the extent of coverage on the U.S. Government data breach last month and the sheer growth in the number of attacks against government and others overall. Then there is the increasing coverage on the inherent security flaws built into the internet, through which most of our data flows. The costs of data breaches to individuals, industries, and government are adding up. And users are taking note…..
If you’re not sure whether the data fueling your approach will raise privacy and security flags, consider the following. When it comes to data privacy and security, not all data is going to be of equal concern. Much depends on the level of detail in data content, data type, data structure, volume, and velocity, and indeed how the data itself will be used and released.

First there is the data where security and privacy has always mattered and for which there is already an existing and well galvanized body of law in place. Foremost among these is classified or national security data where data usage is highly regulated and enforced. Other data for which there exists a considerable body of international and national law regulating usage includes:

  • Proprietary Data – specifically the data that makes up the intellectual capital of individual businesses and gives them their competitive economic advantage over others, including data protected under copyright, patent, or trade secret laws and the sensitive, protected data that companies collect on behalf of its customers;
  • Infrastructure Data – data from the physical facilities and systems – such as roads, electrical systems, communications services, etc. – that enable local, regional, national, and international economic activity; and
  • Controlled Technical Data – technical, biological, chemical, and military-related data and research that could be considered of national interest and be under foreign export restrictions….

The second group of data that raises privacy and security concerns is personal data. Commonly referred to as Personally Identifiable Information (PII), it is any data that distinguishes individuals from each other. It is also the data that an increasing number of digital approaches rely on, and the data whose use tends to raise the most public ire. …

A third category of data needing privacy consideration is the data related to good people working in difficult or dangerous places. Activists, journalists, politicians, whistle-blowers, business owners, and others working in contentious areas and conflict zones need secure means to communicate and share data without fear of retribution and personal harm.  That there are parts of the world where individuals can be in mortal danger for speaking out is one of the reason that TOR (The Onion Router) has received substantial funding from multiple government and philanthropic groups, even at the high risk of enabling anonymized criminal behavior. Indeed, in the absence of alternate secure networks on which to pass data, many would be in grave danger, including those such as the organizers of the Arab Spring in 2010 as well as dissidents in Syria and elsewhere….(More)”

 

Beyond Propaganda


Foreign Policy: “This essay is adapted from the first in a series of publications by the Legatum Institute’s Transitions Forum on the politics of information in the 21st century.

Pity the poor propagandist! Back in the 20th century, it was a lot easier to control an authoritarian country’s hearts and minds. All domestic media could be directed out of a government office. Foreign media could be jammed. Borders were sealed, and your population couldn’t witness the successes of a rival system. You had a clear narrative with at least a theoretically enticing vision of social justice or national superiority, one strong enough to fend off the seductions of liberal democracy and capitalism. Anyone who disagreed could be isolated, silenced, and suppressed.

Those were the halcyon days of what the Chinese call “thought work” — and Soviets called the “engineering of human souls.” And until recently, it seemed as if they were gone forever. Today’s smart phones and laptops mean any citizen can be their own little media center. Borders are more open. Western films, cars, and search engines permeate virtually everywhere. All regimes are experimenting with at least some version of capitalism, which theoretically means that everyone has more in common.

Yet the story is far from straightforward. Neo-authoritarian, “hybrid,” and illiberal democratic regimes in countries such as Venezuela, Turkey, China, Syria, and Russia have not given up on propaganda. They have found completely new ways of pursuing it, many of them employing technologies invented in the democratic world.

Why fight the information age and globalization when you can use it?

Often, the techniques are quite subtle. After analyzing the real-time censorship of 1,382 Chinese websites during the first half of 2011 — 11,382,221 posts in all — researchers from Harvard University found that the government’s propagandists did in fact tolerate criticism of politicians and policies. But they immediately censored any online attempts to organize collective protests, including some that were not necessarily critical of the regime. One heavily censored event, for example, was meant to highlight fears that nuclear spillage from Japan would reach China….(More)”

African American family records from era of slavery to be available free online


Joanna Walters in The Guardian: “Millions of African Americans will soon be able to trace their families through the era of slavery, some to the countries from which their ancestors were snatched, thanks to a new and free online service that is digitizing a huge cache of federal records for the first time.

Handwritten records collecting information on newly freed slaves that were compiled just after the civil war will be available for easy searches through a new website, it was announced on Friday.

The records belong to the Freedmen’s Bureau, an administrative body created by Congress in 1865 to assist slaves in 15 states and the District of Columbia transition into free citizenship.

Before that time, slaves were legally regarded as property in the US and their names were not officially documented. They often appeared only as dash marks – even on their owners’ records.

African Americans trying to trace family history today regularly hit the research equivalent of a brick wall prior to 1870, when black people were included in the US census for the first time.

Now a major project run by several organisations is beginning to digitise the 1.5 million handwritten records from the Freedmen’s Bureau, which feature more than four million names and are held by various federal bodies, for full online access.

All the records are expected to be online by late 2016, to coincide with the opening of the new Smithsonian National Museum of African American History and Culture on the National Mall in Washington.

Hollis Gentry, a genealogy specialist at the Smithsonian, said at the announcement of the project in Los Angeles on Friday: “The records serve as a bridge to slavery and freedom. You can look at some of the original documents that were created at the time when these people were living. They are the earliest records detailing people who were formerly enslaved. We get a sense of their voice, their dreams.”…

The Freedmen’s Bureau made records that include marriages and church and financial details as well as full names, dates of birth and histories of slave ownership.

They have been available for access by the public in Washington, but only in person by searching through hundreds of pages of handwritten documents.

The project to put the documents online is a collaboration involving the Smithsonian, the National Archives, the Afro-American Historical and Genealogical Society, the California African American Museum and FamilySearch. The last-named body is a large online genealogy organisation run by the Church of Jesus Christ of Latter-Day Saints – otherwise known as the Mormon church, based in Salt Lake City.

Volunteers will help to digitise the handwritten records and they will be added to the website as they become available. The website is discoverfreedmen.org….”

 

Improving Crowdsourcing and Citizen Science as a Policy Mechanism for NASA


Paper by Balcom Brittany: “This article examines citizen science projects, defined as “a form of open collaboration where members of the public participate in the scientific process, including identifying research questions, collecting and analyzing the data, interpreting the results, and problem solving,” as an effective and innovative tool for National Aeronautics and Space Administration (NASA) science in line with the Obama Administration’s Open Government Directive. Citizen science projects allow volunteers with no technical training to participate in analysis of large sets of data that would otherwise constitute prohibitively tedious and lengthy work for research scientists. Zooniverse.com hosts a multitude of popular space-focused citizen science projects, many of which have been extraordinarily successful and have enabled new research publications and major discoveries. This article takes a multifaceted look at such projects by examining the benefits of citizen science, effective game design, and current desktop computer and mobile device usage trends. It offers suggestions of potential research topics to be studied with emerging technologies, policy considerations, and opportunities for outreach. This analysis includes an overview of other crowdsourced research methods such as distributed computing and contests. New research and data analysis of mobile phone usage, scientific curiosity, and political engagement among Zooniverse.com project participants has been conducted for this study…(More)”

Secrecy and Publicity in Votes and Debates


Book edited by Jon Elster: “In the spirit of Jeremy Bentham’s Political Tactics, this volume offers the first comprehensive discussion of the effects of secrecy and publicity on debates and votes in committees and assemblies. The contributors – sociologists, political scientists, historians, and legal scholars – consider the micro-technology of voting (the devil is in the detail), the historical relations between the secret ballot and universal suffrage, the use and abolition of secret voting in parliamentary decisions, and the sometimes perverse effects of the drive for greater openness and transparency in public affairs. The authors also discuss the normative questions of secret versus public voting in national elections and of optimal mixes of secrecy and publicity, as well as the opportunities for strategic behavior created by different voting systems. Together with two previous volumes on Collective Wisdom (Cambrige, 2012) and Majority Decisions (Cambridge, 2014), the book sets a new standard for interdisciplinary work on collective decision-making….(More)”

When Guarding Student Data Endangers Valuable Research


Susan M. Dynarski  in the New York Times: “There is widespread concern over threats to privacy posed by the extensive personal data collected by private companies and public agencies.

Some of the potential danger comes from the government: The National Security Agency has swept up the telephone records of millions of people, in what it describes as a search for terrorists. Other threats are posed by hackers, who have exploited security gaps to steal data from retail giantslike Target and from the federal Office of Personnel Management.

Resistance to data collection was inevitable — and it has been particularly intense in education.

Privacy laws have already been strengthened in some states, and multiple bills now pending in state legislatures and in Congress would tighten the security and privacy of student data. Some of this proposed legislation is so broadly written, however, that it could unintentionally choke off the use of student data for its original purpose: assessing and improving education. This data has already exposed inequities, allowing researchers and advocates to pinpoint where poor, nonwhite and non-English-speaking children have been educated inadequately by their schools.

Data gathering in education is indeed extensive: Across the United States, large, comprehensive administrative data sets now track the academic progress of tens of millions of students. Educators parse this data to understand what is working in their schools. Advocates plumb the data to expose unfair disparities in test scores and graduation rates, building cases to target more resources for the poor. Researchers rely on this data when measuring the effectiveness of education interventions.

To my knowledge there has been no large-scale, Target-like theft of private student records — probably because students’ test scores don’t have the market value of consumers’ credit card numbers. Parents’ concerns have mainly centered not on theft, but on the sharing of student data with third parties, including education technology companies. Last year, parentsresisted efforts by the tech start-up InBloom to draw data on millions of students into the cloud and return it to schools as teacher-friendly “data dashboards.” Parents were deeply uncomfortable with a third party receiving and analyzing data about their children.

In response to such concerns, some pending legislation would scale back the authority of schools, districts and states to share student data with third parties, including researchers. Perhaps the most stringent of these proposals, sponsored by Senator David Vitter, a Louisiana Republican, would effectively end the analysis of student data by outside social scientists. This legislation would have banned recent prominent research documenting the benefits of smaller classes, the value of excellent teachersand the varied performance of charter schools.

Under current law, education agencies can share data with outside researchers only to benefit students and improve education. Collaborations with researchers allow districts and states to tap specialized expertise that they otherwise couldn’t afford. The Boston public school district, for example, has teamed up with early-childhood experts at Harvard to plan and evaluate its universal prekindergarten program.

In one of the longest-standing research partnerships, the University of Chicago works with the Chicago Public Schools to improve education. Partnerships like Chicago’s exist across the nation, funded by foundations and the United States Department of Education. In one initiative, a Chicago research consortium compiled reports showing high school principals that many of the seniors they had sent off to college swiftly dropped out without earning a degree. This information spurred efforts to improve high school counseling and college placement.

Specific, tailored information in the hands of teachers, principals or superintendents empowers them to do better by their students. No national survey could have told Chicago’s principals how their students were doing in college. Administrative data can provide this information, cheaply and accurately…(More)”

Introducing the Governance Data Alliance


“The overall assumption of the Governance Data Alliance is that governance data can contribute to improved sustainable economic and human development outcomes and democratic accountability in all countries. The contribution that governance data will make to those outcomes will of course depend on a whole range of issues that will vary across contexts; development processes, policy processes, and the role that data plays vary considerably. Nevertheless, there are some core requirements that need to be met if data is to make a difference, and articulating them can provide a framework to help us understand and improve the impact that data has on development and accountability across different contexts.

We also collectively make another implicit (and important) assumption: that the current state of affairs is vastly insufficient when it comes to the production and usage of high-quality governance data. In other words, the status quo needs to be significantly improved upon. Data gathered from participants in the April 2014 design session help to paint that picture in granular terms. Data production remains highly irregular and ad hoc; data usage does not match data production in many cases (e.g. users want data that don’t exist and do not use data that is currently produced); production costs remain high and inconsistent across producers despite possibilities for economies of scale; and feedback loops between governance data producers and governance data users are either non-existent or rarely employed. We direct readers to http://dataalliance.globalintegrity.org for a fuller treatment of those findings.

Three requirements need to be met if governance data is to lead to better development and accountability outcomes, whether those outcomes are about core “governance” issues such as levels of inclusion, or about service delivery and human development outcomes that may be shaped by the quality of governance. Those requirements are:

  • The availability of governance data.
  • The quality of governance data, including its usability and salience.
  • The informed use of governance data.

(Or to use the metaphor of markets, we face a series of market failures: supply of data is inconsistent and not uniform; user demand cannot be efficiently channeled to suppliers to redirect their production to address those deficiencies; and transaction costs abound through non-existent data standards and lack of predictability.)

If data are not available about those aspects of governance that are expected to have an impact on development outcomes and democratic accountability, no progress will be made. The risk is that data about key issues will be lacking, or that there will be gaps in coverage, whether country coverage, time periods covered, or sectors, or that data sets produced by different actors may not be comparable. This might come about for reasons including the following: a lack of knowledge – amongst producers, and amongst producers and users – about what data is needed and what data is available; high costs, and limited resources to invest in generating data; and, institutional incentives and structures (e.g. lack of autonomy, inappropriate mandate, political suppression of sensitive data, organizational dysfunction – relating, for instance, to National Statistical Offices) that limit the production of governance data….

What A Governance Data Alliance Should Do (Or, Making the Market Work)

During the several months of creative exploration around possibilities for a Governance Data Alliance, dozens of activities were identified as possible solutions (in whole or in part) to the challenges identified above. This note identifies what we believe to be the most important and immediate activities that an Alliance should undertake, knowing that other activities can and should be rolled into an Alliance work plan in the out years as the initiative matures and early successes (and failures) are achieved and digested.

A brief summary of the proposals that follow:

  1. Design and implement a peer-to-peer training program between governance data producers to improve the quality and salience of existing data.
  2. Develop a lightweight data standard to be adopted by producer organizations to make it easier for users to consume governance data.
  3. Mine the 2014 Reform Efforts Survey to understand who actually uses which governance data, currently, around the world.
  4. Leverage the 2014 Reform Efforts Survey “plumbing” to field customized follow-up surveys to better assess what data users seek in future governance data.
  5. Pilot (on a regional basis) coordinated data production amongst producer organizations to fill coverage gaps, reduce redundancies, and respond to actual usage and user preferences….(More) “

Exploring Open Energy Data in Urban Areas


The Worldbank: “…Energy efficiency – using less energy input to deliver the same level of service – has been described by many as the ‘first fuel’ of our societies. However, lack of adequate data to accurately predict and measure energy efficiency savings, particularly at the city level, has limited the realization of its promise over the past two decades.
Why Open Energy Data?
Open Data can be a powerful tool to reduce information asymmetry in markets, increase transparency and help achieve local economic development goals. Several sectors like transport, public sector management and agriculture have started to benefit from Open Data practices. Energy markets are often characterized by less-than-optimal conditions with high system inefficiencies, misaligned incentives and low levels of transparency. As such, the sector has a lot to potentially gain from embracing Open Data principles.
The United States is a leader in this field with its ‘Energy Data’ initiative. This initiative makes data easy to find, understand and apply, helping to fuel a clean energy economy. For example, the Energy Information Administration’s (EIA) open application programming interface (API) has more than 1.2 million time series of data and is frequently visited by users from the private sector, civil society and media. In addition, the Green Button  initiative is empowering American citizens to have access to their own energy usage data, and OpenEI.org is an Open Energy Information platform to help people find energy information, share their knowledge and connect to other energy stakeholders.
Introducing the Open Energy Data Assessment
To address this data gap in emerging and developing countries, the World Bank is conducting a series of Open Energy Data Assessments in urban areas. The objective is to identify important energy-related data, raise awareness of the benefits of Open Data principles and improve the flow of data between traditional energy stakeholders and others interested in the sector.
The first cities we assessed were Accra, Ghana and Nairobi, Kenya. Both are among the fastest-growing cities in the world, with dynamic entrepreneurial and technology sectors, and both are capitals of countries with an ongoing National Open Data Initiative., The two cities have also been selected to be part of the Negawatt Challenge, a World Bank international competition supporting technology innovation to solve local energy challenges.
The ecosystem approach
The starting point for the exercise was to consider the urban energy sector as an ecosystem, comprised of data suppliers, data users, key datasets, a legal framework, funding mechanisms, and ICT infrastructure. The methodology that we used adapted the established World Bank Open Data Readiness Assessment (ODRA), which highlights valuable connections between data suppliers and data demand.  The assessment showcases how to match pressing urban challenges with the opportunity to release and use data to address them, creating a longer-term commitment to the process. Mobilizing key stakeholders to provide quick, tangible results is also key to this approach….(More) …See also World Bank Open Government Data Toolkit.”

How Crowdsourcing Can Help Us Fight ISIS


 at the Huffington Post: “There’s no question that ISIS is gaining ground. …So how else can we fight ISIS? By crowdsourcing data – i.e. asking a relevant group of people for their input via text or the Internet on specific ISIS-related issues. In fact, ISIS has been using crowdsourcing to enhance its operations since last year in two significant ways. Why shouldn’t we?

First, ISIS is using its crowd of supporters in Syria, Iraq and elsewhere to help strategize new policies. Last December, the extremist group leveraged its global crowd via social media to brainstorm ideas on how to kill 26-year-old Jordanian coalition fighter pilot Moaz al-Kasasba. ISIS supporters used the hashtag “Suggest a Way to Kill the Jordanian Pilot Pig” and “We All Want to Slaughter Moaz” to make their disturbing suggestions, which included decapitation, running al-Kasasba over with a bulldozer and burning him alive (which was the winner). Yes, this sounds absurd and was partly a publicity stunt to boost ISIS’ image. But the underlying strategy to crowdsource new strategies makes complete sense for ISIS as it continues to evolve – which is what the US government should consider as well.

In fact, in February, the US government tried to crowdsource more counterterrorism strategies. Via its official blog, DipNote, the State Departmentasked the crowd – in this case, US citizens – for their suggestions for solutions to fight violent extremism. This inclusive approach to policymaking was obviously important for strengthening democracy, with more than 180 entries posted over two months from citizens across the US. But did this crowdsourcing exercise actually improve US strategy against ISIS? Not really. What might help is if the US government asked a crowd of experts across varied disciplines and industries about counterterrorism strategies specifically against ISIS, also giving these experts the opportunity to critique each other’s suggestions to reach one optimal strategy. This additional, collaborative, competitive and interdisciplinary expert insight can only help President Obama and his national security team to enhance their anti-ISIS strategy.

Second, ISIS has been using its crowd of supporters to collect intelligence information to better execute its strategies. Since last August, the extremist group has crowdsourced data via a Twitter campaign specifically on Saudi Arabia’s intelligence officials, including names and other personal details. This apparently helped ISIS in its two suicide bombing attacks during prayers at a Shite mosque last month; it also presumably helped ISIS infiltrate a Saudi Arabian border town via Iraq in January. This additional, collaborative approach to intelligence collection can only help President Obama and his national security team to enhance their anti-ISIS strategy.

In fact, last year, the FBI used crowdsourcing to spot individuals who might be travelling abroad to join terrorist groups. But what if we asked the crowd of US citizens and residents to give us information specifically on where they’ve seen individuals get lured by ISIS in the country, as well as on specific recruitment strategies they may have noted? This might also lead to more real-time data points on ISIS defectors returning to the US – who are they, why did they defect and what can they tell us about their experience in Syria or Iraq? Overall, crowdsourcing such data (if verifiable) would quickly create a clearer picture of trends in recruitment and defectors across the country, which can only help the US enhance its anti-ISIS strategies.

This collaborative approach to data collection could also be used in Syria and Iraq with texts and online contributions from locals helping us to map ISIS’ movements….(More)”

India wants all government organizations to develop open APIs


Medianama: “The department of electronics and information technology (DeitY) is looking to frame a policy (pdf) for adopting and developing open application programming interfaces (APIs) in government organizations to promote software interoperability for all e-governance applications & systems. The policy shall be applicable to all central government organizations and to those state governments that choose to adopt the policy.

DeitY also said that all information and data of a government organisation shall be made available by open APIs, as per the National Data Sharing and Accessibility Policy and adhere to National Cyber Security Policy.

Policy points

– Each published API of a Government organization shall be provided free of charge whenever possible to other government organizations and public.

– Each published API shall be properly documented with sample code and sufficient information for developers to make use of the API.

– The life-cycle of the open API shall be made available by the API publishing Government organisation. The API shall be backward compatible with at least two earlier versions.

– Government organizations may use an authentication mechanism to enable service interoperability and single sign-on.

– All Open API systems built and data provided shall adhere to GoI security policies and guidelines.

…. This would allow anyone to build a website or an application and pull government information into the public domain. Everyone knows navigating a government website can be nightmarish. For example, Indian Railways provides open APIs which enabled the development of applications such as RailYatri. Through the eRail APIs, the application pulls info which includes list of stations, trains between stations, route of a train, Train Fares, PNR Status, Live train status, seat availability, cancelled, rescheduled or diverted train information and current running status of the train. …(More)”

See also “Policy on Open Application Programming Interfaces (APIs) for Government of India