Lessons from Cambridge Analytica: one way to protect your data


Julia Apostle in the Financial Times: “The unsettling revelations about how data firm Cambridge Analytica surreptitiously exploited the personal information of Facebook users is yet another demoralising reminder of how much data has been amassed about us, and of how little control we have over it.

Unfortunately, the General Data Protection Regulation privacy laws that are coming into force across Europe — with more demanding consent, transparency and accountability requirements, backed by huge fines — may improve practices, but they will not change the governing paradigm: the law labels those who gather our data as “controllers”. We are merely “subjects”.

But if the past 20 years have taught us anything, it is that when business and legislators have been too slow to adapt to public demand — for goods and services that we did not even know we needed, such as Amazon, Uber and bitcoin — computer scientists have stepped in to fill the void. And so it appears that the realms of data privacy and security are deserving of some disruption. This might come in the form of “self-sovereign identity” systems.

The theory behind self-sovereign identity is that individuals should control the data elements that form the basis of their digital identities, and not centralised authorities such as governments and private companies. In the current online environment, we all have multiple log-ins, usernames, customer IDs and personal data spread across countless platforms and stored in myriad repositories.

Instead of this scattered approach, we should each possess the digital equivalent of a wallet that contains verified pieces of our identities. We can then choose which identification to share, with whom, and when. Self-sovereign identity systems are currently being developed.

They involve the creation of a unique and persistent identifier attributed to an individual (called a decentralised identity), which cannot be taken away. The systems use public/private key cryptography, which enables a user with a private key (a string of numbers) to share information with unlimited recipients who can access the encrypted data if they possess a corresponding public key.

The systems also rely on decentralised ledger applications like blockchain. While key cryptography has been around for a long time, it is the development of decentralised ledger technology, which also supports the trading of cryptocurrencies without the involvement of intermediaries, that will allow self-sovereign identity systems to take off. The potential uses for decentralised identity are legion and small-scale implementation is already happening. The Swiss municipality of Zug started using a decentralised identity system called uPort last year, to allow residents access to certain government services. The municipality announced it will also use the system for voting this spring….

Decentralised identity is more difficult to access and therefore there is less financial incentive for hackers to try. Self-sovereign identity systems could eliminate many of our data privacy concerns while empowering individuals in the online world and turning the established data order on its head. But the success of the technology depends on its widespread adoption….(More)

Artificial Intelligence and the Need for Data Fairness in the Global South


Medium blog by Yasodara Cordova: “…The data collected by industry represents AI opportunities for governments, to improve their services through innovation. Data-based intelligence promises to increase the efficiency of resource management by improving transparency, logistics, social welfare distribution — and virtually every government service. E-government enthusiasm took of with the realization of the possible applications, such as using AI to fight corruption by automating the fraud-tracking capabilities of cost-control tools. Controversially, the AI enthusiasm has spread to the distribution of social benefits, optimization of tax oversight and control, credit scoring systems, crime prediction systems, and other applications based in personal and sensitive data collection, especially in countries that do not have comprehensive privacy protections.

There are so many potential applications, society may operate very differently in ten years when the “datafixation” has advanced beyond citizen data and into other applications such as energy and natural resource management. However, many countries in the Global South are not being given necessary access to their countries’ own data.

Useful data are everywhere, but only some can take advantage. Beyond smartphones, data can be collected from IoT components in common spaces. Not restricted to urban spaces, data collection includes rural technology like sensors installed in tractors. However, even when the information is related to issues of public importance in developing countries —like data taken from road mesh or vital resources like water and land — it stays hidden under contract rules and public citizens cannot access, and therefore take benefit, from it. This arrangement keeps the public uninformed about their country’s operations. The data collection and distribution frameworks are not built towards healthy partnerships between industry and government preventing countries from realizing the potential outlined in the previous paragraph.

The data necessary to the development of better cities, public policies, and common interest cannot be leveraged if kept in closed silos, yet access often costs more than is justifiable. Data are a primordial resource to all stages of new technology, especially tech adoption and integration, so the necessary long term investment in innovation needs a common ground to start with. The mismatch between the pace of the data collection among big established companies and small, new, and local businesses will likely increase with time, assuming no regulation is introduced for equal access to collected data….

Currently, data independence remains restricted to discussions on the technological infrastructure that supports data extraction. Privacy discussions focus on personal data rather than the digital accumulation of strategic data in closed silos — a necessary discussion not yet addressed. The national interest of data is not being addressed in a framework of economic and social fairness. Access to data, from a policy-making standpoint, needs to find a balance between the extremes of public, open access and limited, commercial use.

A final, but important note: the vast majority of social media act like silos. APIs play an important role in corporate business models, where industry controls the data it collects without reward, let alone user transparency. Negotiation of the specification of APIs to make data a common resource should be considered, for such an effort may align with the citizens’ interest….(More)”.

Is Distributed Ledger Technology Built for Personal Data?


Paper by Henry Chang: “Some of the appealing characteristics of distributed ledger technology (DLT), which blockchain is a type of, include guaranteed integrity, disintermediation and distributed resilience. These characteristics give rise to the possible consequences of immutability, unclear ownership, universal accessibility and trans-border storage. These consequences have the potential to contravene data protection principles of Purpose Specification, Use Limitation, Data Quality, Individual Participation and Trans-Border Data Flow. This paper endeavors to clarify the various types of DLTs, how they work, why they exhibit the depicted characteristics and the consequences. Using the universal privacy principles developed by the Organisation of Economic Cooperation and Development (OECD), this paper then describes how each of the consequence causes concerns for privacy protection and how attempts are being made to address them in the design and implementation of various applications of blockchain and DLT, and indicates where further research and best-practice developments lie….(More)”.

The World’s Biggest Biometric Database Keeps Leaking People’s Data


Rohith Jyothish at FastCompany: “India’s national scheme holds the personal data of more than 1.13 billion citizens and residents of India within a unique ID system branded as Aadhaar, which means “foundation” in Hindi. But as more and more evidence reveals that the government is not keeping this information private, the actual foundation of the system appears shaky at best.

On January 4, 2018, The Tribune of India, a news outlet based out of Chandigarh, created a firestorm when it reported that people were selling access to Aadhaar data on WhatsApp, for alarmingly low prices….

The Aadhaar unique identification number ties together several pieces of a person’s demographic and biometric information, including their photograph, fingerprints, home address, and other personal information. This information is all stored in a centralized database, which is then made accessible to a long list of government agencies who can access that information in administrating public services.

Although centralizing this information could increase efficiency, it also creates a highly vulnerable situation in which one simple breach could result in millions of India’s residents’ data becoming exposed.

The Annual Report 2015-16 of the Ministry of Electronics and Information Technology speaks of a facility called DBT Seeding Data Viewer (DSDV) that “permits the departments/agencies to view the demographic details of Aadhaar holder.”

According to @databaazi, DSDV logins allowed third parties to access Aadhaar data (without UID holder’s consent) from a white-listed IP address. This meant that anyone with the right IP address could access the system.

This design flaw puts personal details of millions of Aadhaar holders at risk of broad exposure, in clear violation of the Aadhaar Act.…(More)”.

On democracy


Sophie in ‘t Veld (European Parliament) in a Special Issue of Internet Policy Review on Political micro-targeting edited by Balazs Bodo, Natali Helberger and Claes de Vreese: Democracy is valuable and vulnerable, which is reason enough to remain alert for new developments that can undermine her. In recent months, we have seen enough examples of the growing impact of personal data in campaigns and elections. It is important and urgent for us to publicly debate this development. It is easy to see why we should take action against extremist propaganda of hatemongers aiming to recruit young people for violent acts. But we euphemistically speak of ‘fake news’ when lies, ‘half-truths’, conspiracy theories, and sedition creepily poison public opinion.

The literal meaning of democracy is ‘the power of the people’. ‘Power’ presupposes freedom. Freedom to choose and to decide. Freedom from coercion and pressure. Freedom from manipulation. ‘Power’ also presupposes knowledge. Knowledge of all facts, aspects, and options. And knowing how to balance them against each other. When freedom and knowledge are restricted, there can be no power.

In a democracy, every individual choice influences society as a whole. Therefore, the common interest is served with everyone’s ability to make their choices in complete freedom, and with complete knowledge.

The interests of parties and political candidates who compete for citizen’s votes may differ from that higher interest. They want citizens to see their political advertising, and only theirs, not that of their competitors. Not only do parties and candidates compete for the voter’s favour. They contend for his exclusive time and attention as well.

POLITICAL TARGETING

No laws dictate what kind of information a voter should rely on to be able to make the right consideration. For lamb chops, toothpaste, mortgages or cars, for example, it’s mandatory for producers to mention the origin and properties. This enables consumers to make a responsible decision. Providing false information is illegal. All ingredients, properties, and risks have to be mentioned on the label.

Political communication, however, is protected by freedom of speech. Political parties are allowed to use all kinds of sales tricks.

And, of course, campaigns do their utmost and continuously test the limits of the socially acceptable….(More)”.

Government data: How open is too open?


Sharon Fisher at HPE: “The notion of “open government” appeals to both citizens and IT professionals seeking access to freely available government data. But is there such a thing as data access being too open? Governments may want to be transparent, yet they need to avoid releasing personally identifiable information.

There’s no question that open government data offers many benefits. It gives citizens access to the data their taxes paid for, enables government oversight, and powers the applications developed by government, vendors, and citizens that improve people’s lives.

However, data breaches and concerns about the amount of data that government is collecting makes some people wonder: When is it too much?

“As we think through the big questions about what kind of data a state should collect, how it should use it, and how to disclose it, these nuances become not some theoretical issue but a matter of life and death to some people,” says Alexander Howard, deputy director of the Sunlight Foundation, a Washington nonprofit that advocates for open government. “There are people in government databases where the disclosure of their [physical] location is the difference between a life-changing day and Wednesday.

Open data supporters point out that much of this data has been considered a public record all along and tout the value of its use in analytics. But having personal data aggregated in a single place that is accessible online—as opposed to, say, having to go to an office and physically look up each record—makes some people uneasy.

Privacy breaches, wholesale

“We’ve seen a real change in how people perceive privacy,” says Michael Morisy, executive director at MuckRock, a Cambridge, Massachusetts, nonprofit that helps media and citizens file public records requests. “It’s been driven by a long-standing concept in transparency: practical obscurity.” Even if something was technically a public record, effort needed to be expended to get one’s hands on it. That amount of work might be worth it about, say, someone running for office, but on the whole, private citizens didn’t have to worry. Things are different now, says Morisy. “With Google, and so much data being available at the click of a mouse or the tap of a phone, what was once practically obscure is now instantly available.”

People are sometimes also surprised to find out that public records can contain their personally identifiable information (PII), such as addresses, phone numbers, and even Social Security numbers. That may be on purpose or because someone failed to redact the data properly.

That’s had consequences. Over the years, there have been a number of incidents in which PII from public records, including addresses, was used to harass and sometimes even kill people. For example, in 1989, Rebecca Schaeffer was murdered by a stalker who learned her address from the Department of Motor Vehicles. Other examples of harassment via driver’s license numbers include thieves who tracked down the address of owners of expensive cars and activists who sent anti-abortion literature to women who had visited health clinics that performed abortions.

In response, in 1994, Congress enacted the Driver’s Privacy Protection Act to restrict the sale of such data. More recently, the state of Idaho passed a law protecting the identity of hunters who shot wolves, because the hunters were being harassed by wolf supporters. Similarly, the state of New York allowed concealed pistol permit holders to make their name and address private after a newspaper published an online interactive map showing the names and addresses of all handgun permit holders in Westchester and Rockland counties….(More)”.

It’s Time to Tax Companies for Using Our Personal Data


Saadia Madsbjerg in the New York Times: “Our data is valuable. Each year, it generates hundreds of billions of dollars’ worth of economic activity, mostly between and within corporations — all on the back of information about each of us.

It’s this transaction — between you, the user, giving up details of yourself to a company in exchange for a product like a photo app or email, or a whole ecosystem like Facebook — that’s worth by some estimates $1,000 per person per year, a number that is quickly rising.

The value of our personal data is primarily locked up in the revenues of large corporations. Some, like data brokerages, exist solely to buy and sell sets of that data.

Why should companies be the major, and often the only, beneficiaries of this largess? They shouldn’t. Those financial benefits need to be shared, and the best way to do it is to impose a small tax on this revenue and use the proceeds to build a better, more equitable internet and society that benefit us all.

The data tax could be a minor cost, less than 1 percent of the revenue companies earn from selling our personal data, spread out over an entire industry. Individually, no company’s bottom line would substantially suffer; collectively, the tax would pull money back to the public, from an industry profiting from material and labor that is, at its very core, our own.

This idea is not new. It is, essentially, a sales tax, among the oldest taxes that exist, but it hasn’t been done because assigning a fixed monetary value to our data can be very difficult. For a lot of internet businesses, our personal data either primarily flows through the business or remains locked within….(More)”.

Tech’s fight for the upper hand on open data


Rana Foroohar at the Financial Times: “One thing that’s becoming very clear to me as I report on the digital economy is that a rethink of the legal framework in which business has been conducted for many decades is going to be required. Many of the key laws that govern digital commerce (which, increasingly, is most commerce) were crafted in the 1980s or 1990s, when the internet was an entirely different place. Consider, for example, the US Computer Fraud and Abuse Act.

This 1986 law made it a federal crime to engage in “unauthorised access” to a computer connected to the internet. It was designed to prevent hackers from breaking into government or corporate systems. …While few hackers seem to have been deterred by it, the law is being used in turf battles between companies looking to monetise the most valuable commodity on the planet — your personal data. Case in point: LinkedIn vs HiQ, which may well become a groundbreaker in Silicon Valley.

LinkedIn is the dominant professional networking platform, a Facebook for corporate types. HiQ is a “data-scraping” company, one that accesses publicly available data from LinkedIn profiles and then mixes it up in its own quantitative black box to create two products — Keeper, which tells employers which of their employees are at greatest risk of being recruited away, and Skill Mapper, which provides a summary of the skills possessed by individual workers. LinkedIn allowed HiQ to do this for five years, before developing a very similar product to Skill Mapper, at which point LinkedIn sent the company a “cease and desist” letter, and threatened to invoke the CFAA if HiQ did not stop tapping its user data.

..Meanwhile, a case that might have been significant mainly to digital insiders is being given a huge publicity boost by Harvard professor Laurence Tribe, the country’s pre-eminent constitutional law scholar. He has joined the HiQ defence team because, as he told me, he believes the case is “tremendously important”, not only in terms of setting competitive rules for the digital economy, but in the realm of free speech. According to Prof Tribe, if you accept that the internet is the new town square, and “data is a central type of capital”, then it must be freely available to everyone — and LinkedIn, as a private company, cannot suddenly decide that publicly accessible, Google-searchable data is their private property….(More)”.

Blockchain Could Help Us Reclaim Control of Our Personal Data


Michael Mainelli at Harvard Business Review: “…numerous smaller countries, such as Singapore, are exploring national identity systems that span government and the private sector. One of the more successful stories of governments instituting an identity system is Estonia, with its ID-kaarts. Reacting to cyber-attacks against the nation, the Estonian government decided that it needed to become more digital, and even more secure. They decided to use a distributed ledger to build their system, rather than a traditional central database. Distributed ledgers are used in situations where multiple parties need to share authoritative information with each other without a central third party, such as for data-logging clinical assessments or storing data from commercial deals. These are multi-organization databases with a super audit trail. As a result, the Estonian system provides its citizens with an all-digital government experience, significantly reduced bureaucracy, and significantly high citizen satisfaction with their government dealings.

Cryptocurrencies such as Bitcoin have increased the awareness of distributed ledgers with their use of a particular type of ledger — blockchain — to hold the details of coin accounts among millions of users. Cryptocurrencies have certainly had their own problems with their wallets and exchanges — even ID-kaarts are not without their technical problems — but the distributed ledger technology holds firm for Estonia and for cryptocurrencies. These technologies have been working in hostile environments now for nearly a decade.

The problem with a central database like the ones used to house social security numbers, or credit reports, is that once it’s compromised, a thief has the ability to copy all of the information stored there. Hence the huge numbers of people that can be affected — more than 140 million people in the Equifax breach, and more than 50 million at Home Depot — though perhaps Yahoo takes the cake with more than three billion alleged customer accounts hacked.  Of course, if you can find a distributed ledger online, you can copy it, too. However, a distributed ledger, while available to everyone, may be unreadable if its contents are encrypted. Bitcoin’s blockchain is readable to all, though you can encrypt things in comments. Most distributed ledgers outside cryptocurrencies are encrypted in whole or in part. The effect is that while you can have a copy of the database, you can’t actually read it.

This characteristic of encrypted distributed ledgers has big implications for identity systems.  You can keep certified copies of identity documents, biometric test results, health data, or academic and training certificates online, available at all times, yet safe unless you give away your key. At a whole system level, the database is very secure. Each single ledger entry among billions would need to be found and then individually “cracked” at great expense in time and computing, making the database as a whole very safe.

Distributed ledgers seem ideal for private distributed identity systems, and many organizations are working to provide such systems to help people manage the huge amount of paperwork modern society requires to open accounts, validate yourself, or make payments.  Taken a small step further, these systems can help you keep relevant health or qualification records at your fingertips.  Using “smart” ledgers, you can forward your documentation to people who need to see it, while keeping control of access, including whether another party can forward the information. You can even revoke someone’s access to the information in the future….(More)”.

Selected Readings on Blockchain and Identity


By Hannah Pierce and Stefaan Verhulst

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of blockchain and identity was originally published in 2017.

The potential of blockchain and other distributed ledger technologies to create positive social change has inspired enthusiasm, broad experimentation, and some skepticism. In this edition of the Selected Readings series, we explore and curate the literature on blockchain and how it impacts identity as a means to access services and rights. (In a previous edition we considered the Potential of Blockchain for Transforming Governance).

Introduction

In 2008, an unknown source calling itself Satoshi Nakamoto released a paper named Bitcoin: A Peer-to-Peer Electronic Cash System which introduced Blockchain. Blockchain is a novel technology that uses a distributed ledger to record transactions and ensure compliance. Blockchain and other Distributed Ledger technologies (DLTs) rely on an ability to act as a vast, transparent, and secure public database.

Distributed ledger technologies (DLTs) have disruptive potential beyond innovation in products, services, revenue streams and operating systems within industry. By providing transparency and accountability in new and distributed ways, DLTs have the potential to positively empower underserved populations in myriad ways, including providing a means for establishing a trusted digital identity.

Consider the potential of DLTs for 2.4 billion people worldwide, about 1.5 billion of whom are over the age of 14, who are unable to prove identity to the satisfaction of authorities and other organizations – often excluding them from property ownership, free movement, and social protection as a result. At the same time, transition to a DLT led system of ID management involves various risks, that if not understood and mitigated properly, could harm potential beneficiaries.

Annotated Selected Reading List

Governance

Cuomo, Jerry, Richard Nash, Veena Pureswaran, Alan Thurlow, Dave Zaharchuk. “Building trust in government: Exploring the potential of blockchains.” IBM Institute for Business Value. January 2017.

This paper from the IBM Institute for Business Value culls findings from surveys conducted with over 200 government leaders in 16 countries regarding their experiences and expectations for blockchain technology. The report also identifies “Trailblazers”, or governments that expect to have blockchain technology in place by the end of the year, and details the views and approaches that these early adopters are taking to ensure the success of blockchain in governance. These Trailblazers also believe that there will be high yields from utilizing blockchain in identity management and that citizen services, such as voting, tax collection and land registration, will become increasingly dependent upon decentralized and secure identity management systems. Additionally, some of the Trailblazers are exploring blockchain application in borderless services, like cross-province or state tax collection, because the technology removes the need for intermediaries like notaries or lawyers to verify identities and the authenticity of transactions.

Mattila, Juri. “The Blockchain Phenomenon: The Disruptive Potential of Distributed Consensus Architectures.” Berkeley Roundtable on the International Economy. May 2016.

This working paper gives a clear introduction to blockchain terminology, architecture, challenges, applications (including use cases), and implications for digital trust, disintermediation, democratizing the supply chain, an automated economy, and the reconfiguration of regulatory capacity. As far as identification management is concerned, Mattila argues that blockchain can remove the need to go through a trusted third party (such as a bank) to verify identity online. This could strengthen the security of personal data, as the move from a centralized intermediary to a decentralized network lowers the risk of a mass data security breach. In addition, using blockchain technology for identity verification allows for a more standardized documentation of identity which can be used across platforms and services. In light of these potential capabilities, Mattila addresses the disruptive power of blockchain technology on intermediary businesses and regulating bodies.

Identity Management Applications

Allen, Christopher.  “The Path to Self-Sovereign Identity.” Coindesk. April 27, 2016.

In this Coindesk article, author Christopher Allen lays out the history of digital identities, then explains a concept of a “self-sovereign” identity, where trust is enabled without compromising individual privacy. His ten principles for self-sovereign identity (Existence, Control, Access, Transparency, Persistence, Portability, Interoperability, Consent, Minimization, and Protection) lend themselves to blockchain technology for administration. Although there are actors making moves toward the establishment of self-sovereign identity, there are a few challenges that face the widespread implementation of these tenets, including legal risks, confidentiality issues, immature technology, and a reluctance to change established processes.

Jacobovitz, Ori. “Blockchain for Identity Management.” Department of Computer Science, Ben-Gurion University. December 11, 2016.

This technical report discusses advantages of blockchain technology in managing and authenticating identities online, such as the ability for individuals to create and manage their own online identities, which offers greater control over access to personal data. Using blockchain for identity verification can also afford the potential of “digital watermarks” that could be assigned to each of an individual’s transactions, as well as negating the creation of unique usernames and passwords online. After arguing that this decentralized model will allow individuals to manage data on their own terms, Jacobvitz provides a list of companies, projects, and movements that are using blockchain for identity management.

Mainelli, Michael. “Blockchain Will Help Us Prove Our Identities in a Digital World.” Harvard Business Review. March 16, 2017.

In this Harvard Business Review article, author Michael Mainelli highlights a solution to identity problems for rich and poor alike–mutual distributed ledgers (MDLs), or blockchain technology. These multi-organizational data bases with unalterable ledgers and a “super audit trail” have three parties that deal with digital document exchanges: subjects are individuals or assets, certifiers are are organizations that verify identity, and inquisitors are entities that conducts know-your-customer (KYC) checks on the subject. This system will allow for a low-cost, secure, and global method of proving identity. After outlining some of the other benefits that this technology may have in creating secure and easily auditable digital documents, such as greater tolerance that comes from viewing widely public ledgers, Mainelli questions if these capabilities will turn out to be a boon or a burden to bureaucracy and societal behavior.

Personal Data Security Applications

Banafa, Ahmed. “How to Secure the Internet of Things (IoT) with Blockchain.” Datafloq. August 15, 2016.

This article details the data security risks that are coming up as the Internet of Things continues to expand, and how using blockchain technology can protect the personal data and identity information that is exchanged between devices. Banafa argues that, as the creation and collection of data is central to the functions of Internet of Things devices, there is an increasing need to better secure data that largely confidential and often personally identifiable. Decentralizing IoT networks, then securing their communications with blockchain can allow to remain scalable, private, and reliable. Enabling blockchain’s peer-to-peer, trustless communication may also enable smart devices to initiate personal data exchanges like financial transactions, as centralized authorities or intermediaries will not be necessary.

Shrier, David, Weige Wu and Alex Pentland. “Blockchain & Infrastructure (Identity, Data Security).” Massachusetts Institute of Technology. May 17, 2016.

This paper, the third of a four-part series on potential blockchain applications, covers the potential of blockchains to change the status quo of identity authentication systems, privacy protection, transaction monitoring, ownership rights, and data security. The paper also posits that, as personal data becomes more and more valuable, that we should move towards a “New Deal on Data” which provides individuals data protection–through blockchain technology– and the option to contribute their data to aggregates that work towards the common good. In order to achieve this New Deal on Data, robust regulatory standards and financial incentives must be provided to entice individuals to share their data to benefit society.