Data for Public Benefit


Carnegie Trust: “Public services are essential to our lives. Collecting, using and sharing data better could help deliver these services more effectively. But as well as delivering many public benefits the sharing of personal data can also involve risks.

Data for Public Benefit’ is a joint initiative with Involve and Understanding Patient Data. The report presents new research from across six local authority areas in England and has found that there are big differences in how public services currently define and weigh up public benefits and risks of data sharing.

We’ve developed a framework to help organisations make better decisions about when data should and shouldn’t be shared. This framework will help professionals weigh up the purpose of sharing data against the potential for harm and help public service providers have conversations with the public about data sharing….(More)“.

The digital economy is disrupting our old models


Diane Coyle at The Financial Times: “One of the many episodes of culture shock I experienced as a British student in the US came when I first visited the university health centre. They gave me my medical notes to take away. Once I was over the surprise, I concluded this was entirely proper. After all, the true data was me, my body. I was reminded of this moment from the early 1980s when reflecting on the debate about Facebook and data, one of the collective conclusions of which seems to be that personal data are personal property so there need to be stronger rights of ownership. If I do not like what Facebook is doing with my data, I should be able to withdraw them. Yet this fix for the problem is not straightforward.

“My” data are inextricably linked with that of other people, who are in my photographs or in my network. Once the patterns and correlations have been extracted from it, withdrawing my underlying data is neither here nor there, for the value lies in the patterns. The social character of information can be seen from the recent example of Strava accidentally publishing maps of secret American military bases because the aggregated route data revealed all the service personnel were running around the edge of their camps. One or two withdrawals of personal data would have made no difference. To put it in economic jargon, we are in the territory of externalities and public goods. Information once shared cannot be unshared.
The digital economy is one of externalities and public goods to a far greater degree than in the past. We have not begun to get to grips with how to analyse it, still less to develop policies for the common good. There are two questions at the heart of the challenge: what norms and laws about property rights over intangibles such as data or ideas or algorithms are going to be needed? And what will the best balance between collective and individual actions be or, to put it another way, between government and market?
Tussles about rights over intangible or intellectual property have been going on for a while: patent trolls on the one hand, open source creators on the other. However, the issue is far from settled. Do we really want to accept, for example, that John Deere, in selling an expensive tractor to a farmer, is only in fact renting it out because it claims property rights over the installed software?

Free digital goods of the open source kind are being cross-subsidised by their creators’ other sources of income. Free digital goods of the social media kind are being funded by various advertising services — and that turns out to be an ugly solution. Yet the network effects are so strong, the benefits they provide so great, that if Facebook and Google were shut down by antitrust action tomorrow, replacement digital groups could well emerge before too long. China seems to be in effect nationalising its big digital platforms but many in the west will find that even less appealing than a private data market. In short, neither “market” nor “state” looks like the right model for ownership and governance in an information economy pervaded by externalities and public goods. Finding alternative models for the creation and sharing of value in the digital world, when these are inherently collective and non-rival activities, is an urgent challenge….(More).

Data in the EU: Commission steps up efforts to increase availability and boost healthcare data sharing


PressRelease: “Today, the European Commission is putting forward a set of measures to increase the availability of data in the EU, building on previous initiatives to boost the free flow of non-personal data in the Digital Single Market.

Data-driven innovation is a key enabler of market growth, job creation, particularly for SMEs and startups, and the development of new technologies. It allows citizens to easily access and manage their health data, and allows public authorities to use data better in research, prevention and health system reforms….

Today’s proposals build on the General Data Protection Regulation (GDPR), which will enter into application as of 25 May 2018. They will ensure:

  • Better access to and reusability of public sector data: A revised law on Public Sector Information covers data held by public undertakings in transport and utilities sectors. The new rules limit the exceptions that allow public bodies to charge more than the marginal costs of data dissemination for the reuse of their data. They also facilitate the reusability of open research data resulting from public funding, and oblige Member States to develop open access policies. Finally, the new rules require – where applicable – technical solutions like Application Programming Interfaces (APIs) to provide real-time access to data.
  • Scientific data sharing in 2018: new set of recommendations address the policy and technological changes since the last Commission proposal on access to and preservation of scientific information. They offer guidance on implementing open access policies in line with open science objectives, research data and data management, the creation of a European Open Science Cloud, and text and data-mining. They also highlight the importance of incentives, rewards, skills and metrics appropriate for the new era of networked research.
  • Private sector data sharing in business-to-business and business-to-governments contexts: A new Communication entitled “Towards a common European data space” provides guidance for businesses operating in the EU on the legal and technical principles that should govern data sharing collaboration in the private sector.
  • Securing citizens’ healthcare data while fostering European cooperation: The Commission is today setting out a plan of action that puts citizens first when it comes to data on citizens’ health: by securing citizens’ access to their health data and introducing the possibility to share their data across borders; by using larger data sets to enable more personalised diagnoses and medical treatment, and better anticipate epidemics; and by promoting appropriate digital tools, allowing public authorities to better use health data for research and for health system reforms. Today’s proposal also covers the interoperability of electronic health records as well as a mechanism for voluntary coordination in sharing data – including genomic data – for disease prevention and research….(More)”.

Use our personal data for the common good


Hetan Shah at Nature: “Data science brings enormous potential for good — for example, to improve the delivery of public services, and even to track and fight modern slavery. No wonder researchers around the world — including members of my own organization, the Royal Statistical Society in London — have had their heads in their hands over headlines about how Facebook and the data-analytics company Cambridge Analytica might have handled personal data. We know that trustworthiness underpins public support for data innovation, and we have just seen what happens when that trust is lost….But how else might we ensure the use of data for the public good rather than for purely private gain?

Here are two proposals towards this goal.

First, governments should pass legislation to allow national statistical offices to gain anonymized access to large private-sector data sets under openly specified conditions. This provision was part of the United Kingdom’s Digital Economy Act last year and will improve the ability of the UK Office for National Statistics to assess the economy and society for the public interest.

My second proposal is inspired by the legacy of John Sulston, who died earlier this month. Sulston was known for his success in advocating for the Human Genome Project to be openly accessible to the science community, while a competitor sought to sequence the genome first and keep data proprietary.

Like Sulston, we should look for ways of making data available for the common interest. Intellectual-property rights expire after a fixed time period: what if, similarly, technology companies were allowed to use the data that they gather only for a limited period, say, five years? The data could then revert to a national charitable corporation that could provide access to certified researchers, who would both be held to account and be subject to scrutiny that ensure the data are used for the common good.

Technology companies would move from being data owners to becoming data stewards…(More)” (see also http://datacollaboratives.org/).

To serve a free society, social media must evolve beyond data mining


Barbara Romzek and Aram Sinnreich at The Conversation: “…For years, watchdogs have been warning about sharing information with data-collecting companies, firms engaged in the relatively new line of business called some academics have called “surveillance capitalism.” Most casual internet users are only now realizing how easy – and common – it is for unaccountable and unknown organizations to assemble detailed digital profiles of them. They do this by combining the discrete bits of information consumers have given up to e-tailers, health sites, quiz apps and countless other digital services.

As scholars of public accountability and digital media systems, we know that the business of social media is based on extracting user data and offering it for sale. There’s no simple way for them to protect data as many users might expect. Like the social pollution of fake news, bullying and spam that Facebook’s platform spreads, the company’s privacy crisis also stems from a power imbalance: Facebook knows nearly everything about its users, who know little to nothing about it.

It’s not enough for people to delete their Facebook accounts. Nor is it likely that anyone will successfully replace it with a nonprofit alternativecentering on privacy, transparency and accountability. Furthermore, this problem is not specific just to Facebook. Other companies, including Google and Amazon, also gather and exploit extensive personal data, and are locked in a digital arms race that we believe threatens to destroy privacy altogether….

Governments need to be better guardians of public welfare – including privacy. Many companies using various aspects of technology in new ways have so far avoided regulation by stoking fears that rules might stifle innovation. Facebook and others have often claimed that they’re better at regulating themselves in an ever-changing environment than a slow-moving legislative process could be….

To encourage companies to serve democratic principles and focus on improving people’s lives, we believe the chief business model of the internet needs to shift to building trust and verifying information. While it won’t be an immediate change, social media companies pride themselves on their adaptability and should be able to take on this challenge.

The alternative, of course, could be far more severe. In the 1980s, when federal regulators decided that AT&T was using its power in the telephone market to hurt competition and consumers, they forced the massive conglomerate to break up. A similar but less dramatic change happened in the early 2000s when cellphone companies were forced to let people keep their phone numbers even if they switched carriers.

Data, and particularly individuals’ personal data, are the precious metals of the internet age. Protecting individual data while expanding access to the internet and its many social benefits is a fundamental challenge for free societies. Creating, using and protecting data properly will be crucial to preserving and improving human rights and civil liberties in this still young century. To meet this challenge will require both vigilance and vision, from businesses and their customers, as well as governments and their citizens….(More).

Practical approaches to big data privacy over time


Micah Altman, Alexandra Wood, David R O’Brien and Urs Gasser in International Data Privacy Law: “

  • Governments and businesses are increasingly collecting, analysing, and sharing detailed information about individuals over long periods of time.
  • Vast quantities of data from new sources and novel methods for large-scale data analysis promise to yield deeper understanding of human characteristics, behaviour, and relationships and advance the state of science, public policy, and innovation.
  • The collection and use of fine-grained personal data over time, at the same time, is associated with significant risks to individuals, groups, and society at large.
  • This article examines a range of long-term research studies in order to identify the characteristics that drive their unique sets of risks and benefits and the practices established to protect research data subjects from long-term privacy risks.
  • We find that many big data activities in government and industry settings have characteristics and risks similar to those of long-term research studies, but are subject to less oversight and control.
  • We argue that the risks posed by big data over time can best be understood as a function of temporal factors comprising age, period, and frequency and non-temporal factors such as population diversity, sample size, dimensionality, and intended analytic use.
  • Increasing complexity in any of these factors, individually or in combination, creates heightened risks that are not readily addressable through traditional de-identification and process controls.
  • We provide practical recommendations for big data privacy controls based on the risk factors present in a specific case and informed by recent insights from the state of the art and practice….(More)”.

Selected Readings on Data Responsibility, Refugees and Migration


By Kezia Paladina, Alexandra Shaw, Michelle Winowatan, Stefaan Verhulst, and Andrew Young

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of Data Collaboration for Migration was originally published in 2018.

Special thanks to Paul Currion whose data responsibility literature review gave us a headstart when developing the below. (Check out his article listed below on Refugee Identity)

The collection below is also meant to complement our article in the Stanford Social Innovation Review on Data Collaboration for Migration where we emphasize the need for a Data Responsibility Framework moving forward.

From climate change to politics to finance, there is growing recognition that some of the most intractable problems of our era are information problems. In recent years, the ongoing refugee crisis has increased the call for new data-driven approaches to address the many challenges and opportunities arising from migration. While data – including data from the private sector – holds significant potential value for informing analysis and targeted international and humanitarian response to (forced) migration, decision-makers often lack an actionable understanding of if, when and how data could be collected, processed, stored, analyzed, used, and shared in a responsible manner.

Data responsibility – including the responsibility to protect data and shield its subjects from harms, and the responsibility to leverage and share data when it can provide public value – is an emerging field seeking to go beyond just privacy concerns. The forced migration arena has a number of particularly important issues impacting responsible data approaches, including the risks of leveraging data regarding individuals fleeing a hostile or repressive government.

In this edition of the GovLab’s Selected Readings series, we examine the emerging literature on the data responsibility approaches in the refugee and forced migration space – part of an ongoing series focused on Data Responsibiltiy. The below reading list features annotated readings related to the Policy and Practice of data responsibility for refugees, and the specific responsibility challenges regarding Identity and Biometrics.

Data Responsibility and Refugees – Policy and Practice

International Organization for Migration (IOM) (2010) IOM Data Protection Manual. Geneva: IOM.

  • This IOM manual includes 13 data protection principles related to the following activities: lawful and fair collection, specified and legitimate purpose, data quality, consent, transfer to third parties, confidentiality, access and transparency, data security, retention and personal data, application of the principles, ownership of personal data, oversight, compliance and internal remedies (and exceptions).
  • For each principle, the IOM manual features targeted data protection guidelines, and templates and checklists are included to help foster practical application.

Norwegian Refugee Council (NRC) Internal Displacement Monitoring Centre / OCHA (eds.) (2008) Guidance on Profiling Internally Displaced Persons. Geneva: Inter-Agency Standing Committee.

  • This NRC document contains guidelines on gathering better data on Internally Displaced Persons (IDPs), based on country context.
  • IDP profile is defined as number of displaced persons, location, causes of displacement, patterns of displacement, and humanitarian needs among others.
  • It further states that collecting IDPs data is challenging and the current condition of IDPs data are hampering assistance programs.
  • Chapter I of the document explores the rationale for IDP profiling. Chapter II describes the who aspect of profiling: who IDPs are and common pitfalls in distinguishing them from other population groups. Chapter III describes the different methodologies that can be used in different contexts and suggesting some of the advantages and disadvantages of each, what kind of information is needed and when it is appropriate to profile.

United Nations High Commissioner for Refugees (UNHCR). Model agreement on the sharing of personal data with Governments in the context of hand-over of the refugee status determination process. Geneva: UNHCR.

  • This document from UNHCR provides a template of agreement guiding the sharing of data between a national government and UNHCR. The model agreement’s guidance is aimed at protecting the privacy and confidentiality of individual data while promoting improvements to service delivery for refugees.

United Nations High Commissioner for Refugees (UNHCR) (2015). Policy on the Protection of Personal Data of Persons of Concern to UNHCR. Geneva: UNHCR.

  • This policy outlines the rules and principles regarding the processing of personal data of persons engaged by UNHCR with the purpose of ensuring that the practice is consistent with UNGA’s regulation of computerized personal data files that was established to protect individuals’ data and privacy.
  • UNHCR require its personnel to apply the following principles when processing personal data: (i) Legitimate and fair processing (ii) Purpose specification (iii) Necessity and proportionality (iv) Accuracy (v) Respect for the rights of the data subject (vi) Confidentiality (vii) Security (viii) Accountability and supervision.

United Nations High Commissioner for Refugees (UNHCR) (2015) Privacy Impact Assessment of UNHCR Cash Based Interventions.

  • This impact assessment focuses on privacy issues related to financial assistance for refugees in the form of cash transfers. For international organizations like UNHCR to determine eligibility for cash assistance, data “aggregation, profiling, and social sorting techniques,” are often needed, leading a need for a responsible data approach.
  • This Privacy Impact Assessment (PIA) aims to identify the privacy risks posed by their program and seek to enhance safeguards that can mitigate those risks.
  • Key issues raised in the PIA involves the challenge of ensuring that individuals’ data will not be used for purposes other than those initially specified.

Data Responsibility in Identity and Biometrics

Bohlin, A. (2008) “Protection at the Cost of Privacy? A Study of the Biometric Registration of Refugees.” Lund: Faculty of Law of the University of Lund.

  • This 2008 study focuses on the systematic biometric registration of refugees conducted by UNHCR in refugee camps around the world, to understand whether enhancing the registration mechanism of refugees contributes to their protection and guarantee of human rights, or whether refugee registration exposes people to invasions of privacy.
  • Bohlin found that, at the time, UNHCR failed to put a proper safeguards in the case of data dissemination, exposing the refugees data to the risk of being misused. She goes on to suggest data protection regulations that could be put in place in order to protect refugees’ privacy.

Currion, Paul. (2018) “The Refugee Identity.” Medium.

  • Developed as part of a DFID-funded initiative, this essay considers Data Requirements for Service Delivery within Refugee Camps, with a particular focus on refugee identity.
  • Among other findings, Currion finds that since “the digitisation of aid has already begun…aid agencies must therefore pay more attention to the way in which identity systems affect the lives and livelihoods of the forcibly displaced, both positively and negatively.”
  • Currion argues that a Responsible Data approach, as opposed to a process defined by a Data Minimization principle, provides “useful guidelines,” but notes that data responsibility “still needs to be translated into organisational policy, then into institutional processes, and finally into operational practice.”

Farraj, A. (2010) “Refugees and the Biometric Future: The Impact of Biometrics on Refugees and Asylum Seekers.” Colum. Hum. Rts. L. Rev. 42 (2010): 891.

  • This article argues that biometrics help refugees and asylum seekers establish their identity, which is important for ensuring the protection of their rights and service delivery.
  • However, Farraj also describes several risks related to biometrics, such as, misidentification and misuse of data, leading to a need for proper approaches for the collection, storage, and utilization of the biometric information by government, international organizations, or other parties.  

GSMA (2017) Landscape Report: Mobile Money, Humanitarian Cash Transfers and Displaced Populations. London: GSMA.

  • This paper from GSMA seeks to evaluate how mobile technology can be helpful in refugee registration, cross-organizational data sharing, and service delivery processes.
  • One of its assessments is that the use of mobile money in a humanitarian context depends on the supporting regulatory environment that contributes to unlocking the true potential of mobile money. The examples include extension of SIM dormancy period to anticipate infrequent cash disbursements, ensuring that persons without identification are able to use the mobile money services, and so on.
  • Additionally, GMSA argues that mobile money will be most successful when there is an ecosystem to support other financial services such as remittances, airtime top-ups, savings, and bill payments. These services will be especially helpful in including displaced populations in development.

GSMA (2017) Refugees and Identity: Considerations for mobile-enabled registration and aid delivery. London: GSMA.

  • This paper emphasizes the importance of registration in the context of humanitarian emergency, because being registered and having a document that proves this registration is key in acquiring services and assistance.
  • Studying cases of Kenya and Iraq, the report concludes by providing three recommendations to improve mobile data collection and registration processes: 1) establish more flexible KYC for mobile money because where refugees are not able to meet existing requirements; 2) encourage interoperability and data sharing to avoid fragmented and duplicative registration management; and 3) build partnership and collaboration among governments, humanitarian organizations, and multinational corporations.

Jacobsen, Katja Lindskov (2015) “Experimentation in Humanitarian Locations: UNHCR and Biometric Registration of Afghan Refugees.” Security Dialogue, Vol 46 No. 2: 144–164.

  • In this article, Jacobsen studies the biometric registration of Afghan refugees, and considers how “humanitarian refugee biometrics produces digital refugees at risk of exposure to new forms of intrusion and insecurity.”

Jacobsen, Katja Lindskov (2017) “On Humanitarian Refugee Biometrics and New Forms of Intervention.” Journal of Intervention and Statebuilding, 1–23.

  • This article traces the evolution of the use of biometrics at the Office of the United Nations High Commissioner for Refugees (UNHCR) – moving from a few early pilot projects (in the early-to-mid-2000s) to the emergence of a policy in which biometric registration is considered a ‘strategic decision’.

Manby, Bronwen (2016) “Identification in the Context of Forced Displacement.” Washington DC: World Bank Group. Accessed August 21, 2017.

  • In this paper, Bronwen describes the consequences of not having an identity in a situation of forced displacement. It prevents displaced population from getting various services and creates higher chance of exploitation. It also lowers the effectiveness of humanitarian actions, as lacking identity prevents humanitarian organizations from delivering their services to the displaced populations.
  • Lack of identity can be both the consequence and and cause of forced displacement. People who have no identity can be considered illegal and risk being deported. At the same time, conflicts that lead to displacement can also result in loss of ID during travel.
  • The paper identifies different stakeholders and their interest in the case of identity and forced displacement, and finds that the biggest challenge for providing identity to refugees is the politics of identification and nationality.
  • Manby concludes that in order to address this challenge, there needs to be more effective coordination among governments, international organizations, and the private sector to come up with an alternative of providing identification and services to the displaced persons. She also argues that it is essential to ensure that national identification becomes a universal practice for states.

McClure, D. and Menchi, B. (2015). Challenges and the State of Play of Interoperability in Cash Transfer Programming. Geneva: UNHCR/World Vision International.

  • This report reviews the elements that contribute to the interoperability design for Cash Transfer Programming (CTP). The design framework offered here maps out these various features and also looks at the state of the problem and the state of play through a variety of use cases.
  • The study considers the current state of play and provides insights about the ways to address the multi-dimensionality of interoperability measures in increasingly complex ecosystems.     

NRC / International Human Rights Clinic (2016). Securing Status: Syrian refugees and the documentation of legal status, identity, and family relationships in Jordan.

  • This report examines Syrian refugees’ attempts to obtain identity cards and other forms of legally recognized documentation (mainly, Ministry of Interior Service Cards, or “new MoI cards”) in Jordan through the state’s Urban Verification Exercise (“UVE”). These MoI cards are significant because they allow Syrians to live outside of refugee camps and move freely about Jordan.
  • The text reviews the acquirement processes and the subsequent challenges and consequences that refugees face when unable to obtain documentation. Refugees can encounter issues ranging from lack of access to basic services to arrest, detention, forced relocation to camps and refoulement.  
  • Seventy-two Syrian refugee families in Jordan were interviewed in 2016 for this report and their experiences with obtaining MoI cards varied widely.

Office of Internal Oversight Services (2015). Audit of the operations in Jordan for the Office of the United Nations High Commissioner for Refugees. Report 2015/049. New York: UN.

  • This report documents the January 1, 2012 – March 31, 2014 audit of Jordanian operations, which is intended to ensure the effectiveness of the UNHCR Representation in the state.
  • The main goals of the Regional Response Plan for Syrian refugees included relieving the pressure on Jordanian services and resources while still maintaining protection for refugees.
  • The audit results concluded that the Representation was initially unsatisfactory, and the OIOS suggested several recommendations according to the two key controls which the Representation acknowledged. Those recommendations included:
    • Project management:
      • Providing training to staff involved in financial verification of partners supervise management
      • Revising standard operating procedure on cash based interventions
      • Establishing ways to ensure that appropriate criteria for payment of all types of costs to partners’ staff are included in partnership agreements
    • Regulatory framework:
      • Preparing annual need-based procurement plan and establishing adequate management oversight processes
      • Creating procedures for the assessment of renovation work in progress and issuing written change orders
      • Protecting data and ensuring timely consultation with the UNHCR Division of Financial and Administrative Management

UNHCR/WFP (2015). Joint Inspection of the Biometrics Identification System for Food Distribution in Kenya. Geneva: UNHCR/WFP.

  • This report outlines the partnership between the WFP and UNHCR in its effort to promote its biometric identification checking system to support food distribution in the Dadaab and Kakuma refugee camps in Kenya.
  • Both entities conducted a joint inspection mission in March 2015 and was considered an effective tool and a model for other country operations.
  • Still, 11 recommendations are proposed and responded to in this text to further improve the efficiency of the biometric system, including real-time evaluation of impact, need for automatic alerts, documentation of best practices, among others.

Replicating the Justice Data Lab in the USA: Key Considerations


Blog by Tracey Gyateng and Tris Lumley: “Since 2011, NPC has researched, supported and advocated for the development of impact-focussed Data Labs in the UK. The goal has been to unlock government administrative data so that organisations (primarily nonprofits) who provide a social service can understand the impact of their services on the people who use them.

So far, one of these Data Labs has been developed to measure re-offending outcomes- the Justice Data Lab-, and others are currently being piloted for employment and education. Given our seven years of work in this area, we at NPC have decided to reflect on the key factors needed to create a Data Lab with our report: How to Create an Impact Data Lab. This blog outlines these factors, examines whether they are present in the USA, and asks what the next steps should be — drawing on the research undertaken with the Governance Lab….Below we examine the key factors and to what extent they appear to be present within the USA.

Environment: A broad culture that supports impact measurement. Similar to the UK, nonprofits in the USA are increasingly measuring the impact they have had on the participants of their service and sharing the difficulties of undertaking robust, high quality evaluations.

Data: Individual person-level administrative data. A key difference between the two countries is that, in the USA, personal data on social services tends to be held at a local, rather than central level. In the UK social services data such as reoffending, education and employment are collated into a central database. In the USA, the federal government has limited centrally collated personal data, instead this data can be found at state/city level….

A leading advocate: A Data Lab project team, and strong networks. Data Labs do not manifest by themselves. They requires a lead agency to campaign with, and on behalf of, nonprofits to set out a persuasive case for their development. In the USA, we have developed a partnership with the Governance Lab to seek out opportunities where Data Labs can be established but given the size of the country, there is scope for further collaborations/ and or advocates to be identified and supported.

Customers: Identifiable organisations that would use the Data Lab. Initial discussions with several US nonprofits and academia indicate support for a Data Lab in their context. Broad consultation based on an agreed region and outcome(s) will be needed to fully assess the potential customer base.

Data owners: Engaged civil servants. Generating buy-in and persuading various stakeholders including data owners, analysts and politicians is a critical part of setting up a data lab. While the exact profiles of the right people to approach can only be assessed once a region and outcome(s) of interest have been chosen, there are encouraging signs, such as the passing of the Foundations for Evidence-Based Policy Making Act of 2017 in the house of representatives which, among other things, mandates the appointment of “Chief Evaluation Officers” in government departments- suggesting that there is bipartisan support for increased data-driven policy evaluation.

Legal and ethical governance: A legal framework for sharing data. In the UK, all personal data is subject to data protection legislation, which provides standardised governance for how personal data can be processed across the country and within the European Union. A universal data protection framework does not exist within the USA, therefore data sharing agreements between customers and government data-owners will need to be designed for the purposes of Data Labs, unless there are existing agreements that enable data sharing for research purposes. This will need to be investigated at the state/city level of a desired Data Lab.

Funding: Resource and support for driving the set-up of the Data Lab. Most of our policy lab case studies were funded by a mixture of philanthropy and government grants. It is expected that a similar mixed funding model will need to be created to establish Data Labs. One alternative is the model adopted by the Washington State Institute for Public Policy (WSIPP), which was created by the Washington State Legislature and is funded on a project basis, primarily by the state. Additionally funding will be needed to enable advocates of a Data Lab to campaign for the service….(More)”.

How Democracy Can Survive Big Data


Colin Koopman in The New York Times: “…The challenge of designing ethics into data technologies is formidable. This is in part because it requires overcoming a century-long ethos of data science: Develop first, question later. Datafication first, regulation afterward. A glimpse at the history of data science shows as much.

The techniques that Cambridge Analytica uses to produce its psychometric profiles are the cutting edge of data-driven methodologies first devised 100 years ago. The science of personality research was born in 1917. That year, in the midst of America’s fevered entry into war, Robert Sessions Woodworth of Columbia University created the Personal Data Sheet, a questionnaire that promised to assess the personalities of Army recruits. The war ended before Woodworth’s psychological instrument was ready for deployment, but the Army had envisioned its use according to the precedent set by the intelligence tests it had been administering to new recruits under the direction of Robert Yerkes, a professor of psychology at Harvard at the time. The data these tests could produce would help decide who should go to the fronts, who was fit to lead and who should stay well behind the lines.

The stakes of those wartime decisions were particularly stark, but the aftermath of those psychometric instruments is even more unsettling. As the century progressed, such tests — I.Q. tests, college placement exams, predictive behavioral assessments — would affect the lives of millions of Americans. Schoolchildren who may have once or twice acted out in such a way as to prompt a psychometric evaluation could find themselves labeled, setting them on an inescapable track through the education system.

Researchers like Woodworth and Yerkes (or their Stanford colleague Lewis Terman, who formalized the first SAT) did not anticipate the deep consequences of their work; they were too busy pursuing the great intellectual challenges of their day, much like Mr. Zuckerberg in his pursuit of the next great social media platform. Or like Cambridge Analytica’s Christopher Wylie, the twentysomething data scientist who helped build psychometric profiles of two-thirds of all Americans by leveraging personal information gained through uninformed consent. All of these researchers were, quite understandably, obsessed with the great data science challenges of their generation. Their failure to consider the consequences of their pursuits, however, is not so much their fault as it is our collective failing.

For the past 100 years we have been chasing visions of data with a singular passion. Many of the best minds of each new generation have devoted themselves to delivering on the inspired data science promises of their day: intelligence testing, building the computer, cracking the genetic code, creating the internet, and now this. We have in the course of a single century built an entire society, economy and culture that runs on information. Yet we have hardly begun to engineer data ethics appropriate for our extraordinary information carnival. If we do not do so soon, data will drive democracy, and we may well lose our chance to do anything about it….(More)”.

The Cambridge Handbook of Consumer Privacy


Handbook by Evan Selinger, Jules Polonetsky, and Omer Tene: “Businesses are rushing to collect personal data to fuel surging demand. Data enthusiasts claim personal information that’s obtained from the commercial internet, including mobile platforms, social networks, cloud computing, and connected devices, will unlock path-breaking innovation, including advanced data security. By contrast, regulators and activists contend that corporate data practices too often disempower consumers by creating privacy harms and related problems. As the Internet of Things matures and facial recognition, predictive analytics, big data, and wearable tracking grow in power, scale, and scope, a controversial ecosystem will exacerbate the acrimony over commercial data capture and analysis. The only productive way forward is to get a grip on the key problems right now and change the conversation….(More)”.