Introducing the Contractual Wheel of Data Collaboration


Blog by Andrew Young and Stefaan Verhulst: “Earlier this year we launched the Contracts for Data Collaboration (C4DC) initiative — an open collaborative with charter members from The GovLab, UN SDSN Thematic Research Network on Data and Statistics (TReNDS), University of Washington and the World Economic Forum. C4DC seeks to address the inefficiencies of developing contractual agreements for public-private data collaboration by informing and guiding those seeking to establish a data collaborative by developing and making available a shared repository of relevant contractual clauses taken from existing legal agreements. Today TReNDS published “Partnerships Founded on Trust,” a brief capturing some initial findings from the C4DC initiative.

The Contractual Wheel of Data Collaboration [beta]

The Contractual Wheel of Data Collaboration [beta] — Stefaan G. Verhulst and Andrew Young, The GovLab

As part of the C4DC effort, and to support Data Stewards in the private sector and decision-makers in the public and civil sectors seeking to establish Data Collaboratives, The GovLab developed the Contractual Wheel of Data Collaboration [beta]. The Wheel seeks to capture key elements involved in data collaboration while demystifying contracts and moving beyond the type of legalese that can create confusion and barriers to experimentation.

The Wheel was developed based on an assessment of existing legal agreements, engagement with The GovLab-facilitated Data Stewards Network, and analysis of the key elements of our Data Collaboratives Methodology. It features 22 legal considerations organized across 6 operational categories that can act as a checklist for the development of a legal agreement between parties participating in a Data Collaborative:…(More)”.

Data-driven models of governance across borders


Introduction to Special Issue of FirstMonday, edited by Payal Arora and Hallam Stevens: “This special issue looks closely at contemporary data systems in diverse global contexts and through this set of papers, highlights the struggles we face as we negotiate efficiency and innovation with universal human rights and social inclusion. The studies presented in these essays are situated in diverse models of policy-making, governance, and/or activism across borders. Attention to big data governance in western contexts has tended to highlight how data increases state and corporate surveillance of citizens, affecting rights to privacy. By moving beyond Euro-American borders — to places such as Africa, India, China, and Singapore — we show here how data regimes are motivated and understood on very different terms….

To establish a kind of baseline, the special issue opens by considering attitudes toward big data in Europe. René König’s essay examines the role of “citizen conferences” in understanding the public’s view of big data in Germany. These “participatory technology assessments” demonstrated that citizens were concerned about the control of big data (should it be under the control of the government or individuals?), about the need for more education about big data technologies, and the need for more government regulation. Participants expressed, in many ways, traditional liberal democratic views and concerns about these technologies centered on individual rights, individual responsibilities, and education. Their proposed solutions too — more education and more government regulation — fit squarely within western liberal democratic traditions.

In contrast to this, Payal Arora’s essay draws us immediately into the vastly different contexts of data governance in India and China. India’s Aadhaar biometric identification system, through tracking its citizens with iris scanning and other measures, promises to root out corruption and provide social services to those most in need. Likewise, China’s emerging “social credit system,” while having immense potential for increasing citizen surveillance, offers ways of increasing social trust and fostering more responsible social behavior online and offline. Although the potential for authoritarian abuses of both systems is high, Arora focuses on how these technologies are locally understood and lived on an everyday basis, which spans from empowering to oppressing their people. From this perspective, the technologies offer modes of “disrupt[ing] systems of inequality and oppression” that should open up new conversations about what democratic participation can and should look like in China and India.

If China and India offer contrasting non-democratic and democratic cases, we turn next to a context that is neither completely western nor completely non-western, neither completely democratic nor completely liberal. Hallam Stevens’ account of government data in Singapore suggests the very different role that data can play in this unique political and social context. Although the island state’s data.gov.sg participates in global discourses of sharing, “open data,” and transparency, much of the data made available by the government is oriented towards the solution of particular economic and social problems. Ultimately, the ways in which data are presented may contribute to entrenching — rather than undermining or transforming — existing forms of governance. The account of data and its meanings that is offered here once again challenges the notion that such data systems can or should be understood in the same ways that similar systems have been understood in the western world.

If systems such as Aadhaar, “social credit,” and data.gov.sg profess to make citizens and governments more visible and legible, Rolien Hoyngexamines what may remain invisible even within highly pervasive data-driven systems. In the world of e-waste, data-driven modes of surveillance and logistics are critical for recycling. But many blind spots remain. Hoyng’s account reminds us that despite the often-supposed all-seeing-ness of big data, we should remain attentive to what escapes the data’s gaze. Here, in midst of datafication, we find “invisibility, uncertainty, and, therewith, uncontrollability.” This points also to the gap between the fantasies of how data-driven systems are supposed to work, and their realization in the world. Such interstices allow individuals — those working with e-waste in Shenzhen or Africa, for example — to find and leverage hidden opportunities. From this perspective, the “blind spots of big data” take on a very different significance.

Big data systems provide opportunities for some, but reduce those for others. Mark Graham and Mohammad Amir Anwar examine what happens when online outsourcing platforms create a “planetary labor market.” Although providing opportunities for many people to make money via their Internet connection, Graham and Anwar’s interviews with workers across sub-Saharan Africa demonstrate how “platform work” alters the balance of power between labor and capital. For many low-wage workers across the globe, the platform- and data-driven planetary labor market means downward pressure on wages, fewer opportunities to collectively organize, less worker agency, and less transparency about the nature of the work itself. Moving beyond bold pronouncements that the “world is flat” and big data as empowering, Graham and Anwar show how data-driven systems of employment can act to reduce opportunities for those residing in the poorest parts of the world. The affordances of data and platforms create a planetary labor market for global capital but tie workers ever-more tightly to their own localities. Once again, the valances of global data systems look very different from this “bottom-up” perspective.

Philippa Metcalfe and Lina Dencik shift this conversation from the global movement of labor to that of people, as they write about the implications of European datafication systems on the governance of refugees entering this region. This work highlights how intrinsic to datafication systems is the classification, coding, and collating of people to legitimize the extent of their belonging in the society they seek to live in. The authors argue that these datafied regimes of power have substantively increased their role in the regulating of human mobility in the guise of national security. These means of data surveillance can foster new forms of containment and entrapment of entire groups of people, creating further divides between “us” and “them.” Through vast interoperable databases, digital registration processes, biometric data collection, and social media identity verification, refugees have become some of the most monitored groups at a global level while at the same time, their struggles remain the most invisible in popular discourse….(More)”.

Privacy-Preserved Data Sharing for Evidence-Based Policy Decisions: A Demonstration Project Using Human Services Administrative Records for Evidence-Building Activities


Paper by the Bipartisan Policy Center: “Emerging privacy-preserving technologies and approaches hold considerable promise for improving data privacy and confidentiality in the 21st century. At the same time, more information is becoming accessible to support evidence-based policymaking.

In 2017, the U.S. Commission on Evidence-Based Policymaking unanimously recommended that further attention be given to the deployment of privacy-preserving data-sharing applications. If these types of applications can be tested and scaled in the near-term, they could vastly improve insights about important policy problems by using disparate datasets. At the same time, the approaches could promote substantial gains in privacy for the American public.

There are numerous ways to engage in privacy-preserving data sharing. This paper primarily focuses on secure computation, which allows information to be accessed securely, guarantees privacy, and permits analysis without making private information available. Three key issues motivated the launch of a domestic secure computation demonstration project using real government-collected data:

  • Using new privacy-preserving approaches addresses pressing needs in society. Current widely accepted approaches to managing privacy risks—like preventing the identification of individuals or organizations in public datasets—will become less effective over time. While there are many practices currently in use to keep government-collected data confidential, they do not often incorporate modern developments in computer science, mathematics, and statistics in a timely way. New approaches can enable researchers to combine datasets to improve the capability for insights, without being impeded by traditional concerns about bringing large, identifiable datasets together. In fact, if successful, traditional approaches to combining data for analysis may not be as necessary.
  • There are emerging technical applications to deploy certain privacy-preserving approaches in targeted settings. These emerging procedures are increasingly enabling larger-scale testing of privacy-preserving approaches across a variety of policy domains, governmental jurisdictions, and agency settings to demonstrate the privacy guarantees that accompany data access and use.
  • Widespread adoption and use by public administrators will only follow meaningful and successful demonstration projects. For example, secure computation approaches are complex and can be difficult to understand for those unfamiliar with their potential. Implementing new privacy-preserving approaches will require thoughtful attention to public policy implications, public opinions, legal restrictions, and other administrative limitations that vary by agency and governmental entity.

This project used real-world government data to illustrate the applicability of secure computation compared to the classic data infrastructure available to some local governments. The project took place in a domestic, non-intelligence setting to increase the salience of potential lessons for public agencies….(More)”.

Data: The Lever to Promote Innovation in the EU


Blog Post by Juan Murillo Arias: “…But in order for data to truly become a lever that foments innovation in benefit of society as a whole, we must understand and address the following factors:

1. Disconnected, disperse sources. As users of digital services (transportation, finance, telecommunications, news or entertainment) we leave a different digital footprint for each service that we use. These footprints, which are different facets of the same polyhedron, can even be contradictory on occasion. For this reason, they must be seen as complementary. Analysts should be aware that they must cross data sources from different origins in order to create a reliable picture of our preferences, otherwise we will be basing decisions on partial or biased information. How many times do we receive advertising for items we have already purchased, or tourist destinations where we have already been? And this is just one example of digital marketing. When scoring financial solvency, or monitoring health, the more complete the digital picture is of the person, the more accurate the diagnosis will be.

Furthermore, from the user’s standpoint, proper management of their entire, disperse digital footprint is a challenge. Perhaps centralized consent would be very beneficial. In the financial world, the PSD2 regulations have already forced banks to open this information to other banks if customers so desire. Fostering competition and facilitating portability is the purpose, but this opening up has also enabled the development of new services of information aggregation that are very useful to financial services users. It would be ideal if this step of breaking down barriers and moving toward a more transparent market took place simultaneously in all sectors in order to avoid possible distortions to competition and by extension, consumer harm. Therefore, customer consent would open the door to building a more accurate picture of our preferences.

2. The public and private sectors’ asymmetric capacity to gather data.This is related to citizens using public services less frequently than private services in the new digital channels. However, governments could benefit from the information possessed by private companies. These anonymous, aggregated data can help to ensure a more dynamic public management. Even personal data could open the door to customized education or healthcare on an individual level. In order to analyze all of this, the European Commissionhas created a working group including 23 experts. The purpose is to come up with a series of recommendations regarding the best legal, technical and economic framework to encourage this information transfer across sectors.

3. The lack of incentives for companies and citizens to encourage the reuse of their data.The reality today is that most companies solely use the sources internally. Only a few have decided to explore data sharing through different models (for academic research or for the development of commercial services). As a result of this and other factors, the public sector largely continues using the survey method to gather information instead of reading the digital footprint citizens produce. Multiple studies have demonstrated that this digital footprint would be useful to describe socioeconomic dynamics and monitor the evolution of official statistical indicators. However, these studies have rarely gone on to become pilot projects due to the lack of incentives for a private company to open up to the public sector, or to society in general, making this new activity sustainable.

4. Limited commitment to the diversification of services.Another barrier is the fact that information based product development is somewhat removed from the type of services that the main data generators (telecommunications, banks, commerce, electricity, transportation, etc.) traditionally provide. Therefore, these data based initiatives are not part of their main business and are more closely tied to companies’ innovation areas where exploratory proofs of concept are often not consolidated as a new line of business.

5. Bidirectionality. Data should also flow from the public sector to the rest of society. The first regulatory framework was created for this purpose. Although it is still very recent (the PSI Directive on the re-use of public sector data was passed in 2013), it is currently being revised, in attempt to foster the consolidation of an open data ecosystem that emanates from the public sector as well. On the one hand it would enable greater transparency, and on the other, the development of solutions to improve multiple fields in which public actors are key, such as the environment, transportation and mobility, health, education, justice and the planning and execution of public works. Special emphasis will be placed on high value data sets, such as statistical or geospatial data — data with tremendous potential to accelerate the emergence of a wide variety of information based data products and services that add value.The Commission will begin working with the Member States to identify these data sets.

In its report, Creating Data through Open Data, the European open data portal estimates that government agencies making their data accessible will inject an extra €65 billion in the EU economy this year.

6. The commitment to analytical training and financial incentives for innovation.They are the key factors that have given rise to the digital unicorns that have emerged, more so in the U.S. and China than in Europe….(More)”

New York City ‘Open Data’ Paves Way for Innovative Technology


Leo Gringut at the International Policy Digest: “The philosophy behind “Open Data for All” turns on the idea that easy access to government data offers everyday New Yorkers the chance to grow and innovate: “Data is more than just numbers – it’s information that can create new opportunities and level the playing field for New Yorkers. It’s the illumination that changes frameworks, the insight that turns impenetrable issues into solvable problems.” Fundamentally, the newfound accessibility of City data is revolutionizing NYC business. According to Albert Webber, Program Manager for Open Data, City of New York, a key part of his job is “to engage the civic technology community that we have, which is very strong, very powerful in New York City.”

Fundamentally, Open Data is a game-changer for hundreds of New York companies, from startups to corporate giants, all of whom rely on data for their operations. The effect is set to be particularly profound in New York City’s most important economic sector: real estate. Seeking to transform the real estate and construction market in the City, valued at a record-setting $1 trillion in 2016, companies have been racing to develop tools that will harness the power of Open Data to streamline bureaucracy and management processes.

One such technology is the Citiscape app. Developed by a passionate team of real estate experts with more than 15 years of experience in the field, the app assembles data from the Department of Building and the Environmental Control Board into one easy-to-navigate interface. According to Citiscape Chief Operational Officer Olga Khaykina, the secret is in the app’s simplicity, which puts every aspect of project management at the user’s fingertips. “We made DOB and ECB just one tap away,” said Khaykina. “You’re one tap away from instant and accurate updates and alerts from the DOB that will keep you informed about any changes to ongoing project. One tap away from organized and cloud-saved projects, including accessible and coordinated interaction with all team members through our in-app messenger. And one tap away from uncovering technical information about any building in NYC, just by entering its address.” Gone are the days of continuously refreshing the DOB website in hopes of an update on a minor complaint or a status change regarding your project; Citiscape does the busywork so you can focus on your project.

The Citiscape team emphasized that, without access to Open Data, this project would have been impossible….(More)”.

New Data Tools Connect American Workers to Education and Job Opportunities


Department of Commerce: “These are the real stories of the people that recently participated in the Census Bureau initiative called The Opportunity Project—a novel, collaborative effort between government agencies, technology companies, and nongovernment organizations to translate government open data into user-friendly tools that solve real world problems for families, communities, and businesses nationwide.  On March 1, they came together to share their projects at The Opportunity Project’s Demo Day. Projects like theirs help veterans, aspiring technologists, and all Americans connect with the career and educational opportunities, like Bryan and Olivia did.

One barrier for many American students and workers is the lack of clear data to help match them with educational opportunities and jobs.  Students want information on the best courses that lead to high paying and high demand jobs. Job seekers want to find the jobs that best match their skills, or where to find new skills that open up career development opportunities.  Despite the increasing availability of big data and the long-standing, highly regarded federal statistical system, there remain significant data gaps about basic labor market questions.

  • What is the payoff of a bachelor’s degree versus an apprenticeship, 2-year degree, industry certification, or other credential?
  • What are the jobs of the future?  Which jobs of today also will be the jobs of the future? What skills and experience do companies value most?

The Opportunity Project brings government, communities, and companies like IBM, the veteran-led Shift.org, and Nepris together to create tools to answer simple questions related to education, employment, health, transportation, housing, and many other matters that are critical to helping Americans advance in their lives and careers….(More)”.

PayStats helps assess the impact of the low-emission area Madrid Central


BBVA API Market: “How do town-planning decisions affect a city’s routines? How can data help assess and make decisions? The granularity and detailed information offered by PayStats allowed Madrid’s city council to draw a more accurate map of consumer behavior and gain an objective measurement of the impact of the traffic restriction measures on commercial activity.

In this case, 20 million aggregate and anonymized transactions with BBVA cards and any other card at BBVA POS terminals were analyzed to study the effect of the changes made by Madrid’s city council to road access to the city center.

The BBVA PayStats API is targeted at all kinds of organizations including the public sector, as in this case. Madrid’s city council used it to find out how restricting car access to Madrid Central impacted Christmas shopping. From information gathered between December 1 2018 and January 7 2019, a comparison was made between data from the last two Christmases as well as the increased revenue in Madrid Central (Gran Vía and five subareas) vs. the increase in the entire city.

According to the report drawn up by council experts, 5.984 billion euros were spent across the city. The sample shows a 3.3% increase in spending in Madrid when compared to the same time the previous year; this goes up to 9.5% in Gran Vía and reaches 8.6% in the central area….(More)”.

The Palgrave Handbook of Global Health Data Methods for Policy and Practice


Book edited by Sarah B. Macfarlane and Carla AbouZahr: “This handbook compiles methods for gathering, organizing and disseminating data to inform policy and manage health systems worldwide. Contributing authors describe national and international structures for generating data and explain the relevance of ethics, policy, epidemiology, health economics, demography, statistics, geography and qualitative methods to describing population health. The reader, whether a student of global health, public health practitioner, programme manager, data analyst or policymaker, will appreciate the methods, context and importance of collecting and using global health data….(More)”.

Toward an Open Data Bias Assessment Tool Measuring Bias in Open Spatial Data


Working Paper by Ajjit Narayanan and Graham MacDonald: “Data is a critical resource for government decisionmaking, and in recent years, local governments, in a bid for transparency, community engagement, and innovation, have released many municipal datasets on publicly accessible open data portals. In recent years, advocates, reporters, and others have voiced concerns about the bias of algorithms used to guide public decisions and the data that power them.

Although significant progress is being made in developing tools for algorithmic bias and transparency, we could not find any standardized tools available for assessing bias in open data itself. In other words, how can policymakers, analysts, and advocates systematically measure the level of bias in the data that power city decisionmaking, whether an algorithm is used or not?

To fill this gap, we present a prototype of an automated bias assessment tool for geographic data. This new tool will allow city officials, concerned residents, and other stakeholders to quickly assess the bias and representativeness of their data. The tool allows users to upload a file with latitude and longitude coordinates and receive simple metrics of spatial and demographic bias across their city.

The tool is built on geographic and demographic data from the Census and assumes that the population distribution in a city represents the “ground truth” of the underlying distribution in the data uploaded. To provide an illustrative example of the tool’s use and output, we test our bias assessment on three datasets—bikeshare station locations, 311 service request locations, and Low Income Housing Tax Credit (LIHTC) building locations—across a few, hand-selected example cities….(More)”

Circular City Data


First Volume of Circular City, A Research Journal by New Lab edited by André Corrêa d’Almeida: “…Circular City Data is the topic being explored in the first iteration of New Lab’s The Circular City program, which looks at data and knowledge as the energy, flow, and medium of collaboration. Circular data refers to the collection, production, and exchange of data, and business insights, between a series of collaborators around a shared set of inquiries. In some scenarios, data may be produced by start-ups and of high value to the city; in other cases, data may be produced by the city and of potential value to the public, start-ups, or enterprise companies. The conditions that need to be in place to safely, ethically, and efficiently extrapolate the highest potential value from data are what this program aims to uncover.

Similar to living systems, urban systems can be enhanced if the total pool of data available, i.e., energy, can be democratized and decentralized and data analytics used widely to positively impact quality of life. The abundance of data available, the vast differences in capacity across organizations to handle it, and the growing complexity of urban challenges provides an opportunity to test how principles of circular city data can help establish new forms of public and private partnerships that make cities more economically prosperous, livable, and resilient. Though we talk of an overabundance of data, it is often still not visible or tactically wielded at the local level in a way that benefits people.

Circular City Data is an effort to build a safe environment whereby start-ups, city agencies, and larger firms can collect, produce, access and exchange data, as well as business insights, through transaction mechanisms that do not necessarily require currency, i.e., through reciprocity. Circular data is data that travels across a number of stakeholders, helping to deliver insights and make clearer the opportunities where such stakeholders can work together to improve outcomes. It includes cases where a set of “circular” relationships need to be in place in order to produce such data and business insights. For example, if an AI company lacks access to raw data from the city, they won’t be able to provide valuable insights to the city. Or, Numina required an established relationship with the DBP in order to access infrastructure necessary for them to install their product and begin generating data that could be shared back with them. ***

Next, the case study documents and explains how The Circular City program was conceived, designed, and implemented, with the goal of offering lessons for scalability at New Lab and replicability in other cities around the world. The three papers that follow investigate and methodologically test the value of circular data applied to three different, but related, urban challenges: economic growth, mobility, and resilience. At the end, the conclusion offers a meta-analysis of the value of circular city data for the future of cities and presents, integrated, the tools developed in each paper that can be used for implementation and scaling-up of a circular city program…(More).

Contents

  • Introduction to The Circular City Research Program (André Corrêa d’Almeida)
  • The Circular City Program: The Case Study (André Corrêa d’Almeida and Caroline McHeffey)  
  • Circular Data for a Circular City: Value Propositions for Economic Development (Stefaan G. Verhulst, Andrew Young, and Andrew J. Zahuranec)  
  • Circular Data for a Circular City: Value Propositions for Mobility (Arnaud Sahuguet)
  • Circular Data for a Circular City: Value Propositions for Resilience and Sustainability (Nilda Mesa)
  • Conclusio (André Corrêa d’Almeida)