A Roadmap to a Nationwide Data Infrastructure for Evidence-Based Policymaking


Introduction by Julia Lane and Andrew Reamer of a Special Issue of the Annals of the American Academy of Political and Social Science: “Throughout the United States, there is broad interest in expanding the nation’s capacity to design and implement public policy based on solid evidence. That interest has been stimulated by the new types of data that are available that can transform the way in which policy is designed and implemented. Yet progress in making use of sensitive data has been hindered by the legal, technical, and operational obstacles to access for research and evaluation. Progress has also been hindered by an almost exclusive focus on the interest and needs of the data users, rather than the interest and needs of the data providers. In addition, data stewardship is largely artisanal in nature.

There are very real consequences that result from lack of action. State and local governments are often hampered in their capacity to effectively mount and learn from innovative efforts. Although jurisdictions often have treasure troves of data from existing programs, the data are stove-piped, underused, and poorly maintained. The experience reported by one large city public health commissioner is too common: “We commissioners meet periodically to discuss specific childhood deaths in the city. In most cases, we each have a thick file on the child or family. But the only time we compare notes is after the child is dead.”1 In reality, most localities lack the technical, analytical, staffing, and legal capacity to make effective use of existing and emerging resources.

It is our sense that fundamental changes are necessary and a new approach must be taken to building data infrastructures. In particular,

  1. Privacy and confidentiality issues must be addressed at the beginning—not added as an afterthought.
  2. Data providers must be involved as key stakeholders throughout the design process.
  3. Workforce capacity must be developed at all levels.
  4. The scholarly community must be engaged to identify the value to research and policy….

To develop a roadmap for the creation of such an infrastructure, the Bill and Melinda Gates Foundation, together with the Laura and John Arnold Foundation, hosted a day-long workshop of more than sixty experts to discuss the findings of twelve commissioned papers and their implications for action. This volume of The ANNALS showcases those twelve articles. The workshop papers were grouped into three thematic areas: privacy and confidentiality, the views of data producers, and comprehensive strategies that have been used to build data infrastructures in other contexts. The authors and the attendees included computer scientists, social scientists, practitioners, and data producers.

This introductory article places the research in both an historical and a current context. It also provides a framework for understanding the contribution of the twelve articles….(More)”.

Open Data Risk Assessment


Report by the Future of Privacy Forum: “The transparency goals of the open data movement serve important social, economic, and democratic functions in cities like Seattle. At the same time, some municipal datasets about the city and its citizens’ activities carry inherent risks to individual privacy when shared publicly. In 2016, the City of Seattle declared in its Open Data Policy that the city’s data would be “open by preference,” except when doing so may affect individual privacy. To ensure its Open Data Program effectively protects individuals, Seattle committed to performing an annual risk assessment and tasked the Future of Privacy Forum (FPF) with creating and deploying an initial privacy risk assessment methodology for open data.

This Report provides tools and guidance to the City of Seattle and other municipalities navigating the complex policy, operational, technical, organizational, and ethical standards that support privacyprotective open data programs. Although there is a growing body of research regarding open data privacy, open data managers and departmental data owners need to be able to employ a standardized methodology for assessing the privacy risks and benefits of particular datasets internally, without access to a bevy of expert statisticians, privacy lawyers, or philosophers. By optimizing its internal processes and procedures, developing and investing in advanced statistical disclosure control strategies, and following a flexible, risk-based assessment process, the City of Seattle – and other municipalities – can build mature open data programs that maximize the utility and openness of civic data while minimizing privacy risks to individuals and addressing community concerns about ethical challenges, fairness, and equity.

This Report first describes inherent privacy risks in an open data landscape, with an emphasis on potential harms related to re-identification, data quality, and fairness. To address these risks, the Report includes a Model Open Data Benefit-Risk Analysis (“Model Analysis”). The Model Analysis evaluates the types of data contained in a proposed open dataset, the potential benefits – and concomitant risks – of releasing the dataset publicly, and strategies for effective de-identification and risk mitigation. This holistic assessment guides city officials to determine whether to release the dataset openly, in a limited access environment, or to withhold it from publication (absent countervailing public policy considerations). …(More)”.

Urban Big Data: City Management and Real Estate Markets


Report by Richard Barkham, Sheharyar Bokhari and Albert Saiz: “In this report, we discuss recent trends in the application of urban big data and their impact on real estate markets. We expect such technologies to improve quality of life and the productivity of cities over the long run.

We forecast that smart city technologies will reinforce the primacy of the most successful global metropolises at least for a decade or more. A few select metropolises in emerging countries may also leverage these technologies to leapfrog on the provision of local public services.

In the long run, all cities throughout the urban system will end up adopting successful and cost-effective smart city initiatives. Nevertheless, smaller-scale interventions are likely to crop up everywhere, even in the short run. Such targeted programs are more likely to improve conditions in blighted or relatively deprived neighborhoods, which could generate gentrification and higher valuations there. It is unclear whether urban information systems will have a centralizing or suburbanizing impact. They are likely to make denser urban centers more attractive, but they are also bound to make suburban or exurban locations more accessible…(More)”.

They Are Watching You—and Everything Else on the Planet


Cover article by Robert Draper for Special Issue of the National Geographic: “Technology and our increasing demand for security have put us all under surveillance. Is privacy becoming just a memory?…

In 1949, amid the specter of European authoritarianism, the British novelist George Orwell published his dystopian masterpiece 1984, with its grim admonition: “Big Brother is watching you.” As unsettling as this notion may have been, “watching” was a quaintly circumscribed undertaking back then. That very year, 1949, an American company released the first commercially available CCTV system. Two years later, in 1951, Kodak introduced its Brownie portable movie camera to an awestruck public.

Today more than 2.5 trillion images are shared or stored on the Internet annually—to say nothing of the billions more photographs and videos people keep to themselves. By 2020, one telecommunications company estimates, 6.1 billion people will have phones with picture-taking capabilities. Meanwhile, in a single year an estimated 106 million new surveillance cameras are sold. More than three million ATMs around the planet stare back at their customers. Tens of thousands of cameras known as automatic number plate recognition devices, or ANPRs, hover over roadways—to catch speeding motorists or parking violators but also, in the case of the United Kingdom, to track the comings and goings of suspected criminals. The untallied but growing number of people wearing body cameras now includes not just police but also hospital workers and others who aren’t law enforcement officers. Proliferating as well are personal monitoring devices—dash cams, cyclist helmet cameras to record collisions, doorbells equipped with lenses to catch package thieves—that are fast becoming a part of many a city dweller’s everyday arsenal. Even less quantifiable, but far more vexing, are the billions of images of unsuspecting citizens captured by facial-recognition technology and stored in law enforcement and private-sector databases over which our control is practically nonexistent.

Those are merely the “watching” devices that we’re capable of seeing. Presently the skies are cluttered with drones—2.5 million of which were purchased in 2016 by American hobbyists and businesses. That figure doesn’t include the fleet of unmanned aerial vehicles used by the U.S. government not only to bomb terrorists in Yemen but also to help stop illegal immigrants entering from Mexico, monitor hurricane flooding in Texas, and catch cattle thieves in North Dakota. Nor does it include the many thousands of airborne spying devices employed by other countries—among them Russia, China, Iran, and North Korea.

We’re being watched from the heavens as well. More than 1,700 satellites monitor our planet. From a distance of about 300 miles, some of them can discern a herd of buffalo or the stages of a forest fire. From outer space, a camera clicks and a detailed image of the block where we work can be acquired by a total stranger….

This is—to lift the title from another British futurist, Aldous Huxley—our brave new world. That we can see it coming is cold comfort since, as Carnegie Mellon University professor of information technology Alessandro Acquisti says, “in the cat-and-mouse game of privacy protection, the data subject is always the weaker side of the game.” Simply submitting to the game is a dispiriting proposition. But to actively seek to protect one’s privacy can be even more demoralizing. University of Texas American studies professor Randolph Lewis writes in his new book, Under Surveillance: Being Watched in Modern America, “Surveillance is often exhausting to those who really feel its undertow: it overwhelms with its constant badgering, its omnipresent mysteries, its endless tabulations of movements, purchases, potentialities.”

The desire for privacy, Acquisti says, “is a universal trait among humans, across cultures and across time. You find evidence of it in ancient Rome, ancient Greece, in the Bible, in the Quran. What’s worrisome is that if all of us at an individual level suffer from the loss of privacy, society as a whole may realize its value only after we’ve lost it for good.”…(More)”.

Extracting crowd intelligence from pervasive and social big data


Introduction by Leye Wang, Vincent Gauthier, Guanling Chen and Luis Moreira-Matias of Special Issue of the Journal of Ambient Intelligence and Humanized Computing: “With the prevalence of ubiquitous computing devices (smartphones, wearable devices, etc.) and social network services (Facebook, Twitter, etc.), humans are generating massive digital traces continuously in their daily life. Considering the invaluable crowd intelligence residing in these pervasive and social big data, a spectrum of opportunities is emerging to enable promising smart applications for easing individual life, increasing company profit, as well as facilitating city development. However, the nature of big data also poses fundamental challenges on the techniques and applications relying on the pervasive and social big data from multiple perspectives such as algorithm effectiveness, computation speed, energy efficiency, user privacy, server security, data heterogeneity and system scalability. This special issue presents the state-of-the-art research achievements in addressing these challenges. After the rigorous review process of reviewers and guest editors, eight papers were accepted as follows.

The first paper “Automated recognition of hypertension through overnight continuous HRV monitoring” by Ni et al. proposes a non-invasive way to differentiate hypertension patients from healthy people with the pervasive sensors such as a waist belt. To this end, the authors train a machine learning model based on the heart rate data sensed from waists worn by a crowd of people, and the experiments show that the detection accuracy is around 93%.

The second paper “The workforce analyzer: group discovery among LinkedIn public profiles” by Dai et al. describes two users’ group discovery methods among LinkedIn public profiles. One is based on K-means and another is based on SVM. The authors contrast results of both methods and provide insights about the trending professional orientations of the workforce from an online perspective.

The third paper “Tweet and followee personalized recommendations based on knowledge graphs” by Pla Karidi et al. present an efficient semantic recommendation method that helps users filter the Twitter stream for interesting content. The foundation of this method is a knowledge graph that can represent all user topics of interest as a variety of concepts, objects, events, persons, entities, locations and the relations between them. An important advantage of the authors’ method is that it reduces the effects of problems such as over-recommendation and over-specialization.

The fourth paper “CrowdTravel: scenic spot profiling by using heterogeneous crowdsourced data” by Guo et al. proposes CrowdTravel, a multi-source social media data fusion approach for multi-aspect tourism information perception, which can provide travelling assistance for tourists by crowd intelligence mining. Experiments over a dataset of several popular scenic spots in Beijing and Xi’an, China, indicate that the authors’ approach attains fine-grained characterization for the scenic spots and delivers excellent performance.

The fifth paper “Internet of Things based activity surveillance of defence personnel” by Bhatia et al. presents a comprehensive IoT-based framework for analyzing national integrity of defence personnel with consideration to his/her daily activities. Specifically, Integrity Index Value is defined for every defence personnel based on different social engagements, and activities for detecting the vulnerability to national security. In addition to this, a probabilistic decision tree based automated decision making is presented to aid defence officials in analyzing various activities of a defence personnel for his/her integrity assessment.

The sixth paper “Recommending property with short days-on-market for estate agency” by Mou et al. proposes an estate with short days-on-market appraisal framework to automatically recommend those estates using transaction data and profile information crawled from websites. Both the spatial and temporal characteristics of an estate are integrated into the framework. The results show that the proposed framework can estimate accurately about 78% estates.

The seventh paper “An anonymous data reporting strategy with ensuring incentives for mobile crowd-sensing” by Li et al. proposes a system and a strategy to ensure anonymous data reporting while ensuring incentives simultaneously. The proposed protocol is arranged in five stages that mainly leverage three concepts: (1) slot reservation based on shuffle, (2) data submission based on bulk transfer and multi-player dc-nets, and (3) incentive mechanism based on blind signature.

The last paper “Semantic place prediction from crowd-sensed mobile phone data” by Celik et al. semantically classifes places visited by smart phone users utilizing the data collected from sensors and wireless interfaces available on the phones as well as phone usage patterns, such as battery level, and time-related information, with machine learning algorithms. For this study, the authors collect data from 15 participants at Galatasaray University for 1 month, and try different classification algorithms such as decision tree, random forest, k-nearest neighbour, naive Bayes, and multi-layer perceptron….(More)”.

Advanced Design for the Public Sector


Essay by Kristofer Kelly-Frere & Jonathan Veale: “…It might surprise some, but it is now common for governments across Canada to employ in-house designers to work on very complex and public issues.

There are design teams giving shape to experiences, services, processes, programs, infrastructure and policies. The Alberta CoLab, the Ontario Digital Service, BC’s Government Digital Experience Division, the Canadian Digital Service, Calgary’s Civic Innovation YYC, and, in partnership with government,MaRS Solutions Lab stand out. The Government of Nova Scotia recently launched the NS CoLab. There are many, many more. Perhaps hundreds.

Design-thinking. Service Design. Systemic Design. Strategic Design. They are part of the same story. Connected by their ability to focus and shape a transformation of some kind. Each is an advanced form of design oriented directly at humanizing legacy systems — massive services built by a culture that increasingly appears out-of-sorts with our world. We don’t need a new design pantheon, we need a unifying force.

We have no shortage of systems that require reform. And no shortage of challenges. Among them, the inability to assemble a common understanding of the problems in the first place, and then a lack of agency over these unwieldy systems. We have fanatics and nativists who believe in simple, regressive and violent solutions. We have a social economy that elevates these marginal voices. We have well-vested interests who benefit from maintaining the status quo and who lack actionable migration paths to new models. The median public may no longer see themselves in liberal democracy. Populism and dogmatism is rampant. The government, in some spheres, is not credible or trusted.

The traditional designer’s niche is narrowing at the same time government itself is becoming fragile. It is already cliche to point out that private wealth and resources allow broad segments of the population to “opt out.” This is quite apparent at the municipal level where privatized sources of security, water, fire protection and even sidewalks effectively produce private shadow governments. Scaling up, the most wealthy may simply purchase residency or citizenship or invest in emerging nation states. Without re-invention this erosion will continue. At the same time artificial intelligence, machine learning and automation are already displacing frontline design and creative work. This is the opportunity: Building systems awareness and agency on the foundations of craft and empathy that are core to human centered design. Time is of the essence. Transitions between one era to the next are historically tumultuous times. Moreover, these changes proceed faster than expected and in unexpected directions….(More).

How Helsinki uses a board game to promote public participatio


Bloomberg Cities: “When mayors talk about “citizen engagement,” two things usually seem clear: It’s a good thing and we need more of it. But defining exactly what citizen engagement means — and how city workers should do it — can be a lot harder than it sounds.

To make the concept real, the city of Helsinki has come up with a creative solution. City leaders made a board game that small teams of managers and front-line staff can play together. As they do so, they learn about dozens of methods for involving citizens in their work, from public meetings to focus groups to participatory budgeting.

It’s called the “Participation Game,” and over the past year, more than 2,000 Helsinki employees from all city departments have played it close to 250 times. Tommi Laitio, who heads the city’s Division of Culture and Leisure, said the game has been a surprise hit with employees because it helps cut through jargon and put public participation in concrete terms they can easily relate to.

“‘Citizen engagement’ is one of those buzzwords that gets thrown around a lot,” Laitio said. “But it means different things to different people. For some, it might mean involving citizens in a co-design process. For others, it might mean answering feedback by email. And there’s a huge difference in ambition between those approaches.”

The game’s rollout comes as Helsinki is overhauling local governance with a goal of making City Hall more responsive to the public. Starting last June, more power is vested in local political leaders, including the mayor, Jan Vapaavuori. More than 30 individual city departments are now consolidated into four. And there’s a deep new focus on involving citizens in decision making. That’s where the board game comes in.

Helsinki’s experiment is part of a wider movement both in and out of government to “gamify” workforce training, service delivery and more….(More)”.

Artificial intelligence and smart cities


Essay by Michael Batty at Urban Analytics and City Sciences: “…The notion of the smart city of course conjures up these images of such an automated future. Much of our thinking about this future, certainly in the more popular press, is about everything ranging from the latest App on our smart phones to driverless cars while somewhat deeper concerns are about efficiency gains due to the automation of services ranging from transit to the delivery of energy. There is no doubt that routine and repetitive processes – algorithms if you like – are improving at an exponential rate in terms of the data they can process and the speed of execution, faithfully following Moore’s Law.

Pattern recognition techniques that lie at the basis of machine learning are highly routinized iterative schemes where the pattern in question – be it a signature, a face, the environment around a driverless car and so on – is computed as an elaborate averaging procedure which takes a series of elements of the pattern and weights them in such a way that the pattern can be reproduced perfectly by the combinations of elements of the original pattern and the weights. This is in essence the way neural networks work. When one says that they ‘learn’ and that the current focus is on ‘deep learning’, all that is meant is that with complex patterns and environments, many layers of neurons (elements of the pattern) are defined and the iterative procedures are run until there is a convergence with the pattern that is to be explained. Such processes are iterative, additive and not much more than sophisticated averaging but using machines that can operate virtually at the speed of light and thus process vast volumes of big data. When these kinds of algorithm can be run in real time and many already can be, then there is the prospect of many kinds of routine behaviour being displaced. It is in this sense that AI might herald in an era of truly disruptive processes. This according to Brynjolfsson and McAfee is beginning to happen as we reach the second half of the chess board.

The real issue in terms of AI involves problems that are peculiarly human. Much of our work is highly routinized and many of our daily actions and decisions are based on relatively straightforward patterns of stimulus and response. The big questions involve the extent to which those of our behaviours which are not straightforward can be automated. In fact, although machines are able to beat human players in many board games and there is now the prospect of machines beating the very machines that were originally designed to play against humans, the real power of AI may well come from collaboratives of man and machine, working together, rather than ever more powerful machines working by themselves. In the last 10 years, some of my editorials have tracked what is happening in the real-time city – the smart city as it is popularly called – which has become key to many new initiatives in cities. In fact, cities – particularly big cities, world cities – have become the flavour of the month but the focus has not been on their long-term evolution but on how we use them on a minute by minute to week by week basis.

Many of the patterns that define the smart city on these short-term cycles can be predicted using AI largely because they are highly routinized but even for highly routine patterns, there are limits on the extent to which we can explain them and reproduce them. Much advancement in AI within the smart city will come from automation of the routine, such as the use of energy, the delivery of location-based services, transit using information being fed to operators and travellers in real time and so on. I think we will see some quite impressive advances in these areas in the next decade and beyond. But the key issue in urban planning is not just this short term but the long term and it is here that the prospects for AI are more problematic….(More)”.

Who Owns Urban Mobility Data?


David Zipper at City Lab: “How, exactly, should policymakers respond to the rapid rise of new private mobility services such as ride-hailing, dockless shared bicycles, and microtransit?   … The most likely solution is via a data exchange that anonymizes rider data and gives public experts (and perhaps academic and private ones too) the ability to answer policy questions.

This idea is starting to catch on. The World Bank’s OpenTraffic project, founded in 2016, initially developed ways to aggregate traffic information derived from commercial fleets. A handful of private companies like Grab and Easy Taxi pledged their support when OpenTraffic launched. This fall, the project become part of SharedStreets, a collaboration between the National Association of City Transportation Officials (NACTO), the World Resources Institute, and the OECD’s International Transport Forum to pilot new ways of collecting and sharing a variety of public and private transport data. …(More).

People-Led Innovation: Toward a Methodology for Solving Urban Problems in the 21st Century


New Methodology by Andrew Young, Jeffrey Brown, Hannah Pierce, and Stefaan G. Verhulst: “More and more people live in urban settings. At the same time, and often resulting from the growing urban population, cities worldwide are increasingly confronted with complex environmental, social, and economic shocks and stresses. When seeking to develop adequate and sustainable responses to these challenges, cities are realizing that traditional methods and existing resources often fall short.

people-led-innovation-coverAddressing 21st century challenges will require innovative approaches.

People-Led Innovation: Toward a Methodology for Solving Urban Problems in the 21st Century,” is a new methodology by The GovLab and Bertelsmann Foundation aimed at empowering public entrepreneurs, particularly city-level government officials, to engage the capacity and expertise of people in solving major public challenges. This guide focuses on unlocking an undervalued asset for innovation and the co-creation of solutions: people and their expertise…..

Designed for city officials, and others seeking ways to improve people’s lives, the methodology provides:

  • A phased approach to helping leaders develop approaches in an iterative manner that is more effective and legitimate by placing people, and groups of people, at the center of all stages of problem-solving process, including: problem definition, ideation, experimentation, and iteration.
  • A flexible framework that instead of rigid prescriptions, provides suggested checklists to probe a more people-led approach when developing innovative solutions to urban challenges.
  • A matrix to determine what kind of engagement (e.g., commenting, co-creating, reviewing, and/or reporting), and by whom (e.g., community-based organizations, residents, foundation partners, among others) is most appropriate at what stage of the innovation lifecycle.
  • A curation of inspirational examples, set at each phase of the methodology, where public entrepreneurs and others have sought to create positive impacts by engaging people in practice….(More)”.