The Everyday Life of an Algorithm


Book by Daniel Neyland: “This open access book begins with an algorithm–a set of IF…THEN rules used in the development of a new, ethical, video surveillance architecture for transport hubs. Readers are invited to follow the algorithm over three years, charting its everyday life. Questions of ethics, transparency, accountability and market value must be grasped by the algorithm in a series of ever more demanding forms of experimentation. Here the algorithm must prove its ability to get a grip on everyday life if it is to become an ordinary feature of the settings where it is being put to work. Through investigating the everyday life of the algorithm, the book opens a conversation with existing social science research that tends to focus on the power and opacity of algorithms. In this book we have unique access to the algorithm’s design, development and testing, but can also bear witness to its fragility and dependency on others….(More)”.

The global race is on to build ‘City Brains’


Prediction by Geoff Mulgan, Eva Grobbink and Vincent Straub: “The USSR’s launch of the Sputnik 1 satellite in 1958 was a major psychological blow to the United States. The US had believed it was technologically far ahead of its rival, but was confronted with proof that the USSR was pulling ahead in some fields. After a bout of soul-searching the country responded with extraordinary vigour, massively increasing investment in space technologies and promising to put a man on the Moon by the end of the 1960s.

In 2019, China’s success in smart cities could prompt a similar “Sputnik Moment” for the rest of the world. It may not be as dramatic as that of 1958. But unlike beeping satellites and Moon landings, it could be coming to a town near you….

The concept of a “smart city” has been around for several decades, often associated with hype, grandiose failures, and an overemphasis on hardware rather than people (Nesta has previously written on how we can rethink smart cities and ensure digital innovation realises the potential of technology and people). But various technologies are now coming of age which bring the vision of a smart city closer to fruition. China is in the forefront, investing heavily in sensors and infrastructures, and its ET City Brain project shows just how far the country’s thinking has progressed.

First launched in September 2016, ET City Brain is a collaboration between Chinese technology giant Alibaba and several cities. It was first trialled in Hangzhou, the hometown of Alibaba’s executive chairman, Jack Ma, but has since expanded to other Chinese cities. Earlier this year, Kuala Lumpurbecame the first city outside of China to import the ET City Brain model.

The ET City Brain system gathers large amounts of data (including logs, videos, and data stream) from sensors. These are then processed by algorithms in supercomputers and fed back into control centres around the city for administrators to act on—in some cases, automation means the system works without any human intervention at all.

So far, the project has been used to monitor congestion in Hangzhou, improve the response of emergency services in Guangzhou, and detect traffic accidents in Suzhou. In Hangzhou, Alibaba was given control of 104 traffic light junctions in the city’s Xiaoshan district and tasked with managing traffic flows. By combining mass video surveillance with live data from public transportation systems, ET City Brain was able to autonomously change traffic lights so that emergency vehicles could travel to accident scenes without interruption. As a result, arrival times for ambulances improved by 49 percent….(More)”.

Trusting Intelligent Machines: Deepening Trust Within Socio-Technical Systems


Peter Andras et al in IEEE Technology and Society Magazine: “Intelligent machines have reached capabilities that go beyond a level that a human being can fully comprehend without sufficiently detailed understanding of the underlying mechanisms. The choice of moves in the game Go (generated by Deep Mind’s Alpha Go Zero) are an impressive example of an artificial intelligence system calculating results that even a human expert for the game can hardly retrace. But this is, quite literally, a toy example. In reality, intelligent algorithms are encroaching more and more into our everyday lives, be it through algorithms that recommend products for us to buy, or whole systems such as driverless vehicles. We are delegating ever more aspects of our daily routines to machines, and this trend looks set to continue in the future. Indeed, continued economic growth is set to depend on it. The nature of human-computer interaction in the world that the digital transformation is creating will require (mutual) trust between humans and intelligent, or seemingly intelligent, machines. But what does it mean to trust an intelligent machine? How can trust be established between human societies and intelligent machines?…(More)”.

Innovations In The Fight Against Corruption In Latin America


Blog Post by Beth Noveck:  “…The Inter-American Development Bank (IADB) has published an important, practical and prescriptive report with recommendations for every sector of society from government to individuals on innovative and effective approaches to combatting corruption. While focused on Latin America, the report’s proposals, especially those on the application of new technology in the fight against corruption, are relevant around the world….

IADB Anti-Corruption Report

The recommendations about the use of new technologies, including big data, blockchain and collective intelligence, are drawn from an effort undertaken last year by the Governance Lab at New York University’s Tandon School of Engineering to crowdsource such solutions and advice on how to implement them from a hundred global experts. (See the Smarter Crowdsourcing against Corruption report here.)…

Big data, when published as open data, namely in a form that can be re-used without legal or technical restriction and in a machine-readable format that computers can analyze, is another tool in the fight against corruption. With machine readable, big and open data, those outside of government can pinpoint and measure irregularities in government contracting, as Instituto Observ is doing in Brazil.

Opening up judicial data, such as information about case processing times, judges’ and prosecutors’ salaries, information about selection processes, such as CV’s, professional and academic backgrounds, and written and oral exam scores provides activists and reformers with the tools to fight judicial corruption. The Civil Association for Equality and Justice (ACIJ) (a non-profit advocacy group) in Argentina uses such open justice data in its Concursos Transparentes (Transparent Contests) to fight judicial corruption. Jusbrasil is a private open justice company also using open data to reform the courts in Brazil….(More)”

On the privacy-conscientious use of mobile phone data


Yves-Alexandre de Montjoye et al in Nature: “The breadcrumbs we leave behind when using our mobile phones—who somebody calls, for how long, and from where—contain unprecedented insights about us and our societies. Researchers have compared the recent availability of large-scale behavioral datasets, such as the ones generated by mobile phones, to the invention of the microscope, giving rise to the new field of computational social science.

With mobile phone penetration rates reaching 90% and under-resourced national statistical agencies, the data generated by our phones—traditional Call Detail Records (CDR) but also high-frequency x-Detail Record (xDR)—have the potential to become a primary data source to tackle crucial humanitarian questions in low- and middle-income countries. For instance, they have already been used to monitor population displacement after disasters, to provide real-time traffic information, and to improve our understanding of the dynamics of infectious diseases. These data are also used by governmental and industry practitioners in high-income countries.

While there is little doubt on the potential of mobile phone data for good, these data contain intimate details of our lives: rich information about our whereabouts, social life, preferences, and potentially even finances. A BCG study showed, e.g., that 60% of Americans consider location data and phone number history—both available in mobile phone data—as “private”.

Historically and legally, the balance between the societal value of statistical data (in aggregate) and the protection of privacy of individuals has been achieved through data anonymization. While hundreds of different anonymization algorithms exist, most of them are variations and improvements of the seminal k-anonymity algorithm introduced in 1998. Recent studies have, however, shown that pseudonymization and standard de-identification are not sufficient to prevent users from being re-identified in mobile phone data. Four data points—approximate places and times where an individual was present—have been shown to be enough to uniquely re-identify them 95% of the time in a mobile phone dataset of 1.5 million people. Furthermore, re-identification estimations using unicity—a metric to evaluate the risk of re-identification in large-scale datasets—and attempts at k-anonymizing mobile phone data ruled out de-identification as sufficient to truly anonymize the data. This was echoed in the recent report of the [US] President’s Council of Advisors on Science and Technology on Big Data Privacy which consider de-identification to be useful as an “added safeguard, but [emphasized that] it is not robust against near-term future re-identification methods”.

The limits of the historical de-identification framework to adequately balance risks and benefits in the use of mobile phone data are a major hindrance to their use by researchers, development practitioners, humanitarian workers, and companies. This became particularly clear at the height of the Ebola crisis, when qualified researchers (including some of us) were prevented from accessing relevant mobile phone data on time despite efforts by mobile phone operators, the GSMA, and UN agencies, with privacy being cited as one of the main concerns.

These privacy concerns are, in our opinion, due to the failures of the traditional de-identification model and the lack of a modern and agreed upon framework for the privacy-conscientious use of mobile phone data by third-parties especially in the context of the EU General Data Protection Regulation (GDPR). Such frameworks have been developed for the anonymous use of other sensitive data such as census, household survey, and tax data. The positive societal impact of making these data accessible and the technical means available to protect people’s identity have been considered and a trade-off, albeit far from perfect, has been agreed on and implemented. This has allowed the data to be used in aggregate for the benefit of society. Such thinking and an agreed upon set of models has been missing so far for mobile phone data. This has left data protection authorities, mobile phone operators, and data users with little guidance on technically sound yet reasonable models for the privacy-conscientious use of mobile phone data. This has often resulted in suboptimal tradeoffs if any.

In this paper, we propose four models for the privacy-conscientious use of mobile phone data (Fig. 1). All of these models 1) focus on a use of mobile phone data in which only statistical, aggregate information is ultimately needed by a third-party and, while this needs to be confirmed on a per-country basis, 2) are designed to fall under the legal umbrella of “anonymous use of the data”. Examples of cases in which only statistical aggregated information is ultimately needed by the third-party are discussed below. They would include, e.g., disaster management, mobility analysis, or the training of AI algorithms in which only aggregate information on people’s mobility is ultimately needed by agencies, and exclude cases in which individual-level identifiable information is needed such as targeted advertising or loans based on behavioral data.

Figure 1
Figure 1: Matrix of the four models for the privacy-conscientious use of mobile phone data.

First, it is important to insist that none of these models is a silver bullet…(More)”.

Towards matching user mobility traces in large-scale datasets


Paper by Daniel Kondor, Behrooz Hashemian,  Yves-Alexandre de Montjoye and Carlo Ratti: “The problem of unicity and reidentifiability of records in large-scale databases has been studied in different contexts and approaches, with focus on preserving privacy or matching records from different data sources. With an increasing number of service providers nowadays routinely collecting location traces of their users on unprecedented scales, there is a pronounced interest in the possibility of matching records and datasets based on spatial trajectories. Extending previous work on reidentifiability of spatial data and trajectory matching, we present the first large-scale analysis of user matchability in real mobility datasets on realistic scales, i.e. among two datasets that consist of several million people’s mobility traces, coming from a mobile network operator and transportation smart card usage. We extract the relevant statistical properties which influence the matching process and analyze their impact on the matchability of users. We show that for individuals with typical activity in the transportation system (those making 3-4 trips per day on average), a matching algorithm based on the co-occurrence of their activities is expected to achieve a 16.8% success only after a one-week long observation of their mobility traces, and over 55% after four weeks. We show that the main determinant of matchability is the expected number of co-occurring records in the two datasets. Finally, we discuss different scenarios in terms of data collection frequency and give estimates of matchability over time. We show that with higher frequency data collection becoming more common, we can expect much higher success rates in even shorter intervals….(More)”.

Making NHS data work for everyone


Reform: This report looks at the access and use of NHS data by private sector companies for research or product and service development purposes….

The private sector is an important partner to the NHS and plays a crucial role in the development of healthcare technologies that use data collected by hospitals or GP practices. It provides the skills and know-how to develop data-driven tools which can be used to improve patient care. However, this is not a one-sided exchange as the NHS makes the data available to build these tools and offers medical expertise to make sense of the data. This is known as the “value exchange”. Our research uncovered that there is a lack of clarity over what a fair value exchange looks like. This lack of clarity in conjunction with the lack of national guidance on the types of partnerships that could be developed has led to a patchwork on the ground….

Knowing what the “value exchange” is between patients, the NHS and industry allows for a more informed conversation about what constitutes a fair partnership when there is access to data to create a product or service

WHAT NEEDS TO CHANGE?

  1. Engage with the public
  2. A national strategy
  3. Access to good quality data
  4. Commercial and legal skills…(More)

We Need an FDA For Algorithms


Interview with Hannah Fry on the promise and danger of an AI world by Michael Segal:”…Why do we need an FDA for algorithms?

It used to be the case that you could just put any old colored liquid in a glass bottle and sell it as medicine and make an absolute fortune. And then not worry about whether or not it’s poisonous. We stopped that from happening because, well, for starters it’s kind of morally repugnant. But also, it harms people. We’re in that position right now with data and algorithms. You can harvest any data that you want, on anybody. You can infer any data that you like, and you can use it to manipulate them in any way that you choose. And you can roll out an algorithm that genuinely makes massive differences to people’s lives, both good and bad, without any checks and balances. To me that seems completely bonkers. So I think we need something like the FDA for algorithms. A regulatory body that can protect the intellectual property of algorithms, but at the same time ensure that the benefits to society outweigh the harms.

Why is the regulation of medicine an appropriate comparison?

If you swallow a bottle of colored liquid and then you keel over the next day, then you know for sure it was poisonous. But there are much more subtle things in pharmaceuticals that require expert analysis to be able to weigh up the benefits and the harms. To study the chemical profile of these drugs that are being sold and make sure that they actually are doing what they say they’re doing. With algorithms it’s the same thing. You can’t expect the average person in the street to study Bayesian inference or be totally well read in random forests, and have the kind of computing prowess to look up a code and analyze whether it’s doing something fairly. That’s not realistic. Simultaneously, you can’t have some code of conduct that every data science person signs up to, and agrees that they won’t tread over some lines. It has to be a government, really, that does this. It has to be government that analyzes this stuff on our behalf and makes sure that it is doing what it says it does, and in a way that doesn’t end up harming people.

How did you come to write a book about algorithms?

Back in 2011 in London, we had these really bad riots in London. I’d been working on a project with the Metropolitan Police, trying mathematically to look at how these riots had spread and to use algorithms to ask how could the police have done better. I went to go and give a talk in Berlin about this paper we’d published about our work, and they completely tore me apart. They were asking questions like, “Hang on a second, you’re creating this algorithm that has the potential to be used to suppress peaceful demonstrations in the future. How can you morally justify the work that you’re doing?” I’m kind of ashamed to say that it just hadn’t occurred to me at that point in time. Ever since, I have really thought a lot about the point that they made. And started to notice around me that other researchers in the area weren’t necessarily treating the data that they were working with, and the algorithms that they were creating, with the ethical concern they really warranted. We have this imbalance where the people who are making algorithms aren’t talking to the people who are using them. And the people who are using them aren’t talking to the people who are having decisions made about their lives by them. I wanted to write something that united those three groups….(More)”.

G20/OECD Compendium of good practices on the use of open data for Anti-corruption


OECD: “This compendium of good practices was prepared by the OECD at the request of the G20 Anti-corruption Working Group (ACWG), to raise awareness of the benefits of open data policies and initiatives in: 

  • fighting corruption,
  • increasing public sector transparency and integrity,
  • fostering economic development and social innovation.

This compendium provides an overview of initiatives for the publication and re-use of open data to fight corruption across OECD and G20 countries and underscores the impact that a digital transformation of the public sector can deliver in terms of better governance across policy areas.  The practices illustrate the use of open data as a way of fighting corruption and show how open data principles can be translated into concrete initiatives.

The publication is divided into three sections:

Section 1 discusses the benefits of open data for greater public sector transparency and performance, national competitiveness and social engagement, and how these initiatives contribute to greater public trust in government.

Section 2 highlights the preconditions necessary across different policy areas related to anti-corruption (e.g. open government, public procurement) to sustain the implementation of an “Open by default” approach that could help government move from a perspective that focuses on increasing access to public sector information to one that enhances the publication of open government data for re-use and value co-creation. 

Section 3 presents the results of the OECD survey administered across OECD and G20 countries, good practices on the publishing and reusing of open data for anti-corruption in G20 countries, and lessons learned from the definition and implementation of these initiatives. This chapter also discusses the implications for broader national matters such as freedom of press, and the involvement of key actors of the open data ecosystem (e.g. journalists and civil society organisations) as key partners in open data re-use for anti-corruption…(More)”.

Data Flow in the Smart City: Open Data Versus the Commons


Chapter by Richard Beckwith, John Sherry and David Prendergast in The Hackable City: “Much of the recent excitement around data, especially ‘Big Data,’ focuses on the potential commercial or economic value of data. How that data will affect people isn’t much discussed. People know that smart cities will deploy Internet-based monitoring and that flows of the collected data promise to produce new values. Less considered is that smart cities will be sites of new forms of citizen action—enabled by an ‘economy’ of data that will lead to new methods of collectivization, accountability, and control which, themselves, can provide both positive and negative values to the citizenry. Therefore, smart city design needs to consider not just measurement and publication of data but also the implications of city-wide deployment, data openness, and the possibility of unintended consequences if data leave the city….(More)”.