A History of the Data-Tracked User


Essay by Tanya Kant: “Who among us hasn’t blindly accepted a cookie notice or an inscrutable privacy policy, or been stalked by a creepy “personalized” ad? Tracking and profiling are now commonplace fixtures of the digital everyday. This stands even if you use tracker blockers, which have been likened to “using an umbrella in a hurricane.”

In most instances, data tracking is conducted in the name of offering “personalized experiences” to web users: individually targeted marketing, tailored newsfeeds, recommended products and content. To offer such experiences, platforms such as Facebook and Google use a dizzyingly extensive list of categories to track and profile people: gender, age, ethnicity, lifestyle and consumption preferences, language, voice recordings, facial recognition, location, political leanings, music and film taste, income, credit status, employment status, home ownership status, marital status — the list goes on….

As I explore in this case study, and as part of my work on algorithmic identity, data tracking does not just match the “right” people with the “right” products and services — it can dis­criminate, govern, and regulate web users in ways that demand close attention to the social and ethical implications of targeting.

It is not an overstatement to propose that data tracking underpins the online economy as we know it.

Commercial platform providers frame data tracking as inevitable: Data in exchange for a (personalized) service is presented as the best, and often the only, option for platform users. Yet this has not always been the case: In the mid-to-late 1990s, when the web was still in its infancy, “cyberspace” was largely celebrated as public, non-tracked space which afforded users freedom of anonymity. How then did the individual tracking of users come to dominate the web as a market practice?

The following timeline outlines a partial history of the data-tracked user. It centers largely on developments that have affected European (and to a lesser extent U.S.) web users. This timeline includes developments in commercial targeting in the EU and U.S. rather than global developments in algorithmic policing, spatial infrastructures, medicine, and education, all of which are related but deserve their own timelines. This brief history fits into ongoing conversations around algorithmic targeting by reminding us that being tracked and targeted emerges from a historically specific set of developments. Increased legal scrutiny of targeting means that individual targeting as we know it may soon change dramatically — though while the assumption that profiling web users equates to more profit, it’s more than likely that data tracking will persist in some form.


1940s. “Identity scoring” emerges: the categorization of individuals to calculate the benefits or risks of lending credit to certain groups of people….(More)”.

Why you should develop a Rules as Code-enabled future


Blog by Tim de Sousa: “In 2021, Rules as Code (RaC) is truly hitting its stride. More governments are exploring the concept of machine-consumable legislation, regulation and policy, research institutes have been established, papers and reports are being published, tools and platforms are being built, and multi-disciplinary teams are learning new ways to draft and implement rules by getting their hands dirty.

RaC is still an emerging practice. Much of the current discussion about RaC is centred on introductory questions such as why and how we should code rules (and we’ve tried to answer those questions here), but to understand the true potential of RaC, we have to take a longer view.

In this two-part series, I set out some possible optimistic futures that could be enabled by RaC. We have to ask ourselves what kind of world we want to build with coded rules. so we can better plan how to get there.

Trustworthy automated decisions

The first reaction that RaC practitioners are often faced with is the fear of the killer robot. What happens if the automated system makes a wrong decision? What if that decision hurts someone? This is not an unfounded fear – we have seen poorly implemented and poorly used automated systems raise debts that are not owed, and lead to the arrest of innocent people. All human-built systems have flaws, and RaC-enabled systems are not immune.

As a former administrative lawyer and someone who grapples with the ethical uses of technology on a daily basis, the use of RaC to help people understand what decisions are being made and how they’re being made – that is, to enable trustworthy automated decisions – is particularly compelling.

Administrative law is the body of law that regulates how governments make decisions. In common law countries, this generally includes requirements that only relevant matters should be taken into account, irrelevant matters should not be, reasons should be given for decisions, and there should be workable avenues for merits reviews of decisions…(More)”

A Curation of Tools for Promoting Effective Data Re-Use for Addressing Public Challenges


Blog by Sampriti Saxena, Andrew J. Zahuranec, and Stefaan Verhulst: “Data can be a powerful tool for change, if made accessible and leveraged responsibly. Since The GovLab’s Data Program began work six years ago, it has structured most of its work around this fact, seeking to tackle many of the complex challenges we face today by enabling access to data. 

In this blog, the Data Program provides a curation of tools that we’ve developed over the years to enable systematic, sustainable and responsible re-use of data – i.e. data stewardship. Organized into four categories—each dealing with a topic or issue we’ve studied in our work—this listing gives data practitioners the resources they need to make sense of the data they work with.  We hope that it will help practitioners to develop fit-for-purpose approaches when designing open data and data collaboration initiatives.

  • Data Collaboration…
  • Open Data…
  • Responsible Data…
  • Participatory Approaches…(More)”.

The search engine of 1896


The Generalist Academy: In 1896 Paul Otlet set up a bibliographic query service by mail: a 19th century search engine….The end of the 19th century was awash with the written word: books, monographs, and publications of all kinds. It was fiendishly difficult to find what you wanted in that mess. Bibliographies – compilations of references on a specific subject – were the maps to this vast informational territory. But they were expensive and time-consuming to compile.

Paul Otlet had a passion for information. More precisely, he had a passion for organising information. He and Henri La Fontaine made bibliographies on many subjects – and then turned their efforts towards creating something better. A master bibliography. A bibliography to rule them all, nothing less than a complete record of everything that had ever been published on every topic. This was their plan: the grandly named Universal Bibliographic Repertory.

This ambitious endeavour listed sources for every topic that its creators could imagine. The references were meticulously recorded on index cards that were filed in a massive series of drawers like the ones pictured above. The whole thing was arranged according to their Universal Decimal Classification, and it was enormous. In 1895 there were four hundred thousand entries. At its peak in 1934, there were nearly sixteen million.

How could you access such a mega-bibliography? Well, Otlet and La Fontaine set up a mail service. People set in queries and received a summary of publications relating to that topic. Curious about the native religions of Sumatra? Want to explore the 19th century decipherment of Akkadian cuneiform? Send a request to the Universal Bibliographic Repertory, get a tidy list of the references you need. It was nothing less than a manual search engine, one hundred and twenty-five years ago.

Encyclopedia Universalis
Paul Otlet, Public domain, via Wikimedia Commons

Otlet had many more ambitions: a world encyclopaedia of knowledge, contraptions to easily access every publication in the world (he was an early microfiche pioneer), and a whole city to serve as the bright centre of global intellect. These ambitions were mostly unrealised, due to lack of funds and the intervention of war. But today Otlet is recognised as an important figure in the history of information science…(More)”.

Little Rock Shows How Open Data Drives Resident Engagement


Blog by  Ross Schwartz: “The 12th Street corridor is in the heart of Little Rock, stretching west from downtown across multiple neighborhoods. But for years the area had suffered from high crime rates and disinvestment, and is considered a food desert.

With the intention of improving public safety and supporting efforts to revitalize the area, the City built a new police station in 2014 on the street. And, in the years following, as city staff ramped up efforts to place data at the center of problem-solving, it began to hold two-day-long “Data Academy” trainings for city employees and residents on foundational data practices, including data analysis.

Responding to public safety concerns, a 2018 Data Academy training focused on 12th Street. A cross-department team dug into data sets to understand the challenges facing the area, looking at variables including crime, building code violations, and poverty. It turned out the neighborhood with the highest levels of crime and blight was actually blocks away from 12th Street itself, in Midtown. A predominantly African-American neighborhood just east of the University of Arkansas at Little Rock campus, Midtown has a mix of older longtime homeowners and younger renters.

“It was a real data-driven ‘a-ha’ moment — an example of what you can understand about a city if you have the right data sets and look in the right places,” says Melissa Bridges, Little Rock’s performance and innovation coordinator. With support from What Works Cities (WWC), for the last five years she’s led Little Rock’s efforts to build open data and performance measurement resources and infrastructure…

Newly aware of Midtown’s challenges, city officials decided to engage residents in the neighborhood and adjacent areas. Data Academy members hosted a human-centered design workshop, during which residents were given the opportunity to self-prioritize their pressing concerns. Rather than lead the workshop, officials from various city departments quietly observed the discussion.

The main issue that emerged? Many parts of Midtown were poorly lit due to broken or blocked streetlights. Many residents didn’t feel safe and didn’t know how to alert the City to get lights fixed or vegetation cut back. A review of 311 request data showed that few streetlight problems in the area were ever reported to the City.

Aware of studies showing the correlation between dark streets and crime, the City designed a streetlight canvassing project in partnership with area neighborhood associations to engage and empower residents. Bridges and her team built canvassing route maps using Google Maps and Little Rock Citizen Connect, which collects 311 requests and other data sets. Then they gathered resident volunteers to walk or drive Midtown’s streets on a Friday night, using the City’s 311 mobile app to make a light service request and tag the location….(More)”.

Contemplating the COVID crisis: what kind of inquiry do we need to learn the right lessons?


Essay by Geoff Mulgan: “Boris Johnson has announced a UK inquiry into COVID-19 to start in 2022, a parallel one is being planned in Scotland, and many more will emerge all over the world. But how should such inquiries be designed and run? What kind of inquiry can do most to mitigate or address the harms caused by the pandemic?

We’re beginning to look at this question at IPPO (the International Public Policy Observatory), including a global scan with our partners, INGSA and the Blavatnik School of Government, on how inquiries are being developed around the world, plus engagement with governments and parliaments across the UK.

It’s highly likely that the most traditional models of inquiries will be adopted – just because that’s what people at the top are used to, or because they look politically expedient. But we think it would be good to look at the options and to encourage some creativity.

The pandemic has prompted extraordinary innovation; there is no reason why inquiries should be devoid of any. Moreover, the pandemic affected every sector of life – and was far more ‘systemic’ than the kinds of issue or event addressed by typical inquiries in the past. That should be reflected in how lessons are learned.

So here are some initial thoughts on what the defaults look like, why they are likely to be inadequate, and what some alternatives might be. This article proposes the idea of a ‘whole of society’ inquiry model which has a distributed rather than centralised structure, which focuses on learning more than blame, and which can connect the thousands of organisations that have had to make so many difficult decisions throughout the crisis, and also the lived experiences of public and frontline staff. We hope that it will prompt responses, including better ideas about what kinds of inquiry will serve us best…

There are many different options for inquiries, and this is a good moment to consider them. They range from ‘truth and reconciliation’ inquiries to no-fault compensation processes to the ways industries such as airlines deal with crashes, through to academic analyses of events like the 2007/08 financial crash. They can involve representative or random samples of the public (e.g. citizens’ assemblies and juries) or just experts and officials…

The idea of a distributed inquiry is not entirely new. Colombia, for example, attempted something along these lines as part of its peace process. Many health systems use methods such as ‘collaboratives’ to organise accelerated learning. Doubtless there is much to be learned from these and other examples. For the UK in particular, it is vital there are contextually appropriate designs for the four nations as well as individual cities and regions.

As already indicated, a key is to combine sensible inquiries focused on particular sectors (e.g. what did universities do, what worked…) and make connections between them. As IPPO’s work on COVID inequalities has highlighted, the patterns are very complex but involved a huge amount of harm – captured in our ‘inequalities matrix’, below.

So, while the inquiries need to dig deep on multiple fronts and to look more like a matrix than a single question, what might connect all the inquiries would be a commitment to some common elements which would be shared:

  • Facts: In each case, a precondition for learning is establishing the facts, as well as the evidence on what did or didn’t work well. This is a process closer to what evidence intermediary organisations – such as the UK’s What Works Network – do than a judicial process designed for binary judgments (guilty/not guilty). This would be helped by some systematic curation and organisation of the evidence in easily accessible forms, of the kind that IPPO is doing….(More)”

Introducing collective crisis intelligence


Blogpost by Annemarie Poorterman et al: “…It has been estimated that over 600,000 Syrians have been killed since the start of the civil war, including tens of thousands of civilians killed in airstrike attacks. Predicting where and when strikes will occur and issuing time-critical warnings enabling civilians to seek safety is an ongoing challenge. It was this problem that motivated the development of Sentry Syria, an early warning system that alerts citizens to a possible airstrike. Sentry uses acoustic sensor data, reports from on-the-ground volunteers, and open media ‘scraping’ to detect warplanes in flight. It uses historical data and AI to validate the information from these different data sources and then issues warnings to civilians 5-10 minutes in advance of a strike via social media, TV, radio and sirens. These extra minutes can be the difference between life and death.

Sentry Syria is just one example of an emerging approach in the humanitarian response we call collective crisis intelligence (CCI). CCI methods combine the collective intelligence (CI) of local community actors (e.g. volunteer plane spotters in the case of Sentry) with a wide range of additional data sources, artificial intelligence (AI) and predictive analytics to support crisis management and reduce the devastating impacts of humanitarian emergencies….(More)”

Can national statistical offices shape the data revolution?


Article by Juan Daniel Oviedo, Katharina Fenz, François Fonteneau, and Simon Riedl: “In recent years, breakthrough technologies in artificial intelligence (AI) and the use of satellite imagery made it possible to disrupt the way we collect, process, and analyze data. Facilitated by the intersection of new statistical techniques and the availability of (big) data, it is now possible to create hypergranular estimates.

National statistical offices (NSOs) could be at the forefront of this change. Conventional tasks of statistical offices, such as the coordination of household surveys and censuses, will remain at the core of their work. However, just like AI can enhance the capabilities of doctors, it also has the potential to make statistical offices better, faster, and eventually cheaper.

Still, many countries struggle to make this happen. In a COVID-19 world marked by constrained financial and statistical capacities, making innovation work for statistical offices is of prime importance to create better lives for all…

In the case of Colombia, this novel method facilitated a scale-up from existing poverty estimates that contained 1,123 data points to 78,000 data points, which represents a 70-fold increase. This results in much more granular estimates highlighting Colombia’s heterogeneity between and within municipalities (see Figure 1).

Figure 1. Poverty shares (%) Colombia, in 2018

Figure 1. Poverty shares (%) Colombia, in 2018

Traditional methods don´t allow for cost-efficient hypergranular estimations but serve as a reference point, due to their ground-truthing capacity. Hence, we have combined existing data with novel AI techniques, to go down to granular estimates of up to 4×4 kilometers. In particular, we have trained an algorithm to connect daytime and nighttime satellite images….(More)”.

Civic Technology for Participatory Cities of the Future


Article by Francesca Esses: “As cities continue to expand and demographics diversify, it has become challenging for governments, on a local level, to make informed decisions representative of the local population. Through town hall meetings and public hearings, traditional means of public engagement are no longer sufficient in attaining meaningful citizen input to policy and decision making. These types of engagement methods have come under criticism for their inaccessibility, timelines, and representation of the broader demographic of modern society…

Winston Churchill is famously quoted as saying, “Never let a good crisis go to waste”. History would suggest that pandemics have forced humans to embrace change, described as a ‘portal’ from the old world to the next. COVID-19 has created an unparalleled opportunity to reimagine technology’s role in shaping society. It is anticipated that a surge in technological innovation will materialise from the pandemic and the subsequent economic instability.

Concerning smart cities, the COVID-19 pandemic has been referred to as the “lubricant” for further development in this area. There has been a significant rise in civic tech projects globally as a direct response to the pandemic: organisations such as Code for Japan, Code for Germany, and Code for Pakistan have all launched several projects in response to the virus. We’ve already seen civic tech initiatives across Africa implemented as a direct response to the pandemic; the Civic Tech Innovation Network referenced at least 140 initiatives across the continent.

Civic technologists also created a comprehensive COVID-19 data platform available at global.health, described as the first easy-to-use global repository, enabling open access to real-time data containing over 30 million anonymised cases in over 100 countries. The data curated on the site aims to help epidemiologists monitor the trajectory of the virus and track variants. A list of other corona-focused civic tech initiatives can be found here….

Restrictions put in place due to COVID-19 have positively impacted the earth’s climate, resulting in a pollution reduction, with carbon emissions falling globally. We’ve all seen the images of smog-free skies over the notoriously muggy cities across the world. According to reports, overall carbon dioxide emissions dropped by 7 percent compared to 2019. It’s argued that a more socially conscious and responsible consumer is likely to emerge post-pandemic, with a greater focus on sustainability, responsible living, and carbon footprint….(More)

Building a Responsible Open Data Ecosystem: Mobility Data & COVID-19


Blog by Anna Livaccari: “Over the last year and a half, COVID-19 has changed the way people move, work, shop, and live. The pandemic has necessitated new data-sharing initiatives to understand new patterns of movement, analyze the spread of COVID-19, and inform research and decision-making. Earlier this year, Cuebiq collaborated with the Open Data Institute (ODI) and NYU’s The GovLab to explore the efficacy of these new initiatives. 

The ODI is a non-profit organization that brings together commercial and non-commercial organizations and governments to address global issues as well as advise on how data can be used for positive social good. As part of a larger project titled “COVID-19: Building an open and trustworthy data ecosystem,” the ODI published a new report with Cuebiq and The GovLab, an action research center at NYU’s Tandon School of Engineering that has pioneered the concept of data collaboratives and runs the data stewards network among other initiatives to advance data-driven decision making in the public interest. This report, “The Use of Mobility Data for Responding to the COVID-19 Pandemic,” specifically addresses key enablers and obstacles to the successful sharing of mobility data between public and private organizations during the pandemic….

Since early 2020, researchers and policy makers have been eager to understand the impact of COVID-19. With the help of mobility data, organizations from different sectors were able to answer some of the most pressing questions regarding the pandemic: questions about policy decisions, mass-communication strategies, and overall socioeconomic impact. Mobility data can be applied to specific use cases and can help answer complex questions, a fact that The GovLab discusses in its short-form mobility data brief. Understanding exactly how organizations employ mobility data can also improve how institutions operate post-pandemic and make data collaboration as a whole more responsible, sustainable, and systemic.

Cuebiq and the GovLab identified 51 projects where mobility data was used for pandemic response, and then selected five case studies to analyze further. The report defines mobility data, the ethics surrounding it, and the lessons learned for the future….(More)”.