Stefaan Verhulst
Paper by Bert-Jaap Koops: “Function creep – the expansion of a system or technology beyond its original purposes – is a well-known phenomenon in STS, technology regulation, and surveillance studies. Correction: it is a well-referenced phenomenon. Yearly, hundreds of publications use the term to criticise developments in technology regulation and data governance. But why function creep is problematic, and why authors call system expansion ‘function creep’ rather than ‘innovation’, is underresearched. If the core problem is unknown, we can hardly identify suitable responses; therefore, we first need to understand what the concept actually refers to.
Surprisingly, no-one has ever written a paper about the concept itself. This paper fills that gap in the literature, by analysing and defining ‘function creep’. This creates conceptual clarity that can help structure future debates and address function creep concerns. First, I analyse what ‘function creep’ refers to, through semiotic analysis of the term and its role in discourse. Second, I discuss concepts that share family resemblances, including other ‘creep’ concepts and many theoretical notions from STS, economics, sociology, public policy, law, and discourse theory. Function creep can be situated in the nexus of reverse adaptation and self-augmentation of technology, incrementalism and disruption in policy and innovation, policy spillovers, ratchet effects, transformative use, and slippery slope argumentation.
Based on this, function creep can be defined as *an imperceptibly transformative and therewith contestable change in a data-processing system’s proper activity*. What distinguishes function creep from innovation is that it denotes some qualitative change in functionality that causes concern not only because of the change itself, but also because the change is insufficiently acknowledged as transformative and therefore requiring discussion. Argumentation theory illuminates how the pejorative ‘function creep’ functions in debates: it makes visible that what looks like linear change is actually non-linear, and simultaneously calls for much-needed debate about this qualitative change…(More)”.
Kate Kaye at IAPP: “In the early 2000s, internet accessibility made risks of exposing individuals from population demographic data more likely than ever. So, the U.S. Census Bureau turned to an emerging privacy approach: synthetic data.
Some argue the algorithmic techniques used to develop privacy-secure synthetic datasets go beyond traditional deidentification methods. Today, along with the Census Bureau, clinical researchers, autonomous vehicle system developers and banks use these fake datasets that mimic statistically valid data.
In many cases, synthetic data is built from existing data by filtering it through machine learning models. Real data representing real individuals flows in, and fake data mimicking individuals with corresponding characteristics flows out.
When data scientists at the Census Bureau began exploring synthetic data methods, adoption of the internet had made deidentified, open-source data on U.S. residents, their households and businesses more accessible than in the past.
Especially concerning, census-block-level information was now widely available. Because in rural areas, a census block could represent data associated with as few as one house, simply stripping names, addresses and phone numbers from that information might not be enough to prevent exposure of individuals.
“There was pretty widespread angst” among statisticians, said John Abowd, the bureau’s associate director for research and methodology and chief scientist. The hand-wringing led to a “gradual awakening” that prompted the agency to begin developing synthetic data methods, he said.
Synthetic data built from the real data preserves privacy while providing information that is still relevant for research purposes, Abowd said: “The basic idea is to try to get a model that accurately produces an image of the confidential data.”
The plan for the 2020 census is to produce a synthetic image of that original data. The bureau also produces On the Map, a web-based mapping and reporting application that provides synthetic data showing where workers are employed and where they live along with reports on age, earnings, industry distributions, race, ethnicity, educational attainment and sex.
Of course, the real census data is still locked away, too, Abowd said: “We have a copy and the national archives have a copy of the confidential microdata.”…(More)”.
Book by Daeyeol Lee: “What is intelligence? How did it begin and evolve to human intelligence? Does a high level of biological intelligence require a complex brain? Can man-made machines be truly intelligent? Is AI fundamentally different from human intelligence? In Birth of Intelligence, distinguished neuroscientist Daeyeol Lee tackles these pressing fundamental issues. To better prepare for future society and its technology, including how the use of AI will impact our lives, it is essential to understand the biological root and limits of human intelligence. After systematically reviewing biological and computational underpinnings of decision making and intelligent behaviors, Birth of Intelligence proposes that true intelligence requires life…(More)”.
Paper by David S. Watson & Luciano Floridi: “We propose a formal framework for interpretable machine learning. Combining elements from statistical learning, causal interventionism, and decision theory, we design an idealised explanation game in which players collaborate to find the best explanation(s) for a given algorithmic prediction. Through an iterative procedure of questions and answers, the players establish a three-dimensional Pareto frontier that describes the optimal trade-offs between explanatory accuracy, simplicity, and relevance. Multiple rounds are played at different levels of abstraction, allowing the players to explore overlapping causal patterns of variable granularity and scope. We characterise the conditions under which such a game is almost surely guaranteed to converge on a (conditionally) optimal explanation surface in polynomial time, and highlight obstacles that will tend to prevent the players from advancing beyond certain explanatory thresholds. The game serves a descriptive and a normative function, establishing a conceptual space in which to analyse and compare existing proposals, as well as design new and improved solutions….(More)”
Alex Hern at The Guardian: “A transatlantic divide on how to use location data to fight coronavirus risks highlights the lack of safeguards for Americans’ personal data, academics and data scientists have warned.
The US Centers for Disease Control and Prevention (CDC) has turned to data provided by the mobile advertising industry to analyse population movements in the midst of the pandemic.
Owing to a lack of systematic privacy protections in the US, data collected by advertising companies is often extremely detailed: companies with access to GPS location data, such as weather apps or some e-commerce sites, have been known to sell that data on for ad targeting purposes. That data provides much more granular information on the location and movement of individuals than the mobile network data received by the UK government from carriers including O2 and BT.
While both datasets track individuals at the collection level, GPS data is accurate to within five metres, according to Yves-Alexandre de Montjoye, a data scientist at Imperial College, while mobile network data is accurate to 0.1km² in city centres and much less in less dense areas – the difference between locating an individual to their street and to a specific room in their home…
But, warns de Montjoye, such data is never truly anonymous. “The original data is pseudonymised, yet it is quite easy to reidentify someone. Knowing where someone was is enough to reidentify them 95% of the time, using mobile phone data. So there’s the privacy concern: you need to process the pseudonymised data, but the pseudonymised data can be reidentified. Most of the time, if done properly, the aggregates are aggregated, and cannot be de-anonymised.”
The data scientist points to successful attempts to use location data in tracking outbreaks of malaria in Kenya or dengue in Pakistan as proof that location data has use in these situations, but warns that trust will be hurt if data collected for modelling purposes is then “surreptitiously used to crack down on individuals not respecting quarantines or kept and used for unrelated purposes”….(More)”.
Book by Adam Kucharski: “From ideas and infections to financial crises and “fake news,” why the science of outbreaks is the science of modern life.
These days, whenever anything spreads, whether it’s a YouTube fad or a political rumor, we say it went viral. But how does virality actually work? In The Rules of Contagion, epidemiologist Adam Kucharski explores topics including gun violence, online manipulation, and, of course, outbreaks of disease to show how much we get wrong about contagion, and how astonishing the real science is.
Why did the president retweet a Mussolini quote as his own? Why do financial bubbles take off so quickly? Why are disinformation campaigns so effective? And what makes the emergence of new illnesses–such as MERS, SARS, or the coronavirus disease COVID-19–so challenging? By uncovering the crucial factors driving outbreaks, we can see how things really spread — and what we can do about it….(More)”.
Blog post by Stefaan Verhulst: “We live in almost unimaginable times. The spread of COVID-19 is a human tragedy and global crisis that will impact our communities for many years to come. The social and economic costs are huge and mounting, and they are already contributing to a global slowdown. Every day, the emerging pandemic reveals new vulnerabilities in various aspects of our economic, political and social lives. These include our vastly overstretched public health services, our dysfunctional political climate, and our fragile global supply chains and financial markets.
The unfolding crisis is also making shortcomings clear in another area: the way we re-use data responsibly. Although this aspect of the crisis has been less remarked upon than other, more obvious failures, those who work with data—and who have seen its potential to impact the public good—understand that we have failed to create the necessary governance and institutional structures that would allow us to harness data responsibly to halt or at least limit this pandemic. A recent article in Stat, an online journal dedicated to health news, characterized the COVID-19 outbreak as “a once-in-a-century evidence fiasco.” The article continues:
“At a time when everyone needs better information, […] we lack reliable evidence on how many people have been infected with SARS-CoV-2 or who continue to become infected. Better information is needed to guide decisions and actions of monumental significance and to monitor their impact.”
It doesn’t have to be this way, and these data challenges are not an excuse for inaction. As we explain in what follows, there is ample evidence that the re-use of data can help mitigate health pandemics. A robust (if somewhat unsystematized) body of knowledge could direct policymakers and others in their efforts. In the second part of this article, we outline eight steps that key stakeholders can and should take to better re-use data in the fight against COVID-19. In particular, we argue that more responsible data stewardship and increased use of data collaboratives are critical….(More)”.
Paper by Nuria Oliver, et al: “This paper describes how mobile phone data can guide government and public health authorities in determining the best course of action to control the COVID-19 pandemic and in assessing the effectiveness of control measures such as physical distancing. It identifies key gaps and reasons why this kind of data is only scarcely used, although their value in similar epidemics has proven in a number of use cases. It presents ways to overcome these gaps and key recommendations for urgent action, most notably the establishment of mixed expert groups on national and regional level, and the inclusion and support of governments and public authorities early on. It is authored by a group of experienced data scientists, epidemiologists, demographers and representatives of mobile network operators who jointly put their work at the service of the global effort to combat the COVID-19 pandemic….(More)”.
GDPR Hub: “The sudden outbreak of cases of COVID-19-afflictions (“Corona-Virus”), which was declared a pandemic by the WHO affects data protection in various ways. Different data protection authorities published guidelines for employers and other parties involved in the processing of data related to the Corona-Virus (read more below).
The Corona-Virus has also given cause to the use of different technologies based on data collection and other data processing activities by the EU/EEA member states and private companies. These processing activities mostly focus on preventing and slowing the further spreading of the Corona-Virus and on monitoring the citizens’ abidance with governmental measures such as quarantine. Some of them are based on anonymous or anonymized data (like for statistics or movement patterns), but some proposals also revolved around personalized tracking.
At the moment, it is not easy to figure out, which processing activities are actually supposed to be conducted and which are only rumors. This page will therefore be adapted once certain processing activities have been confirmed. For now, this article does not assess the lawfulness of particular processing activities, but rather outlines the general conditions for data processing in connection with the Corona-Virus.
It must be noted that several activities – such as monitoring, if citizens comply with quarantine and stay indoors by watching at mobile phone locations – can be done without having to use personal data under Article 4(1) GDPR, if all necessary information can be derived from anonymised data. The GDPR does not apply to activities that only rely on anonymised data….(More)”.
Chapter by Oran Doyle and Rachael Walsh: “Populisms come in different forms, but all involve a political rhetoric that invokes the will of a unitary people to combat perceived constraints, whether economic, legal, or technocratic. In this chapter, our focus is democratic backsliding aided by populist rhetoric. Some have suggested deliberative democracy as a means to combat this form of populism. Deliberative democracy encourages and facilitates both consultation and contestation, emphasizing plurality of voices, the legitimacy of disagreement, and the imperative of reasoned persuasion. Its participatory and inclusive character has the potential to undermine the credibility of populists’ claims to speak for a unitary people. Ireland has been widely referenced in constitutionalism’s deliberative turn, given its recent integration of deliberative mini-publics into the constitutional amendment process.
Reviewing the Irish experience, we suggest that deliberative mini-publics are unlikely to reverse democratic backsliding. Populist rhetoric is fueled by the very measures intended to combat democratic backsliding: enhanced constitutional constraints merely illustrate how the will of the people is being thwarted. The virtues of Ireland’s experiment in deliberative democracy — citizen participation, integration with representative democracy, deliberation, balanced information, expertise — have all been criticized in ways that are at least consistent with populist narratives. The failure of such narratives to take hold in Ireland, we suggest, may be due to a political system that is already resistant to populist rhetoric, as well as a tradition of participatory constitutionalism. The experiment with deliberative mini-publics may have strengthened Ireland’s constitutional culture by reinforcing anti-populist features. But it cannot be assumed that this experience would be replicated in larger countries polarized along political, ethnic, or religious lines….(More)”.