Stefaan Verhulst

The Ethics of Hiding Your Data From the Machines

Curated on August 26, 2019August 26, 2019 by Stefaan Verhulst

Molly Wood at Wired: “…But now that data is being used to train artificial intelligence, and the insights those future algorithms create could quite literally save lives.

So while targeted advertising is an easy villain, data-hogging artificial intelligence is a dangerously nuanced and highly sympathetic bad guy, like Erik Killmonger in Black Panther. And it won’t be easy to hate.

I recently met with a company that wants to do a sincerely good thing. They’ve created a sensor that pregnant women can wear, and it measures their contractions. It can reliably predict when women are going into labor, which can help reduce preterm births and C-sections. It can get women into care sooner, which can reduce both maternal and infant mortality.

All of this is an unquestionable good.

And this little device is also collecting a treasure trove of information about pregnancy and labor that is feeding into clinical research that could upend maternal care as we know it. Did you know that the way most obstetricians learn to track a woman’s progress through labor is based on a single study from the 1950s, involving 500 women, all of whom were white?…

To save the lives of pregnant women and their babies, researchers and doctors, and yes, startup CEOs and even artificial intelligence algorithms need data. To cure cancer, or at least offer personalized treatments that have a much higher possibility of saving lives, those same entities will need data….

And for we consumers, well, a blanket refusal to offer up our data to the AI gods isn’t necessarily the good choice either. I don’t want to be the person who refuses to contribute my genetic data via 23andMe to a massive research study that could, and I actually believe this is possible, lead to cures and treatments for diseases like Parkinson’s and Alzheimer’s and who knows what else.

I also think I deserve a realistic assessment of the potential for harm to find its way back to me, because I didn’t think through or wasn’t told all the potential implications of that choice—like how, let’s be honest, we all felt a little stung when we realized the 23andMe research would be through a partnership with drugmaker (and reliable drug price-hiker) GlaxoSmithKline. Drug companies, like targeted ads, are easy villains—even though this partnership actually couldproduce a Parkinson’s drug. But do we know what GSK’s privacy policy looks like? That deal was a level of sharing we didn’t necessarily expect….(More)”.

The Practice of Civic Tech: Tensions in the Adoption and Use of New Technologies in Community Based Organizations

Curated on August 25, 2019August 25, 2019 by Stefaan Verhulst

Eric Gordon and Rogelio Alejandro Lopez in Media and Communication: “This article reports on a qualitative study of community based organizations’ (CBOs) adoption of information communication technologies (ICT). As ICTs in the civic sector, otherwise known as civic tech, get adopted with greater regularity in large and small organizations, there is need to understand how these technologies shape and challenge the nature of civic work. Based on a nine-month ethnographic study of one organization in Boston and additional interviews with fourteen other organizations throughout the United States, the study addresses a guiding research question: how do CBOs reconcile the changing (increasingly mediated) nature of civic work as ICTs, and their effective adoption and use for civic purposes, increasingly represent forward-thinking, progress, and innovation in the civic sector?—of civic tech as a measure of “keeping up with the times.”

From a sense of top-down pressures to innovate in a fast-moving civic sector, to changing bottom-up media practices among community constituents, our findings identify four tensions in the daily practice of civic tech, including: 1) function vs. representation, 2) amplification vs. transformation, 3) grassroots vs. grasstops, and 4) youth vs. adults. These four tensions, derived from a grounded theory approach, provide a conceptual picture of a civic tech landscape that is much more complicated than a suite of tools to help organizations become more efficient. The article concludes with recommendations for practitioners and researchers….(More)”.

Companies Collect a Lot of Data, But How Much Do They Actually Use?

Curated on August 25, 2019August 25, 2019 by Stefaan Verhulst

Article by Priceonomics Data Studio: “For all the talk of how data is the new oil and the most valuable resource of any enterprise, there is a deep dark secret companies are reluctant to share — most of the data collected by businesses simply goes unused.

This unknown and unused data, known as dark data comprises more than half the data collected by companies. Given that some estimates indicate that 7.5 septillion (7,700,000,000,000,000,000,000) gigabytes of data are generated every single day, not using most of it is a considerable issue.

In this article, we’ll look at this dark data. Just how much of it is created by companies, what are the reasons this data isn’t being analyzed, and what are the costs and implications of companies not using the majority of the data they collect.

Before diving into the analysis, it’s worth spending a moment clarifying what we mean by the term “dark data.” Gartner defines dark data as:

“The information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing).

To learn more about this phenomenon, Splunk commissioned a global survey of 1,300+ business leaders to better understand how much data they collect, and how much is dark. Respondents were from IT and business roles, and were located in Australia, China, France, Germany, Japan, the United States, and the United Kingdom. across various industries. For the report, Splunk defines dark data as: “all the unknown and untapped data across an organization, generated by systems, devices and interactions.”

While the costs of storing data has decreased overtime, the cost of saving septillions of gigabytes of wasted data is still significant. What’s more, during this time the strategic importance of data has increased as companies have found more and more uses for it. Given the cost of storage and the value of data, why does so much of it go unused?

The following chart shows the reasons why dark data isn’t currently being harnessed:

By a large margin, the number one reason given for not using dark data is that companies lack a tool to capture or analyze the data. Companies accumulate data from server logs, GPS networks, security tools, call records, web traffic and more. Companies track everything from digital transactions to the temperature of their server rooms to the contents of retail shelves. Most of this data lies in separate systems, is unstructured, and cannot be connected or analyzed.

Second, the data captured just isn’t good enough. You might have important customer information about a transaction, but it’s missing location or other important metadata because that information sits somewhere else or was never captured in useable format.

Additionally, dark data exists because there is simply too much data out there and a lot of is unstructured. The larger the dataset (or the less structured it is), the more sophisticated the tool required for analysis. Additionally, these kinds of datasets often time require analysis by individuals with significant data science expertise who are often is short supply.

The implications of the prevalence are vast. As a result of the data deluge, companies often don’t know where all the sensitive data is stored and can’t be confident they are complying with consumer data protection measures like GDPR. …(More)”.

Exploring the Smart City Indexes and the Role of Macro Factors for Measuring Cities Smartness

Curated on August 25, 2019August 25, 2019 by Stefaan Verhulst

María Verónica Alderete in Social Indicators Research: “The main objective of this paper is to discuss the key factors involved in the definition of smart city indexes. Although recent literature has explored the smart city subject, it is of concern if macro ICT factors should also be considered for assessing the technological innovation of a city. To achieve this goal, firstly a literature review of smart city is provided. An analysis of the smart city concept together with a theoretical framework based on the knowledge society and the Quintuple Helix innovation model are included. Secondly, the study analyzes some smart city cases in developed and developing countries. Thirdly, it describes, criticizes and compares some well-known smart city indexes. Lastly, the empirical literature is explored to detect if there are studies proposing changes in smart city indexes or methodologies to consider the macro level variables. It results that cities at the top of the indexes rankings are from developed countries. On the other side, most cities at the bottom of the ranking are from developing or not developed countries. As a result, it is addressed that the ICT development of Smart Cities depends both on the cities’ characteristics and features, and on macro-technological factors. Secondly, there is a scarce number of papers in the subject including macro or country factors, and most of them are revisions of the literature or case studies. There is a lack of studies discussing the indexes’ methodologies. This paper provides some guidelines to build one….(More)”.

Stop the Open Data Bus, We Want to Get Off

Curated on August 25, 2019August 25, 2019 by Stefaan Verhulst

Paper by Chris Culnane, Benjamin I. P. Rubinstein, and Vanessa Teague: “The subject of this report is the re-identification of individuals in the Myki public transport dataset released as part of the Melbourne Datathon 2018. We demonstrate the ease with which we were able to re-identify ourselves, our co-travellers, and complete strangers; our analysis raises concerns about the nature and granularity of the data released, in particular the ability to identify vulnerable or sensitive groups…..

This work highlights how a large number of passengers could be re-identified in the 2018 Myki data release, with detailed discussion of specific people. The implications of re-identification are potentially serious: ex-partners, one-time acquaintances, or other parties can determine places of home, work, times of travel, co-travelling patterns—presenting risk to vulnerable groups in particular…

In 2018 the Victorian Government released a large passenger centric transport dataset to a data science competition—the 2018 Melbourne Datathon. Access to the data was unrestricted, with a URL provided on the datathon’s website to download the complete dataset from an Amazon S3 Bucket. Over 190 teams continued to analyse the data through the 2 month competition period. The data consisted of touch on and touch off events for the Myki smart card ticketing system used throughout the state of Victoria, Australia. With such data, contestants would be able to apply retrospective analyses on an entire public transport system, explore suitability of predictive models, etc.

The Myki ticketing system is used across Victorian public transport: on trains, buses and trams. The dataset was a longitudinal dataset, consisting of touch on and touch off events from Week 27 in 2015 through to Week 26 in 2018. Each event contained a card identifier (cardId; not the actual card number), the card type, the time of the touch on or off, and various location information, for example a stop ID or route ID, along with other fields which we omit here for brevity. Events could be indexed by the cardId and as such, all the events associated with a single card could be retrieved. There are a total of 15,184,336 cards in the dataset—more than twice the 2018 population of Victoria. It appears that all touch on and off events for metropolitan trains and trams have been included, though other forms of transport such as intercity trains and some buses are absent. In total there are nearly 2 billion touch on and off events in the dataset.

No information was provided as to the de-identification that was performed on the dataset. Our analysis indicates that little to no de-identification took place on the bulk of the data, as will become evident in Section 3. The exception is the cardId, which appears to have been mapped in some way from the Myki Card Number. The exact mapping has not been discovered, although concerns remain as to its security effectiveness….(More)”.

Datafication and accountability in public health

Curated on August 24, 2019August 25, 2019 by Stefaan Verhulst

Introduction to a special issue of Social Studies of Science by Klaus Hoeyer, Susanne Bauer, and Martyn Pickersgill: “In recent years and across many nations, public health has become subject to forms of governance that are said to be aimed at establishing accountability. In this introduction to a special issue, From Person to Population and Back: Exploring Accountability in Public Health, we suggest opening up accountability assemblages by asking a series of ostensibly simple questions that inevitably yield complicated answers: What is counted? What counts? And to whom, how and why does it count? Addressing such questions involves staying attentive to the technologies and infrastructures through which data come into being and are made available for multiple political agendas. Through a discussion of public health, accountability and datafication we present three key themes that unite the various papers as well as illustrate their diversity….(More)”.

Governance sinkholes

Curated on August 24, 2019August 24, 2019 by Stefaan Verhulst

Blog post by Geoff Mulgan: “Governance sinkholes appear when shifts in technology, society and the economy throw up the need for new arrangements. Each industrial revolution has created many governance sinkholes – and prompted furious innovation to fill them. The fourth industrial revolution will be no different. But most governments are too distracted to think about what to do to fill these holes, let alone to act. This blog sets out my diagnosis – and where I think the most work is needed to design new institutions….

It’s not too hard to get a map of the fissures and gaps – and to see where governance is needed but is missing. There are all too many of these now.

Here are a few examples. One is long-term care, currently missing adequate financing, regulation, information and navigation tools, despite its huge and growing significance. The obvious contrast is with acute healthcare, which, for all its problems, is rich in institutions and governance.

A second example is lifelong learning and training. Again, there is a striking absence of effective institutions to provide funding, navigation, policy and problem solving, and again, the contrast with the institution-rich fields of primary, secondary and tertiary education is striking. The position on welfare is not so different, as is the absence of institutions fit for purpose in supporting people in precarious work.

I’m particularly interested in another kind of sinkhole: the absence of the right institutions to handle data and knowledge – at global, national and local levels – now that these dominate the economy, and much of daily life. In field after field, there are huge potential benefits to linking data sets and connecting artificial and human intelligence to spot patterns or prevent problems. But we lack any institutions with either the skills or the authority to do this well, and in particular to think through the trade-offs between the potential benefits and the potential risks….(More)”.

What the Hack? – Towards a Taxonomy of Hackathons

Curated on August 24, 2019August 27, 2019 by Stefaan Verhulst

Paper by Christoph Kollwitz and Barbara Dinter: “In order to master the digital transformation and to survive in global competition, companies face the challenge of improving transformation processes, such as innovation processes. However, the design of these processes poses a challenge, as the related knowledge is still largely in its infancy. A popular trend since the mid-2000s are collaborative development events, so-called hackathons, where people with different professional backgrounds work collaboratively on development projects for a defined period. While hackathons are a widespread phenomenon in practice and many field reports and individual observations exist, there is still a lack of holistic and structured representations of the new phenomenon in literature.

The paper at hand aims to develop a taxonomy of hackathons in order to illustrate their nature and underlying characteristics. For this purpose, a systematic literature review is combined with existing taxonomies or taxonomy-like artifacts (e.g. morphological boxes, typologies) from similar research areas in an iterative taxonomy development process. The results contribute to an improved understanding of the phenomenon hackathon and allow the more effective use of hackathons as a new tool in organizational innovation processes. Furthermore, the taxonomy provides guidance on how to apply hackathons for organizational innovation processes….(More)”.

Governing Complexity: Analyzing and Applying Polycentricity

Curated on August 24, 2019August 26, 2019 by Stefaan Verhulst

Book edited by Andreas Thiel, William A. Blomquist, and Dustin E. Garrick: “There has been a rapid expansion of academic interest and publications on polycentricity. In the contemporary world, nearly all governance situations are polycentric, but people are not necessarily used to thinking this way. Governing Complexity provides an updated explanation of the concept of polycentric governance. The editors provide examples of it in contemporary settings involving complex natural resource systems, as well as a critical evaluation of the utility of the concept. With contributions from leading scholars in the field, this book makes the case that polycentric governance arrangements exist and it is possible for polycentric arrangements to perform well, persist for long periods, and adapt. Whether they actually function well, persist, or adapt depends on multiple factors that are reviewed and discussed, both theoretically and with examples from actual cases….(More)”.

After Technopoly

Curated on August 23, 2019August 23, 2019 by Stefaan Verhulst

Alan Jacobs at the New Atlantis: “Technocratic solutionism is dying. To replace it, we must learn again the creation and reception of myth….
What Neil Postman called “technopoly” may be described as the universal and virtually inescapable rule of our everyday lives by those who make and deploy technology, especially, in this moment, the instruments of digital communication. It is difficult for us to grasp what it’s like to live under technopoly, or how to endure or escape or resist the regime. These questions may best be approached by drawing on a handful of concepts meant to describe a slightly earlier stage of our common culture.

First, following on my earlier essay in these pages, “Wokeness and Myth on Campus” (Summer/Fall 2017), I want to turn again to a distinction by the Polish philosopher Leszek Kołakowski between the “technological core” of culture and the “mythical core” — a distinction he believed is essential to understanding many cultural developments.

“Technology” for Kołakowski is something broader than we usually mean by it. It describes a stance toward the world in which we view things around us as objects to be manipulated, or as instruments for manipulating our environment and ourselves. This is not necessarily meant in a negative sense; some things ought to be instruments — the spoon I use to stir my soup — and some things need to be manipulated — the soup in need of stirring. Besides tools, the technological core of culture includes also the sciences and most philosophy, as those too are governed by instrumental, analytical forms of reasoning by which we seek some measure of control.

By contrast, the mythical core of culture is that aspect of experience that is not subject to manipulation, because it is prior to our instrumental reasoning about our environment. Throughout human civilization, says Kołakowski, people have participated in myth — they may call it “illumination” or “awakening” or something else — as a way of connecting with “nonempirical unconditioned reality.” It is something we enter into with our full being, and all attempts to describe the experience in terms of desire, will, understanding, or literal meaning are ways of trying to force the mythological core into the technological core by analyzing and rationalizing myth and pressing it into a logical order. This is why the two cores are always in conflict, and it helps to explain why rational argument is often a fruitless response to people acting from the mythical core….(More)”.