The Data Revolution

Review of Rob Kitchin’s The Data Revolution: Big Data, Open Data, Data Infrastructures & their Consequences by David Moats in Theory, Culture and Society: “…As an industry, academia is not immune to cycles of hype and fashion. Terms like ‘postmodernism’, ‘globalisation’, and ‘new media’ have each had their turn filling the top line of funding proposals. Although they are each grounded in tangible shifts, these terms become stretched and fudged to the point of becoming almost meaningless. Yet, they elicit strong, polarised reactions. For at least the past few years, ‘big data’ seems to be the buzzword, which elicits funding, as well as the ire of many in the social sciences and humanities.

Rob Kitchin’s book The Data Revolution is one of the first systematic attempts to strip back the hype surrounding our current data deluge and take stock of what is really going on. This is crucial because this hype is underpinned by very real societal change, threats to personal privacy and shifts in store for research methods. The book acts as a helpful wayfinding device in an unfamiliar terrain, which is still being reshaped, and is admirably written in a language relevant to social scientists, comprehensible to policy makers and accessible even to the less tech savvy among us.

The Data Revolution seems to present itself as the definitive account of this phenomena but in filling this role ends up adopting a somewhat diplomatic posture. Kitchin takes all the correct and reasonable stances on the matter and advocates all the right courses of action but he is not able to, in the context of this book, pursue these propositions fully. This review will attempt to tease out some of these latent potentials and how they might be pushed in future work, in particular the implications of the ‘performative’ character of both big data narratives and data infrastructures for social science research.

Kitchin’s book starts with the observation that ‘data’ is a misnomer – etymologically data should refer to phenomena in the world which can be abstracted, measured etc. as opposed to the representations and measurements themselves, which should by all rights be called ‘capta’. This is ironic because the worst offenders in what Kitchin calls “data boosterism” seem to conflate data with ‘reality’, unmooring data from its conditions of production and making relationship between the two given or natural.

As Kitchin notes, following Bowker (2005), ‘raw data’ is an oxymoron: data are not so much mined as produced and are necessarily framed technically, ethically, temporally, spatially and philosophically. This is the central thesis of the book, that data and data infrastructures are not neutral and technical but also social and political phenomena. For those at the critical end of research with data, this is a starting assumption, but one which not enough practitioners heed. Most of the book is thus an attempt to flesh out these rapidly expanding data infrastructures and their politics….

Kitchin is at his best when revealing the gap between the narratives and the reality of data analysis such as the fallacy of empiricism – the assertion that, given the granularity and completeness of big data sets and the availability of machine learning algorithms which identify patterns within data (with or without the supervision of human coders), data can “speak for themselves”. Kitchin reminds us that no data set is complete and even these out-of-the-box algorithms are underpinned by theories and assumptions in their creation, and require context specific knowledge to unpack their findings. Kitchin also rightly raises concerns about the limits of big data, that access and interoperability of data is not given and that these gaps and silences are also patterned (Twitter is biased as a sample towards middle class, white, tech savy people). Yet, this language of veracity and reliability seems to suggest that big data is being conceptualised in relation to traditional surveys, or that our population is still the nation state, when big data could helpfully force us to reimagine our analytic objects and truth conditions and more pressingly, our ethics (Rieder, 2013).

However, performativity may again complicate things. As Kitchin observes, supermarket loyalty cards do not just create data about shopping, they encourage particular sorts of shopping; when research subjects change their behaviour to cater to the metrics and surveillance apparatuses built into platforms like Facebook (Bucher, 2012), then these are no longer just data points representing the social, but partially constitutive of new forms of sociality (this is also true of other types of data as discussed by Savage (2010), but in perhaps less obvious ways). This might have implications for how we interpret data, the distribution between quantitative and qualitative approaches (Latour et al., 2012) or even more radical experiments (Wilkie et al., 2014). Kitchin is relatively cautious about proposing these sorts of possibilities, which is not the remit of the book, though it clearly leaves the door open…(More)”