Virtuous and vicious circles in the data life-cycle

Paper by Elizabeth Yakel, Ixchel M. Faniel, and Zachary J. Maiorana: “In June 2014, ‘Data sharing reveals complexity in the westward spread of domestic animals across Neolithic Turkey’, was published in PLoS One (Arbuckle et al. 2014). In this article, twenty-three authors, all zooarchaeologists, representing seventeen different archaeological sites in Turkey investigated the domestication of animals across Neolithic southwest Asia, a pivotal era of change in the region’s economy. The PLoS One article originated in a unique data sharing, curation, and reuse project in which a majority of the authors agreed to share their data and perform analyses across the aggregated datasets. The extent of data sharing and the breadth of data reuse and collaboration were previously unprecedented in archaeology. In the present article, we conduct a case study of the collaboration leading to the development of the PLoS One article. In particular, we focus on the data sharing, data curation, and data reuse practices exercised during the project in order to investigate how different phases in the data life-cycle affected each other.

Studies of data practices have generally engaged issues from the singular perspective of data producers, sharers, curators, or reusers. Furthermore, past studies have tended to focus on one aspect of the life-cycle (production, sharing, curation, reuse, etc.). A notable exception is Carlson and Anderson’s (2007) comparative case study of four research projects which discusses the life-cycle of data from production through sharing with an eye towards reuse. However, that study primarily addresses the process of data sharing. While we see from their research that data producers’ and curators’ decisions and actions regarding data are tightly coupled and have future consequences, those consequences are not fully explicated since the authors do not discuss reuse in depth.

Taking a perspective that captures the trajectory of data, our case study discusses actions and their consequences throughout the data life-cycle. Our research theme explores how different stakeholders and their work practices positively and/or negatively affected other phases of the life-cycle. More specifically, we focus on data production practices and data selection decisions made during data sharing as these have frequent and diverse consequences for other life-cycle phases in our case study. We address the following research questions:

  1. How do different aspects of data production positively and negatively impact other phases in the life-cycle?
  2. How do data selection decisions during sharing positively and negatively impact other phases in the life-cycle?
  3. How can the work of data curators intervene to reinforce positive actions or mitigate negative actions?…(More)”