Massive, Unarchivable Datasets of Cancer, Covid, and Alzheimer’s Research Could Be Lost Forever


Article by Sam Cole: “Almost two dozen repositories of research and public health data supported by the National Institutes of Health are marked for “review” under the Trump administration’s direction, and researchers and archivists say the data is at risk of being lost forever if the repositories go down. 

“The problem with archiving this data is that we can’t,” Lisa Chinn, Head of Research Data Services at the University of Chicago, told 404 Media. Unlike other government datasets or web pages, downloading or otherwise archiving NIH data often requires a Data Use Agreement between a researcher institution and the agency, and those agreements are carefully administered through a disclosure risk review process. 

A message appeared at the top of multiple NIH websites last week that says: “This repository is under review for potential modification in compliance with Administration directives.”

Repositories with the message include archives of cancer imagery, Alzheimer’s disease research, sleep studies, HIV databases, and COVID-19 vaccination and mortality data…

“So far, it seems like what is happening is less that these data sets are actively being deleted or clawed back and more that they are laying off the workers whose job is to maintain them, update them and maintain the infrastructure that supports them,” a librarian affiliated with the Data Rescue Project told 404 Media. “In time, this will have the same effect, but it’s really hard to predict. People don’t usually appreciate, much less our current administration, how much labor goes into maintaining a large research dataset.”…(More)”.