First, design for data sharing


John Wilbanks & Stephen H Friend in Nature Biotechnology: “To upend current barriers to sharing clinical data and insights, we need a framework that not only accounts for choices made by trial participants but also qualifies researchers wishing to access and analyze the data.

This March, Sage Bionetworks (Seattle) began sharing curated data collected from >9,000 participants of mPower, a smartphone-enabled health research study for Parkinson’s disease. The mPower study is notable as one of the first observational assessments of human health to rapidly achieve scale as a result of its design and execution purely through a smartphone interface. To support this unique study design, we developed a novel electronic informed consent process that includes participant-determined data-sharing preferences. It is through these preferences that the new data—including self-reported outcomes and quantitative sensor data—are shared broadly for secondary analysis. Our hope is that by sharing these data immediately, prior even to our own complete analysis, we will shorten the time to harnessing any utility that this study’s data may hold to improve the condition of patients who suffer from this disease.

Turbulent times for data sharing

Our release of mPower comes at a turbulent time in data sharing. The power of data for secondary research is top of mind for many these days. Vice President Joe Biden, in heading President Barack Obama’s ambitious cancer ‘moonshot’, describes data sharing as second only to funding to the success of the effort. However, this powerful support for data sharing stands in opposition to the opinions of many within the research establishment. To wit, the august New England Journal of Medicine (NEJM)’s recent editorial suggesting that those who wish to reuse clinical trial data without the direct participation and approval of the original study team are “research parasites”4. In the wake of colliding perspectives on data sharing, we must not lose sight of the scientific and societal ends served by such efforts.

It is important to acknowledge that meaningful data sharing is a nontrivial process that can require substantial investment to ensure that data are shared with sufficient context to guide data users. When data analysis is narrowly targeted to answer a specific and straightforward question—as with many clinical trials—this added effort might not result in improved insights. However, many areas of science, such as genomics, astronomy and high-energy physics, have moved to data collection methods in which large amounts of raw data are potentially of relevance to a wide variety of research questions, but the methodology of moving from raw data to interpretation is itself a subject of active research….(More)”