Building a Data Infrastructure for the Bioeconomy


Article by Gopal P. Sarma and Melissa Haendel: “While the development of vaccines for COVID-19 has been widely lauded, other successful components of the national response to the pandemic have not received as much attention. The National COVID Cohort Collaborative (N3C), for example, flew under the public’s radar, even though it aggregated crucial US public health data about the new disease through cross-institutional collaborations among government, private, and nonprofit health and research organizations. These data, which were made available to researchers via cutting-edge software tools, have helped in myriad ways: they led to identification of the clinical characteristics of acute COVID-19 for risk prediction, assisted in providing clinical care for immunocompromised adults, revealed how COVID infection affects children, and documented that vaccines appear to reduce the risk of developing long COVID.

N3C has created the largest national, publicly available patient-level dataset in US history. Through a unique public-private partnership, over 300 participating organizations quickly overcame privacy concerns and data silos to include 13 million patient records in the project. More than 3,000 participating scientists are now working to overcome the particular challenge faced in the United States—the lack of a national healthcare data infrastructure available in many other countries—to support public health and medical responses. N3C shows great promise for unraveling answers to questions related to COVID, but it could easily be expanded for many areas of public health, including pandemic preparedness and monitoring disease status across the population.

As public servants dedicated to improving public health and equity, we believe that to unite the nation’s fragmented public health system, the United States should establish a standing capacity to collect, harmonize, and sustain a wide range of data types and sources. The public health data collected by N3C would ultimately be but one component of a rich landscape of interoperable data systems that can guide public policy in an era of rapid environmental change, sophisticated biological threats, and an economy enabled by biotechnology. Such an effort will require new thinking about data collection, infrastructure, and regulation, but its benefits could be enormous—enabling policymakers to make decisions in an increasingly complex world. And as the interconnections between society, industry, and government continue to intensify, decisionmaking of all types and scales will be more efficient and responsive if it can rely on significantly expanded data collection and analysis capabilities…(More)”.