Bringing Structure and Design to Data Governance


Report by John Wilbanks et al: “Before COVID-19 took over the world, the Governance team at Sage Bionetworks had started working on an analysis of data governance structures and systems to be published as a “green paper” in late 2020. Today we’re happy to publicly release that paper, Mechanisms to Govern Responsible Sharing of Open Data: A Progress Report.

In the paper, we provide a landscape analysis of models of governance for open data sharing based on our observations in the biomedical sciences. We offer an overview of those observations and show areas where we think this work can expand to supply further support for open data sharing outside the sciences.

The central argument of this paper is that the “right” system of governance is determined by first understanding the nature of the collaborative activities intended. These activities map to types of governance structures, which in turn can be built out of standardized parts — what we call governance design patterns. In this way, governance for data science can be easy to build, follow key laws and ethics regimes, and enable innovative models of collaboration. We provide an initial survey of structures and design patterns, as well as examples of how we leverage this approach to rapidly build out ethics-centered governance in biomedical research.

While there is no one-size-fits-all solution, we argue for learning from ongoing data science collaborations and building on from existing standards and tools. And in so doing, we argue for data governance as a discipline worthy of expertise, attention, standards, and innovation.

We chose to call this report a “green paper” in recognition of its maturity and coverage: it’s a snapshot of our data governance ecosystem in biomedical research, not the world of all data governance, and the entire field of data governance is in its infancy. We have licensed the paper under CC-BY 4.0 and published it in github via Manubot in hopes that the broader data governance community might fill in holes we left, correct mistakes we made, add references and toolkits and reference implementations, and generally treat this as a framework for talking about how we share data…(More)”.