Microsoft Research Open Data


Microsoft Research Open Data: “… is a data repository that makes available datasets that researchers at Microsoft have created and published in conjunction with their research. You can browse available datasets and either download them or directly copy them to an Azure-based Virtual Machine or Data Science Virtual Machine. To the extent possible, we follow FAIR (findable, accessible, interoperable and reusable) data principles and will continue to push towards the highest standards for data sharing. We recognize that there are dozens of data repositories already in use by researchers and expect that the capabilities of this repository will augment existing efforts. Datasets are categorized by their primary research area. You can find links to research projects or publications with the dataset.

What is our goal?

Our goal is to provide a simple platform to Microsoft’s researchers and collaborators to share datasets and related research technologies and tools. The site has been designed to simplify access to these data sets, facilitate collaboration between researchers using cloud-based resources, and enable the reproducibility of research. We will continue to evolve and grow this repository and add features to it based on feedback from the community.

How did this project come to be?

Over the past few years, our team, based at Microsoft Research, has worked extensively with the research community to create cloud-based research infrastructure. We started this project as a prototype about a year ago and are excited to finally share it with the research community to support data-intensive research in the cloud. Because almost all research projects have a data component, there is real need for curated and meaningful datasets in the research community, not only in computer science but in interdisciplinary and domain sciences. We have now made several such datasets available for download or use directly on cloud infrastructure….(More)”.