The New Tech Tools in Data Sharing


Essay by Massimo Russo and Tian Feng: “…Cloud providers are integrating data-sharing capabilities into their product suites and investing in R&D that addresses new features such as data directories, trusted execution environments, and homomorphic encryption. They are also partnering with industry-specific ecosystem orchestrators to provide joint solutions.

Cloud providers are moving beyond infrastructure to enable broader data sharing. In 2018, for example, Microsoft teamed up with Oracle and SAP to kick off its Open Data Initiative, which focuses on interoperability among the three large platforms. Microsoft has also begun an Open Data Campaign to close the data divide and help smaller organizations get access to data needed for innovation in artificial intelligence (AI). Amazon Web Services (AWS) has begun a number of projects designed to promote open data, including the AWS Data Exchange and the Open Data Sponsorship Program. In addition to these large providers, specialty technology companies and startups are likewise investing in solutions that further data sharing.

Technology solutions today generally fall into three categories: mitigating risks, enhancing value, and reducing friction. The following is a noncomprehensive list of solutions in each category.

1. Mitigating the Risks of Data Sharing

Potential financial, competitive, and brand risks associated with data disclosure inhibit data sharing. To address these risks, data platforms are embedding solutions to control use, limit data access, encrypt data, and create substitute or synthetic data. (See slide 2 in the slideshow.)

Data Breaches. Here are some of the technological solutions designed toprevent data breaches and unauthorized access to sensitive or private data:

  • Data modification techniques alter individual data elements or full data sets while maintaining data integrity. They provide increasing levels of protection but at a cost: loss of granularity of the underlying data. De-identification and masking strip personal identifier information and use encryption, allowing most of the data value to be preserved. More complex encryptions can increase security, but they also remove resolution of information from the data set.
  • Secure data storage and transfer can help ensure that data stays safe both at rest and in transit. Cloud solutions such as Microsoft Azure and AWS have invested in significant platform security and interoperability.
  • Distributed ledger technologies, such as blockchain, permit data to be stored and shared in a decentralized manner that makes it very difficult to tamper with. IOTA, for example, is a distributed ledger platform for IoT applications supported by industy players such as Bosch and Software AG.
  • Secure computation enables analysis without revealing details of the underlying data. This can be done at a software level, with techniques such as secure multiparty computation (MPC) that allow potentially untrusting parties to jointly compute a function without revealing their private inputs. For example, with MPC, two parties can calculate the intersection of their respective encrypted data set while only revealing information about the intersection. Google, for one, is embedding MPC in its open-source Private Join and Compute tools.
  • Trusted execution environments (TEEs) are hardware modules separate from the operating system that allow for secure data processing within an encrypted private area on the chip. Startup Decentriq is partnering with Intel and Microsoft to explore confidential computing by means of TEEs. There is a significant opportunity for IoT equipment providers to integrate TEEs into their products….(More)”