Forget GMOs. The Future of Food Is Data—Mountains of It

Cade Metz at Wired: “… Led by Dan Zigmond—who previously served as chief data scientist for YouTube, then Google Maps—this ambitious project aims to accelerate the work of all the biochemists, food scientists, and chefs on the first floor, providing a computer-generated shortcut to what Hampton Creek sees as the future of food. “We’re looking at the whole process,” Zigmond says of his data team, “trying to figure out what it all means and make better predictions about what is going to happen next.”

The project highlights a movement, spreading through many industries, that seeks to supercharge research and development using the kind of data analysis and manipulation pioneered in the world of computer science, particularly at places like Google and Facebook. Several projects already are using such techniques to feed the development of new industrial materials and medicines. Others hope the latest data analytics and machine learning techniques can help diagnosis disease. “This kind of approach is going to allow a whole new type of scientific experimentation,” says Jeremy Howard, who as the president of Kaggle once oversaw the leading online community of data scientists and is now applying tricks of the data trade to healthcare as the founder of Enlitic.
Zigmond’s project is the first major effort to apply “big data” to the development of food, and though it’s only just getting started—with some experts questioning how effective it will be—it could spur additional research in the field. The company may license its database to others, and Hampton Creek founder and CEO Josh Tetrick says it may even open source the data, so to speak, freely sharing it with everyone. “We’ll see,” says Tetrick, a former college football linebacker who founded Hampton Creek after working on economic and social campaigns in Liberia and Kenya. “That would be in line with who we are as a company.”…
Initially, Zigmond and his team will model protein interactions on individual machines, using tools like the R programming language (a common means of crunching data) and machine learning algorithms much like those that recommend products on As the database expands, they plan to arrange for much larger and more complex models that run across enormous clusters of computer servers, using the sort of sweeping data-analysis software systems employed by the likes of Google. “Even as we start to get into the tens and hundreds of thousands and millions of proteins,” Zigmond says, “it starts to be more than you can handle with traditional database techniques.”
In particular, Zigmond is exploring the use of deep learning, a form of artificial intelligence that goes beyond ordinary machine learning. Google is using deep learning to drive the speech recognition system in Android phones. Microsoft is using it to translate Skype calls from one language to another. Zigmond believes it can help model the creation of new foods….”