Hal Hodson in the New Scientist: “The search giant is automatically building Knowledge Vault, a massive database that could give us unprecedented access to the world’s facts
GOOGLE is building the largest store of knowledge in human history – and it’s doing so without any human help. Instead, Knowledge Vault autonomously gathers and merges information from across the web into a single base of facts about the world, and the people and objects in it.
The breadth and accuracy of this gathered knowledge is already becoming the foundation of systems that allow robots and smartphones to understand what people ask them. It promises to let Google answer questions like an oracle rather than a search engine, and even to turn a new lens on human history.
Knowledge Vault is a type of “knowledge base” – a system that stores information so that machines as well as people can read it. Where a database deals with numbers, a knowledge base deals with facts. When you type “Where was Madonna born” into Google, for example, the place given is pulled from Google’s existing knowledge base.
This existing base, called Knowledge Graph, relies on crowdsourcing to expand its information. But the firm noticed that growth was stalling; humans could only take it so far. So Google decided it needed to automate the process. It started building the Vault by using an algorithm to automatically pull in information from all over the web, using machine learning to turn the raw data into usable pieces of knowledge.
Knowledge Vault has pulled in 1.6 billion facts to date. Of these, 271 million are rated as “confident facts”, to which Google’s model ascribes a more than 90 per cent chance of being true. It does this by cross-referencing new facts with what it already knows.
“It’s a hugely impressive thing that they are pulling off,” says Fabian Suchanek, a data scientist at Télécom ParisTech in France.
Google’s Knowledge Graph is currently bigger than the Knowledge Vault, but it only includes manually integrated sources such as the CIA Factbook.
Knowledge Vault offers Google fast, automatic expansion of its knowledge – and it’s only going to get bigger. As well as the ability to analyse text on a webpage for facts to feed its knowledge base, Google can also peer under the surface of the web, hunting for hidden sources of data such as the figures that feed Amazon product pages, for example.
Tom Austin, a technology analyst at Gartner in Boston, says that the world’s biggest technology companies are racing to build similar vaults. “Google, Microsoft, Facebook, Amazon and IBM are all building them, and they’re tackling these enormous problems that we would never even have thought of trying 10 years ago,” he says.
The potential of a machine system that has the whole of human knowledge at its fingertips is huge. One of the first applications will be virtual personal assistants that go way beyond what Siri and Google Now are capable of, says Austin…”