US start-up aims to steer through flood of data


Richard Waters in the Financial Times: “The “open data” movement has produced a deluge of publicly available information this decade, as governments like those in the UK and US have released large volumes of data for general use.

But the flood has left researchers and data scientists with a problem: how do they find the best data sets, ensure these are accurate and up to date, and combine them with other sources of information?

The most ambitious in a spate of start-ups trying to tackle this problem is set to be unveiled on Monday, when data.world opens for limited release. A combination of online repository and social network, the site is designed to be a central platform to support the burgeoning activity around freely available data.

The aim closely mirrors Github, which has been credited with spurring the open source software movement by becoming both a place to store and find free programs as well as a crowdsourcing tool for identifying the most useful.

“We are at an inflection point,” said Jeff Meisel, chief marketing officer for the US Census Bureau. A “massive amount of data” has been released under open data provisions, he said, but “what hasn’t been there are the tools, the communities, the infrastructure to make that data easier to mash up”….

Data.world plans to seed its site with about a thousand data sets and attract academics as its first users, said Mr Hurt. By letting users create personal profiles on the site, follow others and collaborate around the information they are working on, the site hopes to create the kind of social dynamic that makes it more useful the more it is used.

An attraction of the service is the ability to upload data in any format and then use common web standards to link different data sets and create mash-ups with the information, said Dean Allemang, an expert in online data….(More)”