Driving Innovation With Open Data


Research Article by The GovLab’s Joel Gurin (Chapter 6 in the report, “The Future of Data-Driven Innovation.”):  The chapters in this report provide ample evidence of the power of data and its business potential. But like any business resource, data is only valuable if the benefit of using it outweighs its cost. Data collection, management, distribution, quality control, and application all come at a price—a potential obstacle for companies of any size, though especially for small and medium-sized enterprises.
Over the last several years, however, the “I” of data’s return on investment (ROI) has become less of a hurdle, and new data-driven companies are developing rapidly as a result. One major reason is that governments at the federal, state, and local level are making more data available at little or no charge for the private sector and the public to use. Governments collect data of all kinds—including scientific, demographic, and financial data—at taxpayer expense.
Now, public sector agencies and departments are increasingly repaying that public investment by making their data available to all for free or at a low cost. This is Open Data. While there are still costs in putting the data to use, the growing availability of this national resource is becoming a significant driver for hundreds of new businesses. This chapter describes the growing potential of Open Data and the data-driven innovation it supports, the types of data and applications that are most promising, and the policies that will encourage innovation going forward. Read and download this article in PDF format.

Open Data as Universal Service. New perspectives in the Information Profession


Paper by L. Fernando Ramos Simón et al in Procedia – Social and Behavioral Sciences: “The Internet provides a global information flow, which improves living conditions in poor countries as well as in rich countries. Owing to its abundance and quality, public information (meteorological, geographic, transport information. and also the content managed in libraries, archives and museums) is an incentive for change, becoming invaluable and accessible to all citizens. However, it is clear that Open Data plays a significant role and provides a business service in the digital economy. Nevertheless, it is unknown how this amount of public data may be introduced as universal service to make it available to all citizens in matters of education, health, culture . In fact, a function or role which has traditionally been assumed by libraries. In addition, information professionals will have to acquire new skills that enable them to assume a new role in the information management: data management (Open Data) and content management (Open Content). Thus, this study analyzes new roles, which will be assumed by new information professionals such as metadata, interoperability, access licenses, information search and retrieval tools and applications for data queries…”

Why the Open Definition Matters for Open Data: Quality, Compatibility and Simplicity


Rufus Pollock at Open Knowledge: “The Open Definition performs an essential function as a “standard”, ensuring that when you say “open data” and I say “open data” we both mean the same thing. This standardization, in turn, ensures the quality, compatibility and simplicity essential to realizing one of the main practical benefits of “openness”: the greatly increased ability to combine different datasets together to drive innovation, insight and change. …
However, these benefits are at significant risk both from quality-dilution and “open-washing”” (non-open data being passed off as open) as well as from fragmentation of the ecosystem as the proliferation of open licenses each with their own slightly different terms and conditions leads to incompatibility. The Open Definition helps eliminates these risks and ensure we realize the full benefits of open. It acts as the “gold standard” for open content and data guaranteeing quality and preventing incompatibility. This post explores in more detail why it’s important to have the Open Definition and the clear standard it provides for what “open” means in open data and open content…”

Uncovering State And Local Gov’s 15 Hidden Successes


Emily Jarvis at GovLoop: “From garbage trucks to vacant lots, cities and states are often tasked with the thankless job of cleaning up a community’s mess. These are tasks that are often overlooked, but are critical to keeping a community vibrant.
But even in these sometimes thankless jobs, there are real innovations happening. Take Miami-Dade County where they are using hybrid garbage trucks to save the community millions of dollars in fuel every year and make the environment a little cleaner. Or head over to Milwaukee where the city is turning vacant and abandoned lots into urban farms.
There are just two of the fifteen examples, GovLoop uncovered in our new guide, From the State House to the County Clerk – 15 Challenges and Success Stories.
We have broken the challenges into four categories:

  • Internal Best Practices
  • Tech Challenges
  • Health and Safety
  • Community Engagement and Outreach

Here’s another example, the open data movement has the potential to effect governing and civic engagement at the state and local government levels. But today very few agencies are actively providing open data. In fact, only 46 U.S. cities and counties have open data sites. One of the cities on the leading edge of the open data movement is Fort Worth, Texas.

“When I came into office, that was one of my campaign promises, that we would get Fort Worth into this century on technology and that we would take a hard look at open records requests and requests for data,” Mayor Betsy Price said in an interview with the Star-Telegram. “It goes a lot further to being transparent and letting people participate in their government and see what we are doing. It is the people’s data, and it should be easy to access.”

The website, data.fortworthtexas.gov, offers data and documents such as certificates of occupancy, development permits and residential permits for download in several formats, including Excel and PDF. Not all datasets are available yet — the city said its priority was to put the most-requested data on the portal first. Next up? Crime data, code violations, restaurant ratings and capital projects progress.

City officials’ ultimate goal is to create and adopt a full open data policy. As part of the launch, they are also looking for local software developers and designers who want to help guide the open data initiative. Those interested in participating can sign up online to receive more information….”

UN Data Revolution Group


Website: “UN Secretary-General Ban Ki-moon has asked an Independent Expert Advisory Group to make concrete recommendations on bringing about a data revolution in sustainable development. Here you can find out more about the work of the group, and feed into the process by adding your comments to this site or sending a private consultation submission

Consultation Areas

Why we’re failing to get the most out of open data


Victoria Lemieux at the WEF Blog: “An unprecedented number of individuals and organizations are finding ways to explore, interpret and use Open Data. Public agencies are hosting Open Data events such as meetups, hackathons and data dives. The potential of these initiatives is great, including support for economic development (McKinsey, 2013), anti-corruption (European Public Sector Information Platform, 2014) and accountability (Open Government Partnership, 2012). But is Open Data’s full potential being realized?
news item from Computer Weekly casts doubt. A recent report notes that, in the United Kingdom, poor data quality is hindering the government’s Open Data program. The report goes on to explain that – in an effort to make the public sector more transparent and accountable – UK public bodies have been publishing spending records every month since November 2010. The authors of the report, who conducted an analysis of 50 spending-related data releases by the Cabinet Office since May 2010, found that that the data was of such poor quality that using it would require advanced computer skills.
Far from being a one-off problem, research suggests that this issue is ubiquitous and endemic. Some estimates indicate that as much as 80 percent of the time and cost of an analytics project is attributable to the need to clean up “dirty data” (Dasu and Johnson, 2003).
In addition to data quality issues, data provenance can be difficult to determine. Knowing where data originates and by what means it has been disclosed is key to being able to trust data. If end users do not trust data, they are unlikely to believe they can rely upon the information for accountability purposes. Establishing data provenance does not “spring full blown from the head of Zeus.” It entails a good deal of effort undertaking such activities as enriching data with metadata – data about data – such as the date of creation, the creator of the data, who has had access to the data over time and ensuring that both data and metadata remain unalterable.
Similarly, if people think that data could be tampered with, they are unlikely to place trust in it; full comprehension of data relies on the ability to trace its origins….”

Plenario


About Plenario: “Plenario makes it possible to rethink the way we use open data. Instead of being constrained by the data that is accessible and usable, let’s start by formulating our questions and then find the data to answer them. Plenario makes this easy by tying together all datasets on one map and one timeline—because in the real world, everything affects everything else…
The problem
Over the past few years, levels of government from the federal administration to individual municipalities like the City of Chicago have begun embracing open data, releasing datasets publicly for free. This movement has vastly increased the amount of data available, but existing platforms and technologies are designed mainly to view and access individual datasets one at a time. This restriction contradicts decades of research contending that no aspect of the urban landscape is truly isolated; in today’s cities, everything is connected to everything else.
Furthermore, researchers are often limited in the questions they can ask by the data available to answer them. It is not uncommon to spend 75% of one’s time locating, downloading, cleaning, and standardizing the relevant datasets—leaving precious little resources for the important work.
What we do
Plenario is designed to take us from “spreadsheets on the web”1 to truly smart open data. This rests on two fundamental breakthroughs:

1)  Allow users to assemble and download data from multiple, independent data sources, such as two different municipal data portals, or the federal government and a privately curated dataset.
2)  Unite all datasets along a single spatial and temporal index, making it possible to do complex aggregations with one query.

With these advances, Plenario allows users to study regions over specified time periods using all relevant data, regardless of original source, and represent the data as a single time series. By providing a single, centralized hub for open data, the Plenario platform enables urban scientists to ask the right questions with as few constraints as possible….
being implemented by the Urban Center for Computation and Data and DataMade

France Announces An Ambitious New Data Strategy


at TechCrunch: “After four long months of speculations and political maneuvering, the French Government finally announced that France is getting its first Chief Data Officer….
First, it’s all about pursuing Etalab’s work when it comes to open data. The small team acted as a startup and quickly iterated on its central platform and multiple side projects. It came up with pragmatic solutions to complicated public issues, such as public health data or fiscal policy simulation. France is now the fourth country in the United Nations e-government survey.
Now, the CDO will have even more official and informal legitimacy to ask other ministries to release data sets. It’s not just about following open government theories — it’s not just about releasing public data to serve the public interest. The team can also simulate new policies before they are implemented, and share recommendations with the ministries working on these new policies.
When a new policy is written, the Government should evaluate all the ins and outs of it before implementation. Citizens should expect no less from their government.
At a larger scale, this nomination is very significant for the French Government. For years, its digital strategy was mostly about finding the best way to communicate through the Internet. But when it came to creating new policies, computers couldn’t help them.
Also announced today, the Government is modernizing and unifying its digital platform between all its ministries and services — it’s never too late. The CDO team will work closely with the DISIC to design this platform — it should be a multi-year project.
Finally, the Government will invest $160 million (€125 million) to innovate in the public sector when it makes sense. In other words, the government will work with private companies (and preferably young innovative companies) to improve the infrastructure that powers the public sector.
France is the first European country to get a Chief Data Officer…”

Mapping the Next Frontier of Open Data: Corporate Data Sharing


Stefaan Verhulst at the GovLab (cross-posted at the UN Global Pulse Blog): “When it comes to data, we are living in the Cambrian Age. About ninety percent of the data that exists today has been generated within the last two years. We create 2.5 quintillion bytes of data on a daily basis—equivalent to a “new Google every four days.”
All of this means that we are certain to witness a rapid intensification in the process of “datafication”– already well underway. Use of data will grow increasingly critical. Data will confer strategic advantages; it will become essential to addressing many of our most important social, economic and political challenges.
This explains–at least in large part–why the Open Data movement has grown so rapidly in recent years. More and more, it has become evident that questions surrounding data access and use are emerging as one of the transformational opportunities of our time.
Today, it is estimated that over one million datasets have been made open or public. The vast majority of this open data is government data—information collected by agencies and departments in countries as varied as India, Uganda and the United States. But what of the terabyte after terabyte of data that is collected and stored by corporations? This data is also quite valuable, but it has been harder to access.
The topic of private sector data sharing was the focus of a recent conference organized by the Responsible Data Forum, Data and Society Research Institute and Global Pulse (see event summary). Participants at the conference, which was hosted by The Rockefeller Foundation in New York City, included representatives from a variety of sectors who converged to discuss ways to improve access to private data; the data held by private entities and corporations. The purpose for that access was rooted in a broad recognition that private data has the potential to foster much public good. At the same time, a variety of constraints—notably privacy and security, but also proprietary interests and data protectionism on the part of some companies—hold back this potential.
The framing for issues surrounding sharing private data has been broadly referred to under the rubric of “corporate data philanthropy.” The term refers to an emerging trend whereby companies have started sharing anonymized and aggregated data with third-party users who can then look for patterns or otherwise analyze the data in ways that lead to policy insights and other public good. The term was coined at the World Economic Forum meeting in Davos, in 2011, and has gained wider currency through Global Pulse, a United Nations data project that has popularized the notion of a global “data commons.”
Although still far from prevalent, some examples of corporate data sharing exist….

Help us map the field

A more comprehensive mapping of the field of corporate data sharing would draw on a wide range of case studies and examples to identify opportunities and gaps, and to inspire more corporations to allow access to their data (consider, for instance, the GovLab Open Data 500 mapping for open government data) . From a research point of view, the following questions would be important to ask:

  • What types of data sharing have proven most successful, and which ones least?
  • Who are the users of corporate shared data, and for what purposes?
  • What conditions encourage companies to share, and what are the concerns that prevent sharing?
  • What incentives can be created (economic, regulatory, etc.) to encourage corporate data philanthropy?
  • What differences (if any) exist between shared government data and shared private sector data?
  • What steps need to be taken to minimize potential harms (e.g., to privacy and security) when sharing data?
  • What’s the value created from using shared private data?

We (the GovLab; Global Pulse; and Data & Society) welcome your input to add to this list of questions, or to help us answer them by providing case studies and examples of corporate data philanthropy. Please add your examples below, use our Google Form or email them to us at corporatedata@thegovlab.org”

Journey tracking app will use cyclist data to make cities safer for bikes


Springwise: “Most cities were never designed to cater for the huge numbers of bikes seen on their roads every day, and as the number of cyclists grows, so do the fatality statistics thanks to limited investment in safe cycle paths. While Berlin already crowdsources bikers’ favorite cycle routes and maps them through the Dynamic Connections platform, a new app called WeCycle lets cyclists track their journeys, pooling their data to create heat maps for city planners.
Created by the UK’s TravelAI transport startup, WeCycle taps into the current consumer trend for quantifying every aspect of life, including journey times. By downloading the free iOS app, London cyclists can seamlessly create stats each time they get on their bike. They app runs in the background and uses the device’s accelerometer to smartly distinguish walking or running from cycling. They can then see how far they’ve traveled, how fast they cycle and every route they’ve taken. Additionally, the app also tracks bus and car travel.
Anyone that downloads the app agrees that their data can be anonymously sent to TravelAI, creating an accurate and real-time information resource. It aims to create tools such as heat maps and behavior monitoring for cities and local authorities to learn more about how citizens are using roads to better inform their transport policies.
WeCycle follows in the footsteps of similar apps such as Germany’s Radwende and the Toronto Cycling App — both released this year — in taking a popular trend and turning into data that could help make cities a safer place to cycle….Website: www.travelai.info