UN Data Revolution Group


Website: “UN Secretary-General Ban Ki-moon has asked an Independent Expert Advisory Group to make concrete recommendations on bringing about a data revolution in sustainable development. Here you can find out more about the work of the group, and feed into the process by adding your comments to this site or sending a private consultation submission

Consultation Areas

Why we’re failing to get the most out of open data


Victoria Lemieux at the WEF Blog: “An unprecedented number of individuals and organizations are finding ways to explore, interpret and use Open Data. Public agencies are hosting Open Data events such as meetups, hackathons and data dives. The potential of these initiatives is great, including support for economic development (McKinsey, 2013), anti-corruption (European Public Sector Information Platform, 2014) and accountability (Open Government Partnership, 2012). But is Open Data’s full potential being realized?
news item from Computer Weekly casts doubt. A recent report notes that, in the United Kingdom, poor data quality is hindering the government’s Open Data program. The report goes on to explain that – in an effort to make the public sector more transparent and accountable – UK public bodies have been publishing spending records every month since November 2010. The authors of the report, who conducted an analysis of 50 spending-related data releases by the Cabinet Office since May 2010, found that that the data was of such poor quality that using it would require advanced computer skills.
Far from being a one-off problem, research suggests that this issue is ubiquitous and endemic. Some estimates indicate that as much as 80 percent of the time and cost of an analytics project is attributable to the need to clean up “dirty data” (Dasu and Johnson, 2003).
In addition to data quality issues, data provenance can be difficult to determine. Knowing where data originates and by what means it has been disclosed is key to being able to trust data. If end users do not trust data, they are unlikely to believe they can rely upon the information for accountability purposes. Establishing data provenance does not “spring full blown from the head of Zeus.” It entails a good deal of effort undertaking such activities as enriching data with metadata – data about data – such as the date of creation, the creator of the data, who has had access to the data over time and ensuring that both data and metadata remain unalterable.
Similarly, if people think that data could be tampered with, they are unlikely to place trust in it; full comprehension of data relies on the ability to trace its origins….”

Plenario


About Plenario: “Plenario makes it possible to rethink the way we use open data. Instead of being constrained by the data that is accessible and usable, let’s start by formulating our questions and then find the data to answer them. Plenario makes this easy by tying together all datasets on one map and one timeline—because in the real world, everything affects everything else…
The problem
Over the past few years, levels of government from the federal administration to individual municipalities like the City of Chicago have begun embracing open data, releasing datasets publicly for free. This movement has vastly increased the amount of data available, but existing platforms and technologies are designed mainly to view and access individual datasets one at a time. This restriction contradicts decades of research contending that no aspect of the urban landscape is truly isolated; in today’s cities, everything is connected to everything else.
Furthermore, researchers are often limited in the questions they can ask by the data available to answer them. It is not uncommon to spend 75% of one’s time locating, downloading, cleaning, and standardizing the relevant datasets—leaving precious little resources for the important work.
What we do
Plenario is designed to take us from “spreadsheets on the web”1 to truly smart open data. This rests on two fundamental breakthroughs:

1)  Allow users to assemble and download data from multiple, independent data sources, such as two different municipal data portals, or the federal government and a privately curated dataset.
2)  Unite all datasets along a single spatial and temporal index, making it possible to do complex aggregations with one query.

With these advances, Plenario allows users to study regions over specified time periods using all relevant data, regardless of original source, and represent the data as a single time series. By providing a single, centralized hub for open data, the Plenario platform enables urban scientists to ask the right questions with as few constraints as possible….
being implemented by the Urban Center for Computation and Data and DataMade

France Announces An Ambitious New Data Strategy


at TechCrunch: “After four long months of speculations and political maneuvering, the French Government finally announced that France is getting its first Chief Data Officer….
First, it’s all about pursuing Etalab’s work when it comes to open data. The small team acted as a startup and quickly iterated on its central platform and multiple side projects. It came up with pragmatic solutions to complicated public issues, such as public health data or fiscal policy simulation. France is now the fourth country in the United Nations e-government survey.
Now, the CDO will have even more official and informal legitimacy to ask other ministries to release data sets. It’s not just about following open government theories — it’s not just about releasing public data to serve the public interest. The team can also simulate new policies before they are implemented, and share recommendations with the ministries working on these new policies.
When a new policy is written, the Government should evaluate all the ins and outs of it before implementation. Citizens should expect no less from their government.
At a larger scale, this nomination is very significant for the French Government. For years, its digital strategy was mostly about finding the best way to communicate through the Internet. But when it came to creating new policies, computers couldn’t help them.
Also announced today, the Government is modernizing and unifying its digital platform between all its ministries and services — it’s never too late. The CDO team will work closely with the DISIC to design this platform — it should be a multi-year project.
Finally, the Government will invest $160 million (€125 million) to innovate in the public sector when it makes sense. In other words, the government will work with private companies (and preferably young innovative companies) to improve the infrastructure that powers the public sector.
France is the first European country to get a Chief Data Officer…”

Mapping the Next Frontier of Open Data: Corporate Data Sharing


Stefaan Verhulst at the GovLab (cross-posted at the UN Global Pulse Blog): “When it comes to data, we are living in the Cambrian Age. About ninety percent of the data that exists today has been generated within the last two years. We create 2.5 quintillion bytes of data on a daily basis—equivalent to a “new Google every four days.”
All of this means that we are certain to witness a rapid intensification in the process of “datafication”– already well underway. Use of data will grow increasingly critical. Data will confer strategic advantages; it will become essential to addressing many of our most important social, economic and political challenges.
This explains–at least in large part–why the Open Data movement has grown so rapidly in recent years. More and more, it has become evident that questions surrounding data access and use are emerging as one of the transformational opportunities of our time.
Today, it is estimated that over one million datasets have been made open or public. The vast majority of this open data is government data—information collected by agencies and departments in countries as varied as India, Uganda and the United States. But what of the terabyte after terabyte of data that is collected and stored by corporations? This data is also quite valuable, but it has been harder to access.
The topic of private sector data sharing was the focus of a recent conference organized by the Responsible Data Forum, Data and Society Research Institute and Global Pulse (see event summary). Participants at the conference, which was hosted by The Rockefeller Foundation in New York City, included representatives from a variety of sectors who converged to discuss ways to improve access to private data; the data held by private entities and corporations. The purpose for that access was rooted in a broad recognition that private data has the potential to foster much public good. At the same time, a variety of constraints—notably privacy and security, but also proprietary interests and data protectionism on the part of some companies—hold back this potential.
The framing for issues surrounding sharing private data has been broadly referred to under the rubric of “corporate data philanthropy.” The term refers to an emerging trend whereby companies have started sharing anonymized and aggregated data with third-party users who can then look for patterns or otherwise analyze the data in ways that lead to policy insights and other public good. The term was coined at the World Economic Forum meeting in Davos, in 2011, and has gained wider currency through Global Pulse, a United Nations data project that has popularized the notion of a global “data commons.”
Although still far from prevalent, some examples of corporate data sharing exist….

Help us map the field

A more comprehensive mapping of the field of corporate data sharing would draw on a wide range of case studies and examples to identify opportunities and gaps, and to inspire more corporations to allow access to their data (consider, for instance, the GovLab Open Data 500 mapping for open government data) . From a research point of view, the following questions would be important to ask:

  • What types of data sharing have proven most successful, and which ones least?
  • Who are the users of corporate shared data, and for what purposes?
  • What conditions encourage companies to share, and what are the concerns that prevent sharing?
  • What incentives can be created (economic, regulatory, etc.) to encourage corporate data philanthropy?
  • What differences (if any) exist between shared government data and shared private sector data?
  • What steps need to be taken to minimize potential harms (e.g., to privacy and security) when sharing data?
  • What’s the value created from using shared private data?

We (the GovLab; Global Pulse; and Data & Society) welcome your input to add to this list of questions, or to help us answer them by providing case studies and examples of corporate data philanthropy. Please add your examples below, use our Google Form or email them to us at corporatedata@thegovlab.org”

Journey tracking app will use cyclist data to make cities safer for bikes


Springwise: “Most cities were never designed to cater for the huge numbers of bikes seen on their roads every day, and as the number of cyclists grows, so do the fatality statistics thanks to limited investment in safe cycle paths. While Berlin already crowdsources bikers’ favorite cycle routes and maps them through the Dynamic Connections platform, a new app called WeCycle lets cyclists track their journeys, pooling their data to create heat maps for city planners.
Created by the UK’s TravelAI transport startup, WeCycle taps into the current consumer trend for quantifying every aspect of life, including journey times. By downloading the free iOS app, London cyclists can seamlessly create stats each time they get on their bike. They app runs in the background and uses the device’s accelerometer to smartly distinguish walking or running from cycling. They can then see how far they’ve traveled, how fast they cycle and every route they’ve taken. Additionally, the app also tracks bus and car travel.
Anyone that downloads the app agrees that their data can be anonymously sent to TravelAI, creating an accurate and real-time information resource. It aims to create tools such as heat maps and behavior monitoring for cities and local authorities to learn more about how citizens are using roads to better inform their transport policies.
WeCycle follows in the footsteps of similar apps such as Germany’s Radwende and the Toronto Cycling App — both released this year — in taking a popular trend and turning into data that could help make cities a safer place to cycle….Website: www.travelai.info

Citizen Science: The Law and Ethics of Public Access to Medical Big Data


New Paper by Sharona Hoffman: Patient-related medical information is becoming increasingly available on the Internet, spurred by government open data policies and private sector data sharing initiatives. Websites such as HealthData.gov, GenBank, and PatientsLikeMe allow members of the public to access a wealth of health information. As the medical information terrain quickly changes, the legal system must not lag behind. This Article provides a base on which to build a coherent data policy. It canvasses emergent data troves and wrestles with their legal and ethical ramifications.
Publicly accessible medical data have the potential to yield numerous benefits, including scientific discoveries, cost savings, the development of patient support tools, healthcare quality improvement, greater government transparency, public education, and positive changes in healthcare policy. At the same time, the availability of electronic personal health information that can be mined by any Internet user raises concerns related to privacy, discrimination, erroneous research findings, and litigation. This Article analyzes the benefits and risks of health data sharing and proposes balanced legislative, regulatory, and policy modifications to guide data disclosure and use.”

5 great apps backed with open data


Jeanne Holm at OpenSource.com: “Data.gov has taken open source to heart. Beyond just providing open data and open source code, the entire process involves open civic engagement. All team ideas, public interactions, and new ideas (from any interaction) are cross-posted and entered in Github. These are tracked openly and completed to milestones for full transparency. We also recently redesigned the website at Data.gov through usability testing and open engagement on Github.
Today, I want to share with you just five of the hundreds of applications that have been developed by the public using open government data. These are examples of the kind of apps, visualizations, and analyses that are created from working with developers, educators, and businesses on a specific challenge at events that pull the community together, like data jams, meetups, and conferences.

Archimedes

Archimedes makes tools that give quantitative models to doctors and patients so that they can find effective interventions, predict how interventions will affect an individual’s health risk, and help decision-makers analyze health outcomes….

Trulia

Trulia provides insights into neighborhoods where you might be interested in moving. Looking at the homes and apartments for sale and rent, trends and prices in real estate, and neighborhood characteristics, Trulia gives you the data to make decisions about buying, selling, renting, and moving….

HelloWallet

HelloWallet helps people to manage their money, and to learn about and start making investments. Some of the subjects for individuals include retirement readiness, debt levels, emergency savings, and health savings….

SaferCar

Consumers looking for a new car, can find a safer car by using the SaferCar app from the Department of Transportation. Powered by data on five-star safety ratings from the National Highway Traffic Safety Administration, consumers can look at new and used car ratings, recalls and complaints, and information about installing child seats….

Red Cross Hurricane

The Safety.Data.gov community of Data.gov held a Safety Datapalooza and brought together developers, businesses, NGOs, and government participants to brainstorm ways to put government data to use to improve the lives of citizens in America. A 90-day challenge was issued to create some of these apps and concepts, and one was with the Red Cross to create an app that would help people find safe ways to move around during a natural disaster. This included rail, roads, buses, and airports–which were open and what schedules they were running on. These data were provided by the Department of Transportation. As Hurricane Sandy descended on the east coast, we accelerated the development of the Red Cross Hurricane app and launched the app as the Hurricane touched ground…”

The Rise of Data Poverty in America


Report by Daniel Castro for the Center of Data Innovation: “Data-driven innovations offer enormous opportunities to advance important societal goals. However, to take advantage of these opportunities, individuals must have access to high-quality data about themselves and their communities. If certain groups routinely do not have data collected about them, their problems may be overlooked and their communities held back in spite of progress elsewhere. Given this risk, policymakers should begin a concerted effort to address the “data divide”—the social and economic inequalities that may result from a lack of collection or use of data about individuals or communities..”

Value Based Prioritisation of Open Government Data Investments


 This ePSI platform: “This ePSI platform topic report explores how Governments are increasingly prioritising their investments in Open Government Data on the basis of the value that can be unlocked by opening up government datasets.
The report elaborates on a working definition for high value datasets from different dimensions, both from the perspective of the data publisher and data re-user. This working definition has been used to identify and prioritise datasets to be listed on the European Union Open Data Portal, allowing EU institutions to better determine which new datasets should be published with priority, or to identify which high value datasets already listed on the portal should be improved with priority.”