Mapping the Next Frontier of Open Data: Corporate Data Sharing


Stefaan Verhulst at the GovLab (cross-posted at the UN Global Pulse Blog): “When it comes to data, we are living in the Cambrian Age. About ninety percent of the data that exists today has been generated within the last two years. We create 2.5 quintillion bytes of data on a daily basis—equivalent to a “new Google every four days.”
All of this means that we are certain to witness a rapid intensification in the process of “datafication”– already well underway. Use of data will grow increasingly critical. Data will confer strategic advantages; it will become essential to addressing many of our most important social, economic and political challenges.
This explains–at least in large part–why the Open Data movement has grown so rapidly in recent years. More and more, it has become evident that questions surrounding data access and use are emerging as one of the transformational opportunities of our time.
Today, it is estimated that over one million datasets have been made open or public. The vast majority of this open data is government data—information collected by agencies and departments in countries as varied as India, Uganda and the United States. But what of the terabyte after terabyte of data that is collected and stored by corporations? This data is also quite valuable, but it has been harder to access.
The topic of private sector data sharing was the focus of a recent conference organized by the Responsible Data Forum, Data and Society Research Institute and Global Pulse (see event summary). Participants at the conference, which was hosted by The Rockefeller Foundation in New York City, included representatives from a variety of sectors who converged to discuss ways to improve access to private data; the data held by private entities and corporations. The purpose for that access was rooted in a broad recognition that private data has the potential to foster much public good. At the same time, a variety of constraints—notably privacy and security, but also proprietary interests and data protectionism on the part of some companies—hold back this potential.
The framing for issues surrounding sharing private data has been broadly referred to under the rubric of “corporate data philanthropy.” The term refers to an emerging trend whereby companies have started sharing anonymized and aggregated data with third-party users who can then look for patterns or otherwise analyze the data in ways that lead to policy insights and other public good. The term was coined at the World Economic Forum meeting in Davos, in 2011, and has gained wider currency through Global Pulse, a United Nations data project that has popularized the notion of a global “data commons.”
Although still far from prevalent, some examples of corporate data sharing exist….

Help us map the field

A more comprehensive mapping of the field of corporate data sharing would draw on a wide range of case studies and examples to identify opportunities and gaps, and to inspire more corporations to allow access to their data (consider, for instance, the GovLab Open Data 500 mapping for open government data) . From a research point of view, the following questions would be important to ask:

  • What types of data sharing have proven most successful, and which ones least?
  • Who are the users of corporate shared data, and for what purposes?
  • What conditions encourage companies to share, and what are the concerns that prevent sharing?
  • What incentives can be created (economic, regulatory, etc.) to encourage corporate data philanthropy?
  • What differences (if any) exist between shared government data and shared private sector data?
  • What steps need to be taken to minimize potential harms (e.g., to privacy and security) when sharing data?
  • What’s the value created from using shared private data?

We (the GovLab; Global Pulse; and Data & Society) welcome your input to add to this list of questions, or to help us answer them by providing case studies and examples of corporate data philanthropy. Please add your examples below, use our Google Form or email them to us at [email protected]

Journey tracking app will use cyclist data to make cities safer for bikes


Springwise: “Most cities were never designed to cater for the huge numbers of bikes seen on their roads every day, and as the number of cyclists grows, so do the fatality statistics thanks to limited investment in safe cycle paths. While Berlin already crowdsources bikers’ favorite cycle routes and maps them through the Dynamic Connections platform, a new app called WeCycle lets cyclists track their journeys, pooling their data to create heat maps for city planners.
Created by the UK’s TravelAI transport startup, WeCycle taps into the current consumer trend for quantifying every aspect of life, including journey times. By downloading the free iOS app, London cyclists can seamlessly create stats each time they get on their bike. They app runs in the background and uses the device’s accelerometer to smartly distinguish walking or running from cycling. They can then see how far they’ve traveled, how fast they cycle and every route they’ve taken. Additionally, the app also tracks bus and car travel.
Anyone that downloads the app agrees that their data can be anonymously sent to TravelAI, creating an accurate and real-time information resource. It aims to create tools such as heat maps and behavior monitoring for cities and local authorities to learn more about how citizens are using roads to better inform their transport policies.
WeCycle follows in the footsteps of similar apps such as Germany’s Radwende and the Toronto Cycling App — both released this year — in taking a popular trend and turning into data that could help make cities a safer place to cycle….Website: www.travelai.info

Citizen Science: The Law and Ethics of Public Access to Medical Big Data


New Paper by Sharona Hoffman: Patient-related medical information is becoming increasingly available on the Internet, spurred by government open data policies and private sector data sharing initiatives. Websites such as HealthData.gov, GenBank, and PatientsLikeMe allow members of the public to access a wealth of health information. As the medical information terrain quickly changes, the legal system must not lag behind. This Article provides a base on which to build a coherent data policy. It canvasses emergent data troves and wrestles with their legal and ethical ramifications.
Publicly accessible medical data have the potential to yield numerous benefits, including scientific discoveries, cost savings, the development of patient support tools, healthcare quality improvement, greater government transparency, public education, and positive changes in healthcare policy. At the same time, the availability of electronic personal health information that can be mined by any Internet user raises concerns related to privacy, discrimination, erroneous research findings, and litigation. This Article analyzes the benefits and risks of health data sharing and proposes balanced legislative, regulatory, and policy modifications to guide data disclosure and use.”

5 great apps backed with open data


Jeanne Holm at OpenSource.com: “Data.gov has taken open source to heart. Beyond just providing open data and open source code, the entire process involves open civic engagement. All team ideas, public interactions, and new ideas (from any interaction) are cross-posted and entered in Github. These are tracked openly and completed to milestones for full transparency. We also recently redesigned the website at Data.gov through usability testing and open engagement on Github.
Today, I want to share with you just five of the hundreds of applications that have been developed by the public using open government data. These are examples of the kind of apps, visualizations, and analyses that are created from working with developers, educators, and businesses on a specific challenge at events that pull the community together, like data jams, meetups, and conferences.

Archimedes

Archimedes makes tools that give quantitative models to doctors and patients so that they can find effective interventions, predict how interventions will affect an individual’s health risk, and help decision-makers analyze health outcomes….

Trulia

Trulia provides insights into neighborhoods where you might be interested in moving. Looking at the homes and apartments for sale and rent, trends and prices in real estate, and neighborhood characteristics, Trulia gives you the data to make decisions about buying, selling, renting, and moving….

HelloWallet

HelloWallet helps people to manage their money, and to learn about and start making investments. Some of the subjects for individuals include retirement readiness, debt levels, emergency savings, and health savings….

SaferCar

Consumers looking for a new car, can find a safer car by using the SaferCar app from the Department of Transportation. Powered by data on five-star safety ratings from the National Highway Traffic Safety Administration, consumers can look at new and used car ratings, recalls and complaints, and information about installing child seats….

Red Cross Hurricane

The Safety.Data.gov community of Data.gov held a Safety Datapalooza and brought together developers, businesses, NGOs, and government participants to brainstorm ways to put government data to use to improve the lives of citizens in America. A 90-day challenge was issued to create some of these apps and concepts, and one was with the Red Cross to create an app that would help people find safe ways to move around during a natural disaster. This included rail, roads, buses, and airports–which were open and what schedules they were running on. These data were provided by the Department of Transportation. As Hurricane Sandy descended on the east coast, we accelerated the development of the Red Cross Hurricane app and launched the app as the Hurricane touched ground…”

The Rise of Data Poverty in America


Report by Daniel Castro for the Center of Data Innovation: “Data-driven innovations offer enormous opportunities to advance important societal goals. However, to take advantage of these opportunities, individuals must have access to high-quality data about themselves and their communities. If certain groups routinely do not have data collected about them, their problems may be overlooked and their communities held back in spite of progress elsewhere. Given this risk, policymakers should begin a concerted effort to address the “data divide”—the social and economic inequalities that may result from a lack of collection or use of data about individuals or communities..”

Value Based Prioritisation of Open Government Data Investments


 This ePSI platform: “This ePSI platform topic report explores how Governments are increasingly prioritising their investments in Open Government Data on the basis of the value that can be unlocked by opening up government datasets.
The report elaborates on a working definition for high value datasets from different dimensions, both from the perspective of the data publisher and data re-user. This working definition has been used to identify and prioritise datasets to be listed on the European Union Open Data Portal, allowing EU institutions to better determine which new datasets should be published with priority, or to identify which high value datasets already listed on the portal should be improved with priority.”

Stacking Up the Benefits of Openness


at Digital Gov:  “Open government, open source, openness. These words are often used in talking about open data, but we sometimes forget that the root of all of this is an open community. Individuals working together to release government data and put it to use to help their neighbors and reach new personal goals.
This sense of community in the open data field shows up in many places. I see it when people volunteer at the National Day of Civic Hacking, crowdsource data integrity with MapGive, or mentor with Girls Who Code. And each day I see it on Open Data Stack Exchange, where people ask questions about open data issues, searches, or challenges, and strangers half a world away answer the question within an hour.
We launched the Open Data Stack Exchange in 2013 as a way of helping to build community and open up the knowledge in our emergent field. What started slowly, soon took off with 3,375 participants today having provided 1,592 answers to 721 questions. Anyone can ask a question. These have ranged from data requests (looking for specific hard-to-find data) to technical questions on parsing or visualizing data. More importantly, anyone can answer a question, too. You’ll notice from the numbers that most questions have more than one answer, with the asker being able to choose the best answer and everyone being able to vote the questions and answers up and down. The forum is loosely moderated (I’ve served as one of the moderators since inception), but predominantly self-governed. Google trusts this method and forum so much that within a few minutes of answering a question, it will pop to the top of the Google search results for that topic.
What are people asking on the Open Data Stack Exchange? One question is seeking applications being developed with open data, one is looking for a database of open databases and another seeks data about the Ebola outbreak. Answers, edits, comments, suggestions…all are part of the conversation and documentation of our collective open data knowledge. This type of community-vetted, open forum helps to evolve and preserve our collective wisdom into the future. I encourage people who ask questions of Data.gov to do so on Stack Exchange so that everyone can see the answer, and flag those for easy reference (OpenFDA does the same)…”

How Open Data Is Transforming City Life


Joel Gurin, The GovLab, at Techonomy: “Start a business. Manage your power use. Find cheap rents, or avoid crime-ridden neighborhoods. Cities and their citizens worldwide are discovering the power of “open data”—public data and information available from government and other sources that can help solve civic problems and create new business opportunities. By opening up data about transportation, education, health care, and more, municipal governments are helping app developers, civil society organizations, and others to find innovative ways to tackle urban problems. For any city that wants to promote entrepreneurship and economic development, open data can be a valuable new resource.
The urban open data movement has been growing for several years, with American cities including New York, San Francisco, Chicago, and Washington in the forefront. Now an increasing number of government officials, entrepreneurs, and civic hackers are recognizing the potential of open data. The results have included applications that can be used across many cities as well as those tailored to an individual city’s needs.
At first, the open data movement was driven by a commitment to transparency and accountability. City, state, and local governments have all released data about their finances and operations in the interest of good government and citizen participation. Now some tech companies are providing platforms to make this kind of city data more accessible, useful, and comparable. Companies like OpenGov and Govini make it possible for city managers and residents to examine finances, assess police department overtime, and monitor other factors that let them compare their city’s performance to neighboring municipalities.
Other new businesses are tapping city data to provide residents with useful, practical information. One of the best examples is NextBus, which uses metropolitan transportation data to tell commuters when to expect a bus along their route. Commuter apps like this have become common in cities in the U.S. and around the world. Another website, SpotCrime, collects, analyzes, and maps crime statistics to tell city dwellers which areas are safest or most dangerous and to offer crime alerts. And the Chicago-based Purple Binder helps people in need find city healthcare services. Many companies in the Open Data 500, the study of open data companies that I direct at the GovLab at NYU, use data from cities as well as other sources….
Some of the most ambitious uses of city data—with some of the greatest potential—focus on improving education. In Washington, the nonprofit Learn DC has made data about public schools available through a portal that state agencies, community organizations, and civic hackers can all use. They’re using it for collaborative research and action that, they say, has “empowered every DC parent to participate in shaping the future of the public education system.”…”

In democracy and disaster, emerging world embraces 'open data'


Jeremy Wagstaff’ at Reuters: “Open data’ – the trove of data-sets made publicly available by governments, organizations and businesses – isn’t normally linked to high-wire politics, but just may have saved last month’s Indonesian presidential elections from chaos.
Data is considered open when it’s released for anyone to use and in a format that’s easy for computers to read. The uses are largely commercial, such as the GPS data from U.S.-owned satellites, but data can range from budget numbers and climate and health statistics to bus and rail timetables.
It’s a revolution that’s swept the developed world in recent years as governments and agencies like the World Bank have freed up hundreds of thousands of data-sets for use by anyone who sees a use for them. Data.gov, a U.S. site, lists more than 100,000 data-sets, from food calories to magnetic fields in space.
Consultants McKinsey reckon open data could add up to $3 trillion worth of economic activity a year – from performance ratings that help parents find the best schools to governments saving money by releasing budget data and asking citizens to come up with cost-cutting ideas. All the apps, services and equipment that tap the GPS satellites, for example, generate $96 billion of economic activity each year in the United States alone, according to a 2011 study.
But so far open data has had a limited impact in the developing world, where officials are wary of giving away too much information, and where there’s the issue of just how useful it might be: for most people in emerging countries, property prices and bus schedules aren’t top priorities.
But last month’s election in Indonesia – a contentious face-off between a disgraced general and a furniture-exporter turned reformist – highlighted how powerful open data can be in tandem with a handful of tech-smart programmers, social media savvy and crowdsourcing.
“Open data may well have saved this election,” said Paul Rowland, a Jakarta-based consultant on democracy and governance…”
 

Assessing Social Value in Open Data Initiatives: A Framework


Paper by Gianluigi Viscusi, Marco Castelli and Carlo Batini in Future Internet Journal: “Open data initiatives are characterized, in several countries, by a great extension of the number of data sets made available for access by public administrations, constituencies, businesses and other actors, such as journalists, international institutions and academics, to mention a few. However, most of the open data sets rely on selection criteria, based on a technology-driven perspective, rather than a focus on the potential public and social value of data to be published. Several experiences and reports confirm this issue, such as those of the Open Data Census. However, there are also relevant best practices. The goal of this paper is to investigate the different dimensions of a framework suitable to support public administrations, as well as constituencies, in assessing and benchmarking the social value of open data initiatives. The framework is tested on three initiatives, referring to three different countries, Italy, the United Kingdom and Tunisia. The countries have been selected to provide a focus on European and Mediterranean countries, considering also the difference in legal frameworks (civic law vs. common law countries)”