What do businesses really look for in open data?


Harvey Lewis in Computer Weekly:  “In 2015, the UK’s primary open data portal, www.data.gov.uk, will be six years old. The portal hosts approximately 20,000 official data sets from central government departments and their agencies, local authorities and other public sector bodies across the country. Just over half of these data sets are available as open data under the Open Government Licence (OGL). Data.gov.uk forms part of an international network of over three hundred open data efforts that have seen not just thousands but millions of data sets worldwide becoming freely available for personal or commercial use. [See http://datacatalogs.org and www.quandle.com].
…simply publishing open data does not guarantee that a business will use it…., if businesses are building new products or services, or relying on the data to inform their strategy, a number of characteristics other than just openness become critical in determining success:

  • Provenance – what is the source of the data and how it was collected? Is it authoritative?
  • Completeness and accuracy – are the examples and features of the data present and correct, and, if not, is the quality understood and documented?
  • Consistency – is the data published in a consistent, easy-to-access format and are any changes documented?
  • Timeliness – is the data available when it is needed for the time periods needed?
  • Richness – does the data contain a level of detail sufficient to answer our questions?
  • Guarantees of availability – will the data continue to be made available in the future?

If these characteristics cannot be guaranteed in open data or are unavailable except under a commercial licence then many businesses would prefer to pay to get them. While some public sector bodies – particularly the Trading Funds – have, over the years, established strong connections with business users of their data and understand their needs implicitly, the Open Data Institute is the first to cement these characteristics into a formal certification scheme for publishers of open data.
A campaign is needed to get publishers to adopt these certificates and to recognise that, economically at least, they are as important as Sir Tim Berners-Lee’s five-star scale for linked open data.  ….”

Restoring Confidence in Open, Shared and Personal Data


Report of the UK Digital Government Review: “It is obvious that government needs to be able to use data both to deliver services and to present information to public view. How else would government know which bank account to place a pension payment into, or a citizen know the results of an election or how to contact their elected representatives?

As more and more data is created, preserved and shared in ever-increasing volumes a number of urgent questions are begged: over opportunities and hazards; over the importance of using best-practice techniques, insights and technologies developed in the private sector, academia and elsewhere; over the promises and limitations of openness; and how all this might be articulated and made accessible to the public.

Government has already adopted “open data” (we will discuss this more in the next section) and there are now increasing calls for government to pay more attention to data analytics and so-called “big data” – although the first faltering steps to unlock benefits, here, have often ended in the discovery that using large-scale data is a far more nuanced business than was initially assumed

Debates around government and data have often been extremely high-profile – the NHS care.data [27] debate was raging while this review was in progress – but they are also shrouded in terms that can generate confusion and complexities that are not easily summarized.

In this chapter we will unpick some of these terms and some parts of the debate. This is a detailed and complex area and there is much more that could have been included [28]. This is not an area that can easily be summarized into a simple bullet-pointed list of policies.

Within this report we will use the following terms and definitions, proceeding to a detailed analysis of each in turn:

Type of Data

Definition [29]

Examples

1. Open Data Data that can be freely used, reused and redistributed by anyone – subject only, at most, to the requirement to attribute and sharealike Insolvency notices in the London Gazette
Government spending information
Public transport information
Official National Statistics
2. Shared Data Restricted data provided to restricted organisations or individuals for restricted purposes National Pupil Database
NHS care.data
Integrated health and social care
Individual census returns
3. Personal Data Data that relate to a living individual who can be identified from that data. For full legal definition see [30] Health records
Individual tax records
Insolvency notices in the London gazette
National Pupil Database
NB These definitions overlap. Personal data can exist in both open and shared data.

This social productivity will help build future economic productivity; in the meantime it will improve people’s lives and it will enhance our democracy. From our analysis it was clear that there was room for improvement…”

New Tool in Fighting Corruption: Open Data


Martin Tisne at Omidyar Network: “Yesterday in Brisbane, the G20 threw its weight behind open data by featuring it prominently in the G20 Anti-Corruption working action plan. Specifically, the action plan calls for effort in three related areas:

(1)   Prepare a G20 compendium of good practices and lessons learned on open data and its application in the fight against corruption
(2)   Prepare G20 Open Data Principles, including identifying areas or sectors where their application is particularly useful
(3)   Complete self‑assessments of G20 country open data frameworks and initiatives

Open data describes information that is not simply public, but that has been published in a manner that makes it easy to access and easy to compare and connect with other information.
This matters for anti corruption – if you are a journalist or a civil society activist investigating bribery and corruption those connections are everything. They tell you that an anonymous person (e.g. ‘Mr Smith’) who owns an obscure company registered in a tax haven is linked to a another company that has been illegally exporting timber from a neighboring country. That the said Mr. Smith is also the son-in-law of the mining minister of yet another country, who herself has been accused of embezzling mining revenues. As we have written elsewhere on this blog, investigative journalists, prosecution authorities, and civil society groups all need access to this linked data for their work.
The action plan also links open data to the wider G20 agenda, citing its impact on the ability of businesses to make better investment decisions. You can find the full detail here….”

Innovating Practice in a Culture of Expertise


Aleem Walji at SSI Review: “When I joined the World Bank five years ago to lead a new innovation practice, the organization asked me to help expand the space for experimentation and learning with an emphasis on emergent technologies. But that mandate was intimidating and counter-intuitive in an “expert-driven” culture. Experts want detailed plans, budgets, clear success indicators, and minimal risk. But innovation is about managing risk and navigating uncertainty intelligently. You fail fast and fail forward. It has been a step-by-step process, and the journey is far from over, but the World Bank today sees innovation as essential to achieving its mission.
It’s taught me a lot about seeding innovation in a culture of expertise, including phasing change across approaches to technology, teaming, problem solving, and ultimately leadership.
Innovating technologies: As a newcomer, my goal was not to try to change the World Bank’s culture. I was content to carve out a space where my team could try new things we couldn’t do elsewhere in the institution, learn fast, and create impact. Our initial focus was leveraging technologies with approaches that, if they took root, could be very powerful.
Over the first 18 to 24 months, we served as an incubator for ideas and had a number of successes that built on senior management’s support for increased access to information. The Open Data Initiative, for example, made our trove of information on countries, people, projects, and programs widely available and searchable. To our surprise, people came in droves to access it. We also launched the Mapping for Results initiative, which mapped project results and poverty data to show the relationship between where we lend and where the poor live, and the results of our work. These programs are now mainstream at the World Bank and have penetrated other development institutions….
Innovating teams: The lab idea—phase two—would require collaboration and experimentation in an unprecedented way. For example, we worked with other parts of the World Bank and a number of outside organizations to incubate the Open Development Technology Alliance, now part of the digital engagement unit of the World Bank. It worked to enhance accountability, and improve the delivery and quality of public services through technology-enabled citizen engagement such as using mobile phones, interactive mapping, and social media to draw citizens into collective problem mapping and problem solving….
Innovating problem solving: At the same time, we recognized that we face some really complex problems that the World Bank’s traditional approach of lending to governments and supervising development projects is not solving. For this, we needed another type of lab that innovated the very way we solve problems. We needed a deliberate process for experimenting, learning, iterating, and adapting. But that’s easier said than done. At our core, we are an expert-driven organization with know-how in disciplines ranging from agricultural economics and civil engineering to maternal health and early childhood development. Our problem-solving architecture is rooted in designing technical solutions to complicated problems. Yet the hardest problems in the world defy technical fixes. We work in contexts where political environments shift, leaders change, and conditions on the ground constantly evolve. Problems like climate change, financial inclusion, food security, and youth unemployment demand new ways of solving old problems.
The innovation we most needed was innovation in the leadership architecture of how we confront complex challenges. We share knowledge and expertise on the “what” of reform, but the “how” is what we need most. We need to marry know-how with do-how. We need multiyear, multi-stakeholder, and systems approaches to solving problems. We need to get better at framing and reframing problems, integrative thinking, and testing a range of solutions. We need to iterate and course-correct as we learn what works and doesn’t work in which context. That’s where we are right now with what we call “integrated leadership learning innovation”—phase four. It’s all about shaping an innovative process to address complex problems….”

Get the Data Button


BetaNYC: “A Web Button to Link Web Projects with their Source Data

How to use

When building a website, report, map, data visualization, or any other project, use this button to link to the underlying data. That’s it! Tell your friends! Let’s make this a thing!
180x60
125x50
120x60
110x32
88x31
80x15
36x13
Born on a discussion of the NYC Open Data Working Group on 14 November 2014.
Icon made by Freepik from www.flaticon.com is licensed by CC BY 3.0

The Next Frontier of Engagement: Civic Innovation Labs


Maayan Dembo at Planetizen: “As described by Clayton Christensen, a professor at the Harvard Business School who developed the term “disruptive innovation,” a successful office for social innovation should employ four main tactics to accomplish its mission. First, governments should invest “in innovations that are developed and identified by citizens outside of government who better understand the problems.” Second, the office should support “‘bottom-up’ initiatives, in preference to ‘trickle-down’ philanthropy—because the societal impact of the former is typically greater.” Third, Christensen argues that the office should utilize impact metrics to measure performance and, finally, that it should also invest in social innovation outside of the non-profit sector.
Los Angeles’ most recent citizen-driven social innovation initiative, the Civic Innovation Lab, is an 11-month project aimed at prototyping new solutions for issues within the city of Los Angeles. It is supported by the HubLA, Learn Do Share, the Los Angeles *City  Tech Bullpen, and Innovate LA, a membership organization within the Los Angeles County Economic Development Corporation. Private and public sector support for such labs, in one of the largest cities in America, is highly unprecedented, and because this initiative in Los Angeles is a new mechanism explicitly supported by the public sector, it warrants a critical check on its motivations and accomplishments. Depending on its success, the Civic Innovation Lab could serve as a model for future municipalities.
The Los Angeles Civic Innovation Lab operates in three main phases: 1) workshops where citizens learn about the possibilities of Open Data and discuss what deep challenges face Los Angeles (called the “Discover, Define, Design” stage), 2) a call for solutions to solve the design challenges brought to light in the first phase, and 3) a six-month accelerator program to prototype selected solutions. I participated in the most recent Civic Innovation Lab session, a three-day workshop concluding the “Discover, Define, Design” phase….”

A New Ebola Crisis Page Built with Open Data


HDX team: “We are introducing a new Ebola crisis page that provides an overview of the data available in HDX. The page includes an interactive map of the worst-affected countries, the top-line figures for the crisis, a graph of cumulative Ebola cases and deaths, and over 40 datasets.
We have been working closely with UNMEER and WHO to make Ebola data available for public use. We have also received important contributions from the British Red Cross, InterAction, MapAction, the Standby Task Force, the US Department of Defense, and WFP, among others.

How we built it

The process to create this page started a couple of months ago by simply linking to existing data sites, such as Open Street Map’s geospatial data or OCHA’s common operational datasets. We then created a service by extracting the data on Ebola cases and deaths from the bi-weekly WHO situation report and making the raw files available for analysts and developers.
The OCHA Regional Office in Dakar contributed a dataset that included Ebola cases by district, which they had been collecting from reports by the national Ministries of Health since March 2014. This data was picked up by The New York Times graphics team and by Gapminder which partnered with Google Crisis Response to add the data to the Google Public Data Explorer.

As more organizations shared Ebola datasets through HDX, users started to transform the data into useful graphs and maps. These visuals were then shared back with the wider community through the HDX gallery. We have incorporated many of these user-generated visual elements into the design of our new Ebola crisis page….”
See also Hacking Ebola.

Spain is trialling city monitoring using sound


Springwise: “There’s more traffic on today’s city streets than there ever has been, and managing it all can prove to be a headache for local authorities and transport bodies. In the past, we’ve seen the City of Calgary in Canada detect drivers’ Bluetooth signals to develop a map of traffic congestion. Now the EAR-IT project in Santander, Spain, is using acoustic sensors to measure the sounds of city streets and determine real time activity on the ground.
Launched as part of the autonomous community’s SmartSantander initiative, the experimental scheme placed hundreds of acoustic processing units around the region. These pick up the sounds being made in any given area and, when processed through an audio recognition engine, can provide data about what’s going on on the street. Smaller ‘motes’ were also developed to provide more accurate location information about each sound.
Created by members of Portugal’s UNINOVA institute and IT consultants EGlobalMark, the system was able to use city noises to detect things such as traffic congestion, parking availability and the location of emergency vehicles based on their sirens. It could then automatically trigger smart signs to display up-to-date information, for example.
The team particularly focused on a junction near the city hospital that’s a hotspot for motor accidents. Rather than force ambulance drivers to risk passing through a red light and into lateral traffic, the sensors were able to detect when and where an emergency vehicle was coming through and automatically change the lights in their favor.
The system could also be used to pick up ‘sonic events’ such as gunshots or explosions and detect their location. The researchers have also trialled an indoor version that can sense if an elderly resident has fallen over or to turn lights off when the room becomes silent.”

Seattle Launches Sweeping, Ethics-Based Privacy Overhaul


for the Privacy Advisor: “The City of Seattle this week launched a citywide privacy initiative aimed at providing greater transparency into the city’s data collection and use practices.
To that end, the city has convened a group of stakeholders, the Privacy Advisory Committee, comprising various government departments, to look at the ways the city is using data collected from practices as common as utility bill payments and renewing pet licenses or during the administration of emergency services like police and fire. By this summer, the committee will deliver the City Council suggested principles and a “privacy statement” to provide direction on privacy practices citywide.
In addition, the city has partnered with the University of Washington, where Jan Whittington, assistant professor of urban design and planning and associate director at the Center for Information Assurance and Cybersecurity, has been given a $50,000 grant to look at open data, privacy and digital equity and how municipal data collection could harm consumers.
Responsible for all things privacy in this progressive city is Michael Mattmiller, who was hired to the position of chief technology officer (CTO) for the City of Seattle in June. Before his current gig, he worked as a senior strategist in enterprise cloud privacy for Microsoft. He said it’s an exciting time to be at the helm of the office because there’s momentum, there’s talent and there’s intention.
“We’re at this really interesting time where we have a City Council that strongly cares about privacy … We have a new police chief who wants to be very good on privacy … We also have a mayor who is focused on the city being an innovative leader in the way we interact with the public,” he said.
In fact, some City Council members have taken it upon themselves to meet with various groups and coalitions. “We have a really good, solid environment we think we can leverage to do something meaningful,” Mattmiller said….
Armbruster said the end goal is to create policies that will hold weight over time.
“I think when looking at privacy principles, from an ethical foundation, the idea is to create something that will last while technology dances around us,” she said, adding the principles should answer the question, “What do we stand for as a city and how do we want to move forward? So any technology that falls into our laps, we can evaluate and tailor or perhaps take a pass on as it falls under our ethical framework.”
The bottom line, Mattmiller said, is making a decision that says something about Seattle and where it stands.
“How do we craft a privacy policy that establishes who we want to be as a city and how we want to operate?” Mattmiller asked.”

OpenUp Corporate Data while Protecting Privacy


Article by Stefaan G. Verhulst and David Sangokoya, (The GovLab) for the OpenUp? Blog: “Consider a few numbers: By the end of 2014, the number of mobile phone subscriptions worldwide is expected to reach 7 billion, nearly equal to the world’s population. More than 1.82 billion people communicate on some form of social network, and almost 14 billion sensor-laden everyday objects (trucks, health monitors, GPS devices, refrigerators, etc.) are now connected and communicating over the Internet, creating a steady stream of real-time, machine-generated data.
Much of the data generated by these devices is today controlled by corporations. These companies are in effect “owners” of terabytes of data and metadata. Companies use this data to aggregate, analyze, and track individual preferences, provide more targeted consumer experiences, and add value to the corporate bottom line.
At the same time, even as we witness a rapid “datafication” of the global economy, access to data is emerging as an increasingly critical issue, essential to addressing many of our most important social, economic, and political challenges. While the rise of the Open Data movement has opened up over a million datasets around the world, much of this openness is limited to government (and, to a lesser extent, scientific) data. Access to corporate data remains extremely limited. This is a lost opportunity. If corporate data—in the form of Web clicks, tweets, online purchases, sensor data, call data records, etc.—were made available in a de-identified and aggregated manner, researchers, public interest organizations, and third parties would gain greater insights on patterns and trends that could help inform better policies and lead to greater public good (including combatting Ebola).
Corporate data sharing holds tremendous promise. But its potential—and limitations—are also poorly understood. In what follows, we share early findings of our efforts to map this emerging open data frontier, along with a set of reflections on how to safeguard privacy and other citizen and consumer rights while sharing. Understanding the practice of shared corporate data—and assessing the associated risks—is an essential step in increasing access to socially valuable data held by businesses today. This is a challenge certainly worth exploring during the forthcoming OpenUp conference!
Understanding and classifying current corporate data sharing practices
Corporate data sharing remains very much a fledgling field. There has been little rigorous analysis of different ways or impacts of sharing. Nonetheless, our initial mapping of the landscape suggests there have been six main categories of activity—i.e., ways of sharing—to date:…
Assessing risks of corporate data sharing
Although the shared corporate data offers several benefits for researchers, public interest organizations, and other companies, there do exist risks, especially regarding personally identifiable information (PII). When aggregated, PII can serve to help understand trends and broad demographic patterns. But if PII is inadequately scrubbed and aggregated data is linked to specific individuals, this can lead to identity theft, discrimination, profiling, and other violations of individual freedom. It can also lead to significant legal ramifications for corporate data providers….”