Open Data Supply: Enriching the usability of information


Report by Phoensight: “With the emergence of increasing computational power, high cloud storage capacity and big data comes an eager anticipation of one of the biggest IT transformations of our society today.

Open data has an instrumental role to play in our digital revolution by creating unprecedented opportunities for governments and businesses to leverage off previously unavailable information to strengthen their analytics and decision making for new client experiences. Whilst virtually every business recognises the value of data and the importance of the analytics built on it, the ability to realise the potential for maximising revenue and cost savings is not straightforward. The discovery of valuable insights often involves the acquisition of new data and an understanding of it. As we move towards an increasing supply of open data, technological and other entrepreneurs will look to better utilise government information for improved productivity.

This report uses a data-centric approach to examine the usability of information by considering ways in which open data could better facilitate data-driven innovations and further boost our economy. It assesses the state of open data today and suggests ways in which data providers could supply open data to optimise its use. A number of useful measures of information usability such as accessibility, quantity, quality and openness are presented which together contribute to the Open Data Usability Index (ODUI). For the first time, a comprehensive assessment of open data usability has been developed and is expected to be a critical step in taking the open data agenda to the next level.

With over two million government datasets assessed against the open data usability framework and models developed to link entire country’s datasets to key industry sectors, never before has such an extensive analysis been undertaken. Government open data across Australia, Canada, Singapore, the United Kingdom and the United States reveal that most countries have the capacity for improvements in their information usability. It was found that for 2015 the United Kingdom led the way followed by Canada, Singapore, the United States and Australia. The global potential of government open data is expected to reach 20 exabytes by 2020, provided governments are able to release as much data as possible within legislative constraints….(More)”

The Open Data Barometer (3rd edition)


The Open Data Barometer: “Once the preserve of academics and statisticians, data has become a development cause embraced by everyone from grassroots activists to the UN Secretary-General. There’s now a clear understanding that we need robust data to drive democracy and development — and a lot of it.

Last year, the world agreed the Sustainable Development Goals (SDGs) — seventeen global commitments that set an ambitious agenda to end poverty, fight inequality and tackle climate change by 2030. Recognising that good data is essential to the success of the SDGs, the Global Partnership for Sustainable Development Data and the International Open Data Charter were launched as the SDGs were unveiled. These alliances mean the “data revolution” now has over 100 champions willing to fight for it. Meanwhile, Africa adopted the African Data Consensus — a roadmap to improving data standards and availability in a region that has notoriously struggled to capture even basic information such as birth registration.

But while much has been made of the need for bigger and better data to power the SDGs, this year’s Barometer follows the lead set by the International Open Data Charter by focusing on how much of this data will be openly available to the public.

Open data is essential to building accountable and effective institutions, and to ensuring public access to information — both goals of SDG 16. It is also essential for meaningful monitoring of progress on all 169 SDG targets. Yet the promise and possibilities offered by opening up data to journalists, human rights defenders, parliamentarians, and citizens at large go far beyond even these….

At a glance, here are this year’s key findings on the state of open data around the world:

    • Open data is entering the mainstream.The majority of the countries in the survey (55%) now have an open data initiative in place and a national data catalogue providing access to datasets available for re-use. Moreover, new open data initiatives are getting underway or are promised for the near future in a number of countries, including Ecuador, Jamaica, St. Lucia, Nepal, Thailand, Botswana, Ethiopia, Nigeria, Rwanda and Uganda. Demand is high: civil society and the tech community are using government data in 93% of countries surveyed, even in countries where that data is not yet fully open.
    • Despite this, there’s been little to no progress on the number of truly open datasets around the world.Even with the rapid spread of open government data plans and policies, too much critical data remains locked in government filing cabinets. For example, only two countries publish acceptable detailed open public spending data. Of all 1,380 government datasets surveyed, almost 90% are still closed — roughly the same as in the last edition of the Open Data Barometer (when only 130 out of 1,290 datasets, or 10%, were open). What is more, much of the approximately 10% of data that meets the open definition is of poor quality, making it difficult for potential data users to access, process and work with it effectively.
    • “Open-washing” is jeopardising progress. Many governments have advertised their open data policies as a way to burnish their democratic and transparent credentials. But open data, while extremely important, is just one component of a responsive and accountable government. Open data initiatives cannot be effective if not supported by a culture of openness where citizens are encouraged to ask questions and engage, and supported by a legal framework. Disturbingly, in this edition we saw a backslide on freedom of information, transparency, accountability, and privacy indicators in some countries. Until all these factors are in place, open data cannot be a true SDG accelerator.
    • Implementation and resourcing are the weakest links.Progress on the Barometer’s implementation and impact indicators has stalled or even gone into reverse in some cases. Open data can result in net savings for the public purse, but getting individual ministries to allocate the budget and staff needed to publish their data is often an uphill battle, and investment in building user capacity (both inside and outside of government) is scarce. Open data is not yet entrenched in law or policy, and the legal frameworks supporting most open data initiatives are weak. This is a symptom of the tendency of governments to view open data as a fad or experiment with little to no long-term strategy behind its implementation. This results in haphazard implementation, weak demand and limited impact.
    • The gap between data haves and have-nots needs urgent attention.Twenty-six of the top 30 countries in the ranking are high-income countries. Half of open datasets in our study are found in just the top 10 OECD countries, while almost none are in African countries. As the UN pointed out last year, such gaps could create “a whole new inequality frontier” if allowed to persist. Open data champions in several developing countries have launched fledgling initiatives, but too often those good open data intentions are not adequately resourced, resulting in weak momentum and limited success.
    • Governments at the top of the Barometer are being challenged by a new generation of open data adopters. Traditional open data stalwarts such as the USA and UK have seen their rate of progress on open data slow, signalling that new political will and momentum may be needed as more difficult elements of open data are tackled. Fortunately, a new generation of open data adopters, including France, Canada, Mexico, Uruguay, South Korea and the Philippines, are starting to challenge the ranking leaders and are adopting a leadership attitude in their respective regions. The International Open Data Charter could be an important vehicle to sustain and increase momentum in challenger countries, while also stimulating renewed energy in traditional open data leaders….(More)”

Emerging urban digital infomediaries and civic hacking in an era of big data and open data initiatives


Chapter by Thakuriah, P., Dirks, L., and Keita, Y. in Seeing Cities Through Big Data: Research Methods and Applications in Urban Informatics (forthcoming): “This paper assesses non-traditional urban digital infomediaries who are pushing the agenda of urban Big Data and Open Data. Our analysis identified a mix of private, public, non-profit and informal infomediaries, ranging from very large organizations to independent developers. Using a mixed-methods approach, we identified four major groups of organizations within this dynamic and diverse sector: general-purpose ICT providers, urban information service providers, open and civic data infomediaries, and independent and open source developers. A total of nine organizational types are identified within these four groups. We align these nine organizational types along five dimensions accounts for their mission and major interests, products and services, as well activities they undertake: techno-managerial, scientific, business and commercial, urban engagement, and openness and transparency. We discuss urban ICT entrepreneurs, and the role of informal networks involving independent developers, data scientists and civic hackers in a domain that historically involved professionals in the urban planning and public management domains. Additionally, we examine convergence in the sector by analyzing overlaps in their activities, as determined by a text mining exercise of organizational webpages. We also consider increasing similarities in products and services offered by the infomediaries, while highlighting ideological tensions that might arise given the overall complexity of the sector, and differences in the backgrounds and end-goals of the participants involved. There is much room for creation of knowledge and value networks in the urban data sector and for improved cross-fertilization among bodies of knowledge….(More)”

Open Data and Beyond


Paper by Frederika Welle Donker, Bastiaan van Loenen and Arnold K. Bregt: “In recent years, there has been an increasing trend of releasing public sector information as open data. Governments worldwide see the potential benefits of opening up their data. The potential benefits are more transparency, increased governmental efficiency and effectiveness, and external benefits, including societal and economic benefits. The private sector also recognizes potential benefits of making their datasets available as open data. One such company is Liander, an energy network administrator in the Netherlands. Liander views open data as a contributing factor to energy conservation. However, to date there has been little research done into the actual effects of open data. This research has developed a monitoring framework to assess the effects of open data, and has applied the framework to Liander’s small-scale energy consumption dataset….(More)

OpenTrials: towards a collaborative open database of all available information on all clinical trials


Paper Ben Goldacre and Jonathan Gray at BioMed Central: “OpenTrials is a collaborative and open database for all available structured data and documents on all clinical trials, threaded together by individual trial. With a versatile and expandable data schema, it is initially designed to host and match the following documents and data for each trial: registry entries; links, abstracts, or texts of academic journal papers; portions of regulatory documents describing individual trials; structured data on methods and results extracted by systematic reviewers or other researchers; clinical study reports; and additional documents such as blank consent forms, blank case report forms, and protocols. The intention is to create an open, freely re-usable index of all such information and to increase discoverability, facilitate research, identify inconsistent data, enable audits on the availability and completeness of this information, support advocacy for better data and drive up standards around open data in evidence-based medicine. The project has phase I funding. This will allow us to create a practical data schema and populate the database initially through web-scraping, basic record linkage techniques, crowd-sourced curation around selected drug areas, and import of existing sources of structured and documents. It will also allow us to create user-friendly web interfaces onto the data and conduct user engagement workshops to optimise the database and interface designs. Where other projects have set out to manually and perfectly curate a narrow range of information on a smaller number of trials, we aim to use a broader range of techniques and attempt to match a very large quantity of information on all trials. We are currently seeking feedback and additional sources of structured data….(More)”

New Orleans Gamifies the City Budget


Kelsey E. Thomas at Next City: “New Orleanians can try their hand at being “mayor for a day” with a new interactive website released by the Committee for a Better New Orleans Wednesday.

The Big Easy Budget Game uses open data from the city to allow players to create their own version of an operating budget. Players are given a digital $602 million, and have to balance the budget — keeping in mind the government’s responsibilities, previous year’s spending and their personal priorities.

Each department in the game has a minimum funding level (players can’t just quit funding public schools if they feel like it), and restricted funding, such as state or federal dollars, is off limits.

CBNO hopes to attract 600 players this year, and plans to compile the data from each player into a crowdsourced meta-budget called “The People’s Budget.” Next fall, the People’s Budget will be released along with the city’s proposed 2017 budget.

Along with the budgeting game, CBNO released a more detailed website, also using the city’s open data, that breaks down the city’s budgeted versus actual spending from 2007 to now and is filterable. The goal is to allow users without big data experience to easily research funding relevant to their neighborhoods.

Many cities have been releasing interactive websites to make their data more accessible to residents. Checkbook NYC updates more than $70 billion in city expenses daily and breaks them down by transaction. Fiscal Focus Pittsburgh is an online visualization tool that outlines revenues and expenses in the city’s budget….(More)”

Open data and the API economy: when it makes sense to give away data


 at ZDNet: “Open data is one of those refreshing trends that flows in the opposite direction of the culture of fear that has developed around data security. Instead of putting data under lock and key, surrounded by firewalls and sandboxes, some organizations see value in making data available to all comers — especially developers.

The GovLab.org, a nonprofit advocacy group, published an overview of the benefits governments and organizations are realizing from open data, as well as some of the challenges. The group defines open data as “publicly available data that can be universally and readily accessed, used and redistributed free of charge. It is structured for usability and computability.”…

For enterprises, an open-data stance may be the fuel to build a vibrant ecosystem of developers and business partners. Scott Feinberg, API architect for The New York Times, is one of the people helping to lead the charge to open-data ecosystems. In a recent CXOTalk interview with ZDNet colleague Michael Krigsman, he explains how through the NYT APIs program, developers can sign up for access to 165 years worth of content.

But it requires a lot more than simply throwing some APIs out into the market. Establishing such a comprehensive effort across APIs requires a change in mindset that many organizations may not be ready for, Feinberg cautions. “You can’t be stingy,” he says. “You have to just give it out. When we launched our developer portal there’s a lot of questions like, are people going to be stealing our data, questions like that. Just give it away. You don’t have to give it all but don’t be stingy, and you will find that first off not that many people are going to use it at first. you’re going to find that out, but the people who do, you’re going to find those passionate people who are really interested in using your data in new ways.”

Feinberg clarifies that the NYT’s APIs are not giving out articles for free. Rather, he explains, “we give is everything but article content. You can search for articles. You can find out what’s trending. You can almost do anything you want with our data through our APIs with the exception of actually reading all of the content. It’s really about giving people the opportunity to really interact with your content in ways that you’ve never thought of, and empowering your community to figure out what they want. You know while we don’t give our actual article text away, we give pretty much everything else and people build a lot of really cool stuff on top of that.”

Open data sets, of course, have to worthy of the APIs that offer them. In his post, Borne outlines the seven qualities open data needs to have to be of value to developers and consumers. (Yes, they’re also “Vs” like big data.)

  1. Validity: It’s “critical to pay attention to these data validity concerns when your organization’s data are exposed to scrutiny and inspection by others,” Borne states.
  2. Value: The data needs to be the font of new ideas, new businesses, and innovations.
  3. Variety: Exposing the wide variety of data available can be “a scary proposition for any data scientist,” Borne observes, but nonetheless is essential.
  4. Voice: Remember that “your open data becomes the voice of your organization to your stakeholders.”
  5. Vocabulary: “The semantics and schema (data models) that describe your data are more critical than ever when you provide the data for others to use,” says Borne. “Search, discovery, and proper reuse of data all require good metadata, descriptions, and data modeling.”
  6. Vulnerability: Accept that open data, because it is so open, will be subjected to “misuse, abuse, manipulation, or alteration.”
  7. proVenance: This is the governance requirement behind open data offerings. “Provenance includes ownership, origin, chain of custody, transformations that been made to it, processing that has been applied to it (including which versions of processing software were used), the data’s uses and their context, and more,” says Borne….(More)”

When open data is a Trojan Horse: The weaponization of transparency in science and governance


Karen E.C. Levy and David Merritt Johns in Big Data and Society: “Openness and transparency are becoming hallmarks of responsible data practice in science and governance. Concerns about data falsification, erroneous analysis, and misleading presentation of research results have recently strengthened the call for new procedures that ensure public accountability for data-driven decisions. Though we generally count ourselves in favor of increased transparency in data practice, this Commentary highlights a caveat. We suggest that legislative efforts that invoke the language of data transparency can sometimes function as “Trojan Horses” through which other political goals are pursued. Framing these maneuvers in the language of transparency can be strategic, because approaches that emphasize open access to data carry tremendous appeal, particularly in current political and technological contexts. We illustrate our argument through two examples of pro-transparency policy efforts, one historical and one current: industry-backed “sound science” initiatives in the 1990s, and contemporary legislative efforts to open environmental data to public inspection. Rules that exist mainly to impede science-based policy processes weaponize the concept of data transparency. The discussion illustrates that, much as Big Data itself requires critical assessment, the processes and principles that attend it—like transparency—also carry political valence, and, as such, warrant careful analysis….(More)”

Liberating data for public value: The case of Data.gov


Paper by Rashmi Krishnamurthy and  Yukika Awazu in the International Journal of Information Management: “Public agencies around the globe are liberating their data. Drawing on a case of Data.gov, we outline the challenges and opportunities that lie ahead for the liberation of public data. Data.gov is an online portal that provides open access to datasets generated by US public agencies and countries around the world in a machine-readable format. By discussing the challenges and opportunities faced by Data.gov, we provide several lessons that can inform research and practice. We suggest that providing access to open data in itself does not spur innovation. Specifically, we claim that public agencies need to spend resources to improve the capacities of their organizations to move toward ‘open data by default’; develop capacities of community to use data to solve problems; and think critically about the unintended consequences of providing access to public data. We also suggest that public agencies need better metrics to evaluate the success of open-data efforts in achieving its goals….(More)”

Open Data Impact: When Demand and Supply Meet


Stefaan Verhulst and Andrew Young at the GovLab: “Today, in “Open Data Impact: When Demand and Supply Meet,” the GovLab and Omidyar Network release key findings about the social, economic, cultural and political impact of open data. The findings are based on 19 detailed case studies of open data projects from around the world. These case studies were prepared in order to address an important shortcoming in our understanding of when, and how, open data works. While there is no shortage of enthusiasm for open data’s potential, nor of conjectural estimates of its hypothetical impact, few rigorous, systematic analyses exist of its concrete, real-world impact…. The 19 case studies that inform this report, all of which can be found at Open Data’s Impact (odimpact.org), a website specially set up for this project, were chosen for their geographic and sectoral representativeness. They seek to go beyond the descriptive (what happened) to the explanatory (why it happened, and what is the wider relevance or impact)….

In order to achieve the potential of open data and scale the impact of the individual projects discussed in our report, we need a better – and more granular – understanding of the enabling conditions that lead to success. We found 4 central conditions (“4Ps”) that play an important role in ensuring success:

Conditions

  • Partnerships: Intermediaries and data collaboratives play an important role in ensuring success, allowing for enhanced matching of supply and demand of data.
  • Public infrastructure: Developing open data as a public infrastructure, open to all, enables wider participation, and a broader impact across issues and sectors.
  • Policies: Clear policies regarding open data, including those promoting regular assessments of open data projects, are also critical for success.
  • Problem definition: Open data initiatives that have a clear target or problem definition have more impact and are more likely to succeed than those with vaguely worded statements of intent or unclear reasons for existence. 

Core Challenges

Finally, the success of a project is also determined by the obstacles and challenges it confronts. Our research uncovered 4 major challenges (“4Rs”) confronting open data initiatives across the globe:

Challenges

  • Readiness: A lack of readiness or capacity (evident, for example, in low Internet penetration or technical literacy rates) can severely limit the impact of open data.
  • Responsiveness: Open data projects are significantly more likely to be successful when they remain agile and responsive—adapting, for instance, to user feedback or early indications of success and failure.
  • Risks: For all its potential, open data does pose certain risks, notably to privacy and security; a greater, more nuanced understanding of these risks will be necessary to address and mitigate them.
  • Resource Allocation: While open data projects can often be launched cheaply, those projects that receive generous, sustained and committed funding have a better chance of success over the medium and long term.

Toward a Next Generation Open Data Roadmap

The report we release today concludes with ten recommendations for policymakers, advocates, users, funders and other stakeholders in the open data community. For each step, we include a few concrete methods of implementation – ways to translate the broader recommendation into meaningful impact.

Together, these 10 recommendations and their means of implementation amount to what we call a “Next Generation Open Data Roadmap.” This roadmap is just a start, and we plan to continue fleshing it out in the near future. For now, it offers a way forward. It is our hope that this roadmap will help guide future research and experimentation so that we can continue to better understand how the potential of open data can be fulfilled across geographies, sectors and demographics.

Additional Resources

In conjunction with the release of our key findings paper, we also launch today an “Additional Resources” section on the Open Data’s Impact website. The goal of that section is to provide context on our case studies, and to point in the direction of other, complementary research. It includes the following elements:

  • A “repository of repositories,” including other compendiums of open data case studies and sources;
  • A compilation of some popular open data glossaries;
  • A number of open data research publications and reports, with a particular focus on impact;
  • A collection of open data definitions and a matrix of analysis to help assess those definitions….(More)