OpenPrism


thomas levine: “There are loads of open data portals There’s even portal about data portals. And each of these portals has loads of datasets.
OpenPrism is my most recent attempt at understanding what is going on in all of these portals. Read on if you want to see why I made it, or just go to the site and start playing with it.

People don’t know much about open data

Nobody seems to know what is in the data portals. Many people know about datasets that are relevant to their work, municipality, &c., but nobody seems to know about the availability of data on broader topics, and nobody seems to have a good way of finding out what is available.
If someone does know any of this, he probably works for an open data portal. Still, he probably doesn’t know much about what is going on in other portals.

Naive search method

One difficulty in discovering open data is the search paradigm.
Open data portals approach searching data as if data were normal prose; your search terms are some keywords, a category, &c., and your results are dataset titles and descriptions.
There are other approaches. For example, AppGen searches for datasets with the same variables as each other, and the results are automatically generated app prototypes.

Siloed open data portals

Another issue is that people tend to use data from only one portal; they use their local government’s portals or their organizations’ portals.
Let me give you a couple examples of why this should maybe be different. Perhaps I’m considering making an app to help people find parking, and I want to see what parking lot data are available before I put much work into the app. Or maybe I want to find all of the data about sewer overflows so that I can expand my initiative to reduce water pollution.
OpenPrism is one small attempt at making it easier to search. Rather than going to all of the different portals and making a separate search for each portal, you type your search in one search bar, and you get results from a bunch of different Socrata, CKAN and Junar portals.”

Open Access


Reports by the UK’s House of Commons, Business, Innovation and Skills Committee: “Open access refers to the immediate, online availability of peer reviewed research articles, free at the point of access (i.e. without subscription charges or paywalls). Open access relates to scholarly articles and related outputs. Open data (which is a separate area of Government policy and outside the scope of this inquiry) refers to the availability of the underlying research data itself. At the heart of the open access movement is the principle that publicly funded research should be publicly accessible. Open access expanded rapidly in the late twentieth century with the growth of the internet and digitisation (the transcription of data into a digital form), as it became possible to disseminate research findings more widely, quickly and cheaply.
Whilst there is widespread agreement that the transition to open access is essential in order to improve access to knowledge, there is a lack of consensus about the best route to achieve it. To achieve open access at scale in the UK, there will need to be a shift away from the dominant subscription-based business model. Inevitably, this will involve a transitional period and considerable change within the scholarly publishing market.
For the UK to transition to open access, an effective, functioning and competitive market in scholarly communications will be vital. The evidence we saw over the course of this inquiry shows that this is currently far from the case, with journal subscription prices rising at rates that are unsustainable for UK universities and other subscribers. There is a significant risk that the Government’s current open access policy will inadvertently encourage and prolong the dysfunctional elements of the scholarly publishing market, which are a major barrier to access.
See Volume I and  Volume II

Understanding the impact of releasing and re-using open government data


New Report by the European Public Sector Information Platform: “While there has been a proliferation of open data portals and data re-using tools and applications of tremendous speed in the last decade, research and understanding about the impact of opening up public sector information and open government data (OGD hereinafter) has been lacking behind.
Until now, there have been some research efforts to structure the concept of the impact of OGD suggesting various theories of change, their measuring methodologies or in some cases, concrete calculations as to what financial benefits opening government data brings on a table. For instance, the European Commission conducted a study on pricing of public sector information, which attempted evaluating direct and indirect economic impact of opening public data and identified key indicators to monitor the effects of open data portals. Also, Open Data Research Network issued a background report in April 2012 suggesting a general framework of key indicators to measure the impact of open data initiatives both on a provision and re-use stages.
Building on the research efforts up to date, this report will reflect upon the main types of impacts OGD may have and will also present key measuring frameworks to observe the change OGD initiatives may bring about.”

Open data for accountable governance: Is data literacy the key to citizen engagement?


at UNDP’s Voices of Eurasia blog: “How can technology connect citizens with governments, and how can we foster, harness, and sustain the citizen engagement that is so essential to anti-corruption efforts?
UNDP has worked on a number of projects that use technology to make it easier for citizens to report corruption to authorities:

These projects are showing some promising results, and provide insights into how a more participatory, interactive government could develop.
At the heart of the projects is the ability to use citizen generated data to identify and report problems for governments to address….

Wanted: Citizen experts

As Kenneth Cukier, The Economist’s Data Editor, has discussed, data literacy will become the new computer literacy. Big data is still nascent and it is impossible to predict exactly how it will affect society as a whole. What we do know is that it is here to stay and data literacy will be integral to our lives.
It is essential that we understand how to interact with big data and the possibilities it holds.
Data literacy needs to be integrated into the education system. Educating non-experts to analyze data is critical to enabling broad participation in this new data age.
As technology advances, key government functions become automated, and government data sharing increases, newer ways for citizens to engage will multiply.
Technology changes rapidly, but the human mind and societal habits cannot. After years of closed government and bureaucratic inefficiency, adaptation of a new approach to governance will take time and education.
We need to bring up a generation that sees being involved in government decisions as normal, and that views participatory government as a right, not an ‘innovative’ service extended by governments.

What now?

In the meantime, while data literacy lies in the hands of a few, we must continue to connect those who have the technological skills with citizen experts seeking to change their communities for the better – as has been done in many a Social Innovation Camps recently (in Montenegro, Ukraine and Armenia at Mardamej and Mardamej Relaoded and across the region at Hurilab).
The social innovation camp and hackathon models are an increasingly debated topic (covered by Susannah Vila, David Eaves, Alex Howard and Clay Johnson).
On the whole, evaluations are leading to newer models that focus on greater integration of mentorship to increase sustainability – which I readily support. However, I do have one comment:
Social innovation camps are often criticized for a lack of sustainability – a claim based on the limited number of apps that go beyond the prototype phase. I find a certain sense of irony in this, for isn’t this what innovation is about: Opening oneself up to the risk of failure in the hope of striking something great?
In the words of Vinod Khosla:

“No failure means no risk, which means nothing new.”

As more data is released, the opportunity for new apps and new ways for citizen interaction will multiply and, who knows, someone might come along and transform government just as TripAdvisor transformed the travel industry.”

Public Open Data: The Good, the Bad, the Future


at IDEALAB: “Some of the most powerful tools combine official public data with social media or other citizen input, such as the recent partnership between Yelp and the public health departments in New York and San Francisco for restaurant hygiene inspection ratings. In other contexts, such tools can help uncover and ultimately reduce corruption by making it easier to “follow the money.”
Despite the opportunities offered by “free data,” this trend also raises new challenges and concerns, among them, personal privacy and security. While attention has been devoted to the unsettling power of big data analysis and “predictive analytics” for corporate marketing, similar questions could be asked about the value of public data. Does it contribute to community cohesion that I can find out with a single query how much my neighbors paid for their house or (if employed by public agencies) their salaries? Indeed, some studies suggest that greater transparency leads not to greater trust in government but to resignation and apathy.
Exposing certain law enforcement data also increases the possibility of vigilantism. California law requires the registration and publication of the home addresses of known sex offenders, for instance. Or consider the controversy and online threats that erupted when, shortly after the Newtown tragedy, a newspaper in New York posted an interactive map of gun permit owners in nearby counties.
…Policymakers and officials must still mind the “big data gap.”So what does the future hold for open data? Publishing data is only one part of the information ecosystem. To be useful, tools must be developed for cleaning, sorting, analyzing and visualizing it as well. …
For-profit companies and non-profit watchdog organizations will continue to emerge and expand, building on the foundation of this data flood. Public-private partnerships such as those between San Francisco and Appallicious or Granicus, startups created by Code for America’s Incubator, and non-partisan organizations like the Sunlight Foundation and MapLight rely on public data repositories for their innovative applications and analysis.
Making public data more accessible is an important goal and offers enormous potential to increase civic engagement. To make the most effective and equitable use of this resource for the public good, cities and other government entities should invest in the personnel and equipment — hardware and software — to make it universally accessible. At the same time, Chief Data Officers (or equivalent roles) should also be alert to the often hidden challenges of equity, inclusion, privacy, and security.”

The Other Side of Open is Not Closed


Dazza Greenwood at Civics.com: “Impliedly, the opposite of “open” is “closed” but the other side of open data, open API’s and open access is usually still about enabling access but only when allowed or required. Open government also needs to include adequate methods to access and work with data and other resources that are not fully open. In fact, many (most?) high value, mission critical and societally important data access is restricted in some way. If a data-set is not fully public record then a good practice is to think of it as “protected” and to ensure access according to proper controls.
As a metaphorical illustration, you could look at an open data system like a village square or agora that is architected and intended to be broadly accessible. On the other side of the spectrum, you could see a protected data system more like a castle or garrison, that is architected to be secure from intruders but features guarded gates and controlled access points in order to function.
In fact, this same conceptual approach applies well beyond data and includes everything you could consider an resource on the Internet.  In other words, any asset, service, process or other item that can exist at a URL (or URI) is a resource and can be positioned somewhere on a spectrum from openly accessible to access protected. It is easy to forget that the “R” in URL stands for “Resource” and the whole wonderful web connects to resources of every nature and description. Data – structured, raw or otherwise – is just the tip of the iceberg.
Resources on the web could be apps and other software, or large-scale enterprise network services, or just a single text file with few lines of html. The concept of a enabling access permission to “protected resources” on the web is the cornerstone of OAuth2 and is now being extended by the OpenID Connect standard, the User Managed Access protocol and other specifications to enable a powerful array of REST-based authorization possibilities…”

A promising phenomenon of open data: A case study of the Chicago open data project


Paper by Maxat Kassen in Government Information Quarterly: “This article presents a case study of the open data project in the Chicago area. The main purpose of the research is to explore empowering potential of an open data phenomenon at the local level as a platform useful for promotion of civic engagement projects and provide a framework for future research and hypothesis testing. Today the main challenge in realization of any e-government projects is a traditional top–down administrative mechanism of their realization itself practically without any input from members of the civil society. In this respect, the author of the article argues that the open data concept realized at the local level may provide a real platform for promotion of proactive civic engagement. By harnessing collective wisdom of the local communities, their knowledge and visions of the local challenges, governments could react and meet citizens’ needs in a more productive and cost-efficient manner. Open data-driven projects that focused on visualization of environmental issues, mapping of utility management, evaluating of political lobbying, social benefits, closing digital divide, etc. are only some examples of such perspectives. These projects are perhaps harbingers of a new political reality where interactions among citizens at the local level will play an more important role than communication between civil society and government due to the empowering potential of the open data concept.”

A Modern Approach to Open Data


at the Sunlight Foundation blog: “Last year, a group of us who work daily with open government data — Josh Tauberer of GovTrack.us, Derek Willis at The New York Times, and myself — decided to stop each building the same basic tools over and over, and start building a foundation we could share.
noun_project_15212
We set up a small home at github.com/unitedstates, and kicked it off with a couple of projects to gather data on the people and work of Congress. Using a mix of automation and curation, they gather basic information from all over the government — THOMAS.gov, the House and Senate, the Congressional Bioguide, GPO’s FDSys, and others — that everyone needs to report, analyze, or build nearly anything to do with Congress.
Once we centralized this work and started maintaining it publicly, we began getting contributions nearly immediately. People educated us on identifiers, fixed typos, and gathered new data. Chris Wilson built an impressive interactive visualization of the Senate’s budget amendments by extending our collector to find and link the text of amendments.
This is an unusual, and occasionally chaotic, model for an open data project. github.com/unitedstates is a neutral space; GitHub’s permissions system allows many of us to share the keys, so no one person or institution controls it. What this means is that while we all benefit from each other’s work, no one is dependent or “downstream” from anyone else. It’s a shared commons in the public domain.
There are a few principles that have helped make the unitedstates project something that’s worth our time:…”

White House Expands Guidance on Promoting Open Data


NextGov: “White House officials have announced expanded technical guidance to help agencies make more data accessible to the public in machine-readable formats.
Following up on President Obama’s May executive order linking the pursuit of open data to economic growth, innovation and government efficiency, two budget and science office spokesmen on Friday published a blog post highlighting new instructions and answers to frequently asked questions.
Nick Sinai, deputy chief technology officer at the Office of Science and Technology Policy, and Dominic Sale, supervisory policy analyst at the Office of Management and Budget, noted that the policy now in place means that all “newly generated government data will be required to be made available in open, machine-readable formats, greatly enhancing their accessibility and usefulness, while ensuring privacy and security.”

Announcing Project Open Data from Cloudant Labs


Yuriy Dybskiy from Cloudant: “There has been an emerging pattern over the last few years of more and more government datasets becoming available for public access. Earlier this year, the White House announced official policy on such data – Project Open Data.

Available resources

Here are four resources on the topic:

  1. Tim Berners-Lee: Open, Linked Data for a Global Community – [10 min video]
  2. Rufus Pollock: Open Data – How We Got Here and Where We’re Going – [24 min video]
  3. Open Knowledge Foundation Datasets – http://data.okfn.org/data
  4. Max Ogden: Project dat – collaborative data – [github repo]

One of the main challenges is access to the datasets. If only there were a database that had easy access to its data baked right in it.
Luckily, there is CouchDB and Cloudant, which share the same APIs to access data via HTTP. This makes for a really great option to store interesting datasets.

Cloudant Open Data

Today we are happy to announce a Cloudant Labs project – Cloudant Open Data!
Several datasets are available at the moment, for example, businesses_sf – data regarding businesses registered in San Francisco and sf_pd_incidents – a collection of incident reports (criminal and non-criminal) made by the San Francisco Police Department.
We’ll add more, but if you have one you’d like us to add faster – drop us a line at open-data@cloudant.com
Create an account and play with these datasets yourself”