A Modern Approach to Open Data

at the Sunlight Foundation blog: “Last year, a group of us who work daily with open government data — Josh Tauberer of GovTrack.us, Derek Willis at The New York Times, and myself — decided to stop each building the same basic tools over and over, and start building a foundation we could share.
We set up a small home at github.com/unitedstates, and kicked it off with a couple of projects to gather data on the people and work of Congress. Using a mix of automation and curation, they gather basic information from all over the government — THOMAS.gov, the House and Senate, the Congressional Bioguide, GPO’s FDSys, and others — that everyone needs to report, analyze, or build nearly anything to do with Congress.
Once we centralized this work and started maintaining it publicly, we began getting contributions nearly immediately. People educated us on identifiers, fixed typos, and gathered new data. Chris Wilson built an impressive interactive visualization of the Senate’s budget amendments by extending our collector to find and link the text of amendments.
This is an unusual, and occasionally chaotic, model for an open data project. github.com/unitedstates is a neutral space; GitHub’s permissions system allows many of us to share the keys, so no one person or institution controls it. What this means is that while we all benefit from each other’s work, no one is dependent or “downstream” from anyone else. It’s a shared commons in the public domain.
There are a few principles that have helped make the unitedstates project something that’s worth our time:…”

White House Expands Guidance on Promoting Open Data

NextGov: “White House officials have announced expanded technical guidance to help agencies make more data accessible to the public in machine-readable formats.
Following up on President Obama’s May executive order linking the pursuit of open data to economic growth, innovation and government efficiency, two budget and science office spokesmen on Friday published a blog post highlighting new instructions and answers to frequently asked questions.
Nick Sinai, deputy chief technology officer at the Office of Science and Technology Policy, and Dominic Sale, supervisory policy analyst at the Office of Management and Budget, noted that the policy now in place means that all “newly generated government data will be required to be made available in open, machine-readable formats, greatly enhancing their accessibility and usefulness, while ensuring privacy and security.”

Announcing Project Open Data from Cloudant Labs

Yuriy Dybskiy from Cloudant: “There has been an emerging pattern over the last few years of more and more government datasets becoming available for public access. Earlier this year, the White House announced official policy on such data – Project Open Data.

Available resources

Here are four resources on the topic:

  1. Tim Berners-Lee: Open, Linked Data for a Global Community – [10 min video]
  2. Rufus Pollock: Open Data – How We Got Here and Where We’re Going – [24 min video]
  3. Open Knowledge Foundation Datasets – http://data.okfn.org/data
  4. Max Ogden: Project dat – collaborative data – [github repo]

One of the main challenges is access to the datasets. If only there were a database that had easy access to its data baked right in it.
Luckily, there is CouchDB and Cloudant, which share the same APIs to access data via HTTP. This makes for a really great option to store interesting datasets.

Cloudant Open Data

Today we are happy to announce a Cloudant Labs project – Cloudant Open Data!
Several datasets are available at the moment, for example, businesses_sf – data regarding businesses registered in San Francisco and sf_pd_incidents – a collection of incident reports (criminal and non-criminal) made by the San Francisco Police Department.
We’ll add more, but if you have one you’d like us to add faster – drop us a line at open-data@cloudant.com
Create an account and play with these datasets yourself”

Guidelines for Open Data Policies

“The Sunlight Foundation created this living document to present a broad vision of the kinds of challenges that open data policies can actively address.
A few general notes: Although some provisions may carry more importance or heft than others, these Guidelines are not ranked in order of priority, but organized to help define What Data Should be Public, How to Make Data Public, and How to Implement Policy — three key elements of any legislation, executive order, or other policy seeking to include language about open data. Further, it’s worth repeating that these provisions are only a guide. As such, they do not address every question one should consider in preparing a policy. Instead, these provisions attempt to answer the specific question: What can or should an open data policy do?”

Data is Inert — It’s What You Do With It That Counts

Kevin Merritt, CEO and Founder, Socrata, in NextGov: “In its infancy, the open data movement was mostly about offering catalogs of government data online that concerned citizens and civic activists could download. But now, a wide variety of external stakeholders are using open data to deliver new applications and services. At the same time, governments themselves are harnessing open data to drive better decision-making.
In a relatively short period of time, open data has evolved from serving as fodder for data publishing to fuel for open innovation.
One of the keys to making this transformation truly work, however, is our ability to re-instrument or re-tool underlying business systems and processes so managers can receive open data in consumable forms on a regular, continuous basis in real-time….”

Smart Government and Big, Open Data: The Trickle-Up Effect

Anthony Townsend at the Future Now Blog: “As we grow numb to the daily headlines decrying the unimaginable scope of data being collected from Internet companies by the National Security Agency’s Prism program, its worth remembering that governments themselves also produce mountains of data too. Tabulations of the most recent U.S. census, conducted in 2010, involved billions of data points and trillions of calculations. Not surprisingly, it is probably safe to assume that the federal government is also the world’s largest spender on database software—its tab with just one company, market-leader Oracle, passed $700 million in 2012 alone. Government data isn’t just big in scope. It is deep in history—governments have been accumulating data for centuries. In 2006, the genealogical research site Ancestry.com imported 600 terabytes of data (about what Facebook collects in a single day!) from the first fifteen U.S. censuses (1790 to 1930).

But the vast majority of data collected by governments never sees the light of day. It sits squirreled away on servers, and is only rarely cross-referenced in ways that private sector companies do all the time to gain insights into what’s actually going on across the country, and emerging problems and opportunities. Yet as governments all around the world have realized, if shared safely with due precautions to protect individual privacy, in the hand of citizens all of this data could be a national civic monument of tremendous economic and social value.”

Why the world’s governments are interested in creating hubs for open data

in Gigaom: “Amid the tech giants and eager startups that have camped out in East London’s trendy Shoreditch neighborhood, the Open Data Institute is the rare nonprofit on the block that talks about feel-good sorts of things like “triple-bottom line” and “social and environmental value.” …Governments everywhere are embracing the idea that open data is the right way to manage services for citizens. The U.K. has been a leader on this — just check out the simplicity of gov.uk — which is one of the reasons why ODI is U.K. born….“Open data” is open access to the data that has exploded on the scene in recent years, some of it due to the rise of our connected, digital lifestyles from the internet, sensors, GPS, and cell phones, just to name a few resources. But ODI is particularly interested in working with data sets that can have big global and societal impacts, like health, financial, environmental and government data. For example, in conjunction with startup OpenCorporates, ODI recently helped launch a data visualization about Goldman Sachs’s insanely complex corporate structure.”

A Videogame That Recruits Players to Map the Brain

Wired: “I’m no neuroscientist, and yet, here I am at my computer attempting to reconstruct a neural circuit of a mouse’s retina. It’s not quite as difficult and definitely not as boring as it sounds. In fact, it’s actually pretty fun, which is a good thing considering I’m playing a videogame.
Called EyeWire, the browser-based game asks players to map the connections between retinal neurons by coloring in 3-D slices of the brain. Much like any other game out there, being good at EyeWire earns you points, but the difference is that the data you produce during gameplay doesn’t just get you on a leader board—it’s actually used by scientists to build a better picture of the human brain.
Created by neuroscientist Sebastian Seung’s lab at MIT, EyeWire basically gamifies the professional research Seung and his collaborators do on a daily basis. Seung is studying the connectome, the hyper-complex tangle of connections among neurons in the brain.”

New Report Finds Cost-Benefit Analyses Improve Budget Choices & Taxpayer Results

Press Release: “A new report shows cost-benefit analyses have helped states make better investments of public dollars by identifying programs and policies that deliver high returns. However, the majority of states are not yet consistently using this approach when making critical decisions. This 50-state look at cost-benefit analysis, a method that compares the expense of public programs to the returns they deliver, was released today by the Pew-MacArthur Results First Initiative, a project of The Pew Charitable Trusts and the John D. and Catherine T. MacArthur Foundation.

The study, “States’ Use of Cost-benefit Analysis: Improving Results for Taxpayers”, comes at a time when states are under continuing pressure to direct limited dollars toward the most cost-effective programs and policies while curbing spending on those that do not deliver. The report is the first comprehensive study of how all 50 states and the District of Columbia analyze the costs and benefits of programs and policies, report findings, and incorporate the assessments into decision-making. It identifies key challenges states face in conducting and using the analyses and offers strategies to overcome those obstacles. The study includes a review of state statutes, a search for cost benefit analyses released between 2008 and 2011, and interviews with legislators, legislative and program evaluation staff, executive officials, report authors, and agency officials.”

The Recent Rise of Government Open Data APIs

Janet Wagner in ProgrammableWeb: “In recent months, the number of government open data APIs has been increasing rapidly due to a variety of factors including the development of open data technology platforms, the launch of Project Open Data and a recent White House executive order regarding government data.
ProgrammableWeb writer Mark Boyd has recently written three articles related to open data APIs; an article about the latest release of the CKAN API, an article about the UK Open Data Institute and an article about the CivOmega Open Data Search Engine. This post is a brief overview of several recent factors that have led to the rise of government open data APIs.”