Guidelines for Open Data Policies


“The Sunlight Foundation created this living document to present a broad vision of the kinds of challenges that open data policies can actively address.
A few general notes: Although some provisions may carry more importance or heft than others, these Guidelines are not ranked in order of priority, but organized to help define What Data Should be Public, How to Make Data Public, and How to Implement Policy — three key elements of any legislation, executive order, or other policy seeking to include language about open data. Further, it’s worth repeating that these provisions are only a guide. As such, they do not address every question one should consider in preparing a policy. Instead, these provisions attempt to answer the specific question: What can or should an open data policy do?”

Data is Inert — It’s What You Do With It That Counts


Kevin Merritt, CEO and Founder, Socrata, in NextGov: “In its infancy, the open data movement was mostly about offering catalogs of government data online that concerned citizens and civic activists could download. But now, a wide variety of external stakeholders are using open data to deliver new applications and services. At the same time, governments themselves are harnessing open data to drive better decision-making.
In a relatively short period of time, open data has evolved from serving as fodder for data publishing to fuel for open innovation.
One of the keys to making this transformation truly work, however, is our ability to re-instrument or re-tool underlying business systems and processes so managers can receive open data in consumable forms on a regular, continuous basis in real-time….”

Smart Government and Big, Open Data: The Trickle-Up Effect


Anthony Townsend at the Future Now Blog: “As we grow numb to the daily headlines decrying the unimaginable scope of data being collected from Internet companies by the National Security Agency’s Prism program, its worth remembering that governments themselves also produce mountains of data too. Tabulations of the most recent U.S. census, conducted in 2010, involved billions of data points and trillions of calculations. Not surprisingly, it is probably safe to assume that the federal government is also the world’s largest spender on database software—its tab with just one company, market-leader Oracle, passed $700 million in 2012 alone. Government data isn’t just big in scope. It is deep in history—governments have been accumulating data for centuries. In 2006, the genealogical research site Ancestry.com imported 600 terabytes of data (about what Facebook collects in a single day!) from the first fifteen U.S. censuses (1790 to 1930).

But the vast majority of data collected by governments never sees the light of day. It sits squirreled away on servers, and is only rarely cross-referenced in ways that private sector companies do all the time to gain insights into what’s actually going on across the country, and emerging problems and opportunities. Yet as governments all around the world have realized, if shared safely with due precautions to protect individual privacy, in the hand of citizens all of this data could be a national civic monument of tremendous economic and social value.”

Why the world’s governments are interested in creating hubs for open data


in Gigaom: “Amid the tech giants and eager startups that have camped out in East London’s trendy Shoreditch neighborhood, the Open Data Institute is the rare nonprofit on the block that talks about feel-good sorts of things like “triple-bottom line” and “social and environmental value.” …Governments everywhere are embracing the idea that open data is the right way to manage services for citizens. The U.K. has been a leader on this — just check out the simplicity of gov.uk — which is one of the reasons why ODI is U.K. born….“Open data” is open access to the data that has exploded on the scene in recent years, some of it due to the rise of our connected, digital lifestyles from the internet, sensors, GPS, and cell phones, just to name a few resources. But ODI is particularly interested in working with data sets that can have big global and societal impacts, like health, financial, environmental and government data. For example, in conjunction with startup OpenCorporates, ODI recently helped launch a data visualization about Goldman Sachs’s insanely complex corporate structure.”

A Videogame That Recruits Players to Map the Brain


Wired: “I’m no neuroscientist, and yet, here I am at my computer attempting to reconstruct a neural circuit of a mouse’s retina. It’s not quite as difficult and definitely not as boring as it sounds. In fact, it’s actually pretty fun, which is a good thing considering I’m playing a videogame.
Called EyeWire, the browser-based game asks players to map the connections between retinal neurons by coloring in 3-D slices of the brain. Much like any other game out there, being good at EyeWire earns you points, but the difference is that the data you produce during gameplay doesn’t just get you on a leader board—it’s actually used by scientists to build a better picture of the human brain.
Created by neuroscientist Sebastian Seung’s lab at MIT, EyeWire basically gamifies the professional research Seung and his collaborators do on a daily basis. Seung is studying the connectome, the hyper-complex tangle of connections among neurons in the brain.”

New Report Finds Cost-Benefit Analyses Improve Budget Choices & Taxpayer Results


Press Release: “A new report shows cost-benefit analyses have helped states make better investments of public dollars by identifying programs and policies that deliver high returns. However, the majority of states are not yet consistently using this approach when making critical decisions. This 50-state look at cost-benefit analysis, a method that compares the expense of public programs to the returns they deliver, was released today by the Pew-MacArthur Results First Initiative, a project of The Pew Charitable Trusts and the John D. and Catherine T. MacArthur Foundation.

The study, “States’ Use of Cost-benefit Analysis: Improving Results for Taxpayers”, comes at a time when states are under continuing pressure to direct limited dollars toward the most cost-effective programs and policies while curbing spending on those that do not deliver. The report is the first comprehensive study of how all 50 states and the District of Columbia analyze the costs and benefits of programs and policies, report findings, and incorporate the assessments into decision-making. It identifies key challenges states face in conducting and using the analyses and offers strategies to overcome those obstacles. The study includes a review of state statutes, a search for cost benefit analyses released between 2008 and 2011, and interviews with legislators, legislative and program evaluation staff, executive officials, report authors, and agency officials.”

The Recent Rise of Government Open Data APIs


Janet Wagner in ProgrammableWeb: “In recent months, the number of government open data APIs has been increasing rapidly due to a variety of factors including the development of open data technology platforms, the launch of Project Open Data and a recent White House executive order regarding government data.
ProgrammableWeb writer Mark Boyd has recently written three articles related to open data APIs; an article about the latest release of the CKAN API, an article about the UK Open Data Institute and an article about the CivOmega Open Data Search Engine. This post is a brief overview of several recent factors that have led to the rise of government open data APIs.”

 

The Power of Hackathons


Woodrow Wilson International Center for Scholars: “The Commons Lab of the Science and Technology Innovation Program is proud to announce the release of The Power of Hackathons: A Roadmap for Sustainable Open Innovation. Hackathons are collaborative events that have long been part of programmer culture, where people gather in person, online or both to work together on a problem. This could involve creating an application, improving an existing one or testing a platform.
In recent years, government agencies at multiple levels have started holding hackathon events of their own. For this brief, author Zachary Bastian interviewed agency staff, hackathon planners and hackathon participants to better understand how these events can be structured. The fundamental lesson was that a hackathon is not a panacea, but instead should be part of a broader open data and innovation centric strategy.
The full brief can be found here”

Sitegeist


“Sitegeist is a mobile application that helps you to learn more about your surroundings in seconds. Drawing on publicly available information, the app presents solid data in a simple at-a-glance format to help you tap into the pulse of your location. From demographics about people and housing to the latest popular spots or weather, Sitegeist presents localized information visually so you can get back to enjoying the neighborhood. The application draws on free APIs such as the U.S. Census, Yelp! and others to showcase what’s possible with access to data. Sitegeist was created by the Sunlight Foundation in consultation with design firm IDEO and with support from the John S. and James L. Knight Foundation. It is the third in a series of National Data Apps.”

9 models to scale open data – past, present and future


Open Knowledge Foundation Blog: “The possibilities of open data have been enthralling us for 10 years…But that excitement isn’t what matters in the end. What matters is scale – which organisational structures will make this movement explode?  This post quickly and provocatively goes through some that haven’t worked (yet!) and some that have.
Ones that are working now
1) Form a community to enter in new data. Open Street Map and MusicBrainz are two big examples. It works as the community is the originator of the data. That said, neither has dominated its industry as much as I thought they would have by now.
2) Sell tools to an upstream generator of open data. This is what CKAN does for central Governments (and the new ScraperWiki CKAN tool helps with). It’s what mySociety does, when selling FixMyStreet installs to local councils, thereby publishing their potholes as RSS feeds.
3) Use open data (quietly). Every organisation does this and never talks about it. It’s key to quite old data resellers like Bloomberg. It is what most of ScraperWiki’s professional services customers ask us to do. The value to society is enormous and invisible. The big flaw is that it doesn’t help scale supply of open data.
4) Sell tools to downstream users. This isn’t necessarily open data specific – existing software like spreadsheets and Business Intelligence can be used with open or closed data. Lots of open data is on the web, so tools like the new ScraperWiki which work well with web data are particularly suited to it.
Ones that haven’t worked
5) Collaborative curation ScraperWiki started as an audacious attempt to create an open data curation community, based on editing scraping code in a wiki. In its original form (now called ScraperWiki Classic) this didn’t scale. …With a few exceptions, notably OpenCorporates, there aren’t yet open data curation projects.
6) General purpose data marketplaces, particularly ones that are mainly reusing open data, haven’t taken off. They might do one day, however I think they need well-adopted higher level standards for data formatting and syncing first (perhaps something like dat, perhaps something based on CSV files).
Ones I expect more of in the future
These are quite exciting models which I expect to see a lot more of.
7) Give labour/money to upstream to help them create better data. This is quite new. The only, and most excellent, example of it is the UK’s National Archive curating the Statute Law Database. They do the work with the help of staff seconded from commercial legal publishers and other parts of Government.
It’s clever because it generates money for upstream, which people trust the most, and which has the most ability to improve data quality.
8) Viral open data licensing. MySQL made lots of money this way, offering proprietary dual licenses of GPLd software to embedded systems makers. In data this could use OKFN’s Open Database License, and organisations would pay when they wanted to mix the open data with their own closed data. I don’t know anyone actively using it, although Chris Taggart from OpenCorporates mentioned this model to me years ago.
9) Corporations release data for strategic advantage. Companies are starting to release their own data for strategic gain. This is very new. Expect more of it.”