Economic effects of open data policy still 'anecdotal'


Adam Mazmanian in FCW:’ A year after the launch of the government’s digital strategy, there’s no official tally of the economic activity generated by the release of government datasets for use in commercial applications.
“We have anecdotal examples, but nothing official yet,” said federal CIO Steven VanRoekel in an invitation-only meeting with reporters at the FOSE conference on May 15. “It’s an area where we have an opportunity to start to talk about this, because it’s starting to tick up a bit, and the numbers are looking pretty good.” (Related story: APIs help agencies say yes)…
The Obama administration is banking on an explosion in the use of federal datasets for commercial and government applications alike. Last week’s executive order and accompanying directive from the Office of Management and Budget tasks agencies with making open and machine readable data the new default setting for government information.
VanRoekel said that the merits of the open data standard don’t necessarily need to be justified by economic activity….
The executive order also spells out privacy concerns arising from the so-called “mosaic effect,’ by which information from disparate datasets can be overlaid to decipher personally identifiable information.”

Wikipedia Recent Changes Map


Wikipedia

The Verge: “By watching a new visualization, known plainly as the Wikipedia Recent Changes Map, viewers can see the location of every unregistered Wikipedia user who makes a change to the open encyclopedia. It provides a voyeuristic look at the rate that knowledge is contributed to the website, giving you the faintest impression of the Spaniard interested in the television show Jackass or the Brazilian who defaced the page on the Jersey Devil to feature a photograph of the new pope. Though the visualization moves quickly, it’s only displaying about one-fifth of the edits being made: Wikipedia doesn’t reveal location data for registered users, and unregistered users make up just 15 to 20 percent of all contribution, according to studies of the website.”

OpenData Latinoamérica


Mariano Blejman and Miguel Paz @ IJNet Blog: “We need a central repository where you can share the data that you have proved to be reliable. Our answer to this need: OpenData Latinoamérica, which we are leading as ICFJ Knight International Journalism Fellows.
Inspired by the open data portal created by ICFJ Knight International Journalism Fellow Justin Arenstein in Africa, OpenData Latinoamérica aims to improve the use of data in this region where data sets too often fail to show up where they should, and when they do, are scattered about the web at governmental repositories and multiple independent repositories where the data is removed too quickly.

The portal will be used at two big upcoming events: Bolivia’s first DataBootCamp and the Conferencia de Datos Abiertos (Open Data Conference) in Montevideo, Uruguay. Then, we’ll hold a series of hackathons and scrape-athons in Chile, which is in a period of presidential elections in which citizens increasingly demand greater transparency. Releasing data and developing applications for accountability will be the key.”

Challenge: Visualizing Online Takedown Requests


visualizing.org: “The free flow of information defines the Internet. Innovations like Wikipedia and crowdsourcing owe their existence to and are powered by the resulting streams of knowledge and ideas. Indeed, more information means more choice, more freedom, and ultimately more power for the individual and society. But — citing reasons like defamation, national security, and copyright infringement — governments, corporations, and other organizations at times may regulate and restrict information online. By blocking or filtering sites, issuing court orders limiting access to information, enacting legislation or pressuring technology and communication companies, governments and other organizations aim to censor one of the most important means of free expression in the world. What does this mean and to what extent should attempts to censor online content be permitted?…
We challenge you to visualize the removal requests in Google’s Transparency Report. What in this data should be communicated to the general public? Are there any trends or patterns in types of requests that have been complied with? Have legal and policy environments shaped what information is available and/or restricted in different countries? The data set on government requests (~1 thousand rows) provides summaries broken down by country, Google product, and reason. The data set on copyright requests, however, is much larger (~1 million rows) and includes each individual request. Use one or both data sets, by themselves or with other open data sets. We’re excited to partner with Google for this challenge, and we’re offering $5,000 in prizes.”

Enter

Deadline: Thursday, June 27, 2013, 11:59 pm EDT
Winner Announced: Thursday, July 11, 2013

 More at http://visualizing.org/contests/visualizing-online-takedown-requests

New Open Data Executive Order and Policy


The White House: “The Obama Administration today took groundbreaking new steps to make information generated and stored by the Federal Government more open and accessible to innovators and the public, to fuel entrepreneurship and economic growth while increasing government transparency and efficiency.
Today’s actions—including an Executive Order signed by the President and an Open Data Policy released by the Office of Management and Budget and the Office of Science and Technology Policy—declare that information is a valuable national asset whose value is multiplied when it is made easily accessible to the public.  The Executive Order requires that, going forward, data generated by the government be made available in open, machine-readable formats, while appropriately safeguarding privacy, confidentiality, and security.
The move will make troves of previously inaccessible or unmanageable data easily available to entrepreneurs, researchers, and others who can use those files to generate new products and services, build businesses, and create jobs….
Along with the Executive Order and Open Data Policy, the Administration announced a series of complementary actions:
• A new Data.Gov.  In the months ahead, Data.gov, the powerful central hub for open government data, will launch new services that include improved visualization, mapping tools, better context to help locate and understand these data, and robust Application Programming Interface (API) access for developers.
• New open source tools to make data more open and accessible.  The US Chief Information Officer and the US Chief Technology Officer are releasing free, open source tools on Github, a site that allows communities of developers to collaboratively develop solutions.  This effort, known as Project Open Data, can accelerate the adoption of open data practices by providing plug-and-play tools and best practices to help agencies improve the management and release of open data.  For example, one tool released today automatically converts simple spreadsheets and databases into APIs for easier consumption by developers.  Anyone, from government agencies to private citizens to local governments and for-profit companies, can freely use and adapt these tools starting immediately.
• Building a 21st century digital government.  As part of the Administration’s Digital Government Strategy and Open Data Initiatives in health, energy, education, public safety, finance, and global development, agencies have been working to unlock data from the vaults of government, while continuing to protect privacy and national security.  Newly available or improved data sets from these initiatives will be released today and over the coming weeks as part of the one year anniversary of the Digital Government Strategy.
• Continued engagement with entrepreneurs and innovators to leverage government data.  The Administration has convened and will continue to bring together companies, organizations, and civil society for a variety of summits to highlight how these innovators use open data to positively impact the public and address important national challenges.  In June, Federal agencies will participate in the fourth annual Health Datapalooza, hosted by the nonprofit Health Data Consortium, which will bring together more than 1,800 entrepreneurs, innovators, clinicians, patient advocates, and policymakers for information sessions, presentations, and “code-a-thons” focused on how the power of data can be harnessed to help save lives and improve healthcare for all Americans.
For more information on open data highlights across government visit: http://www.whitehouse.gov/administration/eop/ostp/library/docsreports”

The Uncertain Relationship Between Open Data and Accountability


Tiago Peixoto’s Response to Yu and Robinson’s paper on  The New Ambiguity of “ Open Government ”: “By looking at the nature of data that may be disclosed by governments, Harlan Yu and David Robinson provide an analytical framework that evinces the ambiguities underlying the term “open government data.” While agreeing with their core analysis, I contend that the authors ignore the enabling conditions under which transparency may lead to accountability, notably the publicity and political agency conditions. I argue that the authors also overlook the role of participatory mechanisms as an essential element in unlocking the potential for open data to produce better government decisions and policies. Finally, I conduct an empirical analysis of the publicity and political agency conditions in countries that have launched open data efforts, highlighting the challenges associated with open data as a path to accountability.”
 

Civilized Discourse Construction Kit


Jeff Atwood at “Coding Horror“: “Forum software? Maybe. Let’s see, it’s 2013, has forum software advanced at all in the last ten years? I’m thinking no.
Forums are the dark matter of the web, the B-movies of the Internet. But they matter. To this day I regularly get excellent search results on forum pages for stuff I’m interested in. Rarely a day goes by that I don’t end up on some forum, somewhere, looking for some obscure bit of information. And more often than not, I find it there….

At Stack Exchange, one of the tricky things we learned about Q&A is that if your goal is to have an excellent signal to noise ratio, you must suppress discussion. Stack Exchange only supports the absolute minimum amount of discussion necessary to produce great questions and great answers. That’s why answers get constantly re-ordered by votes, that’s why comments have limited formatting and length and only a few display, and so forth….

Today we announce the launch of Discourse, a next-generation, 100% open source discussion platform built for the next decade of the Internet.

Discourse-logo-big

The goal of the company we formed, Civilized Discourse Construction Kit, Inc., is exactly that – to raise the standard of civilized discourse on the Internet through seeding it with better discussion software:

  • 100% open source and free to the world, now and forever.
  • Feels great to use. It’s fun.
  • Designed for hi-resolution tablets and advanced web browsers.
  • Built in moderation and governance systems that let discussion communities protect themselves from trolls, spammers, and bad actors – even without official moderators.”

6 Things You May Not Know About Open Data


GovTech: “On Friday, May 3, Palo Alto, Calif., CIO Jonathan Reichental …said that when it comes to making data more open, “The invisible becomes visible,” and he outlined six major points that identify and define what open data really is:

1.  It’s the liberation of peoples’ data

The public sector collects data that pertains to government, such as employee salaries, trees or street information, and government entities are therefore responsible for liberating that data so the constituent can view it in an accessible format. Though this practice has become more commonplace in recent years, Reichental said government should have been doing this all along.

2.  Data has to be consumable by a machine

Piecing data together from a spreadsheet to a website or containing it in a PDF isn’t the easiest way to retrieve data. To make data more open, in needs to be in a readable format so users don’t have to go through additional trouble of finding or reading it.

3.  Data has a derivative value

When data is made available to the public, people like app developers, arichitects or others are able to analyze the data. In some cases, data can be used in city planning to understand what’s happening at the city scale.

4.  It eliminates the middleman

For many states, public records laws require them to provide data when a public records request is made. But oftentimes, complying with such request regulations involves long and cumbersome processes. Lawyers and other government officials must process paperwork, and it can take weeks to complete a request. By having data readily available, these processes can be eliminated, thus also eliminating the middleman responsible for processing the requests. Direct access to the data saves time and resources.

5.  Data creates deeper accountability

Since government is expected to provide accessible data, it is therefore being watched, making it more accountable for its actions — everything from emails, salaries and city council minutes can be viewed by the public.

6.  Open Data builds trust

When the community can see what’s going on in its government through the access of data, Reichtental said individuals begin to build more trust in their government and feel less like the government is hiding information.”

Linking open data to augmented intelligence and the economy


Open Data Institute and Professor Nigel Shadbolt (@Nigel_Shadbolt) interviewed by by (@digiphile):  “…there are some clear learnings. One that I’ve been banging on about recently has been that yes, it really does matter to turn the dial so that governments have a presumption to publish non-personal public data. If you would publish it anyway, under a Freedom of Information request or whatever your local legislative equivalent is, why aren’t you publishing it anyway as open data? That, as a behavioral change. is a big one for many administrations where either the existing workflow or culture is, “Okay, we collect it. We sit on it. We do some analysis on it, and we might give it away piecemeal if people ask for it.” We should construct publication process from the outset to presume to publish openly. That’s still something that we are two or three years away from, working hard with the public sector to work out how to do and how to do properly.
We’ve also learned that in many jurisdictions, the amount of [open data] expertise within administrations and within departments is slight. There just isn’t really the skillset, in many cases. for people to know what it is to publish using technology platforms. So there’s a capability-building piece, too.
One of the most important things is it’s not enough to just put lots and lots of datasets out there. It would be great if the “presumption to publish” meant they were all out there anyway — but when you haven’t got any datasets out there and you’re thinking about where to start, the tough question is to say, “How can I publish data that matters to people?”
The data that matters is revealed in the fact that if we look at the download stats on these various UK, US and other [open data] sites. There’s a very, very distinctive parallel curve. Some datasets are very, very heavily utilized. You suspect they have high utility to many, many people. Many of the others, if they can be found at all, aren’t being used particularly much. That’s not to say that, under that long tail, there isn’t large amounts of use. A particularly arcane open dataset may have exquisite use to a small number of people.
The real truth is that it’s easy to republish your national statistics. It’s much harder to do a serious job on publishing your spending data in detail, publishing police and crime data, publishing educational data, publishing actual overall health performance indicators. These are tough datasets to release. As people are fond of saying, it holds politicians’ feet to the fire. It’s easy to build a site that’s full of stuff — but does the stuff actually matter? And does it have any economic utility?”
there are some clear learnings. One that I’ve been banging on about recently has been that yes, it really does matter to turn the dial so that governments have a presumption to publish non-personal public data. If you would publish it anyway, under a Freedom of Information request or whatever your local legislative equivalent is, why aren’t you publishing it anyway as open data? That, as a behavioral change. is a big one for many administrations where either the existing workflow or culture is, “Okay, we collect it. We sit on it. We do some analysis on it, and we might give it away piecemeal if people ask for it.” We should construct publication process from the outset to presume to publish openly. That’s still something that we are two or three years away from, working hard with the public sector to work out how to do and how to do properly.
We’ve also learned that in many jurisdictions, the amount of [open data] expertise within administrations and within departments is slight. There just isn’t really the skillset, in many cases. for people to know what it is to publish using technology platforms. So there’s a capability-building piece, too.
One of the most important things is it’s not enough to just put lots and lots of datasets out there. It would be great if the “presumption to publish” meant they were all out there anyway — but when you haven’t got any datasets out there and you’re thinking about where to start, the tough question is to say, “How can I publish data that matters to people?”
The data that matters is revealed in the fact that if we look at the download stats on these various UK, US and other [open data] sites. There’s a very, very distinctive parallel curve. Some datasets are very, very heavily utilized. You suspect they have high utility to many, many people. Many of the others, if they can be found at all, aren’t being used particularly much. That’s not to say that, under that long tail, there isn’t large amounts of use. A particularly arcane open dataset may have exquisite use to a small number of people.
The real truth is that it’s easy to republish your national statistics. It’s much harder to do a serious job on publishing your spending data in detail, publishing police and crime data, publishing educational data, publishing actual overall health performance indicators. These are tough datasets to release. As people are fond of saying, it holds politicians’ feet to the fire. It’s easy to build a site that’s full of stuff — but does the stuff actually matter? And does it have any economic utility?

An API for "We the People"


WeThePeopleThe White House Blog: “We can’t talk about We the People without getting into the numbers — more than 8 million users, more than 200,000 petitions, more than 13 million signatures. The sheer volume of participation is, to us, a sign of success.
And there’s a lot we can learn from a set of data that rich and complex, but we shouldn’t be the only people drawing from its lessons.
So starting today, we’re making it easier for anyone to do their own analysis or build their own apps on top of the We the People platform. We’re introducing the first version of our API, and we’re inviting you to use it.
Get started here: petitions.whitehouse.gov/developers
This API provides read-only access to data on all petitions that passed the 150 signature threshold required to become publicly-available on the We the People site. For those who don’t need real-time data, we plan to add the option of a bulk data download in the near future. Until that’s ready, an incomplete sample data set is available for download here.”