If My Data Is an Open Book, Why Can’t I Read It?


Natasha Singer in the New York Times: “Never mind all the hoopla about the presumed benefits of an “open data” society. In our day-to-day lives, many of us are being kept in the data dark.

“The fact that I am producing data and companies are collecting it to monetize it, if I can’t get a copy myself, I do consider it unfair,” says Latanya Sweeney, the director of the Data Privacy Lab at Harvard, where she is a professor of government and technology….

In fact, a few companies are challenging the norm of corporate data hoarding by actually sharing some information with the customers who generate it — and offering tools to put it to use. It’s a small but provocative trend in the United States, where only a handful of industries, like health care and credit, are required by federal law to provide people with access to their records.

Last year, San Diego Gas and Electric, a utility, introduced an online energy management program in which customers can view their electricity use in monthly, daily or hourly increments. There is even a practical benefit: customers can earn credits by reducing energy consumption during peak hours….

Deepbills project


Cato Institute: “The Deepbills project takes the raw XML of Congressional bills (available at FDsys and Thomas) and adds additional semantic information to them in inside the text.

You can download the continuously-updated data at http://deepbills.cato.org/download

Congress already produces machine-readable XML of almost every bill it proposes, but that XML is designed primarily for formatting a paper copy, not for extracting information. For example, it’s not currently possible to find every mention of an Agency, every legal reference, or even every spending authorization in a bill without having a human being read it….
Currently the following information is tagged:

  • Legal citations…
  • Budget Authorities (both Authorizations of Appropriations and Appropriations)…
  • Agencies, bureaus, and subunits of the federal government.
  • Congressional committees
  • Federal elective officeholders (Congressmen)”

Introducing: Project Open Data


White House Blog: “Technology evolves rapidly, and it can be challenging for policy and its implementation to evolve at the same pace.  Last week, President Obama launched the Administration’s new Open Data Policy and Executive Order aimed at ensuring that data released by the government will be as accessible and useful as possible.  To make sure this tech-focused policy can keep up with the speed of innovation, we created Project Open Data.
Project Open Data is an online, public repository intended to foster collaboration and promote the continual improvement of the Open Data Policy. We wanted to foster a culture change in government where we embrace collaboration and where anyone can help us make open data work better. The project is published on GitHub, an open source platform that allows communities of developers to collaboratively share and enhance code.  The resources and plug-and-play tools in Project Open Data can help accelerate the adoption of open data practices.  For example, one tool instantly converts spreadsheets and databases into APIs for easier consumption by developers.  The idea is that anyone, from Federal agencies to state and local governments to private citizens, can freely use and adapt these open source tools—and that’s exactly what’s happening.
Within the first 24 hours after Project Open Data was published, more than two dozen contributions (or “pull requests” in GitHub speak) were submitted by the public. The submissions included everything from fixing broken links, to providing policy suggestions, to contributing new code and tools. One pull request even included new code that translates geographic data from locked formats into open data that is freely available for use by anyone…”

IRS: Turn over a new leaf, Open up Data


Beth Simone Noveck and Stefaan Verhulst in Forbes: “The core task for Danny Werfel, the new acting commissioner of the IRS, is to repair the agency’s tarnished reputation and achieve greater efficacy and fairness in IRS investigations. Mr. Werfel can show true leadership by restructuring how the IRS handles its tax-exempt enforcement processes.
One of Mr. Werfel’s first actions on the job should be the immediate implementation of the groundbreaking Presidential Executive Order and Open Data policy, released last week, that requires data captured and generated by the government be made available in open, machine-readable formats. Doing so will make the IRS a beacon to other agencies in how to use open data to screen any wrongdoing and strengthen law enforcement.
By sharing readily available IRS data on tax-exempt organizations, encouraging Congress to pass a budget proposal that mandates release of all tax-exempt returns in a machine-readable format, and increasing the transparency of its own processes, the agency can begin to turn the page on this scandal and help rebuild trust and partnership between government and its citizens.”
See full article here.

Economic effects of open data policy still 'anecdotal'


Adam Mazmanian in FCW:’ A year after the launch of the government’s digital strategy, there’s no official tally of the economic activity generated by the release of government datasets for use in commercial applications.
“We have anecdotal examples, but nothing official yet,” said federal CIO Steven VanRoekel in an invitation-only meeting with reporters at the FOSE conference on May 15. “It’s an area where we have an opportunity to start to talk about this, because it’s starting to tick up a bit, and the numbers are looking pretty good.” (Related story: APIs help agencies say yes)…
The Obama administration is banking on an explosion in the use of federal datasets for commercial and government applications alike. Last week’s executive order and accompanying directive from the Office of Management and Budget tasks agencies with making open and machine readable data the new default setting for government information.
VanRoekel said that the merits of the open data standard don’t necessarily need to be justified by economic activity….
The executive order also spells out privacy concerns arising from the so-called “mosaic effect,’ by which information from disparate datasets can be overlaid to decipher personally identifiable information.”

Wikipedia Recent Changes Map


Wikipedia

The Verge: “By watching a new visualization, known plainly as the Wikipedia Recent Changes Map, viewers can see the location of every unregistered Wikipedia user who makes a change to the open encyclopedia. It provides a voyeuristic look at the rate that knowledge is contributed to the website, giving you the faintest impression of the Spaniard interested in the television show Jackass or the Brazilian who defaced the page on the Jersey Devil to feature a photograph of the new pope. Though the visualization moves quickly, it’s only displaying about one-fifth of the edits being made: Wikipedia doesn’t reveal location data for registered users, and unregistered users make up just 15 to 20 percent of all contribution, according to studies of the website.”

OpenData Latinoamérica


Mariano Blejman and Miguel Paz @ IJNet Blog: “We need a central repository where you can share the data that you have proved to be reliable. Our answer to this need: OpenData Latinoamérica, which we are leading as ICFJ Knight International Journalism Fellows.
Inspired by the open data portal created by ICFJ Knight International Journalism Fellow Justin Arenstein in Africa, OpenData Latinoamérica aims to improve the use of data in this region where data sets too often fail to show up where they should, and when they do, are scattered about the web at governmental repositories and multiple independent repositories where the data is removed too quickly.

The portal will be used at two big upcoming events: Bolivia’s first DataBootCamp and the Conferencia de Datos Abiertos (Open Data Conference) in Montevideo, Uruguay. Then, we’ll hold a series of hackathons and scrape-athons in Chile, which is in a period of presidential elections in which citizens increasingly demand greater transparency. Releasing data and developing applications for accountability will be the key.”

Challenge: Visualizing Online Takedown Requests


visualizing.org: “The free flow of information defines the Internet. Innovations like Wikipedia and crowdsourcing owe their existence to and are powered by the resulting streams of knowledge and ideas. Indeed, more information means more choice, more freedom, and ultimately more power for the individual and society. But — citing reasons like defamation, national security, and copyright infringement — governments, corporations, and other organizations at times may regulate and restrict information online. By blocking or filtering sites, issuing court orders limiting access to information, enacting legislation or pressuring technology and communication companies, governments and other organizations aim to censor one of the most important means of free expression in the world. What does this mean and to what extent should attempts to censor online content be permitted?…
We challenge you to visualize the removal requests in Google’s Transparency Report. What in this data should be communicated to the general public? Are there any trends or patterns in types of requests that have been complied with? Have legal and policy environments shaped what information is available and/or restricted in different countries? The data set on government requests (~1 thousand rows) provides summaries broken down by country, Google product, and reason. The data set on copyright requests, however, is much larger (~1 million rows) and includes each individual request. Use one or both data sets, by themselves or with other open data sets. We’re excited to partner with Google for this challenge, and we’re offering $5,000 in prizes.”

Enter

Deadline: Thursday, June 27, 2013, 11:59 pm EDT
Winner Announced: Thursday, July 11, 2013

 More at http://visualizing.org/contests/visualizing-online-takedown-requests

New Open Data Executive Order and Policy


The White House: “The Obama Administration today took groundbreaking new steps to make information generated and stored by the Federal Government more open and accessible to innovators and the public, to fuel entrepreneurship and economic growth while increasing government transparency and efficiency.
Today’s actions—including an Executive Order signed by the President and an Open Data Policy released by the Office of Management and Budget and the Office of Science and Technology Policy—declare that information is a valuable national asset whose value is multiplied when it is made easily accessible to the public.  The Executive Order requires that, going forward, data generated by the government be made available in open, machine-readable formats, while appropriately safeguarding privacy, confidentiality, and security.
The move will make troves of previously inaccessible or unmanageable data easily available to entrepreneurs, researchers, and others who can use those files to generate new products and services, build businesses, and create jobs….
Along with the Executive Order and Open Data Policy, the Administration announced a series of complementary actions:
• A new Data.Gov.  In the months ahead, Data.gov, the powerful central hub for open government data, will launch new services that include improved visualization, mapping tools, better context to help locate and understand these data, and robust Application Programming Interface (API) access for developers.
• New open source tools to make data more open and accessible.  The US Chief Information Officer and the US Chief Technology Officer are releasing free, open source tools on Github, a site that allows communities of developers to collaboratively develop solutions.  This effort, known as Project Open Data, can accelerate the adoption of open data practices by providing plug-and-play tools and best practices to help agencies improve the management and release of open data.  For example, one tool released today automatically converts simple spreadsheets and databases into APIs for easier consumption by developers.  Anyone, from government agencies to private citizens to local governments and for-profit companies, can freely use and adapt these tools starting immediately.
• Building a 21st century digital government.  As part of the Administration’s Digital Government Strategy and Open Data Initiatives in health, energy, education, public safety, finance, and global development, agencies have been working to unlock data from the vaults of government, while continuing to protect privacy and national security.  Newly available or improved data sets from these initiatives will be released today and over the coming weeks as part of the one year anniversary of the Digital Government Strategy.
• Continued engagement with entrepreneurs and innovators to leverage government data.  The Administration has convened and will continue to bring together companies, organizations, and civil society for a variety of summits to highlight how these innovators use open data to positively impact the public and address important national challenges.  In June, Federal agencies will participate in the fourth annual Health Datapalooza, hosted by the nonprofit Health Data Consortium, which will bring together more than 1,800 entrepreneurs, innovators, clinicians, patient advocates, and policymakers for information sessions, presentations, and “code-a-thons” focused on how the power of data can be harnessed to help save lives and improve healthcare for all Americans.
For more information on open data highlights across government visit: http://www.whitehouse.gov/administration/eop/ostp/library/docsreports”

The Uncertain Relationship Between Open Data and Accountability


Tiago Peixoto’s Response to Yu and Robinson’s paper on  The New Ambiguity of “ Open Government ”: “By looking at the nature of data that may be disclosed by governments, Harlan Yu and David Robinson provide an analytical framework that evinces the ambiguities underlying the term “open government data.” While agreeing with their core analysis, I contend that the authors ignore the enabling conditions under which transparency may lead to accountability, notably the publicity and political agency conditions. I argue that the authors also overlook the role of participatory mechanisms as an essential element in unlocking the potential for open data to produce better government decisions and policies. Finally, I conduct an empirical analysis of the publicity and political agency conditions in countries that have launched open data efforts, highlighting the challenges associated with open data as a path to accountability.”