If My Data Is an Open Book, Why Can’t I Read It?


Natasha Singer in the New York Times: “Never mind all the hoopla about the presumed benefits of an “open data” society. In our day-to-day lives, many of us are being kept in the data dark.

“The fact that I am producing data and companies are collecting it to monetize it, if I can’t get a copy myself, I do consider it unfair,” says Latanya Sweeney, the director of the Data Privacy Lab at Harvard, where she is a professor of government and technology….

In fact, a few companies are challenging the norm of corporate data hoarding by actually sharing some information with the customers who generate it — and offering tools to put it to use. It’s a small but provocative trend in the United States, where only a handful of industries, like health care and credit, are required by federal law to provide people with access to their records.

Last year, San Diego Gas and Electric, a utility, introduced an online energy management program in which customers can view their electricity use in monthly, daily or hourly increments. There is even a practical benefit: customers can earn credits by reducing energy consumption during peak hours….

Deepbills project


Cato Institute: “The Deepbills project takes the raw XML of Congressional bills (available at FDsys and Thomas) and adds additional semantic information to them in inside the text.

You can download the continuously-updated data at http://deepbills.cato.org/download

Congress already produces machine-readable XML of almost every bill it proposes, but that XML is designed primarily for formatting a paper copy, not for extracting information. For example, it’s not currently possible to find every mention of an Agency, every legal reference, or even every spending authorization in a bill without having a human being read it….
Currently the following information is tagged:

  • Legal citations…
  • Budget Authorities (both Authorizations of Appropriations and Appropriations)…
  • Agencies, bureaus, and subunits of the federal government.
  • Congressional committees
  • Federal elective officeholders (Congressmen)”

IRS: Turn over a new leaf, Open up Data


Beth Simone Noveck and Stefaan Verhulst in Forbes: “The core task for Danny Werfel, the new acting commissioner of the IRS, is to repair the agency’s tarnished reputation and achieve greater efficacy and fairness in IRS investigations. Mr. Werfel can show true leadership by restructuring how the IRS handles its tax-exempt enforcement processes.
One of Mr. Werfel’s first actions on the job should be the immediate implementation of the groundbreaking Presidential Executive Order and Open Data policy, released last week, that requires data captured and generated by the government be made available in open, machine-readable formats. Doing so will make the IRS a beacon to other agencies in how to use open data to screen any wrongdoing and strengthen law enforcement.
By sharing readily available IRS data on tax-exempt organizations, encouraging Congress to pass a budget proposal that mandates release of all tax-exempt returns in a machine-readable format, and increasing the transparency of its own processes, the agency can begin to turn the page on this scandal and help rebuild trust and partnership between government and its citizens.”
See full article here.

Finding the Common Good in an Era of Dysfunctional Governance


New Essay by Thomas E. Mann and Norman J. Ornstein in the Spring 2013 issue of Daedalus (a journal of the American Academy of Arts & Sciences):

“The framers designed a constitutional system in which the government would play a vigorous role in securing the liberty and well-being of a large and diverse population. They built a political system around a number of key elements, including debate and deliberation, divided powers competing with one another, regular order in the legislative process, and avenues to limit and punish corruption. America in recent years has struggled to adhere to each of these principles, leading to a crisis of governability and legitimacy. The roots of this problem are twofold. The first is a serious mismatch between our political parties, which have become as polarized and vehemently adversarial as parliamentary parties, and a separation-of-powers governing system that makes it extremely difficult for majorities to act. The second is the asymmetric character of the polarization. The Republican Party has become a radical insurgency—ideologically extreme, scornful of facts and compromise, and dismissive of the legitimacy of its political opposition. Securing the common good in the face of these developments will require structural changes but also an informed and strategically focused citizenry.”

New NAS Report: Copyright in the Digital Era: Building Evidence for Policy


0309278953National Academies of Sciences: “Over the course of several decades, copyright protection has been expanded and extended through legislative changes occasioned by national and international developments. The content and technology industries affected by copyright and its exceptions, and in some cases balancing the two, have become increasingly important as sources of economic growth, relatively high-paying jobs, and exports. Since the expansion of digital technology in the mid-1990s, they have undergone a technological revolution that has disrupted long-established modes of creating, distributing, and using works ranging from literature and news to film and music to scientific publications and computer software.

In the United States and internationally, these disruptive changes have given rise to a strident debate over copyright’s proper scope and terms and means of its enforcement–a debate between those who believe the digital revolution is progressively undermining the copyright protection essential to encourage the funding, creation, and distribution of new works and those who believe that enhancements to copyright are inhibiting technological innovation and free expression.

Copyright in the Digital Era: Building Evidence for Policy examines a range of questions regarding copyright policy by using a variety of methods, such as case studies, international and sectoral comparisons, and experiments and surveys. This report is especially critical in light of digital age developments that may, for example, change the incentive calculus for various actors in the copyright system, impact the costs of voluntary copyright transactions, pose new enforcement challenges, and change the optimal balance between copyright protection and exceptions.”

Is Privacy Algorithmically Impossible?


MIT Technology Reviewwhat.is_.personal.data2x519: “In 1995, the European Union introduced privacy legislation that defined “personal data” as any information that could identify a person, directly or indirectly. The legislators were apparently thinking of things like documents with an identification number, and they wanted them protected just as if they carried your name.
Today, that definition encompasses far more information than those European legislators could ever have imagined—easily more than all the bits and bytes in the entire world when they wrote their law 18 years ago.
Here’s what happened. First, the amount of data created each year has grown exponentially (see figure)…
Much of this data is invisible to people and seems impersonal. But it’s not. What modern data science is finding is that nearly any type of data can be used, much like a fingerprint, to identify the person who created it: your choice of movies on Netflix, the location signals emitted by your cell phone, even your pattern of walking as recorded by a surveillance camera. In effect, the more data there is, the less any of it can be said to be private. We are coming to the point that if the commercial incentives to mine the data are in place, anonymity of any kind may be “algorithmically impossible,” says Princeton University computer scientist Arvind Narayanan.”

Life in the City Is Essentially One Giant Math Problem


Smithsonian Magazine : “A new science—so new it doesn’t have its own journal, or even an agreed-upon name—is exploring these laws. We will call it “quantitative urbanism.” It’s an effort to reduce to mathematical formulas the chaotic, exuberant, extravagant nature of one of humanity’s oldest and most important inventions, the city.
The systematic study of cities dates back at least to the Greek historian Herodotus. In the early 20th century, scientific disciplines emerged around specific aspects of urban development: zoning theory, public health and sanitation, transit and traffic engineering. By the 1960s, the urban-planning writers Jane Jacobs and William H. Whyte used New York as their laboratory to study the street life of neighborhoods, the walking patterns of Midtown pedestrians, the way people gathered and sat in open spaces. But their judgments were generally aesthetic and intuitive…
Only in the past decade has the ability to collect and analyze information about the movement of people begun to catch up to the size and complexity of the modern metropolis itself…
Deep mathematical principles underlie even such seemingly random and historically contingent facts as the distribution of the sizes of cities within a country. There is, typically, one largest city, whose population is twice that of the second-largest, and three times the third-largest, and increasing numbers of smaller cities whose sizes also fall into a predictable pattern. This principle is known as Zipf’s law, which applies across a wide range of phenomena…”

An API for "We the People"


WeThePeopleThe White House Blog: “We can’t talk about We the People without getting into the numbers — more than 8 million users, more than 200,000 petitions, more than 13 million signatures. The sheer volume of participation is, to us, a sign of success.
And there’s a lot we can learn from a set of data that rich and complex, but we shouldn’t be the only people drawing from its lessons.
So starting today, we’re making it easier for anyone to do their own analysis or build their own apps on top of the We the People platform. We’re introducing the first version of our API, and we’re inviting you to use it.
Get started here: petitions.whitehouse.gov/developers
This API provides read-only access to data on all petitions that passed the 150 signature threshold required to become publicly-available on the We the People site. For those who don’t need real-time data, we plan to add the option of a bulk data download in the near future. Until that’s ready, an incomplete sample data set is available for download here.”

Frameworks for a Location–Enabled Society


Annual CGA Conference “Location-enabled devices are weaving “smart grids” and building “smart cities;” they allow people to discover a friend in a shopping mall, catch a bus at its next stop, check surrounding air quality while walking down a street, or avoid a rain storm on a tourist route – now or in the near future. And increasingly they allow those who provide services to track, whether we are walking past stores on the street or seeking help in a natural disaster.
The Centre for Spatial Law and Policy based in Washington, DC, the Center for Geographic Analysis, the Belfer Center for Science and International Affairs and the Berkman Center for Internet and Society at Harvard University are co-hosting a two-day program examining the legal and policy issues that will impact geospatial technologies and the development of location-enabled societies. The event will take place at Harvard University on May 2-3, 2013…The goal is to explore the different dimensions of policy and legal concerns in geospatial technology applications, and to begin in creating a policy and legal framework for a location-enabled society. Download the conference program brochure.
Live Webcast:

Stream videos at Ustream

Department of Better Technology


logo-250Next City reports: “…opening up government can get expensive. That’s why two developers this week launched the Department of Better Technology, an effort to make open government tools cheaper, more efficient and easier to engage with.

As founder Clay Johnson explains in a post on the site’s blog, a federal website that catalogues databases on government contracts, which launched last year, cost $181 million to build — $81 million more than a recent research initiative to map the human brain.

“I’d like to say that this is just a one-off anomaly, but government regularly pays millions of dollars for websites,” writes Johnson, the former director of Sunlight Labs at the Sunlight Foundation and author the 2012 book The Information Diet.

The first undertaking of Johnson and his partner, GovHub co-founder Adam Becker, is a tool meant to make it simpler for businesses to find government projects to bid on, as well as help officials streamline the process of managing procurements. In a pilot experiment, Johnson writes, the pair found that not only were bids coming in faster and at a reduced price, but more people were doing the bidding.

Per Johnson, “many of the bids that came in were from businesses that had not ordinarily contracted with the federal government before.”
The Department of Better Technology will accept five cities to test a beta version of this tool, called Procure.io, in 2013.”