The Declassification Engine


Wired: “The CIA offers an electronic search engine that lets you mine about 11 million agency documents that have been declassified over the years. It’s called CREST, short for CIA Records Search Tool. But this represents only a portion the CIA’s declassified materials, and if you want unfettered access to the search engine, you’ll have to physically visit the National Archives at College Park, Maryland….
a new project launched by a team of historians, mathematicians, and computer scientists at Columbia University in New York City. Led by Matthew Connelly — a Columbia professor trained in diplomatic history — the project is known as The Declassification Engine, and it seeks to provide a single online database for declassified documents from across the federal government, including the CIA, the State Department, and potentially any other agency.
The project is still in the early stages, but the team has already assembled a database of documents that stretches back to the 1940s, and it has begun building new tools for analyzing these materials. In aggregating all documents into a single database, the researchers hope to not only provide quicker access to declassified materials, but to glean far more information from these documents than we otherwise could.
In the parlance of the day, the project is tackling these documents with the help of Big Data. If you put enough of this declassified information in a single place, Connelly believes, you can begin to predict what government information is still being withheld”

Deepbills project


Cato Institute: “The Deepbills project takes the raw XML of Congressional bills (available at FDsys and Thomas) and adds additional semantic information to them in inside the text.

You can download the continuously-updated data at http://deepbills.cato.org/download

Congress already produces machine-readable XML of almost every bill it proposes, but that XML is designed primarily for formatting a paper copy, not for extracting information. For example, it’s not currently possible to find every mention of an Agency, every legal reference, or even every spending authorization in a bill without having a human being read it….
Currently the following information is tagged:

  • Legal citations…
  • Budget Authorities (both Authorizations of Appropriations and Appropriations)…
  • Agencies, bureaus, and subunits of the federal government.
  • Congressional committees
  • Federal elective officeholders (Congressmen)”

Crowdfunding gives rise to projects truly in public domain


USA Today: “Crowdfunding, the cyberpractice of pooling individuals’ money for a cause, so far has centered on private enterprise. It’s now spreading to public spaces and other community projects that are typically the domain of municipalities.

The global reach and speed of the Internet are raising not just money but awareness and galvanizing communities.

SmartPlanet.com recently reported that crowdfunding capital projects is gaining momentum, giving communities part ownership of everything from a 66-story downtown skyscraper in Bogota to a bridge in Rotterdam, the Netherlands. Several websites such as neighborland.com and neighbor.ly are platforms to raise money for projects ranging from planting fruit trees in San Francisco to building a playground that accommodates disabled children in Parsippany, N.J.

“Community groups are increasingly ready to challenge cities’ plans,” says Bryan Boyer, an independent consultant and adviser to The Finnish Innovation Fund SITRA, a think tank. “We’re all learning to live in the context of a networked society.”

Crowdfund
Crowdfunder, which connects entrepreneurs and investors globally, just launched a local version — CROWDFUNDx.”

What the Obama Campaign's Chief Data Scientist Is Up to Now


Alexis Madrigal in The Atlantic: “By all accounts, Rayid Ghani’s data work for President Obama’s reelection campaign was brilliant and unprecedented. Ghani probably could have written a ticket to work at any company in the world, or simply collected speaking fees for a few years telling companies how to harness the power of data like the campaign did.
But instead, Ghani headed to the University of Chicago to bring sophisticated data analysis to difficult social problems. Working with Computation Institute and the Harris School of Public Policy, Ghani will serve as the chief data scientist for the Urban Center for Computation and Data.”

Feel the force


The Economist: “Three new books look at power in the digital age…
To Save Everything, Click Here: The Folly of Technological Solutionism. By Evgeny Morozov. PublicAffairs; 415 pages; $28.99. Allen Lane; £20.
Who Owns the Future? By Jaron Lanier. Simon and Schuster; 397 pages; $28. Allen Lane; £20.
The New Digital Age: Reshaping the Future of People, Nations and Business. By Eric Schmidt and Jared Cohen. Knopf; 319 pages; $26.95. John Murray; £25.

Open government data shines a light on hospital billing and health care costs


hospital-costsAlex Howard: “If transparency is the best disinfectant, casting sunlight upon the cost of care in hospitals across the United States will make the health care system itself healthier.
The Department of Health and Human Services has released open data that compares the billing for the 100 most common treatments and procedures performed at more than 3000 hospital in the U.S. The Medicare provider charge data shows significant variation within communies and across the country for the same procedures.
One hospital charged $8,000, another $38,000 — for the same condition. This data is enabling newspapers like the Washington Post to show people the actual costs of health care and create  interactive features that enable  people to search for individual hospitals and see how they compare. The New York Times explored the potential reasons behind wild disparities in billing at length today, from sicker patients to longer hospitalizations to higher labor costs.”

A Page From the Tri-Sector Athlete Playbook: Designing a Pro-Bono Partnership Model for Cities and Public Agencies


Jeremy Goldberg: “Leaders in our social systems and institutions are faced with many of the same challenges of the past century, but they are tasked to solve them within new fiscal realities. In the United States these fiscal realities are tied to the impact of the most recent economic recession coupled with declining property and tax revenues. While these issues seem largely to be “problems” that many perceive to belong to our government, leadership across sectors has had to respond and adapt in numerous ways, some of which unfortunately include pay and hiring-freezes, lay-offs and cuts to important public services and programs related to education, parks and safety.
Fortunately, within this “new normal” there are examples of leadership within the public and private sector confronting these challenges head-on through innovative public-private partnerships (p3s). For example, municipal governments are turning to opportunities like IBM’s Smarter Cities Challenge, which provides funding and a team of IBM employees to assist the city in solving specific public problems. Other cities such as Boston, Louisville and San Francisco have established initiatives, projects and Offices of Civic Innovation where government, technologists, communities and residents are collaborating to solve problems through open-data initiatives and platforms.
This new generation of innovative P3s demonstrates the inherent power of what Joseph Nye coined a tri-sector athlete — someone who is able and experienced in business, government and the social sector. Today, unlike any other time before, tri-sector athletes are demonstrating that business as usual just won’t cut it. These athletes, myself included, believe it’s the perfect moment for civic innovation, the perfect time civic collaboration, and the perfect moment for an organization like Fuse Corps to lead the national civic entrepreneurship movement… and I’m proud to be a part of it.”

D4D Challenge Winners announced


development=prize-pic_0Global Pulse Blog: “The winners of the Data for Development challenge – an international research challenge using a massive anonymized dataset provided by telecommunications company Orange – were announced at the NetMob 2013 Conference in Boston last week….
In this post we’ll look at the winners and how their research could be put to use.

Best Visualization prize winner: “Exploration and Analysis of Massive Mobile Phone Data: A Layered Visual Analytics Approach” –

Best Development prize winner: “AllAboard: a System for Exploring Urban Mobility and Optimizing Public Transport Using Cellphone Data”

Best Scientific prize winner: “Analyzing Social Divisions Using Cell Phone Data”

First prize winner: “Exploiting Cellular Data for Disease Containment and Information Campaigns Strategies in Country-Wide Epidemics””

New NAS Report: Copyright in the Digital Era: Building Evidence for Policy


0309278953National Academies of Sciences: “Over the course of several decades, copyright protection has been expanded and extended through legislative changes occasioned by national and international developments. The content and technology industries affected by copyright and its exceptions, and in some cases balancing the two, have become increasingly important as sources of economic growth, relatively high-paying jobs, and exports. Since the expansion of digital technology in the mid-1990s, they have undergone a technological revolution that has disrupted long-established modes of creating, distributing, and using works ranging from literature and news to film and music to scientific publications and computer software.

In the United States and internationally, these disruptive changes have given rise to a strident debate over copyright’s proper scope and terms and means of its enforcement–a debate between those who believe the digital revolution is progressively undermining the copyright protection essential to encourage the funding, creation, and distribution of new works and those who believe that enhancements to copyright are inhibiting technological innovation and free expression.

Copyright in the Digital Era: Building Evidence for Policy examines a range of questions regarding copyright policy by using a variety of methods, such as case studies, international and sectoral comparisons, and experiments and surveys. This report is especially critical in light of digital age developments that may, for example, change the incentive calculus for various actors in the copyright system, impact the costs of voluntary copyright transactions, pose new enforcement challenges, and change the optimal balance between copyright protection and exceptions.”