Deepbills project


Cato Institute: “The Deepbills project takes the raw XML of Congressional bills (available at FDsys and Thomas) and adds additional semantic information to them in inside the text.

You can download the continuously-updated data at http://deepbills.cato.org/download

Congress already produces machine-readable XML of almost every bill it proposes, but that XML is designed primarily for formatting a paper copy, not for extracting information. For example, it’s not currently possible to find every mention of an Agency, every legal reference, or even every spending authorization in a bill without having a human being read it….
Currently the following information is tagged:

  • Legal citations…
  • Budget Authorities (both Authorizations of Appropriations and Appropriations)…
  • Agencies, bureaus, and subunits of the federal government.
  • Congressional committees
  • Federal elective officeholders (Congressmen)”