Index: Crime and Criminal Justice Data


The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on crime and criminal justice data and was originally published in 2015.

This index provides information about the type of crime and criminal justice data collected, shared and used in the United States. Because it is well known that data related to the criminal justice system is often times unreliable, or just plain missing, this index also highlights some of the issues that stand in the way of accessing useful and in-demand statistics.

Data Collections: National Crime Statistics

  • Number of incident-based crime datasets created by the Federal Bureau of Investigation (FBI): 2
    • Number of U.S. Statistical Agencies: 13
    • How many of those are focused on criminal justice: 1, the Bureau of Justice Statistics (BJS)
    • Number of data collections focused on criminal justice the BJS produces: 61
    • Number of federal-level APIs available for crime or criminal justice data: 1, the National Crime Victimization Survey (NCVS).
    • Frequency of the NCVS: annually
  • Number of Statistical Analysis Centers (SACs), organizations that are essentially clearinghouses for crime and criminal justice data for each state, the District of Columbia, Puerto Rico and the Northern Mariana Islands: 53

Open data, data use and the impact of those efforts

  • Number of datasets that are returned when “criminal justice” is searched for on Data.gov: 417, including federal-, state- and city-level datasets
  • Number of datasets that are returned when “crime” is searched for on Data.gov: 281
  • The percentage that public complaints dropped after officers started wearing body cameras, according to a study done in Rialto, Calif.: 88
  • The percentage that reported incidents of officer use of force fell after officers started wearing body cameras, according to a study done in Rialto, Calif.: 5
  • The percent that crime decreased during an experiment in predictive policing in Shreveport, LA: 35  
  • Number of crime data sets made available by the Seattle Police Department – generally seen as a leader in police data innovation – on the Seattle.gov website: 4
    • Major crime stats by category in aggregate
    • Crime trend reports
    • Precinct data by beat
    • State sex offender database
  • Number of datasets mapped by the Seattle Police Department: 2:
      • 911 incidents
    • Police reports
  • Number of states where risk assessment tools must be used in pretrial proceedings to help determine whether an offender is released from jail before a trial: at least 11.

Police Data

    • Number of federally mandated databases that collect information about officer use of force or officer involved shootings, nationwide: 0
    • The year a crime bill was passed that called for data on excessive force to be collected for research and statistical purposes, but has never been funded: 1994
    • Number of police departments that committed to being a part of the White House’s Police Data Initiative: 21
    • Percentage of police departments surveyed in 2013 by the Office of Community Oriented Policing within the Department of Justice that are not using body cameras, therefore not collecting body camera data: 75

The criminal justice system

  • Parts of the criminal justice system where data about an individual can be created or collected: at least 6
    • Entry into the system (arrest)
    • Prosecution and pretrial
    • Sentencing
    • Corrections
    • Probation/parole
    • Recidivism

Sources

  • Crime Mapper. Philadelphia Police Department. Accessed August 24, 2014.

The New Science of Sentencing


Anna Maria Barry-Jester et al at the Marshall Project: “Criminal sentencing has long been based on the present crime and, sometimes, the defendant’s past criminal record. In Pennsylvania, judges could soon consider a new dimension: the future.

Pennsylvania is on the verge of becoming one of the first states in the country to base criminal sentences not only on what crimes people have been convicted of, but also on whether they are deemed likely to commit additional crimes. As early as next year, judges there could receive statistically derived tools known as risk assessments to help them decide how much prison time — if any — to assign.

Risk assessments have existed in various forms for a century, but over the past two decades, they have spread through the American justice system, driven by advances in social science. The tools try to predict recidivism — repeat offending or breaking the rules of probation or parole — using statistical probabilities based on factors such as age, employment history and prior criminal record. They are now used at some stage of the criminal justice process in nearly every state. Many court systems use the tools to guide decisions about which prisoners to release on parole, for example, and risk assessments are becoming increasingly popular as a way to help set bail for inmates awaiting trial.

But Pennsylvania is about to take a step most states have until now resisted for adult defendants: using risk assessment in sentencing itself. A state commission is putting the finishing touches on a plan that, if implemented as expected, could allow some offenders considered low risk to get shorter prison sentences than they would otherwise or avoid incarceration entirely. Those deemed high risk could spend more time behind bars.

Pennsylvania, which already uses risk assessment in other phases of its criminal justice system, is considering the approach in sentencing because it is struggling with an unwieldy and expensive corrections system. Pennsylvania has roughly 50,000 people in state custody, 2,000 more than it has permanent beds for. Thousands more are in local jails, and hundreds of thousands are on probation or parole. The state spends $2 billion a year on its corrections system — more than 7 percent of the total state budget, up from less than 2 percent 30 years ago. Yet recidivism rates remain high: 1 in 3inmates is arrested again or reincarcerated within a year of being released.

States across the country are facing similar problems — Pennsylvania’s incarceration rate is almost exactly the national average — and many policymakers see risk assessment as an attractive solution. Moreover, the approach has bipartisan appeal: Among some conservatives, risk assessment appeals to the desire to spend tax dollars on locking up only those criminals who are truly dangerous to society. And some liberals hope a data-driven justice system will be less punitive overall and correct for the personal, often subconscious biases of police, judges and probation officers. In theory, using risk assessment tools could lead to both less incarceration and less crime.

There are more than 60 risk assessment tools in use across the U.S., and they vary widely. But in their simplest form, they are questionnaires — typically filled out by a jail staff member, probation officer or psychologist — that assign points to offenders based on anything from demographic factors to family background to criminal history. The resulting scores are based on statistical probabilities derived from previous offenders’ behavior. A low score designates an offender as “low risk” and could result in lower bail, less prison time or less restrictive probation or parole terms; a high score can lead to tougher sentences or tighter monitoring.

The risk assessment trend is controversial. Critics have raised numerous questions: Is it fair to make decisions in an individual case based on what similar offenders have done in the past? Is it acceptable to use characteristics that might be associated with race or socioeconomic status, such as the criminal record of a person’s parents? And even if states can resolve such philosophical questions, there are also practical ones: What to do about unreliable data? Which of the many available tools — some of them licensed by for-profit companies — should policymakers choose?…(More)”

Flawed Humans, Flawed Justice


Adam Benforado in the New York Times  on using …”lessons from behavioral science to make police and courts more fair…. WHAT would it take to achieve true criminal justice in America?

Imagine that we got rid of all of the cops who cracked racist jokes and prosecutors blinded by a thirst for power. Imagine that we cleansed our courtrooms of lying witnesses and foolish jurors. Imagine that we removed every judge who thought the law should bend to her own personal agenda and every sadistic prison guard.

We would certainly feel just then. But we would be wrong.

We would still have unarmed kids shot in the back and innocent men and women sentenced to death. We would still have unequal treatment, disregarded rights and profound mistreatment.

The reason is simple and almost entirely overlooked: Our legal system is based on an inaccurate model of human behavior. Until recently, we had no way of understanding what was driving people’s thoughts, perceptions and actions in the criminal arena. So, we built our institutions on what we had: untested assumptions about what deceit looks like, how memories work and when punishment is merited.

But we now have tools — from experimental methods and data collection approaches to brain-imaging technologies — that provide an incredible opportunity to establish a new and robust foundation.

Our justice system must be reconstructed upon scientific fact. We can start by acknowledging what the data says about the fundamental flaws in our current legal processes and structures.

Consider the evidence that we treat as nearly unassailable proof of guilt at trial — an unwavering eyewitness, a suspect’s signed confession or a forensic match to the crime scene.

While we charge tens of thousands of people with crimes each year after they are identified in police lineups, research shows that eyewitnesses chose an innocent person roughly one-third of the time. Our memories can fail us because we’re frightened. They can be altered by the word choice of a detective. They can be corrupted by previously seeing someone’s image on a social media site.

Picking out lying suspects from their body language is ineffective. And trying then to gain a confession by exaggerating the strength of the evidence and playing down the seriousness of the offense can encourage people to admit to terrible things they didn’t do.

Even seemingly objective forensic analysis is far from incorruptible. Recent data shows that fingerprint — and even DNA — matches are significantly more likely when the forensic expert is aware that the sample comes from someone the police believe is guilty.

With the aid of psychology, we see there’s a whole host of seemingly extraneous forces influencing behavior and producing systematic distortions. But they remain hidden because they don’t fit into our familiar legal narratives.

We assume that the specific text of the law is critical to whether someone is convicted of rape, but research shows that the details of the criminal code — whether it includes a “force” requirement or excuses a “reasonably mistaken” belief in consent — can be irrelevant. What matters are the backgrounds and identifies of the jurors.

When a black teenager is shot by a police officer, we expect to find a bigot at the trigger.

But studies suggest that implicit bias, rather than explicit racism, is behind many recent tragedies. Indeed, simulator experiments show that the biggest danger posed to young African-American men may not be hate-filled cops, but well-intentioned police officers exposed to pervasive, damaging stereotypes that link the concepts of blackness and violence.

Likewise, Americans have been sold a myth that there are two kinds of judges — umpires and activists — and that being unbiased is a choice that a person makes. But the truth is that all judges are swayed by countless forces beyond their conscious awareness or control. It should have no impact on your case, for instance, whether your parole hearing is scheduled first thing in the morning or right before lunch, but when scientists looked at real parole boards, they found that judges were far more likely to grant petitions at the beginning of the day than they were midmorning.

The choice of where to place the camera in an interrogation room may seem immaterial, yet experiments show that it can affect whether a confession is determined to be coerced. When people watch a recording with the camera behind the detective, they are far more likely to find that the confession was voluntary than when watching the interactions from the perspective of the suspect.

With such challenges to our criminal justice system, what can possibly be done? The good news is that an evidence-based approach also illuminates the path forward.

Once we have clear data that something causes a bias, we can then figure out how to remove that influence. …(More)

The Missing Statistics of Criminal Justice


Matt Ford at the Atlantic: “An abundance of data has fueled the reform movement, but from prisons to prosecutors, crucial questions remain unquantified.

After Ferguson, a noticeable gap in criminal-justice statistics emerged: the use of lethal force by the police. The federal government compiles a wealth of data on homicides, burglaries, and arson, but no official, reliable tabulation of civilian deaths by law enforcement exists. A partial database kept by the FBI is widely considered to be misleading and inaccurate. (The Washington Post has just released a more expansive total of nearly 400 police killings this year.) “It’s ridiculous that I can’t tell you how many people were shot by the police last week, last month, last year,” FBI Director James Comey told reporters in April.

This raises an obvious question: If the FBI can’t tell how many people were killed by law enforcement last year, what other kinds of criminal-justice data are missing? Statistics are more than just numbers: They focus the attention of politicians, drive the allocation of resources, and define the public debate. Public officials—from city councilors to police commanders to district attorneys—are often evaluated based on how these numbers change during their terms in office. But existing statistical measures only capture part of the overall picture, and the problems that go unmeasured are often also unaddressed. What changes could the data that isn’t currently collected produce if it were gathered?….

Without reliable official statistics, scholars often must gather and compile necessary data themselves. “A few years ago, I was struck at how many police killings of civilians we seemed to be having in Philadelphia,” Gottschalk said as an example. “They would be buried in the newspaper, and I was stunned by how difficult it was to compile that information and compare it to New York and do it on a per-capita basis. It wasn’t readily available.” As a result, criminal-justice researchers often spend more time gathering data than analyzing it.

This data’s absence shapes the public debate over mass incarceration in the same way that silence between notes of music gives rhythm to a song. Imagine debating the economy without knowing the unemployment rate, or climate change without knowing the sea level, or healthcare reform without knowing the number of uninsured Americans. Legislators and policymakers heavily rely on statistics when crafting public policy. Criminal-justice statistics can also influence judicial rulings, including those by the Supreme Court, with implications for the entire legal system.

Beyond their academic and policymaking value, there’s also a certain power to statistics. They have the irreplaceable ability to both clarify social issues and structure the public’s understanding of them. A wealth of data has allowed sociologists, criminologists, and political scientists to diagnose serious problems with the American criminal-justice system over the past twenty years. Now that a growing bipartisan consensus recognizes the problem exists, gathering the right facts and figures could help point the way towards solutions…(More)”

Opening Criminal Justice Data


Sunlight Foundation: “As part of a new initiative, the Sunlight Foundation has begun amassing an inventory of public and privately-produced criminal justice data. The spreadsheet on this page is a work in progress but we’re publishing it now with hopes that people can use it for research or reporting and even contribute to it. Please go through the spreadsheet — so far we have an inventory started with information from 26 states and the federal government. When we’re done, we’ll have an inventory of data from all 50 states and the District of Columbia. You can read more about this project, submit your own work and feedback below….(More) “

Dubai detectives to get Google Glass to fight crime


Reuters: “Dubai police plan to issue detectives with Google Glass hands-free eyewear to help them fight crime using facial recognition technology, a police spokesman in the wealthy Gulf Arab emirate said.
The wearable device consists of a tiny computer screen mounted in the corner of an eyeglass frame and is capable of taking photos, recording video and playing sound.The spokesman confirmed a report in Dubai’s 7 Days newspaper that software developed by Dubai police would enable a connection between the wearer and a database of wanted people. Once the device “recognized” a suspect based on a face print, it would alert the officer wearing the gadget.
The gadget would be used in a first phase to combat traffic violations and track vehicles suspected of involvement in motoring offences. A second phase would see the technology rolled out to detectives, the spokesman said.
The U.S. Internet company said in a blogpost in May that anyone in the United States could buy the gadget for $1,500.
Dubai’s decision appears in line with the authorities’ determination to spare no expense in equipping the police…”

Civic Works Project translates data into community tools


The blog of the John S. and James L. Knight Foundation:”The Civic Works Project is a two-year effort to create apps and other tools to help increase the utility of local government data to benefit community organizations and the broader public. w
This project looks systemically at public and private information that can be used to engage residents, solve community problems and increase government accountability. We believe that there is a new frontier where information can be used to improve public services and community building efforts that benefit local residents.
Through the Civic Works Project, we’re seeking to improve access to information and identify solutions to problems facing diverse communities. Uncovering the value of data—and the stories behind it—can enhance the provision of public services through the smart application of technology.
Here’s some of what we’ve accomplished.
Partnership with WBEZ Public Data Blog
The WBEZ Public Data Blog is dedicated to examining and promoting civic data in Chicago, Cook County and Illinois. WBEZ is partnering with the Smart Chicago Collaborative to provide news and analysis on open government by producing content items that explain and tell stories hidden in public data. The project seeks to increase the utility, understanding, awareness and availability of local civic data. It comprises blog postings on the hidden uses of data and stories from the data, while including diverse voices and discussions on how innovations can improve civic life. It also features interviews with community organizations, businesses, government leaders and residents on challenges that could be solved through more effective use of public data.
Crime and Punishment in Chicago
The Crime and Punishment in Chicago project will provide an index of data sources regarding the criminal justice system in Chicago. This site will aggregate sources of data, how this data is generated, how to get it and what data is unavailable.
Illinois OpenTech Challenge
The Illinois Open Technology Challenge aims to bring governments, developers and communities together to create digital tools that use public data to serve today’s civic needs and promote economic development. Smart Chicago and our partners worked with government officials to publish 138 new datasets (34 in Champaign, 15 in Rockford, 12 in Belleville, and 77 from the 42 municipalities in the South Suburban Mayors and Managers Association) on the State of Illinois data portal. Smart Chicago has worked with developers in meet-ups all over the state—in six locations in four cities with 149 people. The project has also allowed Smart Chicago to conduct outreach in each of our communities to reach regular residents with needs that can be addressed through data and technology.
LocalData + SWOP
The LocalData + SWOP project is part of our effort to help bridge technology gaps in high-capacity organizations. This effort helps the Southwest Organizing Project collect information about vacant and abandoned housing using the LocalData tool.
Affordable Care Act Outreach App
With the ongoing implementation of the Affordable Care Act, community organizations such as LISC-Chicago have been hard at work providing navigators to help residents register through the healthcare.gov site.
Currently, LISC-Chicago organizers are in neighborhoods contacting residents and encouraging them to go to their closest Center for Working Families. Using a combination of software, such as Wufoo and Twilio, Smart Chicago is helping LISC with its outreach by building a tool that enables organizers to send text reminders to sign up for health insurance to residents.
Texting Tools: Twilio and Textizen
Smart Chicago is expanding the Affordable Care Act outreach project to engage residents in other ways using SMS messaging.
Smart Chicago is also a local provider for Textizen,  an SMS-based survey tool that civic organizations can use to obtain resident feedback. Organizations can create a survey campaign and then place the survey options on posters, postcards or screens during live events. They can then receive real-time feedback as people text in their answers.
WikiChicago
WikiChicago will be a hyper-local Wikipedia-like website that anyone can edit. For this project, Smart Chicago is partnering with the Chicago Public Library to feature local authors and books about Chicago, and to publish more information about Chicago’s rich history.”

The Moneyball Effect: How smart data is transforming criminal justice, healthcare, music, and even government spending


TED: “When Anne Milgram became the Attorney General of New Jersey in 2007, she was stunned to find out just how little data was available on who was being arrested, who was being charged, who was serving time in jails and prisons, and who was being released. It turns out that most big criminal justice agencies like my own didn’t track the things that matter,” she says in today’s talk, filmed at TED@BCG. “We didn’t share data, or use analytics, to make better decisions and reduce crime.”
Milgram’s idea for how to change this: “I wanted to moneyball criminal justice.”
Moneyball, of course, is the name of a 2011 movie starring Brad Pitt and the book it’s based on, written by Michael Lewis in 2003. The term refers to a practice adopted by the Oakland A’s general manager Billy Beane in 2002 — the organization began basing decisions not on star power or scout instinct, but on statistical analysis of measurable factors like on-base and slugging percentages. This worked exceptionally well. On a tiny budget, the Oakland A’s made it to the playoffs in 2002 and 2003, and — since then — nine other major league teams have hired sabermetric analysts to crunch these types of numbers.
Milgram is working hard to bring smart statistics to criminal justice. To hear the results she’s seen so far, watch this talk. And below, take a look at a few surprising sectors that are getting the moneyball treatment as well.

Moneyballing music. Last year, Forbes magazine profiled the firm Next Big Sound, a company using statistical analysis to predict how musicians will perform in the market. The idea is that — rather than relying on the instincts of A&R reps — past performance on Pandora, Spotify, Facebook, etc can be used to predict future potential. The article reads, “For example, the company has found that musicians who gain 20,000 to 50,000 Facebook fans in one month are four times more likely to eventually reach 1 million. With data like that, Next Big Sound promises to predict album sales within 20% accuracy for 85% of artists, giving labels a clearer idea of return on investment.”
Moneyballing human resources. In November, The Atlantic took a look at the practice of “people analytics” and how it’s affecting employers. (Billy Beane had something to do with this idea — in 2012, he gave a presentation at the TLNT Transform Conference called “The Moneyball Approach to Talent Management.”) The article describes how Bloomberg reportedly logs its employees’ keystrokes and the casino, Harrah’s, tracks employee smiles. It also describes where this trend could be going — for example, how a video game called Wasabi Waiter could be used by employers to judge potential employees’ ability to take action, solve problems and follow through on projects. The article looks at the ways these types of practices are disconcerting, but also how they could level an inherently unequal playing field. After all, the article points out that gender, race, age and even height biases have been demonstrated again and again in our current hiring landscape.
Moneyballing healthcare. Many have wondered: what about a moneyball approach to medicine? (See this call out via Common Health, this piece in Wharton Magazine or this op-ed on The Huffington Post from the President of the New York State Health Foundation.) In his TED Talk, “What doctors can learn from each other,” Stefan Larsson proposed an idea that feels like something of an answer to this question. In the talk, Larsson gives a taste of what can happen when doctors and hospitals measure their outcomes and share this data with each other: they are able to see which techniques are proving the most effective for patients and make adjustments. (Watch the talk for a simple way surgeons can make hip surgery more effective.) He imagines a continuous learning process for doctors — that could transform the healthcare industry to give better outcomes while also reducing cost.
Moneyballing government. This summer, John Bridgeland (the director of the White House Domestic Policy Council under President George W. Bush) and Peter Orszag (the director of the Office of Management and Budget in Barack Obama’s first term) teamed up to pen a provocative piece for The Atlantic called, “Can government play moneyball?” In it, the two write, “Based on our rough calculations, less than $1 out of every $100 of government spending is backed by even the most basic evidence that the money is being spent wisely.” The two explain how, for example, there are 339 federally-funded programs for at-risk youth, the grand majority of which haven’t been evaluated for effectiveness. And while many of these programs might show great results, some that have been evaluated show troubling results. (For example, Scared Straight has been shown to increase criminal behavior.) Yet, some of these ineffective programs continue because a powerful politician champions them. While Bridgeland and Orszag show why Washington is so averse to making data-based appropriation decisions, the two also see the ship beginning to turn around. They applaud the Obama administration for a 2014 budget with an “unprecendented focus on evidence and results.” The pair also gave a nod to the nonprofit Results for America, which advocates that for every $99 spent on a program, $1 be spent on evaluating it. The pair even suggest a “Moneyball Index” to encourage politicians not to support programs that don’t show results.
In any industry, figuring out what to measure, how to measure it and how to apply the information gleaned from those measurements is a challenge. Which of the applications of statistical analysis has you the most excited? And which has you the most terrified?”

Safety Datapalooza Shows Power of Data.gov Communities


Lisa Nelson at DigitalGov: “The White House Office of Public Engagement held the first Safety Datapalooza illustrating the power of Data.gov communities. Federal Chief Technology Officer Todd Park and Deputy Secretary of Transportation John Porcari hosted the event, which touted the data available on Safety.Data.gov and the community of innovators using it to make effective tools for consumers.
The event showcased many of the  tools that have been produced as a result of  opening this safety data including:

  • PulsePoint, from the San Ramon Fire Protection District, a lifesaving mobile app that allows CPR-trained volunteers to be notified if someone nearby is in need of emergency assistance;
  • Commute and crime maps, from Trulia, allow home buyers to choose their new residence based on two important everyday factors; and
  • Hurricane App, from the American Red Cross, to monitor storm conditions, prepare your family and home, find help, and let others know you’re safe even if the power is out;

Safety data is far from alone in generating innovative ideas and gathering a community of developers and entrepreneurs, Data.gov currently has 16 different topically diverse communities on land and sea — the Cities and Oceans communities being two such examples. Data.gov’s communities are a virtual meeting spot for interested parties across government, academia and industry to come together and put the data to use. Data.gov enables a whole set of tools to make these communities come to life: apps, blogs, challenges, forums, ranking, rating and wikis.
For a summary of the Safety Datapalooza visit Transportation’s “Fast Lane” blog.”

The Wise Way to Crowdsource a Manhunt


in the New Yorker: “If Reddit were looking for a model to follow, it could use NASA’s Clickworkers experiment, which in 2000-01 let tens of thousands of amateurs look at photos of Mars in order to identify craters on the planet and classify them by age. That study found that the aggregated judgments of the amateur “clickworkers” were “virtually indistinguishable from the inputs of a geologist with years of experience.”
The problem from Reddit’s perspective, of course, is that this method of sleuthing would be far less exciting for users, and would probably generate less traffic, than its current free-for-all approach. The point of the “find-the-bombers” subthread, after all, wasn’t just to find the bombers—it was also to connect and talk with others, and to feel like you were part of a virtual community. But valuable as that experience may have been for users, it also diminished the chances of the community coming up with useful information. Reddit has done an excellent job of being engaging. Now it needs to figure out if it wants to be effective”.