Avoiding Garbage In – Garbage Out: Improving Administrative Data Quality for Research


Blog by : “In June, I presented the webinar, “Improving Administrative Data Quality for Research and Analysis”, for members of the Association of Public Data Users (APDU). APDU is a national network that provides a venue to promote education, share news, and advocate on behalf of public data users.

The webinar served as a primer to help smaller organizations begin to use their data for research. Participants were given the tools to transform their administrative data into “research-ready” datasets.

I first reviewed seven major issues for administrative data quality and discussed how these issues can affect research and analysis. For instance, issues with incorrect value formats, unit of analysis, and duplicate records can make the data difficult to use. Invalid or inconsistent values lead to inaccurate analysis results. Missing or outlier values can produce inaccurate and biased analysis results. All these issues make the data less useful for research.

Next, I presented concrete strategies for reviewing the data to identify each of these quality issues. I also discussed several tips to make the data review process easier, faster, and easy to replicate. Most importantly among these tips are: (1) reviewing everyvariable in the data set, whether you expect problems or not, and (2) relying on data documentation to understand how the data should look….(More)”.