Combining Racial Groups in Data Analysis Can Mask Important Differences in Communities


Blog by Jonathan Schwabish and Alice Feng: “Surveys, datasets, and published research often lump together racial and ethnic groups, which can erase the experiences of certain communities. Combining groups with different experiences can mask how specific groups and communities are faring and, in turn, affect how government funds are distributed, how services are provided, and how groups are perceived.

Large surveys that collect data on race and ethnicity are used to disburse government funds and services in a number of ways. The US Department of Housing Urban Development, for instance, distributes millions of dollars annually to Native American tribes through the Indian Housing Block Grant. And statistics on race and ethnicity are used as evidence in employment discrimination lawsuits and to help determine whether banks are discriminating against people and communities of color.

Despite the potentially large effects these data can have, researchers don’t always disaggregate their analysis to more racial groups. Many point to small sample sizes as a limitation for including more race and ethnicity categories in their analysis, but efforts to gather more specific data and disaggregate available survey results are critical to creating better policy for everyone.

To illustrate how aggregating racial groups can mask important variation, we looked at the 2019 poverty rate across 139 detailed race categories in the Census Bureau’s annual American Community Survey (ACS). The ACS provides information that helps determine how more than $675 billion in government funds is distributed each year.

The official poverty rate in the United States stood at 10.5 percent in 2019, with significant variation across racial and ethnic groups. The primary question in the ACS concerning race includes 15 separate checkboxes, with space to print additional names or races for some options (a separate question refers to Hispanic or Latino origin).

Screenshot of the American Community Survey's race question

Although the survey offers ample latitude for interviewees to respond with their race, researchers have a tendency to aggregate racial categories. People who identify as Asian or Pacific Islander (API), for example, are often combined in economic analyses.

This aggregation can mask variation within racial or ethnic categories. As an example, one analysis that used the ACS showed 11 percent of children in the API group are in poverty, relative to 18 percent of the overall population. But that estimate could understate the poverty rate among children who identify as Pacific lslanders and could overstate the poverty rate among children who identify as Asian, which itself is a broad grouping that encompasses many different communities with various experiences. Similar aggregating can be found across economic literature, including on educationimmigration (PDF), and wealth….(More)”.