The application of a currently proposed differential privacy algorithm to the 2020 United States Census data and additional data products may affect the usefulness of these data, the accuracy of estimates and rates derived from them, and critical knowledge about social phenomena such as health disparities. We test the ramifications of applying differential privacy to released data by studying estimates of US mortality rates for the overall population and three major racial/ethnic groups. We ask how changes in the denominators of these vital rates due to the implementation of differential privacy can lead to biased estimates. We situate where these changes are most likely to matter by disaggregating biases by population size, degree of urbanization, and adjacency to a metropolitan area. Our results suggest that differential privacy will more strongly affect mortality rate estimates for non-Hispanic blacks and Hispanics than estimates for non-Hispanic whites. We also find significant changes in estimated mortality rates for less populous areas, with more pronounced changes when stratified by race/ethnicity. We find larger changes in estimated mortality rates for areas with lower levels of urbanization or adjacency to metropolitan areas, with these changes being greater for non-Hispanic blacks and Hispanics. These findings highlight the consequences of implementing differential privacy, as proposed, for research examining population composition, particularly mortality disparities across racial/ethnic groups and along the urban/rural continuum. Overall, they demonstrate the challenges in using the data products derived from the proposed disclosure avoidance methods, while highlighting critical instances where scientific understandings may be negatively impacted.
|Original language||English (US)|
|Number of pages||8|
|Journal||Proceedings of the National Academy of Sciences of the United States of America|
|State||Published - Jun 16 2020|
All Science Journal Classification (ASJC) codes