TY - JOUR
T1 - Accuracy and repeatability of commercial geocoding
AU - Whitsel, Eric A.
AU - Rose, Kathryn M.
AU - Wood, Joy L.
AU - Henley, Amenda C.
AU - Liao, Duanping
AU - Heiss, Gerardo
N1 - Funding Information:
Grants from the National Heart, Lung, and Blood Institute (R01-HL64142-03) funded this study and supported Dr. Whitsel (5-T32-HL07055).
PY - 2004/11/15
Y1 - 2004/11/15
N2 - The authors estimated accuracy and repeatability of commercial geocoding to guide vendor selection in the Life Course Socioeconomic Status, Social Context and Cardiovascular Disease study (2001-2002). They submitted 1,032 participant addresses (97% in Maryland, Minnesota, Mississippi, or North Carolina) to vendor A twice over 9 months and measured repeatability as agreement between levels of address matching, discordance (%) between statistical tabulation areas, and median distance (d, in meters) and bearing (θ, in degrees) between coordinates assigned on each occasion (Ho:Σ i = 1 → n[θi/n] = 180°). They also submitted 75 addresses of nearby air pollution monitors (77% urban/suburban; 69% residential/commercial) to vendors A and B and then measured accuracy by comparing vendor- and US Environmental Protection Agency (EPA)-assigned geocodes using the above measures. Repeatability of geocodes assigned by vendor A was high (kappa = 0.90; census block group discordance = 5%; d < 1 m; θ = 177°). The match rate for EPA monitor addresses was higher for vendor B versus A (88% vs. 76%), but discordance at census block group, tract, and county levels also was, respectively, 1.4-, 1.9-, and 5.0-fold higher for vendor B. Moreover, coordinates assigned by vendor B were further from those assigned by the EPA (d = 212 m vs. 149 m; θ = 131° vs. 171°). These findings suggest that match rates, repeatability, and accuracy should be used to guide vendor selection.
AB - The authors estimated accuracy and repeatability of commercial geocoding to guide vendor selection in the Life Course Socioeconomic Status, Social Context and Cardiovascular Disease study (2001-2002). They submitted 1,032 participant addresses (97% in Maryland, Minnesota, Mississippi, or North Carolina) to vendor A twice over 9 months and measured repeatability as agreement between levels of address matching, discordance (%) between statistical tabulation areas, and median distance (d, in meters) and bearing (θ, in degrees) between coordinates assigned on each occasion (Ho:Σ i = 1 → n[θi/n] = 180°). They also submitted 75 addresses of nearby air pollution monitors (77% urban/suburban; 69% residential/commercial) to vendors A and B and then measured accuracy by comparing vendor- and US Environmental Protection Agency (EPA)-assigned geocodes using the above measures. Repeatability of geocodes assigned by vendor A was high (kappa = 0.90; census block group discordance = 5%; d < 1 m; θ = 177°). The match rate for EPA monitor addresses was higher for vendor B versus A (88% vs. 76%), but discordance at census block group, tract, and county levels also was, respectively, 1.4-, 1.9-, and 5.0-fold higher for vendor B. Moreover, coordinates assigned by vendor B were further from those assigned by the EPA (d = 212 m vs. 149 m; θ = 131° vs. 171°). These findings suggest that match rates, repeatability, and accuracy should be used to guide vendor selection.
UR - http://www.scopus.com/inward/record.url?scp=8444239741&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=8444239741&partnerID=8YFLogxK
U2 - 10.1093/aje/kwh310
DO - 10.1093/aje/kwh310
M3 - Article
C2 - 15522859
AN - SCOPUS:8444239741
VL - 160
SP - 1023
EP - 1029
JO - American Journal of Epidemiology
JF - American Journal of Epidemiology
SN - 0002-9262
IS - 10
ER -