Geocoding public health data [Letter}

Virginia Commonwealth University, Ричмонд, Virginia, United States
American Journal of Public Health (Impact Factor: 4.55). 06/2003; 93(5):699; author reply 699-700. DOI: 10.2105/AJPH.93.5.699
Source: PubMed
Download full-text


Available from: Henry Joseph Carretta, Apr 11, 2014
20 Reads
  • Source
    • "ZCTAs are created from Census blocks, where the ZIP Code of the majority of residents in a Census block determines that block's membership within a ZCTA. The merits and detriments of the ZIP Code as a geocode have been discussed at length previously [5-10]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The need to estimate the distance from an individual to a service provider is common in public health research. However, estimated distances are often imprecise and, we suspect, biased due to a lack of specific residential location data. In many cases, to protect subject confidentiality, data sets contain only a ZIP Code or a county. This paper describes an algorithm, known as "the probabilistic sampling method" (PSM), which was used to create a distribution of estimated distances to a health facility for a person whose region of residence was known, but for which demographic details and centroids were known for smaller areas within the region. From this distribution, the median distance is the most likely distance to the facility. The algorithm, using Monte Carlo sampling methods, drew a probabilistic sample of all the smaller areas (Census blocks) within each participant's reported region (ZIP Code), weighting these areas by the number of residents in the same age group as the participant. To test the PSM, we used data from a large cross-sectional study that screened women at a clinic for intimate partner violence (IPV). We had data on each woman's age and ZIP Code, but no precise residential address. We used the PSM to select a sample of census blocks, then calculated network distances from each census block's centroid to the closest IPV facility, resulting in a distribution of distances from these locations to the geocoded locations of known IPV services. We selected the median distance as the most likely distance traveled and computed confidence intervals that describe the shortest and longest distance within which any given percent of the distance estimates lie. We compared our results to those obtained using two other geocoding approaches. We show that one method overestimated the most likely distance and the other underestimated it. Neither of the alternative methods produced confidence intervals for the distance estimates. The algorithm was implemented in R code. The PSM has a number of benefits over traditional geocoding approaches. This methodology improves the precision of estimates of geographic access to services when complete residential address information is unavailable and, by computing the expected distribution of possible distances for any respondent and associated distance confidence limits, sensitivity analyses on distance access measures are possible. Faulty or imprecise distance measures may compromise decisions about service location and misdirect scarce resources.
    International Journal of Health Geographics 01/2011; 10(1):4. DOI:10.1186/1476-072X-10-4 · 2.62 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Geographic information systems have proven instrumental in assessing environmental impacts on individual and community health, but numerous methodological challenges are associated with analyses of highly localized phenomena in which spatially misaligned data are used. In a case study based on child care facility and traffic data for the Los Angeles metropolitan area, we assessed the extent of facility misclassification with spatially unreconciled data from 3 different governmental agencies in an attempt to identify child care centers in which young children are at risk from high concentrations of toxic vehicle-exhaust pollutants. Relative to geographically corrected data, unreconciled information produced a modest bias in terms of aggregated number of facilities at risk and a substantial number of false positives and negatives.
    American Journal of Public Health 04/2006; 96(3):499-504. DOI:10.2105/AJPH.2005.071373 · 4.55 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This research develops methods for determining the effect of geocoding quality on relationships between environmental exposures and health. The likelihood of detecting an existing relationship - statistical power - between measures of environmental exposures and health depends not only on the strength of the relationship but also on the level of positional accuracy and completeness of the geocodes from which the measures of environmental exposure are made. This paper summarizes the results of simulation studies conducted to examine the impact of inaccuracies of geocoded addresses generated by three types of geocoding processes: a) addresses located on orthophoto maps, b) addresses matched to TIGER files (U.S Census or their derivative street files); and, c) addresses from E-911 geocodes (developed by local authorities for emergency dispatch purposes). The simulated odds of disease using exposures modelled from the highest quality geocodes could be sufficiently recovered using other, more commonly used, geocoding processes such as TIGER and E-911; however, the strength of the odds relationship between disease exposures modelled at geocodes generally declined with decreasing geocoding accuracy. Although these specific results cannot be generalized to new situations, the methods used to determine the sensitivity of results can be used in new situations. Estimated measures of positional accuracy must be used in the interpretation of results of analyses that investigate relationships between health outcomes and exposures measured at residential locations. Analyses similar to those employed in this paper can be used to validate interpretation of results from empirical analyses that use geocoded locations with estimated measures of positional accuracy.
    International Journal of Health Geographics 02/2008; 7(1):13. DOI:10.1186/1476-072X-7-13 · 2.62 Impact Factor
Show more