Identifying Clusters of Active Transportation Using Spatial Scan Statistics

Statistical Research and Applications Branch, Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, Maryland, USA.
American journal of preventive medicine (Impact Factor: 4.53). 09/2009; 37(2):157-66. DOI: 10.1016/j.amepre.2009.04.021
Source: PubMed


There is an intense interest in the possibility that neighborhood characteristics influence active transportation such as walking or biking. The purpose of this paper is to illustrate how a spatial cluster identification method can evaluate the geographic variation of active transportation and identify neighborhoods with unusually high/low levels of active transportation.
Self-reported walking/biking prevalence, demographic characteristics, street connectivity variables, and neighborhood socioeconomic data were collected from respondents to the 2001 California Health Interview Survey (CHIS; N=10,688) in Los Angeles County (LAC) and San Diego County (SDC). Spatial scan statistics were used to identify clusters of high or low prevalence (with and without age-adjustment) and the quantity of time spent walking and biking. The data, a subset from the 2001 CHIS, were analyzed in 2007-2008.
Geographic clusters of significantly high or low prevalence of walking and biking were detected in LAC and SDC. Structural variables such as street connectivity and shorter block lengths are consistently associated with higher levels of active transportation, but associations between active transportation and socioeconomic variables at the individual and neighborhood levels are mixed. Only one cluster with less time spent walking and biking among walkers/bikers was detected in LAC, and this was of borderline significance. Age-adjustment affects the clustering pattern of walking/biking prevalence in LAC, but not in SDC.
The use of spatial scan statistics to identify significant clustering of health behaviors such as active transportation adds to the more traditional regression analysis that examines associations between behavior and environmental factors by identifying specific geographic areas with unusual levels of the behavior independent of predefined administrative units.

Download full-text


Available from: Linda Pickle,
18 Reads
  • Source
    • "Even though STSS-based approaches have commonly been used in epidemiology to detect disease outbreaks (Kulldorff et al. 2005; Neill et al. 2005), in crime science to detect crime hotspots (Maciejewski et al. 2010; Nakaya and Yano 2010) amongst others (SaTScan 2010); their investigation in transportation science is a recent research endeavour. According to the best of our knowledge, only Huang et al. (2009) investigated the use of spatial scan statistics to detect clusters of active transportation (i.e. walking, cycling); however, they have not considered the temporal aspect of the phenomenon. "
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes two novel methods for non-recurrent congestion (NRC) event detection on heterogeneous urban road networks based on link journey time (LJT) estimates. Heterogeneity exists on urban road networks in two main aspects: variation in link lengths and data quality. The proposed NRC detection methods are referred to as percentile-based NRC detection and space–time scan statistics (STSS) based NRC detection. Both of these methods capture the heterogeneity of an urban road network by modelling the LJTs with a lognormal distribution. Empirical analyses are conducted on London's urban road network consisting of 424 links for the 20 weekdays of October 2010. Various parameter settings are tested for both of the methods, and the results favour STSS-based NRC detection method over the percentile-based NRC detection method. Link-based analyses demonstrate the effectiveness of the proposed methods in capturing the heterogeneity of the analysed road network.
    09/2015; 11(9):1-33. DOI:10.1080/23249935.2015.1087229
  • Source
    • "However, these large spatial clusters will cover a large area with a larger and more heterogeneous population. Conversely, clusters generated using smaller circular windows will produce smaller clusters but will contain a more homogeneous population which can help policy makers in planning more focused community interventions [48,49]. For example, Fang et al. used circular windows no more than 20% to identify hemorrhagic fever with renal syndrome clusters and smaller circular windows no more than 10% to identify possible subclusters for more efficient resource allocation for preventing hemorrhagic fever with renal syndrome [41]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Late antenatal care and smoking during pregnancy are two important factors that are amenable to intervention. Despite the adverse health impacts of smoking during pregnancy and the health benefits of early first antenatal visit on both the mother and the unborn child, substantial proportions of women still smoke during pregnancy or have their first antenatal visit after 10 weeks gestation. This study was undertaken to assess the usefulness of geospatial methods in identifying communities at high risk of smoking during pregnancy and timing of the first antenatal visit, for which targeted interventions may be warranted, and more importantly, feasible. The Perinatal Data Collection, from 1999 to 2008 for south-western Sydney, were obtained from the New South Wales Ministry of Health. Maternal addresses at the time of delivery were georeferenced. A spatial scan statistic implemented in SaTScan was then used to identify statistically significant spatial clusters of women who smoked during pregnancy or women whose first antenatal care visit occurred at or after 10 weeks of pregnancy. Four spatial clusters of maternal smoking during pregnancy and four spatial clusters of first antenatal visit occurring at or after 10 weeks were identified in our analyses. In the maternal smoking during pregnancy clusters, higher proportions of mothers, were aged less than 35 years, had their first antenatal visit at or after 10 weeks and a lower proportion of mothers were primiparous. For the clusters of increased risk of late first antenatal visit at or after 10 weeks of gestation, a higher proportion of mothers lived in the most disadvantaged areas and a lower proportion of mothers were primiparous. The application of spatial analyses provides a means to identify spatial clusters of antenatal risk factors and to investigate the associated socio-demographic characteristics of the clusters.
    International Journal of Health Geographics 10/2013; 12(1):46. DOI:10.1186/1476-072X-12-46 · 2.62 Impact Factor
  • Source
    • "Third, spatial data resolution and process aetiology are two inherently related themes that uniquely impact the design and results of research. It has been suggested in papers from this review that some levels of data resolution are not representative of the aetiology of some processes (Huang et al., 2009; Vieira et al., 2009). To answer this call, work on data collection techniques and spatial cluster analysis methods that incorporate a notion of an individual's activity space and mobility (Orellana and Wachowicz, 2011; Zenk et al., 2011), or residential history (Meliker and Sloan, 2011), should be encouraged throughout the disciplines involved with spatial cluster analysis. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Spatial cluster analysis is a uniquely interdisciplinary endeavour, and so it is important to communicate and disseminate ideas, innovations, best practices and challenges across practitioners, applied epidemiology researchers and spatial statisticians. In this research we conducted a scoping review to systematically search peer-reviewed journal databases for research that has employed spatial cluster analysis methods on individual-level, address location, or x and y coordinate derived data. To illustrate the thematic issues raised by our results, methods were tested using a dataset where known clusters existed. Point pattern methods, spatial clustering and cluster detection tests, and a locally weighted spatial regression model were most commonly used for individual-level, address location data (n = 29). The spatial scan statistic was the most popular method for address location data (n = 19). Six themes were identified relating to the application of spatial cluster analysis methods and subsequent analyses, which we recommend researchers to consider; exploratory analysis, visualization, spatial resolution, aetiology, scale and spatial weights. It is our intention that researchers seeking direction for using spatial cluster analysis methods, consider the caveats and strengths of each approach, but also explore the numerous other methods available for this type of analysis. Applied spatial epidemiology researchers and practitioners should give special consideration to applying multiple tests to a dataset. Future research should focus on developing frameworks for selecting appropriate methods and the corresponding spatial weighting schemes.
    Geospatial health 05/2013; 7(2):183-98. DOI:10.4081/gh.2013.79 · 1.19 Impact Factor
Show more