ArticlePDF Available

Abstract and Figures

Spatial statistics is a growing discipline providing important analytical techniques in a wide range of disciplines in the natural and social sciences. In the R package GWmodel, we introduce techniques from a particular branch of spatial statistics, termed geographically weighted (GW) models. GW models suit situations when data are not described well by some global model, but where there are spatial regions where a suitably localised calibration provides a better description. The approach uses a moving window weighting technique, where localised models are found at target locations. Outputs are mapped to provide a useful exploratory tool into the nature of the data spatial heterogeneity. GWmodel includes: GW summary statistics, GW principal components analysis, GW regression, GW regression with a local ridge compensation, and GW regression for prediction; some of which are provided in basic and robust forms.
Content may be subject to copyright.
A preview of the PDF is not available
... This expression allows the β to vary with the location coordinates ( u i ,v i ) , making the model spatially non-stationary; and allows β to be estimated via weighted least squares with the weights matrix obtained from a Gaussian kernel, attributing larger weights to values of predictors from more proximate locations. Analysis was undertaken using the package 'GWmodel' [46]. Standard errors were estimated using the bootstrap function, with coefficients reestimated 1,000 times at each grid point. ...
... The non-stationary model has some limitations. For example, the bandwidth is optimized based on accurate prediction of the response variable, not on accurate estimation of the coefficients [46]. Especially when the regression model is fitted within a small kernel or with limited data, collinearity can be a problem [71]. ...
Article
Full-text available
Background To achieve malaria elimination it is essential to understand the impact of insecticide-treated net (ITNs) programmes. Here, the impact of ITN access and use on malaria prevalence in children in Malawi was investigated using Malaria Indicator Survey (MIS) data. Methods MIS data from 2012, 2014 and 2017 were used to investigate the relationship between malaria prevalence in children (6–59 months) and ITN use. Generalized linear modelling (GLM), geostatistical mixed regression modelling and non-stationary GLM were undertaken to evaluate trends, spatial patterns and local dynamics, respectively. Results Malaria prevalence in Malawi was 27.1% (95% CI 23.1–31.2%) in 2012 and similar in both 2014 (32.1%, 95% CI 25.5–38.7) and 2017 (23.9%, 95% CI 20.3–27.4%). ITN coverage and use increased during the same time period, with household ITN access growing from 19.0% (95% CI 15.6–22.3%) of households with at least 1 ITN for every 2 people sleeping in the house the night before to 41.7% (95% CI 39.1–44.4%) and ITN use from 41.1% (95% CI 37.3–44.9%) of the population sleeping under an ITN the previous night to 57.4% (95% CI 55.0–59.9%). Both the geostatistical and non-stationary GLM regression models showed child malaria prevalence had a negative association with ITN population access and a positive association with ITN use although affected by large uncertainties. The non-stationary GLM highlighted the spatital heterogeneity in the relationship between childhood malaria and ITN dynamics across the country. Conclusion Malaria prevalence in children under five had a negative association with ITN population access and a positive association with ITN use, with spatial heterogeneity in these relationships across Malawi. This study presents an important modelling approach that allows malaria control programmes to spatially disentangle the impact of interventions on malaria cases.
... These GW models form a generic, open, and continually evolving technical framework to explore spatial heterogeneities from a wide range of disciplines in the natural and social sciences. Many of the listed GW models are incorporated into a range of R [27] packages, including spgwr [28], mgwrsar [18], GWLelast [29], spMoran [30], gwer [31], lctools [32,33], gwrr [34], CARBayes [35] and GWmodel [36,37]. In particular, GWmodel contains functions to calibrate and estimate a wide range models or techniques based on geographical weighting schemes. ...
... (2) Specify dependent variable and independent variables from the combobox and list box; if tick the checkbox ''Enable automatic model specification'', the independent variables will be automatically selected via a stepwise procedure (see details in [47]), and it is checked in this example. (3) Define the weighting scheme by ticking the radio buttons, i.e. fixed or adaptive bandwidth, user-defined or optimized via the cross validation (CV) approach or corrected Akaike Information Criterion (AICc) [37], and finally choosing the kernel function from the combobox; the Bi-square kernel function with an optimized adaptive bandwidth is adopted in this example. (4) Tick the radio button to calculate distance metric, where ''According to CRS'' means great circle and Euclidean distances will be calculated when the coordinate reference system (CRS) is geographic and projected, respectively; Minkowski distance or an individual distance matrix is also allowable by ticking the rest radio buttons. ...
Article
Full-text available
Spatial heterogeneity or non-stationarity has become a popular and necessary concern in exploring relationships between variables. In this regard, geographically weighted (GW) models provide a powerful collection of techniques in its quantitative description. We developed a user-friendly, high-performance and systematic software, named GWmodelS, to promote better and broader usages of such models. Apart from a variety of GW models, including GW descriptive statistics, GW regression models, and GW principal components analysis, data management and mapping tools have also been incorporated with well-designed interfaces.
... Coordinated class indicates highly coordinated and sustainable development between PGS and SED. Geographically weighted regression (GWR) is a geo-statistical method that incorporates spatial characteristics into the model in the way of distance weighting on the basis of the traditional least square model and allows local parameter estimation [59]. Spatial data are usually characterized by spatially non-stationarity, and the analysis results of fitting spatial data with a general linear regression model cannot completely reflect the real characteristics of spatial data, while the GWR can effectively detect spatially non-stationarity and allow different spatial relationships in different geographic spaces [56]. ...
Article
Full-text available
Several studies have revealed that park green space (PGS) plays a crucial role in improving residents’ quality of life and promoting sustainable development of the environment. However, rapid urbanization and population growth have led to an inequitable supply and demand for PGS, especially in high-density cities, which has been widely recognized as an important environmental justice issue. However, few studies have evaluated the equity and sustainability of PGS in high-density cities based on multi-scale. This study developed a framework to explore the spatial equity of PGS and its coupling coordination degree (CCD) with socioeconomic deprivation (SED) based on a multi-scale approach (pocket park, community park, and comprehensive park), then analyzed the spatial correlation between PGS and CCD. The results showed that: (1) The overall supply of 3-scale PGS does not meet residents’ demand for PGS resources in the study area and the urban center has the highest demand for PGS. (2) Among the three-scale PGS, the comprehensive PGS has the strongest supply capacity, but it also has the most severe supply–demand mismatch. (3) Although the service radius of pocket PGS is smaller than that of community PGS, the supply of pocket PGS is higher. (4) More than 95% of the studied area lacks coordination between PGS and SED development. (5) The subsystem that has the greatest spatial correlation with CCD in pocket PGS and comprehensive PGS was the number of configurations, while that in community PGS was the spatial arrangement. This study not only provides a theoretical reference for conducting research on PGS equity in high-density cities, but also provides a novel perspective on the sustainable, coordinated development and planning of urban PGS system.
Article
Full-text available
In the United States, the rise in hypertension prevalence has been connected to neighborhood characteristics. While various studies have found a link between neighborhood and health, they do not evaluate the relative dependence of each component in the growth of hypertension and, more significantly, how this value differs geographically (i.e., across different neighborhoods). This study ranks the contribution of ten socioeconomic neighborhood factors to hypertension prevalence in Chicago, Illinois, using multiple global and local machine learning models at the census tract level. First, we use Geographical Random Forest, a recently proposed non-linear machine learning regression method, to assess each predictive factor’s spatial variation and contribution to hypertension prevalence. Then we compare GRF performance to Geographically Weighted Regression (local model), Random Forest (global model), and OLS (global model). The results indicate that GRF outperforms all models and that the importance of variables varies by census tract. Household composition is the most important factor in the Chicago tracts, while on the other hand, Housing type and Transportation is the least important factor. While the household composition is the most important determinant around north Lake Michigan, the socioeconomic condition of the neighborhood in Chicago’s mid-north has the most importance on hypertension prevalence. Understanding how the importance of socioeconomic factors associated with hypertension prevalence varies spatially aids in the design and implementation of health policies based on the most critical factors identified at the local level (i.e., tract), rather than relying on broad city-level guidelines (i.e., for entire Chicago and other large cities).
Chapter
Spatial heterogeneity or non-stationarity is a prominent characteristic of data relationships. In line with Tobler’s first law of geography, a number of local statistics or local models have been proposed to explore spatial heterogeneities in spatial patterns or relationships. A particular branch of spatial statistics, termed geographically weighted (GW) models have evolved to encompass local techniques applicable in situations when data are not described well by such global models. Typical GW models and techniques include GW regression, GW descriptive statistics, GW principal components analysis, GW discriminant analysis, GW visualization techniques and GW artificial neural network. These GW models form a generic, open, and continually evolving technical framework to explore spatial heterogeneities from a wide range of disciplines in the natural and social sciences. In this study, we present a high-performance computing framework to incorporate the GW models with parallel computing techniques. We developed a software, namely GWmodelS to facilitate a flexible implementation of GW models. This study describes the procedures of geospatial data management, parameter optimization, model calibration and result visualization associated with GWmodelS. This software provides free services for scientific research and educational courses in the related domains.KeywordsSpatial statisticsSpatial heterogeneityGWRExploratory data analysisCUDA
Preprint
Full-text available
This study describes the development of PISCOt (v1.2), an innovative high-spatial resolution (0.01°) daily air temperature dataset for Peru (1981-2020). The development of PISCOt involves four main steps: i) quality control; ii) gap-filling; iii) homogenisation of weather stations; and iv) spatial interpolation. The methodological framework allows the representation of the complex spatial variability of air temperature at a more accurate scale than other national and global products (e.g. PISCOt v1.1, ERA5-Land, TerraClimate, CHIRTS). The technical validation indicates mean absolute errors of less than 1.5 °C at climatological and daily mean scales. The new PISCOt dataset appropriately captures the temporal trends which highlights its usefulness to understand the historical variability of air temperature. For the first time, PISCOt v1.2 provides a suitable and widely applicable baseline at the local and regional level in the face of data scarcity in several regions of Peru for applications related to climate change, water balance studies, or the assessment of ecosystems, among others.
Article
Ensuring the social equity of planning measures in social systems requires an understanding of human dynamics, particularly how individual relationships, activities, and interactions intersect with individual needs. Spatial microsimulation models (SMSMs) support planning for human security goals by representing human dynamics through realistic, georeferenced synthetic populations, that a) provide a complete representation of social systems while b) also protecting individual privacy. In this paper, we present UrbanPop, an open and reproducible SMSM framework for analysis of human dynamics with high spatial, temporal, and demographic resolution. UrbanPop creates synthetic populations of demographically detailed worker and student agents, positioning them first at probable nighttime locations (home), then moving them to probable daytime locations (work/school). Summary aggregations of these populations match the granular detail available at the census block group level in the American Community Survey Summary File (SF), providing realistic approximations of the actual population. UrbanPop users can select particular demographic traits important in their application, resulting in a highly tailored agent population. We first lay out UrbanPop's baseline methodology, including population synthesis, activity modeling, and diagnostics, then demonstrate these capabilities by developing case studies of shifting population distributions and high-risk populations in Knox County, TN during the global COVID-19 pandemic.
Article
Full-text available
Geographically neural network weighted regression is an improved model of GWR combined with a neural network. It has a stronger ability to fit nonlinear functions, and complex geographical processes can be modeled more fully. GNNWR uses the distance metric of Euclidean space to express the relationship between sample points. However, except for spatial location features, geographic entities also have many diverse attribute features. Incorporating attribute features into the modeling process can make the model more suitable for the real geographical process. Therefore, we proposed a spatial-attribute proximities deep neural network to aggregate data from the spatial feature and attribute feature, so that one unified distance metric can be used to express the spatial and attribute relationships between sample points at the same time. Based on GNNWR, we designed a spatial and attribute neural network weighted regression (SANNWR) model to adapt to this new unified distance metric. We developed one case study to examine the effectiveness of SANNWR. We used PM2.5 concentration data in China as the research object and compared the prediction accuracy between GWR, GNNWR and SANNWR. The results showed that the “spatial-attribute” unified distance metric is useful, and that the SANNWR model showed the best performance.
Article
Full-text available
Increasingly, the geographically weighted regression (GWR) model is being used for spatial prediction rather than for inference. Our study compares GWR as a predictor to (a)its global counterpart of multiple linear regression (MLR); (b)traditional geostatistical models such as ordinary kriging (OK) and universal kriging (UK), with MLR as a mean component; and (c)hybrids, where kriging models are specified with GWR as a mean component. For this purpose, we test the performance of each model on data simulated with differing levels of spatial heterogeneity (with respect to data relationships in the mean process) and spatial autocorrelation (in the residual process). Our results demonstrate that kriging (in a UK form) should be the preferred predictor, reflecting its optimal statistical properties. However the GWR-kriging hybrids perform with merit and, as such, a predictor of this form may provide a worthy alternative to UK for particular (non-stationary relationship) situations when UK models cannot be reliably calibrated. GWR predictors tend to perform more poorly than their more complex GWR-kriging counterparts, but both GWR-based models are useful in that they provide extra information on the spatial processes generating the data that are being predicted. KeywordsRelationship nonstationarity–Relationship heterogeneity–GWR–Kriging–Spatial interpolation
Chapter
This article presents stochastic and nonstochastic methods of spatial prediction, using a unified notation. The geostatistical method (i.e., kriging in its various forms) has an advantage over other predictors in that it adapts to the quantity and quality of spatial dependence demonstrated by the data. Properties of the various methods are discussed very briefly.