Stochastic Environmental Research and Risk Assessment

Published by Springer Nature

Online ISSN: 1436-3259


Print ISSN: 1436-3240


Fig. 1 Eleven-county study area in southeastern Michigan 
Fig. 4 Local clusters of residential histories for timegeographies of participants' ages. Snapshots of continuous animation from STIS. k = 8 nearest neighbors. a Study participants at age 15 years. b Study participants at age 45 years 
Space-time clustering of case-control data with residential histories: Insights into empirical induction periods, age-specific susceptibility, and calendar year-specific effects
  • Article
  • Full-text available

September 2007


78 Reads


Our research group recently developed Q-statistics for evaluating space-time clustering in case-control studies with residential histories. This technique relies on time-dependent nearest-neighbor relationships to examine clustering at any moment in the life-course of the residential histories of cases relative to that of controls. In addition, in place of the widely used null hypothesis of spatial randomness, each individual's probability of being a case is based instead on his/her risk factors and covariates. In this paper, we extend this approach to illustrate how alternative temporal orientations (e.g., years prior to diagnosis/recruitment, participant's age, and calendar year) influence a spatial clustering pattern. These temporal orientations are valuable for shedding light on the duration of time between clustering and subsequent disease development (known as the empirical induction period), and for revealing age-specific susceptibility windows and calendar year-specific effects. An ongoing population-based bladder cancer case-control study is used to demonstrate this approach. Data collection is currently incomplete and therefore no inferences should be drawn; we analyze these data to demonstrate these novel methods. Maps of space-time clustering of bladder cancer cases are presented using different temporal orientations while accounting for covariates and known risk factors. This systematic approach for evaluating space-time clustering has the potential to generate novel hypotheses about environmental risk factors and provides insights into empirical induction periods, age-specific susceptibility, and calendar year-specific effects.

Fig. 1 Picture of the four PAH samplers around Ground Zero  
Fig. 3 a The spatial validation showing reduction in the MSE using BME compared to space/time kriging. The number for PAH refers to: 1 benz(a)anthracene, 2 chrysene, 3 benzo(b)fluoranthene, 4 benzo(k)fluoranthene, 5 benzo(a)pyrene, 6 indeno(1,2,3-c,d)pyrene, 7 dibenzo(a,h)anthracene, 8 benzo(g,h,i)perylene, and 9 benzo(e)pyrene. b The temporal validation showing reduction in the MSE for benzo(a)pyrene  
Fig. 4 a and b Time-integrated individual-level exposure to benzo(a)pyrene (ng inhaled) during the 150 days after 9/11 as calculated by Eq. (2); c and d the time-integrated benzo(a)pyrene population burden (ng 9 persons/mi 2 ) for the 150 days after 9/11 as calculated by Eq. (3)  
Mass fraction spatiotemporal geostatistics and its application to map atmospheric polycyclic aromatic hydrocarbons after 9/11

December 2009


100 Reads

This work proposes a space/time estimation method for atmospheric PM2.5 components by modelling the mass fraction at a selection of space/time locations where the component is measured and applying the model to the extensive PM2.5 monitoring network. The method we developed utilizes the nonlinear Bayesian maximum entropy framework to perform the geostatistical estimation. We implemented this approach using data from nine carcinogenic, particle-bound polycyclic aromatic hydrocarbons (PAHs) measured from archived PM2.5 samples collected at four locations around the World Trade Center (WTC) from September 22, 2001 to March 27, 2002. The mass fraction model developed at these four sites was used to estimate PAH concentrations at additional PM2.5 monitors. Even with limited PAH data, a spatial validation showed the application of the mass fraction model reduced the mean squared error (MSE) by 7–22%, while in the temporal validation there was an exponential improvement in MSE positively associated with the number of days of PAH data removed. Our results include space/time maps of atmospheric PAH concentrations in the New York area after 9/11.

Spatial and temporal variation of precipitation in Sudan and their possible causes during 1948–2005

January 2012


691 Reads






Temporal and spatial patterns of precipitation are essential to the understanding of soil moisture status which is vital for vegetation regeneration in the arid ecosystems. The purposes of this study are (1) to understand the temporal and spatial variations of precipitation in Sudan during 1948–2005 by using high quality global precipitation data known as Precipitation REConstruction (PREC), which has been constructed at the National Oceanic and Atmospheric Administration (NOAA) Climate Prediction Center, and (2) to discuss the relationship between precipitation variability and moisture flux based on the NCEP/NCAR reanalysis data in order to ascertain the potential causes of the spatial and temporal variations of precipitation in the region. Results showed that (1) annual and monthly precipitation in Sudan had great spatial variability, and mean annual precipitation varied from almost nil in the North to about 1500mm in the extreme Southwest; (2) precipitation of the main rain season, i.e., July, August and September, and annual total precipitation in the central part of Sudan decreased significantly during 1948–2005; (3) abrupt change points were found in the annual, July, August and September in the late 1960s, when precipitation decreased more rapidly than in other periods; and (4) the decreasing precipitation was associated with the weakening African summer monsoon. The summer moisture flux over Sudan tended to be decreasing after the late 1960s which decreased the northward propagation of moisture flux in North Africa. This study provides a complementary view to the previous studies that attempted to explain the Sahel persistent drought and possible causes. KeywordsPrecipitation–Large scale–Abrupt–Moisture flux–Sudan–African monsoon

Spatial and temporal characteristics of changes in precipitation during 1957-2007 in the Haihe River basin, China

October 2011


104 Reads

The present study explores the spatial and temporal changing patterns of the precipitation in the Haihe River basin of North China during 1957–2007 at annual, seasonal and monthly scales. The Mann–Kendall and Sen’s T tests are employed to detect the trends, and the segmented regression is applied to investigate possible change points. Meanwhile, Sen’s slope estimator is computed to represent the magnitudes of the temporal trends. The regional precipitation trends are also discussed based on the regional index series of four sub-basins in the basin. Serial correlation of the precipitation series is checked prior to the application of the statistical test to ensure the validity of trend detection. Moreover, moisture flux variations based on the NCEP/NCAR reanalysis dataset are investigated to further reveal the possible causes behind the changes in precipitation. The results show that: (1) Although the directions of annual precipitation trends at all stations are downward, only seven stations have significant trends at the 90% confidence level, and these stations are mainly located in the western and southeastern Haihe River basin. (2) Summer is the only season showing a strong downward trend. For the monthly series, significant decreasing trends are mainly found during July, August and November, while significant increasing trends are mostly observed during May and December. In comparison with the annual series, more intensive changes can be found in the monthly series, which may indicate a shift in the precipitation regime. (3) Most shifts from increasing trends to decreasing trends occurred in May–June, July, August and December series, while opposed shifts mainly occurred in November. Summer is the only season displaying strong shift trends and the change points mostly emerged during the late 1970s to early 1980s. (4) An obvious decrease in moisture flux is observed after 1980 in comparison with the observations before 1980. The results of similar changing patterns between monthly moisture budget and precipitation confirmed that large-scale atmospheric circulation may be responsible for the shift in the annual cycle of precipitation in the Haihe River basin. These findings are expected to contribute to providing more accurate results of regional changing precipitation patterns and understanding the underlying linkages between climate change and alterations of hydrological cycles in the Haihe River basin. KeywordsClimate change–Precipitation–Trend analysis–Spatial distribution–The Haihe River basin–China

Fig. 1 Location of the study region (FW China) and gauging stations. The regions are a Xinjiang; b Xizang (Tibet); c Qinghai; d Gansu; and e west Inner Mongolia
Table 1 Extreme temperature events defined by percentiles, associated definition and abbreviated names
Fig. 12 MK trend of intraseasonal maximum/minimum temperature anomalies. a Annual; b spring; c summer; d autumn; e winter
Changes of temperature extremes for 1960-2004 in Far-West China

August 2009


365 Reads

The spatial and temporal patterns of the temperature extremes defined by 5th and 95th percentiles based on daily maximum/minimum temperature dataset were analyzed using Mann–Kendall test and linear regression method. The research results indicate that: (1) the seasonal minimum temperature is in stronger increasing trend than the seasonal maximum temperature; (2) in comparison with the changes of the maximum temperature, more stations display significantly increasing trends of minimum temperature in frequency and intensity; (3) comparatively, more stations have significantly decreasing trends in the intra-seasonal extreme temperature anomaly in summer and winter than in spring and autumn. The areal mean minimum temperature is in stronger increasing trend than areal mean maximum temperature; (4) the warming process in the Far-West (FW) China is characterized mainly by significantly increasing minimum temperature. The research will be helpful for local human mitigation to alterations in water resource and ecological environment in FW China due to changes of temperature extremes, as the ecologically fragile region of China.

Fig. 1 Location of the measuring stations  
Table 1 Reaches of the Yangtze River 
Fig. 6 Precipitation anomaly of the 1991–2000 period in comparison to the period 1951–1990 in percent  
Fig. 7 Onset and length of the mei-yu season in Nanjing 1950–2001  
Fig. 8 June/July average V- Wind at the 850 hPa level in m/ s of the years 1985 (upper) and 1998 (lower)  
Analysis of Precipitation Trends in 1990s in the Yangtze River Catchment

September 2006


156 Reads

Precipitation trends in the Yangtze River catchment (PR China) have been analyzed for the past 50years by applying the Mann-Kendall trend test and geospatial analyses. Monthly precipitation trends of 36 stations have been calculated. Significant positive trends at many stations can be observed for the summer months, which naturally show precipitation maxima. They were preceded and/or followed by negative trends. This observation points towards a concentration of summer precipitation within a shorter period of time. The analysis of a second data set on a gridded basis with 0.5° resolution reveals trends with distinct spatial patterns. The combination of classic trend tests and spatially interpolated precipitation data sets allows the spatiotemporal visualization of detected trends. Months with positive trends emphasize the aggravation of severe situation in a region, which is particularly prone to flood disasters during summer. Reasons for the observed trends were found in variations in the meridional wind pattern at the 850hPa level, which account for an increased transport of warm moist air to the Yangtze River catchment during the summer months.

Spatiotemporal statistical analysis of influenza mortality risk in the State of California during the period 1997–2001

March 2008


84 Reads

Using the Bayesian maximum entropy (BME) method of spatiotemporal statistics, the present study examines the geographical risk pattern of influenza mortality in the state of California during the time period 1997–2001. BME risk analysis is of considerable value, since influenza is the largest contributing factor to wintertime mortality increases in the US. By incorporating age-adjusted mortality data collected at the county level, informative influenza mortality maps were generated and composite space-time influenza dependences were assessed quantitatively. On the basis this analysis, essential risk patterns and correlations were detected across the state during wintertime. It was found that significantly high risks initially occurred during December in the west-central part of the state; in the following two weeks the risk distribution extended in the south and east-central parts of the state; in late February significant influenza mortalities were detected mainly in the west-central part of the state. These findings, combined with the results of earlier works, can lead to useful conclusions regarding influenza risk assessment in a space-time context and, also, point toward promising future research directions.

Projected streamflow in the Huaihe River Basin (2010–2100) using artificial neural network

July 2009


83 Reads

Climate projections for the Huaihe River Basin, China, for the years 2001–2100 are derived from the ECHAM5/MPI-OM model based on observed precipitation and temperature data covering 1964–2007. Streamflow for the Huaihe River under three emission scenarios (SRES-A2, A1B, B1) from 2010 to 2100 is then projected by applying artificial neural networks (ANN). The results show that annual streamflow will change significantly under the three scenarios from 2010 to 2100. The interannual fluctuations cover a significant increasing streamflow trend under the SRES-A2 scenario (2051–2085). The streamflow trend declines gradually under the SRES-A1B scenario (2024–2037), and shows no obvious trend under the SRES-B1 scenario. From 2010 to 2100, the correlation coefficient between the observed and modeled streamflow in SRES-A2 scenario is the best of the three scenarios. Combining SRES-A2 scenario of the ECHAM5 model and ANN might therefore be the best approach for assessing and projecting future water resources in the Huaihe basin and other catchments. Compared to the observed period of streamflows, the projected periodicity of streamflows shows significant changes under different emission scenarios. Under A2 scenario and A1B scenario, the period would delay to about 32–33a and 27–28a, respectively, but under B1 scenario, the period would not change, as it is about 5–6a and the observed period is about 7–8a. All this might affect drought/flood management, water supply and irrigation projects in the Huaihe River basin. KeywordsObserved data-Streamflow projection-Artificial neural networks-The Huaihe Basin

2D Monte Carlo versus 2D Fuzzy Monte Carlo health risk assessment

February 2005


208 Reads

Risk estimates can be calculated using crisp estimates of the exposure variables (i.e., contaminant concentration, contact rate, exposure frequency and duration, body weight, and averaging time). However, aggregate and cumulative exposure studies require a better understanding of exposure variables and uncertainty and variability associated with them. Probabilistic risk assessment (PRA) studies use probability distributions for one or more variables of the risk equation in order to quantitatively characterize variability and uncertainty. Two-dimensional Monte Carlo Analysis (2D MCA) is one of the advanced modeling approaches that may be used to conduct PRA studies. In this analysis the variables of the risk equation along with the parameters of these variables (for example mean and standard deviation for a normal distribution) are described in terms of probability density functions (PDFs). A variable described in this way is called a second order random variable. Significant data or considerable insight to uncertainty associated with these variables is necessary to develop the appropriate PDFs for these random parameters. Typically, available data and accuracy and reliability of such data are not sufficient for conducting a reliable 2D MCA. Thus, other theories and computational methods that propagate uncertainty and variability in exposure and health risk assessment are needed. One such theory is possibility analysis based on fuzzy set theory, which allows the utilization of incomplete information (incomplete information includes vague and imprecise information that is not sufficient to generate probability distributions for the parameters of the random variables of the risk equation) together with expert judgment. In this paper, as an alternative to 2D MCA, we are proposing a 2D Fuzzy Monte Carlo Analysis (2D FMCA) to overcome this difficulty. In this approach, instead of describing the parameters of PDFs used in defining the variables of the risk equation as random variables, we describe them as fuzzy numbers. This approach introduces new concepts and risk characterization methods. In this paper we provide a comparison of these two approaches relative to their computational requirements, data requirements and availability. For a hypothetical case, we also provide a comperative interpretation of the results generated.

Table 1 Estimates of ~ Bby the two methods considered, for 16 simulations of the lognormal diffusion random field (known non constant mean case) 
Fig. 1 Grids with the 49 observation locations and the 361 locations for prediction
Fig. 2 Straight line fitted by least squares  
Estimation and prediction of a 2D lognormal diffusion random field

January 2005


80 Reads

This paper describes techniques for estimation and prediction of two-parameter lognormal diffusion random fields. The drift and diffusion coefficients, which characterize a two-parameter lognormal diffusion under certain conditions, are estimated by maximum likelihood. For data on a regular grid, an alternative method is proposed to estimate the diffusion coefficient. Both of these estimates are compared in several situations. The kriging predictors are formulated involving the drift and diffusion coefficients and the predictions are obtained using the estimates of these coefficients.

Do 2-amino-3,8-dimethylimidazo[4,5-f] quinoxaline data support the conclusion of threshold carcinogenic effects?

January 2008


25 Reads

The objectives of this paper are to (1) reexamine the data that were used to support the conclusion of a threshold effect for 2-amino-3,8-dimethylimidazo[4,5-f] quinoxaline (MeIQx)-induced initiation and carcinogenicity at low doses in the rat liver, and (2) discuss issues and uncertainties about assessing cancer risk at low doses. Our analysis is part of an effort to understand proper interpretation and modeling of data related to cancer mechanisms and is not an effort to develop a risk assessment for this compound. The data reanalysis presented herein shows that the low-dose initiation activity of MeIQx, which can be found in cooked meat, cannot be dismissed. It is argued that the threshold effect for carcinogenic agents cannot be determined by statistical non-significance alone; more relevant biological information is required. A biologically motivated procedure is proposed for data analyses. The concept and procedure that are appropriate for analyzing MeIQx data are equally applicable to other compounds with comparable data.

3D Inverse modelling of groundwater flow at a fractured site using a stochastic continuum model with multiple statistical populations

April 2002


132 Reads

 3D groundwater flow at the fractured site of Aspö (Sweden) is simulated. The aim was to characterise the site as adequately as possible and to provide measures on the uncertainty of the estimates. A stochastic continuum model is used to simulate both groundwater flow in the major fracture planes and in the background. However, the positions of the major fracture planes are deterministically incorporated in the model and the statistical distribution of the hydraulic conductivity is modelled by the concept of multiple statistical populations; each fracture plane is an independent statistical population. Multiple equally likely realisations are built that are conditioned to geological information on the positions of the major fracture planes, hydraulic conductivity data, steady state head data and head responses to six different interference tests. The experimental information could be reproduced closely. The results of the conditioning are analysed in terms of ensemble averaged average fracture plane conductivities, the ensemble variance of average fracture plane conductivities and the statistical distribution of the hydraulic conductivity in the fracture planes. These results are evaluated after each conditioning stage. It is found that conditioning to hydraulic head data results in an increase of the hydraulic conductivity variance while the statistical distribution of log hydraulic conductivity, initially Gaussian, becomes more skewed for many of the fracture planes in most of the realisations.

FIG. 2: Theoretical m~ltifrac~al spectra of the. singular vector-valued measures described in Figure 1. (a) r(q) spectrum (Eq. (3.3)). (b) D(h) sl.ngulanty spectrum obtamed by Legendre transforming the r(q) spectrum Ceq. (2.17». The symbols correspond to the followmg model parameters: P2 = 2, P3 = 1 and C = Pl = P4 0.5 «0»,0.3 «.6» and 0.1 (Co»~.  
FIG. 8: Determination of the Tv(q) and Dv(h) spectra of the velocity field with the TWTMM method. The analyzing wavelet is the same as in figure 7. (a) log2 Z(q, a) vs log2 aj (b) h(q, a) vs log2 aj the solid lines correspond to linear regression fit estimates in the range 21.20W ;S a;S 2 3. 5 0W. (c) Tv(q) vs q; the solid line corresponds to a fit of the data with the log-normal parabolic spectrum (4.1) for the parameter values C8 = 3.02, Cr =-0.34 and C~ = 0.049 (Eq. (4.2)). (d) Dv(h) vs h, as obtained from the scaling behavior of h(q, a) (Eq. (2.19)) and D(q, a) (Eq. (2.20)); the solid line corresponds to a fit of the data with the log-normal parabolic spectrum (4.3) with the same parameter values (Eq. (4.2)). These results correspond to annealed averaging over 18 (256)3 snapshots of vex). a is expressed in Ow (= 13 pixels) units.
FIG. 9: Determination of the r",(q) and D",(h) spectra of the vorticity field with the TWTMM method (0), with the same analyzing wavelet as in figure 7, and with box-counting techniques (0). (a) log2 Z(q, a) vs log2 aj (b) h(q, a) vs log2 aj the solid and dashed lines correspond to linear regression fit estimates in the range 21. 2 ow.::s a.::s 2 3. 5 aw. (c) r",(q) vs qj (d) D",(h) vs h as obtained from the scaling behavior of h(q, a) (Eq. (2.19» and D(q, a) (Eq. (2.20»; the dashed line corresponds to the parabolic spectrum found for v in figure 8d (Eqs. (4.2) and (4.3»; after a translation of one unit on the left (Eqs. (4.4) and (4.5)j the dashed vertical line marks the K41 value h", = h.u-1 =-2/3. These results correspond to annealed averaging over 18 (256)3 snapshots of w(x). a is expressed in ow
A multifractal formalism for vector-valued random fields based on wavelet analysis: Application to turbulent velocity and vorticity 3D numerical data

August 2005


113 Reads

Extreme atmospheric events are intimately related to the statistics of atmospheric turbulent velocities. These, in turn, exhibit multifractal scaling, which is determining the nature of the asymptotic behavior of velocities, and whose parameter evaluation is therefore of great interest currently. We combine singular value decomposition techniques and wavelet transform analysis to generalize the multifractal formalism to vector-valued random fields. The so-called Tensorial Wavelet Transform Modulus Maxima (TWTMM) method is calibrated on synthetic self-similar 2D vector-valued multifractal measures and monofractal 3D vector-valued fractional Brownian fields. We report the results of some application of the TWTMM method to turbulent velocity and vorticity fields generated by direct numerical simulations of the incompressible Navier–Stokes equations. This study reveals the existence of an intimate relationship Dv(h+1)=D\varvecw(h),{D_{{\bf v}}(h+1)=D_{\varvec{\omega}}(h)}, between the singularity spectra of these two vector fields which are found significantly more intermittent than previously estimated from longitudinal and transverse velocity increment statistics.

Trace Element Contaminations of Roadside Soils from Two Cultivated Wetlands after Abandonment in a Typical Plateau Lakeshore, China

January 2010


51 Reads

Total concentrations of As, Cd, Cu, Ni, Pb and Zn were determined in order to assess their changes in contamination levels in roadside soils of long-term abandoned tillage (LAT) belt and short-term abandoned tillage (SAT) belt from a plateau lakeshore of China. Results showed that the mean concentrations of these trace elements except for Cd and Pb were lower than background values. The contamination index values of these trace elements fluctuated and generally decreased with increasing distances from the road in both sampling belts except for Cu and Ni. Both LAT and SAT soils were facing Cd and Pb contamination. The integrated contamination level was more serious in LAT soils compared to SAT soils, with heavy contamination levels at the distances from 5 to 10m in LAT soils. The effects of the studied rural road on both belts were clearly shown up to 250m away from the road. KeywordsTrace element–Abandoned tillage soil–Roadside–Contamination index–Plateau lake

Fig. 1 The plot depicts the relation between the number of trees and the canopy area (m 2 ). Bins with a width of 30 m 2 are used. The observed values are shown with the solid line, while the optimal negative exponential distribution (Eq. 6) is shown with a dotted line. The histogram bars are also shown on the plot. Regression for the estimated data was excellent. (R 2 = 0.9936, P = 8.1354 9 10 -8 )  
Fig. 2 The plot depicts the relation between the number of trees on a logarithmic scale (vertical axis) and the canopy area (m 2 ). The observed data are shown by the continuous line, while the estimates based on Eq. (6) by the broken (dash-dot) line. ''Lower'' and ''upper'' bounds based on the 95% confidence intervals for the regression parameters a and b are also shown by means of broken (dashes) lines  
Estimating tree abundance from remotely sensed imagery in semi-arid and arid environments: Bringing small trees to the light

January 2009


167 Reads

The analysis of remotely sensed images provides a powerful method for estimating tree abundance. However, a number of trees have sizes that are below the spatial resolution of remote sensing images, and as a result they cannot be observed and classified. We propose a method for estimating the number of such sub-resolution trees on forest stands. The method is based on a backwards extrapolation of the size-class distribution of trees as observed from the remotely sensed images. We apply our method to a tree database containing around 13,000 tree individuals to determine the number of sub-resolution trees. While the proposed method is formulated for estimating tree abundance from remotely sensed images, it is generally applicable to any database containing tree canopy surface area data with a minimum size cut-off.

An efficient tool for accelerating the numerical solution of the stochastic subsurface flow problem using neural networks

November 2000


13 Reads

 An efficient numerical solution for the two-dimensional groundwater flow problem using artificial neural networks (ANNs) is presented. Under stationary velocity conditions with unidirectional mean flow, the conductivity realizations and the head gradients, obtained by a traditional finite difference solution to the flow equation, are given as input-output pairs to train a neural network. The ANN is trained successfully and a certain level of recognition of the relationship between input conductivity patterns and output head gradients is achieved. The trained network produced velocity realizations that are physically plausible without solving the flow equation for each of the conductivity realizations. This is achieved in a small fraction of the time necessary for solving the flow equations. The prediction accuracy of the ANN reaches 97.5% for the longitudinal head gradient and 94.7% for the transverse gradient. Head-gradient and velocity statistics in terms of the first two moments are obtained with a very high accuracy. The cross covariances between head gradients and the fluctuating log-conductivity (log-K) and between velocity and log-K obtained with the ANN approach match very closely those obtained by a traditional numerical solution. The same is true for the velocity components auto-covariances. The results are also extended to transport simulations with very good accuracy. Spatial moments (up to the fourth) of mean-concentration plumes obtained using ANNs are in very good agreement with the traditional Monte Carlo simulations. Furthermore, the concentration second moment (concentration variance) is very close between the two approaches. Considering the fact that higher moments of concentration need more computational effort in numerical simulations, the advantage of the presented approach in saving long computational times is evident. Another advantage of the ANNs approach is the ability to generalize a trained network to conductivity distributions different from those used in training. However, the accuracy of the approach in cases with higher conductivity variances is being investigated.

Statistical algorithms accounting for background density in the detection of UXO target areas at DoD munitions sites

February 2009


30 Reads

Statistically defensible methods are presented for developing geophysical detector sampling plans and analyzing data for munitions response sites where unexploded ordnance (UXO) may exist. Detection methods for identifying areas of elevated anomaly density from background density are shown. Additionally, methods are described which aid in the choice of transect pattern and spacing to assure with degree of confidence that a target area (TA) of specific size, shape, and anomaly density will be identified using the detection methods. Methods for evaluating the sensitivity of designs to variation in certain parameters are also discussed. Methods presented have been incorporated into the Visual Sample Plan (VSP) software (free at and demonstrated at multiple sites in the United States. Application examples from actual transect designs and surveys from the previous two years are demonstrated.

Accounting for high-order correlations in probabilistic characterization of environmental variables, and evaluation

February 2008


11 Reads

Probabilistic characterization of environmental variables or data typically involves distributional fitting. Correlations, when present in variables or data, can considerably complicate the fitting process. In this work, effects of high-order correlations on distributional fitting were examined, and how they are technically accounted for was described using two multi-dimensional formulation methods: maximum entropy (ME) and Koehler–Symanowski (KS). The ME method formulates a least-biased distribution by maximizing its entropy, and the KS method uses a formulation that conserves specified marginal distributions. Two bivariate environmental data sets, ambient particulate matter and water quality, were chosen for illustration and discussion. Three metrics (log-likelihood function, root-mean-square error, and bivariate Kolmogorov–Smirnov statistic) were used to evaluate distributional fit. Bootstrap confidence intervals were also employed to help inspect the degree of agreement between distributional and sample moments. It is shown that both methods are capable of fitting the data well and have the potential for practical use. The KS distributions were found to be of good quality, and using the maximum likelihood method for the parameter estimation of a KS distribution is computationally efficient.

Accounting for the uncertainty in the local mean in spatial prediction by BME

September 2007


14 Reads

Bayesian Maximum Entropy (BME) has been successfully used in geostatistics to calculate predictions of spatial variables given some general knowledge base and sets of hard (precise) and soft (imprecise) data. This general knowledge base commonly consists of the means at each of the locations considered in the analysis, and the covariances between these locations. When the means are not known, the standard practice is to estimate them from the data; this is done by either generalized least squares or maximum likelihood. The BME prediction then treats these estimates as the general knowledge means, and ignores their uncertainty. In this paper we develop a prediction that is based on the BME method that can be used when the general knowledge consists of the covariance model only. This prediction incorporates the uncertainty in the estimated local mean. We show that in some special cases our prediction is equal to results from classical geostatistics. We investigate the differences between our approach and the standard approach for predicting in this common practical situation.

Table 1 Comparison of fitted models
Fig. 5 One-day-ahead forecasts of hourly ozone level for the second week of August, 2002 at Taft station using various models 
Fig. 6 Comparison of observed and one-day-ahead forecast of daily average ozone concentrations for April to June 2002 (top) and July to October 2002 (bottom) 
Accounting seasonal nonstationarity in time series models for short-term ozone level forecast

January 2005


87 Reads

Due to the nonlinear feature of a ozone process, regression based models such as the autoregressive models with an exogenous vector process (ARX) suffer from persistent diurnal behaviors in residuals that cause systematic over-predictions and under-predictions and fail to make accurate multi-step forecasts. In this article we present a simple class of the functional coefficient ARX (FARX) model which allows the regression coefficients to vary as a function of another variable. As a special case of the FARX model, we investigate the threshold ARX (TARX) model of Tong [Lecture notes in Statistics, Springer-Verlag, Berlin, 1983; Nonlinear time series: a dynamics system approach, Oxford University Press, Oxford, 1990] which separates the ARX model in terms of a variable called the threshold variable. In this study we use time of day as the threshold variable. The TARX model can be used directly for ozone forecasts; however, investigation of the estimated coefficients over the threshold regimes suggests polynomial coefficient functions in the FARX model. This provides a parsimonious model without deteriorating the forecast performance and successfully captures the diurnal nonstationarity in ozone data. A general linear F-test is used to test varying coefficients and the portmanteau tests, based on the autocorrelation and partial autocorrelation of fitted residuals, are used to test error autocorrelations. The proposed models were applied to a 2year dataset of hourly ozone concentrations obtained in downtown Cincinnati, OH, USA. For the exogenous processes, outdoor temperature, wind speed, and wind direction were used. The results showed that both TARX and FARX models substantially improve one-day-ahead forecasts and remove the diurnal pattern in residuals for the cases considered.

Table 1 Some typical clustering algorithms and applicability (Tan et al. 2006; Witten and Frank 2000)
Fig. 3 Scatter plots (a, b, c) and histograms (d, e, f) for visual exploratory analysis  
Fig. 7 Thematic polygon maps as the stratification frame: a soil type, b landform
An information-fusion method to identify pattern of spatial heterogeneity for improving the accuracy of estimation

October 2008


193 Reads

While spatial autocorrelation is used in spatial sampling survey to improve the precision of the feature’s estimate of a certain population at area units, spatial heterogeneity as the stratification frame in survey also often have a considerable effect upon the precision. Under the context of increasingly enriched spatiotemporal data, this paper suggests an information-fusion method to identify pattern of spatial heterogeneity, which can be used as an informative stratification for improving the estimation accuracy. Data mining is major analysis components in our method: multivariate statistics, association analysis, decision tree and rough set are used in data filter, identification of contributing factors, and examination of relationship; classification and clustering are used to identify pattern of spatial heterogeneity using the auxiliary variables relevant to the goal and thus to stratify the samples. These methods are illustrated and examined in the case study of the cultivable land survey in Shandong Province in China. Different from many stratification schemes which just uses the goal variable to stratify which is too simplified, information from multiple sources can be fused to identify pattern of spatial heterogeneity, thus stratifying samples at geographical units as an informative polygon map, and thereby to increase the precision of estimates in sampling survey, as demonstrated in our case research.

Accuracy of spatio-temporal RARX model predictions of water table depths

January 2002


105 Reads

 Time series of water table depths (H t ) are predicted in space using a regionalised autoregressive exogenous variable (RARX) model with precipitation surplus (#E5 /E5# t ) as input variable. Because of their physical basis, RARX model parameters can be guessed from auxiliary information such as a digital elevation model (DEM), digital topographic maps and digitally stored soil profile descriptions. Three different approaches to regionalising RARX parameters are used. In the `direct' method (DM) #E5 /E5# t is transformed into H t using the guessed RARX parameters. In the `indirect' method (IM) the predictions from DM are corrected for observed systematic errors. In the Kalman filter approach the parameters of regionalisation functions for the RARX model parameters are optimised using observations on H t . These regionalisation functions describe the dependence on spatial co-ordinates of the RARX parameters. External drift kriging and simple kriging with varying means are applied as regionalisation functions, using guessed RARX model parameters or DEM data as secondary variables. Predictions of H t at given days, as well as estimates of expected water table depths are made for a study area of 1375 ha. The performance of the three approaches is tested by cross-validation using observed values of H t in 27 wells which are positioned following a stratified random sampling design. IM performs significantly better with respect to systematic errors than the alternative methods in estimating expected water table depths. The Kalman filter methods perform better than both DM and IM in predicting the temporal variation of H t , as is indicated by lower random errors. Particularly the Kalman filter method that uses DEM data as an external drift outperforms the alternative methods with respect to the prediction of the temporal variation of the water table depth.

Fig. 1 Streamflow data. Excess prediction error DRMSE for increasing calibration sample size (logarithmic scales)
Table 1 Main characteristics of the model averaging methods considered
Fig. 2 Tensiometric pressure head data. Excess prediction error DRMSE for increasing calibration sample size (logarithmic scales)  
Comparison of point forecast accuracy of model averaging methods in hydrologic applications

August 2010


306 Reads

Multi-model averaging is currently receiving a surge of attention in the atmospheric, hydrologic, and statistical literature to explicitly handle conceptual model uncertainty in the analysis of environmental systems and derive predictive distributions of model output. Such density forecasts are necessary to help analyze which parts of the model are well resolved, and which parts are subject to considerable uncertainty. Yet, accurate point predictors are still desired in many practical applications. In this paper, we compare a suite of different model averaging techniques by their ability to improve forecast accuracy of environmental systems. We compare equal weights averaging (EWA), Bates-Granger model averaging (BGA), averaging using Akaike’s information criterion (AICA), and Bayes’ Information Criterion (BICA), Bayesian model averaging (BMA), Mallows model averaging (MMA), and Granger-Ramanathan averaging (GRA) for two different hydrologic systems involving water flow through a 1950km2 watershed and 5m deep vadose zone. Averaging methods with weights restricted to the multi-dimensional simplex (positive weights summing up to one) are shown to have considerably larger forecast errors than approaches with unconstrained weights. Whereas various sophisticated model averaging approaches have recently emerged in the literature, our results convincingly demonstrate the advantages of GRA for hydrologic applications. This method achieves similar performance as MMA and BMA, but is much simpler to implement and use, and computationally much less demanding. KeywordsBates-Granger weights-Bayesian model averaging-Granger-Ramanathan weights-Mallows model averaging-Streamflow forecasting-Tensiometric pressure head

Assessment of the prediction error in a large-scale application of a dynamic soil acidification model

August 2002


80 Reads

 The prediction error of a relatively simple soil acidification model (SMART2) was assessed before and after calibration, focussing on the Al and NO3 concentrations on a block scale. Although SMART2 is especially developed for application on a national to European scale, it still runs at a point support. A 5×5 km2 grid was used for application on the European scale. Block characteristic values were obtained simply by taking the median value of the point support values within the corresponding grid cell. In order to increase confidence in model predictions on large spatial scales, the model was calibrated and validated for the Netherlands, using a resolution that is feasible for Europe as a whole. Because observations are available only at the point support, it was necessary to transfer them to the block support of the model results. For this purpose, about 250 point observations of soil solution concentrations in forest soils were upscaled to a 5×5 km2 grid map, using multiple linear regression analysis combined with block kriging. The resulting map with upscaled observations was used for both validation and calibration. A comparison of the map with model predictions using nominal parameter values and the map with the upscaled observations showed that the model overestimated the predicted Al and NO3 concentrations. The nominal model results were still in the 95% confidence interval of the upscaled observations, but calibration improved the model predictions and strongly reduced the model error. However, the model error after calibration remains rather large.

Spatial and temporal characteristics of actual evapotranspiration over Haihe River basin in China

July 2011


458 Reads

Spatial and temporal characteristics of actual evapotranspiration over the Haihe River basin in China during 1960–2002 are estimated using the complementary relationship and the Thornthwaite water balance (WB) approaches. Firstly, the long-term water balance equation is used to validate and select the most suitable long-term average annual actual evapotranspiration equations for nine subbasins. Then, the most suitable method, the Pike equation, is used to calibrate parameters of the complementary relationship models and the WB model at each station. The results show that the advection aridity (AA) model more closely estimates actual evapotranspiration than does the Granger and Gray (GG) model especially considering the annual and summer evapotranspiration when compared with the WB model estimates. The results from the AA model and the WB model are then used to analyze spatial and temporal changing characteristics of the actual evapotranspiration over the basin. The analysis shows that the annual actual evapotranspirations during 1960–2002 exhibit similar decreasing trends in most parts of the Haihe River basin for the AA and WB models. Decreasing trends in annual precipitation and potential evapotranspiration, which directly affect water supply and the energy available for actual evapotranspiration respectively, jointly lead to the decrease in actual evapotranspiration in the basin. A weakening of the water cycle seems to have appeared, and as a consequence, the water supply capacity has been on the decrease, aggravating water shortage and restricting sustainable social and economic development in the region. KeywordsComplementary relationship–Thornthwaite water balance model–Actual evapotranspiration–Trend–Haihe River basin–China

Top-cited authors