Recent evolutions in computing science and web technology provide the environmental community with continuously expanding resources for data collection and analysis that pose unprecedented challenges to the design of analysis methods, workflows, and interaction with data sets. In the light of the recent UK Research Council funded Environmental Virtual Observatory pilot project, this paper gives an overview of currently available implementations related to web-based technologies for processing large and heterogeneous datasets and discuss their relevance within the context of environmental data processing, simulation and prediction. We found that, the processing of the simple datasets used in the pilot proved to be relatively straightforward using a combination of R, RPy2, PyWPS and PostgreSQL. However, the use of NoSQL databases and more versatile frameworks such as OGC standard based implementations may provide a wider and more flexible set of features that particularly facilitate working with larger volumes and more heterogeneous data sources.
Landslides are a significant hazard in many parts of the world and exhibit a high, and often underestimated, damage potential. Deploying landslide early warning systems is one risk management strategy that, amongst others, can be used to protect local communities. In geotechnical applications, slope stability models play an important role in predicting slope behaviour as a result of external influences; however, they are only rarely incorporated into landslide early warning systems. In this study, the physically based slope stability model CHASM (Combined Hydrology and Stability Model) was initially applied to a reactivated landslide in the Swabian Alb to assess stability conditions and was subsequently integrated into a prototype of a semi-automated landslide early warning system. The results of the CHASM application demonstrate that for several potential shear surfaces the Factor of Safety is relatively low, and subsequent rainfall events could cause instability. To integrate and automate CHASM within an early warning system, international geospatial standards were employed to ensure the interoperability of system components and the transferability of the implemented system as a whole. The CHASM algorithm is automatically run as a web processing service, utilising fixed, predetermined input data, and variable input data including hydrological monitoring data and quantitative rainfall forecasts. Once pre-defined modelling or monitoring thresholds are exceeded, a web notification service distributes SMS and email messages to relevant experts, who then determine whether to issue an early warning to local and regional stakeholders, as well as providing appropriate action advice. This study successfully demonstrated the potential of this new approach to landslide early warning. To move from demonstration to active issuance of early warnings demands the future acquisition of high-quality data on mechanical properties and distributed pore water pressure regimes.
Spatial interpolation of precipitation data is of great importance for
hydrological modelling. Geostatistical methods (kriging) are widely
applied in spatial interpolation from point measurement to continuous
surfaces. The first step in kriging computation is the semi-variogram
modelling which usually used only one variogram model for all-moment
data. The objective of this paper was to develop different algorithms of
spatial interpolation for daily rainfall on 1 km2 regular
grids in the catchment area and to compare the results of geostatistical
and deterministic approaches. This study leaned on 30-yr daily rainfall
data of 70 raingages in the hilly landscape of the Ourthe and Ambleve
catchments in Belgium (2908 km2). This area lies between 35
and 693 m in elevation and consists of river networks, which are
tributaries of the Meuse River. For geostatistical algorithms, seven
semi-variogram models (logarithmic, power, exponential, Gaussian,
rational quadratic, spherical and penta-spherical) were fitted to daily
sample semi-variogram on a daily basis. These seven variogram models
were also adopted to avoid negative interpolated rainfall. The
elevation, extracted from a digital elevation model, was incorporated
into multivariate geostatistics. Seven validation raingages and cross
validation were used to compare the interpolation performance of these
algorithms applied to different densities of raingages. We found that
between the seven variogram models used, the Gaussian model was the most
frequently best fit. Using seven variogram models can avoid negative
daily rainfall in ordinary kriging. The negative estimates of kriging
were observed for convective more than stratiform rain. The performance
of the different methods varied slightly according to the density of
raingages, particularly between 8 and 70 raingages but it was much
different for interpolation using 4 raingages. Spatial interpolation
with the geostatistical and Inverse Distance Weighting (IDW) algorithms
outperformed considerably the interpolation with the Thiessen polygon,
commonly used in various hydrological models. Integrating elevation into
Kriging with an External Drift (KED) and Ordinary Cokriging (OCK) did
not improve the interpolation accuracy for daily rainfall. Ordinary
Kriging (ORK) and IDW were considered to be the best methods, as they
provided smallest RMSE value for nearly all cases. Care should be taken
in applying UNK and KED when interpolating daily rainfall with very few
neighbourhood sample points. These recommendations complement the
results reported in the literature. ORK, UNK and KED using only
spherical model offered a slightly better result whereas OCK using seven
variogram models achieved better result.
High-resolution temperature and precipitation variations and their seasonal extremes since 1500 are presented for the European Alps (43.25–48.25°N and 4.25–16.25°E). The spatial resolution of the gridded reconstruction is given by 0.5° × 0.5° and monthly (seasonal) grids are reconstructed back to 1659 (1500–1658). The reconstructions are based on a combination of long instrumental station data and documentary proxy evidence applying principal component regression analysis. Annual, winter and summer Alpine temperatures indicate a transition from cold conditions prior to 1900 to present day warmth. Very harsh winters occurred at the turn of the seventeenth century. Warm summers were recorded around 1550, during the second half of the eighteenth century and towards the end of the twentieth century. The years 1994, 2000, 2002, and particularly 2003 were the warmest since 1500. Unlike temperature, precipitation variation over the European Alps showed no significant low-frequency trend and increased uncertainty back to 1500. The years 1540, 1921 and 2003 were very likely the driest in the context of the last 500 years.
Running correlations between the North Atlantic Oscillation Index (NAOI) and the Alpine temperature and precipitation reconstructions demonstrate the importance of this mode in explaining Alpine winter climate over the last centuries. Winter NAOI correlates positively with Alpine temperatures and negatively with precipitation. These correlations, however, are temporally unstable. We conclude that the Alps are situated in a band of varying influence of the NAO, and that other atmospheric circulation modes controled Alpine temperature and precipitation variability through the recent past. Copyright
Rainfall-induced shallow landslides are common phenomena in many parts of the world, affecting cultivation and infrastructure and sometimes causing human losses. Assessing the triggering zones of shallow landslides is fundamental for land planning at different scales. This work defines a reliable methodology to extend a slope stability analysis from the site-specific to local scale by using a well-established physically based model (TRIGRS-unsaturated). The model is initially applied to a sample slope and then to the surrounding 13.4 km2 area in Oltrepò Pavese (northern Italy). To obtain more reliable input data for the model, long-term hydro-meteorological monitoring has been carried out at the sample slope, which has been assumed to be representative of the study area. Field measurements identified the triggering mechanism of shallow failures and were used to verify the reliability of the model to obtain pore water pressure trends consistent with those measured during the monitoring activity. In this way, more reliable trends have been modelled for past landslide events, such as the April 2009 event that was assumed as a benchmark. The assessment of shallow landslide triggering zones obtained using TRIGRS-unsaturated for the benchmark event appears good for both the monitored slope and the whole study area, with better results when a pedological instead of geological zoning is considered at the regional scale. The sensitivity analyses of the influence of the soil input data show that the mean values of the soil properties give the best results in terms of the ratio between the true positive and false positive rates. The scheme followed in this work allows us to obtain better results in the assessment of shallow landslide triggering areas in terms of the reduction in the overestimation of unstable zones with respect to other distributed models applied in the past.
Global loss of life from landslides is poorly quantified. A global data set of fatalities from nonseismically triggered landslides that resulted in loss of life between A.D. 2004 and 2010 permits for the first time proper quantification of impacts and spatial distributions. In total, 2620 fatal landslides were recorded worldwide during the 7 yr period of the study, causing a total of 32,322 recorded fatalities. These total numbers of landslides and victims are an order of magnitude greater than other data sets have indicated, but analysis of the data suggests that it may still slightly underestimate the true human costs. The majority of human losses occur in Asia, especially along the Himalayan Arc and in China. This geographical concentration dominates the annual landslide cycle, which peaks in the Northern Hemisphere summer months. Finally, numbers of fatalities per event show a fat-tailed power law distribution, with the density of landslides being moderately correlated with the population density on a national basis.
The general linear model encompasses statistical methods such as regression and analysis of variance (anova) which are commonly used by soil scientists. The standard ordinary least squares (OLS) method for estimating the parameters of the general linear model is a design-based method that requires that the data have been collected according to an appropriate randomized sample design. Soil data are often obtained by systematic sampling on transects or grids, so OLS methods are not appropriate.
Parameters of the general linear model can be estimated from systematically sampled data by model-based methods. Parameters of a model of the covariance structure of the error are estimated, then used to estimate the remaining parameters of the model with known variance. Residual maximum likelihood (REML) is the best way to estimate the variance parameters since it is unbiased. We present the REML solution to this problem. We then demonstrate how REML can be used to estimate parameters for regression and anova-type models using data from two systematic surveys of soil.
We compare an efficient, gradient-based implementation of REML (ASReml) with an implementation that uses simulated annealing. In general the results were very similar; where they differed the error covariance model had a spherical variogram function which can have local optima in its likelihood function. The simulated annealing results were better than the gradient method in this case because simulated annealing is good at escaping local optima.
The methods kriging with external drift (KED) and indicator kriging with external drift (IKED) are used for the spatial interpolation of hourly rainfall from rain gauges using additional information from radar, daily precipitation of a denser network, and elevation. The techniques are illustrated using data from the storm period of the 10th to the 13th of August 2002 that led to the extreme flood event in the Elbe river basin in Germany. Cross-validation is applied to compare the interpolation performance of the KED and IKED methods using different additional information with the univariate reference methods nearest neighbour (NN) or Thiessen polygons, inverse square distance weighting (IDW), ordinary kriging (OK) and ordinary indicator kriging (IK). Special attention is given to the analysis of the impact of the semivariogram estimation on the interpolation performance. Hourly and average semivariograms are inferred from daily, hourly and radar data considering either isotropic or anisotropic behaviour using automatic and manual fitting procedures. The multivariate methods KED and IKED clearly outperform the univariate ones with the most important additional information being radar, followed by precipitation from the daily network and elevation, which plays only a secondary role here. The best performance is achieved when all additional information are used simultaneously with KED. The indicator-based kriging methods provide, in some cases, smaller root mean square errors than the methods, which use the original data, but at the expense of a significant loss of variance. The impact of the semivariogram on interpolation performance is not very high. The best results are obtained using an automatic fitting procedure with isotropic variograms either from hourly or radar data.
Data repositories such as the USGS' National Water Information System (NWIS), EPA's Storage and Retrieval System (EPA STORET) offer a considerable volume of data for researchers and engineers in the United States through their websites. While accessible through a web browser, data from these sources can not be directly ingested by modeling or analysis tools without human intervention. Different input/output formats, syntax and terminology make data discovery and retrieval a major time sink. This paper examines the web services developed as a part of Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) Hydrologic Information System (HIS) project as a means to standardize access to hydrologic data repositories, facilitate data discovery and enable direct machine-to-machine communication.
This paper presents three multivariate geostatistical algorithms for incorporating a digital elevation model into the spatial prediction of rainfall: simple kriging with varying local means; kriging with an external drift; and colocated cokriging. The techniques are illustrated using annual and monthly rainfall observations measured at 36 climatic stations in a 5000 km2 region of Portugal. Cross validation is used to compare the prediction performances of the three geostatistical interpolation algorithms with the straightforward linear regression of rainfall against elevation and three univariate techniques: the Thiessen polygon; inverse square distance; and ordinary kriging.Larger prediction errors are obtained for the two algorithms (inverse square distance, Thiessen polygon) that ignore both the elevation and rainfall records at surrounding stations. The three multivariate geostatistical algorithms outperform other interpolators, in particular the linear regression, which stresses the importance of accounting for spatially dependent rainfall observations in addition to the colocated elevation. Last, ordinary kriging yields more accurate predictions than linear regression when the correlation between rainfall and elevation is moderate (less than 0.75 in the case study).
Increased landslide activity is commonly listed as an expected impact of human-induced climate change. This paper examines the theoretical and empirical bases for this assertion. It identifies the mechanisms by which climate can induce landsliding and examines the manner in which these mechanisms may respond to changes in a range of climatic parameters. It is argued that inherent limiting stability factors, which vary with different terrain conditions and landslide types, ultimately govern the nature of response to changing climate.Several modelling approaches are evaluated on the basis of their potential to predict landslide response to climate projections. Given reliable input data of appropriate form and resolution, the existing slope stability, hydrological, and statistical models are for the most part capable of yielding useful prognoses on occurrence, reactivation, magnitude and frequency of landsliding.While there is a strong theoretical basis for increased landslide activity as a result of predicted climate change, there remains a high level of uncertainty resulting from the margins of error inherent in scenario-driven global climate predictions, and the lack of sufficient spatial resolution of currently available downscaled projections.Examples from New Zealand are used to illustrate the extent to which changes resulting from human activity have affected slope stability. Changes resulting from human activity are seen as a factor of equal, if not greater, importance than climate change in affecting the temporal and spatial occurrence of landslides.
The variogram is essential for local estimation and mapping of any variable by kriging. The variogram itself must usually be estimated from sample data. The sampling density is a compromise between precision and cost, but it must be sufficiently dense to encompass the principal spatial sources of variance. A nested, multi-stage, sampling with separating distances increasing in geometric progression from stage to stage will do that. The data may then be analyzed by a hierarchical analysis of variance to estimate the components of variance for every stage, and hence lag. By accumulating the components starting from the shortest lag one obtains a rough variogram for modest effort. For balanced designs the analysis of variance is optimal; for unbalanced ones, however, these estimators are not necessarily the best, and the analysis by residual maximum likelihood (reml) will usually be preferable.The paper summarizes the underlying theory and illustrates its application with data from three surveys, one in which the design had four stages and was balanced and two implemented with unbalanced designs to economize when there were more stages. A Fortran program is available for the analysis of variance, and code for the reml analysis is listed in the paper.
Considering the scope of water resources research, data can be available from many different sources that use different nomenclature, storage technologies, interfaces and even languages, which make its discovery a hard and time-consuming task. This paper addresses the development of an ontology-aided, clustered search mechanism that enables querying multiple hydrologic and environmental data repositories through a single interface regardless of the heterogeneity that exists between these sources.
Geostatistics is essential for environmental scientists. Weather and climate vary from place to place, soil varies at every scale at which it is examined, and even man-made attributes - such as the distribution of pollution - vary. The techniques used in geostatistics are ideally suited to the needs of environmental scientists, who use them to make the best of sparse data for prediction, and top plan future surveys when resources are limited. Geostatistical technology has advanced much in the last few years and many of these developments are being incorporated into the practitioner's repertoire. This second edition describes these techniques for environmental scientists. Topics such as stochastic simulation, sampling, data screening, spatial covariances, the variogram and its modeling, and spatial prediction by kriging are described in rich detail. At each stage the underlying theory is fully explained, and the rationale behind the choices given, allowing the reader to appreciate the assumptions and constraints involved.