MethodPDF Available
Notes on the document “Stochastic modelling demystified” by S.M. Papalexiou, 2010
The next scanned 14-page document was submitted to and verified by a lawyer on June 30, 2010, in
Athens, Greece. The first and last pages (in Greek) provide legal details. The pages 2-13 (in English)
describe a multivariate and cyclostationary general stochastic modeling scheme that enables to
generate synthetic time series having any marginal distribution and correlation structure. The purpose
of this legal document was to establish copywrite for a software that would use this method to generate
time series.
At that time (2009) I was not aware that similar ideas have been used in other research fields; for
example, a simple univariate case with continuous marginals can be found as early as 1975 by Li &
Hammond. However, these schemes and those later published were not general and flexible enough, or
easy to apply, to deal with intermittent processes, while it seems that this scheme was totally neglected
in hydroclimatology. Originally, I developed this framework in 2009 to perform a multivariate and
cyclostationary simulation of daily rainfall in 13 stations in Greece for the needs of a small research
project. I presented the simulation results in Chapters 3-6 in the project report in 2009 (in Greek), while
the precise method was described in the legal document that follows. The simulation used mixed-type
marginals, with the Burr type XII distribution describing the nonzero precipitation, and the MAR(1) to
preserve correlations.
For several unfortunate personal reasons, I published the complete work after many years (July 21,
2017) as arXiv preprint (Papalexiou, 2017), and a few months later in Advances in Water Resources
(Papalexiou, 2018). This method was also evolved to a disaggregation framework preserving marginals
and correlations (DiPMaC) including nonstationary simulations (Papalexiou et al., 2018). I attempted to
unify and extend these methods and provide and simple framework for univariate and multivariate
modeling preserving any continuous, discrete, binary or mixed-type marginals having any valid
autocorrelation structure (including long memory); The focus of this work was specifically on
hydroclimatic variables such as precipitation, streamflow, wind, humidity, etc.
I release this legal document as a personal note and didactic case. The initial idea back in 2009 for a
commercial software that would implement this method for time series generation was evolved into
something “better”, that is, an open source R-package, named CoSMoS, freely available on CRAN.
References
Papalexiou, S. M. (2018). Unified theory for stochastic modelling of hydroclimatic processes: Preserving
marginal distributions, correlation structures, and intermittency. Advances in Water Resources,
115, 234–252. https://doi.org/10.1016/j.advwatres.2018.02.013
Papalexiou, S. M., Markonis, Y., Lombardo, F., AghaKouchak, A., & Foufoula-Georgiou, E. (2018). Precise
Temporal Disaggregation Preserving Marginals and Correlations (DiPMaC) for Stationary and
Nonstationary Processes. Water Resources Research. https://doi.org/10.1029/2018WR022726
Papalexiou, S. M. (2017). A unified theory for exact stochastic modelling of univariate and multivariate
processes with continuous, mixed type, or discrete marginal distributions and any correlation
structure. ArXiv:1707.06842 [Math, Stat]. Retrieved from http://arxiv.org/abs/1707.06842
Li, S. T., & Hammond, J. L. (1975). Generation of Pseudorandom Numbers with Specified Univariate
Distributions and Correlation Coefficients. IEEE Transactions on Systems, Man, and Cybernetics,
SMC-5(5), 557561. https://doi.org/10.1109/TSMC.1975.5408380
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Hydroclimatic processes come in all “shapes and sizes”. They are characterized by different spatiotemporal correlation structures and probability distributions that can be continuous, mixed-type, discrete or even binary. Simulating such processes by reproducing precisely their marginal distribution and linear correlation structure, including features like intermittency, can greatly improve hydrological analysis and design. Traditionally, modelling schemes are case specific and typically attempt to preserve few statistical moments providing inadequate and potentially risky distribution approximations. Here, a single framework is proposed that unifies, extends, and improves a general-purpose modelling strategy, based on the assumption that any process can emerge by transforming a specific “parent” Gaussian process. A novel mathematical representation of this scheme, introducing parametric correlation transformation functions, enables straightforward estimation of the parent-Gaussian process yielding the target process after the marginal back transformation, while it provides a general description that supersedes previous specific parameterizations, offering a simple, fast and efficient simulation procedure for every stationary process at any spatiotemporal scale. This framework, also applicable for cyclostationary and multivariate modelling, is augmented with flexible parametric correlation structures that parsimoniously describe observed correlations. Real-world simulations of various hydroclimatic processes with different correlation structures and marginals, such as precipitation, river discharge, wind speed, humidity, extreme events per year, etc., as well as a multivariate example, highlight the flexibility, advantages, and complete generality of the method.
Article
Full-text available
Hydroclimatic processes are characterized by heterogeneous spatiotemporal correlation structures and marginal distributions that can be continuous, mixed-type, discrete or even binary. Simulating exactly such processes can greatly improve hydrological analysis and design. Yet this challenging task is accomplished often by ad hoc and approximate methodologies that are devised for specific variables and purposes. In this study, a single framework is proposed allowing the exact simulation of processes with any marginal and any correlation structure. We unify, extent, and improve of a general-purpose modelling strategy based on the assumption that any process can emerge by transforming a parent Gaussian process with a specific correlation structure. A novel mathematical representation of the parent-Gaussian scheme provides a consistent and fully general description that supersedes previous specific parameterizations, resulting in a simple, fast and efficient simulation procedure for every spatiotemporal process. In particular, introducing a simple but flexible procedure we obtain a parametric expression of the correlation transformation function, allowing to assess the correlation structure of the parent-Gaussian process that yields the prescribed correlation of the target process after marginal back transformation. The same framework is also applicable for cyclostationary and multivariate modelling. The simulation of a variety of hydroclimatic variables with very different correlation structures and marginals, such as precipitation, stream flow, wind speed, humidity, extreme events per year, etc., as well as a multivariate application, highlights the flexibility, advantages, and complete generality of the proposed methodology.
Article
Hydroclimatic variables such as precipitation and temperature are often measured or simulated by climate models at coarser spatiotemporal scales than those needed for operational purposes. This has motivated more than half a century of research in developing disaggregation methods that break down coarse-scale time series into finer scales, with two primary objectives: (a) reproducing the statistical properties of the fine-scale process and (b) preserving the original coarse-scale data. Existing methods either preserve a limited number of statistical moments at the fine scale, which is often insufficient and can lead to an unrepresentative approximation of the actual marginal distribution, or are based on a limited number of a priori distributional assumptions, for example, lognormal. Additionally, they are not able to account for potential nonstationarity in the underlying fine-scale process. Here we introduce a novel disaggregation method, named Disaggregation Preserving Marginals and Correlations (DiPMaC), that is able to disaggregate a coarse-scale time series to any finer scale, while reproducing the probability distribution and the linear correlation structure of the fine-scale process. DiPMaC is also generalized for arbitrary nonstationary scenarios to reproduce time varying marginals. Additionally, we introduce a computationally efficient algorithm, based on Bernoulli trials, to optimize the disaggregation procedure and guarantee preservation of the coarse-scale values. We focus on temporal disaggregation and demonstrate the method by disaggregating monthly precipitation to hourly, and time series with trends (e.g., climate model projections), while we show its potential to disaggregate based on general nonstationary scenarios. The example applications demonstrate the performance and robustness of DiPMaC.
This correspondence presents a procedure for generating correlated random variables with specified non-Gaussian probability distribution functions (pdf's) such as might be required for Monte Carlo simulation studies. Specifically, a method is presented for generating an arbitrary number of pseudorandom numbers each with a prescribed probability distribution and with a prescribed correlation coefficient matrix for the collection of random numbers. Collections of typical numbers generated with the method are evaluated with chi-squared tests for the distribution functions and with confidence intervals for the correlation coefficients derived from maximum likelihood estimates. In all cases tested the generated numbers passed the tests.