ArticlePDF Available

Abstract and Figures

Stochastic weather generators can generate very long time series of weather patterns, which are indispensable in earth sciences, ecology and climate research. Yet, both their potential and limitations remain largely unclear because past research has typically focused on eclectic case studies at small spatial scales in temperate climates. In addition, stochastic multi-site algorithms are usually not publicly available, making the reproducibility of results difficult. To overcome these limitations, we investigated the performance of the reduced-complexity multi-site precipitation generator TripleM across three different climatic regions in the United States. By resampling observations, we investigated for the first time the performance of a multi-site precipitation generator as a function of the extent of the gauge network and the network density. The definition of the role of the network density provides new insights into the applicability in data-poor contexts. The performance was assessed using nine different statistical metrics with main focus on the inter-annual variability of precipitation and the lengths of dry and wet spells. Among our study regions, our results indicate a more accurate performance in wet temperate climates compared to drier climates. Performance deficits are more marked at larger spatial scales due to the increasing heterogeneity of climatic conditions.
Content may be subject to copyright.
1
Scientific RepoRts | 7: 5449 | DOI:10.1038/s41598-017-05822-y
www.nature.com/scientificreports
Can weather generation capture
precipitation patterns across
dierent climates, spatial scales
and under data scarcity?
Korbinian Breinl1, Giuliano Di Baldassarre1, Marc Girons Lopez
2, Michael Hagenlocher
3,
Giulia Vico
4 & Anna Rutgersson1
Stochastic weather generators can generate very long time series of weather patterns, which are
indispensable in earth sciences, ecology and climate research. Yet, both their potential and limitations
remain largely unclear because past research has typically focused on eclectic case studies at small
spatial scales in temperate climates. In addition, stochastic multi-site algorithms are usually not publicly
available, making the reproducibility of results dicult. To overcome these limitations, we investigated
the performance of the reduced-complexity multi-site precipitation generator TripleM across three
dierent climatic regions in the United States. By resampling observations, we investigated for the
rst time the performance of a multi-site precipitation generator as a function of the extent of the
gauge network and the network density. The denition of the role of the network density provides new
insights into the applicability in data-poor contexts. The performance was assessed using nine dierent
statistical metrics with main focus on the inter-annual variability of precipitation and the lengths of
dry and wet spells. Among our study regions, our results indicate a more accurate performance in wet
temperate climates compared to drier climates. Performance decits are more marked at larger spatial
scales due to the increasing heterogeneity of climatic conditions.
Precipitation is a key component of the water cycle, which in turn aects terrestrial ecosystems, agricultural pro-
duction and human well-being. Access to long precipitation time series is crucial for many ecological, agricultural
or hydrological studies, as well as for public health and climate research1, 2. Many regions lack such wealth of data,
so that realistic simulations of precipitation patterns are needed. Simulations have to preserve the spatial and
temporal dynamics as well as the correlation structures of precipitation patterns and their variability as they are
fundamental for impact analyses3.
Precipitations patterns can be simulated either with numerical weather prediction models or stochastic algo-
rithms. ese methods are complementary and have specic advantages and drawbacks. Numerical weather pre-
diction models include a physical description of the entire atmosphere and its interaction with the land surface,
oen also including oceans and vegetation, making the simulated elds physically consistent. is however leads
to high computational costs and potential limitations in both the number of simulations that can be generated
and their spatial resolution. Typically, the feasible spatial resolution is coarser than required for most impact
assessments. Moreover, the accuracy of precipitation elds produced by such models can suer from spatiotem-
poral and amplitude errors depending on the model physics, dynamics and model conguration4, 5.
Stochastic algorithms, in contrast, require considerably less computational eort and can therefore easily pro-
vide long time series. Multi-site stochastic precipitation generators are mathematical algorithms for producing
synthetic precipitation based on multiple ground observation sites (i.e. precipitation gauges). ey can simu-
late precipitation patterns in space and time similar to the actual observations. Several algorithms exist, oen
1Department of Earth Sciences, Uppsala University, Villavägen 16, 75236, Uppsala, Sweden. 2Department of
Geography, University of Zurich, Winterthurerstrasse 190, 8057, Zurich, Switzerland. 3Institute for Environment and
Human Security, United Nations University (UNU-EHS), UN Campus, Platz der Vereinten Nationen 1, 53113, Bonn,
Germany. 4Department of Crop Production Ecology, Swedish University of Agricultural Sciences, Ulls väg 16, 75007,
Uppsala, Sweden. Correspondence and requests for materials should be addressed to K.B. (email: korbinian.breinl@
geo.uu.se)
Received: 15 March 2017
Accepted: 20 June 2017
Published: xx xx xxxx
OPEN
www.nature.com/scientificreports/
2
Scientific RepoRts | 7: 5449 | DOI:10.1038/s41598-017-05822-y
embedded in weather generators for various climate variables. Stochastic precipitation generators can be used for
downscaling of numerical weather models and for climate projections615, ood and drought assessments1621,
agricultural studies2224, food security25, 26, as well as public2729 and veterinary health30. e main drawback of
statistical methods is that, while spatial and temporal correlation structures are kept, unlike numerical weather
prediction, they cannot simulate the associated large-scale dynamics leading to temperature and precipitation
variabilities.
Despite this limitation, there is undoubtedly potential for more extensive application of multi-site precipi-
tation generators. Yet surprisingly little knowledge is available regarding their application across spatial scales,
in dierent climates and under conditions of data scarcity. So far, stochastic multi-site precipitation generators
have been primarily applied at small spatial scales (not exceeding some tens of kilometers), with only a handful
of sites10, 13, 3137. Only very few authors have focused on larger spatial scales3841. e majority of these studies has
been carried out in temperate and precipitation-rich climates in developed countries, where dense observation
networks and long time series of reliable climate data are the norm8, 9, 19, 20, 22, 24, 39, 42, 43. Furthermore, the wide-
spread application of precipitation generators has been limited by the lack of publicly available transparent source
codes and the mathematical complexity of many models, so that setting up a model still requires major eorts.
e complexity of algorithms has been recently identied as an issue by Apel et al.44 and the fragmented body of
knowledge has been critically reviewed by Ailliot et al.45.
Towards an easier and more widespread use of stochastic multi-site precipitation generation, here we rst
assess the performance of multi-site precipitation generation across three dierent climatic zones and across
spatial scales in the United States, from about thirty kilometers to over one thousand kilometers of maximum
extent. Second, we link the density of the observation network to the performance of the precipitation generation,
to provide new insights into model performance under conditions of data scarcity and thus into the applicability
in data-poor regions, such as in emerging economies and developing countries.
Addressing these multiple aspects requires the generation of very large data amounts that go far beyond what
has yet been presented: for our study we generated almost 1.5 million years of synthetic precipitation. For this
reason, we use the latest version of the very fast reduced-complexity stochastic multi-site precipitation generator
TripleM (Multisite Markov Model), which requires only two key parameters for simulating any gauge network
in its simplest setup. Other algorithms require a very large number of parameters that grow exponentially with
the number of gauges42, making comprehensive studies not feasible. To our knowledge, TripleM is the most
straightforward multi-site precipitation generator currently available and thus probably one of very few models
that allows for comprehensive studies.
Data and Experiments
Station-based climate observations. In order to fulll the objectives of the study, a homogeneous data-
set covering dierent climatic zones and providing a suciently dense observation network is needed. For these
reasons, we use the dataset of daily precipitation observations available for the United States from the Global
Historical Climatology Network - Daily (GHCN-Daily)46 for the 30-year period 1986–2015, which has been
compiled by the National Climatic Data Center (NCDC) (https://www.ncdc.noaa.gov/oa/climate/ghcn-daily/).
From this dataset we selected three study areas representing dierent climatic conditions, located in the North-
East (NE), South-East (SE) and West (W) of the United States (Fig.1).
e NE is dominated by a relatively cold climate without any dry season, with evenly distributed monthly
precipitation and warmer summers. e SE is dominated by a temperate and tropical monsoon climate without
any pronounced dry season and moist, hot summers. e W is dominated by an arid and semi-arid climate with
frequent droughts47. It has a marked seasonality in precipitation and includes a temperature gradient with warmer
(South) and colder regions (North). While the inter-annual variability of precipitation is more evenly distributed
over the year in the NE and SE, it has a pronounced annual cycle in the W, reaching comparatively high values in
the winter months. e lengths of dry and wet spells show similar annual cycles in the NE and SE with dry spells
peaking in winter/spring and in the fall. Dry spells are longest in the summer in the W. For the period 1986–2015,
72 precipitation gauges with complete time series are available for the NE, 111 for the SE and 98 for the W.
Design of the experiments. To evaluate model performance under diering conditions of data availability/
scarcity, we investigated four dierent levels of gauge network densities (Table1).
e density scenarios are based on actual precipitation gauge network densities of the GHCN-Daily dataset
(1986–2015) in two of the three study areas in the United States (“very high”), the average density over Europe
(“high”) as well as China (“medium”) as an orientation for emerging economies, and the average density on the
African continent (“low”) as an orientation for developing countries (see FigureS1 in the Supplementary mate-
rial). e distribution of precipitation gauges in each scenario was conducted subjectively, aiming for equally
spatially distributed networks. As each density scenario required a comparable network density for each study
area, a high-density scenario could not be examined for the W.
For each density scenario, we conducted four separate experiments, each starting at one of the four so-called
‘starting sites’ (located in four dierent regions of each study area; see red gauges in Fig.1). e four starting sites
(i.e. four experiments starting in dierent regions of the study area) were introduced to capture the obviously not
fully homogenous climate of each study area. Each experiment began by considering a minimum precipitation
gauge network of three sites (i.e. the starting site and its two closest sites), continuously widening up the pre-
cipitation gauge network by adding the next closest precipitation gauge up to the maximum number of gauges
available for each density scenario. For each precipitation gauge network, we simulated 30 dierent ensembles to
obtain stable results, each time over the 30-year period (i.e. 900 years). In other words, for each density scenario
and starting site in each study area, we performed 30 runs for a network of three gauges, 30 runs for four gauges
and so on, up to 30 runs for all available gauges. For example, the total number of simulated years in the NE for
www.nature.com/scientificreports/
3
Scientific RepoRts | 7: 5449 | DOI:10.1038/s41598-017-05822-y
the density scenario “very high” is 4 (experiments using the four starting sites) × 70 (dierent precipitation gauge
networks between three and 72 sites) × 30 (dierent ensembles) × 30 (observation years) = 252,000 years. e
combination of all three study areas (i.e. climates), precipitation gauge network sizes (i.e. spatial scale) and num-
ber of sites (i.e. network density) led to a total of 1,461,600 generated precipitation years.
We used the semi-parametric multi-site precipitation generator TripleM21, 48, which we optimized for large
gauge networks to perform the experiments (see Methods). We simulated daily precipitation amounts by a pure
resampling of the observations (bootstrap) to eliminate uncertainties arising from parametric precipitation sam-
pling. A detailed description of the TripleM algorithm is available in the Methods.
Results and Discussion
We focus on four key metrics relevant for climate change and climate change impacts studies, namely: (i) the
inter-annual standard deviation of precipitation, (ii) the average maximum length of dry spells (dry periods), (iii)
the mean length of dry spells, and (iv) the average maximum length of wet spells (wet periods). e intra-annual
distribution of precipitation and inter-annual variability in precipitation amounts are key drivers of the function-
ing of terrestrial ecosystems, and hence local carbon balance, agricultural production, natural hazards such as
oods and droughts, and have both direct and indirect impacts on human health and well-being4951. Inter-annual
variations in precipitation and temperature explain on average a third of the global crop yield variability50. Mean
dry spells represent continuing water stress of plants52, while maximum dry spells (ii) are of relevancy for drought
Figure 1. e three study areas in the North-East (NE), South-East (SE) and West (W) of the United States,
including the location of all precipitation gauges available for the period 1986–2015 (grey dots), the starting
sites of the experiments (see section ‘Design of the experiments’ below), plots of the mean precipitation, annual
standard deviation of precipitation, mean length of dry and wet spells, averaged over all gauges for each month.
e bar/line plots also contain information on the mean annual precipitation (MAP). e map was generated in
ArcGIS 10.2 (http://www.esri.com/), related bar/line plots in MATLAB 2016a (http://www.mathworks.com/).
Study area/number
of gauges very high
(5,200 km²/gauge) high (11,400 km²/
gauge) medium
(48,000 km²/gauge) low (94,400 km²/
gauge) Maximum gauge
network extent (km)
NE 72 32 8 4 1,173
SE 111 52 12 6 1,161
Wnot available 98 23 12 1,167
Table 1. Number of gauges for the three study areas, the four simulated precipitation gauge density scenarios
referred to as ‘very high, ‘high’, ‘medium’ and ‘low’, and the maximum extent of the networks in each scenario.
www.nature.com/scientificreports/
4
Scientific RepoRts | 7: 5449 | DOI:10.1038/s41598-017-05822-y
studies. Maximum wet spells (iv) inuence oods and, depending on climatic conditions, have a direct impact
on the prevalence of water-related vector-borne diseases, such as Chikungunya53 or ri valley fever54, 55, but also
on agriculture56. We assessed the performance of the precipitation generator with focus on the average annual
performance and on the summer (Jun, Jul, Aug) and winter (Dec, Jan, Feb) seasons separately. e precipitation
generator performance was characterized as the relative error between the mean of the 30 simulations for all sites
of each precipitation gauge network and the observations. Since we conducted four experiments with four start-
ing sites in each study area, in the gures below we show the mean of these four simulations.
For a more in-depth assessment, we examined ve additional standard hydrological metrics: (i) the simulated
mean precipitation, (ii) the daily standard deviation of precipitation, (iii) the mean length of wet spells, (iv) the
lag1 autocorrelation of precipitation occurrence as well as (v) the cross-correlation of precipitation occurrence
lagged by one day as a proxy for the persistence of weather situations. e results for these metrics are reported
in the Supplementary material.
Climate and spatial scale. We present the impact of the spatial scale (Figs2 and 3) for the high density
scenario (see Table1) for the three climates. e performance generally decreases with increasing gauge network
size. is is expected as in TripleM daily snapshots of precipitation occurrences are rst clustered according
to their similarity and then simulated based on a univariate Markov process. A larger extent means larger, less
homogenous precipitation snapshots and lower performance. For the four metrics, increasing the network size
Figure 2. Relative error for all sites in the North-East (blue), the South-East (green) and the West (magenta) for
all months. e error is plotted for four metrics against the maximum extent of each simulated gauge network.
For each study area, the lines show the mean of the four simulations (using four dierent starting sites; see
Fig.1).
www.nature.com/scientificreports/
5
Scientific RepoRts | 7: 5449 | DOI:10.1038/s41598-017-05822-y
increases the mean error on average by 3.7% in the NE (from 0.0% with three sites), 6.2% in the SE (from 1.0%
with three sites) and 4.3% in the W (from 5.7% with three sites).
For the annual performance (Fig.2), the inter-annual standard deviation tends to be underestimated. is is a
typical phenomenon of daily weather generators, referred to as overdispersion. e underestimation is generally
low for the NE and the SE and higher for the W. On average, the underestimation reaches a maximum of 6.4% in
the NE (starting at 3.0% for the smallest network size) and 6.2% (starting at 4.3%) in the SE, with a slightly
decreasing performance towards larger gauge networks. e underestimation in the W increases from 16.7%
for three gauges to 19.6% for all sites. Daily weather generators rely on daily weather scenarios and have thus a
limited capability for reproducing the inter-annual variability. e underestimation is predominately caused by
the resampling approach. e bootstrap only takes into account observations and cannot generate very extreme
events, which underestimates the sampling distribution, especially for small sample sizes. e latter explains
the higher overdispersion in the W where precipitation events are rare. Attempts have been made to overcome
this shortcoming5759. For example, overdispersion could be further reduced also in TripleM-type models by
introducing parametric precipitation sampling with heavy tailed distributions as suggested by Wilks39. However,
tting of parametric precipitation curves in dry areas may be infeasible due to the limited number of precipitation
observations. Seasonal dierences are shown in Fig.3. In the NE and SE, the variability is more underestimated
in the summer. e observed annual standard deviation averaged over the entire precipitation gauge network is
43.8% higher in summer than in winter in the NE and 30.1% higher in the SE. In the W, it is 8.7 times higher in
Figure 3. Relative error for all sites in the North-East (blue), the South-East (green) and the West (magenta)
for the summer (solid lines) and winter season (dashed lines). e error is plotted for four metrics against the
maximum extent of each simulated gauge network. For each study area, the lines show the mean of the four
simulations (using four dierent starting sites; see Fig.1).
www.nature.com/scientificreports/
6
Scientific RepoRts | 7: 5449 | DOI:10.1038/s41598-017-05822-y
winter than in summer due to the predominantly arid summer. It is an inherent property of the bootstrap that
the underestimation decreases exponentially with an increasing variability of the observations. is explains
why seasons with a relatively high inter-annual variability are more strongly underestimated than seasons with a
relatively low variability.
e length of maximum dry spells is slightly overestimated in the NE with 2.1% for three gauges and under-
estimated for larger extents, reaching 1.6% at full extent (Fig.2). e trend is similar for the SE and W, starting
with an overestimation of 1.3% and underestimation of 3.5% respectively and reaching 4.3% and 5.2% at
full extent. Mean dry spells are likewise least underestimated in the NE (0.3% to 4.3%). e underestimation
in the SE and W starts with 1.4% and 3.2%, reaching 9.8% and 7.9% at full extent. e bias for simulating
maximum wet spells is smallest in the NE (1.2% to 2.5%). Maximum wet spells are less well reproduced in the
SE and W and follow similar trends (0.5% and 0.6% to 8.3% and 7.4%).
In the NE and the SE maximum dry spells are better reproduced in winter compared to summer (Fig.3).
is is related to the persistence of weather events, expressed by the lagged cross-correlation of the precipitation
occurrences, which is 6.4% higher in winter in the NE and 13.8% higher in the SE compared to summer. e clus-
tering approach performs better when precipitation events are predominantly of frontal nature. e convective
systems that are common in summer are more variable, with smaller scales in time and space, thus leading to
more distinctive precipitation patterns and reducing the clustering performance. e performance for mean dry
spells is similar with almost equal performance in summer and winter in the NE.
Performance dierences between the NE and the SE are related to the strong impact of convective systems in
the SE, particularly in summer: Florida is the state in the United States with the highest thunderstorm activity60, 61.
Precipitation contribution of tropical cyclones to the seasonal precipitation totals can reach up to 20% in the
coastal regions, with comparatively high inter-annual variabilities depending on whether a year has hurricane
observations or not62. According to the International Best Track Archive for Climate Stewardship IBTrACS63
(release version v03r09), the South-East study area as presented in this research has been hit by 23 named and
three unnamed tropical cyclones between 1986 and 2015 in the summer season. Conversely, precipitation in the
NE is predominately of frontal nature. According to a study by Hawcroet al.64 using two dierent reanalysis
datasets the contribution of extratropical cyclones to the total precipitation in the NE study area reaches over
80% in the winter season and over 65% in the summer season with uncertainties of up to about 20% depending
on the reanalysis dataset under investigation. In the W, the precipitation climatology is much more complex,
with a pronounced spatial heterogeneity of precipitation with a large impact of smaller-scale climatic controls in
the mountainous areas65. e region is also strongly inuenced by the El Niño–Southern Oscillation (ENSO). In
the Great Basin, which covers most of the Western study area except for California, above normal precipitation
between October and March is predominately associated to ENSO years66. e inter-annual variability is also
linked to ENSO67. e mountain ranges of the Sierra Nevada in California receive high precipitation amounts due
to orographic eects, which also explain the dry conditions in the Great Basin because of a rain shadow eect. e
lagged cross-correlation of observed precipitation occurrences (i.e. weather persistence) is three times higher in
winter than summer, due to the dominant inuence of midlatitudinal synoptic-scale storms68, 69. e still better
performance for dry spells in the W in summer is related to the arid summer (recorded precipitation on only
4.3% of all days), making a pronounced underestimation of dry spells unlikely. e performance for maximum
wet spells in the NE and in the SE is similar to the performance in regard to dry spells. e performance is better
during winter with higher persistence of weather events. Maximum wet spells are equally reproduced in both
seasons in the W. e 90% condence intervals (see Supplementary material) show similar spreads across seasons
and study areas. e most signicant dierences are related to the W: For summer, condence intervals are signif-
icantly wider for the majority of metrics, which is related to the low number of precipitation days. e results for
the medium and low gauge density scenarios (Table1, not shown here) showed comparable results.
Network density and spatial scale. e gauge network density impacts the performance. Here, for all
available density scenarios (Table1), we focus on the annual performance only (Fig.4), but seasonal perfor-
mances are comparable.
Deviations between network densities can be encountered. For the majority of the metrics, the model bias
decreases with a reduced network density, with dierences between about one to ve percent, depending on
the study area and maximum extent. However, low density does not always mean better performance, primar-
ily for the inter-annual standard deviation of precipitation in the SE and W. To reach the same xed duplication
rate of observations, fewer clusters of daily precipitation snapshots are required for small networks. us, the
clusters represent weather situations less well, which eectively reduces the model performance. e opposite
applies to dry and wet spells where the bias decreases with a reduced network density. e most pronounced
dierences can be recognized for maximum and mean dry spells in the SE, and maximum wet spells in the SE
and in the W. e phenomenon is likewise caused by the clustering algorithm. A smaller number of gauges
leads to a better distinction between the clustered daily precipitation snapshots and therefore higher similarity
within these clusters, which improves the performance. Deviations are higher in the less homogenous climates
of the SE and W. e slightly better performance may give the impression that a lower dense gauge network
may likewise be preferable, but (i) dierences in the performance do not exceed dierences of one to ve per-
cent and (ii) most applications require the interpolation of the simulated precipitation patterns, where a high
number of stations is desirable.
Conclusion
is study is a rst step towards overcoming the fragmented, eclectic knowledge in stochastic generation of pre-
cipitation patterns and is thereby a call for testing multiple, and possibly publicly available, model codes across
dierent climate types, spatial scales and network densities. e comparison of 30-year long observed daily
www.nature.com/scientificreports/
7
Scientific RepoRts | 7: 5449 | DOI:10.1038/s41598-017-05822-y
precipitation patterns with generated precipitation across three dierent climates shows a general adequate agree-
ment when considering relatively small regions, although key metrics such as dry or wet spells are oen underes-
timated. Larger spatial scales lead to reduced performance in reproducing the observations. e simulations are
Figure 4. Relative error for all sites in the North-East (NE), the South-East (SE) and the West (W) for all
months. e error is plotted for four metrics against the maximum extent of each simulated gauge network and
density scenario. For each study area and scenario, the lines show the mean of the four simulations (using four
dierent starting sites; see Fig.1). e solid line represents the results for the very-high gauge network density
(not available for the West), the dashed line for the high-density, the dash-dot line for the medium density and
the dotted line for the low gauge density scenario.
www.nature.com/scientificreports/
8
Scientific RepoRts | 7: 5449 | DOI:10.1038/s41598-017-05822-y
less biased in wet temperate climates than in dry climates. Seasons and locations that are dominated by frontal
precipitation are better reproduced than seasons with a more pronounced impact of convective systems. is
explains the dierent performance obtained in the temperate North-East and subtropical South-East. Seasons
with a higher inter-annual variability of precipitation are less well reproduced, as demonstrated with the Western
study area, which is inuenced by ENSO.
In this research, we focused on the current climate. ere are dierent approaches to parameterize precipita-
tion generators to simulate climate change, for example by altering the precipitation values using output from cli-
mate models as for example suggested by Turkington et al.13. However, as the pure alteration of the precipitation
amounts ignores potential future changes in dry and wet spells, another promising avenue could be to condition
the clustering of the daily precipitation snapshots in the TripleM model to the distribution of current and future
circulation patterns to incorporate changes of dry spells, wet spells and also in the autocorrelation of precipita-
tion. Simulating climate change with weather generators however has inherent limitations in regard to decadal
variabilities and long-term trends. e consideration of other climate types beyond the three of this study would
be another interesting topic for investigation.
e development of common evaluation standards as for instance information on relative errors for better
comparability is highly desirable. Additional comparative studies particularly in countries with lower network
densities (FigureS1, Supplementary material) would be useful to validate the ndings of this study. Further, the
proposed methodology should be complemented to enable simulating projected precipitation patterns that can
be used for climate change impact studies, ideally in the developing world, where impacts of climate change are
oen most signicant. At this point in time, facing numerous published types of algorithms, eclectic case studies,
a very limited number of transparent publicly available source codes and a lack of common evaluation standards,
the full potential of stochastic multi-site weather generation remains unclear. e issue magnies when dierent
model types are parameterized for simulating future climate scenarios. We made a rst step towards closing this
gap by demonstrating that – if there is awareness and knowledge of stochastic approaches and model type specic
opportunities and shortcomings – stochastic multi-site precipitation generation has the potential to support a
variety of societally and ecologically relevant issues in dierent climates, at dierent spatial scales and under
diering conditions of data availability.
Methods
e reduced complexity multi-site precipitation generator TripleM (Multisite Markov Model) applied here works
as follows: First, daily snapshots of the precipitation occurrences (i.e. catchment-wide precipitation patterns) are
clustered according to their similarity. e model uses the non-hierarchical k-means clustering method70, 71 and
the hamming distance (equation (1)).
=≠
=
distance xy pIx y(, )
1
{},
(1)
j
p
jj
1
where I is the indicator function.
In the original version of TripleM48, the k-means clustering was applied to daily snapshots of precipitation
amounts that were rst standardized using the z-score transformation in order to take into account the het-
eroscedastic nature of the precipitation. is led to a satisfying performance in a comparatively small Alpine
precipitation gauge network not exceeding a maximum distance between sites of about 150 km. For this research,
we ran multiple experiments with dierent clustering methods and it turned out that the performance increases
signicantly for large gauge networks when applying the hamming distance to binary precipitation occurrences.
Second, the clustered occurrence vectors are simulated with a Markov process (equation (2)), where the tran-
sition probabilities depend on m previous days, i.e.,
…= …<
+−−+−−
XXXX XXXX XmtPR{,,,,} PR{,,, }with1 (2)
tttt ttttm1121 11
Once the synthetic time series of clusters are simulated, each cluster is replaced by a random amount vector
(i.e. daily snapshot of precipitation amounts) belonging to the same cluster. In a last step, which is optional, the
model introduces sampling of parametric precipitation amounts in combination with an adapted version of a
resampling approach by Clark et al.41, to account for unobserved precipitation extremes. e method is shown
in Fig.5, using a hypothetical example of three sites and a ten states Markov chain: Aer generating synthetic
time series of clusters using the Markov process (a), amount vectors are randomly drawn from all observations
that t the corresponding cluster (b). Following this, synthetic precipitation amounts are sampled independently
for each site from parametric curves (c) optionally using correlated uniform random numbers from a Cholesky
decomposition72. e use of correlated random numbers avoids the generation of signicantly dierent precipi-
tation amounts across sites, which becomes increasingly important when generating short synthetic time series.
In the last step (d), the parametric precipitation amounts are reshued according to the original ranks aer the
resampling in step (b), to maintain the inter-site correlations.
e entire simulation process can be depicted from Fig.6.
In its most simplistic setup (resampling i.e. bootstrap without parametric sampling of precipitation amounts as
applied in this study), TripleM has two key parameters the user has to dene: the duplication rate and the order of
the Markov chain. As for the duplication rate, an inherent characteristic of TripleM is that the clustering approach
will duplicate parts of the time series: A higher number of clusters will generally improve the reproduction of
various metrics of the observations such as the precipitation autocorrelation, but result in duplicated observa-
tions in the simulations, especially in large station networks. Here and elsewhere48, a maximum duplication rate
of only 1% produced satisfying results. Higher duplications rates increase the computational costs. e second
www.nature.com/scientificreports/
9
Scientific RepoRts | 7: 5449 | DOI:10.1038/s41598-017-05822-y
key parameter is the order of the Markov chain used. In this study, a one-order Markov chain was used. For larger
observation networks, it is recommended not to increase the order due to the exponentially growing state-space
related to higher orders.
Another model specic characteristic is the reproduction of inter-site correlations. If long synthetic time series
are generated in combination with parametric precipitation sampling, inter-site correlations are better repro-
duced. is is caused by the reshuing method. With long synthetic time series, the pool of parametric precip-
itation amounts becomes more similar to the resampled precipitation amounts. In TripleM, the reshuing is
conducted over all generated years separately for all months or seasons depending on the chosen model setup.
e choice of parametric models for the synthetic precipitation amounts is another inuencing factor in general,
Figure 5. Key steps of precipitation generation in TripleM aer clustering of the daily precipitation snapshots
and Markov simulation (a), including resampling of amount vectors (b), parametric sampling (c) and
reshuing (d).
Figure 6. Schematic ow diagram of the TripleM precipitation generator. TripleM can be used as a bootstrap
model (Output 1) and a parametric precipitation model (Output 2). Parallelograms represent time series or
variables, boxes represent methods. Blue parallelograms represent input and output data. Cholesky matrices and
transition matrices are either derived monthly (12) or seasonally (4). e parametric distribution parameters
are either derived monthly (12) or seasonally (4) for the number of gauges simulated (n).
www.nature.com/scientificreports/
10
Scientific RepoRts | 7: 5449 | DOI:10.1038/s41598-017-05822-y
which has been discussed in the past36, 73, 74 and is not specic to TripleM. e MATLAB code oers the Gamma
distribution, the Weibull distribution or a compound distribution of the Weibull distribution for lower and a
Generalized Pareto distribution for higher and extreme precipitation amounts with a user-dened threshold
between both curves.
TripleM oers monthly and seasonal setups. All steps, including the clustering of amount vectors, the tting of
the Markov chains, the simulation of the Markov process and the reshuing of parametric precipitation amounts,
can either be run monthly or seasonally. In this study we used a monthly setup.
Code and data availability. e MATLAB source code of TripleM, a user manual and a training dataset
are available from the github page, https://github.com/KBreinl/TripleM. e data used in this paper are available
from the NOAA websites.
References
1. Aerts, J. C. J. H. & Botzen, W. J. W. Climate change impacts on pricing long-term ood insurance: A comprehensive study for the
Netherlands. Global Environ Chang 21, 1045–1060 (2011).
2. Van Loon, A. F. et al. Drought in the Anthropocene. Nat Geosci 9, 89–91 (2016).
3. Aghaoucha, A. et al. Geometrical Characterization of Precipitation Patterns. J Hydrometeorol 12, 274–285 (2011).
4. Schwartz, C. S. et al. Toward Improved Convection-Allowing Ensembles: Model Physics Sensitivities and Optimizing Probabilistic
Guidance with Small Ensemble Membership. Weather Forecast 25, 263–280 (2010).
5. Bray, M. et al. ainfall uncertainty for extreme events in NWP downscaling model. Hydrol Process 25, 1397–1406 (2011).
6. Ciais, P. et al. Europe-wide reduction in primary productivity caused by the heat and drought in 2003. Nature 437, 529–533
(2005).
7. Piao, S. L. et al. Net carbon dioxide losses of northern ecosystems in response to autumn warming. Nature 451, 49–52 (2008).
8. Burton, A. et al. Downscaling transient climate change using a Neyman-Scott ectangular Pulses stochastic rainfall model. J Hydro l
381, 18–32 (2010).
9. Feddersen, H. & Andersen, U. A method for statistical downscaling of seasonal ensemble predictions. Tell us A 57, 398–408 (2005).
10. Palutiof, J. P. et al. Generating rainfall and temperature scenarios at multiple sites: Examples from the Mediterranean. J Climate 15,
3529–3548 (2002).
11. Forsythe, N. et al. Application of a stochastic weather generator to assess climate change impacts in a semi-arid climate: e Upper
Indus Basin. J Hydro l 517, 1019–1034 (2014).
12. Jones, P. D. et al. Downscaling regional climate model outputs for the Caribbean using a weather generator. Int J Climatol,
36,4141–4163 (2016).
13. Turington, T. et al. A new flood type classification method for use in climate change impact studies. Weather and Climate
Extremes14, 1–16 (2016).
14. Trna, M. et al. Adverse weather conditions for European wheat production will become more frequent with climate change. Nat
Clim Change 4, 637–643 (2014).
15. Holding, S. et al. Groundwater vulnerability on small islands. Nat Clim Change 6, 1100–1103 (2016).
16. Breinl, . et al. A joint modelling framewor for daily extremes of river discharge and precipitation in urban areas. Journal of Flood
Risk Management 10, 97–114 (2017).
17. Qin, X. S. & Lu, Y. Study of Climate Change Impact on Flood Frequencies: A Combined Weather Generator and Hydrological
Modeling Approach. J Hydrometeorol 15, 1205–1219 (2014).
18. hazaei, M. . et al. Assessment of climate change impact on oods using weather generator and continuous rainfall-runo model.
Int J Climatol 32, 1997–2006 (2012).
19. Harris, C. N. P. et al. e use of probabilistic weather generator information for climate change adaptation in the U water sector.
Meteorol Appl 21, 129–140 (2014).
20. Le ander, . & Buishand, T. A. A daily weather generator based on a two-stage resampling algorithm. J Hydr ol 374, 185–195 (2009).
21. Breinl, . Driving a lumped hydrological model with precipitation output from weather generators of dierent complexity. Hydrolog
Sci J 61, 1395–1414 (2016).
22. Hansen, J. W. & Ines, A. V. M. Stochastic disaggregation of monthly rainfall data for crop simulation studies. Agr Forest Meteorol 131,
233–246 (2005).
23. Greene, A. M. et al. A climate generator for agricultural planning in southeastern South America. Agr Forest Meteorol 203, 217–228
(2015).
24. Mearns, L. O. et al. Mean and variance change in climate scenarios: Methods, agricultural applications, and measures of uncertainty.
Clim Change 35, 367–396 (1997).
25. Stevens, T. & Madani, . Future climate impacts on maize farming and food security in Malawi. Scientic Reports 6 (2016).
26. Semenov, M. A. & Shewry, P. . Modelling predicts that heat stress, not drought, will increase vulnerability of wheat in Europe.
Scientic Reports 1 (2011).
27. Charron, D. F. et al. Lins Between Climate, Water And Waterborne Illness, and Projected Impacts of Climate Change. Health
Canada (2005).
28. Morin, C. W. & Comrie, A. C. egional and seasonal response of a West Nile virus vector to climate change. P Natl Acad Sci USA
110, 15620–15625 (2013).
29. Ogden, N. H. et al. Climate change and the potential for range expansion of the Lyme disease vector Ixodes scapularis in Canada. Int
J Parasitol 36, 63–70 (2006).
30. Clare, F. C. et al. Climate forcing of an emerging pathogenic fungus across a montane multi-host community. Philosophical
Transactions of the Royal Society B: Biological Sciences 371 (2016).
31. Baigorria, G. A. & Jones, J. W. GiST: A Stochastic Model for Generating Spatially and Temporally Correlated Daily ainfall Data. J
Climate 23, 5990–6008 (2010).
32. Bardossy, A. & Pegram, G. G. S. Copula based multisite model for daily precipitation simulation. Hydrol Earth Syst Sc 13, 2299–2314
(2009).
33. Serinaldi, F. A multisite daily rainfall generator driven by bivariate copula-based mixed distributions. J Geophys Res-Atmos 114
(2009).
34. Brissette, F. P. et al. Ecient stochastic generation of multi-site synthetic precipitation data. J Hydrol 345, 121–133 (2007).
35. S erinaldi, F. Copula-based mixed models for bivariate rainfall data: an empirical study in regression perspective. Stoch Env Res Risk
A 23, 677–693 (2009).
36. Breinl, . et al. Stochastic generation of multi-site daily precipitation for applications in ris management. J Hydrol 498, 23–35
(2013).
37. hazaei, M. et al. A new daily weather generator to preserve extremes and low-frequency variability. Clim Change 119, 631–645
(2013).
www.nature.com/scientificreports/
11
Scientific RepoRts | 7: 5449 | DOI:10.1038/s41598-017-05822-y
38. Leander, . & Buishand, T. A. esampling of regional climate model output for the simulation of extreme river ows. J Hydrol 332,
487–496 (2007).
39. Wils, D. S. Multisite generalization of a daily stochastic precipitation generation model. J Hydrol 210, 178–191 (1998).
40. ayner, D. et al. A multi-state weather generator for daily precipitation for the Torne iver basin, northern Sweden/western Finland.
Advances in Climate Change Research 7, 70–81 (2016).
41. Clar, M. P. et al. A resampling procedure for generating conditioned daily weather sequences. Water Resour Res 40 (2004).
42. Mehrotra, . et al. A comparison of three stochastic multi-site precipitation occurrence generators. J Hydrol 331, 280–292 (2006).
43. Mehrotra, . & Sharma, A. A semi-parametric model for stochastic generation of multi-site daily rainfall exhibiting low-frequency
variability. J Hydrol 335, 180–193 (2007).
44. Apel, H. et al. Combined uvial and pluvial urban ood hazard analysis: concept development and application to Can o city,
Meong Delta, Vietnam. Nat. Hazards Earth Syst. Sci. 16, 941–961 (2016).
45. Ailliot, P. et al. Stochastic weather generators: an overview of weather type models. J Soc Fr Statistique 156, 101–113 (2015).
46. Menne, M. J. et al. An Overview of the Global Historical Climatology Networ-Daily Database. J Atmos Ocean Tech 29, 897–910
(2012).
47. Aghaoucha, A. et al. Water and climate: ecognize anthropogenic drought. Nature 524, 409–411 (2015).
48. Breinl, . et al. Simulating daily precipitation and temperature: a weather generation framewor for assessing hydrometeorological
hazards. Meteorol Appl 22, 334–347 (2015).
49. napp, A. . & Smith, M. D. Variation among biomes in temporal dynamics of aboveground primary production. Science 291,
481–484 (2001).
50.  ay, D. . et al. Climate variation explains a third of global crop yield variability. Nat Commun 6 (2015).
51. Porporato, A. et al. Superstatistics of hydro-climatic uctuations and interannual ecosystem productivity. Geophys Res Lett 33
(2006).
52. Fran, D. A. et al. Eects of climate extremes on the terrestrial carbon cycle: concepts, processes and potential future impacts. Global
Change Biol 21, 2861–2880 (2015).
53. Fischer, D. et al. Climate change eects on Chiungunya transmission in Europe: geospatial analysis of vector’s climatic suitability
and virus’ temperature requirements. Int J Health Geogr 12 (2013).
54. Linthicum, . J. et al. Climate and satellite indicators to forecast i Valley fever epidemics in enya. Science 285, 397–400 (1999).
55. Taylor, D. et al. Environmental change and i Valley fever in eastern Africa: projecting beyond HEALTHY FUTUES. Geospatial
Health 11, 115–128 (2016).
56. Lobell, D. B. et al. Climate extremes in California agriculture. Clim Change 109, 355–363 (2011).
57. atz, . W. & Parlange, M. B. Overdispersion phenomenon in stochastic modeling of precipitation. J Climate 11, 591–601 (1998).
58. im, Y. et al. educing overdispersion in stochastic weather generators using a generalized linear modeling approach. Climate Res
53, 13–24 (2012).
59. Chen, J. et al. A daily stochastic weather generator for preserving low-frequency of climate variability. J Hydr ol 388, 480–490 (2010).
60. Orville, . E. & Hunes, G. . Cloud-to-ground lightning in the United States: NLDN results in the rst decade, 1989–98. Mon
Weather Rev 129, 1179–1193 (2001).
61. Hodanish, S. et al. A 10-yr monthly lightning climatology of Florida: 1986–95. Weather Forecast 12, 439–448 (1997).
62. Prat, O. P. & Nelson, B. . Precipitation Contribution of Tropical Cyclones in the Southeastern United States from 1998 to 2009
Using TMM Satellite Data. J Climate 26, 1047–1062 (2013).
63. napp, . . et al. e International Best Trac Archive for Climate Stewardship (Ibtracs) Unifying Tropical Cyclone Data. B Am
Meteorol Soc 91, 363376 (2010).
64. Hawcro, M. . et al. How much Northern Hemisphere precipitation is associated with extratropical cyclones? Geophys Res Lett 39
(2012).
65. Moc, C. J. Climatic Controls and Spatial Variations of Precipitation in the Western United States. J Climate 9, 1111–1125 (1996).
66. opelewsi, C. F. & Halpert, M. S. Global and egional Scale Precipitation Patterns Associated with the El-Nino Southern
Oscillation. Mon Weather Rev 115, 1606–1626 (1987).
67.  ajagopalan, B. & Lall, U. Interannual variability in western US precipitation. J Hydrol 210, 51–67 (1998).
68. Cayan, D. . & oads, J. O. Local elationships between United-States West-Coast Precipitation and Monthly Mean Circulation
Parameters. Mon Weather Rev 112, 1276–1282 (1984).
69. Lareau, N. P. & Horel, J. D. e Climatology of Synoptic-Scale Ascent over Western North America: A Perspective on Storm Tracs.
Mon Weather Rev 140, 1761–1778 (2012).
70. Hartigan, J. A. Clustering Algorithms. (Wiley, 1975).
71. Hartigan, J. A. & Wong, M. A. Algorithm AS 136: A -Means Clustering Algorithm. J Roy Stat Soc C 28, 100–108 (1979).
72. Watins, D. S. Fundamentals of matrix computations. 3rd edn, (Wiley, 2010).
73. Papalexiou, S. M. et al. How extreme is extreme? An assessment of daily rainfall distribution tails. Hydrol Earth Syst Sc 17, 851–862
(2013).
74. Vlce, O. & Huth, . Is daily precipitation Gamma-distributed? Adverse eects of an incorrect use of the olmogorov-Smirnov test.
Atmos Res 93, 759–766 (2009).
Acknowledgements
This research has been funded by the project STEEP STREAMS funded by the Swedish Research Council
FORMAS within WaterJPI, ERA-Net Cofund WaterWorks 2014. e data used in this paper are available from
the NOAA websites.
Author Contributions
K.B. and G.D.B. conceived the research; K.B. prepared the data, improved the MATLAB code for modelling large
gauge networks and ran the experiments; K.B. and G.D.B. designed the experiments with major contributions
by A.R., G.V. and M.H.; M.G.L. revised the MATLAB code for high performance; all authors contributed to the
interpretation and writing of the manuscript with major contributions by M.H.
Additional Information
Supplementary information accompanies this paper at doi:10.1038/s41598-017-05822-y
Competing Interests: e authors declare that they have no competing interests.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional aliations.
www.nature.com/scientificreports/
12
Scientific RepoRts | 7: 5449 | DOI:10.1038/s41598-017-05822-y
Open Access This article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Cre-
ative Commons license, and indicate if changes were made. e images or other third party material in this
article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons license and your intended use is not per-
mitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the
copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
© e Author(s) 2017
... Moreover, distribution functions with more parameters performed better than those with fewer parameters [10]. Breinl et al. [11] used the stochastic multi-site precipitation generator TripleM (Multisite Markov Model) to simulate precipitation patterns and conducted a review of applications related to the SPG approaches. The statistical analysis of such large-scale datasets requires parameter estimation techniques that are computationally effective and adequately capture the dynamism of the underlying processes. ...
Article
Full-text available
Precipitation modeling holds significant importance in various fields such as agriculture, animal husbandry, weather derivatives, hydrology, and risk and disaster preparedness. Stochastic precipitation generators (SPGs) represent a class of statistical models designed to generate synthetic data capable of simulating dry and wet precipitation stretches for a long duration. The construction of Hidden Markov Models (HMMs), which treat latent meteorological circumstances as hidden states, is an efficient technique for simulating precipitation. Considering that there are many choices of emission distributions used to generate positive precipitation, the characteristics of different distributions for simulating positive precipitation have not been fully explored. The paper includes a simulation study that demonstrates how the Pareto distribution, when used as the distribution for generating positive precipitation, addresses the limitations of the exponential and gamma distributions in predicting heavy precipitation events. Additionally, the Pareto distribution offers flexibility through adjustable parameters, making it a promising option for precipitation modeling. We can estimate parameters in HMMs using forward–backward algorithms, Variational Bayes Expectation-Maximization (VBEM), and Stochastic Variational Bayes (SVB). In the Xilingol League, located in the central part of the Inner Mongolia Autonomous Region, China, our study involved data analysis to identify crucial locations demonstrating a robust correlation and notable partial correlation between the Normalized Difference Vegetation Index (NDVI) and annual precipitation. We performed fitting of monthly dry days ratios and monthly precipitation using seasonal precipitation and year-round precipitation data at these crucial locations. Subsequently, we conducted precipitation predictions for the daily, monthly, and annual time frames using the new test dataset observations. The study concludes that the SPG fits the monthly dry-day ratio better for annual daily precipitation data than for seasonal daily precipitation data. The fitting error for the monthly dry day ratio corresponding to annual daily precipitation data is 0.053 (exponential distribution) and 0.066 (Pareto distribution), while for seasonal daily precipitation data, the fitting error is 0.14 (exponential distribution) and 0.15 (Pareto distribution). The exponential distribution exhibits the poorest performance as a model for predicting future precipitation, with average errors of 2.49 (daily precipitation), 40.62 (monthly precipitation), and 130.40 (annual precipitation). On the other hand, the Pareto distribution demonstrates the best overall predictive performance, with average errors of 0.69 (daily precipitation), 34.69 (monthly precipitation), and 66.42 (annual precipitation). The results of this paper can provide decision support for future grazing strategies in the Xilingol League.
... To avoid the pitfall of physically unrealistic simulations, stochastic rainfall models embed a significant part of our conceptual knowledge about rainfall behavior in their parameterization (i.e., they implement statistical relationships that reflect as closely as possible the physical processes at work). However, rainfall properties (Krajewski et al., 2003) and, in turn, the performance of stochastic rainfall generators (Breinl et al., 2017;Vu et al., 2018) strongly depend on the climate of the area of interest. Hence, different models have been proposed for different climates, with each model focusing on a specific aspect of rainfall, for instance, rainfall seasonality in monsoonal climates (Greene et al., 2011), rainfall spatial-temporal correlation in temperate climates (Paschalis et al., 2013), or rainfall occurrence and extreme intensities in arid regions (Wilcox et al., 2021). ...
Article
Full-text available
Stochastic rainfall generators are probabilistic models of rainfall space–time behavior. During parameterization and calibration, they allow the identification and quantification of the main modes of rainfall variability. Hence, stochastic rainfall models can be regarded as probabilistic conceptual models of rainfall dynamics. As with most conceptual models in earth sciences, the performance of stochastic rainfall models strongly relies on their adequacy in representing the rain process at hand. On tropical islands with high elevation topography, orographic rain enhancement challenges most existing stochastic models because it creates localized precipitations with strong spatial gradients, which break down the stationarity of rain statistics. To allow for stochastic rainfall modeling on tropical islands, despite non-stationarity of rain statistics, we propose a new stochastic daily multi-site rainfall generator specifically for areas with significant orographic effects. Our model relies on a preliminary classification of daily rain patterns into rain types based on rainfall space and intensity statistics, and sheds new light on rainfall variability at the island scale. Within each rain type, the distribution of rainfall through the island is modeled by combining a non-parametric resampling of past analogs of a latent field describing the spatial distribution of rainfall, and a parametric gamma transform function describing rain intensity. When applied to the stochastic simulation of rainfall on the islands of O`ahu (Hawai`i, United States of America) and Tahiti (French Polynesia) in the tropical Pacific, the proposed model demonstrates good skills in jointly simulating site-specific and island-scale rain statistics. Hence, it provides a new tool for stochastic impact studies in tropical islands, in particular for watershed water resource management.
... While the output provided from these models are statistical estimates and therefore have uncertainty built in, ensemble datasets generated from these models can improve other climate and weather models. Breinl et al. (2017) provides a review of current SPG approaches and applications. ...
Preprint
Full-text available
Stochastic precipitation generators (SPGs) are a class of statistical models which generate synthetic data that can simulate dry and wet rainfall stretches for long durations. Generated precipitation time series data are used in climate projections, impact assessment of extreme weather events, and water resource and agricultural management. We construct an SPG for daily precipitation data that is specified as a semi-continuous distribution at every location, with a point mass at zero for no precipitation and a mixture of two exponential distributions for positive precipitation. Our generators are obtained as hidden Markov models (HMMs) where the underlying climate conditions form the states. We fit a 3-state HMM to daily precipitation data for the Chesapeake Bay watershed in the Eastern coast of the USA for the wet season months of July to September from 2000--2019. Data is obtained from the GPM-IMERG remote sensing dataset, and existing work on variational HMMs is extended to incorporate semi-continuous emission distributions. In light of the high spatial dimension of the data, a stochastic optimization implementation allows for computational speedup. The most likely sequence of underlying states is estimated using the Viterbi algorithm, and we are able to identify differences in the weather regimes associated with the states of the proposed model. Synthetic data generated from the HMM can reproduce monthly precipitation statistics as well as spatial dependency present in the historical GPM-IMERG data.
... Some significant fluctuations in the atmospheric patterns such as the frequency and rotational changes of large-scale atmospheric circulations, increasing sea surface temperatures and instability conditions between surface and upper levels change the precipitation amount, duration and intensity, which significantly affects the water cycle of a region (Mishra and Singh, 2010;Trenberth, 2011;Li et al., 2016;Caloiero and Coscarelli, 2020;Dong et al., 2020). Additionally, prolonged wet and/or dry spells negatively affect the natural resources by causing extreme weather events such as floods, droughts and heat waves (Houghton et al., 2001;Li et al., 2010;Breinl et al., 2017;Li et al., 2017;Li et al., 2018;Nabeel and Athar, 2018;Breinl et al., 2020). From natural meteorological disasters, droughts occur in certain places of the earth during dry spells with insufficient precipitation and can be intensified by long continued successive dry days. ...
Article
Understanding the variations and leading atmospheric mechanisms causing wet and dry episodes are very important for the water management strategies. For this purpose, this study investigates the spatiotemporal variation and background environmental conditions for wet and dry spell lengths in Turkey. With this aim, Mann-Kendall rank statistic test and fitted ordinary least squares regression method are implemented to the 92 meteorology stations, which are homogenously distributed over Turkey, for the period 1966–2018. In terms of atmospheric circulation mechanisms, synoptic composites of NCEP/NCAR Reanalysis data (sea level pressure, air temperature at 850-hPa level, geopotential height at 500-hPa level) and sea surface temperature from NOAA High Resolution data are applied to the five years of highest frequency long dry/wet spells. According to the results, both the increasing frequencies of wet spells and their highest contribution to the precipitation are mainly found in the eastern Black Sea Region (BSR) of Turkey during all seasons, especially in spring. In this sub-region, statistically significant increasing (decreasing) springtime wet (dry) spell lengths in the neighboring five stations (i.e. Ardahan, Artvin, Bayburt, Hopa, Rize) indicated that the majority of wet spells (almost 30%) appear to occur between 3 and 5 days and the maximum number of the wet spell days are shown in the last two decades of the eastern BSR. Local quasi-stationary surface high located over eastern Black Sea (1016-hPa core) and associated weak northerly winds transfer relatively warm moist air from the sea surface (11.8 °C in average). This air mass meets with cold- land and low level (6 °C at 850 hPa) air masses, developing instability conditions and resulting in low precipitation rates in wet spells for the eastern BSR. On the other hand, by the extension of the south-west Asian Monsoon to the inner parts of Turkey, relatively hot dry air is transferred to the eastern BSR via southerly winds. As a result of this long-staying Asiatic monsoon characteristics, more dry days are shown in the region due to insufficient moisture and associated lack of precipitation.
... Precipitation is a crucial component of the water cycle, directly or indirectly related to many domains (Breinl et al., 2017;Ye et al., 2018). As the critical factor, rainfall data is vital to the reliability of water resources planning, hydraulic infrastructure design, and flood and drought risk assessment (Kim et al., 2017). ...
Article
Full-text available
As an essential part of the hydrological cycle, precipitation directly contributes to surface runoff and river runoff formation. Simulation on precipitation variables can effectively solve the adverse effects on hydrological assessment in some areas with insufficient or even no runoff observation. With the widespread use of various weather generators, the traditional stochastic hydrological simulation methods tend to be gradually replaced. To compare these two approaches mentioned above, this paper utilizes the precipitation records from 1958 to 2011 at nine meteorological stations within Huaihe River System in Henan Province to evaluate a stochastic hydrological simulation method, SARIMA model, and two types of weather generators, WeaGETS and LARS-WG, through the comparison of statistical characteristics regarding precipitation variables, such as mean, mean square error, extreme value and coefficient of variation. The results show that (1) on the annual scale, SARIMA has a better performance to reproduce the mean and mean square error as well as the extreme precipitation events than weather generators; (2) regarding the monthly-scale precipitation simulation, SARIMA is good at reproducing the statistical properties of monthly precipitation at the average level, while WeaGETS and LARS-WG work better in simulating monthly precipitation extremes; (3) compared with weather generators, SARIMA is highly constrained by the observed records, and among these two weather generators, WeaGETS scores higher on monthly precipitation simulation under the same sample length conditions. In conclusion, the traditional hydrological simulation method, SARIMA, and weather generators, WeaGETS and LARS-WG, have both benefits and drawbacks. The appropriate choice depends on different research backgrounds and purposes.
Article
Full-text available
Enhancing spatial data attributes is crucial for effective basin-scale environmental modelling and improving our understanding and management of precipitation patterns. In this study, we focused on reconstructing homogeneous areal precipitation data in the complex terrain of the Calore River Basin (CRB) in Southern Italy. Until 1869, weather observations in the region were inconsistent, unstandardised, and lacked coordination, but the establishment of meteorological observatories brought a more unified approach to weather monitoring. We relied on the rainfall data obtained from two of these historical observatories: Benevento (1869–present) and Montevergine (1884–present). We utilised a statistical regression framework that considered rainfall measurements and temporal properties from specific locations to reconstruct and visually analyse the evolution patterns of annual mean areal precipitation (MAP) in the CRB from 1869 to 2020. The analysis revealed that mean MAP decreased from 1153 mm yr−1 (1869–1951) to 998 mm yr−1 (1952–2020). This decrease was accompanied by a reduction in interannual variability (from 168 mm yr−1 to 147 mm yr−1 standard deviation), and the difference between the means was significant (p < 0.0001), suggesting a sudden shift in the time-series. These findings provide a basis for CRB water resource management and insights for modelling other complex Mediterranean basins.
Article
Full-text available
Precipitation is crucial for the hydrological cycle and is directly related to many ecological processes. Historically, measurements of precipitation totals were made at weather stations, but spatial and temporal coverage suffered due to the lack of a robust network of weather stations and temporal gaps in observations. Several products have been proposed to identify the location of the occurrence of precipitation and measure its intensity from different types of estimates, based on alternative data sources, that have global (or quasi-global) coverage with long historical time series. However, there are concerns about the accuracy of these estimates. The objective of this study is to evaluate the accuracy of the ERA5 product for two ecoregions of the Canadian Prairies through comparison with monthly means measured from 1981–2019 at ten weather stations (in-situ), as well as to assess the intraseasonal variability of precipitation and identify dry and wet periods based on the annual Standardized Precipitation Index (SPI) derived from ERA5. A significant relationship between in-situ data and ERA5 data (with the R2 varying between 0.42 and 0.76) (p < 0.01)) was observed in nine of the ten weather stations analyzed, with lower RMSE in the Mixed Ecoregion. The Mean Absolute Percentage Error (MAPE) results showed greater agreement between the datasets in May (average R value of 0.84 and an average MAPE value of 32.33%), while greater divergences were observed in February (average R value of 0.57 and an average MAPE value of 50.40%). The analysis of wet and dry periods, based on the SPI derived from ERA5, and the comparison with events associated with the El Niño-Southern Oscillation (ENSO), showed that from the ERA5 data and the derivation of the SPI it is possible to identify anomalies in temporal series with consistent patterns that can be associated with historical events that have been highlighted in the literature. Therefore, our results show that ERA5 data has potential to be an alternative for estimating precipitation in regions with few in-situ stations or with gaps in the time series in the Canadian Prairies, especially at the beginning of the growing season.
Article
Full-text available
Nowadays, climate change is one of the most important threads for civilization. Having the origin mainly in the anthropic activity and intensive use of the environmental resources, climate change affects the ecosystems and the population lives. Changes in precipitation volume and cycles severely affect agriculture and food security. Therefore, building meteorological forecasting is important for planning agricultural works and water management. In this respect, this article attempts to create an image of the future precipitation evolution in the northern part of Dobrogea, a region more and more affected by extreme meteorological events - long drought periods, followed by high precipitation amounts.
Article
Full-text available
A recurrent issue encountered in environmental, ecological or agricultural impact studies in which climate is an important driving force is to provide fast and realistic simulations of atmospheric variables such as temperature, precipitation and wind at a few specific locations, at daily or hourly temporal scales. Spatio-temporal dynamics and correlation structures among the variables of interest, as well as weather persistence and natural variability have to be reproduced accurately in a distributional sense. This quest leads to a large variety of so-called stochastic weather generators (WGs) in the literature. Here, we provide an up-to-date overview of weather type WG models.Weather types classically represent daily characteristics of the relevant atmospheric information at hand. There are many ways to build such weather states, either hidden or observed, and to infer their properties. This overview should help statisticians as well as meteorologists and climate product users to understand the probabilistic concepts and models behind weather type WGs, and to identify their advantages and limits.
Article
Full-text available
Changes in the timings of seasonality as a result of anthropogenic climate change are predicted to occur over the coming decades. While this is expected to have widespread impacts on the dynamics of infectious disease through environmental forcing, empirical data are lacking. Here, we investigated whether seasonality, specifically the timing of spring ice-thaw, affected susceptibility to infection by the emerging pathogenic fungus Batrachochytrium dendrobatidis ( Bd ) across a montane community of amphibians that are suffering declines and extirpations as a consequence of this infection. We found a robust temporal association between the timing of the spring thaw and Bd infection in two host species, where we show that an early onset of spring forced high prevalences of infection. A third highly susceptible species (the midwife toad, Alytes obstetricans ) maintained a high prevalence of infection independent of time of spring thaw. Our data show that perennially overwintering midwife toad larvae may act as a year-round reservoir of infection with variation in time of spring thaw determining the extent to which infection spills over into sympatric species. We used future temperature projections based on global climate models to demonstrate that the timing of spring thaw in this region will advance markedly by the 2050s, indicating that climate change will further force the severity of infection. Our findings on the effect of annual variability on multi-host infection dynamics show that the community-level impact of fungal infectious disease on biodiversity will need to be re-evaluated in the face of climate change. This article is part of the themed issue ‘Tackling emerging fungal threats to animal health, food security and ecosystem resilience’.
Article
Full-text available
Agriculture is the mainstay of Malawi’s economy and maize is the most important crop for food security. As a Least Developed Country (LDC), adverse effects of climate change (CC) on agriculture in Malawi are expected to be significant. We examined the impacts of CC on maize production and food security in Malawi’s dominant cereal producing region, Lilongwe District. We used five Global Circulation Models (GCMs) to make future (2011 to 2100) rainfall and temperature projections and simulated maize yields under these projections. Our future rainfall projections did not reveal a strong increasing or decreasing trend, but temperatures are expected to increase. Our crop modelling results, for the short-term future, suggest that maize farming might benefit from CC. However, faster crop growth could worsen Malawi’s soil fertility problem. Increasing temperature could drive lower maize yields in the medium to long-term future. Consequently, up to 12% of the population in Lilongwe District might be vulnerable to food insecurity by the end of the century. Measures to increase soil fertility and moisture must be developed to build resilience into Malawi’s agriculture sector.
Article
Full-text available
Flood type classification is an optimal tool to cluster floods with similar meteorological triggering conditions. Under climate change these flood types may change differently as well as new flood types develop. This paper presents a new methodology to classify flood types, particularly for use in climate change impact studies. A weather generator is coupled with a conceptual rainfall-runoff model to create long synthetic records of discharge to efficiently build an inventory with high number of flood events. Significant discharge days are classified into causal types using k-means clustering of temperature and precipitation indicators capturing differences in rainfall amount, antecedent rainfall and snow-cover and day of year. From climate projections of bias-corrected temperature and precipitation, future discharge and associated change in flood types are assessed. The approach is applied to two different Alpine catchments: the Ubaye region, a small catchment in France, dominated by rain-on-snow flood events during spring, and the larger Salzach catchment in Austria, affected more by rainfall summer/autumn flood events. The results show that the approach is able to reproduce the observed flood types in both catchments. Under future climate scenarios, the methodology identifies changes in the distribution of flood types and characteristics of the flood types in both study areas. The developed methodology has potential to be used flood impact assessment and disaster risk management as future changes in flood types will have implications for both the local social and ecological systems in the future.
Article
Full-text available
The upper part of a probability distribution, usually known as the tail, governs both the magnitude and the frequency of extreme events. The tail behaviour of all probability distributions may be, loosely speaking, categorized into two families: heavy-tailed and light-tailed distributions, with the latter generating "milder" and less frequent extremes compared to the former. This emphasizes how important for hydrological design it is to assess the tail behaviour correctly. Traditionally, the wet-day daily rainfall has been described by light-tailed distributions like the Gamma distribution, although heavier-tailed distributions have also been proposed and used, e.g., the Lognormal, the Pareto, the Kappa, and other distributions. Here we investigate the distribution tails for daily rainfall by comparing the upper part of empirical distributions of thousands of records with four common theoretical tails: those of the Pareto, Lognormal, Weibull and Gamma distributions. Specifically, we use 15 029 daily rainfall records from around the world with record lengths from 50 to 172 yr. The analysis shows that heavier-tailed distributions are in better agreement with the observed rainfall extremes than the more often used lighter tailed distributions. This result has clear implications on extreme event modelling and engineering design.
Article
Full-text available
Many urban areas experience both fluvial and pluvial floods, because locations next to rivers are preferred settlement areas and the predominantly sealed urban surface prevents infiltration and facilitates surface inundation. The latter problem is enhanced in cities with insufficient or non-existent sewer systems. While there are a number of approaches to analyse either a fluvial or pluvial flood hazard, studies of a combined fluvial and pluvial flood hazard are hardly available. Thus this study aims to analyse a fluvial and a pluvial flood hazard individually, but also to develop a method for the analysis of a combined pluvial and fluvial flood hazard. This combined fluvial–pluvial flood hazard analysis is performed taking Can Tho city, the largest city in the Vietnamese part of the Mekong Delta, as an example. In this tropical environment the annual monsoon triggered floods of the Mekong River, which can coincide with heavy local convective precipitation events, causing both fluvial and pluvial flooding at the same time. The fluvial flood hazard was estimated with a copula-based bivariate extreme value statistic for the gauge Kratie at the upper boundary of the Mekong Delta and a large-scale hydrodynamic model of the Mekong Delta. This provided the boundaries for 2-dimensional hydrodynamic inundation simulation for Can Tho city. The pluvial hazard was estimated by a peak-over-threshold frequency estimation based on local rain gauge data and a stochastic rainstorm generator. Inundation for all flood scenarios was simulated by a 2-dimensional hydrodynamic model implemented on a Graphics Processing Unit (GPU) for time-efficient flood propagation modelling. The combined fluvial–pluvial flood scenarios were derived by adding rainstorms to the fluvial flood events during the highest fluvial water levels. The probabilities of occurrence of the combined events were determined assuming independence of the two flood types and taking the seasonality and probability of coincidence into account. All hazards – fluvial, pluvial and combined – were accompanied by an uncertainty estimation taking into account the natural variability of the flood events. This resulted in probabilistic flood hazard maps showing the maximum inundation depths for a selected set of probabilities of occurrence, with maps showing the expectation (median) and the uncertainty by percentile maps. The results are critically discussed and their usage in flood risk management are outlined.
Article
The majority of naturally occurring freshwater on small islands is groundwater, which is primarily recharged by precipitation. Recharge rates are therefore likely to be impacted by climate change. Freshwater resources on small islands are particularly vulnerable to climate change because they are limited in size and easily compromised. Here we have compiled available aquifer system characteristics and water-use data for 43 small island developing states distributed worldwide, based on local expert knowledge, publications and regional data sets. Current vulnerability was assessed by evaluating the recharge volume per capita. For future vulnerability, climate change projections were used to estimate changes in aquifer recharge. We find that 44% of islands are in a state of water stress, and while recharge is projected to increase by as much as 117% on 12 islands situated in the western Pacific and Indian Ocean, recharge is projected to decrease by up to 58% on the remaining 31 islands. Of great concern is the lack of enacted groundwater protection legislation for many of the small island developing states identified as highly vulnerable to current and future conditions. Recharge indicators, shown alongside the state of legal groundwater protections, provide a global picture of groundwater supply vulnerability under current and future climate change conditions. http://rdcu.be/kqK4
Article
This paper describes a new weather generator – the 10-state empirical model – that combines a 10-state, first-order Markov chain with a non-parametric precipitation amounts model. Using a doubly-stochastic transition-matrix results in a weather generator for which the overall precipitation distribution (including both wet and dry days) and the temporal-correlation can be modified independently for climate change studies. This paper assesses the ability of the 10-state empirical model to simulate daily area-average precipitation in the Torne River catchment in northern Sweden/western Finland in the context of 3 other models: a 10-state model with a parametric (Gamma) amounts model; a wet/dry chain with the empirical amounts model; and a wet/dry chain with the parametric amounts model. The ability to accurately simulate the distribution of multi-day precipitation in the catchment is the primary consideration. Results showed that the 10-state empirical model represented accumulated 2- to 14-day precipitation most realistically. Further, the distribution of precipitation on wet days in the catchment is related to the placement of a wet day within a wet-spell, and the 10-state models represented this realistically, while the wet/dry models did not. Although all four models accurately reproduced the annual and monthly averages in the training data, all models underestimated inter-annual and inter-seasonal variance. Even so, the 10-state empirical model performed best. We conclude that the multi-state model is a promising candidate for hydrological applications, as it simulates multi-day precipitation well, but that further development is required to improve the simulation of interannual variation.