ArticlePDF Available

Abstract and Figures

Since the beginning of the 21st century, the scientific community has made huge leaps to exploit renewable energy sources, with solar radiation being one of the most important. However, the variability of solar radiation has a significant impact on solar energy conversion systems, such as in photovoltaic systems, characterized by a fast and non-linear response to incident solar radiation. The performance prediction of these systems is typically based on hourly or daily data because those are usually available at these time scales. The aim of this work is to investigate the stochastic nature and time evolution of the solar radiation process for daily and hourly scale, with the ultimate goal of creating a new cyclostationary stochastic model capable of reproducing the dependence structure and the marginal distribution of hourly solar radiation via the clearness index KT.
Content may be subject to copyright.
Adv. Geosci., 45, 139–145, 2018
© Author(s) 2018. This work is distributed under
the Creative Commons Attribution 4.0 License.
A stochastic model for the hourly solar radiation process for
application in renewable resources management
Giannis Koudouris, Panayiotis Dimitriadis, Theano Iliopoulou , Nikos Mamassis, and Demetris Koutsoyiannis
Department of Water Resources and Environmental Engineering, School of Civil Engineering, National
Technical University of Athens (NTUA), Heroon Polytechneiou 5, 157 80, Zographou, Greece
Correspondence: Giannis Koudouris (
Received: 31 May 2018 – Revised: 1 August 2018 – Accepted: 3 August 2018 – Published: 14 August 2018
Abstract. Since the beginning of the 21st century, the sci-
entific community has made huge leaps to exploit renewable
energy sources, with solar radiation being one of the most
important. However, the variability of solar radiation has a
significant impact on solar energy conversion systems, such
as in photovoltaic systems, characterized by a fast and non-
linear response to incident solar radiation. The performance
prediction of these systems is typically based on hourly or
daily data because those are usually available at these time
scales. The aim of this work is to investigate the stochastic
nature and time evolution of the solar radiation process for
daily and hourly scale, with the ultimate goal of creating a
new cyclostationary stochastic model capable of reproduc-
ing the dependence structure and the marginal distribution of
hourly solar radiation via the clearness index KT.
1 Introduction
Human activities are either explicitly or implicitly linked
with the dynamic behaviour of the solar radiation process.
As a result, during the last two decades extensive research
(Ettoumi et al., 2002; Tovar-Pescador, 2008; Reno et al.,
2012; Tsekouras and Koutsoyiannis, 2014) has been car-
ried out into the stochastic nature of the solar radiation pro-
cess (e.g. marginal distribution, dependence structure etc.).
Although many popular distributions, used in geophysics
such as (Gamma, Pareto, Lognormal, Logistic, mixture of
two Normal etc.) are suggested in the literature (Ayodele
and Ogunjuyigbe, 2015; Jurado et al., 1995; Aguiar and
Collares-Pereira, 1992; Hollands and Huget, 1983) and may
exhibit a good fit, they cannot adequately fit the right tail
of the distribution. This can be explained, considering that
the right boundary of the process varies at a seasonal scale.
Also, the maximum value of solar radiation that can be mea-
sured at the earth surface is the solar constant (i.e. Gsc =
1367 W m2) and therefore, distributions which are not right
bounded should not be applied for solar radiation. Koudouris
et al. (2017) introduce a new marginal distribution (i.e. Ku-
maraswamy distribution) for daily solar radiation process
which is verified by three goodness of fit tests. The Ku-
maraswamy distribution may not be a very popular distri-
bution, but it originates from the Beta family of distribu-
tions (Jones, 2009) and exhibits some technical advantages
in modelling, e.g. invertible closed form of the cumulative
distribution function:
F(z;a, b)=
f(ξ;a, b)dξ=11zαβ
Q(z)=F1(z)= {1(1z) 1
where z[0,1]is standardized according to z=zzmin
zmaxzmin ,
with zmin and zmax are minimum and maximum values deter-
mined from the empirical time series.
In this research framework a more extensive analysis is be-
ing conducted for hourly solar radiation. The marginal distri-
bution for hourly scale and the dependence structure of the
examined process are being investigated with the ultimate
goal to synthesize a preliminary cyclostationary stochastic
model capable of generating synthetic series calibrated from
regional climate data. Furthermore, one the most common
characteristics of hydrometeorological processes is the dou-
ble periodicity (i.e. the diurnal and seasonal variation of the
process); therefore, solar radiation exhibits same behaviour
(e.g. Fig. 1). The seasonality occurs due to the deterministic
Published by Copernicus Publications on behalf of the European Geosciences Union.
140 G. Koudouris et al.: A stochastic model for the hourly solar radiation process
Τίτλος γραφήματος
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
W m-2
January February
March April
May June
July August
September October
November December
Figure 1. Double periodicity diagram of Solar radiation (i.e. averaged measured value for each hour) for Denver station.
movement of the earth in orbit around the sun and around its
axis of rotation.
In order to proceed to the hourly scale data analysis, the
double periodicity must be deduced from the measured data.
To achieve that, the clearness index KTis introduced Eq. (2)
, KT[0,1](2)
The index can be considered as a ratio of both deterministic
and stochastic elements. The clearness index KT, describes
all the meteorological stochastic influences (e.g. cloudiness,
dew point, temperature, atmospheric aerosols) as it is the ra-
tio of the actual solar radiation measured on the ground I,
to that available at the top of the atmosphere I0and so it ac-
counts for the transparency of the atmosphere.
The denominator of Eq. (2) is a deterministic process,
which can be determined from
where dis the solar eccentricity (Eq. 4), Gsc is the solar con-
stant, ωis the hour angle (Eq. 5), ϕis the latitude and δis the
declination (Eq. 6); these quantities are given as
365 0.05(4)
where J=1 and 365 refer to 1 January and 31 December,
ω=(St 12)h
24 ·h360 ω=15(St 12),
ω∈ [−180,180](5)
where St is the solar time, St [0,23]so between successive
hours ω1ω2=15;
δ= −0.49cos2πJ
365 +0.16(6)
Therefore, investigating the stochastic nature of clearness in-
dex KTcan lead to conclusions for the fluctuations in hourly
solar radiation.
2 Experimental data
The analysis of hourly solar radiation is conducted via the
clearness index KT. In order to examine the process be-
haviour on a world-wide scale, data from both United States
of America and Greece are examined. Data for Greece
were retrieved from the Hydrological Observatory of Athens
(HOA). The network consists of more than 10 stations lo-
cated in Attica region measuring environmental variables of
hydrometeorological interest. Each station is equipped with
a data logger which records with 10 min interval. These mea-
surements were aggregated to mean hourly data through
the open software “Hydrognomon” (http://hydrognomon.
org, last access: 9 August 2018). For the Attica region, the
KTtimeseries were generated utilizing Eq. (3) by produc-
ing hourly solar top of atmosphere intensity at the local sta-
tion. For USA, the data base of NRLE (National Renew-
able Energy Laboratory)-NSRDB (National Solar Radiation
Database) was used, which contains more than 1500 stations
with hourly solar radiation, but only 40 of them include time
series with more than 14 years of measurements of hourly
global solar radiation measured on horizontal surface.
Adv. Geosci., 45, 139–145, 2018
G. Koudouris et al.: A stochastic model for the hourly solar radiation process 141
3 Hourly solar radiation stochastic investigation
3.1 Marginal distribution
According to previous research (Koudouris et al., 2017),
daily solar radiation can be modelled by the Kumaraswamy
distribution Eq. (1). A new investigation for the hourly
marginal distribution of hourly solar radiation is being con-
ducted through the clearness index KT. Firstly, for every ex-
amined station, the hourly empirical data KTis divided into
288 times series (i.e. 24 hourly time series for each month
which are constructed with approximately 30 observations
during days for a certain number of years). This technique
is usually found to be sufficient allowing the construction of
a stationary model for the solar radiation time series. The
Kumaraswamy distribution is fitted to the empirical data of
the clearness index, to evaluate the statistical properties of
the solar data under study. Furthermore, three goodness of
fit tests (Marsaglia and Marsaglia, 2004; Csorgo and Far-
way, 1996; Burnham and Anderson, 2003) were applied (i.e.
Kolmogorov-Smirnov, Cramer von Misses and the Anderson
Darling) by setting the significance level at 5 %. For the Psyt-
talia station (Greece), from the 288 times series only 170 are
considered with a mean value of solar radiation much larger
than zero (0, 1367], as a result of the absence of the sun-
shine during the night period. We calculated that for only
44 of them the Kumaraswamy distribution is appropriate.
The empirical probability distributions that were constructed
for these results, seem to indicate that the clearness index
and therefore the solar radiation exhibit a bimodal behaviour.
As a result, from the analysis of all HOA network stations,
the Kumaraswamy distribution cannot describe the empiri-
cal data sufficiently well and thus it is an insufficient dis-
tribution for modelling hourly solar radiation in the Attica
region. Nevertheless, the empirical probability distributions
that were constructed from NSRDB stations indicate that
hourly solar radiation does not always exhibit a bimodal be-
haviour at any geographic location. This is due to the fact that
hourly solar radiation is extremely correlated with the cloudi-
ness process. Consequently, in regions where the cloudiness
process does not exhibit fluctuations in small time scales, the
Kumaraswamy distribution seem to adequately fit the hourly
empirical data. This assumption is confirmed after investi-
gating the marginal distribution of the clearness index at the
Barrow station in Alaska where cloudiness does not exhibit a
highly varying behaviour in contrast to the Attica region. For
a better representation of the multivariate analysis scenarios
of clearness index, a linear combination of Kumaraswamy
distributions is proposed. A new distribution is constructed
considering the sum of two Kumaraswamy distributions con-
ventionally chosen and weighted by a parameter λ∈ [0,1].
The probability distribution and density functions of the pro-
posed distribution are:
F(x;λ, a1, b1, a2, b2)=
λF x;a1,b1+(1λ)F(x;a2,b2)
F(x;λ, a1, b1, a2, b2)=
F(x;λ, a1, b1, a2, b2)=
f=λf1+(1λ)f2f(x;λ, a1, b1, a2, b2)=
λf x;a1,b1+(1λ)f(x;a2, b2)
f(x;λ, a1, b1, a2, b2)=
The parameters are calculated via the least square estimator
method or the maximum likelihood estimation. The resulting
f (x) and F (x) exhibit good agreement with the empirical
ones (e.g. Fig. 2),notably when solar radiation exhibits a bi-
modal behaviour (e.g. Fig. 2c, b).
3.2 Dependence structure
It is well known from the literature that hydrometeorolog-
ical processes show a large variability (often linked to the
maximum entropy; Koutsoyiannis, 2011) at different time
scales, exhibiting so-called long-term persistence (LTP) or
else the Hurst-Kolmogorov dynamics (Koutsoyiannis, 2002).
In order to investigate if solar radiation exhibits an LTP be-
haviour, the Hurst parameter (representing the dependence
structure of the process) is estimated via the climacogram
tool (i.e. double logarithmic plot of variance of the averaged
process versus averaging time scale k; Koutsoyiannis, 2010).
The Hurst parameter is identified only at large scales. Thus,
in Fig. 3, it is identified for time scales >1 year (8760 h),
where the seasonal variation is averaged out and equals the
half slope of the climacogram, as scale tends to infinity, plus
1. The climacogram has some important statistical advan-
tages if compared to the autocovariance and power spec-
trum (Dimitriadis and Koutsoyiannis, 2015a; Harrouni et al.,
2005). Exploiting the Hurst parameter, the persistence be-
haviour of the process is quantified and examined. The anal-
ysis of both NSRDB and HOA stations estimates the Hurst
parameter larger than 0.5 (where the latter value corresponds
to a white-noise behaviour); consequently, the examined pro-
cess indicates long-term persistence and cannot be consid-
ered as white noise nor a Markov process. Adv. Geosci., 45, 139–145, 2018
142 G. Koudouris et al.: A stochastic model for the hourly solar radiation process
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Month: June - Hour: 09:00
Empirical Cdf
Bimodal Kumaraswamy Cdf
0 0.2 0.4 0.6 0.8 1 1.2
Month: June - hour: 09:00
Empirical pdf
Bimodal Kumaraswamy Pdf
00.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Month: February - hour: 19:00
Empirical Cdf
Bimodal Kumaraswamy Cdf
0 0.2 0.4 0.6 0.8 1 1.2
Month: February - hour: 19:00
Empirical pdf
Bimodal Kumaraswamy Pdf
(a) (b)
(c) (d)
Figure 2. Examples of comparison between empirical and theoretical distribution, where the latter is a linear combination of Kumaraswamy
distributions. (a, b) Psitallia station, Greece (bimodal); (c, d) Barrow station, Alaska (unimodal).
110 100 1000 10 000 100 000
Variene γ(κ)
Time scale k (h)
H = 0.65
H = 0.78
H = 0.75
Figure 3. Climacogram of three investigated stations: Denver, H=
0.78, Elizabeth city, H=0.75, Bluefield Virginia, H=0.64.
4 Methodology and application of the model
In this section, we describe a simple methodology to pro-
duce synthetic hourly solar radiation. After we cautiously se-
lect a marginal distribution model (e.g., Eq. 8, Fig. 5), we
estimate the distribution parameters pi,j for each diurnal-
seasonal process xi,j (e.g., 12 ×24 different set of pa-
rameters; Fig. 4). Then we homogenize the timeseries
(Dimitriadis, 2017) by applying the distribution function
Fxi,j ;pi,j to each one of the diurnal-seasonal processes
(with the estimated, e.g. i=12 ×j=24, set of parame-
ters pi,j ) and then, by employing the standard Gaussian (or
any other) distribution function to each diurnal-seasonal pro-
cess, i.e. y=N1(F (xi,j ;pi,j );0,1). Note that in case the
diurnal-seasonal processes are all Gaussian, the proposed
homogenization is equivalent to the standard normalization
scheme, where for each process the mean is subtracted and
the residual is divided by the standard deviation.
In this implicit way, we manage to homogenize the time-
series xi,j F (xi,j ;pi,j )to yN(0,1). In case where the
marginal distribution is unknown or difficult to estimate, we
may use non-linear transformation schemes based on the
maximization of entropy (Koutsoyiannis et al., 2008; Dim-
itriadis and Koutsoyiannis, 2015b). It is noted that a more ro-
bust approach to reduce the 12 ×24 set of parameters would
be to employ an analytical expression for the double solar
periodicity (as done for the wind process in Deligiannis et
al., 2016). This homogenization scheme has been applied to
several processes such as wind (Deligiannis et al., 2016), so-
lar radiation (Koudouris et al., 2017), wave height, wave pe-
riod and wind for renewable energy production (Moschos et
al., 2017), river discharge (Pizarro et al., 2018) and precipi-
tation (Dimitriadis and Koutsoyiannis, 2018). However, it is
noted that this scheme assumes stationary in the dependence
structure rather cyclostationary (for such analyses see Kout-
soyiannis et al., 2008, and references therein).
The above homogenization enables the estimation and
modelling of the dependence structure after the effect of the
double periodicity has been approximately removed. This
homogenization scheme also enables approximating the cor-
relations among the diurnal processes for the same scale
(Fig. 6). After the estimation of the N(0,1)homogenized
Adv. Geosci., 45, 139–145, 2018
G. Koudouris et al.: A stochastic model for the hourly solar radiation process 143
Figure 4. Results of the simulation model for the Denver Station: (a) 2-year synthetic timeseries of hourly KT;(b) yearly average of hourly
solar radiation observations vs. synthetic values of 15 years simulation.
Figure 5. Comparison of distribution (a, c) and density (b, d) function of simulated and observed data, for hour 13:00: (a, b) May;
(c, d) September.
timeseries, we estimate the dependence structure through the
second-order climacogram and we fit a generalized-Hurst
Kolmogorov (GHK) model (Dimitriadis and Koutsoyiannis,
where γis the standardized variance, kthe scale (h), qa
scale-parameter and Hthe Hurst parameter (for the exam-
ined process we estimate q=2 and H=0.83, considering
the bias effect; Fig. 7).
In this implicit manner, the marginal characteristics of
each periodic part are exactly preserved (since the marginal
distribution functions are implicitly handled through the pro-
posed homogenization) and the expectation of the second-
order dependence structure (e.g. correlation function) is also
exactly preserved after properly adjusting for bias through
the mode or expected value of the estimator (Dimitriadis,
2017). It is noted that higher-order moments of processes
with HK behaviour cannot be adequately preserved in an
implicit manner (see an illustrative example in Dimitriadis
and Koutsoyiannis, 2018, their Appendix D) and thus, for a
more accurate preservation of the dependence structure an
explicit algorithm is necessary (Dimitriadis and Koutsoyian-
nis, 2018). We may use a simple generation scheme, such as
the sum of AR(1) models (SAR; Dimitriadis and Koutsoyian-
nis, 2015b), that can synthesize any N(0,1)autoregressive- Adv. Geosci., 45, 139–145, 2018
144 G. Koudouris et al.: A stochastic model for the hourly solar radiation process
7 9 11 13 15 17 19
Correlation function
Lag 1 (hour)
January February March April May June
July August September October November December
7 9 11 13 15 17 19
Correlation function
Lag 1 (hour)
January February March April May June
July August September October November December
(a) (b)
Figure 6. Lag-1 autocorrelation of hourly time series for each month (a) empirical; (b) simulated.
110 100 1000 10 000 100 000
Standardized variane γ(κ)
Time scale k (hours)
Modelled including double periodicities
Homogenized (without double periodicities)
White noise (uncorrelated)
Figure 7. Climacogram of observed and simulated KTvs. white
noise and a homogenized process without periodicities.
like dependence structure, which later it can be transformed
back to the original distribution function F (xi,j ;pi,j )and so
in this way produce a double-periodic process with the de-
sired marginal distribution for each diurnal-seasonal cycle as
well as the desired dependence structure. Finally, we multi-
ply each value of the synthetic KTwith the deterministically
determined value of the hourly intensity of solar radiation at
the top of the atmosphere.
5 Conclusions
The hourly marginal distribution of solar radiation is inves-
tigated via the clearness index KT. Regarding the marginal
distribution, after analysing a variety of stations with dif-
ferent regional climate data, it is concluded that the Ku-
maraswamy distribution cannot adequately describe the em-
pirical data of hourly solar radiation. Therefore, a new bi-
modal distribution is constructed which is a weighted sum
of two Kumaraswamy distributions This distribution is fully
characterized by five parameters and can adequately fit the
empirical hourly data of the clearness index in in the regions
investigated in this study. Also, the dependence structure of
solar radiation process is investigated via the climacogram
tool. It is concluded that since the Hurst parameter is esti-
mated higher than 0.5, the examined process exhibits long-
term persistence and cannot be considered as a white noise
nor a Markov process. Finally, a new preliminary stochas-
tic model is proposed, capable of reproducing the clearness
sky index KTand so the hourly solar radiation. The model
can maintain and preserve the probability density function
by means of the first four central moments and also the Hurst
parameter which represents the correlation and the persistent
Data availability. The datasets generated during the current study
are available from the corresponding author on reasonable request.
Author contributions. GK conceived of the presented idea, devel-
oped the theory and performed the computations. TI helped with R
environment statistical tests. PD verified the analytical methods and
encouraged GK to investigate the LTP behaviour of the solar radi-
ation process. NM contributed to sample preparation. DK and PD
supervised the findings of this work. GK took the lead in writing
the manuscript. All authors provided critical feedback and helped
shape the research, analysis and manuscript.
Competing interests. The authors declare that they have no conflict
of interest.
Special issue statement. This article is part of the special issue “Eu-
ropean Geosciences Union General Assembly 2018, EGU Division
Energy, Resources & Environment (ERE)”. It is a result of the EGU
General Assembly 2018, Vienna, Austria, 8–13 April 2018.
Acknowledgements. The authors thank warmly the anonymous
reviewers for their most helpful comments. The statistical analyses
were performed in the R statistical environment (RDC Team,
2006) by also using the contributed packages VGAM (Yee, 2015),
fitdistrplus (Delignette-Muller and Dutang, 2015) and lmomco
Adv. Geosci., 45, 139–145, 2018
G. Koudouris et al.: A stochastic model for the hourly solar radiation process 145
(Asquith, 2018).
Edited by: Sonja Martens
Reviewed by: two anonymous referees
Aguiar, R. and Collares-Pereira, M. T. A. G.: TAG: a time-
dependent, autoregressive, Gaussian model for generating syn-
thetic hourly radiation, Sol. Energy, 49, 167–174, 1992.
Asquith, W. H.: Lmomco – L-moments, censored L-moments,
trimmed L-moments, L-comoments, and many distributions, R
package version 2.3.1, Texas Tech University, Lubbock, Texas,
USA, 2018.
Ayodele, T. R. and Ogunjuyigbe, A. S. O.: Prediction of monthly
average global solar radiation based on statistical distribution of
clearness index, Energy, 90, 1733–1742, 2015.
Burnham, K. P. and Anderson, D. R.: Model selection and mul-
timodel inference: a practical information-theoretic approach,
Springer Science & Business Media, New York, USA, 2003.
Csorgo, S. and Faraway, J. J.: The exact and asymptotic distribu-
tions of Cramér-von Mises statistics, J. Roy. Stat. Soc. B, 58,
221–234, 1996.
Deligiannis, I., Dimitriadis, P., Daskalou, O., Dimakos, Y., and
Koutsoyiannis, D.: Global investigation of double periodicity
of hourly wind speed for stochastic simulation; application in
Greece, Enrgy. Proced., 97, 278–285, 2016.
Delignette-Muller, M. L. and Dutang, C.: fitdistrplus: An R package
for fitting distributions, J. Stat. Softw., 64, 1–34, 2015.
Dimitriadis, P.: Hurst-Kolmogorov dynamics in hydrometeorologi-
cal processes and in the mircoscale turbulence, PhD thesis, Na-
tional Technical University of Athens, Athens, Greece, 167 pp.,
Dimitriadis, P. and Koutsoyiannis, D.: Climacogram versus autoco-
variance and power spectrum in stochastic modelling for Marko-
vian and Hurst–Kolmogorov processes, Stoch. Env. Res. Risk A.,
29, 1649–1669, 2015a.
Dimitriadis, P. and Koutsoyiannis, D.: Application of stochastic
methods to double cyclostationary processes for hourly wind
speed simulation, Enrgy. Proced., 76, 406–411, 2015b.
Dimitriadis, P. and Koutsoyiannis, D.: Stochastic synthesis approx-
imating any process dependence and distribution, Stoch. Env.
Res. Risk A., 32, 1493–1515, 2018.
Ettoumi, F. Y., Mefti, A., Adane, A., and Bouroubi, M. Y.: Statisti-
cal analysis of solar measurements in Algeria using beta distri-
butions, Renew. Energ., 26, 47–67, 2002.
Harrouni, S., Guessoum, A., and Maafi, A.: Classification of daily
solar irradiation by fractional analysis of 10-min-means of solar
irradiance, Theor. Appl. Climatol., 80, 27–36, 2005.
Hollands, K. G. T. and Huget, R. G.: A probability density function
for the clearness index, with applications, Sol. Energy, 30, 195–
209, 1983.
Jones, M. C.: Kumaraswamy’s distribution: A beta-type distribu-
tion with some tractability advantages, Stat. Methodol., 6, 70–81,
Jurado, M., Caridad, J. M., and Ruiz, V.: Statistical distribution
of the clearness index with radiation data integrated over five
minute intervals, Sol. Energy, 55, 469–473, 1995.
Koudouris, G., Dimitriadis, P., Iliopoulou, T., Mamassis, N., and
Koutsoyiannis, D.: Investigation on the stochastic nature of the
solar radiation process, Enrgy. Proced., 125, 398–404, 2017.
Koutsoyiannis, D.: The Hurst phenomenon and fractional Gaussian
noise made easy, Hydrolog. Sci. J., 47, 573–595, 2002.
Koutsoyiannis, D.: HESS Opinions “A random walk on water”, Hy-
drol. Earth Syst. Sci., 14, 585–601,
14-585-2010, 2010.
Koutsoyiannis, D.: Hurst–Kolmogorov dynamics as a result of ex-
tremal entropy production, Physica A, 390, 1424–143, 2011.
Koutsoyiannis, D., Yao, H.. and Georgakakos, A.: Medium-range
flow prediction for the Nile: a comparison of stochastic and de-
terministic methods, Hydrolog. Sci. J., 53, 142–164, 2008.
Marsaglia, G. and Marsaglia, J.: Evaluating the anderson-darling
distribution, J. Stat. Softw., 9, 1–5, 2004.
Moschos, E., Manou, G., Dimitriadis, P., Afentoulis, V., Kout-
soyiannis, D., and Tsoukala, V. K.: Harnessing wind and wave
resources for a Hybrid Renewable Energy System in remote is-
lands: a combined stochastic and deterministic approach, Enrgy.
Proced., 125, 415–424, 2017.
Pizarro, A., Dimitriadis, P., Chalakatevaki, M., Samela, C., Man-
freda, S., and Koutsoyiannis, D.: An integrated stochastic model
of the river discharge process with emphasis on floods and bridge
scour, European Geosciences Union General Assembly 2018,
vol. 20, 11 April 2018, Vienna, Austria, EGU2018-8271, 2018.
Reno, M. J., Hansen, C. W., and Stein, J. S.: Global horizon-
tal irradiance clear sky models: Implementation and analysis,
Tech. Rep. SAND2012-2389, Sandia National Laboratories, Al-
buquerque, New Mexico and Livermore, California, USA, 2012.
RDC Team (R Development Core Team): R: A Language and Envi-
ronment for Statistical Computing, R Foundation for Statistical
Computing, Vienna, Austria, 2006.
Tovar-Pescador, J.: Modelling the statistical properties of solar radi-
ation and proposal of a technique based on boltzmann statistics,
in: Modeling Solar Radiation at the Earth’s Surface, Springer,
Berlin, Heidelberg, Germany, 55–91, 2008.
Tsekouras, G. and Koutsoyiannis, D.: Stochastic analysis and sim-
ulation of hydrometeorological processes associated with wind
and solar energy, Renew. Energ., 63, 624–633, 2014.
Yee, T. W.: Vector generalized linear and additive models: with an
implementation in R, Springer, New York, USA, 2015. Adv. Geosci., 45, 139–145, 2018
... One of the main features of the hydrometeorological processes (such as wind speed [73] and solar irradiance [74]) is the double periodicity, interpreting the diurnal and seasonal variation of these uncertain variables. Note that the seasonality occurs considering the deterministic movement of the earth in orbit around the sun and around its axis of rotation [74]. ...
... One of the main features of the hydrometeorological processes (such as wind speed [73] and solar irradiance [74]) is the double periodicity, interpreting the diurnal and seasonal variation of these uncertain variables. Note that the seasonality occurs considering the deterministic movement of the earth in orbit around the sun and around its axis of rotation [74]. Hence, we represent the double periodicity of the solar irradiance and wind speed in Fig. 8. ...
This paper presents a stochastic planning algorithm to plan an operation of a multi-microgrid (MMG) in an electricity market considering the integration of stochastic renewable energy resources (RERs). The proposed planning algorithm investigates the optimal operation of resources (i.e., wind turbine (WT), fuel cell (FC), Electrolyzer, photovoltaic (PV) panel, and microturbine (MT)) and energy storage (ES). Various uncertainties (e.g., the power production of WT, the power production of PV, the departure time of electric vehicle (EV), the arrival time of EV, and the traveled distance of EV) are initially forecasted according to the observed data. The prediction error is estimated by fitting the forecasted data and observed data using a Copula method. A Cournot equilibrium and game theory (GT) are applied to model the real-time electricity market and its interactions with the MMG. The proposed algorithm is examined in a sample MMG to determine the operation of uncertain resources and ES. The obtained results are compared with a baseline and the other conventional optimization methods to verify the effectiveness of the proposed algorithm. The obtained results authenticate the importance of modeling the interaction between the MMG and electricity market, especially under the high integration of uncertain RERs, resulting in above 8% cost reduction in the MMG.
... For the sake of brevity, six of them are presented here, which are more precise for this type of data. Three of them (Beta, Kumaraswamy (Kuma) and Logit-Normal (LogitN)) are suitable for fitting random variables bounded within [0, 1], which is the typical range of average daily CSI values [13,39]. The other three (Logistic (Logist), Log-Logistic (Log-Log), and Generalized Extreme Value (GEV)) are suitable instead for modeling unbounded random variables; however, they may also be useful to characterize bounded variables such as average daily CSI [40]. ...
Full-text available
In this paper, we introduce a model representing the key characteristics of high frequency variations of solar irradiance and photovoltaic (PV) power production based on Clear Sky Index (CSI) data. The model is suitable for data-driven decision-making in electrical distribution grids, e.g., descriptive/predictive analyses, optimization, and numerical simulation. We concentrate on solar irradiance data since the power production of a PV system strongly correlates with solar irradiance at the site location. The solar irradiance is not constant due to the Earth’s orbit and irradiance absorption/scattering from the clouds. To simulate the operation of a PV system with one-minute resolution for a specific coordinate, we have to use a model based on the CSI of the solar irradiance data, capturing the uncertainties caused by cloud movements. The proposed model is based on clustering the days of each year into groups of days, e.g., (i) cloudy, (ii) intermittent cloudy, and (iii) clear sky. The CSI data of each group are divided into bins of magnitudes and the transition probabilities among the bins are identified to deliver a Markov Chain (MC) model to track the intraday weather condition variations. The proposed model is tested on the measurements of two PV systems located at two different climatic regions: (a) Yverdon-les-Bains, Switzerland; and (b) Oahu, Hawaii, USA. The model is compared with a previously published N-state MC model and the performance of the proposed model is elaborated.
... In literature, the Hurst-Kolmogorov dynamics term is used to express and simulate a more general behaviour of the geophysical processes extending from fine-scale fractal structure to the large-scale longterm persistence . In fact, there are global-scale studies in literature, which have traced the LTP behaviour in several of the processes affecting the ET 0 , such as in temperature and wind , and solar radiation (Koudouris et al., 2018). In the presence of the LTP behaviour, the variability of the ET 0 is expected to highly increase (in comparison with a white-noise and Markov behaviours), leading to a decrease in predictability of the ET 0 . ...
Reference evapotranspiration (ET0) is a key component of the water cycle. In this study, a modified Penman–Monteith method incorporating the CO2 effect on surface resistance was used to estimate past change in ET0 in China for 1961–2019 and to project future changes for 2040–2099. A partial-differential equation was used to attribute the changes in ET0. Results indicated the following. (1) A shift point in the annual ET0 series occurred in 1993, in which annual ET0 in China had been decreasing and then increased significantly by -18.93 and 11.19 mm·decade⁻¹ before and after 1993, respectively, with ET0 having changed most drastically within the temperate continental climate zone. (2) Wind speed and solar radiation were the major dominant factors responsible for the change in ET0 during 1961–1992, and their contribution rates were 48.30 and 38.92%, respectively. During 1993–2019, increasing maximum air temperature was the dominant factor contributing to increasing ET0, with a contribution rate of 81.71%, yet wind speed was the dominant factor affecting ET0 changes at 45% of the stations in China. (3) In future projections, maximum air temperature is expected to be the dominant factor influencing the increase in ET0, the values of which were projected at 1.49 and 16.05 mm·decade⁻¹ during 2040–2099 under RCP4.5 and 8.5 scenarios, respectively, with projections from five GCMs. The increasing surface resistance response to elevated CO2 was found to be an important contributor to the decrease in ET0. Particularly for the RCP8.5 scenario, the increase in surface resistance was found to lead to the decrease in ET0 by -0.68 mm·year⁻¹ (-42.37%). This suggests that historical and future tendencies towards aridity in China may be considerably weaker and less extensive than previously assumed.
... . We note that a similar index denoted 'clearness index' was introduced by Koudouris et al. (2018). It represents the ratio between the effective solar radiation and the one measured at the top of the atmosphere. ...
Full-text available
This article deals with the production of energy through photovoltaic (PV) panels. The efficiency and quantity of energy produced by a PV panel depend on both deterministic factors, mainly related to the technical characteristics of the panels, and stochastic factors, essentially the amount of incident solar radiation and some climatic variables that modify the efficiency of solar panels such as temperature and wind speed. The main objective of this work is to estimate the energy production of a PV system with fixed technical characteristics through the modeling of the stochastic factors listed above. Besides, we estimate the economic profitability of the plant, net of taxation or subsidiary payment policies, considered taking into account the hourly spot price curve of electricity and its correlation with solar radiation, via vector autoregressive models. Our investigation ends with a Monte Carlo simulation of the models introduced. We also propose the pricing of some quanto options that allow hedging both the price risk and the volumetric risk.
... We note that in general, the spatiotemporal variability of the hourly clearness index may be significant since it depends on local topographic characteristics (mostly affecting scenarios 1 and 3) and on the temporal variability, although we do not expect the spatial variability to be significant at such a small scale. The temporal variability is triggered by the short-term high autocorrelation as well as by the long-term behaviour identified from the time series analysis (Dimitriadis and Koutsoyiannis, 2015) of solar irradiation both over the area of interest and globally (Koudouris et al., 2017(Koudouris et al., , 2018. A proper measure to express the strength of the long-term persistence of a stochastic process is the Hurst parameter, (Hurst, 1951;Koutsoyiannis, 2010). ...
Full-text available
We investigate the application of a solar-powered bus route to a small-scale transportation system, as such of a university campus. In particular, we explore the prospect of replacing conventional fossil fuel buses by electric buses powered by solar energy and electricity provided by the central grid. To this end, we employ GIS mapping technology to estimate the solar radiation at the university campus and, accordingly, we investigate three different scenarios for harnessing the available solar power: (1) solar panels installed on the roof of bus stop shelters, (2) solar panels installed at an unused open space in the university, and (3) solar roads, i.e. roads constructed by photovoltaic (PV) materials. For each of the three scenarios, we investigate the optimal technical configuration, the resulting energy generation, as well as the capital cost for application in the case of NTUA campus in Athens (Greece). The preliminary feasibility analysis showcases that all three scenarios contribute to satisfying transportation demand, proportionately to their size, with scenario (2) presenting the lowest capital cost in relation to energy generation. Therefore, we further explore this scenario by simulating its daily operation including the actions of buying and selling energy to the central grid, when there is energy deficit or surplus, respectively. A sensitivity analysis is carried out in order to ascertain the optimal size of the solar panel installation in relation to profit and reliability. Overall, results indicate that, albeit the high capital costs, solar-powered transportation schemes present a viable alternative for replacing conventional buses at the studied location, especially considering conventional PV panels. We note that present results heavily depend on the choice of capacity factors of PV materials, which differ among technologies. Yet, as capacity factors of PV panels are currently increasing, the studied schemes might be more promising in the future.
... It should be noted that the clearness index is sensitive to the short-term effects (atmospheric influences which are described by statistics and the long-term effects (Earth's movement which is described by astronomy) [49]. In general, it represents the ratio of the global solar irradiance on a terrestrial a horizontal surface (which is a stochastic quantity) to the global solar irradiance on an extraterrestrial a horizontal surface (which is a deterministic quantity) for the same time and site [6,50]. In this context, the concepts of long-term of solar radiation data (either daily or monthly average daily) and short-term of solar radiation data (either hourly or monthly average hourly) can be utilized to estimate the cleanness index [6]. ...
Full-text available
The precise estimation of solar radiation data is substantial in the long-term evaluation for the techno-economic performance of solar energy conversion systems (e.g., concentrated solar thermal collectors and photovoltaic plants) for each site around the world, particularly, direct normal irradiance which is utilized commonly in designing solar concentrated collectors. However, the lack of direct normal irradiance data comparing to global and diffuse horizontal irradiance data and the high cost of measurement equipment represent significant challenges for exploiting and managing solar energy. Consequently, this study was performed to develop two hierarchical methodologies by using various models, empirical correlations and regression equations to estimate hourly solar irradiance data for various worldwide locations (using new correlation coefficients) and different sky conditions (using cloud cover range). Additionally, the preliminary assessment for the potential of solar energy in the selected region was carried out by developing a comprehensive analysis for the solar irradiance data and the clearness index to make a proper decision for the capability of utilizing solar energy technologies. A case study for the San Antonio region in Texas was selected to demonstrate the accuracy of the proposed methodologies for estimating hourly direct normal irradiance and monthly average hourly direct normal irradiance data at this region. The estimated data show a good accuracy comparing with measured solar data by using locally adjusted coefficients and different statistical indicators. Furthermore, the obtained results show that the selected region is unequivocally amenable to harnessing solar energy as the prime source of energy by utilizing concentrating and non-concentrating solar energy systems.
... The first step of solar energy calculations were carried to study the capability of incorporating solar energy as a source of thermal power in the system by estimating the potential of this type of energy in the selected site of a case study. The hourly global solar irradiance data measured in the (128), which is considered as a stochastic parameter because it is a function of a period of year, seasons, climatic conditions, and geographic site [88]. ...
Full-text available
The production of shale gas and oil is associated with the generation of substantial amounts of wastewater. With the growing emphasis on sustainable development, the energy sector has been intensifying efforts to manage water resources while diversifying the energy portfolio used in treating wastewater to include fossil and renewable energy. The nexus of water and energy introduces complexity in the optimization of the water management systems. Furthermore, the uncertainty in the data for energy (e.g., solar intensity) and cost (e.g., price fluctuation) introduce additional complexities. The objective of this work is to develop a novel framework for the optimizing wastewater treatment and water-management systems in shale gas production while incorporating fossil and solar energy and accounting for uncertainties. Solar energy is utilized via collection, recovery, storage, and dispatch of heat. Heat integration with an adjacent industrial facility is considered. Additionally, electric power production is intended to supply a reverse osmosis (RO) plant and the local electric grid. The optimization problem is formulated as a multi-scenario mixed integer non-linear programming (MINLP) problem that is a deterministic equivalent of a two-stage stochastic programming model for handling uncertainty in operational conditions through a finite set of scenarios. The results show the capability of the system to address water-energy nexus problems in shale gas production based on the system’s economic and environmental merits. A case study for Eagle Ford Basin in Texas is solved by enabling effective water treatment and energy management strategies to attain the maximum annual profit of the entire system while achieving minimum environmental impact.
Full-text available
The stochastic structures of potential evaporation and evapotranspiration (PEV and PET or ETo) are analyzed using the ERA5 hourly reanalysis data and the Penman–Monteith model applied to the well-known CIMIS network. The latter includes high-quality ground meteorological samples with long lengths and simultaneous measurements of monthly incoming shortwave radiation, temperature, relative humidity, and wind speed. It is found that both the PEV and PET processes exhibit a moderate long-range dependence structure with a Hurst parameter of 0.64 and 0.69, respectively. Additionally, it is noted that their marginal structures are found to be light-tailed when estimated through the Pareto–Burr–Feller distribution function. Both results are consistent with the global-scale hydrological-cycle path, determined by all the above variables and rainfall, in terms of the marginal and dependence structures. Finally, it is discussed how the existence of, even moderate, long-range dependence can increase the variability and uncertainty of both processes and, thus, limit their predictability.
Full-text available
To seek stochastic analogies in key processes related to the hydrological cycle, an extended collection of several billions of data values from hundred thousands of worldwide stations is used in this work. The examined processes are the near-surface hourly temperature, dew point, relative humidity, sea level pressure, and atmospheric wind speed, as well as the hourly/daily streamflow and precipitation. Through the use of robust stochastic metrics such as the K-moments and a secondorder climacogram (i.e., variance of the averaged process vs. scale), it is found that several stochastic similarities exist in both the marginal structure, in terms of the first four moments, and in the secondorder dependence structure. Stochastic similarities are also detected among the examined processes, forming a specific hierarchy among their marginal and dependence structures, similar to the one in the hydrological cycle. Finally, similarities are also traced to the isotropic and nearly Gaussian turbulence, as analyzed through extensive lab recordings of grid turbulence and of turbulent buoyant jet along the axis, which resembles the turbulent shear and buoyant regime that dominates and drives the hydrological-cycle processes in the boundary layer. The results are found to be consistent with other studies in literature such as solar radiation, ocean waves, and evaporation, and they can be also justified by the principle of maximum entropy. Therefore, they allow for the development of a universal stochastic view of the hydrological-cycle under the Hurst–Kolmogorov dynamics, with marginal structures extending from nearly Gaussian to Pareto-type tail behavior, and with dependence structures exhibiting roughness (fractal) behavior at small scales, long-term persistence at large scales, and a transient behavior at intermediate scales.
Full-text available
An extension of the symmetric-moving-average (SMA) scheme is presented for stochastic synthesis of a stationary process for approximating any dependence structure and marginal distribution. The extended SMA model can exactly preserve an arbitrary second-order structure as well as the high order moments of a process, thus enabling a better approximation of any type of dependence (through the second-order statistics) and marginal distribution function (through statistical moments), respectively. Interestingly, by explicitly preserving the coefficient of kurtosis, it can also simulate certain aspects of intermittency, often characterizing the geophysical processes. Several applications with alternative hypothetical marginal distributions, as well as with real world processes, such as precipitation, wind speed and grid-turbulence, highlight the scheme’s wide range of applicability in stochastic generation and Monte-Carlo analysis. Particular emphasis is given on turbulence, in an attempt to simulate in a simple way several of its characteristics regarded as puzzles.
Conference Paper
Full-text available
Floods have an important influence on society, being able to affect human life, human properties and also cultural heritage. Nevertheless, the dynamics of floods and their interaction with infrastructure over time is still unexplored. Therefore, there is a significant need for the development of new hydrologic and hydraulic modeling techniques able to represent the process in a realistic way. With this aim, the stochastic structure of the discharge has been modeled by a generalized Hurst-Kolmogorov (HK) process in terms of dependence structure (from long to short term) and marginal distribution (from left to right distribution tail). Several long length discharge time series have been filtered with the aim to ensure a minimum human influence on the discharge regime. Time series were analyzed using the climacogram stochastic tool for the analysis because of its good properties, such as small statistical errors, a priori known bias and a mean close to its mode. Finally, a general and parsimonious discharge model, with emphasis on floods, is coupled with a hydraulic model for long run numerical simulations. The authors are seeking to apply these ideas to evaluate the hydraulic infrastructure risk due to the discharge uncertainty and, in particular, to assess the bridge scour risk.
Full-text available
The high complexity and uncertainty of atmospheric dynamics has been long identified through the observation and analysis of hydroclimatic processes such as temperature, dew-point, humidity, atmospheric wind, precipitation, atmospheric pressure, river discharge and stage etc. Particularly, all these processes seem to exhibit high unpredictability due to the clustering of events, a behaviour first identified in Nature by H.E. Hurst in 1951 while working at the River Nile, although its mathematical description is attributed to A. N. Kolmogorov who developed it while studying turbulence in 1940. To give credits to both scientists this behaviour and dynamics is called Hurst-Kolmogorov (HK). In order to properly study the clustering of events as well as the stochastic behaviour of hydroclimatic processes in general we would require numerous of measurements in annual scale. Unfortunately, large lengths of high quality annual data are hardly available in observations of hydroclimatic processes. However, the microscopic processes driving and generating the hydroclimatic ones are governed by turbulent state. By studying turbulent phenomena in situ we may be able to understand certain aspects of the related macroscopic processes in field. Certain strong advantages of studying microscopic turbulent processes in situ is the recording of very long time series, the high resolution of records and the controlled environment of the laboratory. The analysis of these time series offers the opportunity of better comprehending, control and comparison of the two scientific methods through the deterministic and stochastic approach. In this thesis, we explore and further advance the second-order stochastic framework for the empirical as well as theoretical estimation of the marginal characteristic and dependence structure of a process (from small to extreme behaviour in time and state). Also, we develop and apply explicit and implicit algorithms for stochastic synthesis of mathematical processes as well as stochastic prediction of physical processes. Moreover, we analyze several turbulent processes and we estimate the Hurst parameter (H >> 0.5 for all cases) and the drop of variance with scale based on experiments in turbulent jets held at the laboratory. Additionally, we propose a stochastic model for the behaviour of a process from the micro to the macro scale that results from the maximization of entropy for both the marginal distribution and the dependence structure. Finally, we apply this model to microscale turbulent processes, as well as hydroclimatic ones extracted from thousands of stations around the globe including countless of data. The most important innovation of this thesis is that, to the Author’s knowledge, a unique framework (through modelling of common expression of both the marginal density distribution function and the second-order dependence structure) is presented that can include the simulation of the discretization effect, the statistical bias, certain aspects of the turbulent intermittent (or else fractal) behaviour (at the microscale of the dependence structure) and the long-term behaviour (at the macroscale of the dependence structure), the extreme events (at the left and right tail of the marginal distribution), as well as applications to 13 turbulent and hydroclimatic processes including experimentation and global analyses of surface stations (overall, several billions of observations). A summary of the major innovations of the thesis are: (a) the further development, and extensive application to numerous processes, of the classical second-order stochastic framework including innovative approaches to account for intermittency, discretization effects and statistical bias; (b) the further development of stochastic generation schemes such as the Sum of Autoregressive (SAR) models, e.g. AR(1) or ARMA(1,1), the Symmetric-Moving-Average (SMA) scheme in many dimensions (that can generate any process second-order dependence structure, approximate any marginal distribution to the desired level of accuracy and simulate certain aspects of the intermittent behaviour) and an explicit and implicit (pseudo) cyclo-stationary (pCSAR and pCSMA) schemes for simulating the deterministic periodicities of a process such as seasonal and diurnal; and (c) the introduction and application of an extended stochastic model (with an innovative identical expression of a four-parameter marginal distribution density function and correlation structure, i.e. g(x;C)=λ/[(1+|x/a+b|^c )]^d, with C=[λ,a,b,c,d]), that encloses a large variety of distributions (ranging from Gaussian to powered-exponential and Pareto) as well as dependence structures (such as white noise, Markov and HK), and is in agreement (in this form or through more simplified versions) with an interestingly large variety of turbulent (such as horizontal and vertical thermal jet of positively buoyancy processes using laser-induced-fluorescence techniques as well as grid-turbulence generated within a wind-tunnel), geostatistical (such as 2d rock formations), and hydroclimatic processes (such as temperature, atmospheric wind, dew-point and thus, humidity, precipitation, atmospheric pressure, river discharges and solar radiation, in a global scale, as well as a very long time series of river stage, and wave height and period). Amazingly, all examined physical processes (overall 13) exhibited long-range dependence and in particular, most (if treated properly within a robust physical and statistical framework, e.g. by adjusting the process for sampling errors as well as discretization and bias effects) with a mean long-term persistence parameter equal to H ≈ 5/6 (as in the case of isotropic grid-turbulence), and (for the processes examined in the microscale such atmospheric wind, surface temperature and dew-point, in a global scale, and a long duration discharge time series and storm event in terms of precipitation and wind) a powered-exponential behaviour with a fractal parameter close to M ≈ 1/3 (as in the case of isotropic grid-turbulence).
Full-text available
Wind and wave resources enclose an important portion of the planet’s energy potential. While wind energy has been effectively harnessed through the last decades to substitute other forms of energy production, the utilization of the synergy between wind and wave resource has not yet been adequately investigated. Such a hybrid energy system could prove efficient in covering the needs of non-connected remote islands. A combined deterministic and stochastic methodology is presented in a case study of a remote Aegean island, by assessing a 100-year climate scenario incorporating uncertainty parameters and exploring the possibilities of fully covering its energy demands.
Full-text available
A detailed investigation of the variability of solar radiation can be proven useful towards more efficient and sustainable design of renewable resources systems. In this context, we analyze observations from Athens, Greece and we investigate the marginal distribution of the solar radiation process at a daily and hourly step, the long-term behavior based on the annual scale of the process, as well as the double periodicity (diurnal-seasonal) of the process. Finally, we apply a parsimonious double-cyclostationary stochastic model to generate hourly synthetic time series preserving the marginal statistical characteristics, the double periodicity and the dependence structure of the process.
Full-text available
The wind process is considered an important hydrometeorological process and one of the basic resources of renewable energy. In this paper, we analyze the double periodicity of wind, i.e., daily and annual, for numerous wind stations with hourly data around the globe and we develop a four-parameter model. Additionally, we apply this model to several stations in Greece and we estimate their marginal characteristics and stochastic structure best described by an extended-Pareto marginal probability function and a Hurst-Kolmogorov process, respectively.
Full-text available
The package ftdistrplus provides functions for tting univariate distributions to different types of data (continuous censored or non-censored data and discrete data) and allowing different estimation methods (maximum likelihood, moment matching, quantile matching and maximum goodness-of-fit estimation). Outputs of fitdist and fitdistcens functions are S3 objects, for which speci c methods are provided, including summary, plot and quantile. This package also provides various functions to compare the fit of several distributions to the same data set and can handle to bootstrap parameter estimates. Detailed examples are given in food risk assessment, ecotoxicology and insurance contexts. Download at
Full-text available
In this paper, we present a methodology to analyze processes of double cyclostationarity (e.g. daily and seasonal). This method preserves the marginal characteristics as well as the dependence structure of a process (through the use of climacogram). It consists of a normalization scheme with two periodicities. Furthermore, we apply it to a meteorological station in Greece and construct a stochastic model capable of preserving the Hurst-Kolmogorov behaviour. Finally, we produce synthetic time-series (based on aggregated Markovian processes) for the purpose of wind speed and energy production simulation (based on a proposed industrial wind turbine).
It is shown that an asymptotically precise one‐term correction to the asymptotic distribution function of the classical Cramér‐von Mises statistic approximates the exact distribution function remarkably closely for sample sizes as small as 7 or even smaller. This correction can be quickly evaluated, and hence it is suitable for the computation of practically exact p‐values when testing simple goodness of fit. Similar findings hold for Watson's rotationally invariant modification, where a sample size of 4 appears to suffice.
This book presents a greatly enlarged statistical framework compared to generalized linear models (GLMs) with which to approach regression modelling. Comprising of about half-a-dozen major classes of statistical models, and fortified with necessary infrastructure to make the models more fully operable, the framework allows analyses based on many semi-traditional applied statistics models to be performed as a coherent whole. Since their advent in 1972, GLMs have unified important distributions under a single umbrella with enormous implications. However, GLMs are not flexible enough to cope with the demands of practical data analysis. And data-driven GLMs, in the form of generalized additive models (GAMs), are also largely confined to the exponential family. The methodology here and accompanying software (the extensive VGAM R package) are directed at these limitations and are described comprehensively for the first time in one volume. This book treats distributions and classical models as generalized regression models, and the result is a much broader application base for GLMs and GAMs. The book can be used in senior undergraduate or first-year postgraduate courses on GLMs or categorical data analysis and as a methodology resource for VGAM users. In the second part of the book, the R package VGAM allows readers to grasp immediately applications of the methodology. R code is integrated in the text, and datasets are used throughout. Potential applications include ecology, finance, biostatistics, and social sciences. The methodological contribution of this book stands alone and does not require use of the VGAM package.