ArticlePDF Available

Abstract

This article reviews the common used forecast error measurements. All error measurements have been joined in the seven groups: absolute forecasting errors, measures based on percentage errors, symmetric errors, measures based on relative errors, scaled errors, relative measures and other error measures. The formulas are presented and drawbacks are discussed for every accuracy measurements. To reduce the impact of outliers, an Integral Normalized Mean Square Error have been proposed. Due to the fact that each error measure has the disadvantages that can lead to inaccurate evaluation of the forecasting results, it is impossible to choose only one measure, the recommendations for selecting the appropriate error measurements are given.
(
)
()m
t tt
e yf= −
()
m
ft
1,
1
1
,
n
ii
in
i
MAE e mean e
n=
=
= =
1,
,
i
in
MdAE median e
=
=
() ()
22
11,
1
,
n
ii
iin
MSE e mean e
n==
= =
World Applied Sciences Journal 24 (Information Technologies in Modern Industry, Education & Society): 171-176, 2013
ISSN 1818-4952
© IDOSI Publications, 2013
DOI: 10.5829/idosi.wasj.2013.24.itmies.80032
Corresponding Author: Shcherbakov, Volgograd State Technical University, Lenin avenue, 28, 400005, Volgograd, Russia.
171
A Survey of Forecast Error Measures
Maxim Vladimirovich Shcherbakov, Adriaan Brebels,
Nataliya Lvovna Shcherbakova, Anton Pavlovich Tyukov,
Timur Alexandrovich Janovsky and Valeriy Anatol’evich Kamaev
Volgograd State Technical University, Volgograd, Russia
Submitted: Aug 7, 2013; Accepted: Sep 18, 2013; Published: Sep 25, 2013
Abstract: This article reviews the common used forecast error measurements. All error measurements have been
joined in the seven groups: absolute forecasting errors, measures based on percentage errors, symmetric errors,
measures based on relative errors, scaled errors, relative measures and other error measures. The formulas are
presented and drawbacks are discussed for every accuracy measurements. To reduce the impact of outliers, an
Integral Normalized Mean Square Error have been proposed. Due to the fact that each error measure has the
disadvantages that can lead to inaccurate evaluation of the forecasting results, it is impossible to choose only
one measure, the recommendations for selecting the appropriate error measurements are given.
Key words: Forecasting Forecast accuracy Forecast error measurements
INTRODUCTION
Different criteria such as forecast error measurements,
the speed of calculation, interpretability and others have where - y is the measured value at time t, - predicted
been used to assess the quality of forecasting [1-6].
Forecast error measures or forecast accuracy are the most
important in solving practical problems [6]. Typically, the
common used forecast error measurements are applied for
estimating the quality of forecasting methods and for
choosing the best forecasting mechanism in case of
multiple objects. A set of "traditional" error measurements
in every domain is applied despite on their drawbacks.
These error measurements are used as presets in domains
despite on drawbacks.
This paper provides an analysis of existing and quite
common forecast error measures that are used in
forecasting [4, 7-10]. Measures are divided into groups
according to the calculating method an value of error for
certain time t. The calculating formula, the description of
the drawbacks, the names of assessments are considered
for each error measure.
A Review
Absolute Forecasting Error: The first group is based on
the absolute error calculation. It includes estimates based
on the calculation of the value ei
(1)
t
value at time t, obtained from the use of the forecast
model m. Hereinafter referred to as the index of the model
(m) will be omitted.
Mean Absolute Error, MAE is given by:
(2)
where n –forecast horizon, mean(•) – a mean operation.
Median Absolute Error, MdAE is obtained using the
following formula
(3)
where mean(•) – operation for calculation of a median.
Mean Square Error, MSE is calculated by the formula
(4)
() ()
22
11,
1n
ii
iin
RMSE e mean e
n==
= =
t
t
t
e
py
=
()
1,
1
1100 100
n
ii
in
i
MAPE p mean p
n=
=
= ⋅=
(
1,
100 i
in
MdAPE median p
=
= ⋅
()
2
1,
10
0,
i
in
RMSPE mean p
=
= ⋅
()
2
1,
10
0,
i
in
RMdSPE median p
=
= ⋅
()
.
t
t
tt
e
syf
=+
()
1,
1
1200 20
0,
n
ii
in
i
sMAPE s mean s
n=
=
= ⋅=
(
)
1,
20
0.
i
in
sMdAPE median s
=
= ⋅
()
1
1
,
/2
n
ii
ii i
i
yf
msMAPE nyf S
=
=++
11
11
,.
11
11
11
ii
S yy y y
ik k
ii
ii
kk
−−
= −=
−−
−−
= =
∑∑
World Appl. Sci. J., 24 (Information Technologies in Modern Industry, Education & Society): 171-176, 2013
172
hence, Root Mean Square Error, RMSE is calculated as:
(5) We note the following shortcomings.
These error measures are the most popular in various is equal to zero.
domains [8, 9]. However, absolute error measures have the Non-symmetrical issue - the error values differ
following shortcomings. whether the predicted value is bigger or smaller than
The main drawback is the scale dependency [9]. Outliers have significant impact on the result,
Therefore if the forecast task includes objects with particularly if outlier has a value much bigger then
different scales or magnitudes then absolute error the maximal value of the "normal" cases [4].
measures could not be applied. The error measures are biased. This can lead to an
The next drawback is the high influence of outliers in incorrect evaluation of the forecasting models
data on the forecast performance evaluation [11]. performance [15].
So, if data contain an outliers with maximal value
(this is common case in real world tasks), then Symmetric Errors: The criteria which have been included
absolute error measures provide conservative values. in this group are calculated based on the value:
RMSE,MSE have a low reliability: the results could
be different depending on different fraction of data
[4]. (11)
Measures Based on Percentage Errors: Percentage errors The group includes next measures. Symmetric
are calculated based on the value PMean Absolute Percentage Error, sMAPE is calculated
t
(6)
Also these errors are the most common in forecasting
domain. The group of percentage based errors includes and the median mean absolute percentage error
the following errors.
Mean Absolute Percentage Error, MAPE
(7) To avoid the problems associated with the division
Median Absolute Percentage Error, MdAPE is more has been proposed. Their denominators have an
resistant to outliers and calculated according to the additional member:
formula
(14)
(8)
Root Mean Square Percentage Error, RMSPE is
calculated according to:
(9)
and the median percentage error of the quadratic
(10)
Appearance division by zero when the actual value
the actual [12-14].
according to
(12)
(13)
by zero, a modified sMAPE - Modified sMAPE, msMAPE
where .
Developing the idea for the inclusion of an additional
terms, more sophisticated measures was presented [16]:
KL-N, KL-N1, KL-N2, KL-DE1, KL-DE2, IQR
()
*
,
t
t
tt
e
r
yf
=
*
ft
*
t
tl
fy
=
1,
,
i
in
MRAE mean r
=
=
1,
,
i
in
MdRAE median r
=
=
1
2
.
1
1
t
tn
ii
i
e
q
yy
n
=
=
1,
,
i
in
MASE mean q
=
=
(
)
2
1,
.
i
in
RMSSE mean q
=
=
*
,
MAE
RMAE
MAE
=
*
.
RMSE
RRMSE
RMSE
=
World Appl. Sci. J., 24 (Information Technologies in Modern Industry, Education & Society): 171-176, 2013
173
The following disadvantages should be noted. If naive model has been chosen then division by zero
Despite its name, this error is also non-symmetric
[13].
Furthermore, if the actual value is equal to forecasted
value, but with opposite sign, or both of these values
are zero, then a divide by zero error occurs.
These criteria are affected by outliers in analogous
with the percentage errors.
If more complex estimations have been used, the
problem of interpretability of results occurs and this
fact slows their spread in practice [4].
Measures Based on Relative Errors: The basis for
calculation of errors in this group is the value determined
as follows:
(15)
where - the predictive value obtained using a reference
model prediction (benchmark model). The main practice is
to use a naive model as a reference model
,
(16)
where l - the value of the lag and l = 1.
The group includes the next measures. Mean Relative
Absolute Error, MRAE is given by the formula
(17)
Median Relative Absolute Error, MRAE is calculated
according to
(18)
and Geometric Mean Relative Absolute Error, GMRAE),
which is calculated similarly to (17), but instead of mean(•)
the geometric mean is obtained gmean(•).
It should be noted the following shortcomings.
Based the formulas (15-18), division by zero error
occurs, if the predictive value obtained by reference
model is equal to the actual value.
error occurs in case of continuous sequence of
identical values of the time series.
Scaled Error: As a basis for calculating the value of the
scaled errors q is given by
i
(19)
This group contains Mean Absolute Scaled Error,
MASE proposed in [9]. It is calculated according to:
(20)
Another evaluation of this group is Root Mean
Square Scaled Error, RMSSE is calculated by the formula
[10]:
(22)
These measures is symmetrical and resistant to
outliers. However, we can point to two drawbacks.
If the forecast horizon real values are equal to each
other, then division by zero occurs.
Besides it is possible to observe a weak bias
estimates if you do the experiments by analogy with
[15].
Relative Measures: This group contains of measures
calculated as a ratio of mentioned above error measures
obtained by estimated forecasting models and reference
models. Relative Mean Absolute Error, RelMAE is
calculated by the formula.
(23)
where MAE and MAE the mean absolute error for the
*
analyzed forecasting model and the reference model
respectively, calculated using the formula (2).
Relative Root Mean Square Error, RelRMSE is
calculated similarly to (23), except that the right side is
calculated by (5)
(24)
*
lo
g.
RMSE
LMR
RMSE

=

{
}
(
)
*
( ) 100
%.
PB MAE mean I MAE MAE=⋅<
*
0
,;
() 1.
if MAE MAE
I MAE <
=
(
)
2
1,
1
,
i
in
nRMSE mean e
y=
=
y
(
)
2
1,
1
1
.
i
n
in
i
i
inRSE mean e
y=
=
=
()
()
2
1
2
1
,
n
i
i
n
i
i
e
inRSE
yy
=
=
=
1
1
n
yy
k
n
k
=
=
()
2
1
_1
n
i
i
e MAE
Std AE n
=
=
()
2
1
_1
n
i
i
p MAPE
Std APE n
=
=
World Appl. Sci. J., 24 (Information Technologies in Modern Industry, Education & Society): 171-176, 2013
174
In some situations it is reasonable to calculate the over the entire interval or time horizon or defined
logarithm of the ratio (23). In this case, the measure is
called the Log Mean Squared Error Ratio, (LMR)
(25)
Syntetos et al. proposed a more complex assessment
of the relative geometric standard deviation Relative
Geometric Root Mean Square Error, RGRMSE [17].
The next group of measures counts the number of
cases where the error of the model prediction error is
greater than the reference model. For instance, PB (MAE)
- Percentage Better (MAE), calculated by the formula:
(26)
where I{•} - the operator that yields the value of zero or
one, in accordance with the expression:
(27)
By analogy with PB (MAE), Percentage Better (MSE)
can be defined.
The disadvantages of these measures are the
following.
Division by zero error occurs if the reference forecast
error is equal to zero.
These criteria determine the number of cases when
the analyzed forecasting model superior to the base
but do not evaluate the value of difference.
Other Error Measures: This group includes measures
proposed in various studies to avoid the shortcomings of
existing and common measures.
To avoid the scale dependency, Normalized Root
Mean Square Error (nRMSE) has been proposed,
calculated by the formula:
(28)
where - the normalization factor, which is usually equal
to either the maximum measured value on the forecast
horizon, or the difference between the maximum and
minimum values. Normalization factor can be calculated
short interval of observation [18]. However, this
estimate is affected by influence of outliers, if outlier has
a value much bigger the maximal "normal" value. To
reduce the impact of outliers, Integral Normalized Mean
Square Error [19] have been proposed, calculated by the
formula:
(29)
Some research contains the the ways of NRMSE
calculation as [16]:
(30)
where .
Other measures are called normalized std_APE and
std_MAPE [20, 21] and calculated by the formula
(31)
and
(32)
respectively.
As a drawback, you can specify a division by zero
error if normalization factor is equal to zero.
Recommendations How to Choose Error Measures:
One of the most difficult issues is the question of
choosing the most appropriate measures out of the
groups. Due to the fact that each error measure has the
disadvantages that can lead to inaccurate evaluation of
the forecasting results, it is impossible to choose only one
measure [5].
World Appl. Sci. J., 24 (Information Technologies in Modern Industry, Education & Society): 171-176, 2013
175
We provide the following guidelines for choosing the ACKNOWLEDGMENTS
error measures.
If forecast performance is evaluated for time series research (Grants #12-07-31017, 12-01-00684).
with the same scale and the data preprocessing
procedures were performed (data cleaning, anomaly REFERENCES
detection), it is reasonable to choose MAE, MdAE,
RMSE. In case of different scales, these error 1. Tyukov, A., A. Brebels, M. Shcherbakov and
measures are not applicable. The following V. Kamaev, 2012. A concept of web-based energy
recommendations are provided for mutli-scales cases. data quality assurance and control system. In the
In spite of the fact that percentage errors are Proceedings of the 14th International Conference on
commonly used in real world forecast tasks, but due Information Integration and Web-based Applications
to the non-symmetry, they are not recommended. & Services, pp: 267-271.
If the range of the values lies in the positive 2. Kamaev, V.A., M.V. Shcherbakov, D.P. Panchenko,
half-plane and there are no outliers in the data, it is N.L. Shcherbakova and A. Brebels, 2012. Using
advisable to use symmetric error measures. Connectionist Systems for Electric Energy
If the data are "dirty", i.e. contain outliers, it is Consumption Forecasting in Shopping Centers.
advisable to apply the scaled measures such as Automation and Remote Control, 73(6): 1075-1084.
MASE, inRSE. In this case (i) the horizon should be 3. Owoeye, D., M. Shcherbakov and V. Kamaev, 2013.
large enough, (ii) no identical values should be, (iii) A photovoltaic output backcast and forecast method
the normalized factor should be not equal to zero. based on cloud cover and historical data. In the
If predicted data have seasonal or cyclical patterns, Proceedings of the The Sixth IASTED Asian
it is advisable to use the normalized error measures, Conference on Power and Energy Systems (AsiaPES
wherein the normalization factors could be calculated 2013), pp: 28-31.
within the interval equal to the cycle or season. 4. Armstrong, J.S. and F. Collopy, 1992. Error measures
If there is no results of prior analysis and a-prior for generalizing about forecasting methods: Empirical
information about the quality of the data, it comparisons. International Journal of Forecasting,
reasonable to use the defined set of error measures. 8(1): 69-80.
After calculating, the results are analyzed with 5. Mahmoud, E., 1984. Accuracy in forecasting: A
respect to division by zero errors and contradiction survey. Journal of Forecasting, 3(2): 139-159.
cases: 6. Yokuma, J.T. and J.S. Armstrong, 1995. Beyond
For the same time series the results for model maccuracy: Comparison of criteria used to select
1
is better than m, based on the one error forecasting methods. International Journal of
2
measure, but opposite for another one; Forecasting, 11(4): 591-597.
For different time series the results for model m7. Armstrong, J.S., 2001. Evaluating forecasting
1
is better in most cases, but worst for a few of methods. In Principles of forecasting: a handbook for
cases. researchers and practitioners. Norwell, MA: Kluwer
CONCLUSION 8. Gooijer, J.G.D. and R.J. Hyndman, 2006. 25 years of
The review contains the error measures for time series Forecasting, 22(3): 443-473.
forecasting models. All these measures are grouped into 9. Hyndman, R.J. and A.B. Koehler, 2006. Another look
seven groups: absolute forecasting error, percentage at measures of forecast accuracy. International
forecasting error, symmetrical forecasting error, measures Journal of Forecasting, 22(4): 679-688.
based on relative errors, scaled errors, relative errors and 10. Theodosiou, M., 2011. Forecasting monthly
other (modified). For each error measure the way of and quarterly time series using STL
calculation is presented. Also shortcomings are defined decomposition. International Journal of Forecasting,
for each of group. 27(4): 1178-1195.
Authors would like to thank RFBR for support of the
Academic Publishers, pp: 443-512.
time series forecasting. International Journal of
World Appl. Sci. J., 24 (Information Technologies in Modern Industry, Education & Society): 171-176, 2013
176
11. Shcherbakov, M.V. and A. Brebels, 2011. Outliers and 18. Tyukov, A., M. Shcherbakov and A. Brebels, 2011.
anomalies detection based on neural networks Automatic two way synchronization between server
forecast procedure. In the Proceedings of the 31 and multiple clients for HVAC system. In the
st
Annual International Symposium on Forecasting, Proceedings of The 13th International Conference on
ISF-2011, pp: 21-22. Information Integration and Web-based Applications
12. Goodwin, P. and R. Lawton, 1999. On the asymmetry & Services, pp: 467-470.
of the symmetric MAPE. International Journal of 19. Brebels, A., M.V. Shcherbakov and V.A. Kamaev,
Forecasting, 15(4): 405-408. 2010. Mathematical and statistical framework for
13. Koehler, A.B., 2001. The asymmetry of the sAPE comparison of neural network models with other
measure and other comments on the M3-competition. algorithms for prediction of Energy consumption in
International Journal of Forecasting, 17: 570-574. shopping centres. In the Proceedings of the 37 Int.
14. Makridakis, S., 1993. Accuracy measures: Theoretical Conf. Information Technology in Science Education
and practical concerns. International Journal of Telecommunication and Business, suppl. to Journal
Forecasting, 9: 527-529. Open Education, pp: 96-97.
15. Kolassa, S. and R. Martin, 2011. Percentage errors 20. Casella, G. and R. Berger, 1990. Statistical inference.
can ruin your day (and rolling the dice shows how). 2nd ed. Duxbury Press, pp: 686.
Foresight, (Fall): 21-27. 21. Kusiak, A., M. Li and Z. Zhang, 2010. A data-driven
16. Assessing Forecast Accuracy Measures. Date View approach for steam load prediction in buildings.
01.08.2013 http:// www.stat.iastate.edu/ preprint/ Applied Energy, 87(3): 925-933.
articles/2004-10.pdf
17. Syntetos, A.A. and J.E. Boylan, 2005. The accuracy
of intermittent demand estimates. International
Journal of Forecasting, 21(2): 303-314.
... It is important not to examine an individual error measure during the assessment of a model. If all time series are on the same scale, the procedures of preprocessing are accomplished, and the aim was to assess the forecasting, then the MAE must be chosen because it is easier to be explained (see Shcherbakov et al., 2013) [56]. Chai and Draxler (2014) [54] suggested the RMSE over the MAE when the error distribution is expected to be Gaussian. ...
... It is important not to examine an individual error measure during the assessment of a model. If all time series are on the same scale, the procedures of preprocessing are accomplished, and the aim was to assess the forecasting, then the MAE must be chosen because it is easier to be explained (see Shcherbakov et al., 2013) [56]. Chai and Draxler (2014) [54] suggested the RMSE over the MAE when the error distribution is expected to be Gaussian. ...
... In case the data contain outliers, the application of escalating measures is recommended, e.g., the mean absolute scale error (MASE). In this case, the time horizon must be large enough but there must not be repeated values and the normalized factor must be equal to zero (Shcherbakov et al., 2013) [56]. ...
Article
Full-text available
Time series forecasting provides a vital basis for the control and management of various systems. The time series data in the real world are usually strongly nonstationary and nonlinear, which increases the difficulty of reliable forecasting. To fully utilize the learning capability of machine learning in time series forecasting, an adaptive broad echo state network (ABESN) is proposed in this paper. Firstly, the broad learning system (BLS) is used as a framework, and the reservoir pools in the echo state network (ESN) are introduced to form the broad echo state network (BESN). Secondly, for the problem of information redundancy in the reservoir structure in BESN, an adaptive optimization algorithm for the BESN structure based on the pruning algorithm is proposed. Thirdly, an adaptive optimization algorithm of hyperparameters based on the nonstationary test index is proposed. In brief, the structure and hyperparameter optimization algorithms are studied to form the ABESN based on the proposed BESN model in this paper. The ABESN is applied to the data forecasting of air humidity and electric load. The experiments show that the proposed ABESN has a better learning ability for nonstationary time series data and can achieve higher forecasting accuracy.
... The most commonly used performance metrics which are implemented in regression analysis cases in machine learning studies are Mean Absolute Error (MAE), Mean Square Error (MSE), and Root Mean Square Error (RMSE) [36]. In fact, each error measurement has different disadvantages that can lead to inaccurate evaluation of forecasting results, which makes it not recommended to only use one measurement [37]. This research aimed to forecast indoor temperature and humidity in the future, which made MAE and RMSE an ideal choice for collecting error information in the model. ...
Article
Full-text available
Solar Dryer Dome (SDD), which is an agriculture facility for preserving and drying agriculture products, needs an intelligent system for predicting future indoor climate conditions, including temperature and humidity. An accurate indoor climate prediction can help to control its indoor climate conditions by efficiently scheduling its actuators, which include fans, heaters, and dehumidifiers that consume a lot of electricity. This research implemented deep learning architectures to predict future indoor climate conditions such as indoor temperature and indoor humidity using a dataset generated from the SDD facility in Sumedang, Indonesia. This research compared adapted sequenced baseline architectures with sequence-to-sequence (seq2seq) or encoder-decoder architectures in predicting sequence time series data as the input and output of both architecture models which are built based on Recurrent Neural Network (RNN) layers such as Gated Recurrent Unit (GRU) and Long Short-Term Memory (LSTM). The result shows that the adapted sequence baseline model using GRU is the best model, whereas seq2seq models yield bigger Mean Absolute Error (MAE) values by almost ten times. Overall, all the proposed deep learning models are categorized as extremely strong with R2 ≥ 0.99.
... It is acknowledged that a lot of effort has been made to obtain the right parameter values, i.e., the noise is removed completely without the loss of data information [77,78]. However, these optimization approaches have one issue, which is that they use the Frobenius norm in their formula, so they fail to work with outlier-containing data, a downside of the Frobenius metric [79]. On the other hand, Eigenvector Clipping distinguishes itself from others [74] as it does not require any training parameters, making its outcome robust and more reliable. ...
Article
Full-text available
We analyze the correlation between different assets in the cryptocurrency market throughout different phases, specifically bearish and bullish periods. Taking advantage of a fine-grained dataset comprising 34 historical cryptocurrency price time series collected tick-by-tick on the HitBTC exchange, we observe the changes in interactions among these cryptocurrencies from two aspects: time and level of granularity. Moreover, the investment decisions of investors during turbulent times caused by the COVID-19 pandemic are assessed by looking at the cryptocurrency community structure using various community detection algorithms. We found that finer-grain time series describes clearer the correlations between cryptocurrencies. Notably, a noise and trend removal scheme is applied to the original correlations thanks to the theory of random matrices and the concept of Market Component, which has never been considered in existing studies in quantitative finance. To this end, we recognized that investment decisions of cryptocurrency traders vary between bearish and bullish markets. The results of our work can help scholars, especially investors, better understand the operation of the cryptocurrency market, thereby building up an appropriate investment strategy suitable to the prevailing certain economic situation.
Article
The accuracy of calculation methods for determining the evapotranspiration (ET) of corn for grain under drip irrigation in the steppe of Ukraine was established. A comprehensive assessment of calculation methods for soil optimal water regime formation during different growth phases of maize plants was carried out. The accuracy of the estimated value of evapotranspiration was determined by the mean absolute percentage error (MAPE). It has been proven that the use of calculation methods without taking into account the climatic conditions of Southern Ukraine leads to a significant error in determining the actual evapotranspiration. By the Penman-Monteith method, the MAPE of 16.3-26.9% corresponds to the good and satisfactory accuracy of the chosen calculation model. Using the methods of A.M. and S.M. Alpatyev as well as D.A. Stoyko the MAPE increased to 22.2-39.7% and 20.8-29.1%, respectively, which proved their satisfactory accuracy. The calculation method of M.M. Ivanov ensured the MAPE of 48,7-76,8%; that is unsatisfactory calculation accuracy. Adapted crop coefficients Kc for the conditions of the South of Ukraine increased the accuracy of calculating ET by the Penman-Monteith method by an average of 2,2 times, D.A. Shtoyko and A.M. and S.M. Alpatiev by 1,9 and 2,2 times, and M.M. Ivanov by 4,4 times. An analysis of the MAPE using various calculation methods for determining the evapotranspiration of corn for grain under drip irrigation showed that the Penman-Monteith method provides the smallest error (MAPE = 9.1%), which corresponds to high prediction accuracy. In a wet year, the accuracy of ET determination decreases by all methods, which indicates an increase in the MAPE: by Penman-Monteith and D.A. Shtoyko - up to 11.9% and 18.7%, respectively, and the determination accuracy decreases to category “good”. When calculating using the methods of A.M. and S.M. Alpatiev and M.M. Ivanov the MAPE increased to 23,3% and 21,5%, respectively, and the accuracy of ET determination was satisfactory.
Chapter
Most of the current data sources generate large amounts of data over time. Renewable energy generation is one example of such data sources. Machine learning is often applied to forecast time series. Since data flows are usually large, trends in data may change and learned patterns might not be optimal in the most recent data. In this paper, we analyse wind energy generation data extracted from the Sistema de Información del Operador del Sistema (ESIOS) of the Spanish power grid. We perform a study to evaluate detecting concept drifts to retrain models and thus improve the quality of forecasting. To this end, we compare the performance of a linear regression model when it is retrained randomly and when a concept drift is detected, respectively. Our experiments show that a concept drift approach improves forecasting between a 7.88% and a 33.97% depending on the concept drift technique applied.KeywordsMachine learningConcept drift detectionData streamingTime seriesWind energy forecasting
Chapter
Nowadays, swarm intelligence shows a high accuracy while solving difficult problems, including image processing problem. Image Edge detection is a complex optimization problem due to the high-resolution images involving large matrix of pixels. The current work describes several sensitive to the environment models involving swarm intelligence. The agents’ sensitivity is used in order to guide the swarm to obtain the best solution. Both theoretical general guidance and a practical example for a particular swarm are included. The quality of results is measured using several known measures.KeywordsSwarm intelligenceImage processingImage Edge Detection
Article
The beetle antennae search (BAS) algorithm is a single-solution metaheuristic optimizer that imitates the foraging behavior of beetles. Due to its easy implementation and fast convergence, it has been applied in various engineering fields. To enhance the local search ability and preclude the blindness of position updates performed in BAS, an improved beetle antennae search algorithm with Lévy flight (IBAS) is proposed in this paper. Additionally, to further enhance the algorithm performance, the features of inertia, normalization, and elitist selection are adopted into this proposed hybrid algorithm. In a comparison with other algorithms based on 15 well-known benchmark functions and 4 classic engineering problems, IBAS maintains good performance under the premise of a trade-off between efficacy and efficiency. The detailed results suggest that the improvements adapted in IBAS alleviate the shortcomings of BAS. In addition, a real-world engineering problem, laser energy prediction in micro-laser assisted turning, is used to further validate the potential of the proposed algorithm for industrial applications. The results indicate the feasibility of the new solution based on the IBAS algorithm. Two further applications of IBAS in the reconstruction of the temperature field and online laser energy detection are also discussed.
Article
Prediction and the stock market go hand in hand. Due to the inherent limitations of traditional forecasting methods and the pursuit to uncover the hidden patterns in stock market data, stock market prediction using data mining techniques has caught the fancy of academicians, researchers, and investors. Based on a systematic review of more than 143 research studies spanning 25 years, the present paper brings to light the major issues concerning forecasting of stock markets based on data mining techniques, such as usage of data mining techniques in the stock market, input data types, single versus hybrid techniques, instruments and stock markets researched, types of software and algorithms used, measures of forecast accuracy, and performance of various data mining techniques. Emerging patterns related to various dimensions have been critically analyzed by highlighting the existing limitations and suggesting future research paradigms. This analysis can be useful for academicians, researchers and investors looking for futuristic directions in a given research domain.
Article
Recently, the challenges in modeling complex dynamical systems, and the advancement in machine learning methodologies have indicated a new promising direction for damage assessment in civil and mechanical systems. Powerful and efficient data-driven approaches have been increasingly employed in Structural Health Monitoring (SHM) to extract Damage Sensitive Features (DSFs) from the monitored dynamic response of structures. In this study, a New Generalized Auto-Encoder (NGAE), integrated with a statistical-pattern-recognition-based approach that uses the power cepstral coefficients of structural acceleration responses as DSFs, is proposed for structural damage assessment. This NGAE is well-generalized in the components of cepstral coefficients that represent the structural properties of the entire system thanks to a newly defined encoder-decoder mapping, which largely reduces rid of the data variance attributed to different types of excitations and measurement noise. The cepstral coefficients, by virtue of a compact representation of the structural properties, can greatly simplify the structure of the NGAE, and therefore, significantly accelerate training and inference speeds with very few computational requirements. Two specific evaluation metrics that relates to the autoencoder signal reconstruction error are defined and used to assess the presence of damage. The proposed method has been validated through numerical simulations and experimental data, and shows better performance compared to a Traditional Auto-Encoder (TAE) and the Principal Component Analysis (PCA).
Article
Reference evapotranspiration (ETo) as a component in the hydrological cycle is calculated using many methods. In this study, the capability of four data‐driven methods including artificial neural network (ANN), adaptive neuro‐fuzzy inference system (ANFIS), support vector machine (SVM), and M5 tree model has been evaluated for daily ETo estimation at the south of the Caspian Sea. Different combinations of climatic data, solar radiation (Rs), mean air temperature (Tmean), mean relative humidity (RHmean), and wind speed (U) during 1991–2020 were used as input variables. The data were divided into training and test data. The values generated from the methods were compared with those of the FAO‐56 Penman‐Monteith as a standard method. The results indicated that the accuracy of ANFIS was increased for estimating ETo, especially in validation phase, when all climatic variables were used as inputs in the synoptic stations. Totally, based on the evaluation of the performances, it was created that the ANFIS with Tmean, RHmean, Rs, and U variables had the best accuracy, while the ANN, SVM, and M5 with only one input of U had the worst performance. The ANFIS with Tmean and Rs was recommended for modelling ETo if there are fewer climatic variables in these regions. Location of the three synoptic stations in the southern parts of the Caspian Sea.
Conference Paper
Full-text available
Silicon photonics has become in the past years an important technology adopted by a growing number of industrial players to develop their next generation optical transceivers. However most of the technology platforms established in CMOS fabrication lines are kept captive or open to only a restricted number of customers. In order to make silicon photonics accessible to a large number of players several initiatives exist around the world to develop open platforms. In this paper we will present imec's silicon photonics active platform accessible through multi-project wafer runs.
Article
Full-text available
We report a high-performance germanium waveguide photodetectors (WPDs) without doping in germanium or direct metal contacts on germanium, grown on and contacted through a silicon p-i-n diode structure. Wafer-scale measurements demonstrate high responsivities larger than 1.0 A/W across the C-band and low dark current of ~3 nA at -1 V and ~8 nA at -2 V. Owing to its small dimensions, the Ge WPD exhibits a high optoelectrical 3-dB bandwidth of 20 and 27 GHz at low-bias voltages of -1 and -2 V, respectively, which are sufficient for operation at 28 Gb/s. The reduced processing complexity at the tungsten contact plug module combined with the high responsivity makes these Ge WPD devices particularly attractive for emerging low-cost CMOS-Si photonics transceivers.
Conference Paper
Full-text available
Owing to the fact that input data for existing PV forecast algorithms are hard to come by, there is the need to develop a method based on data that is relatively easy to obtain and one with a much simpler algorithm which still produces acceptable results. This paper proposes a method for photovoltaic output backcast and forecast using cloud cover and historical data. The method is about five times more efficient than the benchmark model.
Article
Full-text available
A solution is presented for the short-term electrical energy forecasting in shopping centers located in the Netherlands and Belgium. A forecasting method is proposed on the basis of connectionist systems. General description of the forecasting method is provided, as well as its specific features with respect to the forecasting problem are studied. Several connectionist models are generalized, stated and applied, notably, moving average model, linear regression model, and neural network model. In addition, changes in forecasting quality are demonstrated depending on different input variables. The results of using these connectionist models are discussed, and conclusions regarding specific features of every model are outlined.
Article
Full-text available
We present a silicon-on-insulator (SOI) polarization-insensitive fiber-to-fiber coupler fabricated on a 200-mm wafer with the standard complementary metal-oxide-semiconductor technology. The coupling losses from a lensed fiber into a 500-nm-wide SOI waveguide were measured to be less than 1 dB in the 1520- to 1600-nm spectral range and below 3 dB between 1300 and 1600 nm.
Conference Paper
Precise dimension control technology for the fabrication of silicon photonics devices was established. The dimension control technology consists of the devices fabrication using 40-nm-node CMOS technology and in-line process monitoring by optical wafer-level probing system. As the results of process optimization in waveguide formation, superior dimension control in 440-nm-wide / 220-nm-thick waveguides was achieved, in which waveguide width deviation of 1.0 nm and height deviation of 0.3 nm were respectively obtained for a single 300 mm wafer. In the characterization of 5th-order coupled resonator optical waveguides (CROWs), remarkably small deviation of resonant frequency 0.7 nm in a single wafer was confirmed, which values agreed with the theoretical estimation from the fabrication error. As for the optical wafer-level probing system, quite small deviation less than 0.2 dB in I/O coupling loss between optical devices under test and fiber probe was confirmed. It was successfully shown that the combination of the precise process control and the in-line optical process control monitor is sufficient to the reproducible device fabrication for wide-bandwidth optical interconnection.
Article
This paper is a re-examination of the benefits and limitations of decomposition and combination techniques in the area of forecasting, and also a contribution to the field, offering a new forecasting method. The new method is based on the disaggregation of time series components through the STL decomposition procedure, the extrapolation of linear combinations of the disaggregated sub-series, and the reaggregation of the extrapolations to obtain estimates for the global series. Applying the forecasting method to data from the NN3 and M1 Competition series, the results suggest that it can perform well relative to four other standard statistical techniques from the literature, namely the ARIMA, Theta, Holt-Winters' and Holt's Damped Trend methods. The relative advantages of the new method are then investigated further relative to a simple combination of the four statistical methods and a Classical Decomposition forecasting method. The strength of the method lies in its ability to predict long lead times with relatively high levels of accuracy, and to perform consistently well for a wide range of time series, irrespective of the characteristics, underlying structure and level of noise of the data.