ArticlePDF Available

An Advanced Intelligent Method for Wind Power Prediction

Authors:

Abstract and Figures

Wind power prediction is one of the most critical aspects in wind power integration and operation. This paper proposes a new approach for wind power prediction. The proposed method is derived by integrating the kernel principal component analysis (KPCA) method with the locally weighted group method of data handling (LWGMDH) which can be derived by combining the GMDH with local regression method an weighted least squares (WLS) regression. In the proposed model, KPCA is used to extract futures of the inputs and obtain kernel principal components for constructing the face space of the multivariate time series of the inputs. Then LWGMDH is employed to solve the wind power prediction problem. The coefficient parameters are calculated using the WLS regression where each point in the neighborhood is weighted according to its distance from the current prediction point. In addition, to optimize the weighting function bandwidth, the weighted distance algorithm is presented. The proposed model is validated using real world dataset.
Content may be subject to copyright.
An Advanced Intelligent Method for Wind Power Prediction
E. E. Elattar1, I. Taha2
1Department of Electrical Engineering, Menoﬁa University, Shebin El-Kom, Egypt.
2Department of Electrical Engineering, Tanta University, Tanta, Egypt
SUMMARY
Wind power prediction is one of the most critical aspects in wind power integration and operation. This
paper proposes a new approach for wind power prediction. The proposed method is derived by integrating the
kernel principal component analysis (KPCA) method with the locally weighted group method of data handling
(LWGMDH) which can be derived by combining the GMDH with local regression method an weighted least
squares (WLS) regression. In the proposed model, KPCA is used to extract futures of the inputs and obtain
kernel principal components for constructing the face space of the multivariate time series of the inputs. Then
LWGMDH is employed to solve the wind power prediction problem. The coeﬃcient parameters are calculated
using the WLS regression where each point in the neighborhood is weighted according to its distance from the
current prediction point. In addition, to optimize the weighting function bandwidth, the weighted distance
algorithm is presented. The proposed model is validated using real world dataset.
Key words: Wind power prediction, group method of data handling, locally weighted group method of data
handling , weighted distance, kernel principal component analysis, state space reconstruction.
1. INTRODUCTION
Wind power is fastest growing power generation
sector in the world nowadays. It is expected that
the global installed wind power continues to grow in
the future. The output power of wind farms is hard
to control due to the uncertain and variable nature
of the wind resources. Hence, the integration of a
large share of wind power in an electricity system
leads to some important challenges to the stability
of power grid and the reliability of electricity sup-
ply [1]. Wind power prediction is one of the most
critical aspects in wind power integration and opera-
tion. It always scheduled operation of wind turbines
and conventional generators, thus achieves low spin-
ning reserve and optimal operating cost [2].
Short term prediction is generally for a few days
and hours to a few minutes. It is required in the
generation commitment and market operation. Short
Correspondence to: Dr. E. E. Elattar, Department of Elec-
trical Engineering, Faculty of Engineering, Menoﬁa University,
Shebin El-Kom 32511, Egypt.
Email: dr.elattar10@yahoo.com
term power prediction is a very important ﬁeld of re-
search for the energy sector, where the system opera-
tors must handle an important amount of ﬂuctuating
power from the increasing installed wind power ca-
pacity. Its time scales are in the order of some days
(for the forecast horizon) and from minutes to hours
(for time-step) [3].
Various methods have been identiﬁed for short
term wind power prediction. They can be categorized
into physical methods, statistical methods, methods
based upon artiﬁcial intelligence (AI) and hybrid ap-
proaches [4]. The physical method needs a lot of
physical considerations to give a good prediction pre-
cision. It is usually used for long term prediction [5].
While the statistical performs well in short term pre-
diction [6].
The traditional statistical methods are time-series-
based methods, such as the persistence method [7],
auto regressive integrated moving average (ARIMA)
method [8, 9], etc. These methods are based on a
linear regression model and can not always repre-
sent the nonlinear characteristics of the inputs. The
AI methods describe the relation between input and
output data from time series of the past by a non-
1
statistical approach such as artiﬁcial neural network
(ANN) [10, 11], fuzzy logic [7] and neuro-fuzzy [12].
Moreover, other hybrid methods [13, 14] have also
been applied to short-term wind power prediction
with success.
Support vector regression (SVR) [15] has been ap-
plied to wind speed prediction with success [16]. SVR
has been shown to be very resistant to the overﬁtting
problem and gives a high generalization performance
in prediction problems. SVR has been evaluated on
several time series datasets [17].
The Group Method of Data Handling (GMDH)
is a self-organizing method that was ﬁrstly developed
by Ivakhnenko [18]. The main idea of GMDH is to
build an analytical function in a feedforward network
based on a quadratic node transfer function whose
coeﬃcients are obtained using a regression technique
[19]. GMDH has been applied to solve many predic-
tion problems with success [20, 21].
All the above techniques are known as global time
series predictors in which a predictor is trained using
all data available but gives a prediction using a cur-
rent data window. The global predictors suﬀer from
some drawbacks which are discussed in the previous
work [22].
The local SVR method is proposed by us to over-
come the drawbacks of global predictors [22]. More
details of the local predictor can be found in [22].
Phase space reconstruction is an important step in
local prediction methods. The traditional time series
reconstruction techniques usually use the coordinate
delay (CD) method [23] to calculate the embedding
dimension and the time delay constant of the time
series [24].
The traditional time series reconstruction tech-
niques have a serious problem. In which there may
be correlation between diﬀerent features in recon-
structed phase space. Consequently, the quality of
phase space reconstruction will be aﬀected [25]. In
recent years, to process nonlinear time series, the
kernel principal component analysis (KPCA) which
is one type of nonlinear principal component anal-
ysis (PCA) is used [26]. KPCA is an unsupervised
technique that is based on performing principal com-
ponent analysis in the feature space of a kernel. The
main idea of KPCA is ﬁrst to map the original in-
puts into a high-dimensional feature space via a ker-
nel map,which makes data structure more linear, and
then to calculate principal components in the high-
dimensional feature space [25].
Moreover, our previous work on local SVR pre-
dictor is extended to locally weighted support vec-
tor regression (LWSVR) by modifying the risk func-
tion of the SVR algorithm with the use of locally
weighted regression (LWR) while keeping the regu-
larization term in its original form [27, 28]. LWSVR
has been applied to solve short term load forecasting
(STLF) problem [27, 28]. Although LWSVR method
improves the accuracy of STLF, it suﬀers from some
limitations. First, the most serious limitation of SVR
algorithm is uncertain in choice of a kernel. The best
choice of kernel for a given problem is still a research
issue. The second limitation is the selection of SVR
parameters due to the lacking of the structural meth-
ods for conﬁrming the selection of parameters eﬃ-
ciently. Finally, the SVR algorithm is computation-
ally slower than the artiﬁcial neural networks.
To avoid the limitations of the existing methods
and in order to follow the latest developments to
have a modern system, a new method is proposed
in this paper using an alternative machine learning
technique which is called GMDH.
The proposed method is derived by combining the
GMDH with the local regression method and weighted
least squares regression and employing the weighted
distance algorithm which uses the Mahalanobis dis-
tance to optimize the weighting function’s bandwidth.
In the proposed model, the phase space is recon-
structed based on KPCA method, so that the prob-
lem of the traditional time series reconstruction tech-
niques can be avoided. The proposed method has
been evaluated using real world dataset.
The paper is organized as follows: Section II de-
scribes the time series reconstruction based on KPCA
method. Section III reviews the GMDH algorithm.
The LWGMDH method is introduced in Section IV.
Section V describes the weighted distance algorithm.
Experimental results and comparisons with other meth-
ods are presented in Section VI. Finally, Section VII
concludes the work.
2. TIME SERIES RECONSTRUCTION
BASED ON KPCA
In recent years, to process nonlinear time series,
KPCA is used to overcome the CD method prob-
lem [26]. In KPCA the computations are performed
in a feature space that is nonlinearly related to the
input space. This feature space is deﬁned by an in-
ner product kernel in accordance with Mercer’s the-
2
orem [29]. However, unlike other forms of nonlinear
PCA, the implementation of KPCA relies on linear
algebra by mapping the original inputs into a high-
dimensional feature space via a kernel map, which
makes data structure more linear. In this paper, the
commonly used Gaussian kernel is employed. The
detail introduction of the basic KPCA can be viewed
in [25, 28, 29]
3. Group Method of Data Handling (GMDH)
Suppose that the original dataset consists of M
columns of the values of the system input variables
that is X= (x1(t), x2(t), ..., xM(t)), (t= 1,2, ..., N )
and a column of the observed values of the output
and Nis the length of the dataset.
The connection between inputs and output vari-
ables can be represented by an inﬁnite Volterra Kol-
mogorov Gabor polynomial of the form:
y=a0+
N
i=1
aixi+
N
i=1
N
j=1
aij xixj
+
N
i=1
N
j=1
N
k=1
aijk xixjxk+... (1)
where Nis the number of the data of the dataset,
A(a0, ai, aij , aijk , ...) and X(xi, xj, xk, ...) are vectors
of the coeﬃcients and input variables of the resulting
multi-input single-output system, respectively.
In the GMDH algorithm, the Volterra-Kolmogorov-
Gabor series is estimated by a cascade of second order
polynomials using only pairs of variables [18] in the
form of:
y=a0+a1xi+a2xj+a3xixj+a4x2
i+a5x2
j(2)
The corresponding network shown in Fig. 1 can be
constructed from simple polynomial. As the learning
procedure evolves, branches that do not contribute
signiﬁcantly to the speciﬁc output can be deleted,
this allows only the dominant causal relationship to
evolve.
The GMDH network training algorithm proce-
dures can be summarized as follows:
GMDH network begins with only input nodes
and all combinations of diﬀerent pairs of them
are generated using a quadratic polynomial us-
ing (Eq. 2) and sent into the ﬁrst layer of
the network. The total number of polynomials
(nodes) that can constructed is equal to M(M1)
2.
(1)
1
y
1
x
2
x
3
x
4
x
(1)
2
y
(1)
3
y
(1)
4
y
(1)
6
y
(1)
5
y
(2)
1
y
(2)
2
y
(2)
3
y
(2)
4
y
(2)
6
y
(2)
5
y
(3)
1
y
(3)
2
y
(3)
3
y
(3)
4
y
(3)
6
y
(3)
5
y
Figure 1: GMDH network.
Use least squares regression to compute the op-
timal coeﬃcients of each polynomial (node)
A(a0, a1, a2, a3, a4, a5) to make it best ﬁt the
training data as following:
e=1
N
N
i=1
(yiyi)2(3)
The least square solution of (3) is given by:
A= (XTX)1XTy(4)
where Y= [y1, y2, ..., yN]T,
X=
1x1Px1Qx1Px1Qx2
1Px2
1Q
1x2Px2Qx2Px2Qx2
2Px2
2Q
. . . . . .
. . . . . .
1xNP xN Q xN P xNQ x2
NP x2
NQ
and P, Q ∈ {1,2, .., M }
Compute the mean squared error for each node
(Eq. 3).
Sort the nodes in order of increasing error.
Select the best nodes which give the smallest
error from the candidate set to be used as in-
put into the next layer with all combinations of
diﬀerent pairs of them being sent into second
layer.
This process is repeated until the current layer
is found to not be as good as the previous one.
Therefore, the previous layer best node is then
used as the ﬁnal solution.
More details about the GMDH and its diﬀerent
applications have been reported in [19, 30].
3
4. Locally Weighted Group Method of Data
Handling (LWGMDH)
The LWGMDH method is derived by combining
GMDH with the local regression method and weighted
least squares (WLS) regression. To predict the out-
put values yfor each query point (xq) belongs to the
testing set, the GMDH will be trained using the K
nearest neighbors only (1 < K N) of this xq. The
coeﬃcient parameters is calculated using WLS regres-
sion where each point in the neighborhood is weighted
according to its distance from the xq. The points that
are close to xqhave large weights, and the points far
from xqhave small weights.
Overall, the framework of the design procedure of
the LWGMDH comes as a sequence of the following
steps.
Step 1: Reconstruct the time series: Load the
multivariate time series dataset;
X= (x1(t), x2(t), ..., xM(t)), (t= 1,2, ..., N ).
Using the KPCA method to calculate the num-
ber of principal components of each dataset (we
set the time delay constant of all datasets equal
to 1). Then, reconstruct the multivariate time
series using these values.
Step 2: Form a training and validation data:
The input dataset after reconstruction
Xis di-
vided into two parts, that is a training
Xtr data-
set and validation
Xva dataset . The size of the
training dataset is Ntr while the size of the val-
idation dataset is Nva.
Step 3: For each query point xq, choosing the K
nearest neighbors of this query point using the
Euclidian distance between xqand each point
in
Xtr (1 < K Ntr ).
Step 4: Create the ﬁrst layer: Using the Knear-
est neighbors only, all combinations of the in-
puts are generated based on (2) and sent into
the ﬁrst layer of the network.
Step 5: Estimate the coeﬃcients parameters of
each node: The vector of coeﬃcients Ais de-
rived by minimizing the locally weighted mean
squared error
e=1
K
K
i=1
wi(yiyi)2(5)
where wis the weighting function. Many weight-
ing functions are proposed by the researchers [31].
Out of these weighting functions, Gaussian ker-
nel, tricube kernel and quadratic kernel are the
most popular [31]. In this work, we employ the
commonly used Gaussian kernel weighting func-
tion as following:
wi=exp xixq2
h2(6)
where xqis the query point, xiis a data point
belongs to the nearest neighbors points of xq
and his the bandwidth parameter which plays
an important role in local modelling. An opti-
mization method for the bandwidth is discussed
in the next section of the paper.
The weighted least square solution of (5) is given
by:
A= ((W X )T(W X))1(W X )T(W y) (7)
where Wis the diagonal matrix with diagonal
elements Wii =wiand zeros elsewhere [31],
Y= [y1, y2, ..., yK]T,A(a0, a1, a2, a3, a4, a5), X
is deﬁned in the last section but with number
of rows equal to K(the number of the near-
est neighbors). This procedure is implemented
repeatedly for all nodes of the layer.
Step 6: Select the nodes with the best predic-
tive capability to create the next layer: Each
node in the current layer is evaluated using the
training and validation datasets. Then the nodes
which gives the best predictive performance for
the output variable are chosen for input into the
next layer with all combinations of the selected
nodes based on (2) being sent into next layer.
In this paper, we use a predetermined number
of these nodes. The coeﬃcients parameters of
each node in this layer can be estimated using
the same procedures in step 5.
Step 7: Check the stopping criterion: The mod-
elling can be terminated when:
el+1 el(8)
where el+1 is a minimal identiﬁcation error of
the current layer while elis a minimal iden-
tiﬁcation error of the previous layer. So that
the previous layer (l) best node is then used as
4
the ﬁnal solution of the current query point. If
the stopping criterion is not satisﬁed, the model
has to be expanded. The steps 6 to 7 can be
repeated until the stopping criterion is satisﬁed.
Step 8: Then, the steps 3 to 7 can be repeated
until the future values of diﬀerent query points
are all acquired.
Figure 2 presents the computation procedure of
the proposed method.
5. Weighted Distance Algorithm For
Optimizing The Bandwidth
The weighting function bandwidth (h)is a very
important parameter which plays an important role
in local modelling. If his inﬁnite then the local mod-
elling becomes global. On the other hand, if his too
small, then it is possible that we will not have ade-
quate number of data points in the neighborhood for
a good prediction.
There are several ways to use this parameter like,
constant bandwidth selection, nearest neighbor band-
width selection where his set to be the distance be-
tween the query point and the Kth nearest point,
global bandwidth selection where his calculated glob-
ally by an optimization process, etc [31].
The constant bandwidth selection method where
training data with constant size and shape are used
the weighting function. However, its performance is
unsatisfactory for nonlinear system as the density and
distribution of data points are unlikely to be identi-
cal at every place of the data set [32]. In this paper,
we used the weighted distance algorithm which uses
the Mahalanobis distance metric for optimizing the
bandwidth (h) to improve the accuracy of our pro-
posed method.
With the Mahalanobis distance metric, the prob-
lem of scale and correlation inherent in Euclidean
distance are no longer an issue. In the Euclidean
distance, the set of points which have equal distance
from a given location is a sphere. The Mahalanobis
distance metric stretches this sphere correct for the
respective scales of the diﬀerent variables.
The standard Mahalanobis distance metric can be
deﬁned as:
d(x) = (xµ)TS1(xµ) (9)
where xis a vector of data, µis a mean and S1is
inverse covariance matrix.
Deﬁning the Mahalanobis distance metric between
the query point xqand data point xas:
dq=(xxq)TS1(xxq) where xbelongs to the
Knearest neighbors of the query point xqand S1
is computed after removing the mean from each col-
umn, the bandwidth hqis the function of dq:
hq= Θ(dq) (10)
where dmin dqdmax and dmin is the distance
between xqand the closest neighbor while dmax is the
distance between xqand the farthest neighbor.
According to the LWR method, the query point
corresponding to dq=dmin is most important that
is hmax = Θ(dmin) = 1 while the query point corre-
sponding to dq=dmax is the least important, that
is hmin = Θ(dmax) = δ.δis a real constant. This
constant is a low sensitivity parameter. Therefore,
after few trials, we ﬁx it to 0.01 which gives the best
results.
The bandwidth hqcan be selected as a function
of dqas follows [32]:
hq= Θ(dq) = a1bdq
dq2
+c(11)
where a,band care constants. By applying the
boundary conditions, we can calculate these constants
and get [32]:
hq= (1 δ)dmin (dmax dq)
dq(dmax dmin)2
+δ(12)
The Gaussian kernel weighting function which used
in this paper can be written as following:
w=exp d2
q
h2
q(13)
6. Experimental Results
6.1. Data
To evaluate the performance of the proposed meth-
od, it has been tested for wind power prediction using
the real data from wind farms in Alberta, Canada [33].
Alberta has the highest percentage of total installed
wind generation capacity of any province in Canada.
There are more than 40 wind projects proposed for fu-
ture development in Alberta. Alberta includes many
5
dataset
Find the Knearest neighbors for the
current prediction point using the
Euclidean distance
Is this
Prediction point the
last one?
No
Yes
Future prediction
Set all
Create the first layer using Knearest
neighbors only based on (12)
Calculate the bandwidth parameter (h)
and Gaussian weighting function of
each point in the neighbourhood
The previous layer (l) best node is then
used as the final solution of the current
prediction point
Estimate the coefficient parameters of
each node in the layer using (17)
Is
e≥ e
Yes
No
Create the next layer (l+1) and estimate
its coefficient parameters with the same
procedures in previous layer (l)
Select the nodes with the best predictive
capability
Figure 2: Flowchart of the proposed method
wind farms such as Ghost Pine wind farm (owning 51
turbines and 81.6 MW total capacity), Taber wind
farm (owning 37 turbines and 81.4 MW total capac-
ity), Wintering Hills wind farm (owning 55 turbines
and 88 MW total capacity), etc [34]. The total wind
power installed capacity in 2011 is 800 MW. This
value will be raised to 893 MW by the most recent
governmental goals for the wind sector in 2012 [34].
The wind power proﬁle in Alberta, Canada at Jan-
uary 2011, is shown in Fig 3.
6.2. Parameters
To implement a good model, there are some im-
portant parameters to choose. There are two impor-
tant parameters in the KPCA algorithm which used
5 10 15 20 25 30
0
100
200
300
400
500
600
700
Day
Wind Power (MW)
Figure 3: Wind power proﬁle in Alberta, Canada, at January
2011.
6
Table 1: Comparative RMSE Results
Winter Spring Summer Fall Average
Persistence 13.71 16.19 14.42 22.99 16.83
SARIMA 6.70 6.59 8.09 13.88 8.82
LRBF 5.03 4.85 4.76 6.97 5.40
LWGMDH 4.01 3.90 3.72 5.32 4.24
Table 2: Improvement of the LWGMDH Over Other Ap-
proaches Regarding RMSE
Average RMSE Improvement
LWGMDH 4.24 −−
Persistence 16.83 74.81%
SARIMA 8.82 51.93%
LRBF 5.40 21.48%
to reconstruct the phase space. These parameters
are the number of principal components (nc) and w2
in the Gaussian kernel function. The optimal values
of these parameters which computed using the cross
validation method are w2= 1.09 and nc= 10.
In the local prediction model, choosing the neigh-
borhood size (K) is very important step. So, this pa-
rameter is calculated as describe in [27] where kmax
and βare always ﬁxed for all test cases at 45% of N
and 80, respectively.
6.3. Forecasting accuracy evaluation
For all performed experiments, we quantiﬁed the
prediction performance with root mean square error
(RMSE) and normalized mean absolute error (NMAE)
criterion. They can be deﬁned as:
RMSE =
1
N
N
h=1
[phph]2(14)
NMAE = 1
N
N
h=1
|phph|
pinst
×100 (15)
where phand phare the forecasted and actual elec-
tricity prices at hour h, respectively, pinst is the in-
stalled wind power capacity and Nis the number of
forecasted hours
6.4. Results
The proposed LWGMDH method has been ap-
plied for the prediction of the whole wind power in
Alberta, Canada. The performance of the proposed
method is compared with 3 published approaches em-
ploying the same dataset. These approaches are per-
sistence, seasonal ARIMA (SARIMA) and local ra-
dial basis function (LRBF). Historical wind power
data are the only inputs for training the proposed
method. For the sake of clear comparison, no exoge-
nous variables are considered.
The proposed LWGMDH method predicts the val-
ue of wind power subseries for one day ahead, taking
into account the wind power data of the previous 3
months (the ﬁrst 80% values of these data are used
for training, while the last 20% values are used for
validation). The length of the forecast horizon for the
Alberta dataset is 24 hours. Four test weeks (Monday
to Sunday) corresponding to four seasons of year 2011
are randomly selected for this numerical experiment.
These test weeks are: the second week of February
2011 as a winter week, The third week of May 2011
as a spring week, The second week of August 2011 as
a summer wee, and the ﬁrst week November 2011 as
a fall week.
The error (RMSE and NMAE) of each day dur-
ing each testing week is calculated. Then the aver-
age error of each testing week (Monday to Sunday)
is calculated by averaging the seven error values of
its corresponding forecast days. Finally, the overall
mean performance for the four testing weeks for each
method can be calculated.
Table 1 shows a comparison between the proposed
LWGMDH method and three other approaches (Per-
sistence, SARIMA and LRBF), regarding the RMSE
criterion.
These results show that the proposed method out-
performs other methods. Table 2 shows the RMSE
7
Table 3: Comparative NMAE Results
Winter Spring Summer Fall Average
Persistence 6.59 7.66 7.51 11.07 8.21
SARIMA 3.21 3.09 3.84 6.53 4.17
LRBF 2.38 2.31 2.20 3.26 2.54
LWGMDH 1.94 1.85 1.71 2.48 1.99
Table 4: Improvement of the LWGMDH Over Other Ap-
proaches Regarding NMAE
Average NMAE Improvement
LWGMDH 1.99 −−
Persistence 8.21 75.76%
SARIMA 4.17 52.28%
LRBF 2.54 21.65%
improvements of the LWGMDH method over Persis-
tence, SARIMA and LRBF.
Table 3 shows a comparison between the proposed
LWGMDH method and three other approaches (Per-
sistence, SARIMA and LRBF), regarding the NMAE
criterion.
These results show the superiority of the proposed
method over other methods. Table 4 shows the NMAE
improvements of the LWGMDH method over Persis-
tence, SARIMA and LRBF.
Figs. 4- 7 show the predicted hourly wind power
versus the actual wind power of one day (as an exam-
ple) of each testing week using the proposed LWG-
MDH method. These results show that our prediction
values are very close to the actual values.
The above results indicates that the proposed met-
hod is less sensitivity to the wind power volatility
than the other techniques used in the comparison.
To further study the superiority of LWGMDH
method, it is also executed for all 52 weeks of year
2011 for the Alberta dataset and compared with three
other approaches (Persistence, SARIMA and LRBF).
The results show that the proposed LWGMDH
method improves the RMSE and NMAE for the 52
weeks of year 2011 over the Persistence, SARIMA
and LRBF methods. Table 5 shows the RMSE and
NMAE improvements of the LWGMDH method over
5 10 15 20
500
550
600
Hour
Wind Power (MW)
Actual
Predicted
Figure 4: Forecasted and actual hourly wind power for Febru-
ary 9, 2011.
5 10 15 20
250
300
350
400
450
500
550
600
Hour
Wind Power (MW)
Actual
Predicted
Figure 5: Forecasted and actual hourly wind power for May 17,
2011.
Persistence, SARIMA and LRBF. In addition, Fig. 8
shows the comparison between LWGMDH method
and Persistence, SARIMA and LRBF methods for
each month of year 2011 regarding RMSE criterion.
Same results can be got using the NMAE criterion.
These results show the robustness of the proposed
LWGMDH method and its performance in a long run
for a complete year.
8
Table 5: Improvement of the LWGMDH Over Other Methods for All 52 Weeks Of Year 2011
RMSE Improvement NMAE Improvement
LWGMDH −− −−
Persistence 73.88 % 75.35%
SARIMA 51.19 % 51.81%
LRBF 20.29 % 21.01%
5 10 15 20
0
20
40
60
80
100
120
Hour
Wnd Power (MW)
Actual
Predicted
Figure 6: Forecasted and actual hourly wind power for August
11, 2011.
5 10 15 20
100
150
200
250
300
350
400
450
Hour
Wind Power (MW)
Actual
Predicted
Figure 7: Forecasted and actual hourly wind power for Novem-
ber 3, 2011.
7. Conclusions
In this paper, we have proposed a LWGMDH based
KPCA method for wind power prediction. In the pro-
posed method, the KPCA method is used to recon-
struct the time series phase space and the neighboring
points are presented by Euclidian distance for each
query point. These neighboring points only can be
used to train the GMDH where the coeﬃcient param-
Jan. Feb. Mar. Apr. May Jun. Jul. Aug. Sep. Oct. Nov. Dec.
0
5
10
15
20
25
Month
RMSE
LWGMDH
LRBF
SARIMA
Persistence
Figure 8: RMSE Results For The Year Of 2011
eters are calculated using the weighted least square
(WLS) regression. In addition, the weighting func-
tion’s bandwidth which plays a very important role
in local modelling is optimized by the weighted dis-
tance algorithm.
By using the KPCA the drawback of the tradi-
tional time series reconstruction techniques can be
avoided by decreasing the correlation between diﬀer-
ent features in reconstructed phase space. Also, by
combining GMDH with the local regression method
the drawbacks of global methods can be overcome. In
addition, by using the WLS, each point in the neigh-
borhood is weighted according to its distance from
the current query point. The points that are close to
the current query point have larger weights than oth-
ers. Moreover, by using the weighted distance algo-
rithm, the disadvantage of using the weighting func-
tions bandwidth as a ﬁxed value can be overcome.
This has led to improve the accuracy of the proposed
model.
A real world dataset has been used to evaluate the
performance of the proposed model which has been
compared with Persistence, SARIMA and LRBF meth-
9
ods. The numerical results show the superiority of
the proposed model over Persistence, SARIMA and
LRBF methods based on diﬀerent measuring errors.
References
[1] M. Negnevitsky, P. Johnson, S. Santoso, Short term wind
power forecasting using hybrid intelligent systems, in:
Proc. of Power Eng. Soc. General Meeting, Tampa, FL,
2007, pp. 1–4.
[2] M. Khalid, A. Savkin, A method for short-term wind
power prediction with multiple observation points, IEEE
Trans. Power Systems 27 (2) (2012) 579–586.
[3] H. P. J. Catalo, V. Mendes, A hybrid PSO ANFIS ap-
proach for short-term wind power prediction in Portugal,
IEEE Trans Sustainable Energy 2 (1) (2011) 50–59.
[4] M. Negnevitsky, P. Mandal, A. K. Srivastava, Machine
learning applications for load, price and wind power pre-
diction in power systems, in: Proc. of 15th Int. Conf. on
Intelligent Syst. Appl. to Power Syst., 2009, pp. 1–6.
[5] L. Wang, L. Dong, Y. Hao, X. Liao, Wind power predic-
tion using wavelet transform and chaotic characteristics,
in: Proc. of World Non-Grid-Connected Wind Power and
Energy Conf., (WNWEC 2009), 2009, pp. 1–5.
[6] L. Ma, S. Y. Luan, C. W. Jiang, H. L. Liu, Y. Zhang,
A review on the forecasting of wind speed and generated
power, Renew. Sust. Energy Rev. 13 (4) (2009) 915–920.
[7] G. Sideratos, N. Hatziargyriou, Using radial basis neural
networks to estimate wind power production, in: Proc. of
Power Eng. Soc. General Meeting, Tampa, FL, 2007, pp.
1–7.
[8] D. F. W. Jiang, Z. Yan, Z. Hu, Wind speed forecasting
using autoregressive moving average/generalized autore-
gressive conditional heteroscedasticity model, European
Transactions on Electrical Power 22 (5) (2012) 662–673.
[9] R. G. Kavasseri, K. Seetharaman, Day-ahead wind speed
forecasting using f-ARIMA models, Renew. Energy 34 (5)
(2009) 1388–1393.
[10] F. K. N. Amjady, H. Zareipour, Wind power prediction by
a new forecast engine composed of modiﬁed hybrid neural
network and enhanced particle swarm optimization, IEEE
Trans. Sustainable Energy 2 (3) (2011) 265–276.
[11] K. Bhaskar, S. Singh, AWNN-assisted wind power fore-
casting using feed-forward neural network, IEEE Trans.
Sustainable Energy 3 (2) (2012) 306–315.
[12] C. W. Potter, W. Negnevitsky, Very short-term wind fore-
casting for Tasmanian power generation, IEEE Trans.
Power Systems 21 (2) (2006) 965–972.
[13] F. K. N. Amjady, H. Zareipour, A new hybrid iterative
method for short-term wind speed forecasting, European
Transactions on Electrical Power 21 (1) (2011) 581–595.
[14] S. Fan, J. R. Liao, R. Yokoyama, L. N. Chen, W. J. Le,
Forecasting the wind generation using a two-stage network
based on meteorological information, IEEE Trans. Energy
Conversion 24 (2) (2009) 474–482.
[15] A. J. Smola, B. Scholkopf, A tutorial on support vector
regression, NeuroCOLT Technical Report NC-TR-98-030,
Royal Holloway College, University of London, 1998.
[16] D. Ying, J. Lu, Q. Li, Short-term wind speed forecasting of
wind farm based on least square-support vector machine,
Power Syst. Technology 32 (15) (2008) 62–66.
[17] M. Zhang, Short-term load forecasting based on support
vector machines regression, in: Proceedings of the Fourth
Int. Conf. on Machine Learning and Cyber., 2005.
[18] A. G. Ivakhnenko, Polynomial theory of complex systems,
IEEE Trans. of Syst., Man and Cyber. SMC-1 (1971) 364–
378.
[19] S. J. Farlow, Self-organizing method in modeling: GMDH
type algorithm, Marcel Dekker Inc., 1984.
[20] D. Srinivasan, Energy demand prediction using GMDH
networks, Neurocomputing 72 (2008) 625–629.
[21] R. E. Abdel-Aal, M. A. Elhadidy, S. M. Shaahid, Modeling
and forecasting the mean hourly wind speed time series us-
ing GMDH-based abductive networks, Renewable Energy
34 (7) (2009) 1686–1699.
[22] E. E. El-Attar, J. Y. Goulermas, Q. H. Wu, Forecasting
electric daily peak load based on local prediction, in: IEEE
Power Engineering Society General Meeting (PESGM09),
2009, pp. 1–6.
[23] F. Takens, Detecting strange attractors in turbulence,
Lect. Notes in Mathematics (Springer Berlin) 898 (1981)
366–381.
[24] D. Tao, X. Hongfei, Chaotic time series prediction based
on radial basis function network, in: Eighth ACIS Int.
Conf. on Software Engin., Artiﬁcial Intelligence, Network-
ing, and Parallel/Distributed Computing, 2007, pp. 595–
599.
[25] L. Caoa, K. Chuab, W. Chongc, H. Leea, Q. Gud, A com-
parison of PCA,KPCA and ICA for dimensionality re-
duction in support vector machine, Neurocomputing 55
(2003) 321–336.
[26] F. Chen, C. Han, Time series forecasting based on wavelet
KPCA and support vector machine, in: IEEE Int. Conf.
on Automation and Logistics, 2007, pp. 1487–1491.
[27] E. E. Elattar, J. Y. Goulermas, Q. H. Wu, Electric load
forecasting based on locally weighted support vector re-
gression, IEEE Trans. Syst., Man and Cyber. C, Appl.
and Rev. 40 (4) (2010) 438–447.
[28] E. E. Elattar, J. Y. Goulermas, Q. H. Wu, Integrat-
ing KPCA and locally weighted support vector regres-
sion for short-term load forecasting, in: the 15th IEEE
Miditerranean Electrotechnical Conf. (MELECOn 2010),
Valletta, Malta, 2010, pp. 1528–1533.
[29] S. Haykin, Neural networks: A comprehensive foundation,
Printic-Hall, Inc., 1999.
[30] A. G. Ivakhnenko, G. A. Ivakhnenko, The review of prob-
lems solved by algorithms of the group method of data
handling (GMDH), Pattern Recognition and Image Anal-
ysis 5 (1995) 527–535.
[31] C. C. Atkeson, A. W. Moore, S. Schaal, Locally weighted
learning, Artiﬁcial Intelligence Review (Special Issue on
Lazy Learning) 11 (1997) 11–73.
[32] H. Wang, C. Cao, H. Leung, An improved locally weighted
regression for a converter re-vanadium prediction model-
ing, in: Proceedings of the 6th World Congress on Intelli-
gent Control and Automation, 2006, pp. 1515–1519.
[33] Alberta electric system operator (AESO), wind power in-
tegration (2012).
URL http://www.aeso.ca/gridoperations/13902.html
[34] Canadian wind power association (CANWEA) (2012).
URL http://www.canwea.ca/farms/wind-farms_e.php
10
Article
Full-text available
Context. Information monitoring technology is used to reduce information uncertainty about the regularity of air temperature changes during managing work in hard-to-reach places [1]. The task was to create a method for modelling one of the climatic indicators, air temperature, in the given territories in the information monitoring technology structure. Climate models are the main tools for studying the response of the ecological system to external and internal influences. The problem of reducing information uncertainty in making managerial decisions is eliminated by predicting the consequences of using planned control actions using climate modelling methods in information monitoring technology. The information technology of climate monitoring combines satellite observation methods and observations on climate stations, taking into account the spatial and temporal characteristics, to form an array of input data. It was made with the methods for synthesizing models of monitoring information systems [1] and methods of forming multilevel model structures of the monitoring information systems [1] for converting observation results into knowledge, and with the rules for interpreting obtained results for calculating the temperature value in the uncontrolled territories. Objective of the work is to solve the problem of identifying the functional dependence of the air temperature in a given uncontrolled territory on the results of observations of the climate characteristics by meteorological stations in the information technology of climate monitoring structure. Method. The methodology for creating information technologies for monitoring has been improved to expand its capabilities to perform new tasks of forecasting temperature using data from thermal imaging satellites and weather stations by using a new method of climate modelling. A systematic approach to the process of climate modelling and the group method of data handling were used for solving problems of functional dependence identification, methods of mathematical statistics for evaluating models. Results. The deviation of the calculated temperature values with the synthesized monitoring information systems models from the actual values obtained from the results of observations by artificial earth satellites does not, on average, exceed 2.5°С. Temperature traces obtained from satellite images and weather stations at similar points show similar dynamics. Conclusions. The problem of the functional dependence identification of air temperature in uncontrolled territories on the results of observations at meteorological stations is solved. The obtained results were used in the process of creating a new method of climate modelling within information technology of climate monitoring. Experimental confirmation of the hypothesis about the possibility of using satellite images in regional models of temperature prediction has been obtained. The effectiveness of the application of the methodology for the creation of monitoring information technologies during the implementation of the tasks of reducing uncertainty for management decisions during works in non-controlled territories has been proven.
Article
Full-text available
Forecasting wind power is recognized as a tool in mitigating the operational challenges imposed on power systems by large-scale integration of intermittent wind-powered generators. Wind energy is directly dependent upon wind speed, which is a complex signal to model and forecast. In this paper, a new Hybrid Iterative Forecast Method (HIFM) for wind speed forecasting is presented which takes into account the interactions of temperature and wind speed. To select the most relevant and the less redundant input variables from the available data, a two-stage feature selection technique is also introduced. The forecast accuracy of the proposed wind power prediction strategy is evaluated by means of real data of wind power farms of Iran and Spain's power systems. Copyright © 2010 John Wiley & Sons, Ltd.
Article
In this paper, a hybrid model of autoregressive moving average and generalized autoregressive conditional heteroscedasticity is proposed to forecast wind speed. In this model, the conditional variance of an observation depends linearly on the conditional variance of the previous observations and on the previous prediction errors. This conditional variance can capture the feature that the predictability of meteorological variables is not constant but shows regular variations. The quasi‐maximum likelihood estimator was used to estimate parameters of the proposed model. An improved particle swarm optimization was proposed to solve the solution of the autoregressive moving average/generalized autoregressive conditional heteroscedasticity model through the log‐quasi‐likelihood function. Four different indices are introduced to demonstrate the performance of the proposed model. Generated results of different season sample sets were compared with their corresponding values when using the autoregressive moving average model. The simulation results validate the effectiveness, accuracy, and superiority of the proposed model for wind speed prediction. Copyright © 2011 John Wiley & Sons, Ltd.
Article
This paper presents a method to improve the short-term wind power prediction at a given turbine using information from numerical weather prediction and from multiple observation points, which correspond to the locations of nearby turbines at a particular wind farm site. The prediction of wind power is achieved in two stages; in the first stage wind speed is predicted using our proposed method. In the second stage, the wind speed to output power conversion is accomplished using power curve model. The proposed wind power prediction method is tested using real measurements and NWP data from one of the wind farm sites in Australia. The performance is compared with the persistence and Grey predictor model in terms of Mean Absolute Error and Root Mean Square Error.
Article
With the growing wind power penetration in the emerging power system, an accurate wind power forecasting method is very much essential, to help the system operators, to include wind generation into economic scheduling, unit commitment, and reserve allocation problems. It also assists the wind power producers to maximize their benefits by bidding in the electricity markets. A statistical-based wind power forecasting without using numerical weather prediction (NWP) inputs is carried out in this work. The proposed approach consists of two stages. In stage-I, wavelet decomposition of wind series is carried out and adaptive wavelet neural network (AWNN) is used to regress upon each decomposed signal, to predict wind speed up to 30 h ahead. In stage-II, a feed-forward neural network (FFNN) is used for nonlinear mapping between wind speed and wind power output, which transforms the forecasted wind speed into wind power prediction. The effectiveness of the proposed method is compared with persistence (PER) and new-reference (NR) benchmark models and the results show that the proposed model outperforms the benchmark models.
Article
The increased integration of wind power into the electric grid, as nowadays occurs in Portugal, poses new challenges due to its intermittency and volatility. Wind power prediction plays a key role in tackling these challenges. The contribution of this paper is to propose a new hybrid approach, combining particle swarm optimization and adaptive-network-based fuzzy inference system, for short-term wind power prediction in Portugal. Significant improvements regarding forecasting accuracy are attainable using the proposed approach, in comparison with the results obtained with five other approaches.