Content uploaded by Ibrahim B M Taha

Author content

All content in this area was uploaded by Ibrahim B M Taha on Feb 05, 2016

Content may be subject to copyright.

An Advanced Intelligent Method for Wind Power Prediction

E. E. Elattar1∗, I. Taha2

1Department of Electrical Engineering, Menoﬁa University, Shebin El-Kom, Egypt.

2Department of Electrical Engineering, Tanta University, Tanta, Egypt

SUMMARY

Wind power prediction is one of the most critical aspects in wind power integration and operation. This

paper proposes a new approach for wind power prediction. The proposed method is derived by integrating the

kernel principal component analysis (KPCA) method with the locally weighted group method of data handling

(LWGMDH) which can be derived by combining the GMDH with local regression method an weighted least

squares (WLS) regression. In the proposed model, KPCA is used to extract futures of the inputs and obtain

kernel principal components for constructing the face space of the multivariate time series of the inputs. Then

LWGMDH is employed to solve the wind power prediction problem. The coeﬃcient parameters are calculated

using the WLS regression where each point in the neighborhood is weighted according to its distance from the

current prediction point. In addition, to optimize the weighting function bandwidth, the weighted distance

algorithm is presented. The proposed model is validated using real world dataset.

Key words: Wind power prediction, group method of data handling, locally weighted group method of data

handling , weighted distance, kernel principal component analysis, state space reconstruction.

1. INTRODUCTION

Wind power is fastest growing power generation

sector in the world nowadays. It is expected that

the global installed wind power continues to grow in

the future. The output power of wind farms is hard

to control due to the uncertain and variable nature

of the wind resources. Hence, the integration of a

large share of wind power in an electricity system

leads to some important challenges to the stability

of power grid and the reliability of electricity sup-

ply [1]. Wind power prediction is one of the most

critical aspects in wind power integration and opera-

tion. It always scheduled operation of wind turbines

and conventional generators, thus achieves low spin-

ning reserve and optimal operating cost [2].

Short term prediction is generally for a few days

and hours to a few minutes. It is required in the

generation commitment and market operation. Short

∗Correspondence to: Dr. E. E. Elattar, Department of Elec-

trical Engineering, Faculty of Engineering, Menoﬁa University,

Shebin El-Kom 32511, Egypt.

Email: dr.elattar10@yahoo.com

term power prediction is a very important ﬁeld of re-

search for the energy sector, where the system opera-

tors must handle an important amount of ﬂuctuating

power from the increasing installed wind power ca-

pacity. Its time scales are in the order of some days

(for the forecast horizon) and from minutes to hours

(for time-step) [3].

Various methods have been identiﬁed for short

term wind power prediction. They can be categorized

into physical methods, statistical methods, methods

based upon artiﬁcial intelligence (AI) and hybrid ap-

proaches [4]. The physical method needs a lot of

physical considerations to give a good prediction pre-

cision. It is usually used for long term prediction [5].

While the statistical performs well in short term pre-

diction [6].

The traditional statistical methods are time-series-

based methods, such as the persistence method [7],

auto regressive integrated moving average (ARIMA)

method [8, 9], etc. These methods are based on a

linear regression model and can not always repre-

sent the nonlinear characteristics of the inputs. The

AI methods describe the relation between input and

output data from time series of the past by a non-

1

statistical approach such as artiﬁcial neural network

(ANN) [10, 11], fuzzy logic [7] and neuro-fuzzy [12].

Moreover, other hybrid methods [13, 14] have also

been applied to short-term wind power prediction

with success.

Support vector regression (SVR) [15] has been ap-

plied to wind speed prediction with success [16]. SVR

has been shown to be very resistant to the overﬁtting

problem and gives a high generalization performance

in prediction problems. SVR has been evaluated on

several time series datasets [17].

The Group Method of Data Handling (GMDH)

is a self-organizing method that was ﬁrstly developed

by Ivakhnenko [18]. The main idea of GMDH is to

build an analytical function in a feedforward network

based on a quadratic node transfer function whose

coeﬃcients are obtained using a regression technique

[19]. GMDH has been applied to solve many predic-

tion problems with success [20, 21].

All the above techniques are known as global time

series predictors in which a predictor is trained using

all data available but gives a prediction using a cur-

rent data window. The global predictors suﬀer from

some drawbacks which are discussed in the previous

work [22].

The local SVR method is proposed by us to over-

come the drawbacks of global predictors [22]. More

details of the local predictor can be found in [22].

Phase space reconstruction is an important step in

local prediction methods. The traditional time series

reconstruction techniques usually use the coordinate

delay (CD) method [23] to calculate the embedding

dimension and the time delay constant of the time

series [24].

The traditional time series reconstruction tech-

niques have a serious problem. In which there may

be correlation between diﬀerent features in recon-

structed phase space. Consequently, the quality of

phase space reconstruction will be aﬀected [25]. In

recent years, to process nonlinear time series, the

kernel principal component analysis (KPCA) which

is one type of nonlinear principal component anal-

ysis (PCA) is used [26]. KPCA is an unsupervised

technique that is based on performing principal com-

ponent analysis in the feature space of a kernel. The

main idea of KPCA is ﬁrst to map the original in-

puts into a high-dimensional feature space via a ker-

nel map,which makes data structure more linear, and

then to calculate principal components in the high-

dimensional feature space [25].

Moreover, our previous work on local SVR pre-

dictor is extended to locally weighted support vec-

tor regression (LWSVR) by modifying the risk func-

tion of the SVR algorithm with the use of locally

weighted regression (LWR) while keeping the regu-

larization term in its original form [27, 28]. LWSVR

has been applied to solve short term load forecasting

(STLF) problem [27, 28]. Although LWSVR method

improves the accuracy of STLF, it suﬀers from some

limitations. First, the most serious limitation of SVR

algorithm is uncertain in choice of a kernel. The best

choice of kernel for a given problem is still a research

issue. The second limitation is the selection of SVR

parameters due to the lacking of the structural meth-

ods for conﬁrming the selection of parameters eﬃ-

ciently. Finally, the SVR algorithm is computation-

ally slower than the artiﬁcial neural networks.

To avoid the limitations of the existing methods

and in order to follow the latest developments to

have a modern system, a new method is proposed

in this paper using an alternative machine learning

technique which is called GMDH.

The proposed method is derived by combining the

GMDH with the local regression method and weighted

least squares regression and employing the weighted

distance algorithm which uses the Mahalanobis dis-

tance to optimize the weighting function’s bandwidth.

In the proposed model, the phase space is recon-

structed based on KPCA method, so that the prob-

lem of the traditional time series reconstruction tech-

niques can be avoided. The proposed method has

been evaluated using real world dataset.

The paper is organized as follows: Section II de-

scribes the time series reconstruction based on KPCA

method. Section III reviews the GMDH algorithm.

The LWGMDH method is introduced in Section IV.

Section V describes the weighted distance algorithm.

Experimental results and comparisons with other meth-

ods are presented in Section VI. Finally, Section VII

concludes the work.

2. TIME SERIES RECONSTRUCTION

BASED ON KPCA

In recent years, to process nonlinear time series,

KPCA is used to overcome the CD method prob-

lem [26]. In KPCA the computations are performed

in a feature space that is nonlinearly related to the

input space. This feature space is deﬁned by an in-

ner product kernel in accordance with Mercer’s the-

2

orem [29]. However, unlike other forms of nonlinear

PCA, the implementation of KPCA relies on linear

algebra by mapping the original inputs into a high-

dimensional feature space via a kernel map, which

makes data structure more linear. In this paper, the

commonly used Gaussian kernel is employed. The

detail introduction of the basic KPCA can be viewed

in [25, 28, 29]

3. Group Method of Data Handling (GMDH)

Suppose that the original dataset consists of M

columns of the values of the system input variables

that is X= (x1(t), x2(t), ..., xM(t)), (t= 1,2, ..., N )

and a column of the observed values of the output

and Nis the length of the dataset.

The connection between inputs and output vari-

ables can be represented by an inﬁnite Volterra Kol-

mogorov Gabor polynomial of the form:

y=a0+

N

i=1

aixi+

N

i=1

N

j=1

aij xixj

+

N

i=1

N

j=1

N

k=1

aijk xixjxk+... (1)

where Nis the number of the data of the dataset,

A(a0, ai, aij , aijk , ...) and X(xi, xj, xk, ...) are vectors

of the coeﬃcients and input variables of the resulting

multi-input single-output system, respectively.

In the GMDH algorithm, the Volterra-Kolmogorov-

Gabor series is estimated by a cascade of second order

polynomials using only pairs of variables [18] in the

form of:

y=a0+a1xi+a2xj+a3xixj+a4x2

i+a5x2

j(2)

The corresponding network shown in Fig. 1 can be

constructed from simple polynomial. As the learning

procedure evolves, branches that do not contribute

signiﬁcantly to the speciﬁc output can be deleted,

this allows only the dominant causal relationship to

evolve.

The GMDH network training algorithm proce-

dures can be summarized as follows:

•GMDH network begins with only input nodes

and all combinations of diﬀerent pairs of them

are generated using a quadratic polynomial us-

ing (Eq. 2) and sent into the ﬁrst layer of

the network. The total number of polynomials

(nodes) that can constructed is equal to M(M−1)

2.

(1)

1

y

1

x

2

x

3

x

4

x

(1)

2

y

(1)

3

y

(1)

4

y

(1)

6

y

(1)

5

y

(2)

1

y

(2)

2

y

(2)

3

y

(2)

4

y

(2)

6

y

(2)

5

y

(3)

1

y

(3)

2

y

(3)

3

y

(3)

4

y

(3)

6

y

(3)

5

y

Figure 1: GMDH network.

•Use least squares regression to compute the op-

timal coeﬃcients of each polynomial (node)

A(a0, a1, a2, a3, a4, a5) to make it best ﬁt the

training data as following:

e=1

N

N

i=1

(yi−yi)2(3)

The least square solution of (3) is given by:

A= (XTX)−1XTy(4)

where Y= [y1, y2, ..., yN]T,

X=

1x1Px1Qx1Px1Qx2

1Px2

1Q

1x2Px2Qx2Px2Qx2

2Px2

2Q

. . . . . .

. . . . . .

1xNP xN Q xN P xNQ x2

NP x2

NQ

and P, Q ∈ {1,2, .., M }

•Compute the mean squared error for each node

(Eq. 3).

•Sort the nodes in order of increasing error.

•Select the best nodes which give the smallest

error from the candidate set to be used as in-

put into the next layer with all combinations of

diﬀerent pairs of them being sent into second

layer.

•This process is repeated until the current layer

is found to not be as good as the previous one.

Therefore, the previous layer best node is then

used as the ﬁnal solution.

More details about the GMDH and its diﬀerent

applications have been reported in [19, 30].

3

4. Locally Weighted Group Method of Data

Handling (LWGMDH)

The LWGMDH method is derived by combining

GMDH with the local regression method and weighted

least squares (WLS) regression. To predict the out-

put values yfor each query point (xq) belongs to the

testing set, the GMDH will be trained using the K

nearest neighbors only (1 < K ≪N) of this xq. The

coeﬃcient parameters is calculated using WLS regres-

sion where each point in the neighborhood is weighted

according to its distance from the xq. The points that

are close to xqhave large weights, and the points far

from xqhave small weights.

Overall, the framework of the design procedure of

the LWGMDH comes as a sequence of the following

steps.

•Step 1: Reconstruct the time series: Load the

multivariate time series dataset;

X= (x1(t), x2(t), ..., xM(t)), (t= 1,2, ..., N ).

Using the KPCA method to calculate the num-

ber of principal components of each dataset (we

set the time delay constant of all datasets equal

to 1). Then, reconstruct the multivariate time

series using these values.

•Step 2: Form a training and validation data:

The input dataset after reconstruction

Xis di-

vided into two parts, that is a training

Xtr data-

set and validation

Xva dataset . The size of the

training dataset is Ntr while the size of the val-

idation dataset is Nva.

•Step 3: For each query point xq, choosing the K

nearest neighbors of this query point using the

Euclidian distance between xqand each point

in

Xtr (1 < K ≪Ntr ).

•Step 4: Create the ﬁrst layer: Using the Knear-

est neighbors only, all combinations of the in-

puts are generated based on (2) and sent into

the ﬁrst layer of the network.

•Step 5: Estimate the coeﬃcients parameters of

each node: The vector of coeﬃcients Ais de-

rived by minimizing the locally weighted mean

squared error

e=1

K

K

i=1

wi(yi−yi)2(5)

where wis the weighting function. Many weight-

ing functions are proposed by the researchers [31].

Out of these weighting functions, Gaussian ker-

nel, tricube kernel and quadratic kernel are the

most popular [31]. In this work, we employ the

commonly used Gaussian kernel weighting func-

tion as following:

wi=exp −∥xi−xq∥2

h2(6)

where xqis the query point, xiis a data point

belongs to the nearest neighbors points of xq

and his the bandwidth parameter which plays

an important role in local modelling. An opti-

mization method for the bandwidth is discussed

in the next section of the paper.

The weighted least square solution of (5) is given

by:

A= ((W X )T(W X))−1(W X )T(W y) (7)

where Wis the diagonal matrix with diagonal

elements Wii =wiand zeros elsewhere [31],

Y= [y1, y2, ..., yK]T,A(a0, a1, a2, a3, a4, a5), X

is deﬁned in the last section but with number

of rows equal to K(the number of the near-

est neighbors). This procedure is implemented

repeatedly for all nodes of the layer.

•Step 6: Select the nodes with the best predic-

tive capability to create the next layer: Each

node in the current layer is evaluated using the

training and validation datasets. Then the nodes

which gives the best predictive performance for

the output variable are chosen for input into the

next layer with all combinations of the selected

nodes based on (2) being sent into next layer.

In this paper, we use a predetermined number

of these nodes. The coeﬃcients parameters of

each node in this layer can be estimated using

the same procedures in step 5.

•Step 7: Check the stopping criterion: The mod-

elling can be terminated when:

el+1 ≥el(8)

where el+1 is a minimal identiﬁcation error of

the current layer while elis a minimal iden-

tiﬁcation error of the previous layer. So that

the previous layer (l) best node is then used as

4

the ﬁnal solution of the current query point. If

the stopping criterion is not satisﬁed, the model

has to be expanded. The steps 6 to 7 can be

repeated until the stopping criterion is satisﬁed.

•Step 8: Then, the steps 3 to 7 can be repeated

until the future values of diﬀerent query points

are all acquired.

Figure 2 presents the computation procedure of

the proposed method.

5. Weighted Distance Algorithm For

Optimizing The Bandwidth

The weighting function bandwidth (h)is a very

important parameter which plays an important role

in local modelling. If his inﬁnite then the local mod-

elling becomes global. On the other hand, if his too

small, then it is possible that we will not have ade-

quate number of data points in the neighborhood for

a good prediction.

There are several ways to use this parameter like,

constant bandwidth selection, nearest neighbor band-

width selection where his set to be the distance be-

tween the query point and the Kth nearest point,

global bandwidth selection where his calculated glob-

ally by an optimization process, etc [31].

The constant bandwidth selection method where

training data with constant size and shape are used

is the easiest and common way to adjust the radius of

the weighting function. However, its performance is

unsatisfactory for nonlinear system as the density and

distribution of data points are unlikely to be identi-

cal at every place of the data set [32]. In this paper,

we used the weighted distance algorithm which uses

the Mahalanobis distance metric for optimizing the

bandwidth (h) to improve the accuracy of our pro-

posed method.

With the Mahalanobis distance metric, the prob-

lem of scale and correlation inherent in Euclidean

distance are no longer an issue. In the Euclidean

distance, the set of points which have equal distance

from a given location is a sphere. The Mahalanobis

distance metric stretches this sphere correct for the

respective scales of the diﬀerent variables.

The standard Mahalanobis distance metric can be

deﬁned as:

d(x) = (x−µ)TS−1(x−µ) (9)

where xis a vector of data, µis a mean and S−1is

inverse covariance matrix.

Deﬁning the Mahalanobis distance metric between

the query point xqand data point xas:

dq=(x−xq)TS−1(x−xq) where xbelongs to the

Knearest neighbors of the query point xqand S−1

is computed after removing the mean from each col-

umn, the bandwidth hqis the function of dq:

hq= Θ(dq) (10)

where dmin ≤dq≤dmax and dmin is the distance

between xqand the closest neighbor while dmax is the

distance between xqand the farthest neighbor.

According to the LWR method, the query point

corresponding to dq=dmin is most important that

is hmax = Θ(dmin) = 1 while the query point corre-

sponding to dq=dmax is the least important, that

is hmin = Θ(dmax) = δ.δis a real constant. This

constant is a low sensitivity parameter. Therefore,

after few trials, we ﬁx it to 0.01 which gives the best

results.

The bandwidth hqcan be selected as a function

of dqas follows [32]:

hq= Θ(dq) = a1−bdq

dq2

+c(11)

where a,band care constants. By applying the

boundary conditions, we can calculate these constants

and get [32]:

hq= (1 −δ)dmin (dmax −dq)

dq(dmax −dmin)2

+δ(12)

The Gaussian kernel weighting function which used

in this paper can be written as following:

w=exp −d2

q

h2

q(13)

6. Experimental Results

6.1. Data

To evaluate the performance of the proposed meth-

od, it has been tested for wind power prediction using

the real data from wind farms in Alberta, Canada [33].

Alberta has the highest percentage of total installed

wind generation capacity of any province in Canada.

There are more than 40 wind projects proposed for fu-

ture development in Alberta. Alberta includes many

5

Load the multivariate time series

dataset

Find the Knearest neighbors for the

current prediction point using the

Euclidean distance

Is this

Prediction point the

last one?

No

Yes

Future prediction

Set all

Create the first layer using Knearest

neighbors only based on (12)

Calculate the bandwidth parameter (h)

and Gaussian weighting function of

each point in the neighbourhood

The previous layer (l) best node is then

used as the final solution of the current

prediction point

Estimate the coefficient parameters of

each node in the layer using (17)

Is

e≥ e

Yes

No

Create the next layer (l+1) and estimate

its coefficient parameters with the same

procedures in previous layer (l)

Select the nodes with the best predictive

capability

Figure 2: Flowchart of the proposed method

wind farms such as Ghost Pine wind farm (owning 51

turbines and 81.6 MW total capacity), Taber wind

farm (owning 37 turbines and 81.4 MW total capac-

ity), Wintering Hills wind farm (owning 55 turbines

and 88 MW total capacity), etc [34]. The total wind

power installed capacity in 2011 is 800 MW. This

value will be raised to 893 MW by the most recent

governmental goals for the wind sector in 2012 [34].

The wind power proﬁle in Alberta, Canada at Jan-

uary 2011, is shown in Fig 3.

6.2. Parameters

To implement a good model, there are some im-

portant parameters to choose. There are two impor-

tant parameters in the KPCA algorithm which used

5 10 15 20 25 30

0

100

200

300

400

500

600

700

Day

Wind Power (MW)

Figure 3: Wind power proﬁle in Alberta, Canada, at January

2011.

6

Table 1: Comparative RMSE Results

Winter Spring Summer Fall Average

Persistence 13.71 16.19 14.42 22.99 16.83

SARIMA 6.70 6.59 8.09 13.88 8.82

LRBF 5.03 4.85 4.76 6.97 5.40

LWGMDH 4.01 3.90 3.72 5.32 4.24

Table 2: Improvement of the LWGMDH Over Other Ap-

proaches Regarding RMSE

Average RMSE Improvement

LWGMDH 4.24 −−

Persistence 16.83 74.81%

SARIMA 8.82 51.93%

LRBF 5.40 21.48%

to reconstruct the phase space. These parameters

are the number of principal components (nc) and w2

in the Gaussian kernel function. The optimal values

of these parameters which computed using the cross

validation method are w2= 1.09 and nc= 10.

In the local prediction model, choosing the neigh-

borhood size (K) is very important step. So, this pa-

rameter is calculated as describe in [27] where kmax

and βare always ﬁxed for all test cases at 45% of N

and 80, respectively.

6.3. Forecasting accuracy evaluation

For all performed experiments, we quantiﬁed the

prediction performance with root mean square error

(RMSE) and normalized mean absolute error (NMAE)

criterion. They can be deﬁned as:

RMSE =

1

N

N

h=1

[ph−ph]2(14)

NMAE = 1

N

N

h=1

|ph−ph|

pinst

×100 (15)

where phand phare the forecasted and actual elec-

tricity prices at hour h, respectively, pinst is the in-

stalled wind power capacity and Nis the number of

forecasted hours

6.4. Results

The proposed LWGMDH method has been ap-

plied for the prediction of the whole wind power in

Alberta, Canada. The performance of the proposed

method is compared with 3 published approaches em-

ploying the same dataset. These approaches are per-

sistence, seasonal ARIMA (SARIMA) and local ra-

dial basis function (LRBF). Historical wind power

data are the only inputs for training the proposed

method. For the sake of clear comparison, no exoge-

nous variables are considered.

The proposed LWGMDH method predicts the val-

ue of wind power subseries for one day ahead, taking

into account the wind power data of the previous 3

months (the ﬁrst 80% values of these data are used

for training, while the last 20% values are used for

validation). The length of the forecast horizon for the

Alberta dataset is 24 hours. Four test weeks (Monday

to Sunday) corresponding to four seasons of year 2011

are randomly selected for this numerical experiment.

These test weeks are: the second week of February

2011 as a winter week, The third week of May 2011

as a spring week, The second week of August 2011 as

a summer wee, and the ﬁrst week November 2011 as

a fall week.

The error (RMSE and NMAE) of each day dur-

ing each testing week is calculated. Then the aver-

age error of each testing week (Monday to Sunday)

is calculated by averaging the seven error values of

its corresponding forecast days. Finally, the overall

mean performance for the four testing weeks for each

method can be calculated.

Table 1 shows a comparison between the proposed

LWGMDH method and three other approaches (Per-

sistence, SARIMA and LRBF), regarding the RMSE

criterion.

These results show that the proposed method out-

performs other methods. Table 2 shows the RMSE

7

Table 3: Comparative NMAE Results

Winter Spring Summer Fall Average

Persistence 6.59 7.66 7.51 11.07 8.21

SARIMA 3.21 3.09 3.84 6.53 4.17

LRBF 2.38 2.31 2.20 3.26 2.54

LWGMDH 1.94 1.85 1.71 2.48 1.99

Table 4: Improvement of the LWGMDH Over Other Ap-

proaches Regarding NMAE

Average NMAE Improvement

LWGMDH 1.99 −−

Persistence 8.21 75.76%

SARIMA 4.17 52.28%

LRBF 2.54 21.65%

improvements of the LWGMDH method over Persis-

tence, SARIMA and LRBF.

Table 3 shows a comparison between the proposed

LWGMDH method and three other approaches (Per-

sistence, SARIMA and LRBF), regarding the NMAE

criterion.

These results show the superiority of the proposed

method over other methods. Table 4 shows the NMAE

improvements of the LWGMDH method over Persis-

tence, SARIMA and LRBF.

Figs. 4- 7 show the predicted hourly wind power

versus the actual wind power of one day (as an exam-

ple) of each testing week using the proposed LWG-

MDH method. These results show that our prediction

values are very close to the actual values.

The above results indicates that the proposed met-

hod is less sensitivity to the wind power volatility

than the other techniques used in the comparison.

To further study the superiority of LWGMDH

method, it is also executed for all 52 weeks of year

2011 for the Alberta dataset and compared with three

other approaches (Persistence, SARIMA and LRBF).

The results show that the proposed LWGMDH

method improves the RMSE and NMAE for the 52

weeks of year 2011 over the Persistence, SARIMA

and LRBF methods. Table 5 shows the RMSE and

NMAE improvements of the LWGMDH method over

5 10 15 20

500

550

600

Hour

Wind Power (MW)

Actual

Predicted

Figure 4: Forecasted and actual hourly wind power for Febru-

ary 9, 2011.

5 10 15 20

250

300

350

400

450

500

550

600

Hour

Wind Power (MW)

Actual

Predicted

Figure 5: Forecasted and actual hourly wind power for May 17,

2011.

Persistence, SARIMA and LRBF. In addition, Fig. 8

shows the comparison between LWGMDH method

and Persistence, SARIMA and LRBF methods for

each month of year 2011 regarding RMSE criterion.

Same results can be got using the NMAE criterion.

These results show the robustness of the proposed

LWGMDH method and its performance in a long run

for a complete year.

8

Table 5: Improvement of the LWGMDH Over Other Methods for All 52 Weeks Of Year 2011

RMSE Improvement NMAE Improvement

LWGMDH −− −−

Persistence 73.88 % 75.35%

SARIMA 51.19 % 51.81%

LRBF 20.29 % 21.01%

5 10 15 20

0

20

40

60

80

100

120

Hour

Wnd Power (MW)

Actual

Predicted

Figure 6: Forecasted and actual hourly wind power for August

11, 2011.

5 10 15 20

100

150

200

250

300

350

400

450

Hour

Wind Power (MW)

Actual

Predicted

Figure 7: Forecasted and actual hourly wind power for Novem-

ber 3, 2011.

7. Conclusions

In this paper, we have proposed a LWGMDH based

KPCA method for wind power prediction. In the pro-

posed method, the KPCA method is used to recon-

struct the time series phase space and the neighboring

points are presented by Euclidian distance for each

query point. These neighboring points only can be

used to train the GMDH where the coeﬃcient param-

Jan. Feb. Mar. Apr. May Jun. Jul. Aug. Sep. Oct. Nov. Dec.

0

5

10

15

20

25

Month

RMSE

LWGMDH

LRBF

SARIMA

Persistence

Figure 8: RMSE Results For The Year Of 2011

eters are calculated using the weighted least square

(WLS) regression. In addition, the weighting func-

tion’s bandwidth which plays a very important role

in local modelling is optimized by the weighted dis-

tance algorithm.

By using the KPCA the drawback of the tradi-

tional time series reconstruction techniques can be

avoided by decreasing the correlation between diﬀer-

ent features in reconstructed phase space. Also, by

combining GMDH with the local regression method

the drawbacks of global methods can be overcome. In

addition, by using the WLS, each point in the neigh-

borhood is weighted according to its distance from

the current query point. The points that are close to

the current query point have larger weights than oth-

ers. Moreover, by using the weighted distance algo-

rithm, the disadvantage of using the weighting func-

tions bandwidth as a ﬁxed value can be overcome.

This has led to improve the accuracy of the proposed

model.

A real world dataset has been used to evaluate the

performance of the proposed model which has been

compared with Persistence, SARIMA and LRBF meth-

9

ods. The numerical results show the superiority of

the proposed model over Persistence, SARIMA and

LRBF methods based on diﬀerent measuring errors.

References

[1] M. Negnevitsky, P. Johnson, S. Santoso, Short term wind

power forecasting using hybrid intelligent systems, in:

Proc. of Power Eng. Soc. General Meeting, Tampa, FL,

2007, pp. 1–4.

[2] M. Khalid, A. Savkin, A method for short-term wind

power prediction with multiple observation points, IEEE

Trans. Power Systems 27 (2) (2012) 579–586.

[3] H. P. J. Catalo, V. Mendes, A hybrid PSO ANFIS ap-

proach for short-term wind power prediction in Portugal,

IEEE Trans Sustainable Energy 2 (1) (2011) 50–59.

[4] M. Negnevitsky, P. Mandal, A. K. Srivastava, Machine

learning applications for load, price and wind power pre-

diction in power systems, in: Proc. of 15th Int. Conf. on

Intelligent Syst. Appl. to Power Syst., 2009, pp. 1–6.

[5] L. Wang, L. Dong, Y. Hao, X. Liao, Wind power predic-

tion using wavelet transform and chaotic characteristics,

in: Proc. of World Non-Grid-Connected Wind Power and

Energy Conf., (WNWEC 2009), 2009, pp. 1–5.

[6] L. Ma, S. Y. Luan, C. W. Jiang, H. L. Liu, Y. Zhang,

A review on the forecasting of wind speed and generated

power, Renew. Sust. Energy Rev. 13 (4) (2009) 915–920.

[7] G. Sideratos, N. Hatziargyriou, Using radial basis neural

networks to estimate wind power production, in: Proc. of

Power Eng. Soc. General Meeting, Tampa, FL, 2007, pp.

1–7.

[8] D. F. W. Jiang, Z. Yan, Z. Hu, Wind speed forecasting

using autoregressive moving average/generalized autore-

gressive conditional heteroscedasticity model, European

Transactions on Electrical Power 22 (5) (2012) 662–673.

[9] R. G. Kavasseri, K. Seetharaman, Day-ahead wind speed

forecasting using f-ARIMA models, Renew. Energy 34 (5)

(2009) 1388–1393.

[10] F. K. N. Amjady, H. Zareipour, Wind power prediction by

a new forecast engine composed of modiﬁed hybrid neural

network and enhanced particle swarm optimization, IEEE

Trans. Sustainable Energy 2 (3) (2011) 265–276.

[11] K. Bhaskar, S. Singh, AWNN-assisted wind power fore-

casting using feed-forward neural network, IEEE Trans.

Sustainable Energy 3 (2) (2012) 306–315.

[12] C. W. Potter, W. Negnevitsky, Very short-term wind fore-

casting for Tasmanian power generation, IEEE Trans.

Power Systems 21 (2) (2006) 965–972.

[13] F. K. N. Amjady, H. Zareipour, A new hybrid iterative

method for short-term wind speed forecasting, European

Transactions on Electrical Power 21 (1) (2011) 581–595.

[14] S. Fan, J. R. Liao, R. Yokoyama, L. N. Chen, W. J. Le,

Forecasting the wind generation using a two-stage network

based on meteorological information, IEEE Trans. Energy

Conversion 24 (2) (2009) 474–482.

[15] A. J. Smola, B. Scholkopf, A tutorial on support vector

regression, NeuroCOLT Technical Report NC-TR-98-030,

Royal Holloway College, University of London, 1998.

[16] D. Ying, J. Lu, Q. Li, Short-term wind speed forecasting of

wind farm based on least square-support vector machine,

Power Syst. Technology 32 (15) (2008) 62–66.

[17] M. Zhang, Short-term load forecasting based on support

vector machines regression, in: Proceedings of the Fourth

Int. Conf. on Machine Learning and Cyber., 2005.

[18] A. G. Ivakhnenko, Polynomial theory of complex systems,

IEEE Trans. of Syst., Man and Cyber. SMC-1 (1971) 364–

378.

[19] S. J. Farlow, Self-organizing method in modeling: GMDH

type algorithm, Marcel Dekker Inc., 1984.

[20] D. Srinivasan, Energy demand prediction using GMDH

networks, Neurocomputing 72 (2008) 625–629.

[21] R. E. Abdel-Aal, M. A. Elhadidy, S. M. Shaahid, Modeling

and forecasting the mean hourly wind speed time series us-

ing GMDH-based abductive networks, Renewable Energy

34 (7) (2009) 1686–1699.

[22] E. E. El-Attar, J. Y. Goulermas, Q. H. Wu, Forecasting

electric daily peak load based on local prediction, in: IEEE

Power Engineering Society General Meeting (PESGM09),

2009, pp. 1–6.

[23] F. Takens, Detecting strange attractors in turbulence,

Lect. Notes in Mathematics (Springer Berlin) 898 (1981)

366–381.

[24] D. Tao, X. Hongfei, Chaotic time series prediction based

on radial basis function network, in: Eighth ACIS Int.

Conf. on Software Engin., Artiﬁcial Intelligence, Network-

ing, and Parallel/Distributed Computing, 2007, pp. 595–

599.

[25] L. Caoa, K. Chuab, W. Chongc, H. Leea, Q. Gud, A com-

parison of PCA,KPCA and ICA for dimensionality re-

duction in support vector machine, Neurocomputing 55

(2003) 321–336.

[26] F. Chen, C. Han, Time series forecasting based on wavelet

KPCA and support vector machine, in: IEEE Int. Conf.

on Automation and Logistics, 2007, pp. 1487–1491.

[27] E. E. Elattar, J. Y. Goulermas, Q. H. Wu, Electric load

forecasting based on locally weighted support vector re-

gression, IEEE Trans. Syst., Man and Cyber. C, Appl.

and Rev. 40 (4) (2010) 438–447.

[28] E. E. Elattar, J. Y. Goulermas, Q. H. Wu, Integrat-

ing KPCA and locally weighted support vector regres-

sion for short-term load forecasting, in: the 15th IEEE

Miditerranean Electrotechnical Conf. (MELECOn 2010),

Valletta, Malta, 2010, pp. 1528–1533.

[29] S. Haykin, Neural networks: A comprehensive foundation,

Printic-Hall, Inc., 1999.

[30] A. G. Ivakhnenko, G. A. Ivakhnenko, The review of prob-

lems solved by algorithms of the group method of data

handling (GMDH), Pattern Recognition and Image Anal-

ysis 5 (1995) 527–535.

[31] C. C. Atkeson, A. W. Moore, S. Schaal, Locally weighted

learning, Artiﬁcial Intelligence Review (Special Issue on

Lazy Learning) 11 (1997) 11–73.

[32] H. Wang, C. Cao, H. Leung, An improved locally weighted

regression for a converter re-vanadium prediction model-

ing, in: Proceedings of the 6th World Congress on Intelli-

gent Control and Automation, 2006, pp. 1515–1519.

[33] Alberta electric system operator (AESO), wind power in-

tegration (2012).

URL http://www.aeso.ca/gridoperations/13902.html

[34] Canadian wind power association (CANWEA) (2012).

URL http://www.canwea.ca/farms/wind-farms_e.php

10