Content uploaded by Xiaodong Jia
Author content
All content in this area was uploaded by Xiaodong Jia on Aug 06, 2019
Content may be subject to copyright.
Gaussian Process Regression for Numerical Wind Speed Prediction Enhancement
1
Haoshu Caia, Xiaodong Jiaa*, Jianshe Fenga, Yuan-Ming Hsua, Wenzhe Lia, Jay Leea
2
a NSF I/UCR Center for Intelligent Maintenance Systems,
3
Department of Mechanical Engineering, University of Cincinnati, PO Box 210072,
4
Cincinnati, Ohio 45221-0072, USA
5
6
Abstract:
7
This paper studies the application of Multi-Task Gaussian Process (MTGP) regression model
8
to enhance the numerical predictions of wind speed. In the proposed method, a Support Vector
9
Regressor (SVR) is first utilized to fuse the predictions from Numerical Weather Predictors (NWP).
10
The purpose of this regressor is to map the numerical predictions at coarse geographical nodes to
11
the desired turbine location. In subsequent analysis, this SVR prediction output is further enhanced
12
by the MTGP regression model. Based on the validation results on the real-world data from large-
13
scale off-shore wind farm, the prediction accuracies of the NWP are significantly improved at both
14
the short-term horizons (1~6 hours ahead) and the long-term horizons (7~24 hours ahead) by
15
employing the proposed method. More importantly, the short-term prediction accuracy after
16
enhancement is found comparable or even better than the cutting-edge statistical models for short-
17
term extrapolations.
18
19
Keyword: Wind speed prediction, Multi-Task Gaussian Process, Gaussian Process Regression,
20
Support Vector Machine, Time Series Prediction, Forecasting
21
22
23
Correspondence Author:
24
Xiaodong Jia; E-mail: jiaxg@mail.uc.edu; Tel: (513)556-3412; Fax: (513)556-3390
25
Address: 560 Baldwin Hall, University of Cincinnati, PO Box 210072, Cincinnati, OH 45221
26
27
28
1
1. Introduction
29
Driven by the demand of renewable energy, large numbers of wind power generators are
30
erected over the past decade [1-3]. However, one limitation that impedes the further development
31
of wind farm is its high Operation and Maintenance (O&M) cost, which is essentially caused by
32
the uncertainty of wind power production. To better adapt to the inconsistent wind conditions and
33
reduce the costs, wind speed (WS) prediction is identified as one of the key inputs for the wind
34
farm power dispatching and the maintenance planning. In current practices, both the short-term
35
WS prediction for the future 0-6 hours and the long-term prediction for future 7-24 hours are
36
important inputs to meet the requirements of power grid dispatching, and the power output for
37
each wind turbine are optimized based on the predicted wind conditions[4, 5]. For maintenance
38
planning, the short-term WS prediction within future 6 hours also serves as a critical input to
39
schedule the maintenance activities for the next day and to minimize the production loss [6-9]. To
40
this end, investigations on the advanced analytics for WS prediction hold great economic value
41
and academic value.
42
WS prediction is intrinsically challenging due to the intermittent fluctuations in WS under
43
intricate meteorological conditions. To better predict the WS at different time horizons, the short-
44
term prediction and long-term prediction are done by different types of data driven models. The
45
short-term prediction is largely based on statistical approaches, which extrapolates the WS series
46
by modeling its time evolution using statistical models. Related examples involve Auto-Regressive
47
(AR) and Auto-Regressive Moving Average (ARMA), Artificial Neural Network (ANN) and
48
Kalman Filter (KF) or Unscented Kalman Filter (UKF) that are discussed in [10-13]. To enhance
49
the performance and the robustness of these prediction algorithms, various combined models are
50
also proposed in the literature. Monfared et al. [12] utilize fuzzy logic and ANN to model WS
51
estimation based on the statistic properties of the input time series. Kani et al.[14] use ANN and
52
Markov Chain to capture patterns in WS time series data. Chen et al.[15] integrate Support Vector
53
Regression (SVR) with KF to realize dynamic state estimation. Santamaría-Bonfil et al.[16, 17]
54
utilize SVR model by tuning the model parameters using heuristic algorithms such as Genetic
55
Algorithm (GA) and Particle Swam Optimization (PSO). For the long-term WS prediction, the
56
prediction outputs from the Numerical Weather Predictors (NWP) are generally preferred since
57
the accuracy of the statistical models deteriorates very fast when larger prediction horizon is
58
considered. The prediction results from NWP are usually given at coarse geographical grids and
59
these results may indicate systematic prediction bias on complex terrain. To address these issues,
60
the NWP results are used as reference predictions and regression models are employed to post-
61
process the NWP outputs. For example, KF is explored as a post-processing method in [18] to
62
correct the prediction results from NWP model to avoid systematic bias. Similarly, a practical
63
methodology based on KF is utilized to improve the prediction of NWP in [19] and the results are
64
validated on two years’ data.
65
By summarizing these related researches, it is found that the statistical models for time series
66
extrapolation can give rather satisfactory accuracy in the short-term horizons (1~6 hours ahead),
67
however, its accuracy deteriorates very fast when the prediction horizon exceeds 6 hours. For long-
68
term prediction (7 hours), the predictions from NWP are necessary to guarantee the prediction
69
accuracy. However, since the NWP results are given at coarse geographical grids and due to the
70
complexity of wind dynamics itself, the direct output from NWP normally has bias comparing
71
2
with prediction target. This bias in prediction is found more prominent in complex terrain and
72
dynamic wind environment. Although the KF structure in [18] and [19] mitigates the prediction
73
bias of NWP to some extent, this method still fails to yield satisfactory results in short-term
74
prediction horizons. Moreover, the KF needs too many prior assumptions, such as the regression
75
coefficients in KF need to be known beforehand, the noise term in KF needs to follow Gaussian
76
distribution and the dynamic process modeled by KF must be linear.
77
To address these challenges and limitations, this work proposes to use the Multiple Task
78
Gaussian Process (MTGP) to post-process the numerical weather predictions. In this work, a novel
79
methodology for NWP enhancement is proposed based on the MTGP model. One major
80
contribution of the proposed method is that it not only enhances the prediction accuracy in the
81
long-term horizons (7~24 hours ahead) but also significantly improved the accuracy in the short-
82
term horizons (1~6 hours ahead). Especially, the short-term prediction accuracy of the proposed
83
method is found even better than the statistical models that are specifically proposed for short term
84
extrapolations. In the proposed method, spatial correlation of wind speed between the prediction
85
grids of NWP and the turbine nacelle position are first modeled by SVR. Subsequently, the
86
prediction outcome of SVR are further enhanced by MTGP. The superiority of the proposed
87
method is demonstrated by benchmarking with cutting-edge prediction techniques for short term
88
predictions in [15] and the recent techniques for NWP enhancement in [18, 19]. To the author’s
89
knowledge, this is the first time that MTGP is applied to address wind speed prediction issues.
90
And there are still limited prediction methods in literature that can achieve good accuracy in both
91
short-term and long-term prediction horizons.
92
The rest of this paper is organized as follows. Section 2 illustrates the technical backgrounds.
93
In Section 3, the proposed methodology is presented and described. Section 4 shows the
94
experiment results, the comparison with several benchmarks and the discussions. Finally,
95
conclusions and future work are presented in Section 5.
96
2. Technical Backgrounds
97
2.1. Standard Gaussian Process Regression
98
Gaussian Process Regression (GPR) is a non-parametric method that can model arbitrary
99
complex system. In most prediction problems, GPR is preferred due to its flexibility to provide the
100
uncertainty representations. GPR models a time series using Gaussian prior that is parameterized
101
by the mean function (MF) and a covariance function (CovF) as described below:
102
( )
( ) ~ ( ), ( , )f m k=y x x x x'N
(1)
In Eq.(1), and denote the input and output in the training dataset and is known as
103
latent variable in the GPR model. In most applications, the mean function in Eq.(1) is set to
104
0, and CovF , which describes the similarity between input data points, is the key ingredient
105
in GPR since data points with similar input are likely to have similar target value [20]. In the
106
current literature, one of the most frequently-used kernel function squared exponential (SE) is
107
shown below:
108
3
2
2
12
2
( ) exp 2
SE
d
kd
=−
(2)
Where in Eq.(2) are the Euclidean distance between two indexes .
109
denotes the modified Bessel function. Parameters, in Eq.(2) are the hyper parameters that
110
need to be optimized.
111
During the model training, the negative log marginalized likelihood (NLML) in Eq. (3) is
112
minimized, so that the hyper-parameters in the kernel matrix can be estimated.
113
( )
2 2 1
NLML = log ( | , )
11
log | | ( ) log(2 )
2 2 2
nn
p
n
−
−
= − + − + −
T
yx
K I y K I y
(3)
The unknown hyper-parameter in Eq. (3) is determined by minimizing the NLML. The
114
optimization problem for parameter estimation is written as:
115
ˆargmin log( ( | , ))
=−pxy
(4)
Since the NLML is a convex function, it can be optimized by off-the-shelf optimization algorithms,
116
such as gradient descent.
117
After the model training, the predictive distribution of GPR at testing data point can be
118
described as:
119
*
| ~ ( ,cov( ))N
* * *
f x,y,x f f
(5)
( )
21
( ) ( , )( ( , ) ) ( )
n
mm
−
= + + −
* * *
f x K x x K x x I y x
(6)
( )
1
2
* * *
cov ( , ) ( , ) ( , ) ( , )
n
−
= − +
**
f K x x K x x K x x I K x x
(7)
Where the
is the prediction results and
demonstrates the prediction uncertainty.
120
The mean of GPR predictive distribution in Eq.(6) is a linear combination of target variable
121
in the training set, when the mean function . Under this condition, the mean of
122
predictive distribution can be re-written as:
123
21
( , )( ( , ) )
−
= + =
n GPR**
f K x x K x x I y W y
(8)
Where is the weighting matrix for the standard GPR.
124
The mean function of GPR is normally set to be 0 for trend-free time series. GPR is
125
known as non-parametric approach which can be employed to model time series or systems with
126
4
arbitrary complexity when provided with sufficient data. A non-zero mean function is normally
127
employed when a clear trend is observed from the time series or there is a sound assumption of the
128
trend term. Like in [21], an exponential trend term is employed as the mean function to better
129
extrapolate the degradation trajectory of the battery cell in the long term. In the present study, the
130
mean function is set to 0 since we did not see a consistent trend term of wind speed in the prediction
131
horizon.
132
When using GPR to make prediction, there are several general steps to follow:
133
Step1: Given training data and , the hyper-parameters in the GPR model is obtained by
134
minimizing the NLML in Eq. (3);
135
Step2: Given the testing time index , the predictive distribution of GPR is obtained by
136
using Eq.(5)~Eq.(6). The hyper-parameters in Eq.(5)~Eq.(7) are obtained from Step 1;
137
Step3: The mean of the predictive distribution in Eq.(6) is employed as the predicted value,
138
the confidence interval is derived by using the covariance function in Eq.(7). In this study, the
139
error bound is as
2cov( )
**
ff
.
140
2.2. Multi-Task Gaussian Process Regression
141
MTGP is an extension of the GPR model, and it is described as a special case of standard
142
GPR [22], to deal with the situation when GPR model has multiple outputs. MTGP was originally
143
proposed in [23], and the superiority of MTGP in the multivariate psychological time-series
144
analysis was demonstrated in [24]. Another more recent study about using MTGP for battery
145
capacity prediction also presents improved results[21]. In the setting of WS prediction, the input
146
of MTGP is the time indices of the WS series, and the output of MTGP is multiple WS series
147
including the historical WS at turbine nacelle and the reference series from NWP model.
148
The key in MTGP is to recognize the correlation across multiple outputs by using the novel
149
covariance kernel function below:
150
( , ', , ') ( , ') ( , ')
MTGP c t
k x x l l k l l k x x=
(9)
Where represent the indices of series and there are series in total, and
151
in Eq.(9) model the correlation across the multiple outputs and the covariance for one series
152
respectively. denote the time indices for task and . Based on Eq.(9), the kernel matrix of
153
MTGP can be constructed as:
154
( , , , ) ( , ) ( , )
c t c t
=
MTGP C t
K X L K L K X
(10)
Where
is the Kronecker product. and are the hyper-parameters in the kernel matrix. In
155
Eq.(10), is a similarity matrix that models the correlation or similarity across multiple
156
series, is a symmetric matrix that models the covariance across the time indices for the
157
-th series, represents the number of time indices for the -th serie. Therefore, is a
158
matrix that captures the similarity across multiple output series. To ensure that
159
5
is positive semi-definitive, is constructed based on the Cholesky decomposition as
160
below:
161
(1,1) (1,2) (1, )
(2,1) (2,2) (2, )
( ,1) ( ,2) ( , )
c c c m
c c c m
T
c m c m c m m
LL
==
C
K
(11)
Where is a lower triangular matrix. The elements in represent the similarity level between
162
each pair of the reference series. According to the description in [23, 24], these elements in can
163
be interpreted as correlation coefficients.
164
3. Methodology
165
3.1. Using MTGP for NWP enhancement
166
When using MTGP for time series prediction, all the training and testing procedures are
167
the same with traditional GPR except the construction of the kernel matrix. To better illustrate the
168
kernel matrix in MTGP, we use the case of two output tasks as an example. In this scenario, the
169
MTGP model can be constructed as:
170
( )
( )
( ) ( )
( )
0, 0, (
==
rr r r rp r p
r
MTGP
ppr p r pp p p
NK x ,x K x ,x
yK x,x)
yK x ,x K x ,x
(12)
Where
1
r
r
y
and
1
p
p
y
are two outputs with different dimensionality,
r
x
and
p
x
are
171
the models input for the two output tasks, the dimensionality of
r
x
and
p
x
should be the same.
172
In 1D case or time series prediction,
r
x
and
p
x
are the time indices the two time series.
rr
,
rp
173
,
pr
and
pp
denotes the correlation coefficients of two output series, these four coefficient are
174
treated as part of the hyper-parameters in MTGP and they are obtained by optimizing the NLML
175
in the model training. It is also important to note that
=
rp pr
is always valid due to Eq.(11).
176
Application of MTGP for NWP enhancement is illustrated in Fig. 1. In Fig. 1,
r
y
serves as
177
the reference wind speed series that is given by the NWP and its time indices of
r
y
is written as
178
,..., ,..., 24= − +
rt k t tx
.
p
y
in Eq.(12) corresponds to the measured wind speed series at
179
the turbine nacelle which is going to be extrapolated to the future 24 hours, the time indices of
p
y
180
is as
,...,=−
pt k tx
, where is the length of time window for model construction. By using
181
6
r
y
,
p
y
,
r
x
and
p
x
as training data, a MTGP model is constructed and a better prediction at time
182
*1,..., 24= + +
pttx
can be obtained. The predictive distribution can be simply obtained
183
following Eq.(5)~Eq.(7).
184
185
Fig. 1 MTGP Enhancement for Reference-based Prediction
186
Like the standard GPR, the prediction results of MTGP can be also interpreted as a linear
187
combination of the historical observations, which can be written as:
188
* * 2 1
( ( ( , ) ) ,
−
= + =
rr
p MTGP p MTGP n MTGP
pp
yy
f K x,x ) K x x I W
yy
Where
( )
( )
( ) ( )
( ) ( )
( ) ( )
* * *
24 ( )
)
( , )
+ +
+
=
=
rr r r rp r p
MTGP
pr p r pp p p r p r p
MTGP p pr p r pp p p rp
K x ,x K x ,x
K (x,x K x ,x K x ,x
K x x K x ,x K x ,x
(13)
From Eq. (13), one can easily find that the prediction output of the MTGP model is simply a
189
linear combination of
r
y
and
p
y
. Consequently, MTGP is employed as a novel approach to
190
further enhance the prediction accuracy of NWP in this investigation.
191
Due to the fact that the prediction result of GPR and MTGP is given as a linear combination
192
of the
r
y
and
p
y
, the MTGP model enhances the NWP results both in the short-term and long-
193
term accuracy. The presence of
( )
pp p p
K x ,x
term in the kernel matrix mainly contributes to
194
the short-term prediction accuracy, since the
pp
is more dominant than
pr
in the short-term
195
7
horizons. The prediction result considers
r
y
as a reference in the long-term extrapolations, which
196
is why the prediction accuracy does not deteriorate significantly in the long-term horizons. More
197
importantly, the algorithms can achieve enhanced prediction accuracy in both short-term and long-
198
term horizons mainly because MTGP automatically decides the optimal trade-off between the
199
NWP output and the extrapolation of measured wind speed at turbine nacelle. This is achieved by
200
obtaining the optimal
pp
and
pr
in the model training phase. Therefore, the model is expected
201
to be superior than NWP in long-term horizons and to be superior than statistical models in short-
202
term horizons.
203
As a summary of the discussion, the algorithm proposed for NWP enhancement based on
204
MTGP is described as below. It is important to note that the proposed method requires to re-train
205
the MTGP model at each prediction step. This implies that the proposed model has no bias known
206
as seasonal effects, since a new model is derived at each prediction step by utilizing the data in the
207
short past only.
208
Algorithm 1. Using MTGP for NWP output enhancement
209
At certain time step ,
Step 1:
Initialize
r
y
,
p
y
,
r
x
,
p
x
,
*
p
x
and the time window length as:
,..., ,..., 24= − +
rt k t tx
,
*1,..., 24= + +
pttx
,
,...,=−
pt k tx
,
1
r
r
y
is the NWP prediction at time
r
x
,
1
p
p
y
is the measured wind speed
at turbine nacelle during time
p
x
Step 2:
Construct the MTGP as Eq. (12) and obtain the optimized hyper-parameters
rr
,
rp
,
pp
and other hyper-parameters in the kernel function. The optimal hyper-
parameters are obtained by minimizing the NLML.
Step 3:
Obtain the predictive distribution of MTGP. The prediction mean is described as Eq.
(13).
Step 4:
Propagate to time and repeat Step 1~Step 3.
3.2. The proposed methodology and implementation
210
211
The goal of this research is to predict WS in future 24 hours. The given data in this research
212
involves the WS at the turbine nacelle collected by the Supervisory Control and Data Acquisition
213
(SCADA) system and the weather forecast data from the NWP model. The NWP data is given as
214
the average WS within 1 hour at each forecast gird from three different heights, 10m, 50m and
215
90m. In this research, the NWP data from today and one day before is used to build the model.
216
Therefore, at any specific point of time, the available data includes the SCADA data up till now,
217
and the NWP data updated today and 1 day ago. Spatially, the NWP data includes the WS
218
8
prediction at 9 grid nodes in Fig. 2, which has rather coarse spatial resolution of roughly 9.5
219
kilometers. The Wind Turbine (WT) position is located within the area covered by these grid nodes,
220
as shown in Fig. 2, and the heights of the turbines remain unknown.
221
222
Fig. 2 WT Position and Weather Forecast Grid Position.
223
224
Fig. 3 Statement of the WS prediction problem
225
The proposed methodology is illustrated in Fig. 3. In Fig. 3, the time resolution of NWP data
226
is 1 hour. As mentioned before, the NWP data reported from today and one day before is utilized.
227
Therefore, the dimensionality of numerical weather predictions is . As shown in
228
Fig. 3, a time window with length is needs to establish the SVR model between all 54 numerical
229
wind predictions and the measured wind speed series. The purpose of this SVR model is to give a
230
reference prediction series
r
y
in the future 24 hours. This prediction is merely based on the NWP
231
outputs and will be subsequently enhanced by the MTGP model. In the MTGP enhancement step,
232
the detailed procedures for model construction and making prediction are described in Algorithm
233
1. It is important to highlight that the SVR model and the MTGP model are re-trained at each time
234
step by using the historical data for model construction. Therefore, the seasonal effect of the wind
235
speed distribution is not a concern in the present model, because the prediction is made based on
236
the predication output of NWP and the extrapolation of wind series in the recent past.
237
238
9
To calibrate the performance of the propose method, Root Mean Square Error (RMSE) and
239
Mean Absolute Percentage Error (MAPE) are used as criteria to evaluate the prediction accuracy.
240
Suppose the WS data from current time point to the next 24 hours is
241
and the predicted WS data is referred as
242
, RMSE and MAPE at a certain prediction horizon are calculated as
243
follows:
244
* ' 2
,,
1
()
=
−
=
m
i h i h
i
h
xx
RMSE m
(14)
*'
,,
'
1,
1
=
−
=
mi h i h
hiih
xx
MAPE mx
(15)
Where represents the number of prediction steps, denotes the prediction horizon.
245
To better demonstrate the improvements made by the proposed methodology, the methods
246
listed in Table 1 will be benchmarked. In Table 1, Best NWP model uses RMSE to select one
247
series of NWP data with the highest forecast accuracy. SVR-NWP model uses an SVR model to
248
fuse the NWP data for prediction as Step 1 in Fig. 3. SVR+UKF focuses on short-term prediction
249
and it is a dynamic methodology that is proposed in [15].
250
To compare with other peer algorithms for NWP enhancement, KF structure that is discussed
251
in [18, 19] are implemented and benchmarked. In their discussions, the bias of the NWP is modeled
252
as a high-order polynomial:
253
* *2 *3
0, 1, 2, 3,
= + + + +
t t t t t t t t t
e c c x c x c x v
(16)
Where
t
e
is a scalar bias of NWP at time ,
0,t
c
,
1,t
c
,
2,t
c
and
3,t
c
are the polynomial coefficients.
254
t
v
is a Gaussian process noise. The above equation is implemented in a KF structure as shown
255
below:
256
* *2 *3
1, [1, , , ]
−
= + = +
t t t t t t t t t
w e x x x vc c c
(17)
Where
0, 1, 2, 3,
, , ,
=
T
t t t t t
c c c cc
.
257
To summarize, a list of benchmarking algorithms is tabulated in Table 1. And the prediction
258
accuracies the algorithms in Table 1 are benchmarked based on the real-word data in the next
259
section.
260
10
Table 1 List of benchmarking algorithms
261
Short-term prediction
(1~6 hours ahead)
Long -term prediction
(6~24 hours ahead)
Short and long-term prediction
(1~24 hours ahead)
SVR+UKF[15]
Best NWP+KF [19]
SVR-NWP+KF [19]
SVR-NWP,
Best NWP
SVR-NWP+MTGP
(The proposed method)
4. Results and Discussions
262
The performance of the proposed method is validated based on the off-shore wind farm data
263
collected within half a year. This dataset under study includes the WS series at turbine nacelle
264
collected by the SCADA and the NWP forecast data that is described above. For KF ensembled
265
methods in [18], the data from January and February 2017 is used for cross-validation and model
266
training, while data from March and September 2017 is used for model performance testing. For
267
other methods which don’t need pre-training, the data from March and September is used for model
268
testing and result benchmarking. At the beginning of the analysis, all the time series from NWP
269
and anemometer measurements at turbine nacelle are synchronized and pre-processed to have a 1-
270
hour interval.
271
To demonstrate the advantages of the proposed methodology, the prediction model is
272
validated on two different turbines, which are benchmarked in Table 2 ~ Table 5. Table 2 and
273
Table 3 show the prediction results of WT #1 in April and September. Table 4 and Table 5 show
274
the prediction results of WT #2 during the same month. The best prediction accuracies at each
275
prediction horizon are highlighted in bold character. Generally, the proposed model SVR-
276
NWP+MTGP yields the best results comparing with all others.
277
Comparing the prediction accuracies of different methods at different prediction horizons, the
278
following findings are highlighted. (1) SVR+UKF model that is proposed in [15] demonstrates
279
improved prediction accuracy in short-term horizons (1~6 hours ahead). However, its
280
extrapolation accuracy deteriorates very fast at long-term horizons (7~24 hours ahead). (2) Best
281
NWP and SVR-NWP demonstrate good accuracy at long-term horizons. However, their accuracies
282
in short-term are not comparable to the statistical models; (3) Best NWP+KF and SVR-NWP+KF
283
demonstrate enhanced prediction accuracy comparing with Best NWP and SVR-NWP in short-
284
term horizons, especially 1~3 hours ahead. This finding indicates that the KF structure in [18, 19]
285
can effectively enhance the prediction accuracy as expected. However, the performance of such
286
post-processing steps in long-term prediction horizon is quite unstable. In some occasions, it makes
287
the prediction accuracy even worse; (4) The proposed method, SVR-NWP+MTGP, demonstrates
288
excellent accuracy in both short-term and long-term horizons. Its prediction accuracies in short-
289
term horizons are found comparable to SVR+UKF model and even better in some occasions. More
290
importantly, the long-term prediction accuracy of the proposed method is better than Best NWP,
291
SVR-NWP, Best NWP+KF and SVR-NWP+KF. In addition, the improvements made by the
292
proposed method is consistent over different turbine locations and different month.
293
The superiority of the proposed method is better explained in Fig. 4 and Fig. 5 by comparing
294
the RMSE and MAPE of different methods. The results in Fig. 4 and Fig. 5 demonstrate the
295
validation results on two different wind turbines at two different months. In both short-term and
296
long-term prediction horizons, the proposed method gives the best accuracy in term of both RMSE
297
11
and MAPE. It is highlighted that the short-term prediction performance of the proposed method is
298
comparable with the state-of-art approach SVR+UKF in [15], the long-term prediction accuracy
299
of the proposed method is superior than the KF structure that is presented in [19] for NWP post-
300
processing.
301
Table 2 Turbine #1 Comparison Result of April
302
Methods
Predict Hours
1- hour ahead
4-h ahead
6-hours ahead
12-hours ahead
18-hours ahead
24-hours ahead
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
SVR+UKF[15]
21.578
1.320
42.362
2.960
46.781
3.504
56.992
4.516
57.390
4.780
59.211
5.192
Best NWP
50.810
3.729
57.339
3.951
59.534
4.009
60.912
3.993
57.215
3.874
48.689
3.551
SVR-NWP
32.462
2.404
45.674
3.302
49.068
3.571
52.059
3.994
52.630
4.225
55.041
4.404
SVR-NWP+KF
[19]
30.179
2.069
43.264
2.747
46.452
2.896
50.164
3.109
51.725
3.274
55.526
3.346
Best NWP+KF
[19]
37.949
2.689
49.353
3.450
50.793
3.364
56.417
3.834
58.062
3.915
58.855
3.821
SVR-
NWP+MTGP
(The proposed
method)
18.891
1.281
41.312
2.481
44.645
2.669
49.831
2.872
52.026
2.991
51.488
3.013
303
Table 3 Turbine #1 Comparison Result of September
304
Methods
Predict Hours
1- hour ahead
4-h ahead
6-hours ahead
12-hours ahead
18-hours ahead
24-hours ahead
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
SVR+UKF[15]
24.493
1.437
40.850
2.656
46.770
3.342
57.907
4.440
64.258
5.051
69.382
5.488
Best NWP
61.311
3.448
64.257
3.512
63.280
3.488
60.932
3.474
64.637
3.561
59.814
3.332
SVR-NWP
34.850
2.267
43.685
2.709
45.656
2.807
48.943
2.959
49.246
2.999
50.100
3.077
SVR-NWP+KF
[19]
35.316
2.585
46.288
2.826
49.599
2.940
56.513
3.206
58.604
3.312
58.544
3.387
Best NWP+KF
[19]
37.598
2.670
44.393
3.147
46.075
3.321
48.928
3.365
48.732
3.386
53.402
3.593
SVR-NWP+MTGP
(The proposed
method)
22.588
1.442
42.071
2.577
45.660
2.780
48.101
2.853
48.635
2.919
50.383
3.109
305
Table 4 Turbine #2 Comparison Result of April
306
Methods
Predict Hours
1- hour ahead
4-h ahead
6-hours ahead
12-hours ahead
18-hours ahead
24-hours ahead
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
SVR+UKF[15]
20.522
1.284
42.853
2.597
48.267
3.030
58.061
3.993
62.880
4.487
60.653
4.612
Best NWP
54.510
3.511
59.528
3.663
61.808
3.727
64.229
3.722
62.155
3.659
52.734
3.380
SVR-NWP
32.938
2.256
43.780
3.067
46.679
3.328
50.568
3.682
50.279
3.824
52.941
3.978
SVR-NWP+KF
[19]
31.651
1.900
43.678
2.577
46.279
2.750
49.773
2.962
51.621
3.057
53.239
3.115
Best NWP+KF
[19]
36.036
2.563
49.136
3.424
50.643
3.379
54.012
3.545
54.849
3.508
58.816
3.899
SVR-NWP+MTGP
(The proposed
method)
18.317
1.240
38.614
2.274
41.994
2.466
46.547
2.656
47.635
2.755
48.145
2.797
Table 5 Turbine #2 Comparison Result of September
307
Methods
Predict Hours
1- hour ahead
4-h ahead
6-hours ahead
12-hours ahead
18-hours ahead
24-hours ahead
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
MAPE
RMSE
12
SVR+UKF[15]
23.349
1.387
41.875
2.618
45.514
3.124
55.323
3.994
65.124
4.764
71.706
5.212
Best NWP
59.158
3.441
60.907
3.460
62.713
3.609
60.261
3.532
64.388
3.649
59.738
3.461
SVR-NWP
33.385
2.343
41.175
2.757
43.057
2.855
45.717
2.954
46.806
3.018
48.089
3.098
SVR-NWP+KF
[19]
33.409
2.268
42.550
2.787
45.107
2.921
50.296
3.139
50.496
3.210
50.851
3.271
Best NWP+KF
[19]
34.456
2.539
41.002
3.071
43.953
3.334
46.819
3.460
46.340
3.465
51.382
3.614
SVR-NWP+MTGP
(The proposed
method)
21.661
1.452
39.950
2.564
44.061
2.848
47.951
3.062
47.378
3.041
48.272
3.146
308
(a)
(b)
(c)
(d)
Fig. 4 Benchmarking of Turbine #1 prediction. (a) RMSE in April; (b) MAPE in April; (c)
RMSE in September; (d) MAPE in September.
309
13
(a)
(b)
(c)
(d)
Fig. 5 Benchmarking of Turbine #2 prediction. (a) RMSE in April; (b) MAPE in April; (c)
RMSE in September; (d) MAPE in September.
310
(a)
14
(b)
(c)
(d)
Fig. 6 Detailed Results and Error Bounds of SVR-NWP+MTGP model (a) 1-hour Ahead
Prediction; (b) 6-hour Ahead Prediction; (c) 12-hour Ahead Prediction; (d) 24-hour Ahead
Prediction.
Fig. 6 shows detailed results of SVR-NWP+MTGP model, which is randomly selected from
311
the testing dataset. Several findings are highlighted below: (1) Generally, the proposed model fits
312
the ground truth well. (2) Short-term horizon leads to smaller error bounds. For long-term horizon,
313
the proposed model is capable to describe the overall trends of WS data. (3) In most cases, the true
314
value of WS data falls into the predicted error bounds, except the wind gust around time point 300-
315
350. However, for 1-hour ahead prediction, the proposed model is capable to predict the wind gust
316
very well.
317
Fig. 7 compares the error distributions of the benchmarking models and the proposed model
318
in two months. Several key points for discussion in Fig. 7 are as follows: (1) The SVR+UKF and
319
the proposed method gives the smallest error in 1-hour ahead prediction. And the proposed method
320
is slightly better than SVR+UKF; (2) the proposed method also gives the best prediction at 6-hour
321
15
ahead prediction. The prediction error of the SVR+UKF method starts to increase at this prediction
322
horizon; (3) At 12-hour Ahead prediction and 24-hours ahead prediction, the SVR-NWP and SVR-
323
NWP+ MTGP give the smallest error. (4) Comparing SVR+UKF, Best NWP and the proposed
324
method, one can easily find that the proposed method keeps the advantage of SVR+UKF in the
325
short-term horizons and the advantage of NWP in the long-term horizons; (5) Comparing Best
326
NWP, SVR+NWP and SVR-NWP+MTGP, one can find that the MTGP mainly reduces the
327
prediction error in the short-term horizons. At long-term horizons, the prediction results are
328
slightly better than SVR-NWP and significantly better than NWP; (6) By comparing with the
329
recent models in literature SVR+UKF[15] and NWP+KF[19], the proposed method gives the best
330
overall prediction accuracy in the 24 hours ahead prediction.
331
1-hour Ahead
6-hour Ahead
12-hour Ahead
24-hour Ahead
SVR+
UKF
[15]
Best
NWP
Best
NWP+
KF
[19]
SVR-
NWP
SVR-
NWP+
MTGP
Fig. 7 Probabilistic Histograms of Residues with the Distribution Mean and 2- Interval
Finally, the execution time of the proposed model is discussed. The proposed methodology
332
and the benchmarking methods are run on a PC with RAM 32GB, CPU 3.50GHz, Windows 10
333
Enterprise. At each time point, WS data and NWP data of the last 120 hours are included in all of
334
16
the models testing procedure. The proposed model is run 10 times on the data from March and
335
September.
336
Fig. 8 demonstrates the average execution time and the accuracy of different time block
337
lengths. The average execution time refers to the average prediction time at each time point. The
338
time block length is donated to the number of hours during which past WS and NWP data is taken
339
into consideration, as shown in Fig. 3. It indicates that the execution time grows as the length of
340
time block enlarges. Meanwhile, the prediction accuracies RMSE and MAPE are improved as
341
more WS and NWP data is taken into the model. However, when the time block length grows to
342
96 hours, the average execution time still grows while RMSE and MAPE converge to a stable
343
level. Therefore, a time block of 96 hours is recommended to achieve the optimal prediction
344
performance and to save execution time.
345
(a)
(b)
(c)
Fig. 8 Average Execution Time and Prediction Accuracy of Time Block Length. (a) Average
Execution Time vs. Time Block Length (b) RMSE of Different Time Block Length (c) MAPE
of Different Time Block Length.
5. Conclusion
346
In this paper, a novel WS prediction method is proposed. The effectiveness and the superiority of
347
the proposed method are validated on a dataset collected from an off-shore wind farm. The final
348
results suggest following conclusions. (1) The proposed method can be effectively employed to
349
improve the prediction accuracy of the numerical weather prediction; (2) the proposed method
350
17
carries both the advantages of time series extrapolation method for short-term prediction and the
351
advantages of NWP in long-term prediction horizon. (3) The proposed method reports improved
352
prediction accuracy comparing with the recently proposed models of SVR+UKF [15] and
353
NWP+KF [19].
354
355
In future works, the proposed method will be integrated into commercial software for
356
practical use and the sparse Gaussian process methods will be explored to further boost the
357
computational efficiency.
358
Reference:
359
[1] I. Colak, S. Sagiroglu, and M. Yesilbudak, "Data mining and wind power prediction: A
360
literature review," Renewable Energy, vol. 46, pp. 241-247, 2012.
361
[2] J. Jung and R. P. Broadwater, "Current status and future advances for wind speed and
362
power forecasting," Renewable and Sustainable Energy Reviews, vol. 31, pp. 762-777,
363
2014/03/01/ 2014.
364
[3] X. Jia, C. Jin, M. Buzza, Y. Di, D. Siegel, and J. Lee, "A deviation based assessment
365
methodology for multiple machine health patterns classification and fault detection,"
366
Mechanical Systems and Signal Processing, vol. 99, pp. 244-261, 2018.
367
[4] J. Jin, D. Zhou, P. Zhou, S. Qian, and M. Zhang, "Dispatching strategies for coordinating
368
environmental awareness and risk perception in wind power integrated system," Energy,
369
vol. 106, pp. 453-463, 2016.
370
[5] X. Zhang, W. Cai, and Z. Gan, "Optimal dispatching strategies of active power for DFIG
371
wind farm based on GA algorithm," in Control and Decision Conference (CCDC), 2016
372
Chinese, 2016, pp. 6094-6099.
373
[6] P. Eecen, H. Braam, L. Rademakers, and T. Obdam, "Estimating costs of operations and
374
maintenance of offshore wind farms," in European Wind Energy Conference and
375
Exhibition, Milan, Italy, 2007.
376
[7] A. Kovács, G. Erdös, L. Monostori, and Z. J. Viharos, "Scheduling the maintenance of
377
wind farms for minimizing production loss," IFAC Proceedings Volumes, vol. 44, pp.
378
14802-14807, 2011.
379
[8] A. Kovacs, G. Erdős, Z. J. Viharos, and L. Monostori, "A system for the detailed
380
scheduling of wind farm maintenance," CIRP Annals-Manufacturing Technology, vol.
381
60, pp. 497-501, 2011.
382
[9] X. Jia, C. Jin, M. Buzza, W. Wang, and J. Lee, "Wind turbine performance degradation
383
assessment based on a novel similarity metric for machine performance curves,"
384
Renewable Energy, vol. 99, pp. 1191-1201, 2016.
385
[10] H. Liu, H.-q. Tian, and Y.-f. Li, "Comparison of two new ARIMA-ANN and ARIMA-
386
Kalman hybrid methods for wind speed prediction," Applied Energy, vol. 98, pp. 415-
387
424, 2012.
388
[11] O. B. Shukur and M. H. Lee, "Daily wind speed forecasting through hybrid KF-ANN
389
model based on ARIMA," Renewable Energy, vol. 76, pp. 637-647, 2015/04/01/ 2015.
390
[12] M. Monfared, H. Rastegar, and H. M. Kojabadi, "A new strategy for wind speed
391
forecasting using artificial intelligent methods," Renewable Energy, vol. 34, pp. 845-848,
392
2009/03/01/ 2009.
393
18
[13] X. Jia, Y. Di, J. Feng, Q. Yang, H. Dai, and J. Lee, "Adaptive virtual metrology for
394
semiconductor chemical mechanical planarization process using GMDH-type polynomial
395
neural networks," Journal of Process Control, vol. 62, pp. 44-54, 2018.
396
[14] S. A. Pourmousavi Kani and M. M. Ardehali, "Very short-term wind speed prediction: A
397
new artificial neural network–Markov chain model," Energy Conversion and
398
Management, vol. 52, pp. 738-745, 2011/01/01/ 2011.
399
[15] K. Chen and J. Yu, "Short-term wind speed prediction using an unscented Kalman filter
400
based state-space support vector regression approach," Applied Energy, vol. 113, pp. 690-
401
705, 2014.
402
[16] G. Santamaría-Bonfil, A. Reyes-Ballesteros, and C. Gershenson, "Wind speed forecasting
403
for wind farms: A method based on support vector regression," Renewable Energy, vol.
404
85, pp. 790-809, 2016/01/01/ 2016.
405
[17] S. Salcedo-Sanz, E. G. Ortiz-Garcı´a, Á. M. Pérez-Bellido, A. Portilla-Figueras, and L.
406
Prieto, "Short term wind speed prediction based on evolutionary support vector
407
regression algorithms," Expert Systems with Applications, vol. 38, pp. 4052-4057,
408
2011/04/01/ 2011.
409
[18] P. Louka, G. Galanis, N. Siebert, G. Kariniotakis, P. Katsafados, I. Pytharoulis, et al.,
410
"Improvements in wind speed forecasts for wind power prediction purposes using
411
Kalman filtering," Journal of Wind Engineering and Industrial Aerodynamics, vol. 96,
412
pp. 2348-2362, 2008.
413
[19] F. Cassola and M. Burlando, "Wind speed and wind energy forecast through Kalman
414
filtering of Numerical Weather Prediction model output," Applied Energy, vol. 99, pp.
415
154-166, 2012/11/01/ 2012.
416
[20] C. E. Rasmussen, "Gaussian processes for machine learning," 2006.
417
[21] R. R. Richardson, M. A. Osborne, and D. A. Howey, "Gaussian process regression for
418
forecasting battery state of health," Journal of Power Sources, vol. 357, pp. 209-219,
419
2017.
420
[22] H. Liu, J. Cai, and Y.-S. Ong, "Remarks on Multi-Output Gaussian Process Regression,"
421
Knowledge-Based Systems, 2018.
422
[23] E. V. Bonilla, K. M. Chai, and C. Williams, "Multi-task Gaussian process prediction," in
423
Advances in neural information processing systems, 2008, pp. 153-160.
424
[24] R. Dürichen, M. A. Pimentel, L. Clifton, A. Schweikard, and D. A. Clifton, "Multitask
425
gaussian processes for multivariate physiological time-series analysis," IEEE
426
Transactions on Biomedical Engineering, vol. 62, pp. 314-322, 2015.
427
428