Gaussian Process Regression for Numerical Wind Speed Prediction Enhancement
Haoshu Caia, Xiaodong Jiaa*, Jianshe Fenga, Yuan-Ming Hsua, Wenzhe Lia, Jay Leea
a NSF I/UCR Center for Intelligent Maintenance Systems,
Department of Mechanical Engineering, University of Cincinnati, PO Box 210072,
Cincinnati, Ohio 45221-0072, USA
This paper studies the application of Multi-Task Gaussian Process (MTGP) regression model
to enhance the numerical predictions of wind speed. In the proposed method, a Support Vector
Regressor (SVR) is first utilized to fuse the predictions from Numerical Weather Predictors (NWP).
The purpose of this regressor is to map the numerical predictions at coarse geographical nodes to
the desired turbine location. In subsequent analysis, this SVR prediction output is further enhanced
by the MTGP regression model. Based on the validation results on the real-world data from large-
scale off-shore wind farm, the prediction accuracies of the NWP are significantly improved at both
the short-term horizons (1~6 hours ahead) and the long-term horizons (7~24 hours ahead) by
employing the proposed method. More importantly, the short-term prediction accuracy after
enhancement is found comparable or even better than the cutting-edge statistical models for short-
Keyword: Wind speed prediction, Multi-Task Gaussian Process, Gaussian Process Regression,
Support Vector Machine, Time Series Prediction, Forecasting
Xiaodong Jia; E-mail: firstname.lastname@example.org; Tel: (513)556-3412; Fax: (513)556-3390
Address: 560 Baldwin Hall, University of Cincinnati, PO Box 210072, Cincinnati, OH 45221
Driven by the demand of renewable energy, large numbers of wind power generators are
erected over the past decade [1-3]. However, one limitation that impedes the further development
of wind farm is its high Operation and Maintenance (O&M) cost, which is essentially caused by
the uncertainty of wind power production. To better adapt to the inconsistent wind conditions and
reduce the costs, wind speed (WS) prediction is identified as one of the key inputs for the wind
farm power dispatching and the maintenance planning. In current practices, both the short-term
WS prediction for the future 0-6 hours and the long-term prediction for future 7-24 hours are
important inputs to meet the requirements of power grid dispatching, and the power output for
each wind turbine are optimized based on the predicted wind conditions[4, 5]. For maintenance
planning, the short-term WS prediction within future 6 hours also serves as a critical input to
schedule the maintenance activities for the next day and to minimize the production loss [6-9]. To
this end, investigations on the advanced analytics for WS prediction hold great economic value
and academic value.
WS prediction is intrinsically challenging due to the intermittent fluctuations in WS under
intricate meteorological conditions. To better predict the WS at different time horizons, the short-
term prediction and long-term prediction are done by different types of data driven models. The
short-term prediction is largely based on statistical approaches, which extrapolates the WS series
by modeling its time evolution using statistical models. Related examples involve Auto-Regressive
(AR) and Auto-Regressive Moving Average (ARMA), Artificial Neural Network (ANN) and
Kalman Filter (KF) or Unscented Kalman Filter (UKF) that are discussed in [10-13]. To enhance
the performance and the robustness of these prediction algorithms, various combined models are
also proposed in the literature. Monfared et al.  utilize fuzzy logic and ANN to model WS
estimation based on the statistic properties of the input time series. Kani et al. use ANN and
Markov Chain to capture patterns in WS time series data. Chen et al. integrate Support Vector
Regression (SVR) with KF to realize dynamic state estimation. Santamaría-Bonfil et al.[16, 17]
utilize SVR model by tuning the model parameters using heuristic algorithms such as Genetic
Algorithm (GA) and Particle Swam Optimization (PSO). For the long-term WS prediction, the
prediction outputs from the Numerical Weather Predictors (NWP) are generally preferred since
the accuracy of the statistical models deteriorates very fast when larger prediction horizon is
considered. The prediction results from NWP are usually given at coarse geographical grids and
these results may indicate systematic prediction bias on complex terrain. To address these issues,
the NWP results are used as reference predictions and regression models are employed to post-
process the NWP outputs. For example, KF is explored as a post-processing method in  to
correct the prediction results from NWP model to avoid systematic bias. Similarly, a practical
methodology based on KF is utilized to improve the prediction of NWP in  and the results are
validated on two years’ data.
By summarizing these related researches, it is found that the statistical models for time series
extrapolation can give rather satisfactory accuracy in the short-term horizons (1~6 hours ahead),
however, its accuracy deteriorates very fast when the prediction horizon exceeds 6 hours. For long-
term prediction (7 hours), the predictions from NWP are necessary to guarantee the prediction
accuracy. However, since the NWP results are given at coarse geographical grids and due to the
complexity of wind dynamics itself, the direct output from NWP normally has bias comparing
with prediction target. This bias in prediction is found more prominent in complex terrain and
dynamic wind environment. Although the KF structure in  and  mitigates the prediction
bias of NWP to some extent, this method still fails to yield satisfactory results in short-term
prediction horizons. Moreover, the KF needs too many prior assumptions, such as the regression
coefficients in KF need to be known beforehand, the noise term in KF needs to follow Gaussian
distribution and the dynamic process modeled by KF must be linear.
To address these challenges and limitations, this work proposes to use the Multiple Task
Gaussian Process (MTGP) to post-process the numerical weather predictions. In this work, a novel
methodology for NWP enhancement is proposed based on the MTGP model. One major
contribution of the proposed method is that it not only enhances the prediction accuracy in the
long-term horizons (7~24 hours ahead) but also significantly improved the accuracy in the short-
term horizons (1~6 hours ahead). Especially, the short-term prediction accuracy of the proposed
method is found even better than the statistical models that are specifically proposed for short term
extrapolations. In the proposed method, spatial correlation of wind speed between the prediction
grids of NWP and the turbine nacelle position are first modeled by SVR. Subsequently, the
prediction outcome of SVR are further enhanced by MTGP. The superiority of the proposed
method is demonstrated by benchmarking with cutting-edge prediction techniques for short term
predictions in  and the recent techniques for NWP enhancement in [18, 19]. To the author’s
knowledge, this is the first time that MTGP is applied to address wind speed prediction issues.
And there are still limited prediction methods in literature that can achieve good accuracy in both
short-term and long-term prediction horizons.
The rest of this paper is organized as follows. Section 2 illustrates the technical backgrounds.
In Section 3, the proposed methodology is presented and described. Section 4 shows the
experiment results, the comparison with several benchmarks and the discussions. Finally,
conclusions and future work are presented in Section 5.
2. Technical Backgrounds
2.1. Standard Gaussian Process Regression
Gaussian Process Regression (GPR) is a non-parametric method that can model arbitrary
complex system. In most prediction problems, GPR is preferred due to its flexibility to provide the
uncertainty representations. GPR models a time series using Gaussian prior that is parameterized
by the mean function (MF) and a covariance function (CovF) as described below:
( ) ~ ( ), ( , )f m k=y x x x x'N
In Eq.(1), and denote the input and output in the training dataset and is known as
latent variable in the GPR model. In most applications, the mean function in Eq.(1) is set to
0, and CovF , which describes the similarity between input data points, is the key ingredient
in GPR since data points with similar input are likely to have similar target value . In the
current literature, one of the most frequently-used kernel function squared exponential (SE) is
( ) exp 2
Where in Eq.(2) are the Euclidean distance between two indexes .
denotes the modified Bessel function. Parameters, in Eq.(2) are the hyper parameters that
need to be optimized.
During the model training, the negative log marginalized likelihood (NLML) in Eq. (3) is
minimized, so that the hyper-parameters in the kernel matrix can be estimated.
2 2 1
NLML = log ( | , )
log | | ( ) log(2 )
2 2 2
= − + − + −
K I y K I y
The unknown hyper-parameter in Eq. (3) is determined by minimizing the NLML. The
optimization problem for parameter estimation is written as:
ˆargmin log( ( | , ))
Since the NLML is a convex function, it can be optimized by off-the-shelf optimization algorithms,
such as gradient descent.
After the model training, the predictive distribution of GPR at testing data point can be
| ~ ( ,cov( ))N
* * *
f x,y,x f f
( ) ( , )( ( , ) ) ( )
= + + −
* * *
f x K x x K x x I y x
* * *
cov ( , ) ( , ) ( , ) ( , )
= − +
f K x x K x x K x x I K x x
is the prediction results and
demonstrates the prediction uncertainty.
The mean of GPR predictive distribution in Eq.(6) is a linear combination of target variable
in the training set, when the mean function . Under this condition, the mean of
predictive distribution can be re-written as:
( , )( ( , ) )
= + =
f K x x K x x I y W y
Where is the weighting matrix for the standard GPR.
The mean function of GPR is normally set to be 0 for trend-free time series. GPR is
known as non-parametric approach which can be employed to model time series or systems with
arbitrary complexity when provided with sufficient data. A non-zero mean function is normally
employed when a clear trend is observed from the time series or there is a sound assumption of the
trend term. Like in , an exponential trend term is employed as the mean function to better
extrapolate the degradation trajectory of the battery cell in the long term. In the present study, the
mean function is set to 0 since we did not see a consistent trend term of wind speed in the prediction
When using GPR to make prediction, there are several general steps to follow:
Step1: Given training data and , the hyper-parameters in the GPR model is obtained by
minimizing the NLML in Eq. (3);
Step2: Given the testing time index , the predictive distribution of GPR is obtained by
using Eq.(5)~Eq.(6). The hyper-parameters in Eq.(5)~Eq.(7) are obtained from Step 1;
Step3: The mean of the predictive distribution in Eq.(6) is employed as the predicted value,
the confidence interval is derived by using the covariance function in Eq.(7). In this study, the
error bound is as
2.2. Multi-Task Gaussian Process Regression
MTGP is an extension of the GPR model, and it is described as a special case of standard
GPR , to deal with the situation when GPR model has multiple outputs. MTGP was originally
proposed in , and the superiority of MTGP in the multivariate psychological time-series
analysis was demonstrated in . Another more recent study about using MTGP for battery
capacity prediction also presents improved results. In the setting of WS prediction, the input
of MTGP is the time indices of the WS series, and the output of MTGP is multiple WS series
including the historical WS at turbine nacelle and the reference series from NWP model.
The key in MTGP is to recognize the correlation across multiple outputs by using the novel
covariance kernel function below:
( , ', , ') ( , ') ( , ')
MTGP c t
k x x l l k l l k x x=
Where represent the indices of series and there are series in total, and
in Eq.(9) model the correlation across the multiple outputs and the covariance for one series
respectively. denote the time indices for task and . Based on Eq.(9), the kernel matrix of
MTGP can be constructed as：
( , , , ) ( , ) ( , )
c t c t
MTGP C t
K X L K L K X
is the Kronecker product. and are the hyper-parameters in the kernel matrix. In
Eq.(10), is a similarity matrix that models the correlation or similarity across multiple
series, is a symmetric matrix that models the covariance across the time indices for the
-th series, represents the number of time indices for the -th serie. Therefore, is a
matrix that captures the similarity across multiple output series. To ensure that
is positive semi-definitive, is constructed based on the Cholesky decomposition as
(1,1) (1,2) (1, )
(2,1) (2,2) (2, )
( ,1) ( ,2) ( , )
c c c m
c c c m
c m c m c m m
Where is a lower triangular matrix. The elements in represent the similarity level between
each pair of the reference series. According to the description in [23, 24], these elements in can
be interpreted as correlation coefficients.
3.1. Using MTGP for NWP enhancement
When using MTGP for time series prediction, all the training and testing procedures are
the same with traditional GPR except the construction of the kernel matrix. To better illustrate the
kernel matrix in MTGP, we use the case of two output tasks as an example. In this scenario, the
MTGP model can be constructed as:
( ) ( )
0, 0, (
rr r r rp r p
ppr p r pp p p
NK x ,x K x ,x
yK x ,x K x ,x
are two outputs with different dimensionality,
the models input for the two output tasks, the dimensionality of
should be the same.
In 1D case or time series prediction,
are the time indices the two time series.
denotes the correlation coefficients of two output series, these four coefficient are
treated as part of the hyper-parameters in MTGP and they are obtained by optimizing the NLML
in the model training. It is also important to note that
is always valid due to Eq.(11).
Application of MTGP for NWP enhancement is illustrated in Fig. 1. In Fig. 1,
the reference wind speed series that is given by the NWP and its time indices of
is written as
,..., ,..., 24= − +
rt k t tx
in Eq.(12) corresponds to the measured wind speed series at
the turbine nacelle which is going to be extrapolated to the future 24 hours, the time indices of
pt k tx
, where is the length of time window for model construction. By using
as training data, a MTGP model is constructed and a better prediction at time
*1,..., 24= + +
can be obtained. The predictive distribution can be simply obtained
Fig. 1 MTGP Enhancement for Reference-based Prediction
Like the standard GPR, the prediction results of MTGP can be also interpreted as a linear
combination of the historical observations, which can be written as:
* * 2 1
( ( ( , ) ) ,
= + =
p MTGP p MTGP n MTGP
f K x,x ) K x x I W
( ) ( )
( ) ( )
( ) ( )
* * *
24 ( )
( , )
rr r r rp r p
pr p r pp p p r p r p
MTGP p pr p r pp p p rp
K x ,x K x ,x
K (x,x K x ,x K x ,x
K x x K x ,x K x ,x
From Eq. (13), one can easily find that the prediction output of the MTGP model is simply a
linear combination of
. Consequently, MTGP is employed as a novel approach to
further enhance the prediction accuracy of NWP in this investigation.
Due to the fact that the prediction result of GPR and MTGP is given as a linear combination
, the MTGP model enhances the NWP results both in the short-term and long-
term accuracy. The presence of
pp p p
K x ,x
term in the kernel matrix mainly contributes to
the short-term prediction accuracy, since the
is more dominant than
in the short-term
horizons. The prediction result considers
as a reference in the long-term extrapolations, which
is why the prediction accuracy does not deteriorate significantly in the long-term horizons. More
importantly, the algorithms can achieve enhanced prediction accuracy in both short-term and long-
term horizons mainly because MTGP automatically decides the optimal trade-off between the
NWP output and the extrapolation of measured wind speed at turbine nacelle. This is achieved by
obtaining the optimal
in the model training phase. Therefore, the model is expected
to be superior than NWP in long-term horizons and to be superior than statistical models in short-
As a summary of the discussion, the algorithm proposed for NWP enhancement based on
MTGP is described as below. It is important to note that the proposed method requires to re-train
the MTGP model at each prediction step. This implies that the proposed model has no bias known
as seasonal effects, since a new model is derived at each prediction step by utilizing the data in the
short past only.
Algorithm 1. Using MTGP for NWP output enhancement
At certain time step ,
and the time window length as:
,..., ,..., 24= − +
rt k t tx
*1,..., 24= + +
pt k tx
is the NWP prediction at time
is the measured wind speed
at turbine nacelle during time
Construct the MTGP as Eq. (12) and obtain the optimized hyper-parameters
and other hyper-parameters in the kernel function. The optimal hyper-
parameters are obtained by minimizing the NLML.
Obtain the predictive distribution of MTGP. The prediction mean is described as Eq.
Propagate to time and repeat Step 1~Step 3.
3.2. The proposed methodology and implementation
The goal of this research is to predict WS in future 24 hours. The given data in this research
involves the WS at the turbine nacelle collected by the Supervisory Control and Data Acquisition
(SCADA) system and the weather forecast data from the NWP model. The NWP data is given as
the average WS within 1 hour at each forecast gird from three different heights, 10m, 50m and
90m. In this research, the NWP data from today and one day before is used to build the model.
Therefore, at any specific point of time, the available data includes the SCADA data up till now,
and the NWP data updated today and 1 day ago. Spatially, the NWP data includes the WS
prediction at 9 grid nodes in Fig. 2, which has rather coarse spatial resolution of roughly 9.5
kilometers. The Wind Turbine (WT) position is located within the area covered by these grid nodes,
as shown in Fig. 2, and the heights of the turbines remain unknown.
Fig. 2 WT Position and Weather Forecast Grid Position.
Fig. 3 Statement of the WS prediction problem
The proposed methodology is illustrated in Fig. 3. In Fig. 3, the time resolution of NWP data
is 1 hour. As mentioned before, the NWP data reported from today and one day before is utilized.
Therefore, the dimensionality of numerical weather predictions is . As shown in
Fig. 3, a time window with length is needs to establish the SVR model between all 54 numerical
wind predictions and the measured wind speed series. The purpose of this SVR model is to give a
reference prediction series
in the future 24 hours. This prediction is merely based on the NWP
outputs and will be subsequently enhanced by the MTGP model. In the MTGP enhancement step,
the detailed procedures for model construction and making prediction are described in Algorithm
1. It is important to highlight that the SVR model and the MTGP model are re-trained at each time
step by using the historical data for model construction. Therefore, the seasonal effect of the wind
speed distribution is not a concern in the present model, because the prediction is made based on
the predication output of NWP and the extrapolation of wind series in the recent past.
To calibrate the performance of the propose method, Root Mean Square Error (RMSE) and
Mean Absolute Percentage Error (MAPE) are used as criteria to evaluate the prediction accuracy.
Suppose the WS data from current time point to the next 24 hours is
and the predicted WS data is referred as
, RMSE and MAPE at a certain prediction horizon are calculated as
* ' 2
i h i h
mi h i h
Where represents the number of prediction steps, denotes the prediction horizon.
To better demonstrate the improvements made by the proposed methodology, the methods
listed in Table 1 will be benchmarked. In Table 1, Best NWP model uses RMSE to select one
series of NWP data with the highest forecast accuracy. SVR-NWP model uses an SVR model to
fuse the NWP data for prediction as Step 1 in Fig. 3. SVR+UKF focuses on short-term prediction
and it is a dynamic methodology that is proposed in .
To compare with other peer algorithms for NWP enhancement, KF structure that is discussed
in [18, 19] are implemented and benchmarked. In their discussions, the bias of the NWP is modeled
as a high-order polynomial:
* *2 *3
0, 1, 2, 3,
= + + + +
t t t t t t t t t
e c c x c x c x v
is a scalar bias of NWP at time ,
are the polynomial coefficients.
is a Gaussian process noise. The above equation is implemented in a KF structure as shown
* *2 *3
1, [1, , , ]
= + = +
t t t t t t t t t
w e x x x vc c c
0, 1, 2, 3,
, , ,
t t t t t
c c c cc
To summarize, a list of benchmarking algorithms is tabulated in Table 1. And the prediction
accuracies the algorithms in Table 1 are benchmarked based on the real-word data in the next
Table 1 List of benchmarking algorithms
(1~6 hours ahead)
Long -term prediction
(6~24 hours ahead)
Short and long-term prediction
(1~24 hours ahead)
Best NWP+KF 
(The proposed method)
4. Results and Discussions
The performance of the proposed method is validated based on the off-shore wind farm data
collected within half a year. This dataset under study includes the WS series at turbine nacelle
collected by the SCADA and the NWP forecast data that is described above. For KF ensembled
methods in , the data from January and February 2017 is used for cross-validation and model
training, while data from March and September 2017 is used for model performance testing. For
other methods which don’t need pre-training, the data from March and September is used for model
testing and result benchmarking. At the beginning of the analysis, all the time series from NWP
and anemometer measurements at turbine nacelle are synchronized and pre-processed to have a 1-
To demonstrate the advantages of the proposed methodology, the prediction model is
validated on two different turbines, which are benchmarked in Table 2 ~ Table 5. Table 2 and
Table 3 show the prediction results of WT #1 in April and September. Table 4 and Table 5 show
the prediction results of WT #2 during the same month. The best prediction accuracies at each
prediction horizon are highlighted in bold character. Generally, the proposed model SVR-
NWP+MTGP yields the best results comparing with all others.
Comparing the prediction accuracies of different methods at different prediction horizons, the
following findings are highlighted. (1) SVR+UKF model that is proposed in  demonstrates
improved prediction accuracy in short-term horizons (1~6 hours ahead). However, its
extrapolation accuracy deteriorates very fast at long-term horizons (7~24 hours ahead). (2) Best
NWP and SVR-NWP demonstrate good accuracy at long-term horizons. However, their accuracies
in short-term are not comparable to the statistical models; (3) Best NWP+KF and SVR-NWP+KF
demonstrate enhanced prediction accuracy comparing with Best NWP and SVR-NWP in short-
term horizons, especially 1~3 hours ahead. This finding indicates that the KF structure in [18, 19]
can effectively enhance the prediction accuracy as expected. However, the performance of such
post-processing steps in long-term prediction horizon is quite unstable. In some occasions, it makes
the prediction accuracy even worse; (4) The proposed method, SVR-NWP+MTGP, demonstrates
excellent accuracy in both short-term and long-term horizons. Its prediction accuracies in short-
term horizons are found comparable to SVR+UKF model and even better in some occasions. More
importantly, the long-term prediction accuracy of the proposed method is better than Best NWP,
SVR-NWP, Best NWP+KF and SVR-NWP+KF. In addition, the improvements made by the
proposed method is consistent over different turbine locations and different month.
The superiority of the proposed method is better explained in Fig. 4 and Fig. 5 by comparing
the RMSE and MAPE of different methods. The results in Fig. 4 and Fig. 5 demonstrate the
validation results on two different wind turbines at two different months. In both short-term and
long-term prediction horizons, the proposed method gives the best accuracy in term of both RMSE
and MAPE. It is highlighted that the short-term prediction performance of the proposed method is
comparable with the state-of-art approach SVR+UKF in , the long-term prediction accuracy
of the proposed method is superior than the KF structure that is presented in  for NWP post-
Table 2 Turbine #1 Comparison Result of April
1- hour ahead
Table 3 Turbine #1 Comparison Result of September
1- hour ahead
Table 4 Turbine #2 Comparison Result of April
1- hour ahead
Table 5 Turbine #2 Comparison Result of September
1- hour ahead
Fig. 4 Benchmarking of Turbine #1 prediction. (a) RMSE in April; (b) MAPE in April; (c)
RMSE in September; (d) MAPE in September.
Fig. 5 Benchmarking of Turbine #2 prediction. (a) RMSE in April; (b) MAPE in April; (c)
RMSE in September; (d) MAPE in September.
Fig. 6 Detailed Results and Error Bounds of SVR-NWP+MTGP model (a) 1-hour Ahead
Prediction; (b) 6-hour Ahead Prediction; (c) 12-hour Ahead Prediction; (d) 24-hour Ahead
Fig. 6 shows detailed results of SVR-NWP+MTGP model, which is randomly selected from
the testing dataset. Several findings are highlighted below: (1) Generally, the proposed model fits
the ground truth well. (2) Short-term horizon leads to smaller error bounds. For long-term horizon,
the proposed model is capable to describe the overall trends of WS data. (3) In most cases, the true
value of WS data falls into the predicted error bounds, except the wind gust around time point 300-
350. However, for 1-hour ahead prediction, the proposed model is capable to predict the wind gust
Fig. 7 compares the error distributions of the benchmarking models and the proposed model
in two months. Several key points for discussion in Fig. 7 are as follows: (1) The SVR+UKF and
the proposed method gives the smallest error in 1-hour ahead prediction. And the proposed method
is slightly better than SVR+UKF; (2) the proposed method also gives the best prediction at 6-hour
ahead prediction. The prediction error of the SVR+UKF method starts to increase at this prediction
horizon; (3) At 12-hour Ahead prediction and 24-hours ahead prediction, the SVR-NWP and SVR-
NWP+ MTGP give the smallest error. (4) Comparing SVR+UKF, Best NWP and the proposed
method, one can easily find that the proposed method keeps the advantage of SVR+UKF in the
short-term horizons and the advantage of NWP in the long-term horizons; (5) Comparing Best
NWP, SVR+NWP and SVR-NWP+MTGP, one can find that the MTGP mainly reduces the
prediction error in the short-term horizons. At long-term horizons, the prediction results are
slightly better than SVR-NWP and significantly better than NWP; (6) By comparing with the
recent models in literature SVR+UKF and NWP+KF, the proposed method gives the best
overall prediction accuracy in the 24 hours ahead prediction.
Fig. 7 Probabilistic Histograms of Residues with the Distribution Mean and 2- Interval
Finally, the execution time of the proposed model is discussed. The proposed methodology
and the benchmarking methods are run on a PC with RAM 32GB, CPU 3.50GHz, Windows 10
Enterprise. At each time point, WS data and NWP data of the last 120 hours are included in all of
the models testing procedure. The proposed model is run 10 times on the data from March and
Fig. 8 demonstrates the average execution time and the accuracy of different time block
lengths. The average execution time refers to the average prediction time at each time point. The
time block length is donated to the number of hours during which past WS and NWP data is taken
into consideration, as shown in Fig. 3. It indicates that the execution time grows as the length of
time block enlarges. Meanwhile, the prediction accuracies RMSE and MAPE are improved as
more WS and NWP data is taken into the model. However, when the time block length grows to
96 hours, the average execution time still grows while RMSE and MAPE converge to a stable
level. Therefore, a time block of 96 hours is recommended to achieve the optimal prediction
performance and to save execution time.
Fig. 8 Average Execution Time and Prediction Accuracy of Time Block Length. (a) Average
Execution Time vs. Time Block Length (b) RMSE of Different Time Block Length (c) MAPE
of Different Time Block Length.
In this paper, a novel WS prediction method is proposed. The effectiveness and the superiority of
the proposed method are validated on a dataset collected from an off-shore wind farm. The final
results suggest following conclusions. (1) The proposed method can be effectively employed to
improve the prediction accuracy of the numerical weather prediction; (2) the proposed method
carries both the advantages of time series extrapolation method for short-term prediction and the
advantages of NWP in long-term prediction horizon. (3) The proposed method reports improved
prediction accuracy comparing with the recently proposed models of SVR+UKF  and
In future works, the proposed method will be integrated into commercial software for
practical use and the sparse Gaussian process methods will be explored to further boost the
 I. Colak, S. Sagiroglu, and M. Yesilbudak, "Data mining and wind power prediction: A
literature review," Renewable Energy, vol. 46, pp. 241-247, 2012.
 J. Jung and R. P. Broadwater, "Current status and future advances for wind speed and
power forecasting," Renewable and Sustainable Energy Reviews, vol. 31, pp. 762-777,
 X. Jia, C. Jin, M. Buzza, Y. Di, D. Siegel, and J. Lee, "A deviation based assessment
methodology for multiple machine health patterns classification and fault detection,"
Mechanical Systems and Signal Processing, vol. 99, pp. 244-261, 2018.
 J. Jin, D. Zhou, P. Zhou, S. Qian, and M. Zhang, "Dispatching strategies for coordinating
environmental awareness and risk perception in wind power integrated system," Energy,
vol. 106, pp. 453-463, 2016.
 X. Zhang, W. Cai, and Z. Gan, "Optimal dispatching strategies of active power for DFIG
wind farm based on GA algorithm," in Control and Decision Conference (CCDC), 2016
Chinese, 2016, pp. 6094-6099.
 P. Eecen, H. Braam, L. Rademakers, and T. Obdam, "Estimating costs of operations and
maintenance of offshore wind farms," in European Wind Energy Conference and
Exhibition, Milan, Italy, 2007.
 A. Kovács, G. Erdös, L. Monostori, and Z. J. Viharos, "Scheduling the maintenance of
wind farms for minimizing production loss," IFAC Proceedings Volumes, vol. 44, pp.
 A. Kovacs, G. Erdős, Z. J. Viharos, and L. Monostori, "A system for the detailed
scheduling of wind farm maintenance," CIRP Annals-Manufacturing Technology, vol.
60, pp. 497-501, 2011.
 X. Jia, C. Jin, M. Buzza, W. Wang, and J. Lee, "Wind turbine performance degradation
assessment based on a novel similarity metric for machine performance curves,"
Renewable Energy, vol. 99, pp. 1191-1201, 2016.
 H. Liu, H.-q. Tian, and Y.-f. Li, "Comparison of two new ARIMA-ANN and ARIMA-
Kalman hybrid methods for wind speed prediction," Applied Energy, vol. 98, pp. 415-
 O. B. Shukur and M. H. Lee, "Daily wind speed forecasting through hybrid KF-ANN
model based on ARIMA," Renewable Energy, vol. 76, pp. 637-647, 2015/04/01/ 2015.
 M. Monfared, H. Rastegar, and H. M. Kojabadi, "A new strategy for wind speed
forecasting using artificial intelligent methods," Renewable Energy, vol. 34, pp. 845-848,
 X. Jia, Y. Di, J. Feng, Q. Yang, H. Dai, and J. Lee, "Adaptive virtual metrology for
semiconductor chemical mechanical planarization process using GMDH-type polynomial
neural networks," Journal of Process Control, vol. 62, pp. 44-54, 2018.
 S. A. Pourmousavi Kani and M. M. Ardehali, "Very short-term wind speed prediction: A
new artificial neural network–Markov chain model," Energy Conversion and
Management, vol. 52, pp. 738-745, 2011/01/01/ 2011.
 K. Chen and J. Yu, "Short-term wind speed prediction using an unscented Kalman filter
based state-space support vector regression approach," Applied Energy, vol. 113, pp. 690-
 G. Santamaría-Bonfil, A. Reyes-Ballesteros, and C. Gershenson, "Wind speed forecasting
for wind farms: A method based on support vector regression," Renewable Energy, vol.
85, pp. 790-809, 2016/01/01/ 2016.
 S. Salcedo-Sanz, E. G. Ortiz-Garcı´a, Á. M. Pérez-Bellido, A. Portilla-Figueras, and L.
Prieto, "Short term wind speed prediction based on evolutionary support vector
regression algorithms," Expert Systems with Applications, vol. 38, pp. 4052-4057,
 P. Louka, G. Galanis, N. Siebert, G. Kariniotakis, P. Katsafados, I. Pytharoulis, et al.,
"Improvements in wind speed forecasts for wind power prediction purposes using
Kalman filtering," Journal of Wind Engineering and Industrial Aerodynamics, vol. 96,
pp. 2348-2362, 2008.
 F. Cassola and M. Burlando, "Wind speed and wind energy forecast through Kalman
filtering of Numerical Weather Prediction model output," Applied Energy, vol. 99, pp.
154-166, 2012/11/01/ 2012.
 C. E. Rasmussen, "Gaussian processes for machine learning," 2006.
 R. R. Richardson, M. A. Osborne, and D. A. Howey, "Gaussian process regression for
forecasting battery state of health," Journal of Power Sources, vol. 357, pp. 209-219,
 H. Liu, J. Cai, and Y.-S. Ong, "Remarks on Multi-Output Gaussian Process Regression,"
Knowledge-Based Systems, 2018.
 E. V. Bonilla, K. M. Chai, and C. Williams, "Multi-task Gaussian process prediction," in
Advances in neural information processing systems, 2008, pp. 153-160.
 R. Dürichen, M. A. Pimentel, L. Clifton, A. Schweikard, and D. A. Clifton, "Multitask
gaussian processes for multivariate physiological time-series analysis," IEEE
Transactions on Biomedical Engineering, vol. 62, pp. 314-322, 2015.