ArticlePDF Available

Abstract

Modeling wind speed has a significant impact on wind energy systems and has attracted attention from numerous researchers. The prediction of wind speed is considered a challenging task because of its natural nonlinear and random characteristics. Therefore, machine learning models have gained popularity in this field. In this paper, three machine learning approaches – Gaussian process regression (GPR), bagged regression trees (BTs) and support vector regression (SVR) – were applied for prediction of the weekly wind speed (maximum, mean, minimum) of the target station using other stations, which were specified as reference stations. Daily wind speed data, gathered via the Malaysian Meteorological Department at 14 measuring stations in Malaysia covering the period between 2000 and 2019, were used. The results showed that the average weekly wind speed had superior performance to the maximum and minimum wind speed prediction. In general, the GPR model could effectively predict the weekly wind speed of the target station using the measured data of other stations. Errors found in this model were within acceptable limits. The findings of this model were compared with the measured data, and only Kota Kinabalu station showed an unacceptable range of prediction. To investigate the prediction performance of the proposed model, two models were used as the comparison models: the BTs model and SVR model. Although the comparison of GPR with the BTs model at Kuching station showed slightly better performance for the BTs model in maximum and minimum wind speed prediction, the prediction outcomes of the other 13 stations showed better performance for the proposed GPR model. Moreover, the proposed model generated smaller prediction errors than the SVR model at all stations.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Performance of four different machine learning–based approaches (long short-term memory (LSTM), support vector machine regression (SVMR), Gaussian process regression (GPR), and multi-gene genetic programming (MGGP) models) in estimation of long-term monthly temperatures was investigated in this study. Data of 250 measuring stations of Turkey were used in present trials. Month numbers of the year, latitude, longitude, and altitude variables were used as input data of the models. Error statistics of mean absolute error (MAE), mean square error (MSE), root mean square error (RMSE), coefficient of determination (R2), and Nash−Sutcliffe efficiency coefficient (NSE) were used while comparing the four models. In terms of five error statistics, models yielded similar outcomes. Therefore, Taylor and Violin diagrams were used to assess how close the model-estimated values to measured data. Taylor and Violin diagrams revealed that GPR model had better performance in estimation of maximum and average temperatures than the other three models. Also, it was determined that the measured data estimated by the Kruskal–Wallis test came from the same distribution. At the end of this study, efficiency of the methods recommended for comparisons was proven.
Article
Full-text available
In this research, monthly wind speed time series of the Kirsehir was investigated using the stand-alone, hybrid and ensemble models. The artificial neural networks, Gaussian process regression, support vector machines and multivariate adaptive regression splines were employed as stand-alone machine learning models, while the discrete wavelet transform was utilized as a pre-processing technique to create hybrid models. Moreover, for the first time in wind speed predictions, we generated a multi-stage ensemble model by using the M5 Model Tree (M5) algorithm to increase the model accuracies. Two major tasks considered to be necessary, in which the first is to obtain the lag times by using autocorrelation functions, and the latter is to determine the optimum mother wavelet as well as the decomposition level to reduce the uncertainties in wavelet modeling. The results revealed that the hybrid wavelet models outperformed the stand-alone models, while a significant improvement was also observed in M5 ensemble models as the highest Nash–Sutcliffe efficiency coefficient values were obtained in M5 hybrid wavelet multi-stage ensemble models for each lead time prediction. The findings of the study were assessed with respect to the various performance indicators and Kruskal–Wallis test to indicate whether the results are statically significant. The proposed multi-stage ensemble framework also benchmarked with the classical treebased ensembles, such as Random forest, AdaBoost and XGBoost.
Article
Full-text available
This study aims to predict daily ionospheric Total Electron Content (TEC) using Gaussian Process Regression (GPR) model and Multiple Linear Regression (MLR). In this case, daily TEC values from 2015 to 2017 of two Global Navigation Satellite System (GNSS) stations were collected in Turkey. The performance of the GPR model was compared with the classical MLR model using Taylor diagrams and relative error graphs. Six models with various input parameters were performed for both GPR and MLR techniques. The results showed that although the models perform similarly, the GPR model estimated the TEC values more precisely at one and two days ahead. Therefore, the GPR model is recommended to forecast the TEC values at the corresponding GNSS stations over Turkey.
Article
Full-text available
This paper introduces an R package ForecastTB that can be used to compare the accuracy of different forecasting methods as related to the characteristics of a time series dataset. The ForecastTB is a plug-and-play structured module, and several forecasting methods can be included with simple instructions. The proposed test-bench is not limited to the default forecasting and error metric functions, and users are able to append, remove, or choose the desired methods as per requirements. Besides, several plotting functions and statistical performance metrics are provided to visualization the comparative performance and accuracy of different forecasting methods. Furthermore, this paper presents a real application examples with natural time series datasets (i.e., wind speed and solar radiation) to exhibit the feature of the ForecastTB package to evaluate forecasting comparison analysis as affected by characteristics of a dataset. Modeling results indicated the applicability and robustness of the proposed R package ForecastTB for time series forecasting.
Article
Full-text available
Wind speed is the main component of wind power. Therefore, wind speed forecasting is of big importance due to its uses. It permits to plan the dispatch, determine the hours of storage needed, the amount of energy stored that should be used and avoid the big fluctuations in the electrical grid caused by the nature of the renewable energy resources. In this paper, we propose four hybrid models based on Support Vector Machine(SVM) and Artificial Neural Networks(ANNs) or just Neural Networks (NN) for wind speed forecasting. Using the Ordinary Least Squares(OLS) analysis for selecting the parameters more influencing wind speed. Then, a Support Vector Machine and Artificial Neural Networks models are tuned by Genetic Algorithm(GA) and Particle Swarm Optimization(PSO). The performance of these models is evaluated using three statistical indicators: the Mean Square Error(MSE), Mean Error(ME) and Mean Absolute Error(MAE). The results show a better performance of the neural model compared to the support vector machine.
Article
Full-text available
Evaporation is a very important process; it is one of the most critical factors in agricultural, hydrological, and meteorological studies. Due to the interactions of multiple climatic factors, evaporation is considered as a complex and nonlinear phenomenon to model. Thus, machine learning methods have gained popularity in this realm. In the present study, four machine learning methods of Gaussian Process Regression (GPR), K-Nearest Neighbors (KNN), Random Forest (RF) and Support Vector Regression (SVR) were used to predict the pan evaporation (PE). Meteorological data including PE, temperature (T), relative humidity (RH), wind speed (W), and sunny hours (S) collected from 2011 through 2017. The accuracy of the studied methods was determined using the statistical indices of Root Mean Squared Error (RMSE), correlation coefficient (R) and Mean Absolute Error (MAE). Furthermore, the Taylor charts utilized for evaluating the accuracy of the mentioned models. The results of this study showed that at Gonbad-e Kavus, Gorgan and Bandar Torkman stations, GPR with RMSE of 1.521 mm/day, 1.244 mm/day, and 1.254 mm/day, KNN with RMSE of 1.991 mm/day, 1.775 mm/day, and 1.577 mm/day, RF with RMSE of 1.614 mm/day, 1.337 mm/day, and 1.316 mm/day, and SVR with RMSE of 1.55 mm/day, 1.262 mm/day, and 1.275 mm/day had more appropriate performances in estimating PE values. It was found that GPR for Gonbad-e Kavus Station with input parameters of T, W and S and GPR for Gorgan and Bandar Torkmen stations with input parameters of T, RH, W and S had the most accurate predictions and were proposed for precise estimation of PE. The findings of the current study indicated that the PE values may be accurately estimated with few easily measured meteorological parameters.
Article
Full-text available
High precision and reliable wind speed forecasting is a challenge for meteorologists. We used multiple nonparametric tree-based machine learning techniques, for predicting the maximum wind speed at 10 m using selected convective weather variables. Analysis is based on 127 convective storms from 2005 to 2013. The study evaluated two error models - the Bayesian Additive Regression Trees (BART) and the Quantile Regression Forests (QRF) - and compares them in terms of point estimates and prediction intervals. The error model performances were evaluated based on different error metrics evaluating both the bias and random error of point estimates and the prediction intervals using ensemble verification statistics. The study showed that error modeling based on QRF is superior to BART, especially in terms of point estimate and prediction interval results. Wind speed prediction through QRF was successfully verified using systematic and random error metrics, and ensemble verification statistics of the corresponding prediction intervals. The model generated realizations of wind speed that successfully encapsulated the reference wind speed and notably reduced systematic and random error. The predicted wind speed from QRF can potentially support emergency preparedness efforts associated with severe weather impacts.
Article
Strong winds could cause train derailment and truck rollover which may result in service interruption, serious injury, and even loss of life. The wind-induced accident is highly related to the maximum value of short-term wind speed, thus highlighting the importance of regulating the vehicle velocity based on wind gusts. Accurate prediction of wind gusts is essential to control the vehicle velocity ahead of time, thereby reducing the risk of accidents. The majority of existing approaches focus on the prediction of mean wind speed. In contrast, fairly limited research applies the machine learning model to forecast wind gusts with strong time-varying characteristics and volatility. In this study, a probabilistic approach is presented to forecast wind gusts using ensemble learning. The ensemble model includes three machine learning models, namely, random forest (RF), long-short term memory (LSTM), and Gaussian process regression (GPR) model. The proposed probabilistic approach allows for the quantification of uncertainty in prediction of wind gusts. The feasibility of the ensemble model is illustrated by using the field wind measurements acquired from a long-span cable-stayed bridge. Compared to the persistence, RF, LSTM, GPR, averaging, and gradient boosting decision tree models, the proposed ensemble model exhibits higher accuracy and generalization performance.
Article
Precise prediction of wind power is important in sustainably integrating the wind power in a smart grid. The need for short-term predictions is increased with the increasing installed capacity. The main contribution of this work is adopting bagging ensembles of decision trees approach for wind power prediction. The choice of this regression approach is motivated by its ability to take advantage of many relatively weak single trees to reach a high prediction performance compared to single regressors. Moreover, it reduces the overall error and has the capacity to merge numerous models. The performance of bagged trees for predicting wind power has been compared to four commonly know prediction methods namely multivariate linear regression, support vector regression, principal component regression, and partial least squares regression. Real measurements recorded every ten minutes from an actual wind turbine are used to illustrate the prediction quality of the studied methods. Results showed that the bagged trees regression approach reached the highest prediction performance with a coefficient of determination of 0.982. The result showed that the bagged trees approach is followed by support vector regression with Gaussian kernel, the same model when using a quadratic kernel, and the mul-tivariate linear regression, partial least squares, and principal component regression gave the lowest prediction. The investigated models in this study can represent a helpful tool for model-based anomaly detection in wind turbines.