## No full-text available

To read the full-text of this research,

you can request a copy directly from the authors.

To read the full-text of this research,

you can request a copy directly from the authors.

... Many models have been developed and proposed and are capable of forecasting solar irradiance up to different horizons. The forecasting horizon capability of the existing models varies from short, medium, and long term, where short-term forecasting horizon capability includes forecasting the solar irradiance from 10 min up to 30 min ahead [4][5][6][7][8][9][10][11][12][13][14][15][16]. Most of the available models fall in medium forecasting horizon capability to forecast the solar irradiance on an hourly time step . ...

... From all the available models that have been studied, a majority of them have a root mean square error (RMSE) value that ranges between 50 Wm -2 to 80 Wm -2 [4,5,7,14,22,29,[36][37][38][39]47]. Some models have low accuracy with RMSE values of at least 165 Wm -2 and above [24,35] and ...

... R. Arumugham and P. Rajendran / Renewable Energy xxx (xxxx) [1][2][3][4][5][6][7][8][9] some with higher accuracy with RMSE values ranging from 10 Wm -2 to 30 Wm -2 [15,17,21,26,42]. Overall, the available models are claimed to have good accuracies in forecasting solar irradiance with mean absolute error (MAE) values ranging between 20Wm -2 to 60Wm -2 [5,14,15,26,37,38] and coefficient of determination (R 2 ) values of at least 0.93 and above [1,21,28,29,31,37]. Models with lower accuracies have MAE values as high as 130 Wm -2 [35] and R 2 values as low as 0.8 up to 0.9 [29,38,39]. ...

The focus of this study is to develop a highly accurate formulation to estimate the day number (DN), solar declination angle (SOLDEC), solar altitude angle (SOLALT) and also to predict the diffuse horizontal irradiance (DHI), direct normal irradiance (DNI) and global horizontal irradiance (GHI) for any location around the world at any time of the day for both short term and long term periods. Regression analysis is done using continuous 12 years of satellite measured historical solar irradiance, weather and solar angle data in the temporal resolution of 10 minutes for 12 cities around the world such as Kuala Lumpur, Auckland, Tokyo, Riyadh, London, Accra, Antananarivo, Brasilia, Lima, Quito, Ottawa and Honolulu. The models generated through the regression analysis perform better than existing models in predicting the solar irradiance, hence, these models are efficient and reliable for universal global applica-tion.

... That's why, accurate forecasting of radiation is one of the challenges for scientists, not just only to predict, for designing as well. For time being, there are many approaches for forecasting irradiance [5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21]. It has grown from mathematical formula and combination of them up to analyzing satellite images by using artificial intelligence. ...

... Yet, results were surprising not as accurate as it was expected [7]. In this review paper, three most common used approaches for forecasting irradiation [7][8][9][10][11][12][13][14][15][16][17][18][19][20][21] were examined, which are exponential smoothing, seasonal method and neural network. For all these three same historical database were used and at the end comparison between them were designed. ...

... (c) Looking at particular day and finding how much is median of that day for each year. (d) times step c value with the last part of b which leads to the forecast of each day of future period [10]. ...

... Furthermore, Benali et al. [17] found that the RF algorithm gives accurate hourly forecasts for the different components of solar radiation compared to the ANN and smart persistence (SP) models. Urraca et al. [18] used the SVR model for forecasting 1h ahead of global solar irradiation in Southeast Spain. e results showed the high performance of the SVR model compared to the RF and classical linear models. ...

... In fact, the radial basis function (RBF) has been widely used by researchers in the field of solar radiation forecasting [3,11,14,18]. For this reason, the RBF has been utilized as a kernel function whose expression is formulated as follows: ...

Photovoltaic power generation depends significantly on solar radiation, which is variable and unpredictable in nature. As a result, the production of electricity from photovoltaic power cannot be guaranteed permanently during the operational phase. Forecasting global solar radiation can play a key role in overcoming this drawback of intermittency. This paper proposes a new hybrid method based on machine learning (ML) algorithms and daily classification technique to forecast 1 h ahead of global solar radiation in the city of Évora. Firstly, several comparative studies have been done between random forest (RF), gradient boosting (GB), support vector machines (SVM), and artificial neural network (ANN). These comparisons were made using annual, seasonal, and daily testing sets in order to determine the best ML algorithm under different meteorological conditions. Subsequently, the daily classification technique has been applied to classify the original training set into sunny and cloudy training subsets in order to enhance the forecasting accuracy. The evaluation of the proposed ML algorithms was carried out using the normalized root mean square error (nRMSE) and the normalized absolute mean error (nMAE). The results of the seasonal comparison show that the RF model performs well for spring and autumn seasons with nRMSE equaling 22.53% and 23.42%, respectively. While the SVR model gives good results for winter and summer seasons with nRMSE equaling 24.31% and 8.41%, respectively. In addition, the daily comparison demonstrates that the RF model performs well for cloudy days with nRMSE = 41.40%, while the SVR model yields good results for sunny days with nRMSE = 8.88%. The results show that the daily classification technique enhances the forecasting accuracy of ML models. Furthermore, this study demonstrates that the forecasting accuracy of ML algorithms depends significantly on sky conditions.

... The solar irradiance prediction models can be broadly divided into three categories: (i) physical (ii) statistical and (iii) machine learning methods. Despite the pros and cons of each forecasting method, machine learning methods, a sub branch of artificial intelligence (AI) techniques have gained enormous attention due to their prediction accuracy and reliability [2]. In this work, machine learning methods are presented for solar irradiance forecasting. ...

... The performance of the proposed model is evaluated using normalized root mean square error (nRMSE) and forecast skill which is formulated by Eqs, (2) and (3): ...

Solar irradiance prediction is an emerging area of research for various applications in renewable energy domain. So far, numerous physical models, statistical models and machine learning based techniques have been utilized to accomplish solar irradiance prediction. However, existing models are not good at learning long-term historical dependencies, lead to compromise in modeling non-linear solar irradiance patterns. In this paper, a novel prediction model (i.e. Long Short Term Memory, LSTM) from deep neural network family is used to predict hourly solar irradiance with enhanced prediction accuracy by considering long-term historical data dependencies. To provide an extensive and strong assessment of proposed model, present study employs National Solar Radiation Database (NSRDB) data for evaluating prediction accuracy at 7 locations of India having different climatic conditions. The proposed model is compared with Feed Forward Neural Network (FFNN), Extreme Gradient Boost (XGBoost) and Persistence model at broader coverage of geographical regions. Empirical outcomes suggest that proposed LSTM model outperforms different models with an average forecast skill of 50.72% over persistence model.

... Solar irradiance forecasting is a broad modeling problem that can be subdivided based on temporal and spatial/geographical resolutions for which the irradiance is being predicted. Various physics-based and statistical methods (including traditional machine learning methods) have been employed in the literature and are currently being used in the industry for different temporal and spatial resolution [3] [4] . A recent detailed review of solar-forecasting literature by Yang et al. [5] classifies solar-forecasting methods into five classes-namely, regression, numerical weather prediction, time series, image-based forecasting, and machine learning. ...

... There are two parts to this process. First is the input gate layer which decides which values will be updated, as shown in equation (4). Next, a vector of new candidate values, ̃, is created by a hyperbolic tangent (tanh) layer, which could be added to the state equation (5). ...

The amount of energy generation from renewable energy sources, particularly from wind and photovoltaic plants, has seen a rapid rise in the last decade. Reliable and economic operation of power systems thus requires an accurate estimate of the power generated from renewable generation plants, particularly those that are intermittent in nature. This has accentuated the need to find an efficient and scalable scheme for forecasting meteorological parameters, such as solar radiation, with better accuracy. For short-term solar irradiance forecasting, the traditional point forecasting methods are rendered less useful due to the non-stationary characteristic of solar power. In this research work, we propose a unified architecture for multi-time-scale predictions for intra-day solar irradiance forecasting using recurrent neural networks (RNN) and long-short-term memory networks (LSTMs). This paper also lays out a framework for extending this modeling approach to intra-hour forecasting horizons, thus making it a multi-time-horizon forecasting approach capable of predicting intra-hour as well as intra-day solar irradiance. We develop an end-to-end pipeline to effectuate the proposed architecture. The robustness of the approach is demonstrated with case studies conducted for geographically scattered sites across the United States. The predictions demonstrate that our proposed unified architecture based approach is effective for multi-time-scale solar forecasts and achieves a lower root-mean-square prediction error when benchmarked against the best-performing methods documented in the literature that use separate models for each time-scale during the day. The proposed method enables multi-time-horizon forecasts with real-time inputs, which have a significant potential for practical industry applications in the evolving grid.

... Ref [155] introduces two machine learning models, Support Vector Regression and Random Forest, respectively, which take as input meteorological records. The models, show significant improvement over persistence and other reference models [155] . ...

... Ref [155] introduces two machine learning models, Support Vector Regression and Random Forest, respectively, which take as input meteorological records. The models, show significant improvement over persistence and other reference models [155] . A neuro-fuzzy estimator based on meteorological input for medium term forecasting is reported by [156] . ...

... The second group directly forecasts the PV power output. Notable examples of the first group are [6][7][8][9][10]. For example, Lorenz et al. [6] predicted the hourly PV power output for up to 2 days ahead based on the weather forecasts of solar irradiance. ...

... They also derived regional power forecasts by up-scaling the forecasts of representative PV systems. Urraca et al. [10] predicted the solar irradiance 1 h ahead based on recorded meteorological data and computed solar variables. They developed two types of models: fixed and moving, and applied a number of prediction algorithms—Support Vector Regression (SVR), random forests, linear regression and nearest neighbor. ...

Solar energy generated from PhotoVoltaic (PV) systems is one of the most promising types of renewable energy. However, it is highly variable as it depends on the solar irradiance and other meteorological factors. This variability creates difficulties for the large-scale integration of PV power in the electricity grid and requires accurate forecasting of the electricity generated by PV systems. In this paper we consider 2D-interval forecasts, where the goal is to predict summary statistics for the distribution of the PV power values in a future time interval. 2D-interval forecasts have been recently introduced, and they are more suitable than point forecasts for applications where the predicted variable has a high variability. We propose a method called NNE2D that combines variable selection based on mutual information and an ensemble of neural networks, to compute 2D-interval forecasts, where the two interval boundaries are expressed in terms of percentiles. NNE2D was evaluated for univariate prediction of Australian solar PV power data for two years. The results show that it is a promising method, outperforming persistence baselines and other methods used for comparison in terms of accuracy and coverage probability.

... SVM has been widely employed by researchers for solar radiation prediction due to its good generalization capabilities and their capacity to lead with nonlinear time series. The nonlinear SVM regression is based on kernel functions that map inputs into high dimensional feature space, in which linear regression is carried out by minimizing the insensitive loss function (Urraca, 2016). Figure 1 shows the nonlinear transformation using a kernel function. ...

Photovoltaic production is highly dependent on solar radiation time series, which is sporadic. Grid operators have a significant problem integrating photovoltaic energy sources into the electrical grid due to the unpredictability of solar radiation. To overcome this, forecasting global solar radiation can solve the intermittency due to the variability of weather conditions. It allows the grid operators to predict photovoltaic power production to facilitate the planning and dispatching tasks of the electric grid. In this work, we have proposed a new hybrid method to predict one-hour solar radiation in Évora city (Portugal). The hybrid model is based on the daily classification of global solar radiation and machine learning algorithms such as support vector machines (SVM) and artificial neural network (ANN). We have collected five years of global horizontal solar radiation data from the meteorological station of Évora city. We have evaluated the performance of the proposed model using normalized root mean square error (nRMSE) and normalized mean absolute error (nMAE). The results show that, for sunny days, the SVM model performs better than the ANN model with nRMSE = 9.15 % and nMAE = 4.65%, while for cloudy days, the ANN model gives better results than the SVM model with nRMSE= 42.09 % and nMAE = 25.1%. Moreover, we have carried out a performance comparison with the recent literature. The results show the superiority of the proposed hybrid model compared to literature’s models.

... Furthermore, the analysis of the literature covered by this present review paper does not include the techniques used for the solar irradiance forecasting. To that end, the readers are invited to check out some beneficial references on the solar irradiance forecasting such as [73,74]. ...

The management of clean energy is usually the key for environmental, economic, and sustainable developments. In the meantime, the energy management system (EMS) ensures the clean energy which includes many sources grouped in a small power plant such as microgrid (MG). In this case, the forecasting methods are used for helping the EMS and allow the high efficiency to the clean energy. The aim of this review paper is providing the necessary data about the basic principles and standards of photovoltaic (PV) power forecasting by stating numerous research studies carried out on the PV power forecasting topic specifically in the short-term time horizon which is advantageous for the EMS and grid operator. At the same time, this contribution can offer a state of the art in different methods and approaches used for PV power forecasting along with a careful study of different time and spatial horizons. Furthermore, this current review paper can support the tenders in the PV power forecasting.

... In [14] smart baseline models for solar irradiation forecasting are presented, comparing different ML approaches such as SVR, NN and Random Forest RF. The results of these ML techniques were compared with genetic algorithms (GA's) approaches, achieving a comparable performance between SVR and GA´s in MSE metric. ...

This paper aims to implement an efficient renewable energy selection (either solar or wind) based on the chosen geographic location of Aguascalientes, Mexico through a Machine Learning (ML) method. Likewise, the information listed below will provide both a critical analysis and review of the state-of-the-art applications for ML Algorithms such as Support Vector Machines (SVM), Linear Regression (LR) and Neural Network Models (NNM). Rigorous data measurements taken over a period of six months, including those of solar irradiance, temperature, wind speed and wind direction to name a few, were the inputs used in different algorithms in order to find t he o ne t hat could most accurately predict future weather conditions. Based on the obtained results, the best ML Algorithm ended up being Random Forest; an approach that is capable of building an accurate prediction through the calculation of two crucial parameters; Mean Square Error (MSE) and Mean Absolute Error (MAE).

... ion index, and solar irradiance data of three locations in China. Their results suggested that RF shows excellent results in comparison to empirical models. Gala et al. (2016) proposed a hybrid model along with three different machine learning models, including support vector regression (SVR), RF, and GB to predict daily and 3-hourly GHI in Spain. Urraca et al. Urraca et al. (2016) developed SVR, k-nearest neighbour, linear regression, and the RF model to predict hourly GHI in Spain and reported that RF is efficient in predicting GHI. Further, Ibrahim and Khatib (2017) applied the RF optimized by the firefly algorithm to estimate hourly GHI in Malaysia, and the proposed ensemble model performed better than neural ...

... The Persistence method is a widely used model to predict solar irradiance and is also used as a reference method to asses the forecast skill of other methods (Chu et al., 2015;Kaur et al., 2016;Urraca et al., 2016). It is based on the assumption that the atmospheric condition is going to be the same as the very previous condition. ...

... Urraca et al. investigated multiple models including the Support Vector Regression (SVR) and RF models to forecast global solar irradiance for 1 h horizon in a site of southeast Spain. The authors found that machine learning methods outperformed the conventional persistence model [74]. In Ref. [71], Hassan et al. explored the potential of four tree-based methods: random forest, decision trees, bagging and gradient boosting for estimating global, diffuse and beam solar irradiance. ...

Accurate forecasting of solar irradiance is a key issue for planning and management of renewable solar energy production technologies. The present paper aims to propose new machine learning forecasting models based on optimized ANNs in order to accurately predict solar irradiance. For this purpose, an evolutionary framework is suggested to generate multiple models for different time horizons up to 6 h ahead by the evolution of the forecasting history and ANN architecture. A dataset of 28 Moroccan cities is used in our experiments in order to explore the performances of the proposed models against different climatic conditions. The proposed framework is then evaluated through a zoning scenario giving the ability to our models to accurately forecast solar irradiance in sites where no such data is available. Two other scenarios are used to assess and compare the resulting performances. For all studied scenarios obtained results show good generalization abilities with NRMSE varying from 7.59% to 12.49% and NMAE from 4.41% to 8.12% as best performances for solar irradiance forecasting from 1 to 6 h ahead respectively. A comparative study is then conducted with three other models (smart persistence, regression trees and random forest), showing better performances of our proposed HAEANN models.

... Studies have used machine learning algorithms, such as k-nearest neighbor (kNN) [35,36], multilayer perceptron [37][38][39][40], and wavelet neural network [41]; some compared or combined multiple machine learning models in the prediction results. For instance, Urraca et al. [42] compared the prediction results of support vector regression with those of random forests, linear regression (LR) and kNN. Yousif et al. [43] compared a self-organizing feature map with multilayer perceptron and support vector machine for forecasting energy production in PV panels. ...

Southern Taiwan has excellent solar energy resources that remain largely unused. This study incorporated a measure that aids in providing simple and effective power generation efficiency assessments of solar panel brands in the planning stage of installing these panels on roofs. The proposed methodology can be applied to evaluate photovoltaic (PV) power generation panels installed on building rooftops in Southern Taiwan. In the first phase, this study selected panels of the BP3 series, including BP350, BP365, BP380, and BP3125, to assess their PV output efficiency. BP Solar is a manufacturer and installer of photovoltaic solar cells. This study first derived ideal PV power generation and then determined the suitable tilt angle for the PV panels leading to direct sunlight that could be acquired to increase power output by panels installed on building rooftops. The potential annual power outputs for these solar panels were calculated. Climate data of 2016 were used to estimate the annual solar power output of the BP3 series per unit area. The results indicated that BP380 was the most efficient model for power generation (183.5 KWh/m2-y), followed by BP3125 (182.2 KWh/m2-y); by contrast, BP350 was the least efficient (164.2 KWh/m2-y). In the second phase, to simulate meteorological uncertainty during hourly PV power generation, a surface solar radiation prediction model was developed. This study used a deep learning–based deep neural network (DNN) for predicting hourly irradiation. The simulation results of the DNN were compared with those of a backpropagation neural network (BPN) and a linear regression (LR) model. In the final phase, the panel of module BP3125 was used as an example and demonstrated the hourly PV power output prediction at different lead times on a solar panel. The results demonstrated that the proposed method is useful for evaluating the power generation efficiency of the solar panels.

... In Ref. [28] an ensemble tree model, Random Forests (RF), is compared with SVR and K-Nearest Neighbors (K-NN) in terms of 1 h ahead global solar irradiance in Spain. In their hybrid method, Gala et al. [29] uses gradient boosted regression and RF ensemble models for solar irradiance forecasting in seven regions in Spain. ...

This article investigates the competence of ensemble learning techniques in solar irradiance prediction. It was seen from the literature survey, an ensemble tree model, random forests is studied more frequently as ensemble models. However, ensemble of support vector regression (SVR) and artificial neural networks (ANN) is also possible. So, this study is the first detailed evaluation of ensemble models in solar irradiance estimation domain. Boosting and bagging ensembles of SVR, ANN and decision tree (DT), are developed to estimate solar irradiance in hourly basis in five cities in Turkey. First frequently used base models (SVR, ANN, and DT) are created and tested with the use of 5 years meteorological data. Then boosting and bagging ensembles of the base models are developed and tested with the same data. The base models are compared with their ensemble counterparts in terms of average coefficient of determination (R²) and root mean squared error (RMSE). The comparative results show that boosting and bagging ensemble models improve SVR, ANN, and DT in terms of RMSE between 4.6 and 14.6% in average. The results show empirically that ensemble models improve prediction accuracies of various base regression models and it can be applied to other machine learning models used in solar irradiance prediction.

... Comparison between the hybrid model and the other popular statistical time serie models like ARIMA, linear exponential smoothing (LES), simple exponential smoothing (SES) and random walk (RW) was conducted, which indicated superior forecasting accuracy of the proposed model. Urraca et al. (2016), in 2016, used two machine learning techniques, random forests and SVR, and the classical linear regression to forecast solar irradiation for horizons of 1 h in a site of Southeast Spain with geographical characteristics of 39 11ʹ38ʺN and 0 26ʹ13ʺW. The study involved the use of two approaches: fixed and moving models. ...

Conventional fossil fuels are depleting daily due to the growing human population. Previous research has proved that renewable energy sources, especially solar and wind, can be suitable alternatives to the conventional energy sources that could satisfy global demand and protect the atmospheric environment. There are many factors that influence the performance of solar and wind energy predicting tools. The accurate forecasting of solar and wind energy resources is highly needed for the optimum utilization of these resources. Different methods have been applied to forecast solar and wind energy resources. Prediction performance of the support vector machine modeling approach found to be better than other modeling approaches. The support vector machine is fast, simple-to-use, reliable and provides accurate results. Findings based on critical analysis suggests that the hybrid support vector machine models can reach much higher accuracies than other models for both solar and wind energy predictions for most of the locations. This investigation highlighted main problems, opportunities and future work in this research area. Novel hybrid models are proposed for further investigation for more accurate predictions of solar and wind energy resources.

... RFR exhibits excellent generalisation performance and can minimise the influence as a result of imbalanced datasets. Therefore, currently numerous studies have been directed at forecasting of renewable energy power generation [10][11][12]. ...

Photovoltaic (PV) electric power has been widely employed to satisfy rising energy demands because inexhaustible renewable energy is environmentally friendly. In order to mitigate the impact caused by the uncertainty of solar radiation in gridconnected PV systems, a hybrid method based on a deep convolutional neural network (CNN) is introduced for short-term PV power forecasting. In the proposed method, different frequency components are first decomposed from the historical time series of PV power through variational mode decomposition (VMD). Then, they are constructed into a two-dimensional data form with correlations in both daily and hourly timescales that can be extracted by convolution kernels. Moreover, the time series of residue from VMD is refined into advanced features by a CNN, which could reduce the data size and be easier for further model training along with meteorological elements. The hybrid model has been verified by forecasting the output power of PV arrays with diverse capacities in various hourly timescales, which demonstrates its superiority over commonly used methods.

... In [39] and [40] themachine. Note than the smart persistence use depends on the clear sky model use as described in [41]. 159 ...

Simple, naïve, smart or clearness persistences are tools largely used as naïve predictors for the global solar irradiation forecasting. It is essential to compare the performances of sophisticated prediction approaches with that of a reference approach generally a naïve methods. In this paper, a new kind of naïve "nowcaster" is developed, a persistence model based on the stochastic aspect of measured solar energy signal denoted stochastic persistence and constructed without needing a large collection of historical data. Two versions are proposed: one based on an additive and one on a multiplicative scheme; a theoretical description and an experimental validation based on measurements realized in Ajaccio (France) and Tilos (Greece) are exposed. The results show that this approach is efficient, easy to implement and does not need historical data as the machine learning methods usually employed. This new solar irradiation predictor could become an interesting tool and become a new member of the solar forecasting family.

... It will also be a benchmark for any industries to manage their energy consumption from time to time. Baseline models doesn't necessary use to show the relationship of energy usage but it is also used in [6] for solar irradiation forecasting utilizing support vector regression and random forest algorithm, predicting the customer baseline loads [7][8][9] for demand response and peak time rebate program. In order to model the baseline energy, the variables that are being chosed in relation to energy comsumption and the period of the baseline itself is an important factor to be consider. ...

... Before performing text mining using the preprocessed text files, a short description of each identified emerging technology is given below. An agency that provides daily, weekly or monthly meteorological data across Spain, as used in Urraca et al. (2016) http://eportal.mapama.gob.es/ websiar/Inicio.aspx ...

Text mining is an emerging topic that advances the review of academic literature. This paper presents a preliminary study on how to review solar irradiance and photovoltaic (PV) power forecasting (both topics combined as "solar forecasting" for short) using text mining, which serves as the first part of a forthcoming series of text mining applications in solar forecasting. This study contains three main contributions: (1) establishing the technological infrastructure (authors, journals & conferences, publications, and organizations) of solar forecasting via the top 1000 papers returned by a Google Scholar search; (2) consolidating the frequently-used abbreviations in solar forecasting by mining the full texts of 249 ScienceDirect publications; and (3) identifying key innovations in recent advances in solar forecasting (e.g., shadow camera, forecast reconciliation). As most of the steps involved in the above analysis are automated via an application programming interface, the presented method can be transferred to other solar engineering topics, or any other scientific domain, by means of changing the search word. The authors acknowledge that text mining, at its present stage, serves as a complement to, but not a replacement of, conventional review papers.

... As the key factor impacting the output power of solar PV plants, solar irradiance forecasting is an important technology for reducing the uncertainty in PV power generations. [7][8][9][10]. Especially in cloudy weather conditions, the solar irradiance on the ground can be fluctuant significantly at the minute level, which brings a great many difficulties for solar irradiance forecasting in intra-hour [11,12] instead of hourly [13] or daily [14] time scales. ...

Irradiance received on the earth's surface is the main factor that affects the output power of solar PV plants, and is chiefly determined by the cloud distribution seen in a ground-based sky image at the corresponding moment in time. It is the foundation for those linear extrapolation-based ultra-short-term solar PV power forecasting approaches to obtain the cloud distribution in future sky images from the accurate calculation of cloud motion displacement vectors (CMDVs) by using historical sky images. Theoretically, the CMDV can be obtained from the coordinate of the peak pulse calculated from a Fourier phase correlation theory (FPCT) method through the frequency domain information of sky images. The peak pulse is significant and unique only when the cloud deformation between two consecutive sky images is slight enough, which is likely possible for a very short time interval (such as 1 min or shorter) with common changes in the speed of cloud. Sometimes, there will be more than one pulse with similar values when the deformation of the clouds between two consecutive sky images is comparatively obvious under fast changing cloud speeds. This would probably lead to significant errors if the CMDVs were still only obtained from the single coordinate of the peak value pulse. However, the deformation estimation of clouds between two images and its influence on FPCT-based CMDV calculations are terrifically complex and difficult because the motion of clouds is complicated to describe and model. Therefore, to improve the accuracy and reliability under these circumstances in a simple manner, an image-phase-shift-invariance (IPSI) based CMDV calculation method using FPCT is proposed for minute time scale solar power forecasting. First, multiple different CMDVs are calculated from the corresponding consecutive images pairs obtained through different synchronous rotation angles compared to the original images by using the FPCT method. Second, the final CMDV is generated from all of the calculated CMDVs through a centroid iteration strategy based on its density and distance distribution. Third, the influence of different rotation angle resolution on the final CMDV is analyzed as a means of parameter estimation. Simulations under various scenarios including both thick and thin clouds conditions indicated that the proposed IPSI-based CMDV calculation method using FPCT is more accurate and reliable than the original FPCT method, optimal flow (OF) method, and particle image velocimetry (PIV) method.

... When solar power is integrated into electric power system, the unpredictable and intermittence characteristic of solar irradiance would bring several severe problems, such as voltage instability and poor power quality [3]. Thus, it is necessary to keep track of solar irradiance in the future for maintaining the security and stability of the power grid [4]. However, it is reported that many developing countries have missing or insufficient solar irradiance data because of prohibitively expensive cost of measurable equipment and maintenance [5]. ...

Integrating solar energy into the electricity grid is an important but challenging task. Forecasting errors can not only break the supply–demand balance but also cause additional costs. Therefore, accurately and effectively forecast the global horizontal irradiance is the key feature to the photovoltaic installation. In this paper, sparse quadratic radial basis function neural network (QRBF) is established. Through mining the association rules, Eclat algorithm is applied to determine relevant meteorological variables to forecast the global horizontal irradiance. QRBF is reformulated as a linear-in-the-parameters problem and a novel approach called square root progressive quantile variable selection procedure (SRPQVSP) is proposed to reduce the complexity of model structure. Furthermore, cuckoo search (CS) algorithm is utilized to optimize the parameters in the model so as to boost forecasting accuracy. Finally, the developed model is verified at four sites of Qinghai province in China with different features of terrain, latitude and other meteorological sources. The experimental results reveal that the developed models composing of selected variables deliver superior performances over other existing approaches.

... Advanced non-linear methods of data processing, in which an (in general) agnostic computer model is trained on historic data, include random forest (RF) search methods [41], the training of artificial neural networks (ANN) by e.g. Sfetsos and Coonick [42], Paoli et al. [43], Voyant et al. [44], Pedro and Coimbra [45] and the use of support vector regression (SVR) as can be seen in work by Fonecsa et al. [46], Boland [47], Rana et al. [48]. ...

Solar forecasting is a necessary component of economical realization of high penetration levels of photovoltaic (PV) systems. This paper presents a short term, intra-hour solar forecasting method. This "peer-to-peer" (P2P) forecasting method is based on the cross-correlation time lag between clear-sky index time series of pairs of PV-systems that are influenced by the (assumed) same cloud sequentially, with the feature that the forecast horizon (FH) can be set at a fixed value. The P2P forecasting algorithm was evaluated for 11 central PV-systems (out of 202) over a half year period from the 1st of March through the 31st of August 2015 using the forecast skill (FS) metric. Positive FS means improvement over reference clear-sky index persistence forecasting. The P2P forecasting method was evaluated over a subset of days with either high, all or low irradiance variability. The average forecast skill (avgFS) concerning forecast horizons between 5 and 8. min was 5.99%, -1.61% and -16.0% over these periods respectively, indicating the superior performance of the P2P method over persistence during the highly variable days, which are most interesting from the perspective of electricity grid management.

... An evaluation using German solar data showed RMSE of 4%. Other recent indirect approaches include [4,16,17]. ...

... Urraca et al. [37] adopted SVR, RF, linear regression and knearest neighbors algorithm (kNN) in order to estimate 1-h ahead global solar irradiance at a single site in the eastern coast of Spain, using both fixed and moving models. Their results assessed the potential of SVR and RF in modeling global irradiance and showed that SVR models have lower bias errors while RF models have lower root mean square errors. ...

This article provides the first comprehensive study to explore the potential of tree-based ensemble methods in modeling solar radiation. Gradient boosting, bagging and random forest (RF) models have been developed for estimating global, diffuse and normal radiation components in daily and hourly time-scales. The developed ensemble models have been compared to their corresponding multi-layer perceptron (MLP), support vector regression (SVR) and decision tree (DT) models. The results show that the suggested techniques are very reliable and accurate, despite being relatively simple. The average validation coefficients of determination (R2) for boosting, bagging and RF algorithms are (0.957, 0.971, 0.967) for the global irradiation model, (0.768, 0.786, 0.791) for the diffuse irradiation model, (0.769, 0.785, 0.792) for the normal irradiation model, (0.852, 0.890, 0.883) for the hourly global irradiance model, (0.778, 0.869, 0.853) for the diffuse irradiance model, and (0.797, 0.897, 0.880) for the normal irradiance model. In general, the bagging and RF algorithms showed better estimates than gradient boosting. However, the gradient boosting algorithm was the most stable with maximum increase of 10.32% in the test root mean square error, compared to 41.3% for the MLP algorithm. The SVR algorithm offers the best combination of stability and prediction accuracy. Nevertheless, its computational costs are up to 39 times the computational costs of ensemble methods. The new ensemble methods have been recommended for generating synthetic radiation data to be used for simulating and evaluating the performance of different solar energy system s.

... A strict filtering procedure has been applied to the available data and the number of ground stations supplying data for the study has been limited to 44 (see Figure 2). This meteorological network has been amply used to test different methodologies to obtain solar irradiation maps by using support vector machines , satellite data (Antonanzas-Torres et al., 2013a) or parametric (Antonanzas-Torres et al., 2013b) and predictive models (Urraca et al., 2016). ...

Four spatial interpolation methods (Inverse Distance Weighted, Spline, Kriging and Natural Neighbor) and their different variations are employed to map Global Horizontal Irradiation (GHI) in Castilla-León, Spain. The work has been performed using the software ArcGis, widely used in geostatistical applications, showing the versatility of the system and its applicability to climate data. The measuring network consists of 71 ground meteorological stations that use seven complete years of half-hourly data sets, yielding annual daily averages of GHI. The interpolation results are tested against data from the four Spanish National Meteorological Agency (AEMET) stations available in the region using standard statistical indicators (RMSE, MBE, MAPE and MAE). An additional partial cross validation of the results, which excludes five stations from the measuring network, employs different criteria to verify the results of the interpolation methods applied. This work contributes to the classification of interpolation methods to obtain climatological data across large areas with a low number of irregularly distributed of measurement points and with a low topographic complexity. The Universal Kriging method with quadratic semi-variogram shows the best results taking into account the RMSE and MAE statistical indicators.

... There are different models suitable for this purpose. Urraca et al. provided a smart baselines model for solar irradiation forecasting in [31]. This model would be necessary for those solar-based system configurations. ...

A multi-objective optimization model for urban integrated electrical, thermal and gas grids is presented. The main system consists of a retrofitted natural gas pressure regulation station where a turbo-expander allows to recover energy from the process. Here, the natural gas must be preheated in order to avoid methane hydrates. The preheating phase could be based on fossil fuels, renewable or on a thermal mix. Depending on the system configuration, the proposed optimization model enables a proper differentiation based on how the natural gas preheating process is expected to be accomplished. This differentiation is addressed by weighting the electricity produced by the turbo-expander and linking it to proper remuneration tariffs. The effectiveness of the model has been tested on an existing plant located in the city of Genoa. Here, the thermal energy is provided by means of two redundant gas-fired boilers and a cogeneration unit. Furthermore, the whole system is thermally integrated with a district heating network. Numerical simulation results, obtained with the commercial proprietary software Honeywell UniSim Design Suite, have been compared with the optimal solutions achieved. The effectiveness of the model, in terms of economic and environmental performances, is finally quantified. For specific conditions, the model allows achieving an operational costs reduction of about 17% with the respect to thermal-load-tracking control logic.

... There are three metrics selected to verify the forecasting performance of all models, including the mean absolute percentage error (MAPE), the root mean squared error (RMSE), the normalized RMSE by the average solar radiation (RMSE/Avg) and the forecasting skill score (s) [39,40]. The smaller these criterion are, the better the forecasting results are. ...

Forecasting of effective solar irradiation has developed a huge interest in recent decades, mainly due to its various applications in grid connect photovoltaic installations. This paper develops and investigates an ensemble learning based multistage intelligent approach to forecast 5 days global horizontal radiation at four given locations of India. The two-way interaction model is considered with purpose of detecting the associated correlation between the features. The main structure of the novel method is the ensemble learning, which is based on Divide and Conquer principle, is applied to enhance the forecasting accuracy and model stability. An efficient feature selection method LASSO is performed in the input space with the regularization parameter selected by Cross-Validation. A weight vector which best represents the importance of each individual model in ensemble system is provided by glowworm swarm optimization. The combination of feature selection and parameter selection are helpful in creating the diversity of the ensemble learning. In order to illustrate the validity of the proposed method, the datasets at four different locations of the India are split into training and test datasets. The results of the real data experiments demonstrate the efficiency and efficacy of the proposed method comparing with other competitors.

... Statistical models emerged in late 90s to overcome these issues, combining several input variables in more complex non-linear relationships. In solar estimation, different algorithms have been tested with this purpose: Artificial Neural Network (ANN) [17][18][19][20], Support Vector Machine (SVM) [21][22][23], regression trees [24] or fuzzy logic [25]. Besides, several hybrid techniques have also been proposed to combine these algorithms with meta-heuristics, such as Genetic Algorithms (GA) [26,27], in order to optimize and automate the calibration process. ...

Solar radiation can be estimated by a variety of methods in an attempt to overcome the limitations of on-ground records. Novel methods are often appearing but these are rarely com- pared to others from a different approach. This study surveys the main types of estimation methods for daily Global Horizontal Irradiation (GHI), and then, one characteristic technique per group is selected, discarding possible hybrid approaches: a parametric model based on temperatures and precipitation (Antonanzas model), a statistical model (XGBoost), interpolated ground-based measurements (Ordinary Kriging (OK)), a satellite-based dataset (CM-SAF- SARAH), and a reanalysis dataset (ERA- Interim). The techniques are evaluated in relation to the seasonal variation, the clearness index and the spatial performance at 38 grounds stations in central Spain from 2001 to 2013.
Three different tiers of estimations were obtained being SARAH and OK the best performing methods overall. The SARAH dataset (MAE = 1.10 ± 0.13 MJ/m2, MBE = 0.22 ± 0.36 MJ/m2) generated estimates with the lowest spread, but led to a slight overestimation in low-altitude flat areas. The OK (MAE = 1.10 ± 0.25 MJ/m2, MBE = 0.00 ± 0.31 MJ/m2) outperformed SARAH in these flat areas (high density of stations), but at the expense of a higher variability. Alternatively, SARAH surpassed Ordinary Kriging (OK) when the distance to the closest station exceeded 20-30 km. The ERA- Interim reanalysis and the XGBoost were in the second tier of estimations, and the parametric model yielded the worst results overall. ERA-Interim exhibited a systematic overestimation. The locally trained Antonanzas and XGBoost struggled to model the atmospheric transmissivity, showing large positive errors in spring months and a small underestimation of clear-sky days. Finally, a summary with the strengths and weaknesses of the five methods provides a deeper understanding for the selection of the adequate estimation approach.

... Other advanced AI approaches have also been explored. In particular, Urraca et al. [197] applied random forest (RF) and support vector regression (SVR) techniques for global horizontal irradiance (GHI) prediction in 1-h forecast horizon, and found that these advanced methods significantly outperformed the classical persistence model with root mean square errors (RMSE) of 0.17 and 0.18. In another study, k-nearest neighbours (KNN) and SVR were employed for one-hour-ahead hourly solar PV power prediction based on empirical meteorological data and numerical weather predictions, wherein both models were found to be highly successful for 10-station setting [198]. ...

An overview of numerical and mathematical modelling-based distributed generation (DG) system optimisation techniques is presented in this review paper. The objective is to compare different aspects of these two broad classes of DG optimisation techniques, explore their applications, and identify potential research directions from reviewed studies. Introductory descriptions of general electrical power system and DG system are first provided, followed by reviews on renewable resource assessment, load demand analysis, model formulation, and optimisation techniques. In renewable resource assessment model review, uncertain solar and wind energy resources are emphasised whereas applications of forecasting models have been highlighted based on their prediction horizons, computational power requirement, and training data intensity. For DG optimisation framework, (solar, wind and tidal) power generator, energy storage and energy balance models are discussed; in optimisation technique section, both numerical and mathematical modelling optimisation methods are reviewed, analysed and criticised with recommendations for their improvements. In overall, this review provides preliminary guidelines, research gaps and recommendations for developing a better and more user-friendly DG energy planning optimisation tool.

... Several energy related studies have also used RF algorithm in modeling both classification and regression problems. For example, Urraca et al. [32] used RF to model the solar irradiation. Tooke et al. [33] implemented RF to predict the building age and energy consumption. ...

Efficient and effective city planning in improving the energy performance of residential buildings requires a clear understanding of the influential features. Previous studies on modeling the relationships between influential features and the energy consumption have several gaps and limitations, such as the linear modeling methodology and insufficient consideration of particular features. This study therefore aims at investigating the influence of 171 possibly related features on the regional energy use intensity (EUI) of residential buildings using a non-linear regression algorithm, namely Random Forests (RF). The New York City (NYC) was focused on due to data availability. The 171 features covered seven different aspects, which are building, economy, education, environment, households, surrounding, and transportation. The average site EUI of the residential buildings in each Block Group (BG) was set as the dependent variable. The regression model was compared to the models using typical linear methods, such as Multiple Linear Regression and Lasso. The results show that the RF model achieved a lower mean square error. In addition, the top 20 influential features were identified based on the out-of-bag estimation in RF. Results show that less percentage of well-educated people, higher percentage of households heated by fuel oil, lower household income and more residential complaints per capita are correlated with higher average site EUI in NYC. Related suggestions on improving the energy performance in different regions are presented to the local government.

... We will not cover the different irradiance forecasting techniques found in literature. For that, readers are referred to some reviews on irradiance forecasting (Inman et al., 2013;Diagne et al., 2013;Gueymard and Ruiz-Arias, 2015;Urraca et al., 2016). ...

Variability of solar resource poses difficulties in grid management as solar penetration rates rise continuously. Thus, the task of solar power forecasting becomes crucial to ensure grid stability and to enable an optimal unit commitment and economical dispatch. Several forecast horizons can be identified, spanning from a few seconds to days or weeks ahead, as well as spatial horizons, from single site to regional forecasts. New techniques and approaches arise worldwide each year to improve accuracy of models with the ultimate goal of reducing uncertainty in the predictions. This paper appears with the aim of compiling a large part of the knowledge about solar power forecasting, focusing on the latest advancements and future trends. Firstly, the motivation to achieve an accurate forecast is presented with the analysis of the economic implications it may have. It is followed by a summary of the main techniques used to issue the predictions. Then, the benefits of point/regional forecasts and deterministic/probabilistic forecasts are discussed. It has been observed that most recent papers highlight the importance of probabilistic predictions and they incorporate an economic assessment of the impact of the accuracy of the forecasts on the grid. Later on, a classification of authors according to forecast horizons and origin of inputs is presented, which represents the most up-to-date compilation of solar power forecasting studies. Finally, all the different metrics used by the researchers have been collected and some remarks for enabling a fair comparison among studies have been stated.

... In their subsequent work [14] the same group proposed a new method using sky camera images instead of satellite images and a maximum cross-correlation method to determine the cloud motion vectors, obtaining nRMSE = 11-25% for all sky conditions. Urraca et al. [15] predicted the solar irradiance 1 h ahead based on recorded meteorological data and computed solar variables. They developed two types of models: fixed, that are trained once using all training data, and moving, that build a separate prediction model for each testing instance using a subset of the training data (e.g. ...

We consider the task of forecasting the electricity power generated by a solar PhotoVoltaic (PV) system for forecasting horizons from 5 to 60 min ahead, from previous PV power and meteorological data. We present a new method based on advanced machine learning algorithms for variable selection and prediction. The correlation based variable selection identifies a small set of informative variables that are used as inputs for an ensemble of neural networks and support vector regression algorithms to generate the predictions. We develop two types of models: univariate, that use only previous PV power data, and multivariate, that also use previous weather data, and evaluate their performance on Australian PV data for two years. The results show that the univariate models performed similarly to the multivariate models, achieving mean relative error of 4.15–9.34%. Hence, the PV power output for very short-term forecasting horizons of 5–60 min can be predicted accurately by using only previous PV power data, without weather information. The most accurate model was univariate ensemble of neural networks, predicting the PV power output separately for each step of the forecasting horizon.

... Furthermore, some studies mixed satellitederived irradiation estimates with on-ground measurements of solar irradiation in order to improve the spatial estimation [10], or used the sunshine duration as explanatory variable [11]. In the last decades, many studies have addressed the proposal and evaluation of clear-sky models for solar resource assessments and also for the forecasting of solar irradiation [12], with great differences in models performance under different topographical and atmospheric conditions. Badescu et al. [13] evaluated fifty-four different clear sky models in three different sites in Romania concluding that none of the models performed best for all sets of input data, while some models ranked noticeably better than others. ...

The exponential growth of solar power has been witnessed in the past decade and is projected by the ambitious policy targets. Nevertheless, the proliferation of solar energy poses challenges to power system operations, mostly due to its uncertainty, locational specificity, and variability. The prevalence of smart grids enables artificial intelligence (AI) techniques to mitigate solar integration problems with massive amounts of solar energy data. Different AI subfields (e.g., machine learning, deep learning, ensemble learning, and metaheuristic learning) have brought breakthroughs in solar energy, especially in its grid integration. However, AI research in solar integration is still at the preliminary stage, and is lagging behind the AI mainstream. Aiming to inspire deep AI involvement in the solar energy domain, this paper presents a taxonomical overview of AI applications in solar photovoltaic (PV) systems. Text mining techniques are first used as an assistive tool to collect, analyze, and categorize a large volume of literature in this field. Then, based on the constructed literature infrastructure, recent advancements in AI applications to solar forecasting, PV array detection, PV system fault detection, design optimization, and maximum power point tracking control problems are comprehensively reviewed. Current challenges and future trends of AI applications in solar integration are also discussed for each application theme.

In this study, a novel photovoltaic power forecasting system that utilizes a deep Convolutional Neural Network (CNN) structure and an input signal decomposition algorithm is proposed. The proposed CNN architecture extracts deep features to forecast short-term power using transfer learning-based AlexNet. The historical power, solar radiation, wind speed, and temperature data are selected as the input. The signal decomposition algorithm called Empirical Mode Decomposition (EMD) is utilized to decompose the historical power signal into sub-components. In order to extract deep features, all input parameters are converted to 2D feature maps and feed to the input of the CNN. The experiments are realized on a grid-tied Photovoltaic Power Plant (PVPP) that has 1000 kW installed capacity located in Turkey. The experiments are performed under four weather conditions as partial cloudy, cloudy-rainy, heavy-rainy, and sunny days to show the effectiveness of the proposed method. The obtained results are compared with the benchmark regression algorithms. When the results are analyzed, the proposed method gives the highest Correlation Coefficient (R) and the lowest Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and SMAPE values under all horizons and weather conditions. For 1-h to 5-h ahead, the average R values of the proposed method are obtained as 97.28%, 95.77%, 94.49%, 93.61%, and 92.62%, respectively. The average RMSE values are observed as 4.90%, 6.30%, 7.50%, 8.00%, and 9.17% for 1-h to 5-h ahead. The experimental results confirm that the proposed method outperforms the conventional regression algorithms and reveals effective results with its competitive performance. ARTICLE HISTORY

The solar power plant is an alternative to the provision of environmentally friendly renewable electricity, especially in the tropics, which are sufficiently exposed to the sun throughout the year. However, environmental conditions such as rainfall, solar radiation, or clouds may affect the output power of photovoltaic (PV) systems. These factors make it difficult to know whether PV can meet the needs of the existing load. This research develops a model to predict the output power of a 160 x 285W PV system located in the tropics and has certain environmental conditions. The prediction development is supported by the Python programming language with a single hidden layer and two hidden layers Neural Network, as well as the traditional Multiple Linear Regression tools. The simulation results show that the two hidden layers Neural Network method has a higher level of accuracy compared to the single hidden layer and Multiple Linear Regression as seen from the value of R ² , MSE, and MAE.

Analysing the Output Power of a Solar Photo-voltaic System at the design stage and at the same time predicting the performance of solar PV System under different weather condition is a primary work i.e . to be carried out before any installation. Due to large penetration of solar Photovoltaic system into the traditional grid and increase in the construction of smart grid, now it is required to inject a very clean and economic power into the grid so that grid disturbance can be avoided. The level of solar Power that can be generated by a solar photovoltaic system depends upon the environment in which it is operated and two other important factor like the amount of solar insolation and temperature. As these two factors are intermittent in nature hence forecasting the output of solar photovoltaic system is the most difficult work. In this paper a comparative analysis of different solar photovoltaic forecasting method were presented. A MATLAB Simulink model based on Real time data which were collected from Odisha (20.9517 ∘ N, 85.0985 ∘ E), India. were used in the model for forecasting performance of solar photovoltaic system.

Recently, many machine learning techniques have been successfully employed in photovoltaic (PV) power output prediction because of their strong non-linear regression capacities. However, single machine learning algorithm does not have stable prediction performance and sufficient generalization capability in the prediction of PV power output. In this work, a hybrid model (SDA-GA-ELM) based on extreme learning machine (ELM), genetic algorithm (GA) and customized similar day analysis (SDA) has been developed to predict hourly PV power output. In the SDA, Pearson correlation coefficient is employed to measure the similarity between different days based on five meteorological factors, and the data samples similar to those from the target forecast day are selected as the training set of ELM. This operation can effectively increase the number of useful samples and reduce the time consumption on training data. In the ELM, the optimal values of the hidden bias and the input weight are searched by GA to improve the prediction accuracy. The performance of the proposed forecast model is evaluated with coefficient of determination (R2), mean absolute error (MAE) and normalized root mean square error (nRMSE). The results show that the SDA-GA-ELM model has higher accuracy and stability in day-ahead PV power prediction.

An accurate short-term global solar irradiation (GHI) forecast is essential for integrating the photovoltaic systems into the electricity grid by reducing some of the problems caused by the intermittency of solar energy, including rapid fluctuations in energy, management storage, and the high costs of electricity.
In this paper, the authors proposed a new hybrid approach to forecast hourly GHI for the Al-Hoceima city, Morocco. For this purpose, a deep long short-term memory network is trained on a combination of the hourly GHI ground measurements from the meteorological station of Al-Hoceima and the satellite-derived GHI from the neighbouring pixels of the point of interest. Xgboost, Random Forest, and Recursive Feature Elimination with cross-validation were used to select the most relevant features, the lagged satellite-derived GHI around the point of interest, as input to the proposed model where the best forecasting model is selected using the Grid Search algorithm. The simulation and results showed that the proposed approach gives high performance and outperformed other benchmark approaches.

The objective of this research is to build models for various time resolutions to predict global solar irradiation using data mining and statistical techniques. The time resolutions analyzed are 5 min, 1 hour and one-day horizon ahead. The models tested herein are three supervised machine learning (ML) techniques: nonlinear autoregressive neural network (NAR), support vector regression (SVR) and random forest (RF). A linear autoregressive (AR) model and the naive persistence (PER) model have also been included. The datasets come from two sites situated in Algeria: Algiers and Ghardaia that have different climatic conditions during the year corresponding to two types of climate, Mediterranean and Arid. One important contribution of this research to global irradiance forecasting is the benchmarking of the ML used, taking into account the lack of practical results and the needs detected in the literature, especially for the case of RF model; according to our best knowledge, the random forest method has never been tested as it has been done in our study: it is just based on past values of the same variable without exogenous data to forecast the future ones. The results of this research show that RF is the best technique with a slight difference in performance, specially for hourly forecasts ahead. The proposed models appear to be less outstanding both in the case of unstable sky conditions (Algiers) and when the resolution is 1 day, due to the fact that time series become significantly less correlated by including more randomness characteristic.

The penetration of photovoltaic (PV) energy into modern electric power and energy systems has been gradually increased in recent years due to its benefits of being abundant, inexhaustible and clean. In order to reduce the negative impacts of PV energy on electric power and energy systems, advanced forecasting approach with high-accuracy is a pressing need. Aimed at this, a novel hybrid method for deterministic PV power forecasting based on wavelet transform (WT) and deep convolutional neural network (DCNN) is firstly proposed in this paper. WT is used to decompose the original signal into several frequency series. Each frequency has better outlines and behaviors. DCNN is employed to extract the nonlinear features and invariant structures exhibited in each frequency. Then, a probabilistic PV power forecasting model that combines the proposed deterministic method and spine quantile regression (QR) is originally developed to statistically evaluate the probabilistic information in PV power data. The proposed deterministic and probabilistic forecasting methods are applied to real PV data series collected from PV farms in Belgium. Numerical results presented in the case studies demonstrate that the proposed methods exhibit the ability of improving forecasting accuracies in terms of seasons and various prediction horizons, when compared to conventional forecasting models.

Large forecast errors of solar power prediction cause challenges for the management of electric grids. Here, the classification technique Random Forests is applied to analyze the possible linkage of hourly or daily forecast errors to the actual situation given by a set of meteorological variables. This form a prediction of the forecast error and is thus usable to update the forecast. The performance of this scheme is assessed for the example of irradiance forecasts in Brazil. While limited to none improvements are obtained for next-hour forecasts, significant improvements are obtained for the next-day forecasts.

This paper examines the motivation, applications and development of short-term solar forecasting using ground-based sky imagery for controlling equipment on electrical grids.
Historically, there has not been a great deal of interaction between the fields of solar forecasting and electrical grid research. This situation is changing rapidly as solar forecasting is becoming increasingly important for dealing with the mass uptake of photovoltaics (PV) and solar-thermal generation on electrical grids around the world. The interactions in these two fields is examined, along with the opportunities for applying solar forecasting for on-grid and mini-grid applications.
We review solar forecasting techniques, summarise their applications and evaluate the links to suitable techniques with applications. A review of current solar forecasting techniques, including Numerical Weather Prediction (NWP), statistical/data-driven approaches, and satellite techniques is presented. We also compare the characteristics of each technique to illustrate the requirements, status and suitability of each for use in short-term solar forecasting applications. The application of sky-camera (‘skycam’) forecasting in electrical grids is discussed in addition to the presentation of a detailed case study demonstrating the use of these techniques to enhance grid operation. The control strategy developed demonstrates an increase the real-time penetration of renewable power by dynamically adapting solar inverter setpoints according to cloud forecast data.
A novel short-term solar forecasting system is presented which makes use of inexpensive ground-based sky imaging cameras (or ‘skycams’). This system is able to predict changes in irradiance by forecasting cloud movement up to 20 min ahead, with a 10 s update frequency.
Several novel techniques for skycam setup and forecasting are presented. These include:
(a)A new, high-performance approach to cloud classification using a novel set of neural network input features.(b)A new method for calibrating a lens distortion model.(c)A novel technique using per-pixel cloud movement vectors to predict the timing and extent of sun shading events.
This latter technique is capable of correctly classifying 97% of cloud pixels from a validation database of over 500,000 examples.
Finally, we present a new technique for taking features extracted from sky-camera pixel data and building a model for predicting large shading events. We show an example of how this model can be easily adapted for either conservative or aggressive operation of a solar power system with a backup generator. In conservative operation, this model is shown to supply adequate or excess energy up to 99.96% of the time using 4 min-ahead forecasts when used for scheduling a backup electrical generator; meaning the system would require only minimal battery storage, while producing a large reduction in fossil-fuel consumption.

Background
This study investigated the impact of renal dysfunction (RD) on long-term outcomes in elderly patients with acute coronary syndrome (ACS), and evaluated prognostic factors in elderly patients with ACS and RD.
Methods
This longitudinal prospective study included 184 consecutive patients who were admitted with ACS between January 2009 and January 2010 and also had RD. Patients were divided into five groups according to their estimated glomerular filtration rate (eGFR): 1) eGFR ≥ 90 mL/min/1.73 m2 with evidence of kidney damage, 2) 60 ≤ eGFR < 90 mL/min/1.73 m2, 3) 30 ≤ eGFR < 60 mL/min/1.73 m2, 4) 15 ≤ eGFR < 30 mL/min/1.73 m2, and 5) eGFR < 15 mL/min/1.73 m2. The primary endpoints were death and complications during hospitalization. The secondary endpoint was any major adverse cardiac event (MACE) during follow-up.
Results
The mean follow-up period was 502.2 ± 203.6 days. The mean patient age was 73.7 ± 9.4 years, and 61.4% of the patients were men. Severe RD (eGFR < 30 mL/min/1.73 m2) was an independent predictor of MACE. Severe RD was associated with a low hemoglobin level, low left ventricular ejection fraction, and high levels of high-sensitivity C-reactive protein, N-terminal pro-B-type natriuretic peptide, and cystatin C. Survival was significantly poorer in patients with severe RD than in patients with mild RD.
Conclusions
Among patients with ACS, severe RD was associated with advanced age, diabetes, hypertension, and cardiac dysfunction. Severe RD was an independent risk factor for MACE, and was associated with poor prognosis.

This pair of articles presents the results of a study about forecasting photovoltaic (PV) electricity production for some power plants in mainland France. Forecasts are built with statistical methods exploiting outputs from numerical weather prediction (NWP) models. Contrary to most other studies, forecasts are built without using technical information on the power plants. In each article, several statistical methods are used to build forecast models and their performance is compared by means of adequate scores. When a best forecast emerges, its characteristics are then further assessed in order to get a deeper insight of its merits and flaws. The robustness of the results are evaluated with an intense use of cross-validation.
The companion article Zamo et al. (2013) will deal with probabilistic forecasts of daily production 2 days ahead. By “probabilistic” we mean that our forecast models yield some quantiles of the expected production’s probability distribution.
This article deals with forecasting hourly PV production for the next day in a deterministic way, which means the mean expectable hourly PV production is forecast for each day-time hour. In this part of our study, predictors comes from ARPEGE, Météo France’s deterministic NWP model. Our best model is very reliable and performs well, even compared to best expectable performances computed while using observations as predictors. It also points at the interest of using predictors based on human forecasters’ experience.

The solaR package allows for reproducible research both for photovoltaics (PV) systems performance and solar radiation. It includes a set of classes, methods and functions to calculate the sun geometry and the solar radiation incident on a photovoltaic generator and to simulate the performance of several applications of the photovoltaic energy. This package performs the whole calculation procedure from both daily and intradaily global horizontal irradiation to the final productivity of grid-connected PV systems and water pumping PV systems. It is designed using a set of S4 classes whose core is a group of slots with multivariate time series. The classes share a variety of methods to access the information and several visualization methods. In addition, the package provides a tool for the visual statistical analysis of the performance of a large PV plant composed of several systems. Although solaR is primarily designed for time series associated to a location defined by its latitude/longitude values and the temperature and irradiation conditions, it can be easily combined with spatial packages for space-time analysis.

Data preprocessing techniques generally refer to the addition, deletion, or transformation of the training set data. Preprocessing data is a crucial step prior to modeling since data preparation can make or break a model’s predictive ability. To illustrate general preprocessing techniques, we begin by introducing a cell segmentation data set (Section 3.1). This data set contains common predictor problems such as skewness, outliers, and missing values. Sections 3.2 and 3.3 review predictor transformations for single predictors and multiple predictors, respectively. In Section 3.4 we discuss several approaches for handling missing data. Other preprocessing steps may include removing (Section 3.5), adding (Section 3.6), or binning (Section 3.7) predictors, all of which must be done carefully so that predictive information is not lost or erroneous information is added to the data. The computing section (3.8) provides R syntax for the previously described preprocessing steps. Exercises are provided at the end of the chapter to solidify concepts.

When predicting a numeric outcome, some measure of accuracy is typically used to evaluate the model’s effectiveness. However, there are different ways to measure accuracy, each with its own nuance. In Section 5.1 we define common measures for evaluating quantitative performance. We also discuss the concept of variance-bias trade-off (Section 5.2), and the implication of this principle for predictive modeling. In Section 5.3, we demonstrate how measures of predictive performance can be generated in R.

In the history of research of the learning problem one can extract four periods that can be characterized by four bright events: (i) Constructing the first learning machines, (ii) constructing the fundamentals of the theory, (iii) constructing neural networks, (iv) constructing the alternatives to neural networks.

We develop a hybrid, real-time solar forecasting computational model to construct prediction intervals (PIs) of one-minute averaged direct normal irradiance for four intra-hour forecasting horizons: five, ten, fifteen, and 20 min. This hybrid model, which integrates sky imaging techniques, support vector machine and artificial neural network sub-models, is developed using one year of co-located, high-quality irradiance and sky image recording in Folsom, California. We validate the proposed model using six-month of measured irradiance and sky image data, and apply it to construct operational PI forecasts in real-time at the same observatory. In the real-time scenario, the hybrid model significantly outperforms the reference persistence model and provides high performance PIs regardless of forecast horizon and weather condition.

This work proposes a novel forecast methodology for intra-hour solar irradiance based on optimized pattern recognition from local telemetry and sky imaging. The model, based on the k-nearest-neighbors (kNN) algorithm, predicts the global (GHI) and direct (DNI) components of irradiance for horizons ranging from 5 min up to 30 min, and the corresponding uncertainty prediction intervals. An optimization algorithm determines the best set of patterns and other free parameters in the model, such as the number of nearest neighbors. Results show that the model achieves significant forecast improvements (between 10% and 25%) over a reference persistence forecast. The results show that large ramps in the irradiance time series are not very well capture by the point forecasts, mostly because those events are underrepresented in the historical dataset. The inclusion of sky images in the pattern recognition results in a small improvement (below 5%) relative to the kNN without images, but it helps in the definition of the uncertainty intervals (specially in the case of DNI). The prediction intervals determined with this method show good performance, with high probability coverage (≈90% for GHI and ≈85% for DNI) and narrow average normalized width (≈8% for GHI and ≈17% for DNI).

Solar global irradiation is barely recorded in remote areas around the world. The lack of access to an electricity grid in these areas presents an enormous opportunity for electrification through renewable energy sources and, specifically, with photovoltaic energy where great solar resources are available. Traditionally, solar resource estimation was performed using parametric-empirical models based on the relationship between solar irradiation and other atmospheric and commonly measured variables, such as temperatures, rainfall, sunshine duration, etc., achieving a relatively high level of certainty. The significant improvement in soft-computing techniques, applied extensively in many research fields, has led to improvements in solar global irradiation modeling. This study conducts a comparative assessment of four different soft-computing techniques (artificial neural networks, support vector regression, M5P regression trees, and extreme learning machines). The results were also compared with two well-known parametric models [Liu and Scot, Agric. For. Meteorol. 106(1), 41–59 (2001) and Antonanzas-Torres et al., Renewable Energy 60, 604–614 (2013b)]. A striking mean absolute error of 1.74 MJ=m 2 day was achieved with support vector regression (around 10% lower than with classic parametric models). Furthermore, the annual sums of estimated solar irradiation with this technique were within the intrinsic tolerance of pyranometers (5%). This methodology is performed in free environment R software and released at www.github.com/EDMANSOLAR/remote for future replications of the study in different areas.

Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability of the prediction method. If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy.

Forecasting the AC power output of a PV plant accurately is important both for plant owners and electric system operators. Two main categories of PV modeling are available: the parametric and the nonparametric. In this paper, a methodology using a nonparametric PV model is proposed, using as inputs several forecasts of meteorological variables from a Numerical Weather Forecast model, and actual AC power measurements of PV plants. The methodology was built upon the R environment and uses Quantile Regression Forests as machine learning tool to forecast AC power with a confidence interval. Real data from five PV plants was used to validate the methodology, and results show that daily production is predicted with an absolute cvMBE lower than 1.3%.

Solar global irradiation is barely recorded in isolated rural areas around the world. Traditionally, solar resource estimation has been performed using parametric-empirical models based on the relationship of solar irradiation with other atmospheric and commonly measured variables, such as temperatures, rainfall, and sunshine duration, achieving a relatively high level of certainty. Considerable improvement in soft-computing techniques, which have been applied extensively in many research fields, has lead to improvements in solar global irradiation modeling, although most of these techniques lack spatial generalization.
This new methodology proposes support vector machines for regression with optimized variable selection via genetic algorithms to generate non-locally dependent and accurate models. A case of study in Spain has demonstrated the value of this methodology. It achieved a striking reduction in the mean absolute error (MAE) – 41.4% and 19.9% – as compared to classic parametric models; Bristow & Campbell and Antonanzas-Torres et al., respectively.

We develop a standalone, real-time solar forecasting computational platform to predict one minute averaged solar irradiance ramps ten minutes in advance. This platform integrates cloud tracking techniques using a low-cost fisheye network camera and artificial neural network (ANN) algorithms, where the former is used to introduce exogenous inputs and the latter is used to predict solar irradiance ramps. We train and validate the forecasting methodology with measured irradiance and sky imaging data collected for a six-month period, and apply it operationally to forecast both global horizontal irradiance and direct normal irradiance at two separate locations characterized by different micro-climates (coastal and continental) in California. The performance of the operational forecasts is assessed in terms of common statistical metrics, and also in terms of three proposed ramp metrics, used to assess the quality of ramp predictions. Results show that the forecasting platform proposed in this work outperforms the reference persistence model for both locations.

The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

Accurate prediction of solar radiation is of high importance for proper operation of the electrical grid. Over short horizons, forecasting solar irradiance is often performed by extrapolation of field measurements. Four tailored statistical models for forecasting hourly average solar irradiance are proposed and assessed in this paper. These follow from the well-known regression and ARIMA class of models, but bring into the model formulation various physically motivated additional features. These capture the distribution of solar radiation more effectively. Their performance is compared with the performance of a standard model used in the strictly black-box style often encountered in practice. Overall results demonstrate that the proposed models are significantly more accurate than the standard model, under conditions of mostly cloudy skies.

A methodology for downscaling solar irradiation from satellite-derived databases is described using R software. Different packages such as raster, parallel, solaR, gstat, sp and rasterVis are considered in this study for improving solar resource estimation in areas with complex topography, in which downscaling is a very useful tool for reducing inherent deviations in satellite-derived irradiation databases, which lack of high global spatial resolution. A topographical analysis of horizon blocking and sky-view is developed with a digital elevation model to determine what fraction of hourly solar irradiation reaches the Earth’s surface. Eventually, kriging with external drift is applied for a better estimation of solar irradiation throughout the region analyzed including the use of local measurements. This methodology has been implemented as an example within the region of La Rioja in northern Spain. The mean absolute error found using the methodology proposed is 91.92 kWh/m2, vs. 172.62 kWh/m2 using the original satellite-derived database (a striking 46.75% lower).
The code is freely available without restrictions for future replications or variations of the study at https://github.com/EDMANSolar/ downscaling.

Applied Predictive Modeling covers the overall predictive modeling process, beginning with the crucial steps of data preprocessing, data splitting and foundations of model tuning. The text then provides intuitive explanations of numerous common and modern regression and classification techniques, always with an emphasis on illustrating and solving real data problems. The text illustrates all parts of the modeling process through many hands-on, real-life examples, and every chapter contains extensive R code for each step of the process. This multi-purpose text can be used as an introduction to predictive models and the overall modeling process, a practitioner's reference handbook, or as a text for advanced undergraduate or graduate level predictive modeling courses. To that end, each chapter contains problem sets to help solidify the covered concepts and uses data available in the book's R package. This text is intended for a broad audience as both an introduction to predictive models as well as a guide to applying them. Non-mathematical readers will appreciate the intuitive explanations of the techniques while an emphasis on problem-solving with real data across a wide variety of applications will aid practitioners who wish to extend their expertise. Readers should have knowledge of basic statistical ideas, such as correlation and linear regression analysis. While the text is biased against complex equations, a mathematical background is needed for advanced topics. © Springer Science+Business Media New York 2013. All rights reserved.

The integration of massive solar energy supply in the existing grids requires an accurate forecast of the solar resources to manage the energetic balance. In this context, we propose a new approach to forecast the Global Horizontal Irradiance at ground level from satellite images and ground based measurements. The training of spatio-temporal multidimensional autoregressive models with HelioClim-3 data along with 15-min averaged GHI times series is tested with respect to a ground based station from the BSRN network. Forecast horizons from 15 min to 1 h provided very promising results validated on a one year ground-based pyranometric data set. The performances have been compared to another similar method from the literature by means of relative metrics. The proposed approach paves the way of the use of satellite-based surface solar irradiance (SSI) estimation as an SSI map nowcasting method that enables to capture spatio-temporal correlation for the improvement of a local SSI forecast.

When a part of the power is generated by grid connected photovoltaic installations, an effective global solar irradiation (GSI) forecasting tool becomes a must to ensure the quality and the security of the electrical grid. GSI forecasts allow the quantification of generated photovoltaic (PV) power and helps electrical grid operators anticipate problems related to the nature of PV power and the planning for adequate solutions and decisions. In this study, a new methodology for local forecasting of daily global horizontal irradiance (GHI) is proposed. This methodology is a combination of spatial modelling and artificial neural networks (ANNs) techniques. An ANN based model is developed to predict the local GHI based on daily weather forecasts provided by the US National Oceanic and Atmospheric Administration (NOAA) for four neighbouring locations. The methodology was tested for two locations; Le Bourget du Lac (45°38′44″N, 5°51′33″E), which is located in the French Alps and Cadarache (43°42′28″N, 05°46′31″E), which is located in the south of France. The model’s forecasts were compared to measured data for the two locations and validation results indicate that the ANN-based method presented in this study can estimate daily GHI with satisfactory accuracy.

This paper discusses the performance of a novel Coral Reefs Optimization – Extreme Learning Machine (CRO–ELM) algorithm in a real problem of global solar radiation prediction. The work considers different meteorological data from the radiometric station at Murcia (southern Spain), both from measurements, radiosondes and meteorological models, and fully describes the hybrid CRO–ELM to solve the prediction of the daily global solar radiation from these data. The algorithm is designed in such a way that the ELM solves the prediction problem, whereas the CRO evolves the weights of the neural network, in order to improve the solutions obtained. The experiments carried out have shown that the CRO–ELM approach is able to obtain an accurate prediction of the daily global radiation, better than the classical ELM, and the Support Vector Regression algorithm.

This paper proposes an accurate short-term solar irradiance prediction scheme via support vector regression. Utilizing clearness index conversion and appropriate features, the support vector regression models are able to output satisfying prediction results. The prediction results are further improved by the proposed ramp-down event forecasting and solar irradiance refinement procedures. With the help of all-sky image analysis, two separated regression models are constructed based on the cloud obstruction conditions near the solar disk. With bi-model prediction, the behavior of the changing irradiance can be captured more accurately. Moreover, if a ramp-down event is forecasted, the predicted irradiance is corrected based on the cloud cover ratio in the area near the sun. The experiments have shown that the proposed method can effectively improve the prediction accuracy on a highly challenging dataset.

Power forecasting is an important factor for planning the operations of photovoltaic (PV) system. This paper presents an advanced statistical method for solar power forecasting based on artificial intelligence techniques. The method requires as input past power measurements and meteorological forecasts of solar irradiance, relative humidity and temperature at the site of the photovoltaic power system. A self-organized map (SOM) is trained to classify the local weather type of 24 h ahead provided by the online meteorological services. A unique feature of the method is that following a preliminary weather type classification, the neural networks can be well trained to improve the forecast accuracy. The proposed method is suitable for operational planning of transmission system operator, i.e. forecasting horizon of 24 h ahead and for PV power system operators trading in electricity markets. Application of the forecasting method on the power production of an actual PV power system shows the validity of the method.

We forecast hourly solar irradiance time series using satellite image analysis and a hybrid exponential smoothing state space (ESSS) model together with artificial neural networks (ANN). Since cloud cover is the major factor affecting solar irradiance, cloud detection and classification are crucial to forecast solar irradiance. Geostationary satellite images provide cloud information, allowing a cloud cover index to be derived and analysed using self-organizing maps (SOM). Owing to the stochastic nature of cloud generation in tropical regions, the ESSS model is used to forecast cloud cover index. Among different models applied in ANN, we favour the multi-layer perceptron (MLP) to derive solar irradiance based on the cloud cover index. This hybrid model has been used to forecast hourly solar irradiance in Singapore and the technique is found to outperform traditional forecasting models.

The higher penetration of renewable resources in the energy portfolios of several communities accentuates the need for accurate forecasting of variable resources (solar, wind, tidal) at several different temporal scales in order to achieve power grid balance. Solar generation technologies have experienced strong energy market growth in the past few years, with corresponding increase in local grid penetration rates. As is the case with wind, the solar resource at the ground level is highly variable mostly due to cloud cover variability, atmospheric aerosol levels, and indirectly and to a lesser extent, participating gases in the atmosphere. The inherent variability of solar generation at higher grid penetration levels poses problems associated with the cost of reserves, dispatchable and ancillary generation, and grid reliability in general. As a result, high accuracy forecast systems are required for multiple time horizons that are associated with regulation, dispatching, scheduling and unit commitment. Here we review the theory behind these forecasting methodologies, and a number of successful applications of solar forecasting methods for both the solar resource and the power output of solar plants at the utility scale level.

In this work, a new hybrid model for short-term power forecasting of a grid-connected photovoltaic plant is introduced. The new model combines two well-known methods: the seasonal auto-regressive integrated moving average method (SARIMA) and the support vector machines method (SVMs). An experimental database of the power produced by a small-scale 20 kWp GCPV plant is used to develop and verify the effectiveness of the proposed model in short-term forecasting. Hourly forecasts of the power produced by the plant were carried out for a few days showing a quite good accuracy. A comparative study has also been introduced showing that the developed hybrid model performs better than both the SARIMA and the SVM model.

Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, ***, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

In this work, a new approach that contains two phases is used to predict the hourly solar radiation series. In the detrending phase, several models are applied to remove the non-stationary trend lying in the solar radiation series. To judge the goodness of different detrending models, the Augmented Dickey–Fuller method is applied to test the stationarity of the residual. The optimal model is used to detrend the solar radiation series. In the prediction phase, the Autoregressive and Moving Average (ARMA) model is used to predict the stationary residual series. Furthermore, the controversial Time Delay Neural Network (TDNN) is applied to do the prediction. Because ARMA and TDNN have their own strength respectively, a novel hybrid model that combines both the ARMA and TDNN, is applied to produce better prediction. The simulation result shows that this hybrid model can take the advantages of both ARMA and TDNN and give excellent result.

The effectiveness of ultrafiltration for the purification of recombinant proteins from aqueous corn endosperm and germ extracts was examined using model proteins of two different sizes, recombinant type I human collagen (rCollagen, 265kDa) and green fluorescent protein (GFP, 27kDa), to evaluate the effects of membrane pore size, transmembrane pressure (TMP), crossflow rate, and filtration pH on permeate flux and protein sieving. Using a 300kDa MWCO membrane resulted in a significant loss of rCollagen, whereas a 100kDa MWCO membrane completely retained rCollagen. Increasing the filtration crossflow rate and TMP resulted in a higher permeate flux without significantly altering the sieving of the host cell proteins (HCP) or GFP. The greatest HCP sieving was observed in the endosperm extract filtration at low pH and, compared to endosperm, the filtration of germ extracts had lower HCP sieving. GFP exhibited similar sieving as the average HCP for all filtration conditions. rCollagen purity of 89% was achieved with only diafiltration of endosperm extracts and, when preceded by precipitation, a purity of >99% was attained. Thus, ultrafiltration is a valuable method to separate and purify corn-hosted recombinant proteins >100kDa, particularly when the expression is targeted to the endosperm.

Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

Artificial neural network is a powerful tool in the forecast of solar irradiance. In order to gain higher forecasting accuracy, artificial neural network and wavelet analysis have been combined to develop a new method of the forecast of solar irradiance. In this paper, the data sequence of solar irradiance as samples is mapped into several time-frequency domains using wavelet transformation, and a recurrent back-propagation (BP) network is established for each domain. The solar irradiance forecasted equals the algebraic sum of the components, which were predicted correspondingly by the established networks, of all the time-frequency domains. A discount coefficient method is adopted in updating the weights and biases of the networks so that the late forecasts play more important roles. On the basis of the principle of combination of artificial neural networks and wavelet analysis, a model is completed for fore-casting solar irradiance. Based on the historical day-by-day records of solar irradiance in Shanghai an example of forecasting total irradiance is presented. The results of the example indicate that the method makes the forecasts much more accurate than the forecasts using the artificial neural networks without combination with wavelet analysis.

In the theory of linear models, the concept of degrees of freedom plays an important role. This concept is often used for measurement of model complexity, for obtaining an unbiased estimate of the error variance, and for comparison of different models. I have developed a concept of generalized degrees of freedom (GDF) that is applicable to complex modeling procedures. The definition is based on the sum of the sensitivity of each fitted value to perturbation in the corresponding observed value. The concept is nonasymptotic in nature and does not require analytic knowledge of the modeling procedures. The concept of GDF offers a unified framework under which complex and highly irregular modeling procedures can be analyzed in the same way as classical linear models. By using this framework, many difficult problems can be solved easily. For example, one can now measure the number of observations used in a variable selection process. Different modeling procedures, such as a tree-based regression and a projection pursuit regression, can be compared on the basis of their residual sums of squares and the GDF that they cost. I apply the proposed framework to measure the effect of variable selection in linear models, leading to corrections of selection bias in various goodness-of-fit statistics. The theory also has interesting implications for the effect of general model searching by a human modeler.

This technical note presents a conversion function between the widely used Linke turbidity coefficient TL, the atmospheric water vapor and urban aerosol content. It takes into account the altitude of the application site.The function is based on radiative transfer calculations and validated with the help of an independent clear sky model. Its precision is around 0.12 units of TL.

Forecasting of solar irradiance is in general significant for planning the operations of power plants which convert renewable energies into electricity. In particular, the possibility to predict the solar irradiance (up to 24 h or even more) can became – with reference to the Grid Connected Photovoltaic Plants (GCPV) – fundamental in making power dispatching plans and – with reference to stand alone and hybrid systems – also a useful reference for improving the control algorithms of charge controllers. In this paper, a practical method for solar irradiance forecast using artificial neural network (ANN) is presented. The proposed Multilayer Perceptron MLP-model makes it possible to forecast the solar irradiance on a base of 24 h using the present values of the mean daily solar irradiance and air temperature. An experimental database of solar irradiance and air temperature data (from July 1st 2008 to May 23rd 2009 and from November 23rd 2009 to January 24th 2010) has been used. The database has been collected in Trieste (latitude 45°40′N, longitude 13°46′E), Italy. In order to check the generalization capability of the MLP-forecaster, a K-fold cross-validation was carried out. The results indicate that the proposed model performs well, while the correlation coefficient is in the range 98–99% for sunny days and 94–96% for cloudy days. As an application, the comparison between the forecasted one and the energy produced by the GCPV plant installed on the rooftop of the municipality of Trieste shows the goodness of the proposed model.