ArticlePDF Available

Abstract and Figures

The relevance of forecasting in renewable energy sources (RES) applications is increasing, due to their intrinsic variability. In recent years, several machine learning and hybrid techniques have been employed to perform day-ahead photovoltaic (PV) output power forecasts. In this paper, the authors present a comparison of the artificial neural network’s main characteristics used in a hybrid method, focusing in particular on the training approach. In particular, the influence of different data-set composition affecting the forecast outcome have been inspected by increasing the training dataset size and by varying the training and validation shares, in order to assess the most effective training method of this machine learning approach, based on commonly used and a newly-defined performance indexes for the prediction error. The results will be validated over a one-year time range of experimentally measured data. Novel error metrics are proposed and compared with traditional ones, showing the best approach for the different cases of either a newly deployed PV plant or an already-existing PV facility.
Content may be subject to copyright.
Article
Comparison of Training Approaches for Photovoltaic
Forecasts by Means of Machine Learning
Alberto Dolara , Francesco Grimaccia ID , Sonia Leva , Marco Mussetta ID
and Emanuele Ogliari †,*
Dipartimento di Energia, Politecnico di Milano, via La Masa 34, 20156 Milano, Italy;
alberto.dolara@polimi.it (A.D.); francesco.grimaccia@polimi.it (F.G.); sonia.leva@polimi.it (S.L.);
marco.mussetta@polimi.it (M.M.)
*Correspondence: emanuelegiovanni.ogliari@polimi.it; Tel.: +39-2399-8524
These authors contributed equally to this work.
Received: 31 December 2017; Accepted: 28 January 2018; Published: 2 February 2018
Abstract:
The relevance of forecasting in renewable energy sources (RES) applications is increasing,
due to their intrinsic variability. In recent years, several machine learning and hybrid techniques
have been employed to perform day-ahead photovoltaic (PV) output power forecasts. In this paper,
the authors present a comparison of the artificial neural network’s main characteristics used in
a hybrid method, focusing in particular on the training approach. In particular, the influence of
different data-set composition affecting the forecast outcome have been inspected by increasing
the training dataset size and by varying the training and validation shares, in order to assess the
most effective training method of this machine learning approach, based on commonly used and
a newly-defined performance indexes for the prediction error. The results will be validated over
a one-year time range of experimentally measured data. Novel error metrics are proposed and
compared with traditional ones, showing the best approach for the different cases of either a newly
deployed PV plant or an already-existing PV facility.
Keywords: photovoltaics; power forecasting; artificial neural networks
1. Introduction
In recent years, several forecasting methods have been developed for the output power of
renewable energy sources (RES) [
1
], addressing in particular the intrinsic variability of parameters
related to changing weather conditions, which directly affect the photovoltaic (PV) systems’ power
output [
2
]. This increasing attention is mainly due to the increasing shares of RES quota in power
systems, which involve novel technical challenges for the efficiency of the electrical grid [
3
].
In particular, predictive tools based on historical data can generally provide advantages in PV plant
operation [4,5], reduce excess production, and take advantage of incentives for RES production [6].
Among the commonly-used forecasting models, most aim to predict the expected power
production based on numerical weather prediction (NWP) systems forecasts [
7
]. This is a complex
problem with high degrees of non-linearity; for this reason, it is commonly approached by means
of advanced models and techniques—i.e., evolutionary computation [
8
], machine learning (ML) [
9
],
and artificial neural networks (ANNs) [
10
]. These are pseudo-stochastic iterative approaches defined
in the class of computational intelligence techniques, and are usually employed to address pattern
recognition, function approximation, control, and forecasting problems [
11
]. Moreover, they are
generally able to handle incomplete or missing data and solve problems with a high degree
of complexity.
Recently, several ANN layouts have been developed to solve different tasks [
12
], such as: times
series prediction, complex dynamical system emulation [
13
], speech generation, handwritten digit
Appl. Sci. 2018,8, 228; doi:10.3390/app8020228 www.mdpi.com/journal/applsci
Appl. Sci. 2018,8, 228 2 of 16
recognition, and image compression, due to their ability to learn from extended time series of historical
measurements with acceptable error levels compared to other statistical and physical forecasting
models [
14
]. Currently, ANN employment in forecasting is quite straightforward due to the widespread
development of specific software applications [1517].
In particular, the first attempts at solar power forecasting by means of ANN started more than
a decade ago [
18
]. Generally, in the case of PV power output, common training data are the historical
measurements of power production from a PV facility and meteorological parameters unique to the
facility location, including temperature, global horizontal irradiance (i.e., the intensity of all the solar
radiation components on a horizontal surface) [
19
], and cloud cover above the facility. Additional
forecasted variables from the numerical weather predictions can also be considered, such as wind
speed, humidity, pressure, etc. [20].
Novel forecasting models were recently implemented by adding an estimate of the clear sky
radiation to the series of historical local weather data, as reported in [21].
Additionally, the effectiveness of ensemble methods was demonstrated in [
22
], thus giving additional
advantages in terms of results reliability and the implementation of efficient parallel computing techniques.
In their previous work [
23
], the authors conducted a detailed analysis to find a procedure for
the best ANN layout and settings in terms of the number of layers, neurons, and trials for the PV
day-ahead forecast. Furthermore, evidence showed that the forecasting performance of ML techniques
is affected by the composition of the training data-set, as well as by input selection [24,25].
In this paper, a specific study is conducted on training data-sets in order to provide a more
detailed analysis of the effect of different approaches in the training data-set composition on the
day-ahead forecast of the PV power production. In particular, the authors present some procedures to
set-up the training and validation data-sets for the ANN used in physical hybrid method to perform
the day-ahead PV power forecast in view of the electricity market. Moreover, a novel error metric
is proposed and compared with traditional ones, in order to validate the best training approach in
different cases: indeed, the procedures outlined herein can be adopted to set-up data-sets based on
either historical data retrieved from an existing PV plant or on incremental data measurements in
a newly deployed PV plant. The test data set will be made up of the 24-hourly PV power values
forecasted one day-ahead.
The paper is structured as follows: Section 2provides an overview of the considered approaches
for the composition of the training database, considering both cases of historical data retrieved from an
existing PV plant and incremental data measurements in a newly deployed PV plant; Section 3presents
the methodology implemented to compare the different training approaches presented here, proposing
some new metrics aimed at evaluating the suitability of the proposed configurations in terms of error
performance and statistical behavior; Section 4presents the considered case study, which is used to test
the proposed training approaches: specific simulations and numerical results are provided in Section 5,
and final remarks are reported in Section 6.
2. Training Database Composition Approaches
In order to perform the day-ahead forecast, the ANN needs to be trained. Hence, the amount
of historical data employed in the supervised learning determines the ANN forecast capability.
This amount of data is formed of samples exploited in the process of identifying the links among
neurons in the network which minimize the error in the forecast. In order to do this task, the whole
amount of available samples is divided in two groups:
the “training set” (or equally “training database”), which is used to adjust the weights among
neurons by performing the forecast on the same samples,
the “validation set”, which is used as a stopping criteria to avoid over-fitting and under-fitting.
It proves the goodness of the trained network on additional samples which have not been
previously included in the training set. The purpose of this step is to test the generalization
capability of the neural network on a new data-set .
Appl. Sci. 2018,8, 228 3 of 16
Learning occurs by updating elements within the network; thus, its response iteratively improves
to match the desired output. An ANN is trained when it has learned its task and converges to a solution.
To achieve this, some learning algorithms are commonly used:
error back-propagation (EBP)
gradient descent
conjugate gradient
evolutionary algorithms (genetic algorithms, particle swarm optimization, etc.)
Sometimes, according to the problem, the fastest algorithm gives solutions rapidly converging
on local minima; however, this does not guarantee the maximum accuracy. In addition, it should be
considered that a large training set size provides a better sample of the trends improving generalization,
but it generally slows down the learning process. If an ANN is not properly trained or sized, there are
usually undesired results, such as “overfitting” and “underfitting” [
26
]. Using ANN ensembles by
averaging their outputs has been demonstrated to be beneficial, as it helps to avoid chance correlations
and the overtraining problem [27,28].
However, to choose both the most suitable learning algorithm and the proper size of the training
set which minimizes the error is a challenge which should be faced in each case study [2931].
In this paper, we inspect how the behavior on the day-ahead forecast is influenced by the possible
characteristics summarized in Figure 1. The first characteristic of the data-set is either “incremental”
when the elements belonging to the training data-set are progressively available over time and the
training set size gradually increases or “complete” if an already existing database of samples is
available. The second characteristic refers to the way the data-set is used for training the ANN.
As the forecast-making is mainly a stochastic process, the choice could be to use entirely the same
training data set for each forecast of the ensemble (we refer to the single forecast with the term “trial”,
and in this case, all the trials will be the same in the ensemble) or to shuffle its elements, grouping them
in smaller subsets adopted each time to separately train a different ANN (in this other case, each trial
is independent, as all the training data-sets are different). Finally, the mean of the resulting output is
usually calculated in the so-called “ensemble” forecast. The third characteristic is related to the order of
the hourly samples that constitute the training data-set. They can appear either consecutively displaced
as the chronological time series they belong to or they can be randomly grouped and mixed up.
Figure 1. Main features of the ANN training data-sets.
The combination of these characteristics results in different ANN training methods, which could
affect the forecast.
All of the assumptions exposed here are valid, in general terms, for all ANN-based methods.
In this specific paper, authors employ the Physical Hybrid Artificial Neural Network (PHANN) method
for the day-ahead forecast, as described in detail in [
14
,
21
]. This procedure mixes the physical Clear
Sky Radiation Model (CSRM) and the stochastic ANN method as reported in Figure 2.
Appl. Sci. 2018,8, 228 4 of 16
Figure 2. Physical Hybrid Artificial Neural Network (PHANN) method schematic diagram.
2.1. Incremental Training Data-Set
An incremental data-set occurs when the available samples are limited. Usually this is the case
of real-time or time-dependant processes, and data can be acquired only progressively. Consider
for example our case study when the monitoring system starts recording data from the first day of
operation of the PV plant: initially a small amount of data is recorded, and if we acquire hourly
samples, 24 samples are added to the historical data-set every day.
In this database composition (e.g., see Figure 3), the days which can be employed for ANN
training are those available starting from the PV plant commissioning (day 1) until the
kd
day before
the forecast (day
Xd
). As a consequence, the size of the training database will increase over time.
In order to supply the data-set to the network for the training step, samples can be arranged in different
methods. Those adopted in this paper are listed in Figure 4, and determine different results in the
forecast. A short description is given in the following:
Method A employs the same chronologically consecutive samples by grouping the 90% of the
samples which are closest to the forecast day for the training set and the remaining 10% of the
samples for the validation set.
Method A* employs the same chronologically consecutive samples by grouping the 90% of the
samples for the training set and the 10% of the samples which are closest to the forecast day for
the validation set.
Method B employs the samples by randomly grouping them separately, 90% for the training set
and 10% for the validation set.
Figure 3.
Hourly samples are progressively available in an incremental training database. PV: photovoltaic.
Appl. Sci. 2018,8, 228 5 of 16
Figure 4. Training database composition for methods A, A*, and B.
In the first two methods (A and A*), the effect of the proximity of the training set to the forecast
day is examined (implying seasonal variations on the parameter), inspecting how the forecast is
affected by the proximity of the samples employed in the training rather than in the validation step.
For example, it is clear that forecasting spring days cannot be accurate if the training samples belong to
the past autumn or winter, and the same consideration applies for the validation. Reasonably, we are
expecting that the further the samples of the validation are, the less accurate the forecast. Obviously,
this problem is not addressed in Method B, as samples are randomly chosen.
2.2. Complete Training Data-Set
In the complete data-set, an extended amount of samples is available, but it might belong to
a period of time which is time-wise distant from the days of the forecast, as it is shown in Figure 5.
In this case, samples which have to be employed for the ANN training can either be mixed (as shown
in Figure 6) each time that a trial is performed (this happens when trials are independent with Method
C1), or each trial depends on the same training data-set with Method C2.
Figure 5.
Hourly samples belonging to an extended period of time are available in a complete
training database.
Appl. Sci. 2018,8, 228 6 of 16
The complete list of the training methods which have been adopted in this paper is in Table 1.
The different shares of the training and the validation set, 90% and 10%, respectively, have been set up
in previous works.
Figure 6.
Hourly samples belonging to an extended period of time in a complete training database are
randomly mixed.
Table 1.
Different methods for the composition of the ANN training data-sets which have been
analysed. (90%ts 10%vs)ts = training set; vs = validation set.
Method Data-Set Trials Samples
A Incremental Dependent Consecutive (10%vs 90%ts)
A* Incremental Dependent Consecutive (90%ts 10%vs)
B1 Incremental Independent Random
B2 Incremental Dependent Random
C1 Complete Independent Random
C2 Complete Dependent Random
3. Evaluation Indexes
The effect of the different methods of training is investigated by means of some evaluation indexes.
These indexes aim at assessing the accuracy of the forecasts and the related error, and it is therefore
necessary to define the indexes. There is a wide variety of existing definitions of the forecasting
performance, and technical papers present many of these indexes; hence, we will report some of the
most commonly used definitions in the literature ([3234]).
The hourly error
eh
is the starting definition given as the difference between the hourly mean
values of the power measured in the
h
-th hour
Pm,h
and the forecast
Pp,h
provided by the adopted
model [32,35]:
eh=Pm,hPp,h(W). (1)
From the hourly error expression and its absolute value
|eh|
, other definitions can be inferred;
i.e., the well-known mean absolute percentage error (MAPE):
MAPE =1
N
N
h=1
eh
Pm,h
·100 , (2)
where
N
represents the number of samples (hours) considered: usually it is calculated for a single day,
month, or year.
Appl. Sci. 2018,8, 228 7 of 16
Since the hourly measured power
Pm,h
significantly changes during the same day (i.e., sunrise,
noon, and sunset), for the sake of a fair comparison, in this paper the authors preferred to consider the
normalized mean absolute error N M AE%:
NMAE%=1
N
N
h=1
eh
C
·100 , (3)
where the percentage of the absolute error is referred to the rated power
C
of the plant, in place of the
hourly measured power Pm,h.
In this paper we also adopted the mean value of all the
NMAE%,d
, which refers to the
d
-th day,
calculated over the whole period. Therefore, we introduce
NMAE%
, which is the mean of all the daily
NMAE%,dobtained with a given data-set:
NMAE%=1
D
D
d=1
NMAE%,d. (4)
The weighted mean absolute error WMAE%is based on total energy production:
WMAE%=N
h=1|eh|
N
h=1Pm,h
·100 . (5)
The normalized root mean square error
nRMSE
is based on the maximum hourly power output
Pm,h
:
nRMSE%=qN
h=1|eh|2
N
max(Pm,h)·100 . (6)
This error definition is the well-known root mean square error (
RMSE
) which has been normalized
over the maximum hourly power output
Pm,h
measured in the considered time range, for the sake of
a fair comparison.
NMAE%
is largely used to evaluate the accuracy of predictions and trend estimations. In fact,
often relative errors are large because they are divided by small power values (for instance the low
values associated to sunset and sunrise): in such cases,
WMAE%
could result very large and biased,
while NMAE%, by weighting these values with the capacity of the plant C, is more useful.
The
nRMSE%
measures the mean magnitude of the absolute hourly errors
eh,abs
. In fact, it gives
a relatively higher weight to larger errors, thus allowing particularly undesirable results to be
emphasized. In fact, if we consider the daily trends of the aforementioned indexes (which are shown
in Figure 7), it can be seen how they are correlated, while in the same Figure 8, the scatterplot of their
normalized values with the relative maxima clearly shows these correlations between the three error
indexes. Furthermore, the Pearson–Bravais correlation index
ρxy
[
36
] has been calculated to underline
the direct relationship among the error indexes:
ρxy =N
h=1(xiµx)(yiµy)
qN
h=1(xiµx)2qN
h=1(yiµy)2. (7)
However, as it is shown in Figure 7, the daily evaluation indexes expressed in Equations
(3)
,
(5)
, and
(6)
could vary a great deal, being unable to give complete information “at a glimpse” of
the accuracy of the prediction. For example, consider Figures 9and 10, where the forecasts and the
relevant evaluation indexes for 1 April and 4 November 2014, respectively, are depicted. In both cases,
daily
NMAE%
values are quite low (around 2–3%) and a forecast assessment solely based on this basis
could be misleading.
Appl. Sci. 2018,8, 228 8 of 16
Actually, the 1 April was quite a sunny day and the bell-shaped hourly power curve which has
been forecast—the red starred line—was accurately following the measured one—the blue circled line.
The cloudy winter day 4 November 2014 was a different story; in fact, the forecast red curve is biased
on the noon hours, while the actual blue curve in the morning. However, in the second day, the daily
NMAE%
value is lower. This is owing to the normalisation of the mean absolute error with the net
capacity of the plant. Regarding the other evaluation indexes, even if they are correlated, they can
exceed the 100% cap, as happens for example to WMAE%in Figure 7on day 72.
Figure 7.
Example of the daily errors trend.
NMAE
: normalized mean absolute error;
nRMSE
:
normalized root mean square error; WM AE: weighted mean absolute error.
Figure 8. Normalized daily errors correlated in a scatterplot.
Appl. Sci. 2018,8, 228 9 of 16
Figure 9.
Example of a sunny day forecast—1 April 2014—with the relevant evaluation indexes.
EM AE%: envelope-weighted mean absolute error.
Figure 10.
Example of a cloudy day forecast—4 November 2014—with the relevant evaluation indexes.
Starting from these assumptions, and in view of a more useful summary evaluation, an additional
performance index is proposed, aiming to provide a value between 0% and 100% of the forecast
accuracy. Therefore the envelope-weighted mean absolute error,EM AE%is defined as:
EM AE%=N
h=1|eh|
N
h=1max(Pm,h,Pp,h)·100 , (8)
where the numerator is the sum of the absolute hourly errors, as in WM AE%, while the denominator
is the sum of the maximum between the forecast and the measured hourly power. In particular,
this definition is consistent with a graphical representation of the error, where the numerator
corresponds to the yellow area shown in Figures 9and 10 and the denominator is the sum of the
gray and yellow areas highlighted in the same figures. With reference to the above-mentioned days,
while the two
NMAE%
values are nearly the same, the
EM AE%
is 11% in the first case and 40% in the
second case, and it never exceeds 100%.
Appl. Sci. 2018,8, 228 10 of 16
As with the daily
NMAE%,d
, in this study we also introduced the mean value of all the
EM AE%,d
,
which are referred to the
d
-th day, calculated over the whole period. Therefore,
EM AE%
is the mean of
all the daily EMAE%,dfor a given data-set:
EM AE%=1
D
D
d=1
EM AE%,d. (9)
4. Case Study
Experimental data for this study were taken from the laboratory SolarTechLab [
37
] located in
Milano, Italy (coordinates: 45
30
0
10.588
00
N; 9
9
0
23.677
00
E). In 2014, the DC output power of a single
PV module with the following characteristics was recorded:
PV technology: Silicon mono crystalline,
Rated power (Net capacity of the PV module): 245 Wp ,
Azimuth: 6300(assuming 0as south direction and counting clockwise),
Solar panel tilt angle (β): 30,
The monitoring activity of the PV system parameters lasted from 8 February to 14 December
2014, but the employable data, without interruptions and discontinuities, amount to 216 days.
These 24-hourly samples were used as the database for the forecasting methods comparison.
The PV module was linked to the electric grid by a micro-inverter ABB MICRO-0.25-I- OUTD [
38
],
guaranteeing the optimization of the production. Its operating parameters—DC power included—were
transmitted to a workstation for storage using a ZigBee protocol wireless connection, in real-time.
An important issue that arises is how to avoid missing values and outliers. A suitable pre-processing
procedure, which has already been developed and described in detail in [39], is applied here.
The weather forecasts employed were delivered by a weather service each day at 11 a.m. of the
day before the forecasted one, for the exact location of the PV plant. The historical hourly database of
these parameters was used to train the network and includes the following parameters:
Tamb ambient temperature (C),
GH I global horizontal irradiance (W/m2),
GPO A global irradiance on the plane of the array (W/m2),
Wswind speed (m/s),
Wdwind direction (),
Ppressure (hPa),
Rprecipitation (mm),
Cccloud cover (%),
Ctcloud type (Low/Medium/High).
In addition to these parameters, in order to train the PHANN method, the local time
LT
(hh:mm)
of the day and the Clear Sky Radiation model
CSRM
(W/m
2
) were also provided. These are the
eleven inputs of the ANN. Regarding the specific settings of the ANN, exception made for the training
database composition (as presented in Section 2), they were selected on the basis of a sensitivity
analysis, as outlined in a previous study [23]. The ANN settings adopted in this study were:
neurons in the input layer: 11,
neurons in the first hidden layer: 11,
neurons in the second hidden layer: 5,
neurons in the output layer: 1,
training algorithm: Levenberg–Marquardt,
activation function: sigmoid,
number of trials in the ensemble forecast: 40.
Appl. Sci. 2018,8, 228 11 of 16
The share of the data included in the training and in the validation steps have been adjusted by
means of another sensitivity analysis. Independently of how many days were employed in the training,
the database was divided into two groups containing different amounts of data. Thereafter, they were
provided first to train the network and the remaining data for the validation. Finally, the ensemble
forecast was performed. This procedure was followed several times, progressively increasing the
number of days employed in the training-process. The above-mentioned performance indexes over
the whole year were calculated, and according to the different shares adopted between training
and validation, the results are plotted in Figures 11 and 12. The results depicted here refer to the
training method C1, and the reason for this choice will be explained later in Section 5. As can be seen,
the best results are always guaranteed by adopting 90% of data for the training and the remaining
10% for the validation (the blue rhomboidal curve). However, the zoom in the top-right corner of
Figure 11 shows that, for the largest amount of data (210 days), also 80% of data for the training and
20% for the validation (the purple dotted curve) provided similar results to the previously described
curve. The same
NMAE%
trends were obtained in Figure 12, where the trend of
EM AE%
is shown as
a function of the data-set size and the shares of training and validation set.
Figure 11. N MAE%as a function of the dataset size.
Figure 12. EMAE%as a function of the dataset size.
Appl. Sci. 2018,8, 228 12 of 16
The same analysis is performed for the training Method A* by comparing the results of
NMAE%
in Figure 13 and the new error definition Equation (9) shown in Figure 14.
Figure 13. N MAE%as a function of the dataset size.
Figure 14. EMAE%as a function of the dataset size.
5. Results
The study carried on so far aimed to compare different methods in the data-set composition
employed for the training of the ANN, highlighting the most effective ones. The obtained results of
the day-ahead forecasts were analysed by the indexes shown in Section 3and led to the following
results. The graph in Figure 15 shows the trend of the
NMAE%
calculated for the methods in
the training-set composition, according to increasing data-set sizes. The best training method,
which globally performed better with all the data-sets considered, was undoubtedly C1. Instead,
in the short-range training, with only 10 days available in the data-set, method C2 scored the worst
result with
NMAE%
equal to 6.079. In accordance with the increasing data-sets method, C2 aligned
with C1 above 90–130 days. The same trends of the other evaluation indexes are equally shown in
Figures 1618 and confirm the same results. From this perspective, method C2 scored the worst result,
with
EM AE%
equal to 36.51. According to the
NMAE%
shown in Figure 15, methods B1 and B2
generally performed pretty much the same.
Appl. Sci. 2018,8, 228 13 of 16
Figure 15. N MAE%as a function of the dataset size.
Figure 16. nRMSE%as a function of the dataset size.
Figure 17. W M AE%as a function of the dataset size.
Appl. Sci. 2018,8, 228 14 of 16
Figure 18. EMAE%as a function of the dataset size.
As a general comment on the reported results, it can be stated that method A is best suited
when the availability of historical data is limited (e.g., newly deployed PV plant), while method
C1 appears to be most effective in the case of a greater availability of data (e.g., at least one year of
power measurements from the considered PV facility). Generally speaking, ensembles composed of
independent trials are most effective. The performance of methods B1 and B2 was halfway between
A and C, and their effectiveness in the case of newly deployed PV plants became significant after
a minimum period of measurement data accumulation (above 60 days).
6. Conclusions
This paper has presented a specific study aimed to analyze the effect of different approaches
in the composition of a training data-set for the day-ahead forecasting of PV power production.
In particular, the authors proposed different procedures to set-up the training and validation data-sets
for the ANN used in physical hybrid method to perform the power forecast in view of the electricity
market. The here-outlined approaches can be adopted to set-up data-sets based on either historical
data retrieved from an existing PV plant or on incremental data measurements in a newly deployed PV
facility. In particular, the influence of different data-set compositions on the forecast outcome has been
inspected by increasing the training dataset size and by varying the training and validation shares,
in order to assess the most effective training method of this machine learning approach, based on
commonly used and newly-defined performance indexes for the prediction error. The reported results
have been validated over a 1-year time range of experimentally measured data from a real PV power
plant, considering a comparison of various error measures and showing the best approach for the
different cases of either newly deployed or already existing PV facilities.
Author Contributions:
In this research activity, all of the authors were involved in the data analysis and
preprocessing phase, the simulation, the results analysis and discussion, and the manuscript’s preparation.
All of the authors have approved the submitted manuscript. All the authors equally contributed to the writing of
the paper.
Conflicts of Interest: The authors declare no conflict of interest.
References
1.
Pelland, S.; Remund, J.; Kleissl, J.; Oozeki, T.; De Brabandere, K. Photovoltaic and solar forecasting: State of
the art. IEA PVPS Task 2013,14, 1–36.
2.
Paulescu, M.; Paulescu, E.; Gravila, P.; Badescu, V. Weather Modeling and Forecasting of PV Systems Operation;
Springer Science & Business Media: Berlin, Germany, 2012.
Appl. Sci. 2018,8, 228 15 of 16
3.
Raza, M.Q.; Nadarajah, M.; Ekanayake, C. On recent advances in PV output power forecast. Sol. Energy
2016,136, 125–144.
4.
Faranda, R.S.; Hafezi, H.; Leva, S.; Mussetta, M.; Ogliari, E. The Optimum PV Plant for a Given Solar
DC/AC Converter. Energies 2015,8, 4853–4870.
5.
Dolara, A.; Lazaroiu, G.C.; Leva, S.; Manzolini, G.; Votta, L. Snail Trails and Cell Microcrack Impact on PV
Module Maximum Power and Energy Production. IEEE J. Photovolt. 2016,6, 1269–1277.
6.
Omar, M.; Dolara, A.; Magistrati, G.; Mussetta, M.; Ogliari, E.; Viola, F. Day-ahead forecasting for photovoltaic
power using artificial neural networks ensembles. In Proceedings of the 2016 IEEE International Conference
on Renewable Energy Research and Applications (ICRERA), Birmingham, UK, 20–23 November 2016;
pp. 1152–1157.
7.
Cali, Ü. Grid and Market Integration of Large-Scale Wind Farms Using Advanced Wind Power Forecasting: Technical
and Energy Economic Aspects; Erneuerbare Energien und Energieeffizienz—Renewable Energies and Energy
Efficiency; Kassel University Press: Kassel, Germany, 2011.
8.
Ni, Q.; Zhuang, S.; Sheng, H.; Kang, G.; Xiao, J. An ensemble prediction intervals approach for short-term
PV power forecasting. Sol. Energy 2017,155, 1072–1083.
9.
Simonov, M.; Mussetta, M.; Grimaccia, F.; Leva, S.; Zich, R. Artificial intelligence forecast of PV plant
production for integration in smart energy systems. Int. Rev. Electr. Eng. 2012,7, 3454–3460.
10.
Duan, Q.; Shi, L.; Hu, B.; Duan, P.; Zhang, B. Power forecasting approach of PV plant based on similar
time periods and Elman neural network. In Proceedings of the 2015 Chinese Automation Congress (CAC),
Wuhan, China, 27–29 November 2015; pp. 1258–1262.
11.
Gardner, M.; Dorling, S. Artificial neural networks (the multilayer perceptron)—A review of applications in
the atmospheric sciences. Atmos. Environ. 1998,32, 2627–2636.
12.
Nelson, M.; Illingworth, W. A Practical Guide to Neural Nets; Physical Sciences; Addison-Wesley:
Boston, MA, USA, 1991; 316p.
13.
Bose, B.K. Neural Network Applications in Power Electronics and Motor Drives—An Introduction and
Perspective. IEEE Trans. Ind. Electron. 2007,54, 14–33.
14.
Ogliari, E.; Dolara, A.; Manzolini, G.; Leva, S. Physical and hybrid methods comparison for the day ahead
PV output power forecast. Renew. Energy 2017,113, 11–21.
15.
Elder, J.F.; Abbott, D.W. A comparison of leading data mining tools. InProceedings ofthe Fourth International
Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 27–31 August 1998; Volume 28.
16.
Bergstra, J.; Breuleux, O.; Bastien, F.; Lamblin, P.; Pascanu, R.; Desjardins, G.; Turian, J.; Warde-Farley, D.;
Bengio, Y. Theano: A CPU and GPU math compiler in Python. In Proceedings of the 9th Python in Science
Conference, Austin, TX, USA, 28 June–3 July 2010; pp. 1–7.
17.
Collobert, R.; Kavukcuoglu, K.; Farabet, C. Torch7: A matlab-like environment for machine learning. In Proceedings
of the BigLearn, NIPS Workshop, Sierra Nevada, Spain, 16–17 December 2011; Number EPFL-CONF-192376.
18.
Kalogirou, S. Artificial Intelligence in Energy and Renewable Energy Systems; Nova Publishers: Hauppauge,
NY, USA, 2007.
19.
Duffie, J.A.; Beckman, W.A. Solar Engineering of Thermal Processes; John Wiley & Sons: Hoboken, NJ, USA, 2013.
20.
Gandelli, A.; Grimaccia, F.; Leva, S.; Mussetta, M.; Ogliari, E. Hybrid model analysis and validation for
PV energy production forecasting. In Proceedings of the 2014 International Joint Conference on Neural
Networks (IJCNN), Beijing, China, 6–11 July 2014; pp. 1957–1962.
21.
Dolara, A.; Grimaccia, F.; Leva, S.; Mussetta, M.; Ogliari, E. A Physical Hybrid Artificial Neural Network for
Short Term Forecasting of PV Plant Power Output. Energies 2015,8, 1138–1153.
22.
Rana, M.; Koprinska, I.; Agelidis, V.G. Forecasting solar power generated by grid connected PV systems
using ensembles of neural networks. In Proceedings of the 2015 International Joint Conference on Neural
Networks (IJCNN), Killarney, Ireland, 12–16 July 2015; pp. 1–8.
23.
Grimaccia, F.; Leva, S.; Mussetta, M.; Ogliari, E. ANN Sizing Procedure for the Day-Ahead Output Power
Forecast of a PV Plant. Appl. Sci. 2017,7, 622.
24.
Netsanet, S.; Zhang, J.; Zheng, D.; Hui, M. Input parameters selection and accuracy enhancement techniques
in PV forecasting using Artificial Neural Network. In Proceedings of the 2016 IEEE International Conference
on Power and Renewable Energy (ICPRE), Shanghai, China, 21–23 October 2016; pp. 565–569.
Appl. Sci. 2018,8, 228 16 of 16
25.
Panapakidis, I.P.; Christoforidis, G.C. A hybrid ANN/GA/ANFIS model for very short-term PV power
forecasting. In Proceedings of the 2017 11th IEEE International Conference on Compatibility, Power Electronics
and Power Engineering (CPE-POWERENG), Cadiz, Spain, 4–6 April 2017; pp. 412–417.
26.
Tetko, I.V.; Livingstone, D.J.; Luik, A.I. Neural network studies. 1. Comparison of overfitting and overtraining.
J. Chem. Inf. Comput. Sci. 1995,35, 826–833.
27.
Hansen, L.K.; Salamon, P. Neural network ensembles. IEEE Trans. Pattern Anal. Mach. Intell.
1990
,12,
993–1001.
28.
Perrone, M.P. General averaging results for convex optimization. In Proceedings of the 1993 Connectionist
Models Summer School; Psychology Press: London, UK, 1994; pp. 364–371.
29.
Odom, M.D.; Sharda, R. A neural network model for bankruptcy prediction. In Proceedings of the 1990 IJCNN
International Joint Conference on Neural Networks, San Diego, CA, USA, 17–21 June 1990; pp. 163–168.
30.
Hagan, M.T.; Demuth, H.B.; Beale, M.H. Neural Network Design; Campus Publishing Service, University of
Colorado Bookstore: Boulder, CO, USA, 2014; ISBN 9780971732100.
31.
Chen, S.H.; Jakeman, A.J.; Norton, J.P. Artificial intelligence techniques: an introduction to their use for
modelling environmental systems. Math. Comput. Simul. 2008,78, 379–400.
32.
Monteiro, C.; Fernandez-Jimenez, L.A.; Ramirez-Rosado, I.J.; Munoz-Jimenez, A.; Lara-Santillan, P.M.
Short-Term Forecasting Models for Photovoltaic Plants: Analytical versus Soft-Computing Techniques.
Math. Probl. Eng. 2013,2013, 767284.
33.
Ulbricht, R.; Fischer, U.; Lehner, W.; Donker, H. First Steps Towards a Systematical Optimized Strategy for
Solar Energy Supply Forecasting. In Proceedings of the European Conference on Machine Learning and
Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD 2013), Riva del Garda, Italy,
23–27 September 2013.
34. Kleissl, J. Solar Energy Forecasting and Resource Assessment; Academic Press: Cambridge, MA, USA, 2013.
35.
Ogliari, E.; Grimaccia, F.; Leva, S.; Mussetta, M. Hybrid Predictive Models for Accurate Forecasting in PV
Systems. Energies 2013,6, 1918–1929.
36.
Wolfram, M.; Bokhari, H.; Westermann, D. Factor influence and correlation of short term demand for
control reserve. In Proceedings of the 2015 IEEE Eindhoven PowerTech, Eindhoven, The Netherlands,
29 June–2 July 2015; pp. 1–5.
37.
SolarTechLab Department of Energy. Available online: http://www.solartech.polimi.it/ (accessed on
30 September 2017).
38.
ABB MICRO-0.25-I-OUTD. Availableonline: https://library.e.abb.com/public/0ac164c3b03678c085257cbd0061a446/
MICRO-CDD_BCD.00373_EN.pdf (accessed on 21 January 2018).
39.
Leva, S.; Dolara, A.; Grimaccia, F.; Mussetta, M.; Ogliari, E. Analysis and validation of 24 hours ahead neural
network forecasting of photovoltaic output power. Math. Comput. Simul. 2017,131, 88–100.
c
2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
... By forecasting PV solar power output, the proportion of solar power injection into a hybrid power system can be better modulated (Das et al., 2018;Khandakar et al., 2019;Seyedmahmoudian et al., 2018) and the mechanism to alternate between different types of energy and power in a hybrid power system can be improved (Touati et al., 2017). Many different weather parameters have been shown to be correlated with PV solar power output, with the most frequently utilized weather parameters including the panel-of-array irradiance (POA), relative humidity (RH), and dry bulb temperature (DBT) (AlKandari & Ahmad, 2019; Chen et al., 2020;Das et al., 2018;Dolara et al., 2018;Khandakar et al., 2019;Meng & Song, 2020;Nespoli et al., 2019;Persson et al., 2017;Rana et al., 2016;Seyedmahmoudian et al., 2018;Tato & Brito, 2019;Theocharides et al., 2018;Touati et al., 2017;Van Tai, 2019; Van-Deventer et al., 2019;Wang, Li, et al., 2018). Other weather parameters that have been used are the panel back-surface temperature (BST), atmospheric pressure (ATP), global horizontal irradiance (GHI), diffuse horizontal irradiance (DHI), direct normal irradiance (DNI), and daily accumulated precipitation (DAP). ...
... Throughout the years, the artificial neural network (ANN) and the random forest (RF) algorithms have been frequently shown to be the most accurate ML algorithms for PV solar power output forecasting. Single-algorithm studies on PV solar power output forecasting using either ANN or RF have produced high-accuracy forecasts (Alomari et al., 2018;Dolara et al., 2018;Erduman, 2020;Liu & Sun, 2019;Massaoudi et al., 2021), while many multiple-algorithm studies on the forecasting of PV solar power output have demonstrated ANN and RF to outperform other tested algorithms such as ANFIS, support vector regression (SVR), k-nearest neighbors, extreme learning machine (ELM), non-linear autoregressive neural network with exogenous inputs (NARXNN), linear regression (LR), multiple linear regression (MLR), elastic net (EN), gaussian process regression (GPR), adaptive boosting (AdaBoost), k-nearest neighbors (kNN), and others (Jawaid & Nazirjunejo, 2017;Khandakar et al., 2019;Kim et al., 2019;Meng & Song, 2020;Rana et al., 2016;Rana & Rahman, 2020;Su et al., 2019;Tato & Brito, 2019;Theocharides et al., 2018;Van Tai, 2019). ...
... Tato and Brito (2019) Global horizontal irradiance, direct normal irradiance Theocharides et al. (2018) Incident global irradiance, relative humidity, wind direction, wind speed, dry bulb temperature. Dolara et al. (2018) Dry bulb temperature, global horizontal irradiance, panel of array irradiance, wind speed, wind direction, atmospheric pressure, daily accumulated precipitation, cloud cover, cloud type. Wang, Li, et al. (2018) Panel of array irradiance, dry bulb temperature. ...
Article
Full-text available
Solar power integration in electrical grids is complicated due to dependence on volatile weather conditions. To address this issue, continuous research and development is required to determine the best machine learning (ML) algorithm for PV solar power output forecasting. Existing studies have established the superiority of the artificial neural network (ANN) and random forest (RF) algorithms in this field. However, more recent studies have demonstrated promising PV solar power output forecasting performances by the decision tree (DT), extreme gradient boosting (XGB), and long short-term memory (LSTM) algorithms. Therefore, the present study aims to address a research gap in this field by determining the best performer among these 5 algorithms. A data set from the United States’ National Renewable Energy Laboratory (NREL) consisting of weather parameters and solar power output data for a monocrystalline silicon PV module in Cocoa, Florida was utilized. Comparisons of forecasting scores show that the ANN algorithm is superior as the ANN16 model produces the best mean absolute error (MAE), root mean squared error (RMSE) and coefficient of determination (R²) with values of 0.4693, 0.8816 W, and 0.9988, respectively. It is concluded that ANN is the most reliable and applicable algorithm for PV solar power output forecasting.
... Conjugate Gradient is a machine learning approach based on ANN that can be used to solve computational problems related to forecasting data [1]. The algorithm used will produce different forecasting accuracy, depending on the parameters given and the data to be predicted. ...
Article
Full-text available
Each method and algorithm ANN has different performances depending on the algorithm used and the parameters given. The purpose of this research is to obtain the best algorithm information from the two algorithms that will be compared based on the performance value or the smallest / lowest MSE value so that it can be used as a reference and information for solving forecasting problems. The ANN algorithms compared were Conjugate Gradient Fletcher-Reeves and Conjugate Gradient Polak-Ribiere. The conjugate gradient algorithm can solve unlimited optimization problems and is much more efficient than gradient descent-based algorithms because of its faster turnaround time and less iteration. The research data used for the forecasting analysis of the two algorithms are data on the number of rural poor people in Sumatra, Indonesia. 6-10-1, 6-15-1, and 6-20-1 architectural analysis. The results showed that the Polak-Ribiere Conjugate Gradient algorithm with the 6-10-1 architecture has the best performance results and the smallest / lowest MSE value compared to the Fletcher-Reeves algorithm and two other architectures. So it can be concluded that the 6-10-1 architectural architecture with the Conjugate Gradient Polak-Ribiere algorithm can be used to solve forecasting problems because the training time to achieve convergence is not too long, and the resulting performance is quite good.
... In order to foster advances that are mutually beneficial to both the ML and power system communities, it is necessary to develop well-documented and calibrated open-source datasets and use cases that are relevant to real-world power engineering problems, while simultaneously being accessible and usable to ML researchers with limited backgrounds in power/energy systems. There have been attempts at developing ML benchmarks for various power system tasks such as renewable 16,17 and load forecasting [18][19][20] , and fault and anomaly detection [21][22][23][24] . Other researchers have attempted to accelerate algorithm development by providing online simulation platforms for specific tasks, such as the L2RPN competition 25,26 and the oscillation source location contest 27 . ...
Article
Full-text available
The electric grid is a key enabling infrastructure for the ambitious transition towards carbon neutrality as we grapple with climate change. With deepening penetration of renewable resources, the reliable operation of the electric grid becomes increasingly challenging. In this paper, we present PSML, a first-of-its-kind open-access multi-scale time-series dataset, to aid in the development of data-driven machine learning (ML)-based approaches towards reliable operation of future electric grids. The dataset is synthesized from a joint transmission and distribution electric grid to capture the increasingly important interactions and uncertainties of the grid dynamics, containing power, voltage and current measurements over multiple spatio-temporal scales. Using PSML, we provide state-of-the-art ML benchmarks on three challenging use cases of critical importance to achieve: (i) early detection, accurate classification and localization of dynamic disturbances; (ii) robust hierarchical forecasting of load and renewable energy; and (iii) realistic synthetic generation of physical-law-constrained measurements. We envision that this dataset will provide use-inspired ML research in safety-critical systems, while simultaneously enabling ML researchers to contribute towards decarbonization of energy sectors. Measurement(s)temperature • wind speed • solar zeinth angle • dew point • irradiance • voltage • currentTechnology Type(s)weather station • power grid model-based simulationFactor Type(s)load power • renewable generation power • disturbance location, type, and duration Measurement(s) temperature • wind speed • solar zeinth angle • dew point • irradiance • voltage • current Technology Type(s) weather station • power grid model-based simulation Factor Type(s) load power • renewable generation power • disturbance location, type, and duration
... In particular, the dataset is divided randomly assigning 90% of data to the training set and 10% to the validation set. These specific choices and shares have been previously defined through sensitivity analysis conducted in [37]. ...
Article
Full-text available
Day-ahead power forecasting is an effective way to deal with the challenges of increased penetration of photovoltaic power into the electric grid, due to its non-programmable nature. This is significantly beneficial for smart grid and micro-grids application. Machine learning and hybrid approaches are well assessed techniques, able to provide effective forecasting with a data-driven approach based on previous measurements from existing power plants. Ensemble methods can be employed to increase solar power forecasting accuracy, by running several independent forecasting models in parallel. In this paper, a novel selective approach is proposed and assessed, where independently trained neural networks are evaluated in terms of accuracy, in order to properly select a suitable forecasting. Moreover, in order to reduce the associated computational burden, suitably developed new normalization approaches are proposed and evaluated. The considered experimental case study shows that the combination of the proposed procedures is able to increase accuracy and to mitigate the overall computational load, resulting in a simple and lightweight algorithm. Additionally, a comparison with other commonly used techniques has shown that the proposed approach is robust with respect to dataset limited size and discontinuities.
... Due to these limitations, meta-heuristic and machine learning procedures have been widely used. The machine learning algorithms are considered successful in pattern recognition and classification as well as data mining and forecasting since they have the ability for developing a relationship between inputs and outputs, even if their representation is not possible [7]. ...
Article
Full-text available
Predicting the power obtained at the output of the photovoltaic (PV) system is fundamental for the optimum use of the PV system. However, it varies at different times of the day depending on intermittent and nonlinear environmental conditions including solar irradiation, temperature and the wind speed, Short-term power prediction is vital in PV systems to reconcile generation and demand in terms of the cost and capacity of the reserve. In this study, a Gaussian kernel based Support Vector Regression (SVR) prediction model using multiple input variables is proposed for estimating the maximum power obtained from using perturb observation method in the different irradiation and the different temperatures for a short-term in the DC-DC boost converter at the PV system. The performance of the kernel-based prediction model depends on the availability of a suitable kernel function that matches the learning objective, since an unsuitable kernel function or hyper parameter tuning results in significantly poor performance. In this study for the first time in the literature both maximum power is obtained at maximum power point and short-term maximum power estimation is made. While evaluating the performance of the suggested model, the PV power data simulated at variable irradiations and variable temperatures for one day in the PV system simulated in MATLAB were used. The maximum power obtained from the simulated system at maximum irradiance was 852.6 W. The accuracy and the performance evaluation of suggested forecasting model were identified utilizing the computing error statistics such as root mean square error (RMSE) and mean square error (MSE) values. MSE and RMSE rates which obtained were 4.5566 * 10 −04 and 0.0213 using ANN model. MSE and RMSE rates which obtained were 13.0000 * 10 −04 and 0.0362 using SWD-FFNN model. Using SVR model, 1.1548 * 10 −05 MSE and 0.0034 RMSE rates were obtained. In the short-term maximum power prediction, SVR gave higher prediction performance according to ANN and SWD-FFNN. 1 Introduction Since fossil energy causes air pollution and is exhaustible, renewable energy is now more widely used. Solar energy, which is a renewable energy source, comes to the forefront because it does not have the resource cost and is an inexhaustible energy source. In order to convert solar energy into electrical energy, serial and parallel connected photovoltaic (PV) panels are utilized [1].
... In order to foster advances that are mutually beneficial to both the ML and power system communities, it is necessary to develop well documented and calibrated open-source datasets and use cases that are relevant to real-world power engineering problems, while simultaneously being accessible and usable by ML researchers without any background in power or energy systems. There have been attempts at developing ML benchmarks for various power system tasks such as renewable [59,14] and load forecasting [73,20,2], and fault and anomaly detection [75,11,26,41]. Other researchers also have attempted to accelerate algorithm development by providing online simulation platforms for specific tasks, such as the L2RPN competition [49] and the oscillation source location contest [31]. ...
Preprint
Full-text available
The electric grid is a key enabling infrastructure for the ambitious transition towards carbon neutrality as we grapple with climate change. With deepening penetration of renewable energy resources and electrified transportation, the reliable and secure operation of the electric grid becomes increasingly challenging. In this paper, we present PSML, a first-of-its-kind open-access multi-scale time-series dataset, to aid in the development of data-driven machine learning (ML) based approaches towards reliable operation of future electric grids. The dataset is generated through a novel transmission + distribution (T+D) co-simulation designed to capture the increasingly important interactions and uncertainties of the grid dynamics, containing electric load, renewable generation, weather, voltage and current measurements at multiple spatio-temporal scales. Using PSML, we provide state-of-the-art ML baselines on three challenging use cases of critical importance to achieve: (i) early detection, accurate classification and localization of dynamic disturbance events; (ii) robust hierarchical forecasting of load and renewable energy with the presence of uncertainties and extreme events; and (iii) realistic synthetic generation of physical-law-constrained measurement time series. We envision that this dataset will enable advances for ML in dynamic systems, while simultaneously allowing ML researchers to contribute towards carbon-neutral electricity and mobility.
Article
In a solar micro-grid, a hybrid renewable energy system generates electricity for a building’s onsite use. The battery storage and the main power grid connection are used to facilitate the matching between the demand and production. To control energy flows optimally, an accurate day-ahead prediction of the photovoltaic (PV) panels output is required. However, this is a challenging task due to the fluctuating nature of solar radiation availability. The accuracy of the prediction is influenced by the modelling method and input parameters. In this study, the measured power and weather data is gathered from an experimental installation of PV panels to predict PV output for a 24-hours horizon in 15 min intervals. The multiple linear regression (MLR) and artificial neural network (ANN) methods are considered in the prediction modelling and compared using performance indicators. The micro-inverter technology is used to gather the individual PV panel output in addition to the overall system output. The results show that the modelling methods have different accuracy performances and the ANN model built with the individual PV output data results in the highest accuracy. Utilizing the micro-inverter technology leads to an advantage of having more accurate PV prediction for the control purpose.
Chapter
Forecasting power production in photovoltaic plants is becoming one of the leading research areas, owing to its potential for electricity production stability, as precisely estimated predictions are crucial for power systems operations and planning. In this work, three machine learning models are used (MLP, SVR, ANN) to predict the power production of a self-consumption PV plant following the traditional and hierarchical approach. The data used correspond to a PV plant located in the Moroccan city of Settat collected from December 2019 to October 2020. The performance of the models was measured using different evaluation matrices, RMSE (root mean squared error), MSE (mean squared error), and MAE (mean absolute error). Based on the MAPE error indices, the suggested (hierarchical) approach showed an improvement of 7% using the ANN model. This work represents the benefits of using outputs of individual inverters for forecasts in PV plants where more than one inverter is installed.KeywordsPV forecastingMachine learningHierarchical approachMLPSVRANNPV plant
Article
Full-text available
Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systematic review of the theory and the practice of forecasting. We provide an overview of a wide range of theoretical, state-of-the-art models, methods, principles, and approaches to prepare, produce, organise, and evaluate forecasts. We then demonstrate how such theoretical concepts are applied in a variety of real-life contexts. We do not claim that this review is an exhaustive list of methods and applications. However, we wish that our encyclopedic presentation will offer a point of reference for the rich work that has been undertaken over the last decades, with some key insights for the future of forecasting theory and practice. Given its encyclopedic nature, the intended mode of reading is non-linear. We offer cross-references to allow the readers to navigate through the various topics. We complement the theoretical concepts and applications covered by large lists of free or open-source software implementations and publicly-available databases.
Article
Full-text available
Since the beginning of this century, the share of renewables in Europe’s total power capacity has almost doubled, becoming the largest source of its electricity production. In 2015 alone, photovoltaic (PV) energy generation rose with a rate of more than 5%; nowadays, Germany, Italy, and Spain account together for almost 70% of total European PV generation. In this context, the so-called day-ahead electricity market represents a key trading platform, where prices and exchanged hourly quantities of energy are defined 24 h in advance. Thus, PV power forecasting in an open energy market can greatly benefit from machine learning techniques. In this study, the authors propose a general procedure to set up the main parameters of hybrid artificial neural networks (ANNs) in terms of the number of neurons, layout, and multiple trials. Numerical simulations on real PV plant data are performed, to assess the effectiveness of the proposed methodology on the basis of statistical indexes, and to optimize the forecasting network performance.
Conference Paper
Full-text available
The aim of this work is to develop a robust model for short-term prediction of Photovoltaics (PV) generation. The model is structured with algorithms that belong to the technical field of computational intelligence. This approach provides the potential to form a forecasting system with high flexibility, efficiency and customization. The paper examines various combinations of inputs, in order to fully investigate the influence of exogenous variables on the PV predicted time series. Simulation results indicate that the proposed model can be successfully implemented in the decision making process of retailers, distribution system operators, prosumers and others, to fully exploit the generation capacity of grid connected PV systems in day-ahead electricity markets. Out of the different combinations of inputs studied, the one that involves the panels' temperature together with historical PV power values lead to lower predictions errors.
Conference Paper
Torch7 is a versatile numeric computing framework and machine learning library that extends Lua. Its goal is to provide a flexible environment to design and train learning machines. Flexibility is obtained via Lua, an extremely lightweight scripting language. High performance is obtained via efficient OpenMP/SSE and CUDA implementations of low-level numeric routines. Torch7 can easily be in- terfaced to third-party software thanks to Lua's light interface.
Article
Prediction intervals (PIs) estimation is a powerful statistical tool used for quantifying the uncertainty of PV power generation in power systems. The lower upper bound estimation (LUBE) approach, when combined with extreme learning machines (ELM), is effective for constructing PIs. ELM is an efficient but unstable machine-learning method in generating reliable and informative PIs. To overcome this instability, a novel ensemble approach based on ELM and LUBE (ELUBE) is proposed for short-term PV power forecasting. To optimize quality of PIs, the sigmoid, radial basis and sine functions are used to train three groups of ELUBE models, and the models with higher performance are selected as ensemble members. Furthermore, a weighted average method is developed to aggregate the selected individuals. An improved differential evolution algorithm is used to perform the search for the optimal combination weight values of PIs. The feasibility and effectiveness of the proposed approach are evaluated by using PV datasets, obtained from a lab-scale DC micro-grid system.
Article
An accurate forecast of the exploitable energy from Renewable Energy Sources, provided 24 hours in advance, is becoming more and more important in the context of the smart grids, both for their stability issues and the reliability of the bidding markets. This work presents a comparison of the PV output power day-ahead forecasts performed by deterministic and stochastic models aiming to find out the best performance conditions. In particular, we have compared the results of two deterministic models, based on three and five parameters electric equivalent circuit, and a hybrid method based on artificial neural network. The forecasts are evaluated against real data measured for one year in an existing PV plant located at SolarTechlab in Milan, Italy. In general, there is no significant difference between the two deterministic models, being the three-parameter approach slightly more accurate (NMAE three-parameter 8.5% vs. NMAE five-parameter 9.0%). The artificial neural network, combined with clear sky solar radiation, generally achieves the best forecasting results (NMAE 5.6%) and only few days of training are necessary to provide accurate forecasts.
Conference Paper
As the demand and deployment of renewables get better, there comes a growing interest in improved techniques of forecasting the energy generation from those sources. This paper aims at testing and suggesting techniques for input parameter selection and accuracy enhancement in forecasting power output of a PV system. The PV system under study is an operational system in Goldwind smart microgrid in Beijing, China. Historical records of PV power and weather data are utilized while the forecasting models are based on ANN. The paper starts with studying selection of input parameters through correlation analysis, sets of sensitivity analysis techniques and Garson's algorithm. A combination of these methods was able to pick out the most important parameters in deciding the PV power amount. In order to boost accuracy of forecasting, in addition to the recommended strategic input selection method and searching for optimal size of network, options of output processing were tested. The use of more than one network with different training algorithm and using different types of ANN were investigated with both techniques resulting in enhancement in accuracy of forecast.