Content uploaded by Saud M. Al-Fattah
Author content
All content in this area was uploaded by Saud M. Al-Fattah on May 16, 2018
Content may be subject to copyright.
Electronic copy available at: http://ssrn.com/abstract=2216337
Artificial Neural Network Models for Forecasting Global Oil Market Volatility
Saud M. Al-Fattah
King Abdullah Petroleum Studies and Research Center (KAPSARC)
P.O. Box 88550, Riyadh 11672, Saudi Arabia; Email: saud.fattah@kapsarc.org
ABSTRACT
Energy market volatility affects macroeconomic conditions and can unduly affect the economies of
energy-producing countries. Large price swings can be detrimental to both producers and
consumers. Market volatility can cause infrastructure and capacity investments to be delayed,
employment losses, and inefficient investments. In sum, the growth potential for energy-producing
countries is adversely affected. Undoubtedly, greater stability of oil prices can reduce uncertainty in
energy markets, for the benefit of consumers and producers alike. Therefore, modeling and
forecasting crude oil price volatility is critical in many financial and investment applications.
The purpose of this paper to develop new predictive models for describing and forecasting
the global oil price volatility using artificial intelligence with artificial neural network (ANN)
modeling technology. Two ANN models were successfully developed: one for WTI futures price
volatility and the other for WTI spot prices volatility. These models were successfully designed,
trained, verified, and tested using historical oil market data. The estimations and predictions from
the ANN models closely match the historical data of WTI from January 1994 to April 2012. These
models appear to capture very well the dynamics and the direction of the oil price volatility.
The ANN models developed in this study can be used: as short-term as well as long-term
predictive tools for the direction of oil price volatility, to quantitatively examine the effects of
various physical and economic factors on future oil market volatility, to understand the effects of
different mechanisms for reducing market volatility, and to recommend policy options and
programs incorporating mechanisms that can potentially reduce the market volatility. With this
improved method for modeling oil price volatility, experts and market analysts will be able to
empirically test new approaches to mitigating market volatility. The outcome of this work provides a
roadmap for research to improve predictability and accuracy of energy and crude models.
Keywords: Oil Price volatility, modeling oil market, artificial neural network, forecasting
Electronic copy available at: http://ssrn.com/abstract=2216337
2
I. INTRODUCTION
1.1 Oil Market Volatility
Crude oil price plays a major role in global economic activity, and its variation can have effects that
span other markets and impact global economic growth. Volatility is a measure of the degree which
prices of a commodity fluctuate. We define it as the standard deviation of price returns over the
sample timeframe. Understanding volatility of the crude oil price is important for several reasons
(Regnier, 2007). First, long-term uncertainty in future oil prices can alter the incentives to develop
new oil fields in producing countries. Second, this can also curb the implementation of alternative
energy policies in consumer countries. Third, in the short-term, volatility can also affect the demand
for storage. In particular, as Pindyck (2004) noted, greater volatility should lead to an increased
demand for storage and an increase in both spot prices and marginal convenience yield1. Moreover,
volatility is critical in the pricing of derivatives whose trading volume has significantly increased in
the last decade. (Matar et al., 2012)
Economic models and policy simulations can forecast energy market behavior to some
extent using information on supply and demand (Kang, 2009; Narayan, 2007; Poon and Granger,
2003; Sadorsky, 2006; Sidorenko, 2002; & Wang, 2011). Matar et al. (2012) published a review of
these models. These models can be misleading, though, unless they explicitly incorporate factors that
affect market volatility. Market volatility can be affected by such factors as:
• Geopolitical events and swings in international markets;
• Investor behavior, including speculation inside and outside the energy industry;
• The linkage between the physical and the financial markets;
• Political instability that threatens energy supplies;
• The level of transparency of market data regarding supply, demand, costs, stocks, reserves,
and capacity;
• Changes in market regulation and investment rules;
• The shift toward renewable energies and stricter fuel-quality standards; and
• Macro-economic factors such as economic growth, exchange rates and monetary policies.
Among others, GARCH-type models have been the dominant conventional method for
modeling oil prices and volatility. Comparative studies, which have examined the effect of differing
factors that influence oil price volatility and its different modeling approaches, generally didn’t reach
1 The marginal convenience yield is the economic benefit from having physical possession of the commodity.
Electronic copy available at: http://ssrn.com/abstract=2216337
3
a consensus on the superiority of the forecasting capabilities for oil price volatility using these
conventional econometric and financial models. One promising approach that we propose is the
application of artificial intelligence with ANN for modeling and forecasting oil market volatility. The
ANN approach has already been used by other studies to forecast the financial markets and the
pricing of commodities, but it has not been thoroughly explored and applied for modeling the oil
price volatility. Therefore, it is the purpose of this study to apply this novel approach of ANN for
modeling oil market volatility. It should be noted that it is not the intention of this work to predict
the oil prices in absolute terms; rather the objective is to reasonably predict the direction of its
volatility.
International energy companies and institutions, investment researchers, and academic
researchers have devoted significant efforts to understanding and mitigating energy market volatility.
However, new developments in energy production, investment patterns, and the geopolitical
environment will require researchers to continue updating and refining their understanding of energy
markets. Additionally, new modeling techniques are being developed that may enhance researchers’
ability to model markets and volatility.
Energy and financial markets are an area that is ripe for exploration with artificial intelligence
technology, given the amount and frequency of market data that are collected. In particular, ANN
can be used to find patterns in data and to model complex relationships between inputs and outputs.
This paper is organized as follows. Section 1 presents an overview of the oil price volatility
and the ANNs. Section 2 presents data sources, preparation and preprocessing. Section 3 presents
the input selection methods and the dimensionality reduction techniques. Sections 4 and 5 discuss
the ANN models design, development procedures, and the results. Section 6 concludes with future
work.
1.2 Artificial Neural Networks
ANNs have seen an enormous escalation of interest during the past few decades. ANNs are found
powerful and effective tools for solving practical and complex problems in the petroleum industry
(Mohaghegh, 2005; Al-Fattah & Startzman, 2003), economics and finance (Azoff, 1994; Trippi &
Turban, 1996). Advantages of neural network methods over conventional simulation and regression
techniques include the ability to solve highly nonlinear complex and dynamic problems without
prior knowledge of the relationships between input and output variables, and the capability to
handle either continuous (numerical) or categorical data as either inputs or outputs. Furthermore,
4
Supply
Economic
Indicators
Spare
capacity
Demand
ANN approach is intuitively attractive as it is based on crude low-level models of biological systems,
which simply learn by examples. The neural network is fed with representative data of the problem
at hand, and gets exposed for training until it learns the pattern and behavior of the data.
ANNs mimic the human brain in thinking with parallel processing of the information in the
nervous system. They can also be defined as computational models of biological neural structures.
Each ANN commonly consists of a number of fully connected nodes or neurons grouped in layers.
These layers can include one input layer, one or more hidden layers, and one output layer. The
number of nodes in each of these layers depends on the number of input and output variables, and
the architecture of the ANN.
Figure 1 depicts the basic configuration of a Multilayer Perceptrons (MLP) three-layer
network: one input layer, one hidden layer, and one output layer. The node entails multiple inputs
and a single output. The input layer has the same number of nodes as the independent variables and
output layer has the same number of nodes as the dependent variables. Each input is modified by a
weight, which multiplies with the input value. The input can be raw data or output of other nodes or
neurons. With reference to an assigned threshold value and activation function, the node will
combine these weighted inputs and use them to determine its output. This output can be either the
final product or an input to another node. (Bishop, 1995; Fausett, 1994; Haykin, 1994; Patterson,
1996)
Figure 1- An example of a three-layer MLP neural network structure.
Input layer Output layerHidden layer
OUTIN
weight
node
Price Volatility
.
.
.
5
The objective for modeling oil market volatility is to predict the target or dependent variable
(oil price volatility) that varies in time using previous values of the target variable and/or other input
variables. The independent/input variables can include crude oil prices, supply, demand, spare
capacity, and economic indicators. This time series problem can be solved using the following
network types: Multilayer Perceptrons (MLP), Radial Basis Function (RBF), Generalized Regression
Neural Network (GRNN), and Linear (Fausett, 1994; Haykin, 1994; Patterson, 1996). The Linear
model is basically the conventional regression analysis. In this study, I experimented with the first
three network types: MLP, RBF, and GRNN, and I found MLP network outperforms the other
network types for this particular study.
The design and development of an ANN model involve seven important procedures. These
procedures include: 1. Data acquisition and preparation, 2. Data preprocessing, 3. Inputs selection
and dimensionality reduction, 4. Network design, 5. Network training, 6. Verifying, and 7. Testing.
Figure 2 is a flowchart illustrating the ANN development strategies implemented in this study (Al-
Fattah & Al-Naim, 2009).
II. DATA
2.1 Input Data and Sources
The data used to develop the ANN model for oil market volatility were collected from the Energy
Information Admin. (EIA, 2012). West Texas Intermediate (WTI) daily crude oil futures and spot
prices were used from Jan.1994 to April 2012. Because other variables were found only in monthly
frequency, the WTI daily data were converted to monthly time scale taking the tenth trading day of
each month as the data for the respective month. The daily returns have been adjusted for
intermediate non-trading days that include weekends and holidays (Pindyck, 2004). In computing the
price returns at the monthly frequency, the oil price on the tenth day of each month is considered.
This is done to avoid proximity with contract expiration dates. For months whose tenth day is not a
trading day, the nearest previous or succeeding trading days are incorporated instead. In this study,
we used data of crude prices for first month contract. The monthly data for the period Jan. 1994 to
April 2012 were used for the input variables which include, among others, oil supply from major
countries, U.S. crude oil inventory, oil consumption in OECD and non-OECD countries, U.S.
ending stocks of crude oil, crude oil spare capacity, economic indicators, and world total petroleum
stocks.
6
Figure 2- Flowchart of ANN design and development procedure deployed in this study.
Source: Revised from Al-Fattah and Al-Naim, 2009.
7
These input variables were selected because they are influencing factors on oil market volatility as
discussed by Matar et al. (2012). Oil supply, demand, inventory levels and storage are among the
factors impacting oil price volatility. The formula used to compute the price return is given by
=100
………………………………………………………………….. (1)
The following equation adjusts for n-1 intermediate non-trading days, following the procedure
detailed by Pindyck (2004).
=100
………………...………………………………………(2)
Where is the standard deviation of log price changes over an interval of one physical day, and
is the standard deviation of log price changes over an interval of n physical days. When computing
the returns from spot prices, an additional adjustment is necessary to account for the convenience
yield (Pindyck, 2004). The price volatility can then be computed as either the squared of return (vt =
rt
2) or the absolute return (vt = |rt|).
2.2 Data Preparation
Data acquisition, preparation, and preprocessing are considered the most important and most time
consuming tasks in the ANN model development process, Fig. 2. The optimum number of data that
is required for developing a neural network model often presents complications. There are some
empirical rules in the literature, which relate the number of data points needed to the size of the
network. One of these rules suggests that there should be at least 10 times as many data points as
connections in the network (Hill & Lewicki, 2007). In this study, we have a total of 220 data
observations and considering about 22 input variables. Thus, we have followed the optimal 10% rule
of thumb. In fact, the number of data required depends on the difficulty of the problem which the
network is trying to model. In general, most practical problems require hundreds or thousands of
data points. As the number of input variables increases, the number of input data points needed
increases nonlinearly. A small number of input variables sometimes even require a large number of
input data points. This problem is known as “the curse of dimensionality.” If there is a larger but
8
still limited data set, then it can be compensated to by creating an ensemble of the networks. Then
the individual networks are trained using a different resampling of the available data. The average of
the predictions from the best networks is then used. The resulting network is called an ensemble.
(Hill & Lewicki, 2007)
2.3 Data Preprocessing
Data preprocessing is an important procedure in the development of ANN models. The
preprocessing procedures used in the construction process of the ANN model of this study are
input/output normalization and transformation (Hill & Lewicki, 2007).
2.3.1 Transformation
My experience found that ANN performs better with normally distributed data and not seasonally
adjusted data. Having input data exhibiting trends or periodic variations renders data transformation
necessary. There are different ways to transform the input variables into forms, making the neural
network interpret the input data easier and perform faster in the training process. Examples for such
transformation forms include the variable first derivative, relative variable difference, natural
logarithm of relative variable, square root of variable, and trigonometric functions. The choice of the
first-derivative transform, for example, can remove the trend in each input variable, thus helping to
reduce the multicollinearity among the input variables. Using the first derivative also results in
greater fluctuation and contrast in the values of the input variables. This improves the ability of the
ANN model to detect significant changes in patterns.
2.3.2 Normalization
Normalization is a process of standardizing the possible numerical range that the input data can
take. It enhances the fairness of training by preventing an input with large values from swamping
out another input that is equally important but with smaller values. Normalization is also
recommended because the network training parameters can be tuned for a given range of input data;
thus, the training process can be carried over to similar tasks.
The mean/standard deviation normalization method was used to normalize all the input and
output variables of the neural network model developed in this study. The mean standard deviation
preprocessing is the most commonly used method and generally works well with almost every case.
It has the benefits that it normalizes the input variable without any loss of information, and it
9
transforms back the original values easily. Each input variable as well as the output were normalized
using the following equation:
=()
……………………………………………….. (3)
where X
′
= normalized input/output vector, X = original input/output vector,
µ
= mean of the
original input/output,
σ
= standard deviation of the input/output vector, and i = number of
input/output vector. Each input/output variable was normalized using the equation above with its
mean and standard deviation values. This normalization process was applied to the whole data
including the training and testing sets. The single set of normalization parameters of each variable
(i.e., the standard deviation and the mean) can then be preserved to be applied to new data during
forecasting.
III. INPUT SELECTION AND DIMENSIONALITY REDUCTION
One of the critical tasks in the design of the ANN model is the decision of which of the available
variables to use as inputs to the neural network. The only guaranteed method to select the best input
set is to train networks with all possible input sets and all possible architectures, and to select the
best. Practically, this is impossible for any significant number of potential input variables. It becomes
thornier when multicollinearity exists among some of the input variables, which means that any set
of variables might be sufficient.
Some neural network architectures can actually learn to ignore redundant or insignificant
variables. Other architectures are unfavorably impacted, and in all situations too many input
variables indicate that a larger number of training data points is needed to prevent the trained
network from over-learning or memorization problem. Over-learning or over-fitting is a major
problem that deteriorates the network performance significantly. Over-learning can cause the
network to perform very well with the training data set, but poorly with the testing set.
Consequently, the performance of a network can be improved by reducing the number of redundant
and insignificant input variables, leading the network to generalize and not memorize. There are
highly sophisticated algorithms that determine the selection of input variables. These techniques
10
include genetic algorithm, forward and backward stepwise algorithm, and principle component
analysis (Hill & Lewicki, 2007; Goldberg, 1989).
The genetic algorithm is an optimization algorithm that can search efficiently for binary
strings by processing an initially random population of strings using artificial system, mimicking the
natural human selection. The genetic algorithm is therefore an efficient technique to identify
significant variables where there are large numbers of variables, and offers a valuable verification for
smaller numbers of variables. In particular, the genetic algorithm works well at recognizing
interdependencies between variables located close together on the masking strings. Genetic
algorithm can detect subsets of variables that are not revealed by other methods. However, the
method of genetic algorithm is time consuming; if the number of variables is large then it typically
requires training and testing many thousands of networks resulting in running the program for a
couple of days. (Goldberg, 1989; Fausett, 1994)
Forward and backward stepwise algorithms usually run much faster than the genetic
algorithm if there are a reasonable number of variables. Both techniques, forward and backward
stepwise selection, are also equally effective if there are not too many complex interdependencies
between variables. Forward stepwise input-selection technique works by adding variables one at a
time, while the backward stepwise input-selection technique operates starting with the complete set
of variables then removing variables one at a time (Hill & Lewicki, 2007).
Another common approach to dimensionality reduction is the principle component analysis
(PCA) (Bishop, 1995; Hill & Lewicki, 2007) which can be denoted in a linear network. PCA can
often extract a small number of components from fairly high-dimensional original data while
preserving the important structure of the data.
In this study, we used the genetic algorithm, forward and backward stepwise techniques to
determine the significant input variables of the neural network among a total of fifty eight input
variables. We found these techniques yield almost the same results. Using these techniques, we
eliminated the redundant input variables with a reduction of about 62%; thus we selected the best
twenty input variables for the network model that contributed significantly to the model’s
performance. Thus all these identified twenty significant input variables are retained in the network
model development. Among the best predictors for the WTI crude oil price volatility are, but not
limited to, the U.S. crude inventory change, U.S. crude oil production, OPEC oil production,
OECD inventory, oil consumption by FSU, India and China, and OECD oil consumption. The
results show that the US business inventory change is the most significant input parameter to be
11
included in the model with F-value greater than one. The OPEC total petroleum supply is the least
significant input variable among the other selected variables with an F-value of one, yet it is
important to be retained in the model. We selected the threshold value for selecting the input
variable to be ones having F-value of one and above. It should be noted that all these input variables
all normalized.
IV. ANN MODEL DESIGN
This section discusses the design aspects for selecting the neural network architecture. These include
the learning or training algorithm, the number of nodes in each layer, the number of hidden layers,
and the type of transfer function.
4.1 Architecture
The neural network architecture determines the method that the weights are interconnected in the
network and specifies the type of learning rules that may be used. Selection of network architecture
is one of the first things done in setting up a neural network. The multilayer perceptron (MLP)
(Haykin, 1994; Azoff, 1994; Trippi & Turban, 1996) is the most commonly used architecture and is
generally recommended for most applications; hence it is selected to be used for this study.
4.2 Learning Algorithm
Selection of a learning rule is also an important step because it affects the determination of input
functions, transfer functions, and associated parameters. The network used in this study is based on
a back-propagation (BP) design, the most widely recognized and most commonly used supervised-
learning algorithm (Haykin, 1994).
The fundamental structure of a back-propagation neural network consists of an input layer,
one or more hidden layers, and an output layer. Fig. 1 shows the architecture of a BP neural
network. A layer consists of a number of processing elements or neurons. The layers are fully
connected, indicating that each neuron of the input layer is connected to each hidden layer node.
Similarly, each hidden layer node is connected to each output layer node. The number of nodes,
needed for the input and output layers, depends on the number of input and output variables
designed for the neural network.
12
4.3 Transfer Function
A transfer function acts on the value returned by the input function. The input function associates
the input vector with the weight vector to obtain the net input to the node given a particular input
vector (Haykin, 1994). Each of the transfer functions introduces nonlinearity into the neural
network, enriching its representational capacity. Actually, one of the ANN advantages over the
traditional regression techniques is the nonlinearity introduced by the transfer function to the
network. There are a number of transfer functions. Among those are sigmoid, arctan, sin, linear,
Gaussian, and Cauchy. The most commonly used transfer function is the sigmoid function. It
squashes and compresses the input function when the variables take on large positive or large
negative values. Large positive values asymptotically approach one, while large negative values are
squashed to zero. The sigmoid is given by (Haykin, 1994),
x
e
xf
−
+
=11
)(
……………………………………………….. (4)
Figure 3 is a typical plot of the sigmoid function. In essence, the activation function acts as a
nonlinear gain for the node. The gain is actually the slope of the sigmoid at a specific point. It varies
from a low value at large negative inputs, to a high value at zero input, and then drops back toward
zero as the input becomes large and positive.
Figure 3- Sigmoid function.
0
0.5
1
-20 -15 -10 -5 0 5 10 15 20
Input
Output
13
4.4 Example:
To illustrate the backpropagation training algorithm, let’s consider the simple network below:
Figure 4- Example neural network illustrating backpropagation training algorithm.
This network, shown in Figure 4, has two nodes in the input layer, two nodes in the hidden layer,
and a single node in the output layer. Assume that the nodes have a Sigmoid activation function. So
let’s
(i) Perform a forward pass on the network.
(ii) Perform a reverse pass (training) once (target = 0.5).
(iii) Perform a further forward pass and comment on the result.
(i) Input to top neuron = (0.35 x 0.1) + (0.9 x 0.8) = 0.755. Out = 1/(1 + e-0.755) = 0.68.
Input to bottom neuron = (0.9 x 0.6) + (0.35 x 0.4) = 0.68. Out = 0.6637.
Input to final neuron = (0.3 x 0.68) + (0.9 x 0.6637) = 0.80133. Out = 0.69.
(ii) Output error δ = (target-output) (1-output) x output = (0.5-0.69) (1-0.69) x 0.69 = -0.0406.
Calculating new weights for output layer:
w1
+ = w1 + (δ x input) = 0.3 + (-0.0406 x 0.68) = 0.272392
w2
+ = w2 + (δ x input) = 0.9 + (-0.0406 x 0.6637) = 0.87305
Calculating errors for hidden layers:
δ1 = δ x w1 = -0.0406 x 0.272392 x (1 - output) x output = -2.406 x10-3
δ2 = δ x w2 = -0.0406 x 0.87305 x (1 - output) x output = -7.916 x10-3
w = 0.1
I
2
= 0.90
I1 = 0.35
0.6
0.4
0.8
0.3
0.9
Output (?)
(Target = 0.5)
Input Layer
Hidden Layer
Output Layer
14
New hidden layer weights:
w3
+ = 0.1 + (-2.406 x 10-3 x 0.35) = 0.09916.
w4
+ = 0.8 + (-2.406 x 10-3 x 0.9) = 0.7978.
w5
+ = 0.4 + (-7.916 x 10-3 x 0.35) = 0.3972.
w6
+ = 0.6 + (-7.916 x 10-3 x 0.9) = 0.5928.
(iii) The old error was = (0.5-0.69) = -0.19 and the new error is = (0.5-0.68205) = -0.18205.
Therefore, the error has reduced. The training continues until the network converges to a solution
with acceptable tolerance of error, or reaches the maximum defined number of iterations or epochs.
V. ANN MODEL DEVELOPMENT
5.1 Training, Verifying, and Testing
In this study, the data (January 1994 to April 2012) are partitioned into three groups: training set
(70% of data), verification or validation set (15% of data), and testing set (15% of data). This is an
optimal apportioning as recommended by others (Hill and Lewicki, 2007); however, a data
apportioning as 80%-10%-10% is also possible. Table 1 shows the partitioning of the WTI crude
prices datasets. In the training process, the network is exposed repeatedly to input data, the weights
and thresholds of the post-synaptic potential function are adjusted using a backpropagation training
algorithm until the network performs very well in correctly predicting the output by meeting the
errors threshold requirements.
Normally, the training data subset is presented to the network in several or even hundreds of
iterations. Each presentation of the training data to the network for adjustment of weights and
thresholds is referred to as an epoch or iteration. This process continues until the overall error
function has been sufficiently minimized (see the above example for illustration). The convergence
criteria for the trained network are either a residual error of 1.0x10-6 or less, or maximum epochs of
1000; whichever occurs first.
The overall error is also computed for the second subset of the data which is sometimes
referred to as the verification or validation data. The verification data acts as a watchdog and takes
15
no part in the adjustment of weights and thresholds during training, but the networks’ performance
is continually checked against this subset as training continues. The training is stopped when the
error for the verification data stops decreasing or starts to increase.
Use of the verification subset of data is important, because with unlimited training, the
neural network usually starts “overlearning or over-fitting” the training data leading to the network’s
memorization problem. The overlearning or memorization problem means that given no restrictions
on training, a neural network may match the training data almost perfectly but may perform very
poorly to new data such as the testing set. A good ANN model is characterized by the
generalization. The use of verification subset to stop training at a point when generalization potential
is best is a critical consideration in training neural networks.
A third subset of data which is the testing set is used to serve as an additional independent
check on the generalization capabilities of the neural network, and to act as a blind test of the
performance and accuracy of the network. (Al-Fattah, 2011; Fausett, 1994) In this study, several
neural network architectures and training algorithms have been attempted to achieve the best results.
The software tool that was used in developing the ANN model in this study is STATISTICA
(StatSoft, 2011).
Table 1– Partitioning of WTI Crude Data Sets
Data Set
No. of Data Points
No. of Years
Period
Estimation (Training)
154 (70%)
~ 13
Jan. 1994 – Oct. 2006
Validation (Verification)
33 (15%)
~ 3
Nov. 2006 – Jul. 2009
Forecasting (Testing)
33 (15%)
~ 3
Aug. 2009 – Apr. 2012
5.2 Discussion of Results
The ANN model for the oil market volatility developed in this study was successfully well trained,
verified and checked for generalization. The ANN model was best developed with the following
characteristics: MLP network architecture, back propagation training algorithm, logistic transfer
function, three layers (input layer with four nodes, hidden layer with four nodes, and output layer
with one node).
The result of the ANN model for the WTI oil market volatility is shown in Figure 5. As can
be seen from this figure that the ANN model matches the observed oil price volatility, and it
16
captures very well the direction and the path of the price volatility whether it is an upward or
downward. The ANN model also captured the negative oil shock that was taken place in 2008 due
to the world financial crisis. It was not the intention of this study to predict the oil price with high
accuracy. Rather, the purpose is to model and predict the direction and the path of the oil price
volatility. I believe achieving this objective is of significant importance to oil producers, consumers,
investors, and traders. Figure 6 shows that the error distribution of the ANN model is evenly
distributed around zero indicating the errors are not different from normality. Figure 7 which
presents the normal probability plot supports this observation by showing most of the data fall on
the straight line. A cross-plot of the observed volatility data and ANN model results depicted in
Figure 8 indicates the close agreement between the observed and predicted volatility by showing the
majority of the data falls on the 45o straight line.
Figure 5- Results of ANN model for WTI futures price volatility.
-5
-4
-3
-2
-1
0
1
2
3
4
5
Normalized WTI Futures Price Volatility
Normalized WTI Crude Futures Price Volatility
(Jan. 1994 – Apr. 2012)
Actual
ANN Model
Estimation Validation Forecasting
17
WTI Crude Oil Futures Price Volatility
Residual Errors Distribution [1.MLP 58-14-1]
-3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0
Residual Errors of Volatility (Output)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
Frequency
Figure 6- Residuals distribution of ANN model of WTI crude futures price volatility.
Normal Probability Plot of WTI Price Volatility - Residuals
-2 -1 0123
Observed Value
-2
-1
0
1
2
3
Expected Normal Value
Figure 7- Normal probability plot of WTI crude futures price volatility.
18
WTI Crude Oil Futures Price Volatility
Cross-Plot for ANN Model vs. Actual Data
-5 -4 -3 -2 -1 0123
WTI Price Return (Target)
-5
-4
-3
-2
-1
0
1
2
3
WTI Price Return (Output)
Figure 8- A cross-plot of WTI crude futures price volatility model.
Statistical error and graphical analyses (Hill & Lewicki, 2007) are used in this study to
examine the performance of the ANN model. The statistical parameters used are the output data
standard deviation (SDo), output error mean (Er), output error standard deviation (SDe), output
absolute error mean (Ea), correlation coefficient (R), mean squared error (Ems), and root mean-
squared error (Erms). It is a good practice to use the statistical error analysis for the network
verification data or testing subsets, but not the training set, to judge and compare the performance
of a network among other competing networks. The training set almost always yield better results
than the validation or the testing sets since the training test is exposed to the majority of the data
that make it learn the data structure and capture the data pattern very well. Therefore, a good
measure of the network’s performance is to use the test data that has not been seen by the network
during the development of the model.
Normalized WTI Price Volatility (Predicted)
Normalized WTI Price Volatility (Measured)
(Normalized)
19
Table 2 presents the results of statistical analysis for the ANN oil-market volatility model
developed in this study. The results shown are for each of the training, validation and testing sets as
well as the complete data set. As is usually the case, the errors of the test set or forecasting set (out
sample) are slightly higher than the training set. The standard deviation ratio (SDR) is less than one
for all data and its subsets except the validation set, which is a bit higher than unity. The Ems values
for training set (0.186), verification set (0.858), and testing set (0.851), with an overall Ems value of
0.384 for the entire dataset. The correlation coefficients for the model subsets range from 0.6 to 0.9
with an overall R value of 0.8 for the entire dataset, indicating a high degree of accuracy. In the oil
market and pricing models, these values of correlation coefficients of about 0.6 and higher are
reasonable and not considered low. The overall data set has excellent results of accuracy with an
average Ems value of 0.384 and a correlation coefficient of 0.8.
We also developed another ANN model for WTI spot prices volatility. The results are not
different from the WTI futures price volatility model. Figure 9 shows the excellent agreement
between the actual WTI spot price volatility and the ANN model predictions.
Table 2. Results of ANN model performance for WTI futures price volatility.
All Data
Sets
Training
Set
Validation
Set Testing Set
Output data standard deviation (S
D
)
0.882
0.895
0.580
1.060
Mean relative error (E
r
)
-0.077
-0.005
-0.475
-0.027
Error standard deviation (S
D
)
0.617
0.433
0.808
0.936
Mean absolute error (E
a
)
0.467
0.344
0.762
0.755
Correlation coefficient (R)
0.794
0.903
0.573
0.588
Mean squared error (E
ms
)
0.384
0.186
0.858
0.851
Root mean squared error (E
rms
)
0.620
0.431
0.926
0.923
20
Figure 9- ANN Model Prediction of WTI spot price volatility.
VI. CONCLUSIONS AND FUTURE WORK
6.1 Conclusions
This study applied a novel approach for modeling and predicting the oil market volatility using
ANN. Two ANN models were successfully developed; one for WTI futures price volatility and the
other for WTI spot prices volatility. The ANN models were developed with three layers, MLP
architecture, and a backpropagation training algorithm. Input selection techniques were deployed to
identify significant variables to the performance of the network and to eliminate the redundant
variables. The ANN models were well designed, trained, verified, and tested using historical oil
market data. The results of the estimations and predictions from the ANN models closely match the
data of WTI crude prices from January 1994 to April 2012. The ANN models appear to capture very
well the dynamics and the direction of the volatility for both WTI crude spot and futures prices. The
predictions of the ANN models show an excellent agreement with the observed data by yielding an
-40
-30
-20
-10
0
10
20
30
40
WTI Crude Spot Price Volatility
WTI Crude Spot Price Volatility
(Jan. 1994 - Apr. 2012)
Actual
ANN Model
Estimation Validation Forecasting
21
overall standard deviation ratio value of 0.7 and a correlation coefficient of 0.8, and match very well
the direction of the oil market volatility.
Greater price stability of oil prices can reduce uncertainty in energy markets, for the benefit
of consumers and producers alike. These ANN models developed in this study can be used: as
short-term as well as long-term predictive tools for the direction and the path of oil price volatility,
to quantitatively examine the effects of various physical and economic factors on future oil market
volatility, to understand the effects of different mechanisms for reducing market volatility, and to
recommend policy options and programs incorporating mechanisms that can potentially reduce the
market volatility. With this improved method for modeling oil price volatility, experts and market
analysts will be able to empirically test new approaches to mitigating market volatility. The outcome
of this work provides a roadmap for research to improve predictability and accuracy of energy and
crude models.
6.2 Future Work
This study has met its objectives within the scope of work set forth. However, the followings
present our recommendations, and the issues to be addressed further for future research.
1. At the time of this study, most of influential factors impacting oil market volatility have data
with monthly frequency. Use of significant input variables with high data frequency (daily or
weekly) will immensely help improving the ANN model performance and its predictions
accuracy.
2. The scope of this study is to develop an ANN model that can describe historical data behavior
and predict satisfactorily the direction of the oil price volatility. These newly developed models
can be used as an oil market volatility forecaster and as a short-term predictive tool for the path
of the oil price volatility.
3. With the neural network model developed in this study, we recommend further analysis to
evaluate quantitatively the effects of the various physical and economic factors impacting future
oil market volatility.
4. Similar research study can be pursued for modeling and forecasting the price volatility for other
global oil markets such as Brent, Dubai and OPEC crudes. Also, a similar work can be pursued
for developing ANN model for gas market volatility or any other commodities following the
same methodology presented in this study.
22
5. To recognize the merits, power and performance of the ANN, a comparative study of oil market
volatility can be performed between the ANN approach and the conventional regression
techniques (e.g. GARCH model).
6. When the newly developed ANN models cease to be adequate, it is recommended that they get
updated periodically as new data becomes available.
ACKNOWLEDGMENTS
The author would like to thank the following for their valuable comments and review of the paper:
Fred Joutz, James Smith, Emmanuel Ntui, and Shaikh Arifusalam.
REFERENCES
Al-Fattah, S.M., & Startzman, R.A. (2003). “Neural Network Approach Predicts U.S. Natural Gas
Production.” SPE Production Facilities Journal, 18(2):84-91.
Al-Fattah, S.M. & H.A. Al-Naim. (Feb. 2009). “Artificial-Intelligence Technology Predicts Relative
Permeability of Giant Carbonate Reservoirs.” SPE Reservoir Evaluation and Engineering
Journal, 12(1):96-103.
Al-Fattah, S.M. (2011). Innovative Methods for Analyzing and Forecasting World Gas Supply.
Germany: Lambert Academic Publishing,
Azoff, E.M. (1994). Neural Network Time Series Forecasting of Financial Markets. Chichester,
England: John Wiley & Sons Ltd. Inc.
Bishop, C. (1995). Neural Networks for Pattern Recognition. Oxford: University Press.
Energy Information Administration (EIA), 2012, Retrieved from EIA website:
http://www.eia.doe.gov/.
Fausett, L. (1994). Fundamentals of Neural Networks. New York, NY: Prentice-Hall.
Goldberg, D.E. (1989). Genetic Algorithms. Reading, MA: Addison Wesley.
Haykin, S. (1994). Neural Networks: A Comprehensive Foundation. New York City: Macmillan
College Publishing Co.
Hill, T. & Lewicki, P. (2007). Statistics: Methods and Applications. Tulsa, OK: StatSoft.
Kang, S.H., Kang, S.M., & Yoon, S.M. (2009). “Forecasting Volatility of Crude Oil Markets.”
Journal of Energy Economics, 31:119-125.
23
Matar, W., Al-Fattah, S., Atallah, T., & Pierru, A. (Dec. 28, 2012). “An Introduction to Oil Market
Volatility Analysis.” USAEE Working Paper No. 12-152. Available at SSRN:
http://ssrn.com/abstract=2194214.
Mohaghegh, S.D. (2005). “Recent Developments in Application of Artificial Intelligence in
Petroleum Engineering.” J. of Petroleum Technology, 57 (4): 86-91. SPE-89033-MS.
Narayan, P., & Narayan, S. (2007). “Modeling Oil Price Volatility.” Journal of Energy Policy, 35:
6549-6553.
Ou, P., & Wang, H. (2011). “Applications of Neural Networks in Modeling and Forecasting
Volatility of Crude Oil Markets: Evidences from US and China.” Advanced Materials
Research, 230-232:953-957.
Patterson, D. (1996). Artificial Neural Networks. Singapore: Prentice Hall.
Pindyck, R. (2004). “Volatility in Natural Gas and Oil Markets.” The Journal of Energy and
Development, 30:1-19.
Poon, S., & Granger, C. (2003). “Forecasting Volatility in Financial Markets: A Review.” Journal of
Economic Literature, 41:478-539.
Regnier, E. (2007). “Oil and Energy Price Volatility.” Journal of Energy Economics, 29:405-427.
Sadorsky, P. (2006). “Modeling and Forecasting Petroleum Futures Volatility.” Journal of Energy
Economics, 28:467-488.
Shiang, T. (2010). Forecasting Volatility with Smooth Transition Exponential Smoothing in
Commodity Market. University Putra Malaysia.
Sidorenko, N., Baron, M., & Rosenberg, M. (2002). “Estimating Oil Price Volatility: A GARCH
Model.” EPRM, 62-65, Website: http://www.eprm.com.
StatSoft, Inc. (2011). STATISTICA (Data Analysis Software System), Version 10.
http://www.statsoft.com/
Trippi, R.R. & Turban, E. (eds.). (1996). Neural Networks in Finance and Investing: Using Artificial
Intelligence to Improve Real-World Performance. Chicago: Irwin Professional Publishing.
Wang, Y., Wu, C., & Wei, Y. (2011). “Can GARCH-Class Models Capture Long Memory in WTI
Crude Oil Markets?” Journal of Economic Modeling, 28(3):921-927.
Wei, Y., Wang, Y., & Huang, D. (2010). “Forecasting Crude Oil Market Volatility: Further Evidence
Using GARCH-Class Volatility.” Journal of Energy Economics, 32(6):1477-1484.
24
APPENDIX
Statistical Error Analysis:
Statistical error analysis is used to evaluate and assess the performance and the accuracy of the
prediction models. The parameters used in the statistical analysis are described below.
1. Mean relative error (
E
r)
This is a measure of the relative deviation of the predicted values (Xp) from the observed values (Xo)
and is defined by:
=
= 1, 2, … , ……………………………… (A-1)
2. Mean absolute error (
E
a)
This parameter computes the relative absolute deviation from the observed or measured data and is
expressed as:
=
= 1, 2, … , ……………………………… (A-2)
3. Mean squared error (
Ems
)
It is a measure of the average of the squares of the error. It is arguably the most important criterion
used to evaluate the performance of the predictor or an estimator. It is defined as:
=
= 1, 2, … , …………………………… (A-3)
4. Root mean squared error (
Erms
)
The root mean squared error measures the data dispersion around zero deviation. It is expressed as:
=
= 1, 2, … , ……………………………… (A-4)
25
5. Standard deviation (
SD
)
The standard deviation measures how much the dispersion or variations of data from the mean. It is
given by:
=
= 1, 2, … , ……………………………… (A-5)
Where,
=
()
……………………………………………………………..… (A-6)
6. The correlation coefficient (
R
)
It measures the strength of the relationship between the independent and dependent variables. It
varies from 0 and 1; the value of 1 indicates a perfect correlation or a strong relationship whereas a
value of 0 indicates no correlation at all or no relationship among the given independent variables. It
is expressed as:
=1
= 1, 2, … , …………………………..…… (A-7)