Content uploaded by Saud M. Al-Fattah

Author content

All content in this area was uploaded by Saud M. Al-Fattah on May 16, 2018

Content may be subject to copyright.

Electronic copy available at: http://ssrn.com/abstract=2216337

Artificial Neural Network Models for Forecasting Global Oil Market Volatility

Saud M. Al-Fattah

King Abdullah Petroleum Studies and Research Center (KAPSARC)

P.O. Box 88550, Riyadh 11672, Saudi Arabia; Email: saud.fattah@kapsarc.org

ABSTRACT

Energy market volatility affects macroeconomic conditions and can unduly affect the economies of

energy-producing countries. Large price swings can be detrimental to both producers and

consumers. Market volatility can cause infrastructure and capacity investments to be delayed,

employment losses, and inefficient investments. In sum, the growth potential for energy-producing

countries is adversely affected. Undoubtedly, greater stability of oil prices can reduce uncertainty in

energy markets, for the benefit of consumers and producers alike. Therefore, modeling and

forecasting crude oil price volatility is critical in many financial and investment applications.

The purpose of this paper to develop new predictive models for describing and forecasting

the global oil price volatility using artificial intelligence with artificial neural network (ANN)

modeling technology. Two ANN models were successfully developed: one for WTI futures price

volatility and the other for WTI spot prices volatility. These models were successfully designed,

trained, verified, and tested using historical oil market data. The estimations and predictions from

the ANN models closely match the historical data of WTI from January 1994 to April 2012. These

models appear to capture very well the dynamics and the direction of the oil price volatility.

The ANN models developed in this study can be used: as short-term as well as long-term

predictive tools for the direction of oil price volatility, to quantitatively examine the effects of

various physical and economic factors on future oil market volatility, to understand the effects of

different mechanisms for reducing market volatility, and to recommend policy options and

programs incorporating mechanisms that can potentially reduce the market volatility. With this

improved method for modeling oil price volatility, experts and market analysts will be able to

empirically test new approaches to mitigating market volatility. The outcome of this work provides a

roadmap for research to improve predictability and accuracy of energy and crude models.

Keywords: Oil Price volatility, modeling oil market, artificial neural network, forecasting

Electronic copy available at: http://ssrn.com/abstract=2216337

2

I. INTRODUCTION

1.1 Oil Market Volatility

Crude oil price plays a major role in global economic activity, and its variation can have effects that

span other markets and impact global economic growth. Volatility is a measure of the degree which

prices of a commodity fluctuate. We define it as the standard deviation of price returns over the

sample timeframe. Understanding volatility of the crude oil price is important for several reasons

(Regnier, 2007). First, long-term uncertainty in future oil prices can alter the incentives to develop

new oil fields in producing countries. Second, this can also curb the implementation of alternative

energy policies in consumer countries. Third, in the short-term, volatility can also affect the demand

for storage. In particular, as Pindyck (2004) noted, greater volatility should lead to an increased

demand for storage and an increase in both spot prices and marginal convenience yield1. Moreover,

volatility is critical in the pricing of derivatives whose trading volume has significantly increased in

the last decade. (Matar et al., 2012)

Economic models and policy simulations can forecast energy market behavior to some

extent using information on supply and demand (Kang, 2009; Narayan, 2007; Poon and Granger,

2003; Sadorsky, 2006; Sidorenko, 2002; & Wang, 2011). Matar et al. (2012) published a review of

these models. These models can be misleading, though, unless they explicitly incorporate factors that

affect market volatility. Market volatility can be affected by such factors as:

• Geopolitical events and swings in international markets;

• Investor behavior, including speculation inside and outside the energy industry;

• The linkage between the physical and the financial markets;

• Political instability that threatens energy supplies;

• The level of transparency of market data regarding supply, demand, costs, stocks, reserves,

and capacity;

• Changes in market regulation and investment rules;

• The shift toward renewable energies and stricter fuel-quality standards; and

• Macro-economic factors such as economic growth, exchange rates and monetary policies.

Among others, GARCH-type models have been the dominant conventional method for

modeling oil prices and volatility. Comparative studies, which have examined the effect of differing

factors that influence oil price volatility and its different modeling approaches, generally didn’t reach

1 The marginal convenience yield is the economic benefit from having physical possession of the commodity.

Electronic copy available at: http://ssrn.com/abstract=2216337

3

a consensus on the superiority of the forecasting capabilities for oil price volatility using these

conventional econometric and financial models. One promising approach that we propose is the

application of artificial intelligence with ANN for modeling and forecasting oil market volatility. The

ANN approach has already been used by other studies to forecast the financial markets and the

pricing of commodities, but it has not been thoroughly explored and applied for modeling the oil

price volatility. Therefore, it is the purpose of this study to apply this novel approach of ANN for

modeling oil market volatility. It should be noted that it is not the intention of this work to predict

the oil prices in absolute terms; rather the objective is to reasonably predict the direction of its

volatility.

International energy companies and institutions, investment researchers, and academic

researchers have devoted significant efforts to understanding and mitigating energy market volatility.

However, new developments in energy production, investment patterns, and the geopolitical

environment will require researchers to continue updating and refining their understanding of energy

markets. Additionally, new modeling techniques are being developed that may enhance researchers’

ability to model markets and volatility.

Energy and financial markets are an area that is ripe for exploration with artificial intelligence

technology, given the amount and frequency of market data that are collected. In particular, ANN

can be used to find patterns in data and to model complex relationships between inputs and outputs.

This paper is organized as follows. Section 1 presents an overview of the oil price volatility

and the ANNs. Section 2 presents data sources, preparation and preprocessing. Section 3 presents

the input selection methods and the dimensionality reduction techniques. Sections 4 and 5 discuss

the ANN models design, development procedures, and the results. Section 6 concludes with future

work.

1.2 Artificial Neural Networks

ANNs have seen an enormous escalation of interest during the past few decades. ANNs are found

powerful and effective tools for solving practical and complex problems in the petroleum industry

(Mohaghegh, 2005; Al-Fattah & Startzman, 2003), economics and finance (Azoff, 1994; Trippi &

Turban, 1996). Advantages of neural network methods over conventional simulation and regression

techniques include the ability to solve highly nonlinear complex and dynamic problems without

prior knowledge of the relationships between input and output variables, and the capability to

handle either continuous (numerical) or categorical data as either inputs or outputs. Furthermore,

4

Supply

Economic

Indicators

Spare

capacity

Demand

ANN approach is intuitively attractive as it is based on crude low-level models of biological systems,

which simply learn by examples. The neural network is fed with representative data of the problem

at hand, and gets exposed for training until it learns the pattern and behavior of the data.

ANNs mimic the human brain in thinking with parallel processing of the information in the

nervous system. They can also be defined as computational models of biological neural structures.

Each ANN commonly consists of a number of fully connected nodes or neurons grouped in layers.

These layers can include one input layer, one or more hidden layers, and one output layer. The

number of nodes in each of these layers depends on the number of input and output variables, and

the architecture of the ANN.

Figure 1 depicts the basic configuration of a Multilayer Perceptrons (MLP) three-layer

network: one input layer, one hidden layer, and one output layer. The node entails multiple inputs

and a single output. The input layer has the same number of nodes as the independent variables and

output layer has the same number of nodes as the dependent variables. Each input is modified by a

weight, which multiplies with the input value. The input can be raw data or output of other nodes or

neurons. With reference to an assigned threshold value and activation function, the node will

combine these weighted inputs and use them to determine its output. This output can be either the

final product or an input to another node. (Bishop, 1995; Fausett, 1994; Haykin, 1994; Patterson,

1996)

Figure 1- An example of a three-layer MLP neural network structure.

Input layer Output layerHidden layer

OUTIN

weight

node

Price Volatility

.

.

.

5

The objective for modeling oil market volatility is to predict the target or dependent variable

(oil price volatility) that varies in time using previous values of the target variable and/or other input

variables. The independent/input variables can include crude oil prices, supply, demand, spare

capacity, and economic indicators. This time series problem can be solved using the following

network types: Multilayer Perceptrons (MLP), Radial Basis Function (RBF), Generalized Regression

Neural Network (GRNN), and Linear (Fausett, 1994; Haykin, 1994; Patterson, 1996). The Linear

model is basically the conventional regression analysis. In this study, I experimented with the first

three network types: MLP, RBF, and GRNN, and I found MLP network outperforms the other

network types for this particular study.

The design and development of an ANN model involve seven important procedures. These

procedures include: 1. Data acquisition and preparation, 2. Data preprocessing, 3. Inputs selection

and dimensionality reduction, 4. Network design, 5. Network training, 6. Verifying, and 7. Testing.

Figure 2 is a flowchart illustrating the ANN development strategies implemented in this study (Al-

Fattah & Al-Naim, 2009).

II. DATA

2.1 Input Data and Sources

The data used to develop the ANN model for oil market volatility were collected from the Energy

Information Admin. (EIA, 2012). West Texas Intermediate (WTI) daily crude oil futures and spot

prices were used from Jan.1994 to April 2012. Because other variables were found only in monthly

frequency, the WTI daily data were converted to monthly time scale taking the tenth trading day of

each month as the data for the respective month. The daily returns have been adjusted for

intermediate non-trading days that include weekends and holidays (Pindyck, 2004). In computing the

price returns at the monthly frequency, the oil price on the tenth day of each month is considered.

This is done to avoid proximity with contract expiration dates. For months whose tenth day is not a

trading day, the nearest previous or succeeding trading days are incorporated instead. In this study,

we used data of crude prices for first month contract. The monthly data for the period Jan. 1994 to

April 2012 were used for the input variables which include, among others, oil supply from major

countries, U.S. crude oil inventory, oil consumption in OECD and non-OECD countries, U.S.

ending stocks of crude oil, crude oil spare capacity, economic indicators, and world total petroleum

stocks.

6

Figure 2- Flowchart of ANN design and development procedure deployed in this study.

Source: Revised from Al-Fattah and Al-Naim, 2009.

7

These input variables were selected because they are influencing factors on oil market volatility as

discussed by Matar et al. (2012). Oil supply, demand, inventory levels and storage are among the

factors impacting oil price volatility. The formula used to compute the price return is given by

=100

………………………………………………………………….. (1)

The following equation adjusts for n-1 intermediate non-trading days, following the procedure

detailed by Pindyck (2004).

=100

………………...………………………………………(2)

Where is the standard deviation of log price changes over an interval of one physical day, and

is the standard deviation of log price changes over an interval of n physical days. When computing

the returns from spot prices, an additional adjustment is necessary to account for the convenience

yield (Pindyck, 2004). The price volatility can then be computed as either the squared of return (vt =

rt

2) or the absolute return (vt = |rt|).

2.2 Data Preparation

Data acquisition, preparation, and preprocessing are considered the most important and most time

consuming tasks in the ANN model development process, Fig. 2. The optimum number of data that

is required for developing a neural network model often presents complications. There are some

empirical rules in the literature, which relate the number of data points needed to the size of the

network. One of these rules suggests that there should be at least 10 times as many data points as

connections in the network (Hill & Lewicki, 2007). In this study, we have a total of 220 data

observations and considering about 22 input variables. Thus, we have followed the optimal 10% rule

of thumb. In fact, the number of data required depends on the difficulty of the problem which the

network is trying to model. In general, most practical problems require hundreds or thousands of

data points. As the number of input variables increases, the number of input data points needed

increases nonlinearly. A small number of input variables sometimes even require a large number of

input data points. This problem is known as “the curse of dimensionality.” If there is a larger but

8

still limited data set, then it can be compensated to by creating an ensemble of the networks. Then

the individual networks are trained using a different resampling of the available data. The average of

the predictions from the best networks is then used. The resulting network is called an ensemble.

(Hill & Lewicki, 2007)

2.3 Data Preprocessing

Data preprocessing is an important procedure in the development of ANN models. The

preprocessing procedures used in the construction process of the ANN model of this study are

input/output normalization and transformation (Hill & Lewicki, 2007).

2.3.1 Transformation

My experience found that ANN performs better with normally distributed data and not seasonally

adjusted data. Having input data exhibiting trends or periodic variations renders data transformation

necessary. There are different ways to transform the input variables into forms, making the neural

network interpret the input data easier and perform faster in the training process. Examples for such

transformation forms include the variable first derivative, relative variable difference, natural

logarithm of relative variable, square root of variable, and trigonometric functions. The choice of the

first-derivative transform, for example, can remove the trend in each input variable, thus helping to

reduce the multicollinearity among the input variables. Using the first derivative also results in

greater fluctuation and contrast in the values of the input variables. This improves the ability of the

ANN model to detect significant changes in patterns.

2.3.2 Normalization

Normalization is a process of standardizing the possible numerical range that the input data can

take. It enhances the fairness of training by preventing an input with large values from swamping

out another input that is equally important but with smaller values. Normalization is also

recommended because the network training parameters can be tuned for a given range of input data;

thus, the training process can be carried over to similar tasks.

The mean/standard deviation normalization method was used to normalize all the input and

output variables of the neural network model developed in this study. The mean standard deviation

preprocessing is the most commonly used method and generally works well with almost every case.

It has the benefits that it normalizes the input variable without any loss of information, and it

9

transforms back the original values easily. Each input variable as well as the output were normalized

using the following equation:

=()

……………………………………………….. (3)

where X

′

= normalized input/output vector, X = original input/output vector,

µ

= mean of the

original input/output,

σ

= standard deviation of the input/output vector, and i = number of

input/output vector. Each input/output variable was normalized using the equation above with its

mean and standard deviation values. This normalization process was applied to the whole data

including the training and testing sets. The single set of normalization parameters of each variable

(i.e., the standard deviation and the mean) can then be preserved to be applied to new data during

forecasting.

III. INPUT SELECTION AND DIMENSIONALITY REDUCTION

One of the critical tasks in the design of the ANN model is the decision of which of the available

variables to use as inputs to the neural network. The only guaranteed method to select the best input

set is to train networks with all possible input sets and all possible architectures, and to select the

best. Practically, this is impossible for any significant number of potential input variables. It becomes

thornier when multicollinearity exists among some of the input variables, which means that any set

of variables might be sufficient.

Some neural network architectures can actually learn to ignore redundant or insignificant

variables. Other architectures are unfavorably impacted, and in all situations too many input

variables indicate that a larger number of training data points is needed to prevent the trained

network from over-learning or memorization problem. Over-learning or over-fitting is a major

problem that deteriorates the network performance significantly. Over-learning can cause the

network to perform very well with the training data set, but poorly with the testing set.

Consequently, the performance of a network can be improved by reducing the number of redundant

and insignificant input variables, leading the network to generalize and not memorize. There are

highly sophisticated algorithms that determine the selection of input variables. These techniques

10

include genetic algorithm, forward and backward stepwise algorithm, and principle component

analysis (Hill & Lewicki, 2007; Goldberg, 1989).

The genetic algorithm is an optimization algorithm that can search efficiently for binary

strings by processing an initially random population of strings using artificial system, mimicking the

natural human selection. The genetic algorithm is therefore an efficient technique to identify

significant variables where there are large numbers of variables, and offers a valuable verification for

smaller numbers of variables. In particular, the genetic algorithm works well at recognizing

interdependencies between variables located close together on the masking strings. Genetic

algorithm can detect subsets of variables that are not revealed by other methods. However, the

method of genetic algorithm is time consuming; if the number of variables is large then it typically

requires training and testing many thousands of networks resulting in running the program for a

couple of days. (Goldberg, 1989; Fausett, 1994)

Forward and backward stepwise algorithms usually run much faster than the genetic

algorithm if there are a reasonable number of variables. Both techniques, forward and backward

stepwise selection, are also equally effective if there are not too many complex interdependencies

between variables. Forward stepwise input-selection technique works by adding variables one at a

time, while the backward stepwise input-selection technique operates starting with the complete set

of variables then removing variables one at a time (Hill & Lewicki, 2007).

Another common approach to dimensionality reduction is the principle component analysis

(PCA) (Bishop, 1995; Hill & Lewicki, 2007) which can be denoted in a linear network. PCA can

often extract a small number of components from fairly high-dimensional original data while

preserving the important structure of the data.

In this study, we used the genetic algorithm, forward and backward stepwise techniques to

determine the significant input variables of the neural network among a total of fifty eight input

variables. We found these techniques yield almost the same results. Using these techniques, we

eliminated the redundant input variables with a reduction of about 62%; thus we selected the best

twenty input variables for the network model that contributed significantly to the model’s

performance. Thus all these identified twenty significant input variables are retained in the network

model development. Among the best predictors for the WTI crude oil price volatility are, but not

limited to, the U.S. crude inventory change, U.S. crude oil production, OPEC oil production,

OECD inventory, oil consumption by FSU, India and China, and OECD oil consumption. The

results show that the US business inventory change is the most significant input parameter to be

11

included in the model with F-value greater than one. The OPEC total petroleum supply is the least

significant input variable among the other selected variables with an F-value of one, yet it is

important to be retained in the model. We selected the threshold value for selecting the input

variable to be ones having F-value of one and above. It should be noted that all these input variables

all normalized.

IV. ANN MODEL DESIGN

This section discusses the design aspects for selecting the neural network architecture. These include

the learning or training algorithm, the number of nodes in each layer, the number of hidden layers,

and the type of transfer function.

4.1 Architecture

The neural network architecture determines the method that the weights are interconnected in the

network and specifies the type of learning rules that may be used. Selection of network architecture

is one of the first things done in setting up a neural network. The multilayer perceptron (MLP)

(Haykin, 1994; Azoff, 1994; Trippi & Turban, 1996) is the most commonly used architecture and is

generally recommended for most applications; hence it is selected to be used for this study.

4.2 Learning Algorithm

Selection of a learning rule is also an important step because it affects the determination of input

functions, transfer functions, and associated parameters. The network used in this study is based on

a back-propagation (BP) design, the most widely recognized and most commonly used supervised-

learning algorithm (Haykin, 1994).

The fundamental structure of a back-propagation neural network consists of an input layer,

one or more hidden layers, and an output layer. Fig. 1 shows the architecture of a BP neural

network. A layer consists of a number of processing elements or neurons. The layers are fully

connected, indicating that each neuron of the input layer is connected to each hidden layer node.

Similarly, each hidden layer node is connected to each output layer node. The number of nodes,

needed for the input and output layers, depends on the number of input and output variables

designed for the neural network.

12

4.3 Transfer Function

A transfer function acts on the value returned by the input function. The input function associates

the input vector with the weight vector to obtain the net input to the node given a particular input

vector (Haykin, 1994). Each of the transfer functions introduces nonlinearity into the neural

network, enriching its representational capacity. Actually, one of the ANN advantages over the

traditional regression techniques is the nonlinearity introduced by the transfer function to the

network. There are a number of transfer functions. Among those are sigmoid, arctan, sin, linear,

Gaussian, and Cauchy. The most commonly used transfer function is the sigmoid function. It

squashes and compresses the input function when the variables take on large positive or large

negative values. Large positive values asymptotically approach one, while large negative values are

squashed to zero. The sigmoid is given by (Haykin, 1994),

x

e

xf

−

+

=11

)(

……………………………………………….. (4)

Figure 3 is a typical plot of the sigmoid function. In essence, the activation function acts as a

nonlinear gain for the node. The gain is actually the slope of the sigmoid at a specific point. It varies

from a low value at large negative inputs, to a high value at zero input, and then drops back toward

zero as the input becomes large and positive.

Figure 3- Sigmoid function.

0

0.5

1

-20 -15 -10 -5 0 5 10 15 20

Input

Output

13

4.4 Example:

To illustrate the backpropagation training algorithm, let’s consider the simple network below:

Figure 4- Example neural network illustrating backpropagation training algorithm.

This network, shown in Figure 4, has two nodes in the input layer, two nodes in the hidden layer,

and a single node in the output layer. Assume that the nodes have a Sigmoid activation function. So

let’s

(i) Perform a forward pass on the network.

(ii) Perform a reverse pass (training) once (target = 0.5).

(iii) Perform a further forward pass and comment on the result.

(i) Input to top neuron = (0.35 x 0.1) + (0.9 x 0.8) = 0.755. Out = 1/(1 + e-0.755) = 0.68.

Input to bottom neuron = (0.9 x 0.6) + (0.35 x 0.4) = 0.68. Out = 0.6637.

Input to final neuron = (0.3 x 0.68) + (0.9 x 0.6637) = 0.80133. Out = 0.69.

(ii) Output error δ = (target-output) (1-output) x output = (0.5-0.69) (1-0.69) x 0.69 = -0.0406.

Calculating new weights for output layer:

w1

+ = w1 + (δ x input) = 0.3 + (-0.0406 x 0.68) = 0.272392

w2

+ = w2 + (δ x input) = 0.9 + (-0.0406 x 0.6637) = 0.87305

Calculating errors for hidden layers:

δ1 = δ x w1 = -0.0406 x 0.272392 x (1 - output) x output = -2.406 x10-3

δ2 = δ x w2 = -0.0406 x 0.87305 x (1 - output) x output = -7.916 x10-3

w = 0.1

I

2

= 0.90

I1 = 0.35

0.6

0.4

0.8

0.3

0.9

Output (?)

(Target = 0.5)

Input Layer

Hidden Layer

Output Layer

14

New hidden layer weights:

w3

+ = 0.1 + (-2.406 x 10-3 x 0.35) = 0.09916.

w4

+ = 0.8 + (-2.406 x 10-3 x 0.9) = 0.7978.

w5

+ = 0.4 + (-7.916 x 10-3 x 0.35) = 0.3972.

w6

+ = 0.6 + (-7.916 x 10-3 x 0.9) = 0.5928.

(iii) The old error was = (0.5-0.69) = -0.19 and the new error is = (0.5-0.68205) = -0.18205.

Therefore, the error has reduced. The training continues until the network converges to a solution

with acceptable tolerance of error, or reaches the maximum defined number of iterations or epochs.

V. ANN MODEL DEVELOPMENT

5.1 Training, Verifying, and Testing

In this study, the data (January 1994 to April 2012) are partitioned into three groups: training set

(70% of data), verification or validation set (15% of data), and testing set (15% of data). This is an

optimal apportioning as recommended by others (Hill and Lewicki, 2007); however, a data

apportioning as 80%-10%-10% is also possible. Table 1 shows the partitioning of the WTI crude

prices datasets. In the training process, the network is exposed repeatedly to input data, the weights

and thresholds of the post-synaptic potential function are adjusted using a backpropagation training

algorithm until the network performs very well in correctly predicting the output by meeting the

errors threshold requirements.

Normally, the training data subset is presented to the network in several or even hundreds of

iterations. Each presentation of the training data to the network for adjustment of weights and

thresholds is referred to as an epoch or iteration. This process continues until the overall error

function has been sufficiently minimized (see the above example for illustration). The convergence

criteria for the trained network are either a residual error of 1.0x10-6 or less, or maximum epochs of

1000; whichever occurs first.

The overall error is also computed for the second subset of the data which is sometimes

referred to as the verification or validation data. The verification data acts as a watchdog and takes

15

no part in the adjustment of weights and thresholds during training, but the networks’ performance

is continually checked against this subset as training continues. The training is stopped when the

error for the verification data stops decreasing or starts to increase.

Use of the verification subset of data is important, because with unlimited training, the

neural network usually starts “overlearning or over-fitting” the training data leading to the network’s

memorization problem. The overlearning or memorization problem means that given no restrictions

on training, a neural network may match the training data almost perfectly but may perform very

poorly to new data such as the testing set. A good ANN model is characterized by the

generalization. The use of verification subset to stop training at a point when generalization potential

is best is a critical consideration in training neural networks.

A third subset of data which is the testing set is used to serve as an additional independent

check on the generalization capabilities of the neural network, and to act as a blind test of the

performance and accuracy of the network. (Al-Fattah, 2011; Fausett, 1994) In this study, several

neural network architectures and training algorithms have been attempted to achieve the best results.

The software tool that was used in developing the ANN model in this study is STATISTICA

(StatSoft, 2011).

Table 1– Partitioning of WTI Crude Data Sets

Data Set

No. of Data Points

No. of Years

Period

Estimation (Training)

154 (70%)

~ 13

Jan. 1994 – Oct. 2006

Validation (Verification)

33 (15%)

~ 3

Nov. 2006 – Jul. 2009

Forecasting (Testing)

33 (15%)

~ 3

Aug. 2009 – Apr. 2012

5.2 Discussion of Results

The ANN model for the oil market volatility developed in this study was successfully well trained,

verified and checked for generalization. The ANN model was best developed with the following

characteristics: MLP network architecture, back propagation training algorithm, logistic transfer

function, three layers (input layer with four nodes, hidden layer with four nodes, and output layer

with one node).

The result of the ANN model for the WTI oil market volatility is shown in Figure 5. As can

be seen from this figure that the ANN model matches the observed oil price volatility, and it

16

captures very well the direction and the path of the price volatility whether it is an upward or

downward. The ANN model also captured the negative oil shock that was taken place in 2008 due

to the world financial crisis. It was not the intention of this study to predict the oil price with high

accuracy. Rather, the purpose is to model and predict the direction and the path of the oil price

volatility. I believe achieving this objective is of significant importance to oil producers, consumers,

investors, and traders. Figure 6 shows that the error distribution of the ANN model is evenly

distributed around zero indicating the errors are not different from normality. Figure 7 which

presents the normal probability plot supports this observation by showing most of the data fall on

the straight line. A cross-plot of the observed volatility data and ANN model results depicted in

Figure 8 indicates the close agreement between the observed and predicted volatility by showing the

majority of the data falls on the 45o straight line.

Figure 5- Results of ANN model for WTI futures price volatility.

-5

-4

-3

-2

-1

0

1

2

3

4

5

Normalized WTI Futures Price Volatility

Normalized WTI Crude Futures Price Volatility

(Jan. 1994 – Apr. 2012)

Actual

ANN Model

Estimation Validation Forecasting

17

WTI Crude Oil Futures Price Volatility

Residual Errors Distribution [1.MLP 58-14-1]

-3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0

Residual Errors of Volatility (Output)

0

5

10

15

20

25

30

35

40

45

50

55

60

65

Frequency

Figure 6- Residuals distribution of ANN model of WTI crude futures price volatility.

Normal Probability Plot of WTI Price Volatility - Residuals

-2 -1 0123

Observed Value

-2

-1

0

1

2

3

Expected Normal Value

Figure 7- Normal probability plot of WTI crude futures price volatility.

18

WTI Crude Oil Futures Price Volatility

Cross-Plot for ANN Model vs. Actual Data

-5 -4 -3 -2 -1 0123

WTI Price Return (Target)

-5

-4

-3

-2

-1

0

1

2

3

WTI Price Return (Output)

Figure 8- A cross-plot of WTI crude futures price volatility model.

Statistical error and graphical analyses (Hill & Lewicki, 2007) are used in this study to

examine the performance of the ANN model. The statistical parameters used are the output data

standard deviation (SDo), output error mean (Er), output error standard deviation (SDe), output

absolute error mean (Ea), correlation coefficient (R), mean squared error (Ems), and root mean-

squared error (Erms). It is a good practice to use the statistical error analysis for the network

verification data or testing subsets, but not the training set, to judge and compare the performance

of a network among other competing networks. The training set almost always yield better results

than the validation or the testing sets since the training test is exposed to the majority of the data

that make it learn the data structure and capture the data pattern very well. Therefore, a good

measure of the network’s performance is to use the test data that has not been seen by the network

during the development of the model.

Normalized WTI Price Volatility (Predicted)

Normalized WTI Price Volatility (Measured)

(Normalized)

19

Table 2 presents the results of statistical analysis for the ANN oil-market volatility model

developed in this study. The results shown are for each of the training, validation and testing sets as

well as the complete data set. As is usually the case, the errors of the test set or forecasting set (out

sample) are slightly higher than the training set. The standard deviation ratio (SDR) is less than one

for all data and its subsets except the validation set, which is a bit higher than unity. The Ems values

for training set (0.186), verification set (0.858), and testing set (0.851), with an overall Ems value of

0.384 for the entire dataset. The correlation coefficients for the model subsets range from 0.6 to 0.9

with an overall R value of 0.8 for the entire dataset, indicating a high degree of accuracy. In the oil

market and pricing models, these values of correlation coefficients of about 0.6 and higher are

reasonable and not considered low. The overall data set has excellent results of accuracy with an

average Ems value of 0.384 and a correlation coefficient of 0.8.

We also developed another ANN model for WTI spot prices volatility. The results are not

different from the WTI futures price volatility model. Figure 9 shows the excellent agreement

between the actual WTI spot price volatility and the ANN model predictions.

Table 2. Results of ANN model performance for WTI futures price volatility.

All Data

Sets

Training

Set

Validation

Set Testing Set

Output data standard deviation (S

D

)

0.882

0.895

0.580

1.060

Mean relative error (E

r

)

-0.077

-0.005

-0.475

-0.027

Error standard deviation (S

D

)

0.617

0.433

0.808

0.936

Mean absolute error (E

a

)

0.467

0.344

0.762

0.755

Correlation coefficient (R)

0.794

0.903

0.573

0.588

Mean squared error (E

ms

)

0.384

0.186

0.858

0.851

Root mean squared error (E

rms

)

0.620

0.431

0.926

0.923

20

Figure 9- ANN Model Prediction of WTI spot price volatility.

VI. CONCLUSIONS AND FUTURE WORK

6.1 Conclusions

This study applied a novel approach for modeling and predicting the oil market volatility using

ANN. Two ANN models were successfully developed; one for WTI futures price volatility and the

other for WTI spot prices volatility. The ANN models were developed with three layers, MLP

architecture, and a backpropagation training algorithm. Input selection techniques were deployed to

identify significant variables to the performance of the network and to eliminate the redundant

variables. The ANN models were well designed, trained, verified, and tested using historical oil

market data. The results of the estimations and predictions from the ANN models closely match the

data of WTI crude prices from January 1994 to April 2012. The ANN models appear to capture very

well the dynamics and the direction of the volatility for both WTI crude spot and futures prices. The

predictions of the ANN models show an excellent agreement with the observed data by yielding an

-40

-30

-20

-10

0

10

20

30

40

WTI Crude Spot Price Volatility

WTI Crude Spot Price Volatility

(Jan. 1994 - Apr. 2012)

Actual

ANN Model

Estimation Validation Forecasting

21

overall standard deviation ratio value of 0.7 and a correlation coefficient of 0.8, and match very well

the direction of the oil market volatility.

Greater price stability of oil prices can reduce uncertainty in energy markets, for the benefit

of consumers and producers alike. These ANN models developed in this study can be used: as

short-term as well as long-term predictive tools for the direction and the path of oil price volatility,

to quantitatively examine the effects of various physical and economic factors on future oil market

volatility, to understand the effects of different mechanisms for reducing market volatility, and to

recommend policy options and programs incorporating mechanisms that can potentially reduce the

market volatility. With this improved method for modeling oil price volatility, experts and market

analysts will be able to empirically test new approaches to mitigating market volatility. The outcome

of this work provides a roadmap for research to improve predictability and accuracy of energy and

crude models.

6.2 Future Work

This study has met its objectives within the scope of work set forth. However, the followings

present our recommendations, and the issues to be addressed further for future research.

1. At the time of this study, most of influential factors impacting oil market volatility have data

with monthly frequency. Use of significant input variables with high data frequency (daily or

weekly) will immensely help improving the ANN model performance and its predictions

accuracy.

2. The scope of this study is to develop an ANN model that can describe historical data behavior

and predict satisfactorily the direction of the oil price volatility. These newly developed models

can be used as an oil market volatility forecaster and as a short-term predictive tool for the path

of the oil price volatility.

3. With the neural network model developed in this study, we recommend further analysis to

evaluate quantitatively the effects of the various physical and economic factors impacting future

oil market volatility.

4. Similar research study can be pursued for modeling and forecasting the price volatility for other

global oil markets such as Brent, Dubai and OPEC crudes. Also, a similar work can be pursued

for developing ANN model for gas market volatility or any other commodities following the

same methodology presented in this study.

22

5. To recognize the merits, power and performance of the ANN, a comparative study of oil market

volatility can be performed between the ANN approach and the conventional regression

techniques (e.g. GARCH model).

6. When the newly developed ANN models cease to be adequate, it is recommended that they get

updated periodically as new data becomes available.

ACKNOWLEDGMENTS

The author would like to thank the following for their valuable comments and review of the paper:

Fred Joutz, James Smith, Emmanuel Ntui, and Shaikh Arifusalam.

REFERENCES

Al-Fattah, S.M., & Startzman, R.A. (2003). “Neural Network Approach Predicts U.S. Natural Gas

Production.” SPE Production Facilities Journal, 18(2):84-91.

Al-Fattah, S.M. & H.A. Al-Naim. (Feb. 2009). “Artificial-Intelligence Technology Predicts Relative

Permeability of Giant Carbonate Reservoirs.” SPE Reservoir Evaluation and Engineering

Journal, 12(1):96-103.

Al-Fattah, S.M. (2011). Innovative Methods for Analyzing and Forecasting World Gas Supply.

Germany: Lambert Academic Publishing,

Azoff, E.M. (1994). Neural Network Time Series Forecasting of Financial Markets. Chichester,

England: John Wiley & Sons Ltd. Inc.

Bishop, C. (1995). Neural Networks for Pattern Recognition. Oxford: University Press.

Energy Information Administration (EIA), 2012, Retrieved from EIA website:

http://www.eia.doe.gov/.

Fausett, L. (1994). Fundamentals of Neural Networks. New York, NY: Prentice-Hall.

Goldberg, D.E. (1989). Genetic Algorithms. Reading, MA: Addison Wesley.

Haykin, S. (1994). Neural Networks: A Comprehensive Foundation. New York City: Macmillan

College Publishing Co.

Hill, T. & Lewicki, P. (2007). Statistics: Methods and Applications. Tulsa, OK: StatSoft.

Kang, S.H., Kang, S.M., & Yoon, S.M. (2009). “Forecasting Volatility of Crude Oil Markets.”

Journal of Energy Economics, 31:119-125.

23

Matar, W., Al-Fattah, S., Atallah, T., & Pierru, A. (Dec. 28, 2012). “An Introduction to Oil Market

Volatility Analysis.” USAEE Working Paper No. 12-152. Available at SSRN:

http://ssrn.com/abstract=2194214.

Mohaghegh, S.D. (2005). “Recent Developments in Application of Artificial Intelligence in

Petroleum Engineering.” J. of Petroleum Technology, 57 (4): 86-91. SPE-89033-MS.

Narayan, P., & Narayan, S. (2007). “Modeling Oil Price Volatility.” Journal of Energy Policy, 35:

6549-6553.

Ou, P., & Wang, H. (2011). “Applications of Neural Networks in Modeling and Forecasting

Volatility of Crude Oil Markets: Evidences from US and China.” Advanced Materials

Research, 230-232:953-957.

Patterson, D. (1996). Artificial Neural Networks. Singapore: Prentice Hall.

Pindyck, R. (2004). “Volatility in Natural Gas and Oil Markets.” The Journal of Energy and

Development, 30:1-19.

Poon, S., & Granger, C. (2003). “Forecasting Volatility in Financial Markets: A Review.” Journal of

Economic Literature, 41:478-539.

Regnier, E. (2007). “Oil and Energy Price Volatility.” Journal of Energy Economics, 29:405-427.

Sadorsky, P. (2006). “Modeling and Forecasting Petroleum Futures Volatility.” Journal of Energy

Economics, 28:467-488.

Shiang, T. (2010). Forecasting Volatility with Smooth Transition Exponential Smoothing in

Commodity Market. University Putra Malaysia.

Sidorenko, N., Baron, M., & Rosenberg, M. (2002). “Estimating Oil Price Volatility: A GARCH

Model.” EPRM, 62-65, Website: http://www.eprm.com.

StatSoft, Inc. (2011). STATISTICA (Data Analysis Software System), Version 10.

http://www.statsoft.com/

Trippi, R.R. & Turban, E. (eds.). (1996). Neural Networks in Finance and Investing: Using Artificial

Intelligence to Improve Real-World Performance. Chicago: Irwin Professional Publishing.

Wang, Y., Wu, C., & Wei, Y. (2011). “Can GARCH-Class Models Capture Long Memory in WTI

Crude Oil Markets?” Journal of Economic Modeling, 28(3):921-927.

Wei, Y., Wang, Y., & Huang, D. (2010). “Forecasting Crude Oil Market Volatility: Further Evidence

Using GARCH-Class Volatility.” Journal of Energy Economics, 32(6):1477-1484.

24

APPENDIX

Statistical Error Analysis:

Statistical error analysis is used to evaluate and assess the performance and the accuracy of the

prediction models. The parameters used in the statistical analysis are described below.

1. Mean relative error (

E

r)

This is a measure of the relative deviation of the predicted values (Xp) from the observed values (Xo)

and is defined by:

=

= 1, 2, … , ……………………………… (A-1)

2. Mean absolute error (

E

a)

This parameter computes the relative absolute deviation from the observed or measured data and is

expressed as:

=

= 1, 2, … , ……………………………… (A-2)

3. Mean squared error (

Ems

)

It is a measure of the average of the squares of the error. It is arguably the most important criterion

used to evaluate the performance of the predictor or an estimator. It is defined as:

=

= 1, 2, … , …………………………… (A-3)

4. Root mean squared error (

Erms

)

The root mean squared error measures the data dispersion around zero deviation. It is expressed as:

=

= 1, 2, … , ……………………………… (A-4)

25

5. Standard deviation (

SD

)

The standard deviation measures how much the dispersion or variations of data from the mean. It is

given by:

=

= 1, 2, … , ……………………………… (A-5)

Where,

=

()

……………………………………………………………..… (A-6)

6. The correlation coefficient (

R

)

It measures the strength of the relationship between the independent and dependent variables. It

varies from 0 and 1; the value of 1 indicates a perfect correlation or a strong relationship whereas a

value of 0 indicates no correlation at all or no relationship among the given independent variables. It

is expressed as:

=1

= 1, 2, … , …………………………..…… (A-7)