ArticlePDF Available

Artificial Intelligence Approach for Modeling and Forecasting Oil-Price Volatility

Authors:

Abstract and Figures

Oil market volatility affects macroeconomic conditions and can unduly affect the economies of oil-producing countries. Large price swings can be detrimental to producers and consumers, causing infrastructure and capacity investments to be delayed, employment losses, inefficient investments, and/or the growth potential for energy-producing countries to be adversely affected. Undoubtedly, greater stability of oil prices increases the certainty of oil markets for the benefit of oil consumers and producers. Therefore, modeling and forecasting crude-oil price volatility is a strategic endeavor for many oil market and investment applications. This paper focused on the development of a new predictive model for describing and forecasting the behavior and dynamics of global oil-price volatility. Using a hybrid approach of artificial intelligence with a genetic algorithm (GA), artificial neural network (ANN), and data mining (DM) time-series (TS) (GANNATS) model was developed to forecast the futures price volatility of West Texas Intermediate (WTI) crude. The WTI price volatility model was successfully designed, trained, verified, and tested using historical oil market data. The predictions from the GANNATS model closely matched the historical data of WTI futures price volatility. The model not only described the behavior and captured the dynamics of oil-price volatility, but also demonstrated the capability for predicting the direction of movements of oil market volatility with an accuracy of 88%. The model is applicable as a predictive tool for oil-price volatility and its direction of movements, benefiting oil producers, consumers, investors, and traders. It assists these key market players in making sound decisions and taking corrective courses of action for oil market stability, development strategies, and future investments; this could lead to increased profits and to reduced costs and market losses. In addition, this improved method for modeling oil-price volatility enables experts and market analysts to empirically test new approaches for mitigating market volatility. It also provides a roadmap for improving the predictability and accuracy of energy and crude models.
Content may be subject to copyright.
Artificial Intelligence Approach
for Modeling and Forecasting
Oil-Price Volatility
Saud M. Al-Fattah, Saudi Aramco
Summary
Oil market volatility affects macroeconomic conditions and can unduly affect the economies of oil-producing countries. Large price
swings can be detrimental to producers and consumers, causing infrastructure and capacity investments to be delayed, employment
losses, inefficient investments, and/or the growth potential for energy-producing countries to be adversely affected. Undoubtedly,
greater stability of oil prices increases the certainty of oil markets for the benefit of oil consumers and producers. Therefore, modeling
and forecasting crude-oil price volatility is a strategic endeavor for many oil market and investment applications.
This paper focuses on the development of a new predictive model for describing and forecasting the behavior and dynamics of
global oil-price volatility. Using a hybrid approach of artificial intelligence with a genetic algorithm (GA), artificial neural network
(ANN), and data mining (DM) time-series (TS), a (GANNATS) model was developed to forecast the futures price volatility of West
Texas Intermediate (WTI) crude. The WTI price volatility model was successfully designed, trained, verified, and tested using historical
oil market data. The predictions from the GANNATS model closely matched the historical data of WTI futures price volatility. The
model not only described the behavior and captured the dynamics of oil-price volatility, but also demonstrated the capability for pre-
dicting the direction of movements of oil market volatility with an accuracy of 88%.
The model is applicable as a predictive tool for oil-price volatility and its direction of movements, benefiting oil producers, consumers,
investors, and traders. It assists these key market players in making sound decisions and taking corrective courses of action for oil market
stability, development strategies, and future investments; this could lead to increased profits and to reduced costs and market losses. In
addition, this improved method for modeling oil-price volatility enables experts and market analysts to empirically test new approaches
for mitigating market volatility. It also provides a roadmap for improving the predictability and accuracy of energy and crude models.
Introduction
Oil-Price Volatility. The price of crude oil plays a major role in global economic activity, and its fluctuations can affect other markets
and impact global economic growth. Volatility is a measure of the degree to which prices of a commodity fluctuate. We define it as the
standard deviation of price returns over the sample time frame. Understanding the volatility of crude-oil pricing is important for several
reasons. First, long-term uncertainty in future oil prices can alter the incentives to develop new oil fields in producing countries.
Second, this can also curb the implementation of alternative energy policies in oil-consumer countries. Third, in the short-term, volatil-
ity can also affect the demand for oil inventories (Regnier 2007; Huntington et al. 2014). Moreover, volatility is critical for pricing
derivatives whose trading volume has increased significantly in the last decade (Matar et al. 2013; Huntington et al. 2014).
Economic models and policy simulations can, to some extent, forecast oil market behavior using data from oil market fundamentals,
including supply and demand as well as inventories. Various types of econometric models were developed and published (Poon and
Granger 2003; Sadorsky 2006; Narayan and Narayan 2007; Kang et al. 2009; Wang et al. 2011). A survey of econometric models for
oil prices and volatility was published by Matar et al. (2013) and Huntington et al. (2013). These econometric models had limitations
and weaknesses because they did not explicitly incorporate influential factors of market volatility. Market volatility can be affected by
endogenous and exogenous factors, such as geopolitical events and instability in important oil regions, imbalance of market fundamen-
tals (supply, demand, and storage), spare oil capacity, speculation by investors, the relationship between the physical and financial mar-
kets, transparency of oil market data, changes in market regulations and policies, and the role of nonconventional oil and renewable
energy in the global energy mix, along with econometric factors, such as economic growth, exchange rates, and monetary policies
(Matar et al. 2013; Huntington et al. 2014).
Significant research and study efforts have been devoted to understanding and mitigating energy market volatility. The new develop-
ments in energy production, investment strategies, and geopolitical environment require continuous updating and refined understanding
of the energy market’s behavior and dynamics. Furthermore, there are new advanced modeling techniques in development that are
expected to enhance and improve modeling of oil market dynamics and volatility.
Artificial Intelligence (AI). The energy and financial markets are burgeoning areas for exploring AI, a science that has gained
increased interest in recent years owing to the power and capability of the state-of-the-art technology and its various applications in the
petroleum industry (Al-Fattah and Startzman 2001, 2003; Mohaghegh 2005; Mohaghegh et al. 2011), and in economics and finance
(Azoff 1994; Trippi and Turban 1996). The most common types of AI models are ANN, machine learning (ML), deep learning, GA,
support vector machine, fuzzy logic, and the boosted decision model. In particular, ANN is used for recognizing patterns in data and
modeling complex relationships between a target and its influential factors and parameters. ANNs can be defined as parallel processing
models of biological neural structures. Each ANN commonly consists of a number of fully connected nodes or neurons grouped in
layers; these layers can include one input layer, one or more hidden layers, and one output layer. The number of nodes in each of these
layers is dependent upon the number of input and output variables, and the architecture of the network, as shown in Fig. 1. The neural
network is fed with data that are representative of the problem, and is submitted to training until it learns the pattern and behavior of the
data. GA is an optimization technique with a binary search string that is commonly used for seeking and identifying factors that had an
impact on the predictor. The GA technique is discussed further in a later section.
Copyright V
C2019 Society of Petroleum Engineers
Original SPE manuscript received forreview 13 December 2017. Revised manuscript received for review10 September 2018. Paper (SPE 195584)peer approved 14 January 2019.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 817 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol0 0000/190012/Comp/APPFile/SA-REE#190012
August 2019 SPE Reservoir Evaluation & Engineering 817
This paper presents a hybrid approach of AI using a GA, ANN, and DM time-series model for describing and forecasting crude-oil
market volatility and its direction of movement. The hybrid model, referred to as the GANNATS model, was aimed at modeling the
behavior and capturing the dynamics of oil-price fluctuations. It should be emphasized that the intention of this work was not to predict
the oil prices in absolute terms, but rather to model the behavior and capture the dynamics of oil market volatility in addition to
adequately predicting the direction of oil-price volatility. It was hoped that achieving that would lead to a predictive tool for oil pro-
ducers, consumers, traders, and investors.
Methodology
The development strategies of the AI model implemented in this study required several precise procedures, including (1) data acquisi-
tion and preparation; (2) data mining and preprocessing; (3) features selection of significant input variables; (4) model design;
(5) model training, verification, and testing; (6) model performance evaluation; (7) model optimization and fine tuning; and (8) post-
processing of model results. Fig. 2 presents the framework, workflow, and template of a strategy implemented in this study to develop
an AI model [adapted from Al-Fattah and Al-Naim (2009)].
Oil-price volatility (target) and future values were forecast using previous values or lagged time steps of the oil-price variable and
its influential input variables. For example, the values of the input variables for the current and previous months were used to forecast
the volatility of future months. This study experimented with three types of ANN architecture: multilayer perceptron (MLP), radial
basis function (RBF), and generalized regression neural network (GRNN). The MLP network performed better than did the other net-
work types for this particular AI application.
Data Preparation
Data acquisition, preparation, and preprocessing are considered to be the most crucial and most time-consuming tasks in the model-
development process. The data used in this study were from the US Energy Information Administration (EIA 2012), a public-domain
source of energy data. Nominal daily data of WTI crude-oil futures from January 1994 to April 2012 were used. Because the majority
of the input variables were provided monthly, the WTI daily data were converted to a monthly time scale, as described by Matar et al.
(2013). This study used WTI crude-price data for the first-month futures contract.
Fig. 3 shows the WTI crude futures prices with the accompanying major global events that impacted its volatility. The major eco-
nomic and geopolitical events that influenced the oil prices varied in their degree of influence on the volatility of the oil market. The
Asian Financial Crisis caused an economic recession during 1997 and 1998 owing to an oil-price depression. The Second Gulf War in
2003 increased oil prices. The 9/11 Terrorist Attacks in 2001 and the 2008 Financial Crisis caused dramatic drops in the price of oil
that consequently led to the highest-volatility oil market to date.
The monthly input data for a pool of predictor variables affecting oil-price volatility include US oil production; Organization of the
Petroleum Exporting Countries (OPEC) production; the oil supply from major producing countries; US oil inventory change; Organiza-
tion for Economic Cooperation and Development (OECD) oil consumption; oil consumption by Russia, India, and China; OPEC spare
capacity; world total petroleum stocks; and economic indicators, such as gross domestic product (GDP), consumer price index,
exchange rates, and interest rates. These input variables were initially selected because they influence the oil market volatility. In this
study, a total of 220 monthly data observations were used and 58 input variables were considered, including transformed variables with
functional links.
The formula used to compute the price returns was expressed as
rt¼ln Pt
Pt1

;ð1Þ
where P
t
is the oil price (USD/bbl) at time tand P
t–1
is the oil price at time t–1. The oil price volatility (v
t
) was then computed as the
returns squared:
vt¼r2
t:ð2Þ
Data Mining and Preprocessing
Data mining is an essential preprocessing procedure in the development of an AI model. It can be defined as a method that mines and
explores data patterns and characteristics using AI methods, statistical techniques, and database systems. Because modeling and fore-
casting oil-price volatility is a time-series analysis application, data mining was used as an exploratory tool to identify data patterns and
...........................................................................
...............................................................................
Supply
Demand
Oil inventory
Economic indicators
Input layer
Output laye
r
Volatility
Weight
Hidden layer
Node
Fig. 1—Three-layer feed-forward MLP neural network structure.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 818 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol0 0000/190012/Comp/APPFile/SA-REE#190012
818 August 2019 SPE Reservoir Evaluation & Engineering
detect trends in regard to oil market volatility. We also used data mining to detect data anomalies in the oil market volatility using an
association of rule learning that searches for relationships between variables and the visualization and reporting of the data summary. In
addition, data mining was used for features selection and variables screening, and was used in conjunction with other modeling applica-
tions for data subsampling. Variables normalization and transformation techniques were also used in the procedure for
data preprocessing.
I. Data Warehousing/Preparation
Data collection
Data quality control
II. Data Mining/Preprocessing
Data exploratory
Transformation (derivatives, logarithmic, time lags)
Normalization (mean/standard deviation, minimum/maximum)
Partitioning sets (training, validation, and testing)
Sampling methods (70-15-15% rule, bootstrap, K-mean, random)
III. Feature Selection and Variables Screening
Genetic algorithm (GA)
Principal-component analysis (PCA)
Forward and backward stepwise selections
Sensitivity analysis
IV. Model Design
Type (classification, regression, time series)
Architecture (MLP, RBF, GRNN, PNN)
Learning algorithm (BP, GD, CG, QBP)
Number of layers (input, hidden, and output)
Number of nodes in each layer
Transfer/activation functions (logistic, arctan, identity)
Convergence/stopping criteria (error tolerance, no. of epochs)
V. Model Training and
Validation
Is training
successful?
VII. Model Testing
Are results
satisfactory?
IX. Post-Processing of Model Results
VIII. Modify, Optimize,
and Fine-Tune
Parameters
No
Yes
Yes
No
Adjust model
parameters
VI. Model Performance
Evaluation
Generalization attributes
Statistical error analysis
Graphical error analysis
Fig. 2—Framework, workflow, and template of GANNATS model development methodology (adapted from Al-Fattah and
Al-Naim 2009). BP 5back-propagation algorithm; CG 5conjugate gradient algorithm; GD 5gradient decent algorithm; GRNN 5
generalized-regression neural network; QBP 5quick-back propagation algorithm; PNN 5probabilistic neural network; RBF 5
radial-basis function.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 819 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol0 0000/190012/Comp/APPFile/SA-REE#190012
August 2019 SPE Reservoir Evaluation & Engineering 819
Transformation. We found that time-series ANN models performed better with normally distributed data and not seasonally adjusted
data (Al-Fattah and Startzman 2001, 2003). Having a time series that exhibited trends or periodic variations rendered data nonstation-
ary, thus indicating that the mean and variance of the data were not constant over time. A transformation technique was used to remove
the trends and periodic variations, making it easier for the neural network model to interpret the input data, to search more efficiently
for relationships between variables, and to perform the training process quickly. Transformation with functional links can take various
forms, including each variable’s derivatives, natural logarithm, time lags, growth rate, and adjustment to per capita terms. A first-
derivative transformation can remove the trend in each input variable, thus reducing the serial correlation and the multicollinearity
among the input variables. This step helps the time series to maintain a constant mean and variance over time, rendering it stationary—
a necessary requirement for econometric modeling of time-series data. Experience with and understanding of econometric modeling in
time-series analysis were found to be beneficial in the development of the GANNATS models.
Normalization. Normalization standardizes the possible numerical range that the input data can take, thus preventing the network
from becoming biased to large numerical values over smaller ones. Normalization was first applied and recommended by Al-Fattah and
Startzman (2001, 2003) to predict natural-gas supply using AI. Each input and output variable was normalized using the mean/standard-
deviation normalization method. Table 1 presents a summary of data statistics of WTI crude futures prices for inflation-adjusted vs. nor-
malized vs. return prices.
Feature Selection and Variables Impact Analysis. One of the tasks in the AI model design is to decide which of the available varia-
bles to use as inputs to the model. The only guaranteed method for selecting the best input set is to train the networks with all the possi-
ble input sets and architectures. Speaking practically, this is impossible when presented with a significant number of potential input
variables. It becomes more problematic when multicollinearity exists among some of the input variables, which means that any set of
variables might be sufficient.
Overlearning or overfitting occurs when too many or too few input variables that unfavorably impact the performance of the
AI model are used, causing it to memorize and not generalize the data structure. Overlearning can cause the network to perform very
well with the training data set, but poorly with the testing set. The performance of the network model can be improved by selecting the
significant variables and reducing the redundant ones, leading to generalization (not memorization) of the network model. There are
highly sophisticated algorithms that determine the selection of significant input variables. These techniques include the GA, forward
and backward stepwise algorithms, and principal-component analysis (PCA) (Goldberg 1989; Hill and Lewicki 2007).
The GA is an optimization algorithm that can efficiently search for binary strings by processing an initially random population of
strings using an artificial system, thus mimicking natural human selection. GA is, therefore, an efficient advanced technique for
Observed Data Normalized Data
Statistic Prices Return Prices Return
Mean 45.36 0.88 0.00 0.00
Standard error 2.00 0.56 0.07 0.07
Median 31.19 1.71 0.48 0.10
Standard deviation 29.65 8.29 1.00 1.00
Sample variance 879.01 68.67 1.00 1.00
Kurtosis 0.25 1.96 0.25 1.96
Skewness 0.89 0.81 0.89 0.81
Minimum 11.35 33.20 1.15 4.11
Maximum 133.88 20.41 2.99 2.36
Observations 220 219 220 219
Table 1Data statistics of WTI crude futures prices (USD/bbl) and
return (%).
0
20
40
60
80
100
120
140
160
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012
Price (USD/bbl)
Year
WTI Prices
2001 9/11
Terrorist Attacks
2008–2009 World
Financial Crisis
2003 Kuwaits Liberation War
1998 Asian
Financial Crisis
Fig. 3—Monthly WTI crude-oil futures price (1994–2012) for first-month contract.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 820 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol0 0000/190012/Comp/APPFile/SA-REE#190012
820 August 2019 SPE Reservoir Evaluation & Engineering
identifying significant variables within a large numbers of variables, and offers a valuable verification within a small number of varia-
bles. In particular, GA recognizes interdependencies between variables located close together on the masking strings. It can also detect
subsets of variables that were not revealed by other methods. GA composes hundreds to thousands of combinations of input variables,
after adding or removing redundant input variables, to efficiently reach the optimal variable selection. The GA method used to be time
consuming, particularly for a large number of variables, typically requiring the training and testing of many thousands of networks
(Goldberg 1989); however, with the current technological advancements in computing, the speed performance of the GA method has
improved dramatically, making it an attractive viable solution.
Forward and backward stepwise algorithms usually run much faster than the GA if there is a reasonable number of variables. Both
techniques, forward and backward stepwise selections, are equally effective if there are not too many complex interdependencies
between the variables. The forward stepwise input-selection technique operates by adding variables one at a time, while the backward
stepwise input-selection method starts with the complete set of variables and then proceeds to remove one variable at a time (Hill and
Lewicki 2007). Another approach to dimensionality reduction is the PCA, which can be denoted in a linear network. PCA can often
extract a small number of components from fairly high-dimensional original data while preserving the integral structure of the data.
GA, forward stepwise and backward stepwise techniques were used to identify the significant input variables in the neural network
among a total of 58 input variables. These techniques yielded almost identical results with a few slight differences. While implementing
the features selection techniques, GA was used as a benchmark to eliminate the redundant input variables, thus reducing the number of
input variables from 58 to 20 and achieving a 66% reduction of total initial input variables. Therefore, the best 20 input variables that
contributed significantly to the model’s performance were retained for use in the model development. The F-value, R
2
, and p-value
were statistical measures used to determine the significance of each selected input variable as the predictor of oil-price volatility. All
these statistics showed similar results but with alternating rankings; however, the F-value was selected as a benchmark. The F-value
statistic measured the significant contribution of a given input variable to the model’s overall performance relative to other input varia-
bles. The threshold value for determining the input variable was decided by those variables having an F-value of unity and greater. It
should be noted that all the input and output variables were normalized.
The final results of the features selection procedure are presented by the variables impact plot (VIP) shown in Fig. 4, depicting the
significance and impact of the selected input variables on oil-price volatility. The variables depicted by the VIP were normalized and
transformed with functional links, and were expressed as X
1
,X
2
, etc. Analyses showed that the optimal predictors selected for the WTI
crude-oil-price volatility were the US oil inventory change; US oil production; OPEC oil production; OECD inventory; US GDP; oil
consumption of Russia, India, and China; OECD oil consumption; and the OPEC oil spare capacity. The results of the input-parameter
selection showed that the US oil inventory change was the most significant in the GANNATS model, followed by US oil production
and OPEC oil production.
Model Design
This section discusses the design aspects that were considered when selecting the neural-network architecture. The neural-network
architecture determines the method by which the weights are interconnected within the network and specifies the type of learning rules
that might be used. The MLP (Azoff 1994; Trippi and Turban 1996) is the most commonly used architecture and was found suitable for
this study. The network used in this study was based on a type of back-propagation learning—the quasi-Newton algorithm—the most
widely recognized and used supervised-learning algorithm (Azoff 1994). The fundamental structure of the quasi-Newton neural net-
work consists of an input layer, one hidden layer, and one output layer. Fig. 1 shows the architecture of a three-layered feed-forward
neural network. The layers have nodes that are fully connected, indicating that each node of the input layer is connected to each hidden-
layer node, and that each hidden-layer node is connected to each output-layer node. Transfer or activation functions, such as sigmoid,
arctan, exponential, and identity, act on the value returned by the input functions. Each of the transfer functions introduces nonlinearity
into the neural network, enriching its representational capacity. The sigmoid function was found to work well and was used for
this application. One, two, and three hidden layers were experimented with. The most optimal results were achieved using one hidden
layer, maintaining simplicity, reliability, and accuracy consistent with the conventional wisdom of AI modeling and forecasting devel-
opment strategies.
0 0.5 1 1.5 2 2.5 3
X1
X2
X3
X4
X5
X6
X7
X8
X9
X10
X11
X12
X13
X14
X15
X16
X17
X18
X19
X20
Importance (F-value)
Input Variable
Variables Impact Plot (VIP)
Fig. 4—VIP showing significant inputs selected for WTI oil-price-volatility model.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 821 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol0 0000/190012/Comp/APPFile/SA-REE#190012
August 2019 SPE Reservoir Evaluation & Engineering 821
Model Development
In this study, the data were partitioned into three subsets: the training set (70%, January 1994–October 2006), the verification or valida-
tion set (15%, November 2006–July 2009), and the test set (15%, August 2009–April 2012). This was the optimal apportioning of the
data; however, a data apportioning of 80/10/10% was also possible. In the training process, the network was repeatedly exposed to
input data, and the weights and thresholds of the post-synaptic potential function were adjusted using the quasi-Newton training algo-
rithm until the network correctly predicted the output by satisfying the convergence criteria. The convergence criteria for the trained
network were set at a residual error of 1.010
–6
or less, or at maximum epochs or iterations of 1,000, whichever occurred first. Gradient
descent (GD) and conjugate gradient (CG) training algorithms were also attempted.
The overall error was computed for the verification data subset. The verification data act as a standard that takes no part in the adjust-
ment of weights and thresholds during training, but the network’s performance was continually checked against this subset during train-
ing. The training was stopped when the error for the verification data stopped decreasing or started to increase. Use of the verification
subset of data is important because, with unlimited training, the neural network usually starts to overlearn the training data, thus leading
to the network’s memorization problem. The use of a verification subset to terminate training at a point when the generalization potential
is optimal is a critical consideration in training neural networks. A third subset of data, the testing set, served as an additional independent
check on the generalization capabilities of the neural network, and acted as a blind test of its performance and accuracy.
The prediction model was constructed so that the oil-price volatility of WTI futures to be forecast would use the data from a previ-
ous month of input variables to forecast the next month’s oil volatility. Selection of the optimal performing model was made on the
basis of the generalization and statistical indicators of the model, which will be discussed in the next section.
Results
The oil-market-volatility GANNATS hybrid model developed in this study was successfully trained, verified, and tested for adequate
predictions. The hybrid model was optimally developed with an MLP network architecture, the quasi-Newton training algorithm, the
sigmoid transfer function, and three layers (a 20-node input layer, a 14-node hidden layer, and a 1-node output layer). The forecasting
model was based on the time-series data of past monthly time lags (i.e., t
–1
,t
–2
, etc.) of input variables to forecast the future monthly
timesteps (i.e., t
þ1
,t
þ2
, etc.) of the predictor of oil-price volatility.
Features selection and variables-impact-analysis methods were used to identify significant variables and remove redundant ones,
resulting in a 66% reduction of total initially selected variables. The most influential factors that impacted the model of WTI oil volatil-
ity were the US oil inventory change; US oil production; OPEC oil production; OECD inventory; US GDP; oil consumption of Russia,
China, and India; and the OPEC spare capacity (Fig. 3). Oil supply disruptions and global geopolitical events in unstable producing
countries around the world were also believed to be significant drivers of oil prices and volatility. Quantifying these exogenous factors
for inclusion in the model would improve and increase the certainty of the oil-price-volatility predictions.
Evaluation of the model’s performance by means of statistical key performance indicators (KPIs), graphical error analysis, and the
generalization attribute was used to examine and assess the adequacy and prediction accuracy of the GANNATS hybrid model. The
results of the statistical and graphical error analyses, presented in the next section, showed that the model had a good generalization
attribute with excellent prediction accuracy for oil-price volatility and its direction of movement.
The main results of the GANNATS model for the WTI oil market volatility are shown by Figs. 5 and 6. Fig. 5 shows the
performance of the observed WTI oil price returns compared to that predicted by the model, while Fig. 6 shows the excellent
performance of the model predictions of the price volatility. Analyses of the results showed that the predictions made by the model
matched adequately the observed historical oil-price volatility. In addition, the hybrid model captured the directions of the price returns
and volatility, whether it was upward or downward. The model also accurately captured the negative oil price shock as a result of the
2008 Financial Crisis, as well as other economic and geopolitical events. The model also demonstrated its capability to predict the
direction of the WTI oil-price volatility during the forecasting (testing) phase of model development, as shown in Figs. 5 and 6.
As previously mentioned, the scope of this study was to develop an advanced analytic and predictive model to describe the behavior,
capture the dynamics, and satisfactorily predict the direction of oil-price volatility. The GANNATS model was successfully developed
as an oil-market-volatility forecaster and as a short-term predictive tool for the direction of oil-price volatility. Development of this
system is beneficial for oil producers, consumers, investors, and traders alike.
–5
–4
–3
–2
–1
0
1
2
3
4
5
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220
Normalized Price Returns (%)
Time (months)
1998 Asian
Financial Crisis 2001 9/11
Terrorist Attacks
2008–2009
Financial Crisis
2003 Kuwaits
Liberation War
Verification
(validation)
Estimation (training) Forecasting
(testing)
Observed returns
Predicted returns
Fig. 5—Results of prediction performance of GANNATS model for WTI futures price return.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 822 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol0 0000/190012/Comp/APPFile/SA-REE#190012
822 August 2019 SPE Reservoir Evaluation & Engineering
Performance Evaluation
Generalization of the Model. Critical problems involved in building an ANN or an ML model include overfitting, overlearning, and/or
memorization—whereby the model tends to produce excellent results during the training stage but performs poorly in the testing stage.
These problems can be caused by insufficient data, inclusion of redundant and insignificant input variables, discounting the verification
or validation subset in data partitioning, improper setting of the hidden layer, and/or inappropriate network configuration. An adequate
AI prediction model is characterized and evaluated by its generalization attributes. Using the concept of ensemble modeling also
improves the generalization property of the network model. An ensemble model is a combination or an average of the optimal perform-
ing models that have different parameters of network configurations. In most cases, an ensemble model provides better results of reliabil-
ity and forecasting accuracy than an individual model.
The generalization property of the model is characterized by decreasing errors within the training and testing data sets with an
increasing number of iterations; the residual of the model is normally distributed around zero, and the statistical errors within the testing
subset are not fewer than the errors within the training subset. Analysis of the GANNATS model showed that the number of errors
within the testing data set decreased as the training data set increased in the number of epochs. In addition, the residual histogram of the
model, shown in Fig. 7, indicated that the residuals were normally distributed. Fig. 8 is a normal probability plot that supports the
normality of the model residuals and illustrates that 97.4% of the residuals fell on a straight line. The Jarque-Bera (JB) test is another
commonly used statistic for normality. The residuals of this model were computed to be JB ¼21.02 with p-value ¼0.00003 (n¼219,
skewness ¼0.2425, kurtosis ¼1.438, and two degrees of freedom). The JB normality test showed that the null hypothesis of normality
cannot be rejected at the 1% level of significance, indicating that the residuals were not significantly different from the normality. This
analysis indicates that the model maintained a good generalization attribute.
Errors Analysis. Statistical and graphical error analyses were used to evaluate and assess the performance and accuracy of the predic-
tion model. The statistical KPIs used in the errors analysis were mean relative percentage error (E
r
), mean absolute percentage error
(E
a
), and root mean squared error (E
rms
). The formulas for these commonly used statistical error indices can be found in the literature
(Hill and Lewicki 2007).
0
2
4
6
8
10
12
14
16
18
20
Normalized Price Volatility
Time (months)
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220
Observed volatility
Predicted volatility
Fig. 6—Results of GANNATS model for WTI futures price volatility, expressed as a squared return.
0%
20%
40%
60%
80%
100%
0
10
20
30
40
50
60
70
80
–2.5 –2 –1.5 –1 –0.5 0 0.5 1 1.5 2.0 2.5
Frequency
Residual
Frequency
Cumulative %
Fig. 7—Histogram of residual distribution of GANNATS model of WTI crude futures price returns.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 823 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol00000/190012/Comp/APPFile/SA-REE#190012
August 2019 SPE Reservoir Evaluation & Engineering 823
Directional Predictability Indicator (DPI). Oil market practitioners measure a model’s prediction accuracy and its tendency to pre-
dict the direction of oil market volatility—either upward, stagnant, or downward. To measure the performance of the prediction of oil-
price-volatility direction, we modified the mean directional accuracy (MDA) (Schnader and Stekler 1990; Pesaran and Timmermann
2004) to a more representative measure of the direction of predictions, referred to as the DPI. Using the concept of the random-walk
process between the observed and forecast models, we defined DPI by the following:
DPI ¼1
NXN
t¼1St;ð3Þ
where
St¼
if ðytyt1Þð^
yt^
yt1Þ>0;then St¼1
ðCorrect;Both Ups or Both DownsÞ
if ðytyt1Þð^
yt^
yt1Þ<0;then St¼0
ðFalse;Either Up and Down or Down and Up Þ;
8
>
>
>
<
>
>
>
:ð4Þ
and yis the measured or observed target value, ^
yis the predicted output value, tis the trading month, and Nis the total number of pre-
dicted cases (not the number of measured cases). S
t
represents the classification of the directional prediction, whether it was correct or
false on the basis of satisfying the specified condition. DPI is a statistical index used to measure the capability and performance of the
GANNATS model to predict the direction of the forecasts (upward, stagnant, or downward) compared to the actual realized direction.
The higher the DPI value is (or closer to unity), the higher will be the prediction accuracy for the direction of the forecast. A lower
DPI value (or closer to zero) indicates poor prediction accuracy for the direction of the forecast.
DPI states that if (ytyt1)>0orif(
^
yt^
yt1)>0, then it is positive and the direction is upward. If (ytyt1)<0orif(
^
yt^
yt1)<0,
then it is negative, indicating a downward direction. If both the terms of S
t
in Eq. 4 agree in signs [i.e., positive and positive (upward direc-
tion), or negative and negative (downward direction)], then the predicted direction of the time series is correct whether it was upward or
downward. Hence, the direction of the target variable is classified as correctly predicted. If the value of the current timestep equals the pre-
vious timestep of either terms of S
t
(i.e., yt¼yt1;or ^
yt¼^
yt1Þthen the DPI would be zero, indicating that the predicted direction was
stagnant. Otherwise, mixing signs of both terms would produce a negative result, indicating that the direction was incorrectly predicted by
the model. DPI differs from MDA because the MDA defines S
t
as [ðytyt1Þð^
ytyt1Þ].
Table 2 presents the results of the statistical analysis for the oil-market-volatility model developed in this study. The results shown
denote the training, validation, and testing sets, as well as the complete data set. Evaluation of all the statistical results for the perfor-
mance of the GANNATS model indicated that all the statistical measures, or KPIs, of all the data subsets showed excellent performance
of accuracy. The E
r
values were –0.0729 for the testing set, and –0.0770 for the entire data set. The E
a
values for the testing set were
0.4105, with an overall E
a
value of 0.4672 for the entire data set. The accuracy of E
rms
was 0.5247 for the testing set, and 0.6199 for the
complete data set. The DPI measurement for the directional accuracy of predictions showed excellent results of accuracy; the DPI was
87.9% for the testing set and 83.6% for all the data sets. This indicated that the hybrid model demonstrated its capability to forecast the
direction of oil-price volatility with an accuracy of approximately 88%, with an overall model prediction accuracy of 84%.
........................................................................
R2 = 0.9742
–3
–2
–1
0
1
2
3
–3 –2 –1 0 1 2 3
Expected Normal Value
Residual
Fig. 8—Normal probability plot of model.
KPI/Data Set All Data Training Validation Testing
Er–0.0770(%)
(%)
(%)
(%)
–0.0831 –0.0528 –0.0729
Ea0.4672 0.5086 0.3318 0.4105
Erms 0.6199 0.6683 0.4520 0.5247
DPI 83.6 81.0 90.9 87.9
Table 2Key performance indicators for WTI futures price-
volatility model.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 824 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol00000/190012/Comp/APPFile/SA-REE#190012
824 August 2019 SPE Reservoir Evaluation & Engineering
Fig. 7 shows the histogram of residuals and cumulative error distribution of the model, which were evenly distributed and predomin-
antly clustered around an error of near zero. Fig. 9 is a crossplot of the observed volatility data and model predictions, which shows the
close agreement between the observed and predicted volatility by indicating that the majority of the data falls on a straight line.
Conclusions
This study developed a time-series AI model for describing and predicting the oil market volatility using a hybrid approach of GA,
ANN, and DM. The GANNATS model exhibited the capability to describe the behavior, capture the dynamics, and predict the
direction of the oil-price volatility with 88% accuracy.
The most influential factors of the WTI oil-price-volatility model are the US oil inventory change; US oil production; OPEC oil pro-
duction; OECD inventory; US GDP; oil consumption of Russia, China, and India; and OPEC spare capacity. Supply disruptions of oil
and global geopolitical events in unstable producing countries around the world are also indicative drivers of oil prices and greater vola-
tility. Quantifying these exogenous factors for inclusion in the model can improve and increase the certainty of the predictions of oil
market volatility.
The GANNATS hybrid model can be used as a risk-management tool, and as a short-term predictive tool for the direction of move-
ment of oil-price volatility. It can also be used to quantitatively examine the effects of various physical and economic factors on future
oil market volatility, to understand the effects of different mechanisms for reducing market volatility, and to recommend policy options
and programs incorporating mechanisms that can potentially lessen the market volatility. This improved method for modeling oil-price
volatility can enable experts and market analysts to empirically test new approaches to mitigating market volatility. This work can also
provide a roadmap for research to improve predictability and accuracy of energy and crude models.
The VIP is a graphical tool and an important outcome of the feature selection and variables screening to identify significant variables,
thus reducing the curse of dimensionality of the AI model.
The DIP, a more representative measure of the directional prediction accuracy of oil volatility, was introduced.
Experience shows that knowledge of econometric modeling of time-series analysis, as well as understanding the physical behavior of
the dependent variable vs. its independent input variables, can lead to the successful development of time-series AI models.
A framework, workflow, or template was constructed as a useful, convenient, and handy tool for development strategies of
AI models.
Recommendations
The following are recommendations and issues to be addressed in future studies.
With the methodology presented, similar studies can be pursued for modeling and forecasting the price volatility of other global oil
markets, such as Brent and Dubai.
A similar work could be pursued for developing AI models for gas market volatility or other energy commodities.
At the time of this study, most of the significant factors influencing the oil market volatility had data provided monthly. Using varia-
bles with high data frequency (daily or weekly) can improve the model’s prediction performance for oil-price volatility.
The methodology presented could be used to perform further in-depth impact analysis to quantitatively evaluate the effects of various
physical and economic factors impacting the future oil market volatility.
A comparative study of oil market volatility should be conducted between the AI approach and the conventional econo-
metric models.
When the developed GANNATS model ceases to be adequate, it is recommended that it be updated periodically as new data
become available.
Nomenclature
E
a
¼mean absolute percentage error, %
E
r
¼mean relative percentage error, %
E
rms
¼root mean squared error, %
–5
–4
–3
–2
–1
0
1
2
3
4
5
–5 –4 –3 –2 –1 0 1 2 3 4 5
Predicted WTI Returns (normalized)
Actual WTI Returns (normalized)
Fig. 9—Crossplot of WTI crude futures price returns and return model.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 825 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol00000/190012/Comp/APPFile/SA-REE#190012
August 2019 SPE Reservoir Evaluation & Engineering 825
Acknowledgments
The author would like to thank Saudi Aramco for its support in the publication of this paper. Thanks are also extended to Fred Joutz,
James Smith, Emmanuel Ntui, and Shaikh Arifusalam for their reviews and comments. The views and opinions presented in this paper
belong solely to the author and not necessarily to Saudi Aramco.
References
Al-Fattah, S. M. and Startzman, R. A. 2001. Predicting Natural Gas Production Using Artificial Neural Network. Presented at the SPE Hydrocarbon Eco-
nomics and Evaluation Symposium, Dallas, Texas, 2–3 April. SPE-68593-MS. https://doi.org/10.2118/68593-MS.
Al-Fattah, S. M. and Startzman, R. A. 2003. Neural Network Approach Predicts U.S. Natural Gas Production. SPE Prod & Fac 18 (2): 84–91. SPE-
82411-PA. https://doi.org/10.2118/82411-PA.
Al-Fattah, S. M. and Al-Naim, H. A. 2009. Artificial-Intelligence Technology Predicts Relative Permeability of Giant Carbonate Reservoirs. SPE Res
Eval & Eng 12 (1): 96–103. SPE-109018-PA. https://doi.org/10.2118/109018-PA.
Azoff, E. M. 1994. Neural Network Time Series Forecasting of Financial Markets. Chichester, England: John Wiley & Sons Ltd. Inc.
Energy Information Administration (EIA). 2012. U.S. Energy Information Administration Independent Statistics & Analysis. http://www.eia.doe.gov/
(accessed 15 July 2012).
Goldberg, D. E. 1989. Genetic Algorithms. Reading, Massachusetts: Addison Wesley.
Hill, T. and Lewicki, P. 2007. Statistics: Methods and Applications, digital edition. Tulsa, Oklahoma: StatSoft.
Huntington, H., Al-Fattah, S. M., Huang, Z. et al. 2013. Oil Markets and Price Movements: A Survey of Models. Social Science Research Network,
USAEE Working Paper No. 13-129. http://ssrn.com/abstract=2277330 or https://doi.org/10.2139/ssrn.2277330.
Huntington, H., Al-Fattah, S. M., Huang, Z. et al. 2014. Oil Price Drivers and Movements: The Challenge for Future Research. Alternative Investment
Analyst Review 2(4): 11–28.
Kang, S. H., Kang, S. M., and Yoon, S. M. 2009. Forecasting Volatility of Crude Oil Markets. Energy Econ 31 (1): 119–125. https://doi.org/10.1016/
j.eneco.2008.09.006.
Matar, W., Al-Fattah, S. M., Atallah, T. et al. 2013. An Introduction to Oil Market Volatility Analysis. OPEC Energy Rev 37 (3): 247–269. OPEC-
12007. https://doi.org/10.1111/opec.12007.
Mohaghegh, S. D. 2005. Recent Developments in Application of Artificial Intelligence in Petroleum Engineering. J Pet Technol 57 (4): 86–91. SPE-
89033-JPT. https://doi.org/10.2118/89033-JPT.
Mohaghegh, S. D., Al-Fattah, S. M., and Popa, A. 2011. Artificial Intelligence and Data Mining Applications in the E&P Industry, digital edition.
Richardson, Texas: Society of Petroleum Engineers.
Narayan, P. and Narayan, S. 2007. Modeling Oil Price Volatility. Energy Policy 35 (12): 6549–6553. https://doi.org/10.1016/j.enpol.2007.07.020.
Pesaran, M. H. and Timmermann, A. 2004. How Costly is it To Ignore Breaks When Forecasting the Direction of a Time Series? Int J Forecast
20 (3): 411–425. https://doi.org/10.1016/S0169-2070(03)00068-2.
Poon, S. and Granger, C. 2003. Forecasting Volatility in Financial Markets: A Review. J Econ Lit 41 (2): 478–539. https://doi.org/10.1257/
002205103765762743.
Regnier, E. 2007. Oil and Energy Price Volatility. Energy Econ 29 (3): 405–427. https://doi.org/10.1016/j.eneco.2005.11.003.
Sadorsky, P. 2006. Modeling and Forecasting Petroleum Futures Volatility. Energy Econ 28 (4): 467–488. https://doi.org/10.1016/j.eneco.2006.04.005.
Schnader, M. H. and Stekler, H. O. 1990. Evaluating Predictions of Change. J Bus 63 (1): 99–107. https://doi.org/10.1086/296486.
Trippi, R. R. and Turban, E. 1996. Neural Networks in Finance and Investing: Using Artificial Intelligence to Improve Real-World Performance.
Chicago: Irwin Professional Publishing.
Wang, Y., Wu, C., and Wei, Y. 2011. Can GARCH-Class Models Capture Long Memory in WTI Crude Oil Markets? Econ Model 28 (3): 921–927.
https://doi.org/10.1016/j.econmod.2010.11.002.
Saud M. Al-Fattah is a corporate consultant of strategy and market analysis at Saudi Aramco. He has experience in reservoir
management, energy markets and economics, oil and gas planning and development, reserves assessment, and petroleum
engineering applications. Al-Fattah has authored or coauthored three books and more than 40 peer-reviewed papers, and
holds one US patent. He holds a PhD degree from Texas A&M University and MSc and BSc degrees from King Fahd University of
Petroleum and Minerals (KFUPM), all in petroleum engineering. In addition, Al-Fattah holds an Executive MBA degree from
Prince Muhammad University, Saudi Arabia. He is a technical reviewer for SPE Reservoir Evaluation & Engineering and other
industry publications. Al-Fattah has served in several SPE volunteer activities, in local and international committees, as an Edito-
rial Review Committee member for SPE Reservoir Evaluation & Engineering, as a coauthor of an SPE digital book on artificial intel-
ligence and data mining, and as vice-chair (2006) and chair (2007) of the SPE Saudi Arabia Annual Technical Conference.
Al-Fattah also established the SPE Student Chapter at KFUPM, and served as president from 1990 to 1994.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 826 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol00000/190012/Comp/APPFile/SA-REE#190012
826 August 2019 SPE Reservoir Evaluation & Engineering
... The GA composes hundreds to thousands of combinations of input variables, after adding or removing redundant input variables, to efficiently reach the optimal set of input variables. Further details on ANN and GA have been provided by Azoff (1994), Goldberg (1989), and Al-Fattah (2019, 2020. ...
... The purpose of this paper is to develop a rigorous data-driven model to describe, analyze, and forecast the crude oil demand for a high oil producer (Saudi Arabia) and a high oil consumer (China) using a hybrid AI approach called GANNATS. GANNATS was initiated by Startzman (2001 and2003) and formulated and developed by Al-Fattah (2019). ...
... The flowchart in Fig. 1 depicts the workflow of the development strategies of the GANNATS model applied to modeling, analyzing, and forecasting the oil demand of Saudi Arabia and China. More details of the GANNATS methodology have been provided by Al-Fattah (2019, 2020. ...
Article
This paper develops a rigorous and advanced data-driven model to describe, analyze, and forecast the global crude oil demand. The study deploys a hybrid approach of artificial intelligence techniques, namely, the genetic-algorithm, neural-network, and data-mining approach for time-series models (GANNATS). The GANNATS was developed and applied to two country cases, including one for a high oil producer (Saudi Arabia) and one for a high oil consumer (China), to develop crude oil demand forecasts. The input variables of the neural network models include gross domestic product (GDP), the country’s population, oil prices, gas prices, and transport data, in addition to transformed variables and functional links. The artificial intelligence predictive models of oil demand were successfully developed, trained, validated, and tested using historical oil-market data, yielding excellent oil demand predictions. The performance of the intelligent models for Saudi Arabia and China was examined using rigorous indicators of generalizability, predictability, and accuracy. The GANNATS forecasting models show that the crude oil demand for both Saudi Arabia and China will continue to increase over the forecast period but with a mildly declining growth, particularly for Saudi Arabia. This decreasing growth in the demand for oil can be attributed to increased energy efficiency, fuel switching, conversion of power plants from crude oil to gas-based plants, and increased utilization of renewable energy, such as solar and wind for electricity generation and water desalination. In this study, the feature engineering of variables selection techniques has been applied to identify and understand significant factors that impact and drive the crude oil demand. The proposed GANNATS methodology optimizes and upgrades the conventional process of developing oil demand forecasts. It also improves and enhances the predictability and accuracy of the current oil demand forecasting models.
... In the aspect of international oil price forecasting, many scholars have also made outstanding progress. Ali Safari et al. [23]combined the exponential smoothing model, the autoregressive integrated moving average model, and nonlinear autoregressive neural network in a structure of the state space model in 2018 to increase accuracy of forecasting. In [24], Aimei Hu built ARIMA(5,1,3) and GARCH(1,1) models in accordance with monthly data of WTI crude oil and made a forecasting of oil prices in 2012 which showed that the forecasting results' accuracy of GARCH is higher than that of the ARIMA model, and the mean relative error decreased from 8.2157% to 5.4791%, and the root mean square error decreased from 9.449168 to 7.25275. ...
Article
Full-text available
It is meaningful and of certain theoretical value for the development of economy through analyzing fluctuation rules of international oil prices and forecasting the future trend of international oil prices. By composing the autoregressive integrated moving average (ARIMA) model and the combination model of autoregressive integrated moving average model-generalized autoregressive conditional heteroskedasticity (ARIMA-GARCH) for analyzing and forecasting international oil prices, study shows that the combination model of ARIMA (1,1,0)-GARCH (1,1) is more suitable for short-term forecasting of international oil prices with higher accuracy that the MAPE of forecasting has reduced from 1.549% to 0.045% and the RMSE of forecasting has reduced from 1.032 to 0.071.
... To further improve the accuracy of traditional statistical techniques for price prediction of crude oil, artificial intelligence-based models with good generalization, self-learning ability, and memory capacity have been developed. It is seen that the prediction performance of AI-based single models such as artificial neural network (ANN) [23,24], support vector regression (SVR) [25], deep learning based on LSTM [26e28], and ANN-Fuzzy regression [29] is superior to traditional statistical models. Although the performance of AI-based models is superior to traditional statistical methods, when used as a single model, they fall short of dealing with nonlinear dynamics and chaoticity. ...
Article
Abstract Estimating the price of crude oil, which is seen as an important resource for economic development and stability in the world, is a topic of great interest by policy makers and market participants. However, the chaotic and nonlinear characteristics of crude oil time series (COTS) make it difficult to estimate crude oil prices with high accuracy. To overcome these challenges, a new crude oil price prediction model is proposed in this study, which includes the long short-term memory (LSTM), technical indicators such as trend, volatility and momentum, and the chaotic Henry gas solubility optimization (CHGSO) technique. In the proposed model, features based on trend, momentum and volatility technical indicators are utilized. The features are obtained by using the trend indicators such as exponential moving average (EMA), simple moving average (SMA) and Kaufman's adaptive moving average (KAMA), the momentum indicators such as commodity channel index (CCI), rate of change (ROC) and relative strength index (RSI), and the volatility indicators such as average true range (ATR), volatility ratio (VR) and highest high-lowest low (HHLL). These indicators are obtained separately for the West Texas Intermediate (WTI) and Brent COTS. Especially, including the volatility indicator in the model is important in terms of the robustness of the proposed model. The features based on EMA, SMA, KAMA indicators are composed by changing the period values between 3 and 10, the features based on ROC indicator is created by changing the period values between 5 and 12, and the features based on CCI, RSI, ATR, VR and HHLL indicators are formed by changing the period values between 5 and 20. The features are selected by CHGSO algorithm based on the logistic chaotic map, which is successful in avoiding local optima and balancing exploitation and exploration in the search space. Both Theil's U and the mean absolute percentage error (MAPE) values are utilized in the optimization algorithm as the objective function. The results show that the proposed prediction model copes with the chaoticity and nonlinear dynamics of both WTI and Brent COTS.
... This great popular effective joint approach is known as ensemble learning approach, which boosts the potency of deep learning reducing forecasting uncertainty [18]. The strengths of ensemble approach show its potentiality with reducing the over-estimation risk [19], reliable and stable accuracy of forecasting [20,21]. Ensemble forecasting approach has been successfully devoted in crude oil [22], gold market [23], energy market [24], financial market etc. [25]. ...
Article
Full-text available
The prices of agro-commodities are highly volatile. Hence it is a challenge to the farmers to ensure fair and remunerative prices of these commodities. As a result, there is a need for prediction of agro market price appropriately. The closing price prediction of one soft commodity product, Cotton 29 mm and one agro-commodity product, Guar gum are chosen. The existing reported methods exhibit poor prediction performance. To alleviate this problem, the current investigation is undertaken for better prediction of closing prices. The deep ensemble approach using convolutional neural network (CNN) and stacked autoencoder (SAE) is employed to improve the prediction performance. For the ensemble strategy, the weights are optimized using three bio-inspired techniques such as genetic algorithm (GA), particle swarm optimization (PSO) and spider monkey optimization (SMO). Eighteen attributes relating to the closing price of each products are considered as input to the proposed models. The simulation based experimental results demonstrate the following contribution of the paper. Firstly, it is observed that CNN outperforms the SAE model in terms of short range prediction and vice versa for long range prediction. Secondly, the prediction performance of all the three ensemble models has been determined. Thirdly, out of three ensemble models, ensemble-SMO (ESMO) shows the best prediction performance in terms of mean square error and coefficient of multiple determination (R2). It is then followed by ensemble-PSO and ensemble-GA respectively. The performance of proposed best ESMO is compared with the Grey wolf optimization based multiquadratic kernel KELM model (GWO-KELM) and it is observed that the proposed ESMO outperforms the GWO-KELM model.
... Particular attention is paid to the impact of artificial intelligence on productivity [10,11] and interchangeability of jobs [12][13][14][15]. Artificial intelligence has been widely used to analyse various economic concepts and problems, for example, in a city smartness analysis [16] prediction of knowledge-hiding behaviour [17] as well as modelling and forecasting [18,19]. The increased use of artificial intelligence in spatial economic analysis is based on its benefits [6,20,21], which include a faster and more accurate answer; a solution to more sophisticated challenges, as methods of artificial intelligence can handle very large quantities of indicators; modelling of dynamic indicators, as algorithms can adapt to newly submitted data and be retrained; and possibilities to predict and model values of indicators; possibilities to analyse each country individually in the context of influence of other countries. ...
Article
Full-text available
A rich volume of literature has analysed country investment attractiveness in a wide range of contexts. The research has mostly focused on traditional economic concepts—economic, social, managerial, governmental, and geopolitical determinants—with a lack of focus on the smartness approach. Smartness is a social construct, which means that it has no objective presence but is “defined into existence”. It cannot be touched or measured based on uniform criteria but, rather, on the ones that are collectively agreed upon and stem from the nature of definition. Key determinants of smartness learning—intelligence, agility, networking, digital, sustainability, innovativeness and knowledgeability—serve as a platform for the deeper analysis of the research problem. In this article, we assessed country investment attractiveness through the economic subjects’ competences and environment empowering them to attract and maintain investments in the country. The country investment attractiveness was assessed by artificial intelligence (in particular, neural networks), which has found widespread application in the sciences and engineering but has remained rather limited in economics and confined to specific areas like counties’ investment attractiveness. The empirical research relies on the case of assessing investment attractiveness of 29 European countries by the use of 58 indicators and 31,958 observations of annual data of the 2000–2018 time period. The advantages and limitations of the use of artificial intelligence in assessing countries’ investment attractiveness proved the need for soft competences for work with artificial intelligence and decision-making based on the information gathered by such research. The creativity, intelligence, agility, networking, sustainability, social responsibility, innovativeness, digitality, learning, curiosity and being knowledge-driven are the competences that, together, are needed in all stages of economic analysis.
... Furthermore, evolving modeling techniques, including artificial intelligence and advanced predictive analytics, are being developed as proved to enhance and improve the ability to model and forecast the behavior of oil market dynamics and price volatility. (Al-Fattah, 2019) The role of large oil producers on the oil market stability can't be overemphasized. Fig. 5 shows the changes of oil production of Saudi Arabia from 2001-2018, and the impact on the changes of oil prices on quarterly basis. ...
Article
The beginning of the new century was marked with another petroleum boom and bust cycle. Oil prices were hovering around $18-20/bbl through most of the 1990s after which crude prices collapsed to $10/bbl in 1998 and 1999. Soon thereafter oil prices began a steady and, at times, sharp rise on the way to $147/bbl in July 2008. This climb was followed by an abrupt decline after the onset of the global “Lehman” economic crises in September 2008 driving down the crude oil price to as low as $32/bbl in December 2008. After a relatively swift recovery, another oil shock “market share” took place in 2014-2016; average oil prices plunged from $108/bbl in the second quarter of 2014 to $30/bbl in the first quarter of 2016. Brent kicked off 2018 with average oil prices of $69/bbl in January toggling around the $85/bbl during October 2018; since then, however, oil prices were dwindling from $80/bbl to $51/bbl by 2018 year end. The year of 2019 started with a fluctuating Brent oil prices around the 55-65$/bbl range. The rapid increase in the oil price and its sudden and dramatic decline raises a fundamental question about the oil industry: Why is it so difficult to accurately predict the price of oil? Supply-demand balance, economic growth, oil inventories, and spare capacity are market fundamentals that drive oil prices and market dynamics. Market financialization, resources availability, technology advancements, and geopolitical events are also important drivers of oil price movements. Collaborative efforts should be geared towards: an acceptable and reasonable level of oil prices for the benefits of oil producers and consumers alike; meeting the future oil demand and availing adequate spare capacity to the market; and incentivizing upstream capital investment. Reasonable predictions of oil prices require a reliable and consistent data, rigorous advanced analytical methods, intelligent forecasting tools, and a better understanding of the influential factors impacting the oil prices and oil market conditions.
Article
Crude oil is the mixture of petroleum liquids and gases that is extracted from the ground by oil wells. It is an important source of fuel and is used in the production of several products. Given the important role price of the crude oil plays, it becomes extremely important for managers to predict future oil price while making operational decisions such as: when to purchase material, how much to produce and what modes of transportation to use. The goal of this paper is to develop a forecasting model to predict the oil prices that aid management to reduce operational costs, increase profit and enhance competitive advantage. We first analyze the primary theories related to the forecast of oil price followed by the reviews of two main streams of forecast theory, which are Target Capacity Utilization Rule (TCU) and Exhaustible Resources Theory. We implement a Target Capacity Utilization Rule recursive simulation model and test it on the historical data from 1987 through 2017 to predict crude oil prices for 1991 through 2017. We tried several variations of the base model and the best method produced MAD, MSE, MAPE and MPE of 12.676, 280.92, 0.2597, 0.028, respectively. We further estimated the forecasts of the oil prices at a monthly level based on our yearly forecast of oil prices from our best method. The calculated MAD, MSE, MAPE and MPE values are 5.66, 82.1163, 0.1246 and 0.038, respectively, which shows our model is promising again at a monthly level.
Article
The prediction of crude oil prices has important research significance. The paper contributes to the literature of hybrid models for forecasting crude oil prices. We apply ensemble empirical mode decomposition (EEMD) to decompose the residual term (RES), which contains complex information after variational mode decomposition (VMD), further combining with a kernel extreme learning machine (KELM) optimized by particle swarm optimization (PSO) to construct the VMD-RES.-EEMD-PSO-KELM model. In order to verify the validity of the model, this paper conducts empirical analyses of Brent crude oil and West Texas Intermediate (WTI) crude oil. The empirical results show that the prediction model proposed in this paper improves the prediction accuracy of crude oil prices.
Article
The accurate prediction of energy price is critical to the energy market orientation, and it can provide a reference for policymakers and market participants. In practice, energy prices are affected by external factors, and their accurate prediction is challenging. This paper provides a systematic decade review of data-driven models for energy price prediction. Energy prices include four types: natural gas, crude oil, electricity, and carbon. Through the screening, 171 publications are reviewed in detail from the aspects of the basic model, the data cleaning method, and optimizer. Publishing time, model structure, prediction accuracy, prediction horizon, and input variables for energy price prediction are discussed. The main contributions and findings of this paper are as follows: (1) basic prediction models for energy price, data cleaning methods, and optimizers are classified and described; (2) the structure of the prediction model is finely classified, and it is inferred that the hybrid model and prediction architecture with multiple techniques are the focus of research and the development direction in the future; (3) root mean square error, mean absolute percentage error, and mean absolute error are the three most frequently used error indicators, and the maximum mean absolute percentage error is less than 0.2; (4) the ranges of data size and data division ratio for energy price prediction in different horizons are given, the proportion of the test set is usually in the range of 0.05-0.35; (5) the input variables for energy price prediction are summarized; (6) the data cleaning method has a more significant role in improving the accuracy of energy price prediction than the optimizer.
Article
This paper develops a novel AI and data-driven predictive model to analyze and forecast energy markets, and tests it for gasoline demand of Saudi Arabia. The AI model is based on a genetic algorithm (GA), artificial neural network (ANN), and data mining (DM) approach for time-series (TS) analysis, referred to as GANNATS. The GANNATS predictive model was successfully designed, trained, validated, and tested using real historical market data. Results show that the model yields accurate predictions with robust key performance indicators. A double cross-validation of the model verified that Saudi Arabia's gasoline demand declined by 2.5% in 2017 from its 2016 level. The model forecasts that Saudi gasoline demand will maintain a mild growth over the short-term outlook. Variables impact and screening analysis was performed to identify the influencing factors driving the gasoline demand. The recent decline in Saudi gasoline demand is primarily attributed to the improvements in vehicle efficiency, lifting of fuel price subsidies, declining population growth, and changes in consumer behavior. This paper enriches existing knowledge of best practices for forecasting domestic and global gasoline demand. In addition, the methodology presented improves on traditional econometric models and enhances the predictability and accuracy of forecasts of gasoline demand.
Article
Full-text available
Determination of relative permeability data is required for almost all calculations of fluid flow in petroleum reservoirs. Water-oil relative permeability data play important roles in characterizing the simultaneous two-phase flow in porous rocks and predicting the performance of immiscible displacement processes in oil reservoirs. They are used, among other applications, for determining fluid distributions and residual saturations, predicting future reservoir performance, and estimating ultimate recovery. Undoubtedly, these data are considered probably the most valuable information required in reservoir simulation studies. Estimates of relative permeability are generally obtained from laboratory experiments with reservoir core samples. In the absence of the laboratory measurement of relative permeability data, empirical correlations are usually used to estimate relative permeability data. Developing empirical correlations for obtaining accurate estimates of relative permeability data showed limited success, and proved difficult, especially for carbonate reservoir rocks. Artificial neural network (ANN) technology has proved successful and useful in solving complex structured and nonlinear problems. This paper presents a new modeling technology to predict accurately water-oil relative permeability using ANN. The ANN models of relative permeability were developed using experimental data from waterflood core tests samples collected from carbonate reservoirs of giant Saudi Arabian oil fields. Three groups of data sets were used for training, verification, and testing the ANN models. Analysis of results of the testing data set show excellent agreement with the experimental data of relative permeability. In addition, error analyses show that the ANN models developed in this study outperform all published correlations. The benefits of this work include meeting the increased demand for conducting special core analysis, optimizing the number of laboratory measurements, integrating into reservoir simulation and reservoir management studies, and providing significant cost savings on extensive lab work and substantial required time.
Article
Full-text available
During the 1970s, oil market models offered a framework for understanding the growing market power being exercised by major oil producing countries. Few such models have been developed in recent years. Moreover, most large institutions do not use models directly for explaining recent oil price trends or projecting their future levels. Models of oil prices have become more computational, more data driven, less structural and increasingly short run since 2004. Quantitative analysis has shifted strongly towards identifying the role of financial instruments in shaping oil price movements. Although it is important to understand these short-run issues, a large vacuum exists between explanations that track short-run volatility within the context of long-run equilibrium conditions. The theories and models of oil demand and supply that are reviewed in this paper, although imperfect in many respects, offer a clear and well-defined perspective on the forces that are shaping the markets for crude oil and refined products. The complexity of the world oil market has increased dramatically in recent years and new approaches are needed to understand, model, and forecast oil prices today. There are several kinds of models have been proposed, including structural, computational and reduced form models. Recently, artificial intelligence was also introduced. This paper provides: (1) model taxonomy and the uses of models providing the motivation for its preparation, (2) a brief chronology explaining how oil market models have evolved over time, (3) three different model types: structural, computational, and reduced form models, and (4) artificial intelligence and data mining for oil market models.
Article
Full-text available
The complexity of the world oil market has increased dramatically in recent years and new approaches are needed to understand, model, and forecast oil prices today. In addition to the commencement of the financialization era in oil markets, there have been structural changes in the global oil market. Financial instruments are communicating information about future conditions much more rapidly than in the past. Prices from long and short duration contracts have started moving more together. Sudden supply and demand adjustments, such as the financial crisis of 2008-2009, faster Chinese economic growth, the Libyan uprising, the Iranian Nuclear standstill or the Deepwater Horizon oil spill, change expectations and current prices. Although volatility appears greater, financialization makes price discovery more robust. Most empirical economic studies suggest that fundamental values shaped expectations over 2004-2008, although financial bubbles may have emerged just prior to and during the summer of 2008. With increased price volatility, major exporters are considering ways and means to achieve more price stability to improve long-term production and consumption decisions. Managing excess capacity has historically been an important method for keeping world crude oil prices stable during periods of sharp demand or supply shifts. Building and maintaining excess capacity in current markets allow greater price stability when Asian economic growth suddenly accelerates or during periods of supply uncertainty in major producing regions. OPEC can contribute to price stability more easily when members agree on the best use of oil production capacity.Important structural changes have emerged in the global oil market after major price increases. Partially motivated by government policies major improvements in energy and oil efficiencies occurred after the oil price increases of the early and the late 70s such as the improved vehicle fuel efficiency, building codes, power grids and systems etc. On the supply side, seismic imaging and horizontal drilling as well as favorable tax regimes expanded production capacity in countries outside OPEC. After the oil price increases of 2004-2008, investments in oil sands, deep water, biofuels and other non-conventional sources accelerated. Recent improvements in shale gas production could well be transferred to oil-producing activities, resulting in expanded oil supplies in areas recently considered prohibitively expensive. The search for alternative transportation fuels continues with expanded research into compressed natural gas, biofuels, diesel made from natural gas, and electric vehicles.Still some aspects of the world oil market are not well understood. Despite numerous attempts to model the behavior of OPEC or its members, there exists no credible, verifiable theory about the behavior of the 50 years old organization. OPEC has not consistently acted like a monolithic cartel, constraining supplies to raise prices. Empirical evidence suggests that members sometimes coordinate supply responses and at other times compete with each other. Supply-restraint strategies include slower capacity expansions as well as curtailed production from existing capacity. Regional political considerations and broader economic goals (beyond oil) are influential factors in a country’s oil decisions. Furthermore, the economies of OPEC members as well as their financial needs have changed dramatically from 1970s and 1980s. This review represents a broad review of economic research and literature related to the structure and functioning of the world oil market. The theories and models of oil demand and supply reviewed here, although imperfect in many respects, offer a clear and well-defined perspective on the forces that are shaping the markets for crude oil and refined products. Much work remains to be done if we are to achieve a more complete understanding of these forces and the trends that lie ahead. The contents that follow represent an assessment of how far we have come and where we are headed. Of course, the entire world shares a vital interest in the many benefits that flow from an efficient, well-functioning oil market. It is intended and hoped, therefore, that the discussion in this review will find a broader audience.
Article
Full-text available
Modelling and forecasting crude oil price volatility is crucial in many financial and investment applications. The main purpose of this paper is to review and assess the current state of oil market volatility knowledge. It highlights the properties and characteristics of the oil price volatility that models seek to capture, and discuss the different modelling approaches to oil price volatility. Asymmetric response to price change, persistence and mean reversion, structural breaks, and possible market spillover of volatility are discussed.To complement the discussion, West Texas Intermediate futures price data are used to illustrate these properties using non-parametric and conditional modelling methods. The generalised autoregressive conditional heteroskedasticity-type models usually applied in the oil price volatility literature are also explored.We additionally examine the exogenous factors that may influence volatility in the oil markets.
Article
Full-text available
The industrial and residential market for natural gas produced in the United States has become increasingly significant. Within the past ten years the wellhead value of produced natural gas has rivaled and sometimes exceeded the value of crude oil. Forecasting natural gas supply is an economically important and challenging endeavor. This paper presents a new approach to predict natural gas production for the United States using an artificial neural network. We developed a neural network model to forecast U.S. natural gas supply to the Year 2020. Our results indicate that the U.S. will maintain its 1999 production of natural gas to 2001 after which production starts increasing. The network model indicates that natural gas production will increase during the period 2002 to 2012 on average rate of 0.5%/yr. This increase rate will more than double for the period 2013 to 2020. The neural network was developed with an initial large pool of input parameters. The input pool included exploratory, drilling, production, and econometric data. Preprocessing the input data involved normalization and functional transformation. Dimension reduction techniques and sensitivity analysis of input variables were used to reduce redundant and unimportant input parameters, and to simplify the neural network. The remaining input parameters of the reduced neural network included data of gas exploratory wells, oil/gas exploratory wells, oil exploratory wells, gas depletion rate, proved reserves, gas wellhead prices, and growth rate of gross domestic product. The three-layer neural network was successfully trained with yearly data starting from 1950 to 1989 using the quick-propagation learning algorithm. The target output of the neural network is the production rate of natural gas. The agreement between predicted and actual production rates was excellent. A test set, not used to train the network and containing data from 1990 to 1998, was used to verify and validate the network performance for prediction. Analysis of the test results shows that the neural network approach provides an excellent match of actual gas production data. An econometric approach, called stochastic modeling or time series analysis, was used to develop forecasting models for the neural network input parameters. A comparison of forecasts between this study and other forecast is presented. The neural network model has use as a short-term as well as a long-term predictive tool of natural gas supply. The model can also be used to examine quantitatively the effects of the various physical and economic factors on future gas production.
Article
This paper investigates the issue whether GARCH-type models can well capture the long memory widely existed in the volatility of WTI crude oil returns. In this frame, we model the volatility of spot and futures returns employing several GARCH-class models. Then, using two non-parametric methods, detrended fluctuation analysis (DFA) and rescaled range analysis (R/S), we compare the long memory properties of conditional volatility series obtained from GARCH-class models to that of actual volatility series. Our results show that GARCH-class models can well capture the long memory properties for the time scale larger than a year. However, for the time scale smaller than a year, the GARCH-class models are misspecified.
Article
In this paper, we examine the volatility of crude oil price using daily data for the period 1991–2006. Our main innovation is that we examine volatility in various sub-samples in order to judge the robustness of our results. Our main findings can be summarised as follows: (1) across the various sub-samples, there is inconsistent evidence of asymmetry and persistence of shocks; and (2) over the full sample period, evidence suggests that shocks have permanent effects, and asymmetric effects, on volatility. These findings imply that the behaviour of oil prices tends to change over short periods of time.
Article
This article investigates the efficacy of a volatility model for three crude oil markets — Brent, Dubai, and West Texas Intermediate (WTI) — with regard to its ability to forecast and identify volatility stylized facts, in particular volatility persistence or long memory. In this context, we assess persistence in the volatility of the three crude oil prices using conditional volatility models. The CGARCH and FIGARCH models are better equipped to capture persistence than are the GARCH and IGARCH models. The CGARCH and FIGARCH models also provide superior performance in out-of-sample volatility forecasts. We conclude that the CGARCH and FIGARCH models are useful for modeling and forecasting persistence in the volatility of crude oil prices.
Article
This article applies a technique developed by Robert C. Merton (1981) and Roy D. Henriksson and Robert C. Merton (1981) for evaluating the market timing of financial managers to macroeconomic predictions of change. This methodology may be used to determine whether the predictions may be of value to the user. As an illustration, the methodology is applied to a set of real gross national product forecasts. Copyright 1990 by the University of Chicago.