ArticlePDF Available

Artificial Intelligence Approach for Modeling and Forecasting Oil-Price Volatility


Abstract and Figures

Oil market volatility affects macroeconomic conditions and can unduly affect the economies of oil-producing countries. Large price swings can be detrimental to producers and consumers, causing infrastructure and capacity investments to be delayed, employment losses, inefficient investments, and/or the growth potential for energy-producing countries to be adversely affected. Undoubtedly, greater stability of oil prices increases the certainty of oil markets for the benefit of oil consumers and producers. Therefore, modeling and forecasting crude-oil price volatility is a strategic endeavor for many oil market and investment applications. This paper focused on the development of a new predictive model for describing and forecasting the behavior and dynamics of global oil-price volatility. Using a hybrid approach of artificial intelligence with a genetic algorithm (GA), artificial neural network (ANN), and data mining (DM) time-series (TS) (GANNATS) model was developed to forecast the futures price volatility of West Texas Intermediate (WTI) crude. The WTI price volatility model was successfully designed, trained, verified, and tested using historical oil market data. The predictions from the GANNATS model closely matched the historical data of WTI futures price volatility. The model not only described the behavior and captured the dynamics of oil-price volatility, but also demonstrated the capability for predicting the direction of movements of oil market volatility with an accuracy of 88%. The model is applicable as a predictive tool for oil-price volatility and its direction of movements, benefiting oil producers, consumers, investors, and traders. It assists these key market players in making sound decisions and taking corrective courses of action for oil market stability, development strategies, and future investments; this could lead to increased profits and to reduced costs and market losses. In addition, this improved method for modeling oil-price volatility enables experts and market analysts to empirically test new approaches for mitigating market volatility. It also provides a roadmap for improving the predictability and accuracy of energy and crude models.
Content may be subject to copyright.
Artificial Intelligence Approach
for Modeling and Forecasting
Oil-Price Volatility
Saud M. Al-Fattah, Saudi Aramco
Oil market volatility affects macroeconomic conditions and can unduly affect the economies of oil-producing countries. Large price
swings can be detrimental to producers and consumers, causing infrastructure and capacity investments to be delayed, employment
losses, inefficient investments, and/or the growth potential for energy-producing countries to be adversely affected. Undoubtedly,
greater stability of oil prices increases the certainty of oil markets for the benefit of oil consumers and producers. Therefore, modeling
and forecasting crude-oil price volatility is a strategic endeavor for many oil market and investment applications.
This paper focuses on the development of a new predictive model for describing and forecasting the behavior and dynamics of
global oil-price volatility. Using a hybrid approach of artificial intelligence with a genetic algorithm (GA), artificial neural network
(ANN), and data mining (DM) time-series (TS), a (GANNATS) model was developed to forecast the futures price volatility of West
Texas Intermediate (WTI) crude. The WTI price volatility model was successfully designed, trained, verified, and tested using historical
oil market data. The predictions from the GANNATS model closely matched the historical data of WTI futures price volatility. The
model not only described the behavior and captured the dynamics of oil-price volatility, but also demonstrated the capability for pre-
dicting the direction of movements of oil market volatility with an accuracy of 88%.
The model is applicable as a predictive tool for oil-price volatility and its direction of movements, benefiting oil producers, consumers,
investors, and traders. It assists these key market players in making sound decisions and taking corrective courses of action for oil market
stability, development strategies, and future investments; this could lead to increased profits and to reduced costs and market losses. In
addition, this improved method for modeling oil-price volatility enables experts and market analysts to empirically test new approaches
for mitigating market volatility. It also provides a roadmap for improving the predictability and accuracy of energy and crude models.
Oil-Price Volatility. The price of crude oil plays a major role in global economic activity, and its fluctuations can affect other markets
and impact global economic growth. Volatility is a measure of the degree to which prices of a commodity fluctuate. We define it as the
standard deviation of price returns over the sample time frame. Understanding the volatility of crude-oil pricing is important for several
reasons. First, long-term uncertainty in future oil prices can alter the incentives to develop new oil fields in producing countries.
Second, this can also curb the implementation of alternative energy policies in oil-consumer countries. Third, in the short-term, volatil-
ity can also affect the demand for oil inventories (Regnier 2007; Huntington et al. 2014). Moreover, volatility is critical for pricing
derivatives whose trading volume has increased significantly in the last decade (Matar et al. 2013; Huntington et al. 2014).
Economic models and policy simulations can, to some extent, forecast oil market behavior using data from oil market fundamentals,
including supply and demand as well as inventories. Various types of econometric models were developed and published (Poon and
Granger 2003; Sadorsky 2006; Narayan and Narayan 2007; Kang et al. 2009; Wang et al. 2011). A survey of econometric models for
oil prices and volatility was published by Matar et al. (2013) and Huntington et al. (2013). These econometric models had limitations
and weaknesses because they did not explicitly incorporate influential factors of market volatility. Market volatility can be affected by
endogenous and exogenous factors, such as geopolitical events and instability in important oil regions, imbalance of market fundamen-
tals (supply, demand, and storage), spare oil capacity, speculation by investors, the relationship between the physical and financial mar-
kets, transparency of oil market data, changes in market regulations and policies, and the role of nonconventional oil and renewable
energy in the global energy mix, along with econometric factors, such as economic growth, exchange rates, and monetary policies
(Matar et al. 2013; Huntington et al. 2014).
Significant research and study efforts have been devoted to understanding and mitigating energy market volatility. The new develop-
ments in energy production, investment strategies, and geopolitical environment require continuous updating and refined understanding
of the energy market’s behavior and dynamics. Furthermore, there are new advanced modeling techniques in development that are
expected to enhance and improve modeling of oil market dynamics and volatility.
Artificial Intelligence (AI). The energy and financial markets are burgeoning areas for exploring AI, a science that has gained
increased interest in recent years owing to the power and capability of the state-of-the-art technology and its various applications in the
petroleum industry (Al-Fattah and Startzman 2001, 2003; Mohaghegh 2005; Mohaghegh et al. 2011), and in economics and finance
(Azoff 1994; Trippi and Turban 1996). The most common types of AI models are ANN, machine learning (ML), deep learning, GA,
support vector machine, fuzzy logic, and the boosted decision model. In particular, ANN is used for recognizing patterns in data and
modeling complex relationships between a target and its influential factors and parameters. ANNs can be defined as parallel processing
models of biological neural structures. Each ANN commonly consists of a number of fully connected nodes or neurons grouped in
layers; these layers can include one input layer, one or more hidden layers, and one output layer. The number of nodes in each of these
layers is dependent upon the number of input and output variables, and the architecture of the network, as shown in Fig. 1. The neural
network is fed with data that are representative of the problem, and is submitted to training until it learns the pattern and behavior of the
data. GA is an optimization technique with a binary search string that is commonly used for seeking and identifying factors that had an
impact on the predictor. The GA technique is discussed further in a later section.
Copyright V
C2019 Society of Petroleum Engineers
Original SPE manuscript received forreview 13 December 2017. Revised manuscript received for review10 September 2018. Paper (SPE 195584)peer approved 14 January 2019.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 817 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol0 0000/190012/Comp/APPFile/SA-REE#190012
August 2019 SPE Reservoir Evaluation & Engineering 817
This paper presents a hybrid approach of AI using a GA, ANN, and DM time-series model for describing and forecasting crude-oil
market volatility and its direction of movement. The hybrid model, referred to as the GANNATS model, was aimed at modeling the
behavior and capturing the dynamics of oil-price fluctuations. It should be emphasized that the intention of this work was not to predict
the oil prices in absolute terms, but rather to model the behavior and capture the dynamics of oil market volatility in addition to
adequately predicting the direction of oil-price volatility. It was hoped that achieving that would lead to a predictive tool for oil pro-
ducers, consumers, traders, and investors.
The development strategies of the AI model implemented in this study required several precise procedures, including (1) data acquisi-
tion and preparation; (2) data mining and preprocessing; (3) features selection of significant input variables; (4) model design;
(5) model training, verification, and testing; (6) model performance evaluation; (7) model optimization and fine tuning; and (8) post-
processing of model results. Fig. 2 presents the framework, workflow, and template of a strategy implemented in this study to develop
an AI model [adapted from Al-Fattah and Al-Naim (2009)].
Oil-price volatility (target) and future values were forecast using previous values or lagged time steps of the oil-price variable and
its influential input variables. For example, the values of the input variables for the current and previous months were used to forecast
the volatility of future months. This study experimented with three types of ANN architecture: multilayer perceptron (MLP), radial
basis function (RBF), and generalized regression neural network (GRNN). The MLP network performed better than did the other net-
work types for this particular AI application.
Data Preparation
Data acquisition, preparation, and preprocessing are considered to be the most crucial and most time-consuming tasks in the model-
development process. The data used in this study were from the US Energy Information Administration (EIA 2012), a public-domain
source of energy data. Nominal daily data of WTI crude-oil futures from January 1994 to April 2012 were used. Because the majority
of the input variables were provided monthly, the WTI daily data were converted to a monthly time scale, as described by Matar et al.
(2013). This study used WTI crude-price data for the first-month futures contract.
Fig. 3 shows the WTI crude futures prices with the accompanying major global events that impacted its volatility. The major eco-
nomic and geopolitical events that influenced the oil prices varied in their degree of influence on the volatility of the oil market. The
Asian Financial Crisis caused an economic recession during 1997 and 1998 owing to an oil-price depression. The Second Gulf War in
2003 increased oil prices. The 9/11 Terrorist Attacks in 2001 and the 2008 Financial Crisis caused dramatic drops in the price of oil
that consequently led to the highest-volatility oil market to date.
The monthly input data for a pool of predictor variables affecting oil-price volatility include US oil production; Organization of the
Petroleum Exporting Countries (OPEC) production; the oil supply from major producing countries; US oil inventory change; Organiza-
tion for Economic Cooperation and Development (OECD) oil consumption; oil consumption by Russia, India, and China; OPEC spare
capacity; world total petroleum stocks; and economic indicators, such as gross domestic product (GDP), consumer price index,
exchange rates, and interest rates. These input variables were initially selected because they influence the oil market volatility. In this
study, a total of 220 monthly data observations were used and 58 input variables were considered, including transformed variables with
functional links.
The formula used to compute the price returns was expressed as
rt¼ln Pt
where P
is the oil price (USD/bbl) at time tand P
is the oil price at time t–1. The oil price volatility (v
) was then computed as the
returns squared:
Data Mining and Preprocessing
Data mining is an essential preprocessing procedure in the development of an AI model. It can be defined as a method that mines and
explores data patterns and characteristics using AI methods, statistical techniques, and database systems. Because modeling and fore-
casting oil-price volatility is a time-series analysis application, data mining was used as an exploratory tool to identify data patterns and
Oil inventory
Economic indicators
Input layer
Output laye
Hidden layer
Fig. 1—Three-layer feed-forward MLP neural network structure.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 818 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol0 0000/190012/Comp/APPFile/SA-REE#190012
818 August 2019 SPE Reservoir Evaluation & Engineering
detect trends in regard to oil market volatility. We also used data mining to detect data anomalies in the oil market volatility using an
association of rule learning that searches for relationships between variables and the visualization and reporting of the data summary. In
addition, data mining was used for features selection and variables screening, and was used in conjunction with other modeling applica-
tions for data subsampling. Variables normalization and transformation techniques were also used in the procedure for
data preprocessing.
I. Data Warehousing/Preparation
Data collection
Data quality control
II. Data Mining/Preprocessing
Data exploratory
Transformation (derivatives, logarithmic, time lags)
Normalization (mean/standard deviation, minimum/maximum)
Partitioning sets (training, validation, and testing)
Sampling methods (70-15-15% rule, bootstrap, K-mean, random)
III. Feature Selection and Variables Screening
Genetic algorithm (GA)
Principal-component analysis (PCA)
Forward and backward stepwise selections
Sensitivity analysis
IV. Model Design
Type (classification, regression, time series)
Architecture (MLP, RBF, GRNN, PNN)
Learning algorithm (BP, GD, CG, QBP)
Number of layers (input, hidden, and output)
Number of nodes in each layer
Transfer/activation functions (logistic, arctan, identity)
Convergence/stopping criteria (error tolerance, no. of epochs)
V. Model Training and
Is training
VII. Model Testing
Are results
IX. Post-Processing of Model Results
VIII. Modify, Optimize,
and Fine-Tune
Adjust model
VI. Model Performance
Generalization attributes
Statistical error analysis
Graphical error analysis
Fig. 2—Framework, workflow, and template of GANNATS model development methodology (adapted from Al-Fattah and
Al-Naim 2009). BP 5back-propagation algorithm; CG 5conjugate gradient algorithm; GD 5gradient decent algorithm; GRNN 5
generalized-regression neural network; QBP 5quick-back propagation algorithm; PNN 5probabilistic neural network; RBF 5
radial-basis function.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 819 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol0 0000/190012/Comp/APPFile/SA-REE#190012
August 2019 SPE Reservoir Evaluation & Engineering 819
Transformation. We found that time-series ANN models performed better with normally distributed data and not seasonally adjusted
data (Al-Fattah and Startzman 2001, 2003). Having a time series that exhibited trends or periodic variations rendered data nonstation-
ary, thus indicating that the mean and variance of the data were not constant over time. A transformation technique was used to remove
the trends and periodic variations, making it easier for the neural network model to interpret the input data, to search more efficiently
for relationships between variables, and to perform the training process quickly. Transformation with functional links can take various
forms, including each variable’s derivatives, natural logarithm, time lags, growth rate, and adjustment to per capita terms. A first-
derivative transformation can remove the trend in each input variable, thus reducing the serial correlation and the multicollinearity
among the input variables. This step helps the time series to maintain a constant mean and variance over time, rendering it stationary—
a necessary requirement for econometric modeling of time-series data. Experience with and understanding of econometric modeling in
time-series analysis were found to be beneficial in the development of the GANNATS models.
Normalization. Normalization standardizes the possible numerical range that the input data can take, thus preventing the network
from becoming biased to large numerical values over smaller ones. Normalization was first applied and recommended by Al-Fattah and
Startzman (2001, 2003) to predict natural-gas supply using AI. Each input and output variable was normalized using the mean/standard-
deviation normalization method. Table 1 presents a summary of data statistics of WTI crude futures prices for inflation-adjusted vs. nor-
malized vs. return prices.
Feature Selection and Variables Impact Analysis. One of the tasks in the AI model design is to decide which of the available varia-
bles to use as inputs to the model. The only guaranteed method for selecting the best input set is to train the networks with all the possi-
ble input sets and architectures. Speaking practically, this is impossible when presented with a significant number of potential input
variables. It becomes more problematic when multicollinearity exists among some of the input variables, which means that any set of
variables might be sufficient.
Overlearning or overfitting occurs when too many or too few input variables that unfavorably impact the performance of the
AI model are used, causing it to memorize and not generalize the data structure. Overlearning can cause the network to perform very
well with the training data set, but poorly with the testing set. The performance of the network model can be improved by selecting the
significant variables and reducing the redundant ones, leading to generalization (not memorization) of the network model. There are
highly sophisticated algorithms that determine the selection of significant input variables. These techniques include the GA, forward
and backward stepwise algorithms, and principal-component analysis (PCA) (Goldberg 1989; Hill and Lewicki 2007).
The GA is an optimization algorithm that can efficiently search for binary strings by processing an initially random population of
strings using an artificial system, thus mimicking natural human selection. GA is, therefore, an efficient advanced technique for
Observed Data Normalized Data
Statistic Prices Return Prices Return
Mean 45.36 0.88 0.00 0.00
Standard error 2.00 0.56 0.07 0.07
Median 31.19 1.71 0.48 0.10
Standard deviation 29.65 8.29 1.00 1.00
Sample variance 879.01 68.67 1.00 1.00
Kurtosis 0.25 1.96 0.25 1.96
Skewness 0.89 0.81 0.89 0.81
Minimum 11.35 33.20 1.15 4.11
Maximum 133.88 20.41 2.99 2.36
Observations 220 219 220 219
Table 1Data statistics of WTI crude futures prices (USD/bbl) and
return (%).
1994 1996 1998 2000 2002 2004 2006 2008 2010 2012
Price (USD/bbl)
WTI Prices
2001 9/11
Terrorist Attacks
2008–2009 World
Financial Crisis
2003 Kuwaits Liberation War
1998 Asian
Financial Crisis
Fig. 3—Monthly WTI crude-oil futures price (1994–2012) for first-month contract.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 820 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol0 0000/190012/Comp/APPFile/SA-REE#190012
820 August 2019 SPE Reservoir Evaluation & Engineering
identifying significant variables within a large numbers of variables, and offers a valuable verification within a small number of varia-
bles. In particular, GA recognizes interdependencies between variables located close together on the masking strings. It can also detect
subsets of variables that were not revealed by other methods. GA composes hundreds to thousands of combinations of input variables,
after adding or removing redundant input variables, to efficiently reach the optimal variable selection. The GA method used to be time
consuming, particularly for a large number of variables, typically requiring the training and testing of many thousands of networks
(Goldberg 1989); however, with the current technological advancements in computing, the speed performance of the GA method has
improved dramatically, making it an attractive viable solution.
Forward and backward stepwise algorithms usually run much faster than the GA if there is a reasonable number of variables. Both
techniques, forward and backward stepwise selections, are equally effective if there are not too many complex interdependencies
between the variables. The forward stepwise input-selection technique operates by adding variables one at a time, while the backward
stepwise input-selection method starts with the complete set of variables and then proceeds to remove one variable at a time (Hill and
Lewicki 2007). Another approach to dimensionality reduction is the PCA, which can be denoted in a linear network. PCA can often
extract a small number of components from fairly high-dimensional original data while preserving the integral structure of the data.
GA, forward stepwise and backward stepwise techniques were used to identify the significant input variables in the neural network
among a total of 58 input variables. These techniques yielded almost identical results with a few slight differences. While implementing
the features selection techniques, GA was used as a benchmark to eliminate the redundant input variables, thus reducing the number of
input variables from 58 to 20 and achieving a 66% reduction of total initial input variables. Therefore, the best 20 input variables that
contributed significantly to the model’s performance were retained for use in the model development. The F-value, R
, and p-value
were statistical measures used to determine the significance of each selected input variable as the predictor of oil-price volatility. All
these statistics showed similar results but with alternating rankings; however, the F-value was selected as a benchmark. The F-value
statistic measured the significant contribution of a given input variable to the model’s overall performance relative to other input varia-
bles. The threshold value for determining the input variable was decided by those variables having an F-value of unity and greater. It
should be noted that all the input and output variables were normalized.
The final results of the features selection procedure are presented by the variables impact plot (VIP) shown in Fig. 4, depicting the
significance and impact of the selected input variables on oil-price volatility. The variables depicted by the VIP were normalized and
transformed with functional links, and were expressed as X
, etc. Analyses showed that the optimal predictors selected for the WTI
crude-oil-price volatility were the US oil inventory change; US oil production; OPEC oil production; OECD inventory; US GDP; oil
consumption of Russia, India, and China; OECD oil consumption; and the OPEC oil spare capacity. The results of the input-parameter
selection showed that the US oil inventory change was the most significant in the GANNATS model, followed by US oil production
and OPEC oil production.
Model Design
This section discusses the design aspects that were considered when selecting the neural-network architecture. The neural-network
architecture determines the method by which the weights are interconnected within the network and specifies the type of learning rules
that might be used. The MLP (Azoff 1994; Trippi and Turban 1996) is the most commonly used architecture and was found suitable for
this study. The network used in this study was based on a type of back-propagation learning—the quasi-Newton algorithm—the most
widely recognized and used supervised-learning algorithm (Azoff 1994). The fundamental structure of the quasi-Newton neural net-
work consists of an input layer, one hidden layer, and one output layer. Fig. 1 shows the architecture of a three-layered feed-forward
neural network. The layers have nodes that are fully connected, indicating that each node of the input layer is connected to each hidden-
layer node, and that each hidden-layer node is connected to each output-layer node. Transfer or activation functions, such as sigmoid,
arctan, exponential, and identity, act on the value returned by the input functions. Each of the transfer functions introduces nonlinearity
into the neural network, enriching its representational capacity. The sigmoid function was found to work well and was used for
this application. One, two, and three hidden layers were experimented with. The most optimal results were achieved using one hidden
layer, maintaining simplicity, reliability, and accuracy consistent with the conventional wisdom of AI modeling and forecasting devel-
opment strategies.
0 0.5 1 1.5 2 2.5 3
Importance (F-value)
Input Variable
Variables Impact Plot (VIP)
Fig. 4—VIP showing significant inputs selected for WTI oil-price-volatility model.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 821 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol0 0000/190012/Comp/APPFile/SA-REE#190012
August 2019 SPE Reservoir Evaluation & Engineering 821
Model Development
In this study, the data were partitioned into three subsets: the training set (70%, January 1994–October 2006), the verification or valida-
tion set (15%, November 2006–July 2009), and the test set (15%, August 2009–April 2012). This was the optimal apportioning of the
data; however, a data apportioning of 80/10/10% was also possible. In the training process, the network was repeatedly exposed to
input data, and the weights and thresholds of the post-synaptic potential function were adjusted using the quasi-Newton training algo-
rithm until the network correctly predicted the output by satisfying the convergence criteria. The convergence criteria for the trained
network were set at a residual error of 1.010
or less, or at maximum epochs or iterations of 1,000, whichever occurred first. Gradient
descent (GD) and conjugate gradient (CG) training algorithms were also attempted.
The overall error was computed for the verification data subset. The verification data act as a standard that takes no part in the adjust-
ment of weights and thresholds during training, but the network’s performance was continually checked against this subset during train-
ing. The training was stopped when the error for the verification data stopped decreasing or started to increase. Use of the verification
subset of data is important because, with unlimited training, the neural network usually starts to overlearn the training data, thus leading
to the network’s memorization problem. The use of a verification subset to terminate training at a point when the generalization potential
is optimal is a critical consideration in training neural networks. A third subset of data, the testing set, served as an additional independent
check on the generalization capabilities of the neural network, and acted as a blind test of its performance and accuracy.
The prediction model was constructed so that the oil-price volatility of WTI futures to be forecast would use the data from a previ-
ous month of input variables to forecast the next month’s oil volatility. Selection of the optimal performing model was made on the
basis of the generalization and statistical indicators of the model, which will be discussed in the next section.
The oil-market-volatility GANNATS hybrid model developed in this study was successfully trained, verified, and tested for adequate
predictions. The hybrid model was optimally developed with an MLP network architecture, the quasi-Newton training algorithm, the
sigmoid transfer function, and three layers (a 20-node input layer, a 14-node hidden layer, and a 1-node output layer). The forecasting
model was based on the time-series data of past monthly time lags (i.e., t
, etc.) of input variables to forecast the future monthly
timesteps (i.e., t
, etc.) of the predictor of oil-price volatility.
Features selection and variables-impact-analysis methods were used to identify significant variables and remove redundant ones,
resulting in a 66% reduction of total initially selected variables. The most influential factors that impacted the model of WTI oil volatil-
ity were the US oil inventory change; US oil production; OPEC oil production; OECD inventory; US GDP; oil consumption of Russia,
China, and India; and the OPEC spare capacity (Fig. 3). Oil supply disruptions and global geopolitical events in unstable producing
countries around the world were also believed to be significant drivers of oil prices and volatility. Quantifying these exogenous factors
for inclusion in the model would improve and increase the certainty of the oil-price-volatility predictions.
Evaluation of the model’s performance by means of statistical key performance indicators (KPIs), graphical error analysis, and the
generalization attribute was used to examine and assess the adequacy and prediction accuracy of the GANNATS hybrid model. The
results of the statistical and graphical error analyses, presented in the next section, showed that the model had a good generalization
attribute with excellent prediction accuracy for oil-price volatility and its direction of movement.
The main results of the GANNATS model for the WTI oil market volatility are shown by Figs. 5 and 6. Fig. 5 shows the
performance of the observed WTI oil price returns compared to that predicted by the model, while Fig. 6 shows the excellent
performance of the model predictions of the price volatility. Analyses of the results showed that the predictions made by the model
matched adequately the observed historical oil-price volatility. In addition, the hybrid model captured the directions of the price returns
and volatility, whether it was upward or downward. The model also accurately captured the negative oil price shock as a result of the
2008 Financial Crisis, as well as other economic and geopolitical events. The model also demonstrated its capability to predict the
direction of the WTI oil-price volatility during the forecasting (testing) phase of model development, as shown in Figs. 5 and 6.
As previously mentioned, the scope of this study was to develop an advanced analytic and predictive model to describe the behavior,
capture the dynamics, and satisfactorily predict the direction of oil-price volatility. The GANNATS model was successfully developed
as an oil-market-volatility forecaster and as a short-term predictive tool for the direction of oil-price volatility. Development of this
system is beneficial for oil producers, consumers, investors, and traders alike.
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220
Normalized Price Returns (%)
Time (months)
1998 Asian
Financial Crisis 2001 9/11
Terrorist Attacks
Financial Crisis
2003 Kuwaits
Liberation War
Estimation (training) Forecasting
Observed returns
Predicted returns
Fig. 5—Results of prediction performance of GANNATS model for WTI futures price return.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 822 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol0 0000/190012/Comp/APPFile/SA-REE#190012
822 August 2019 SPE Reservoir Evaluation & Engineering
Performance Evaluation
Generalization of the Model. Critical problems involved in building an ANN or an ML model include overfitting, overlearning, and/or
memorization—whereby the model tends to produce excellent results during the training stage but performs poorly in the testing stage.
These problems can be caused by insufficient data, inclusion of redundant and insignificant input variables, discounting the verification
or validation subset in data partitioning, improper setting of the hidden layer, and/or inappropriate network configuration. An adequate
AI prediction model is characterized and evaluated by its generalization attributes. Using the concept of ensemble modeling also
improves the generalization property of the network model. An ensemble model is a combination or an average of the optimal perform-
ing models that have different parameters of network configurations. In most cases, an ensemble model provides better results of reliabil-
ity and forecasting accuracy than an individual model.
The generalization property of the model is characterized by decreasing errors within the training and testing data sets with an
increasing number of iterations; the residual of the model is normally distributed around zero, and the statistical errors within the testing
subset are not fewer than the errors within the training subset. Analysis of the GANNATS model showed that the number of errors
within the testing data set decreased as the training data set increased in the number of epochs. In addition, the residual histogram of the
model, shown in Fig. 7, indicated that the residuals were normally distributed. Fig. 8 is a normal probability plot that supports the
normality of the model residuals and illustrates that 97.4% of the residuals fell on a straight line. The Jarque-Bera (JB) test is another
commonly used statistic for normality. The residuals of this model were computed to be JB ¼21.02 with p-value ¼0.00003 (n¼219,
skewness ¼0.2425, kurtosis ¼1.438, and two degrees of freedom). The JB normality test showed that the null hypothesis of normality
cannot be rejected at the 1% level of significance, indicating that the residuals were not significantly different from the normality. This
analysis indicates that the model maintained a good generalization attribute.
Errors Analysis. Statistical and graphical error analyses were used to evaluate and assess the performance and accuracy of the predic-
tion model. The statistical KPIs used in the errors analysis were mean relative percentage error (E
), mean absolute percentage error
), and root mean squared error (E
). The formulas for these commonly used statistical error indices can be found in the literature
(Hill and Lewicki 2007).
Normalized Price Volatility
Time (months)
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220
Observed volatility
Predicted volatility
Fig. 6—Results of GANNATS model for WTI futures price volatility, expressed as a squared return.
–2.5 –2 –1.5 –1 –0.5 0 0.5 1 1.5 2.0 2.5
Cumulative %
Fig. 7—Histogram of residual distribution of GANNATS model of WTI crude futures price returns.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 823 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol00000/190012/Comp/APPFile/SA-REE#190012
August 2019 SPE Reservoir Evaluation & Engineering 823
Directional Predictability Indicator (DPI). Oil market practitioners measure a model’s prediction accuracy and its tendency to pre-
dict the direction of oil market volatility—either upward, stagnant, or downward. To measure the performance of the prediction of oil-
price-volatility direction, we modified the mean directional accuracy (MDA) (Schnader and Stekler 1990; Pesaran and Timmermann
2004) to a more representative measure of the direction of predictions, referred to as the DPI. Using the concept of the random-walk
process between the observed and forecast models, we defined DPI by the following:
DPI ¼1
if ðytyt1Þð^
yt1Þ>0;then St¼1
ðCorrect;Both Ups or Both DownsÞ
if ðytyt1Þð^
yt1Þ<0;then St¼0
ðFalse;Either Up and Down or Down and Up Þ;
and yis the measured or observed target value, ^
yis the predicted output value, tis the trading month, and Nis the total number of pre-
dicted cases (not the number of measured cases). S
represents the classification of the directional prediction, whether it was correct or
false on the basis of satisfying the specified condition. DPI is a statistical index used to measure the capability and performance of the
GANNATS model to predict the direction of the forecasts (upward, stagnant, or downward) compared to the actual realized direction.
The higher the DPI value is (or closer to unity), the higher will be the prediction accuracy for the direction of the forecast. A lower
DPI value (or closer to zero) indicates poor prediction accuracy for the direction of the forecast.
DPI states that if (ytyt1)>0orif(
yt1)>0, then it is positive and the direction is upward. If (ytyt1)<0orif(
then it is negative, indicating a downward direction. If both the terms of S
in Eq. 4 agree in signs [i.e., positive and positive (upward direc-
tion), or negative and negative (downward direction)], then the predicted direction of the time series is correct whether it was upward or
downward. Hence, the direction of the target variable is classified as correctly predicted. If the value of the current timestep equals the pre-
vious timestep of either terms of S
(i.e., yt¼yt1;or ^
yt1Þthen the DPI would be zero, indicating that the predicted direction was
stagnant. Otherwise, mixing signs of both terms would produce a negative result, indicating that the direction was incorrectly predicted by
the model. DPI differs from MDA because the MDA defines S
as [ðytyt1Þð^
Table 2 presents the results of the statistical analysis for the oil-market-volatility model developed in this study. The results shown
denote the training, validation, and testing sets, as well as the complete data set. Evaluation of all the statistical results for the perfor-
mance of the GANNATS model indicated that all the statistical measures, or KPIs, of all the data subsets showed excellent performance
of accuracy. The E
values were –0.0729 for the testing set, and –0.0770 for the entire data set. The E
values for the testing set were
0.4105, with an overall E
value of 0.4672 for the entire data set. The accuracy of E
was 0.5247 for the testing set, and 0.6199 for the
complete data set. The DPI measurement for the directional accuracy of predictions showed excellent results of accuracy; the DPI was
87.9% for the testing set and 83.6% for all the data sets. This indicated that the hybrid model demonstrated its capability to forecast the
direction of oil-price volatility with an accuracy of approximately 88%, with an overall model prediction accuracy of 84%.
R2 = 0.9742
–3 –2 –1 0 1 2 3
Expected Normal Value
Fig. 8—Normal probability plot of model.
KPI/Data Set All Data Training Validation Testing
–0.0831 –0.0528 –0.0729
Ea0.4672 0.5086 0.3318 0.4105
Erms 0.6199 0.6683 0.4520 0.5247
DPI 83.6 81.0 90.9 87.9
Table 2Key performance indicators for WTI futures price-
volatility model.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 824 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol00000/190012/Comp/APPFile/SA-REE#190012
824 August 2019 SPE Reservoir Evaluation & Engineering
Fig. 7 shows the histogram of residuals and cumulative error distribution of the model, which were evenly distributed and predomin-
antly clustered around an error of near zero. Fig. 9 is a crossplot of the observed volatility data and model predictions, which shows the
close agreement between the observed and predicted volatility by indicating that the majority of the data falls on a straight line.
This study developed a time-series AI model for describing and predicting the oil market volatility using a hybrid approach of GA,
ANN, and DM. The GANNATS model exhibited the capability to describe the behavior, capture the dynamics, and predict the
direction of the oil-price volatility with 88% accuracy.
The most influential factors of the WTI oil-price-volatility model are the US oil inventory change; US oil production; OPEC oil pro-
duction; OECD inventory; US GDP; oil consumption of Russia, China, and India; and OPEC spare capacity. Supply disruptions of oil
and global geopolitical events in unstable producing countries around the world are also indicative drivers of oil prices and greater vola-
tility. Quantifying these exogenous factors for inclusion in the model can improve and increase the certainty of the predictions of oil
market volatility.
The GANNATS hybrid model can be used as a risk-management tool, and as a short-term predictive tool for the direction of move-
ment of oil-price volatility. It can also be used to quantitatively examine the effects of various physical and economic factors on future
oil market volatility, to understand the effects of different mechanisms for reducing market volatility, and to recommend policy options
and programs incorporating mechanisms that can potentially lessen the market volatility. This improved method for modeling oil-price
volatility can enable experts and market analysts to empirically test new approaches to mitigating market volatility. This work can also
provide a roadmap for research to improve predictability and accuracy of energy and crude models.
The VIP is a graphical tool and an important outcome of the feature selection and variables screening to identify significant variables,
thus reducing the curse of dimensionality of the AI model.
The DIP, a more representative measure of the directional prediction accuracy of oil volatility, was introduced.
Experience shows that knowledge of econometric modeling of time-series analysis, as well as understanding the physical behavior of
the dependent variable vs. its independent input variables, can lead to the successful development of time-series AI models.
A framework, workflow, or template was constructed as a useful, convenient, and handy tool for development strategies of
AI models.
The following are recommendations and issues to be addressed in future studies.
With the methodology presented, similar studies can be pursued for modeling and forecasting the price volatility of other global oil
markets, such as Brent and Dubai.
A similar work could be pursued for developing AI models for gas market volatility or other energy commodities.
At the time of this study, most of the significant factors influencing the oil market volatility had data provided monthly. Using varia-
bles with high data frequency (daily or weekly) can improve the model’s prediction performance for oil-price volatility.
The methodology presented could be used to perform further in-depth impact analysis to quantitatively evaluate the effects of various
physical and economic factors impacting the future oil market volatility.
A comparative study of oil market volatility should be conducted between the AI approach and the conventional econo-
metric models.
When the developed GANNATS model ceases to be adequate, it is recommended that it be updated periodically as new data
become available.
¼mean absolute percentage error, %
¼mean relative percentage error, %
¼root mean squared error, %
–5 –4 –3 –2 –1 0 1 2 3 4 5
Predicted WTI Returns (normalized)
Actual WTI Returns (normalized)
Fig. 9—Crossplot of WTI crude futures price returns and return model.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 825 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol00000/190012/Comp/APPFile/SA-REE#190012
August 2019 SPE Reservoir Evaluation & Engineering 825
The author would like to thank Saudi Aramco for its support in the publication of this paper. Thanks are also extended to Fred Joutz,
James Smith, Emmanuel Ntui, and Shaikh Arifusalam for their reviews and comments. The views and opinions presented in this paper
belong solely to the author and not necessarily to Saudi Aramco.
Al-Fattah, S. M. and Startzman, R. A. 2001. Predicting Natural Gas Production Using Artificial Neural Network. Presented at the SPE Hydrocarbon Eco-
nomics and Evaluation Symposium, Dallas, Texas, 2–3 April. SPE-68593-MS.
Al-Fattah, S. M. and Startzman, R. A. 2003. Neural Network Approach Predicts U.S. Natural Gas Production. SPE Prod & Fac 18 (2): 84–91. SPE-
Al-Fattah, S. M. and Al-Naim, H. A. 2009. Artificial-Intelligence Technology Predicts Relative Permeability of Giant Carbonate Reservoirs. SPE Res
Eval & Eng 12 (1): 96–103. SPE-109018-PA.
Azoff, E. M. 1994. Neural Network Time Series Forecasting of Financial Markets. Chichester, England: John Wiley & Sons Ltd. Inc.
Energy Information Administration (EIA). 2012. U.S. Energy Information Administration Independent Statistics & Analysis.
(accessed 15 July 2012).
Goldberg, D. E. 1989. Genetic Algorithms. Reading, Massachusetts: Addison Wesley.
Hill, T. and Lewicki, P. 2007. Statistics: Methods and Applications, digital edition. Tulsa, Oklahoma: StatSoft.
Huntington, H., Al-Fattah, S. M., Huang, Z. et al. 2013. Oil Markets and Price Movements: A Survey of Models. Social Science Research Network,
USAEE Working Paper No. 13-129. or
Huntington, H., Al-Fattah, S. M., Huang, Z. et al. 2014. Oil Price Drivers and Movements: The Challenge for Future Research. Alternative Investment
Analyst Review 2(4): 11–28.
Kang, S. H., Kang, S. M., and Yoon, S. M. 2009. Forecasting Volatility of Crude Oil Markets. Energy Econ 31 (1): 119–125.
Matar, W., Al-Fattah, S. M., Atallah, T. et al. 2013. An Introduction to Oil Market Volatility Analysis. OPEC Energy Rev 37 (3): 247–269. OPEC-
Mohaghegh, S. D. 2005. Recent Developments in Application of Artificial Intelligence in Petroleum Engineering. J Pet Technol 57 (4): 86–91. SPE-
Mohaghegh, S. D., Al-Fattah, S. M., and Popa, A. 2011. Artificial Intelligence and Data Mining Applications in the E&P Industry, digital edition.
Richardson, Texas: Society of Petroleum Engineers.
Narayan, P. and Narayan, S. 2007. Modeling Oil Price Volatility. Energy Policy 35 (12): 6549–6553.
Pesaran, M. H. and Timmermann, A. 2004. How Costly is it To Ignore Breaks When Forecasting the Direction of a Time Series? Int J Forecast
20 (3): 411–425.
Poon, S. and Granger, C. 2003. Forecasting Volatility in Financial Markets: A Review. J Econ Lit 41 (2): 478–539.
Regnier, E. 2007. Oil and Energy Price Volatility. Energy Econ 29 (3): 405–427.
Sadorsky, P. 2006. Modeling and Forecasting Petroleum Futures Volatility. Energy Econ 28 (4): 467–488.
Schnader, M. H. and Stekler, H. O. 1990. Evaluating Predictions of Change. J Bus 63 (1): 99–107.
Trippi, R. R. and Turban, E. 1996. Neural Networks in Finance and Investing: Using Artificial Intelligence to Improve Real-World Performance.
Chicago: Irwin Professional Publishing.
Wang, Y., Wu, C., and Wei, Y. 2011. Can GARCH-Class Models Capture Long Memory in WTI Crude Oil Markets? Econ Model 28 (3): 921–927.
Saud M. Al-Fattah is a corporate consultant of strategy and market analysis at Saudi Aramco. He has experience in reservoir
management, energy markets and economics, oil and gas planning and development, reserves assessment, and petroleum
engineering applications. Al-Fattah has authored or coauthored three books and more than 40 peer-reviewed papers, and
holds one US patent. He holds a PhD degree from Texas A&M University and MSc and BSc degrees from King Fahd University of
Petroleum and Minerals (KFUPM), all in petroleum engineering. In addition, Al-Fattah holds an Executive MBA degree from
Prince Muhammad University, Saudi Arabia. He is a technical reviewer for SPE Reservoir Evaluation & Engineering and other
industry publications. Al-Fattah has served in several SPE volunteer activities, in local and international committees, as an Edito-
rial Review Committee member for SPE Reservoir Evaluation & Engineering, as a coauthor of an SPE digital book on artificial intel-
ligence and data mining, and as vice-chair (2006) and chair (2007) of the SPE Saudi Arabia Annual Technical Conference.
Al-Fattah also established the SPE Student Chapter at KFUPM, and served as president from 1990 to 1994.
REE195584 DOI: 10.2118/195584-PA Date: 19-July-19 Stage: Page: 826 Total Pages: 10
ID: jaganm Time: 17:09 I Path: S:/REE#/Vol00000/190012/Comp/APPFile/SA-REE#190012
826 August 2019 SPE Reservoir Evaluation & Engineering
... This section provides a detailed structure of market analysis from the investors' point of view and the stock sellers' strategies for performing buying and selling of stocks to obtain optimal profit [26], [27]. The trading strategies are: ...
... Genetic algorithm plays a major role in the field of stock market prediction. The development of the feature with transformation method and genetic algorithms reduced the feature space dimensionality, which removes the irrelevant factors that predicts the stock prices [26].This method has a better prediction results than the transformation based on fuzzy expert system. The Genetic algorithm transformation has a better outcome in stock market prediction than the other feature transformation techniques. ...
Full-text available
Stock Market Prediction is a challenging task due to the volatile, unpredictable and chaotic nature of the stock market. Global digitization has revamped SMP and trading techniques. Many researchers have employed Machine learning for predicting future value of stocks helping investors to make safe and wise financial decisions. This study systematically examines the traditional prediction methods and the modern approaches that utilize Artificial Intelligence and Machine Learning for the task of prediction. The study compares and contrasts various supervised and unsupervised techniques and Artificial Neural Networks that use temporal data for prediction. Performance of algorithms depends on the dynamic input data, and the nature of forecast. Data fitting is an important concern for identifying, analyzing and predicting future instances. Extensive research is required to build appropriate modules for data pre-processing, analysis, and prediction. Comparing the performance of ML algorithms with traditional methods is required to prove their effectiveness. The study explores the strengths of various ML algorithms to develop a basic understanding, and paves the way for further research in the field of Stock Market Prediction.
... where y i is the actual value of ith sample andŷ i is the predicted value of the i-th sample based on the desired model. The model with the lowest error values is more reliable [57]. The results of calculating the criteria for thirty sample countries based on the four studied models are shown in Table 4. ...
Full-text available
Given the prevalence of the digital world, artificial intelligence (AI) stands out as one of the most prominent technologies for demand prediction. Although numerous studies have explored energy demand forecasting using machine learning models, previous research has been limited to incorporating either a country’s macroeconomic characteristics or exogenous elements as input variables. The simultaneous consideration of both endogenous and exogenous economic elements in demand forecasting has been disregarded. Furthermore, the stability of machine learning models for energy exporters and importers facing varying uncertainties has not been adequately examined. Therefore, this study aims to address these gaps by investigating these issues comprehensively. To accomplish this objective, data from 30 countries spanning the period from 2000 to 2020 was selected. In predicting oil demand, endogenous economic variables, such as carbon emissions, income level, energy price, gross domestic product (GDP), population growth, urbanization, trade liberalization, inflation, foreign direct investment (FDI), and financial development, were considered alongside exogenous factors, including energy sanctions and the COVID-19 pandemic. The findings indicate that among the input variables examined in demand forecasting, oil sanctions and the COVID-19 pandemic have had the most significant impact on reducing oil demand, while trade liberalization has proven to be the most influential factor in increasing oil demand. Furthermore, the support vector regression (SVR) model outperforms other models in terms of lower prediction error, as revealed by the error assessment of statistical models and AI in forecasting oil demand. Additionally, when comparing the stability of models in oil exporting and importing countries facing different levels of demand uncertainty, the SVR model demonstrates higher stability compared to other models.
... The price fluctuations of crude oil have significant implications for the overall economy of a country. India, ranked as the third-largest crude oil importer worldwide, is particularly vulnerable have used sophisticated forecasting models like time-series approach GARCH (Morana, 2001), GARCH-MIDAS-EUEPU (Dai et al., 2022b), structural VECM (Bekiros & Diks, 2008;Kaufmann & Ullman, 2009), dynamic optimization model (Pindyck, 1978), and others (Al-Fattah, 2019;Chen et al., 2017;Gong & Lin, 2017;Ramyar & Kianfar, 2019) to reliably predict the crude oil prices in the short and long run to help the decision-makers in framing the policy. While these techniques are effective in comprehending price movements over the medium to long-term, they do lack a non-linear and non-parametric approach that could potentially offer superior results and improved computational efficiency. ...
Full-text available
The uncertainty caused by high volatile crude oil prices and the higher level of deregulations worldwide has significant effects on the economic growth of a country. The financial markets of many developing countries experienced a severe downturn during the oil price shocks in March-April 2020. Traditional predictive approaches, which assume linearity and stationarity of time series in the long run, fail to accurately capture short-term fluctuations. This paper presents an efficient algorithm based on ARMA denoising and taking advantage of the wavelet transformation. By decomposing the time series and extracting the intricate underlying structure, wavelet denoising minimizes distortions and enhances forecasting accuracy. The results demonstrate a substantial improvement in performance compared to conventional forecasting techniques.
... Al-Fattah [44] introduced a novel model for predicting the price of gasoline. This model uses AI, GA, ANN, and data mining techniques to predict the gasoline price. ...
Full-text available
In an increasingly automated world, Artificial Intelligence (AI) promises to revolutionize how people work, consume, and develop their societies. Science and technology advancement has led humans to seek solutions to problems; however, AI-based technology is not novel and has a wide range of economic applications. This paper examines AI applications in economics, including stock trading, market analysis, and risk assessment. A comprehensive taxonomy is proposed to investigate AI applications in various scopes of the proposed categories. Furthermore, we will discuss this area's most significant AI-based techniques and evaluation criteria. As a final step, we will identify challenges, open issues, and future work suggestions. INDEX TERMS Internet of things, Artificial intelligence, Economy, Machine learning, Stock market, Neural network.
... Kristjanpoller & Minutolo, 2016;Al-Fattah, 2019, Bouteska et al., 2023, and individual stocks for different markets(Calôba et al. 2001;Fong et al. 2005;Wang et al. 2012;Kristjanpoller & Minutolo, 2015;Kaushik et al. 2019). ...
Full-text available
Recently, the use of machine learning (ML) in scientific disciplines has experienced an unprecedented increase. Finance has not been an exception. Several works have been published in recent years using mltechniques. However, one of the topics with the least number of developed papers is volatility in this context. Nevertheless, the data analyzed here suggest changes regarding this issue. Data obtained from the Web of Science database show that between 2001 and 2010 there were 33 published papers associated with this topic. Surprisingly, between 2019 and 2023, 189 manuscripts have been published related to this topic. The purpose of this work is to review the works related to the applications of ml in volatility. For this, a classification of the main proposals on this topic is proposed following a narrative methodology, accompanied by a statistical and bibliometric analysis in which novel techniques such as K-means were used. The results are suggestive. Although most papers focus on volatility prediction through neural networks and support vector machines, there is a lack of studies related to volatility transmission, calibration of volatility surfaces,and corporate finance. Moreover, the obtained results indicate that there is a gap in the production of worksrelated to these topics in finance and economics specialized journals.
... This information is valuable for authorities as it helps them validate records and activities. Furthermore, it can be utilized in proposals for repurposing the surrounding land, such as converting it into farmland [20]- [21]. ...
... In the aspect of international oil price forecasting, many scholars have also made outstanding progress. Ali Safari et al. [23]combined the exponential smoothing model, the autoregressive integrated moving average model, and nonlinear autoregressive neural network in a structure of the state space model in 2018 to increase accuracy of forecasting. In [24], Aimei Hu built ARIMA(5,1,3) and GARCH(1,1) models in accordance with monthly data of WTI crude oil and made a forecasting of oil prices in 2012 which showed that the forecasting results' accuracy of GARCH is higher than that of the ARIMA model, and the mean relative error decreased from 8.2157% to 5.4791%, and the root mean square error decreased from 9.449168 to 7.25275. ...
Full-text available
It is meaningful and of certain theoretical value for the development of economy through analyzing fluctuation rules of international oil prices and forecasting the future trend of international oil prices. By composing the autoregressive integrated moving average (ARIMA) model and the combination model of autoregressive integrated moving average model-generalized autoregressive conditional heteroskedasticity (ARIMA-GARCH) for analyzing and forecasting international oil prices, study shows that the combination model of ARIMA (1,1,0)-GARCH (1,1) is more suitable for short-term forecasting of international oil prices with higher accuracy that the MAPE of forecasting has reduced from 1.549% to 0.045% and the RMSE of forecasting has reduced from 1.032 to 0.071.
... To further improve the accuracy of traditional statistical techniques for price prediction of crude oil, artificial intelligence-based models with good generalization, self-learning ability, and memory capacity have been developed. It is seen that the prediction performance of AI-based single models such as artificial neural network (ANN) [23,24], support vector regression (SVR) [25], deep learning based on LSTM [26e28], and ANN-Fuzzy regression [29] is superior to traditional statistical models. Although the performance of AI-based models is superior to traditional statistical methods, when used as a single model, they fall short of dealing with nonlinear dynamics and chaoticity. ...
Abstract Estimating the price of crude oil, which is seen as an important resource for economic development and stability in the world, is a topic of great interest by policy makers and market participants. However, the chaotic and nonlinear characteristics of crude oil time series (COTS) make it difficult to estimate crude oil prices with high accuracy. To overcome these challenges, a new crude oil price prediction model is proposed in this study, which includes the long short-term memory (LSTM), technical indicators such as trend, volatility and momentum, and the chaotic Henry gas solubility optimization (CHGSO) technique. In the proposed model, features based on trend, momentum and volatility technical indicators are utilized. The features are obtained by using the trend indicators such as exponential moving average (EMA), simple moving average (SMA) and Kaufman's adaptive moving average (KAMA), the momentum indicators such as commodity channel index (CCI), rate of change (ROC) and relative strength index (RSI), and the volatility indicators such as average true range (ATR), volatility ratio (VR) and highest high-lowest low (HHLL). These indicators are obtained separately for the West Texas Intermediate (WTI) and Brent COTS. Especially, including the volatility indicator in the model is important in terms of the robustness of the proposed model. The features based on EMA, SMA, KAMA indicators are composed by changing the period values between 3 and 10, the features based on ROC indicator is created by changing the period values between 5 and 12, and the features based on CCI, RSI, ATR, VR and HHLL indicators are formed by changing the period values between 5 and 20. The features are selected by CHGSO algorithm based on the logistic chaotic map, which is successful in avoiding local optima and balancing exploitation and exploration in the search space. Both Theil's U and the mean absolute percentage error (MAPE) values are utilized in the optimization algorithm as the objective function. The results show that the proposed prediction model copes with the chaoticity and nonlinear dynamics of both WTI and Brent COTS.
The main aim of the proposed study is to develop a hybrid temporal model that provides learning pattern for classifying the temporal data. These results are unusual, which is in contrast with the Hidden Markov Models (HMM). The system is evaluated in terms of the capabilities of a hybrid learning algorithm, which is applied over the temporal data. Performance of the hybrid algorithm depends entirely on the dynamic data, which is fed into the system. The data fitting is an important concern, to find, analyse and predict the future instance. Hence, the difficulty in making a hybrid algorithm to fit the dynamic data is increasing, however, the data fits in better proportion over the expert system. An expensive research is required to build the required module for data pre-processing, analyzing and prediction. Also comparing such systems’ performance with the conventional schemes is required to prove its effectiveness. The study aims at developing a most generic artificial neural network hybrid algorithm, which predicts well the stock market data without the knowledge of past outputs. Hence, the end user does not trouble the recognition system and that is regarded as the virtues of soft computing tools
Full-text available
Determination of relative permeability data is required for almost all calculations of fluid flow in petroleum reservoirs. Water-oil relative permeability data play important roles in characterizing the simultaneous two-phase flow in porous rocks and predicting the performance of immiscible displacement processes in oil reservoirs. They are used, among other applications, for determining fluid distributions and residual saturations, predicting future reservoir performance, and estimating ultimate recovery. Undoubtedly, these data are considered probably the most valuable information required in reservoir simulation studies. Estimates of relative permeability are generally obtained from laboratory experiments with reservoir core samples. In the absence of the laboratory measurement of relative permeability data, empirical correlations are usually used to estimate relative permeability data. Developing empirical correlations for obtaining accurate estimates of relative permeability data showed limited success, and proved difficult, especially for carbonate reservoir rocks. Artificial neural network (ANN) technology has proved successful and useful in solving complex structured and nonlinear problems. This paper presents a new modeling technology to predict accurately water-oil relative permeability using ANN. The ANN models of relative permeability were developed using experimental data from waterflood core tests samples collected from carbonate reservoirs of giant Saudi Arabian oil fields. Three groups of data sets were used for training, verification, and testing the ANN models. Analysis of results of the testing data set show excellent agreement with the experimental data of relative permeability. In addition, error analyses show that the ANN models developed in this study outperform all published correlations. The benefits of this work include meeting the increased demand for conducting special core analysis, optimizing the number of laboratory measurements, integrating into reservoir simulation and reservoir management studies, and providing significant cost savings on extensive lab work and substantial required time.
Full-text available
During the 1970s, oil market models offered a framework for understanding the growing market power being exercised by major oil producing countries. Few such models have been developed in recent years. Moreover, most large institutions do not use models directly for explaining recent oil price trends or projecting their future levels. Models of oil prices have become more computational, more data driven, less structural and increasingly short run since 2004. Quantitative analysis has shifted strongly towards identifying the role of financial instruments in shaping oil price movements. Although it is important to understand these short-run issues, a large vacuum exists between explanations that track short-run volatility within the context of long-run equilibrium conditions. The theories and models of oil demand and supply that are reviewed in this paper, although imperfect in many respects, offer a clear and well-defined perspective on the forces that are shaping the markets for crude oil and refined products. The complexity of the world oil market has increased dramatically in recent years and new approaches are needed to understand, model, and forecast oil prices today. There are several kinds of models have been proposed, including structural, computational and reduced form models. Recently, artificial intelligence was also introduced. This paper provides: (1) model taxonomy and the uses of models providing the motivation for its preparation, (2) a brief chronology explaining how oil market models have evolved over time, (3) three different model types: structural, computational, and reduced form models, and (4) artificial intelligence and data mining for oil market models.
Full-text available
The complexity of the world oil market has increased dramatically in recent years and new approaches are needed to understand, model, and forecast oil prices today. In addition to the commencement of the financialization era in oil markets, there have been structural changes in the global oil market. Financial instruments are communicating information about future conditions much more rapidly than in the past. Prices from long and short duration contracts have started moving more together. Sudden supply and demand adjustments, such as the financial crisis of 2008-2009, faster Chinese economic growth, the Libyan uprising, the Iranian Nuclear standstill or the Deepwater Horizon oil spill, change expectations and current prices. Although volatility appears greater, financialization makes price discovery more robust. Most empirical economic studies suggest that fundamental values shaped expectations over 2004-2008, although financial bubbles may have emerged just prior to and during the summer of 2008. With increased price volatility, major exporters are considering ways and means to achieve more price stability to improve long-term production and consumption decisions. Managing excess capacity has historically been an important method for keeping world crude oil prices stable during periods of sharp demand or supply shifts. Building and maintaining excess capacity in current markets allow greater price stability when Asian economic growth suddenly accelerates or during periods of supply uncertainty in major producing regions. OPEC can contribute to price stability more easily when members agree on the best use of oil production capacity.Important structural changes have emerged in the global oil market after major price increases. Partially motivated by government policies major improvements in energy and oil efficiencies occurred after the oil price increases of the early and the late 70s such as the improved vehicle fuel efficiency, building codes, power grids and systems etc. On the supply side, seismic imaging and horizontal drilling as well as favorable tax regimes expanded production capacity in countries outside OPEC. After the oil price increases of 2004-2008, investments in oil sands, deep water, biofuels and other non-conventional sources accelerated. Recent improvements in shale gas production could well be transferred to oil-producing activities, resulting in expanded oil supplies in areas recently considered prohibitively expensive. The search for alternative transportation fuels continues with expanded research into compressed natural gas, biofuels, diesel made from natural gas, and electric vehicles.Still some aspects of the world oil market are not well understood. Despite numerous attempts to model the behavior of OPEC or its members, there exists no credible, verifiable theory about the behavior of the 50 years old organization. OPEC has not consistently acted like a monolithic cartel, constraining supplies to raise prices. Empirical evidence suggests that members sometimes coordinate supply responses and at other times compete with each other. Supply-restraint strategies include slower capacity expansions as well as curtailed production from existing capacity. Regional political considerations and broader economic goals (beyond oil) are influential factors in a country’s oil decisions. Furthermore, the economies of OPEC members as well as their financial needs have changed dramatically from 1970s and 1980s. This review represents a broad review of economic research and literature related to the structure and functioning of the world oil market. The theories and models of oil demand and supply reviewed here, although imperfect in many respects, offer a clear and well-defined perspective on the forces that are shaping the markets for crude oil and refined products. Much work remains to be done if we are to achieve a more complete understanding of these forces and the trends that lie ahead. The contents that follow represent an assessment of how far we have come and where we are headed. Of course, the entire world shares a vital interest in the many benefits that flow from an efficient, well-functioning oil market. It is intended and hoped, therefore, that the discussion in this review will find a broader audience.
Full-text available
Modelling and forecasting crude oil price volatility is crucial in many financial and investment applications. The main purpose of this paper is to review and assess the current state of oil market volatility knowledge. It highlights the properties and characteristics of the oil price volatility that models seek to capture, and discuss the different modelling approaches to oil price volatility. Asymmetric response to price change, persistence and mean reversion, structural breaks, and possible market spillover of volatility are discussed.To complement the discussion, West Texas Intermediate futures price data are used to illustrate these properties using non-parametric and conditional modelling methods. The generalised autoregressive conditional heteroskedasticity-type models usually applied in the oil price volatility literature are also explored.We additionally examine the exogenous factors that may influence volatility in the oil markets.
Full-text available
The industrial and residential market for natural gas produced in the United States has become increasingly significant. Within the past ten years the wellhead value of produced natural gas has rivaled and sometimes exceeded the value of crude oil. Forecasting natural gas supply is an economically important and challenging endeavor. This paper presents a new approach to predict natural gas production for the United States using an artificial neural network. We developed a neural network model to forecast U.S. natural gas supply to the Year 2020. Our results indicate that the U.S. will maintain its 1999 production of natural gas to 2001 after which production starts increasing. The network model indicates that natural gas production will increase during the period 2002 to 2012 on average rate of 0.5%/yr. This increase rate will more than double for the period 2013 to 2020. The neural network was developed with an initial large pool of input parameters. The input pool included exploratory, drilling, production, and econometric data. Preprocessing the input data involved normalization and functional transformation. Dimension reduction techniques and sensitivity analysis of input variables were used to reduce redundant and unimportant input parameters, and to simplify the neural network. The remaining input parameters of the reduced neural network included data of gas exploratory wells, oil/gas exploratory wells, oil exploratory wells, gas depletion rate, proved reserves, gas wellhead prices, and growth rate of gross domestic product. The three-layer neural network was successfully trained with yearly data starting from 1950 to 1989 using the quick-propagation learning algorithm. The target output of the neural network is the production rate of natural gas. The agreement between predicted and actual production rates was excellent. A test set, not used to train the network and containing data from 1990 to 1998, was used to verify and validate the network performance for prediction. Analysis of the test results shows that the neural network approach provides an excellent match of actual gas production data. An econometric approach, called stochastic modeling or time series analysis, was used to develop forecasting models for the neural network input parameters. A comparison of forecasts between this study and other forecast is presented. The neural network model has use as a short-term as well as a long-term predictive tool of natural gas supply. The model can also be used to examine quantitatively the effects of the various physical and economic factors on future gas production.
This paper investigates the issue whether GARCH-type models can well capture the long memory widely existed in the volatility of WTI crude oil returns. In this frame, we model the volatility of spot and futures returns employing several GARCH-class models. Then, using two non-parametric methods, detrended fluctuation analysis (DFA) and rescaled range analysis (R/S), we compare the long memory properties of conditional volatility series obtained from GARCH-class models to that of actual volatility series. Our results show that GARCH-class models can well capture the long memory properties for the time scale larger than a year. However, for the time scale smaller than a year, the GARCH-class models are misspecified.
In this paper, we examine the volatility of crude oil price using daily data for the period 1991–2006. Our main innovation is that we examine volatility in various sub-samples in order to judge the robustness of our results. Our main findings can be summarised as follows: (1) across the various sub-samples, there is inconsistent evidence of asymmetry and persistence of shocks; and (2) over the full sample period, evidence suggests that shocks have permanent effects, and asymmetric effects, on volatility. These findings imply that the behaviour of oil prices tends to change over short periods of time.
This article investigates the efficacy of a volatility model for three crude oil markets — Brent, Dubai, and West Texas Intermediate (WTI) — with regard to its ability to forecast and identify volatility stylized facts, in particular volatility persistence or long memory. In this context, we assess persistence in the volatility of the three crude oil prices using conditional volatility models. The CGARCH and FIGARCH models are better equipped to capture persistence than are the GARCH and IGARCH models. The CGARCH and FIGARCH models also provide superior performance in out-of-sample volatility forecasts. We conclude that the CGARCH and FIGARCH models are useful for modeling and forecasting persistence in the volatility of crude oil prices.
This article applies a technique developed by Robert C. Merton (1981) and Roy D. Henriksson and Robert C. Merton (1981) for evaluating the market timing of financial managers to macroeconomic predictions of change. This methodology may be used to determine whether the predictions may be of value to the user. As an illustration, the methodology is applied to a set of real gross national product forecasts. Copyright 1990 by the University of Chicago.