ArticlePDF Available

Neural Network Approach Predicts U.S. Natural Gas Production

Authors:

Abstract and Figures

The industrial and residential market for natural gas produced in the United States has become increasingly significant. Within the past ten years the wellhead value of produced natural gas has rivaled and sometimes exceeded the value of crude oil. Forecasting natural gas supply is an economically important and challenging endeavor. This paper presents a new approach to predict natural gas production for the United States using an artificial neural network. We developed a neural network model to forecast U.S. natural gas supply to the Year 2020. Our results indicate that the U.S. will maintain its 1999 production of natural gas to 2001 after which production starts increasing. The network model indicates that natural gas production will increase during the period 2002 to 2012 on average rate of 0.5%/yr. This increase rate will more than double for the period 2013 to 2020. The neural network was developed with an initial large pool of input parameters. The input pool included exploratory, drilling, production, and econometric data. Preprocessing the input data involved normalization and functional transformation. Dimension reduction techniques and sensitivity analysis of input variables were used to reduce redundant and unimportant input parameters, and to simplify the neural network. The remaining input parameters of the reduced neural network included data of gas exploratory wells, oil/gas exploratory wells, oil exploratory wells, gas depletion rate, proved reserves, gas wellhead prices, and growth rate of gross domestic product. The three-layer neural network was successfully trained with yearly data starting from 1950 to 1989 using the quick-propagation learning algorithm. The target output of the neural network is the production rate of natural gas. The agreement between predicted and actual production rates was excellent. A test set, not used to train the network and containing data from 1990 to 1998, was used to verify and validate the network performance for prediction. Analysis of the test results shows that the neural network approach provides an excellent match of actual gas production data. An econometric approach, called stochastic modeling or time series analysis, was used to develop forecasting models for the neural network input parameters. A comparison of forecasts between this study and other forecast is presented. The neural network model has use as a short-term as well as a long-term predictive tool of natural gas supply. The model can also be used to examine quantitatively the effects of the various physical and economic factors on future gas production.
Content may be subject to copyright.
Neural Network Approach Predicts U.S.
Natural Gas Production
S.M. Al-Fattah, SPE, Saudi Aramco, and R.A. Startzman, SPE, Texas A&M U.
Summary
The industrial and residential market for natural gas produced in
the United States has become increasingly significant. Within the
past 10 years, the wellhead value of produced natural gas has
rivaled and sometimes exceeded the value of crude oil. Forecasting
natural gas supply is an economically important and challenging
endeavor. This paper presents a new approach to predict natural
gas production for the United States with an artificial neural net-
work (NN).
We developed an NN model to forecast the U.S. natural gas
supply to 2020. Our results indicate that the U.S. will maintain its
1999 production of natural gas until 2001, after which production
increases. The network model indicates that natural gas production
will increase by an average rate of 0.5%/yr from 2002 to 2012.
This increase will more than double from 2013 to 2020.
The NN was developed with a large initial pool of input pa-
rameters. The input pool included exploratory, drilling, produc-
tion, and econometric data. Preprocessing the input data involved
normalization and functional transformation. Dimension-reduction
techniques and sensitivity analysis of input variables were used to
reduce redundant and unimportant input parameters and to sim-
plify the NN. The remaining parameters included data from gas
exploratory wells, oil/gas exploratory wells, oil exploratory wells,
gas depletion rate, proved reserves, gas wellhead prices, and
growth rate of the gross domestic product. The three-layer NN was
successfully trained with yearly data from 1950 to 1989 using the
quick-propagation learning algorithm. The NN’s target output is
the production rate of natural gas. The agreement between pre-
dicted and actual production rates was excellent. A test set not used
to train the network and containing data from 1990 to 1998 was
used to verify and validate the network prediction performance.
Analysis of the test results showed that the NN approach provides
an excellent match with actual gas production data. An economet-
ric approach, called stochastic modeling or time-series analysis,
was used to develop forecasting models for NN input parameters.
A comparison of forecasts between this study and another is pre-
sented.
The NN model has use as a short-term as well as a long-term
predictive tool for natural gas supply. The model can also be used
to quantitatively examine the effects of the various physical and
economic factors on future gas production.
Introduction
In recent years, there has been a growing interest in applying
artificial NNs
1–4
to various areas of science, engineering, and fi-
nance. Among other applications
4
to petroleum engineering, NNs
have been used for pattern recognition in well-test interpretation
5
and for prediction in well logs
4
and phase behavior.
6
Artificial NNs are an information-processing technology in-
spired by studies of the brain and nervous system. In other words,
they are computational models of biological neural structures.
Each NN generally consists of a number of interconnected pro-
cessing elements (PE) or neurons grouped in layers. Fig. 1 shows
the basic structure of a three-layer network—one input, one hid-
den, and one output. The neuron consists of multiple inputs and a
single output. “Input” denotes the values of independent variables,
and “output” is the dependent variables. Each input is modified by
a weight, which multiplies with the input value. The input can be
raw data or output from other PEs or neurons. With reference to a
threshold value and activation function, the neuron will combine
these weighted inputs and use them to determine its output. The
output can be either the final product or an input to another neuron.
This paper describes the methodology of developing an artifi-
cial NN model to predict U.S. natural gas production. It presents
the results of the NN modeling approach and compares it to other
modeling approaches.
Data Sources
The data used to develop the artificial NN model for U.S. gas
production were collected mostly from the Energy Information
Admin. (EIA).
7
U.S. marketed-gas production for 1918 to 1997
was obtained from Twentieth Century Petroleum Statistics,
8–9
with
the EIA’s 1998 production data. Gas-discovery data from 1900 to
1998 were from Refs. 7 and 10. Proved gas reserves for 1949 to
1999 came from the Oil and Gas J. (OGJ) database.
11
EIA pro-
vides various statistics on U.S. energy historical data, including
gas production, exploration, drilling, and econometrics. These data
are available to the public and can be downloaded from the internet
with ease. The following data (1949 to 1998) were downloaded
from the EIA website.
7
Gas discovery rate.
Population.
Gas wellhead price.
Oil wellhead price.
Gross domestic product (D
G
), with purchasing power parity
(PPP) based on 1992 U.S. dollars.
Gas exploratory wells.
Footage and wells drilled.
Oil exploratory wells.
Footage and wells drilled.
Percentage of successful wells drilled.
Oil and gas exploratory wells.
Footage and wells drilled.
Proved gas reserves.
Other input parameters were also derived from the previous data
parameters. The derived input parameters include:
Gross domestic product growth rate. This input parameter
was calculated with the following formula.
12
G
DPi+ 1
=
D
Gi+ 1
D
Gi
1
t
i+ 1
t
i
1
× 100, ...................( 1)
where D
G
gross domestic product, G
DP
growth rate of gross
domestic product, ttime, and iobservation number.
Average depth drilled per well. This is calculated by dividing
the footage drilled by the number of exploratory wells drilled each
year. This is done for the gas exploratory wells, oil exploratory
wells, and oil-and-gas exploratory wells, resulting in three addi-
tional new input variables.
Depletion rate. This measures how fast the reserves are being
depleted each year at that year’s production rate. It is calculated as
the annual production divided by the proved reserves and is ex-
pressed in a percentage.
Data Preprocessing
Data preparation is a critical procedure in the development of an
artificial NN system. The preprocessing procedures used in the
construction process of this study’s NN model are input/output
normalization and transformation.
Copyright © 2003 Society of Petroleum Engineers
This paper (SPE 82411) was revised for publication from paper SPE 67260, first presented
at the 2001 SPE Production and Operations Symposium, Oklahoma City, Oklahoma, 25-28
March. Original manuscript received for review 22 April 2002. Revised manuscript received
21 November 2002. Paper peer approved 3 December 2002.
84 May 2003 SPE Production & Facilities
Normalization. Normalization is the process of standardizing the
possible numerical range input data can take. It enhances the fair-
ness of training by preventing an input with large values from
swamping out another that is equally important but has smaller
values. Normalization is also recommended because the network
training parameters can be tuned for a given range of input data;
thus, the training process can be carried over to similar tasks.
We used the mean/standard deviation normalization method to
normalize all the NN’s input and output variables. Mean standard
deviation preprocessing is the most commonly used method and
generally works well with almost every case. Its advantages are
that it processes the input variable without any loss of information
and its transform is mathematically reversible. Each input variable,
as well as the output, were normalized with the following formula.
13
X
i
=
X
i
i
i
, ........................................( 2)
where X⬘⳱normalized input/output vector, Xoriginal input/
output vector,
mean of the original input/output,
standard
deviation of the input/output vector, and inumber of input/output
vector. Each input/output variable was normalized with its mean
and standard deviation values with Eq. 2. This process was applied
to all the data, including the training and testing sets. The single set
of normalization parameters for each variable (i.e., the standard
deviation and the mean) were then preserved to be applied to new
data during forecasting.
Transformation. Our experience found that NN performs better
with normally distributed data and unseasonal data. Input data
exhibiting trend or periodic variations renders data transformation
necessary. There are different ways to transform the input vari-
ables into forms, making the NN interpret the input data easier and
perform faster in the training process. Examples of such transfor-
mation forms include the variable first derivative, relative variable
difference, natural logarithm of the relative variable, square root of
the variable, and trigonometric functions. In this study, all input as
well as output variables were transformed with the first derivative
of each. This transform choice removed the trend in each input
variable, thus helping to reduce the multicolinearity among the
input variables.
Using the first derivative also results in greater fluctuation and
contrast in the values of the input variables. This improves the
ability of the NN model to detect significant changes in patterns.
For instance, if gas exploratory footage (one of the input variables)
is continuously increasing, the actual level may not be as important
as the first-time derivative of footage or the rate of change in
footage from year to year.
The first-derivative transformation, however, resulted in a loss
of one data point because of its mathematical formulation.
Selection of NN Inputs and Outputs
Gas production was selected as the NN output because it is the
prediction target. Diagnostic techniques, such as scatter plots and
correlation matrices, were performed on the data to check their
validity and to study relationships between the target and each of
the predictor variables. For example, a scatter plot for average
footage drilled per oil and gas exploratory well vs. gas production
is shown in Fig. 2. The correlation coefficients for all inputs vs. the
target (gas production) are given in Table 1. The highest correla-
tion coefficient value is 0.924 for Input I-9, average footage drilled
per oil and gas exploratory well. This is also shown in Fig. 2 by the
high linear correlation of this variable with gas production. The
correlation matrix helps reduce the number of input variables by
excluding those with high correlation coefficients, some of which,
however, are important and needed to be included in the network
model because of their physical relations with the target. This
problem can be alleviated by applying transformation techniques
to remove the trend and reduce the high correlation coefficient.
Fig. 3 shows a scatter plot of Input I-9 vs. gas production after
performing the normalization and the first derivative transforma-
tion. The figure shows that the data points are more scattered and
fairly distributed around the zero horizontal line. The preprocess-
ing procedure resulted in a 45% reduction of the correlation coef-
ficient for this input, from 0.924 to 0.512.
NN Model Design
There are a number of design factors that must be considered in
constructing an NN model. These considerations include selection
Fig. 1—Basic structure of a three-layer back-propagation (BP) NN.
Fig. 2—Scatter plot of gas production and average footage
drilled per oil and gas exploratory well.
85May 2003 SPE Production & Facilities
of the NN architecture, the learning rule, the number of processing
elements in each layer, the number of hidden layers, and the type
of transfer function. Fig. 4 depicts an illustration of the NN model
designed in this study.
Architecture. The NN architecture determines the method by
which the weights are interconnected in the network and specifies
the type of learning rules that may be used. Selecting the network
architecture is one of the first tasks in setting up an NN. The
multilayer, normal feed forward
1–3
is the most commonly used
architecture and is generally recommended for most applications;
hence, it was selected to be used for this study.
Learning Algorithm. Selection of a learning rule is also an im-
portant step because it affects the determination of input and trans-
fer functions and associated parameters. The network used is based
on a back-propagation (BP) design,
1
the most widely recognized
and most commonly used supervised-learning algorithm. In this
study, the quick-propagation (QP)
14
learning algorithm, which is
an enhanced version of the BP one, is used for its performance and
speed. The advantage of QP is that it runs faster than BP by
minimizing the time required to find a good set of weights with
heuristic rules. These rules automatically regulate the step size and
detect conditions that accelerate learning. The optimum step size is
then determined by evaluating the trend of the weight updates
with time.
The fundamental design of a BP NN consists of an input layer,
a hidden layer, and an output layer, as shown in Fig. 4. A layer
consists of a number of processing elements or neurons and is fully
connected, indicating that each neuron of the input layer is con-
nected to each hidden-layer node. Similarly, each hidden-layer
node is connected to each output-layer node. The number of nodes
needed for the input and output layers depends on the number of
inputs and outputs designed for the NN.
Activation Rule. A transfer function acts on the value returned by
the input function, which combines the input vector with the
weight vector to obtain the net input to the processing element
given a particular input vector. Each transfer function introduces a
nonlinearity into the NN, enriching its representational capacity. In
fact, it is the nonlinearity of the transfer function that gives an NN
its advantage vs. conventional or traditional regression techniques.
There are also a number of transfer functions. Among those are
sigmoid, arctan, sin, linear, Gaussian, and Cauchy. The most com-
monly used transfer function is the sigmoid function. It squashes
and compresses the input function when it takes on large positive
or negative values. Large positive values asymptotically approach
1, while large negative values are squashed to 0. The sigmoid is
given by
1
f
x
=
1
1 + exp
x
. ....................................( 3)
Fig. 5 is a typical plot of the sigmoid function. In essence, the
activation function acts as a nonlinear gain for the processing
element. The gain is actually the slope of the sigmoid at a specific
point. It varies from a low value at large negative inputs to a high
value at zero input, then drops back toward zero as the input
becomes large and positive.
Training Procedure
In the first step of the development process, the available data were
divided into training and test sets. The training set was selected to
cover the data from 1949 to 1989 (40-year data points), while the
testing set covered the data from 1990 to 1998 (9-year data points).
We chose to split the data based on an 80/20 rule. We first nor-
malized all input variables and the output with the average/
standard deviation method, then took the first derivative of all
input variables, including the output. In the initial training and
testing phases, we developed the network model with most of the
default parameters in the NN software. Generally, these default
settings provided satisfactory beginning results. We examined dif-
ferent architectures, different learning rules, and different input
and transfer functions (with increasing numbers of hidden-layer
neurons) on the training set to find the optimal learning parameters
and then the optimal architecture. We primarily used the black-box
testing approach (comparing network results to actual historical
results) to verify that the inputs produce the desired outputs. Dur-
ing training, we used several diagnostic tools to facilitate under-
standing of how the network is training. These include
The MSE of the entire output.
A plot of the MSE vs. the number of iterations.
Fig. 4—NN design from this study.
Fig. 5—Sigmoid function.
Fig. 3—Scatter plot of gas production and average footage
drilled per oil and gas exploratory well after data preprocessing.
86 May 2003 SPE Production & Facilities
The percentage of training- or testing-set samples that are
correct based on a chosen tolerance value.
A plot of the actual vs. the network output.
A histogram of all the weights in the network.
The three-layer network with all initial 15 input variables was
trained with the training samples. We chose the number of neurons
in the hidden layer on the basis of existing rules of thumb
2,3
and
experimentation. One rule of thumb states that the number of
hidden-layer neurons should be approximately 75% of the input
variables. Another rule suggests that the number of hidden-layer
neurons be approximately 50% of the total number of input and
output variables. One of the advantages of the neural software used
in this study is that it allows the user to specify a range for the
minimum and maximum number of hidden neurons. Putting all
this knowledge together with our experimentation experience al-
lowed us to specify the range of 5 to 12 hidden neurons for the
single hidden layer.
We used the input sensitivity analysis to study the significance
of each input parameter and how it affects network performance.
This procedure helps to reduce the redundant input parameters and
determine the optimum number of NN input parameters. In each
training run, the results of the input sensitivity analysis are exam-
ined and the least-significant input parameter is deleted, then the
weights are reset and the network-training process is restarted with
the remaining input parameters. This process is repeated until all
the input parameters are found to have a significant contribution to
network performance. The input is considered significant when its
normalized effect value is greater than or equal to 0.7 in the train-
ing set and 0.5 in the test set. We varied the number of iterations
used to train the network from 500 to 7,000 to find the optimal
number. Three thousand iterations were used for most of the train-
ing runs. In the process, training is automatically terminated when
the maximum iterations are reached or the mean square error of the
network falls to less than the set limit, specified as 1.0×10
−5
. While
training the network, the test set is also evaluated. This step en-
ables a pass through the test set for each pass through the training
set. However, this step does not intervene with the training statis-
tics other than evaluating the test set while training for fine-tuning
and generalizing the network parameters.
After training, the network performance was tested. The test set
was used to determine how well the network performed with data
it had not seen during training.
To evaluate network performance, the classification option
used specified the network output as correct based on a set toler-
ance. This method evaluates the percentage of training and testing
samples that faithfully generalize the patterns and values of the
network outputs. We used a tolerance of 0.05 in this study (the
default value is 0.5), meaning that all outputs for a sample must be
within this tolerance for it to be considered correct. Another mea-
sure is the plot of the mean square error vs. the number of itera-
tions. A well-trained network is characterized by decreasing errors
for both the training and test sets as the number of itera-
tions increases.
Results of Training and Testing
We used the input sensitivity-analysis technique
2,14
to gauge the
sensitivity of the gas production rate (output) for any particular
input. The method makes use of the weight values of a successfully
trained network to extract the information relevant to any particu-
lar input node. The outcome is the effect and normalized effect
values for each input variable at the gas-production output rate.
These effect values represent an assessment of the influence of any
particular input node on the output node.
The results of the input-identification process and training pro-
cedure indicated that the network has excellent performance with
11 input parameters. We found that these parameters, described in
Table 2, contribute significantly to network performance.
Tables 3 and 4 present the results of the input sensitivity
analysis for the training and test sets, respectively. The normalized
effect values indicate that all 11 inputs contribute significantly to
the improvement of the network performance and to the prediction
87May 2003 SPE Production & Facilities
of the U.S. natural-gas production rate for both the training and test
sets. The training set input-sensitivity analysis (Table 3) shows that
the gas annual depletion rate (I-15) is the most significant input
parameter contributing to network performance and, hence, to pre-
dicting U.S. natural gas production. Although we found it impor-
tant to network performance improvement and kept it in the model,
the input of gas wellhead prices (I-3) has the least normalized
effect value (0.7) of all other inputs in the training set. Table 4
shows that all inputs in the test set exceeded the arbitrary specified
threshold value of 0.5, indicating that all inputs contribute signifi-
cantly to the network model.
The network was trained with 5,000 iterations and the QP
learning algorithm. We found that the optimum number of hidden-
layer nodes is 5. Fig. 6 shows the NN model prediction, after the
training and validation processes, superimposed on the normal-
ized, actual U.S. gas production. The NN prediction results show
excellent agreement with the actual production data in both the
training and testing stages. These results indicate that the network
was trained and validated very well and is ready to be used for
forecasting. In addition, statistical and graphical error analyses
were used to examine network performance.
Optimization of Network Parameters. We attempted different
network configurations to optimize the number of hidden nodes
and number of iterations and thus fine-tune the network perfor-
mance, running numerous simulations in the optimization process.
Table 5 presents potential cases for illustration purposes only and
shows that increasing the number of iterations to more than 5,000
improves the training-set performance but worsens the test-set per-
formance. In addition, decreasing the number of iterations to 3,000
yields higher errors for both the training and test sets. The number
of hidden-layer nodes also varied by 4 to 22 nodes. Increasing the
number of hidden nodes to more than five showed good results for
the training set but gave unsatisfactory results for the test set,
which is the most important. From these analyses, the op-
timal network configuration for this specific U.S. gas produc-
tion model is a three-layer QP network with 11 input nodes, 5
hidden nodes, and 1 output node. The network is optimally trained
with 5,000 iterations.
Error Analysis. Statistical accuracy of this network performance
is given in Table 5 (Case 11a). The mean squared error (MSE) of
the training set is 0.0034 and 0.0252 for the test set. Fig. 7 shows
the MSE vs. the iterations for both the training and test sets. The
errors with training-set samples decrease consistently throughout
the training process. In addition, errors with the test-set samples
decrease fairly consistently along with the training-set samples,
indicating that the network is generalizing rather than memorizing.
All the training- and test-set samples yield results of 100% correct
based on 0.05 tolerance, as shown in Fig. 8.
Fig. 9 shows the residual plot of the NN model for both the
training and test samples. The plot shows not only that training set
errors are minimal but also that they are evenly distributed around
zero, as shown in Fig. 10. As is usually the case, errors in test
samples are slightly higher than in training ones. The crossplots of
predicted vs. actual values for natural gas production are presented
in Figs. 11 and 12. Almost all the plotted points of this study’s NN
model fall very close to the perfect 45° straight line, indicating its
high degree of accuracy.
Forecasting
After successful development of the NN model for U.S. natural gas
production, future gas production rates must also be forecast. To
Fig. 6—Performance of the NN model with actual U.S.
gas production.
Fig. 7—Convergence behavior of the QP three-layer network
(11, 5, 1) that learned from the U.S. natural gas production data.
Fig. 8—Behavior of training and testing samples classified
as correct.
88 May 2003 SPE Production & Facilities
implement the network model for prediction, forecast models
should be developed for all 11 network inputs or obtained from
independent studies. We developed forecasting models for all the
independent network inputs (except for the input of gas wellhead
prices) with the time-series-analysis approach. The forecasts for
the gas wellhead prices came from the EIA.
15
We adjusted the EIA
forecasts for gas prices, based on 1998 U.S. dollars/Mcf, to 1992
U.S. dollars/Mcf so that the forecasts would be compatible with
the historical gas prices used in network development. We devel-
oped the forecasting models for the NN input variables with the
Box-Jenkins
16
methodology of time-series analysis. Details of
forecast development for other network inputs are described in
Ref. 17.
Before implementing the network model for forecasting, we
took one additional step, taking the test set back and adding it to
the original training set. The network could then be trained only
one time, keeping the same configuration and parameters of the
original trained network intact. The purpose of this step is to have
the network take into account the effects of all available data.
Because the amount of data is limited, this ensures generalization
of the network performance, yielding better forecasting.
Next, we saved data for the forecasted network inputs for 1999
to 2020 as a test-set file, whereas the training-set file contained
data from 1950 to 1998. We then ran the network with one pass
through all the training and test sets. We retained the obtained data
results in their original form by adding the output value at a given
time to its previous one. After decoding the first-difference output
values, we denormalized the obtained values for the training and
test samples with the same normalization parameters as in the
data preprocessing.
Fig. 13 shows this study’s NN forecasting model for U.S. gas
production to 2020. It also shows the excellent match between the
NN model results and actual natural gas production data. The NN
forecasting model indicates that the U.S. gas production in 1999 is
in a decline, at 1.8% of the 1998 production. Production stayed at
the 1999 level with a slight decline until 2001, after which gas
production started to increase. From 2002 to 2012, gas production
will increase steadily, with an average growth rate of approxi-
mately 0.5%/yr. The NN model indicates that this growth will
more than double from 2013 to 2020, with a 1.3%/yr average
growth rate. By 2019, gas production is predicted at 22.6 Tcf/yr,
approximately the same as the 1973 production level.
The NN forecasting model developed in this study is dependent
not only on the performance of the trained data set but also on the
future performance of forecasted input parameters. Therefore, the
network model should be updated periodically when new data
become available. While it is desirable to update the network
model with new data, the architecture and its parameters need not
be changed. However, a one-time run to train the network with the
updated data is necessary.
Comparison of Forecasts
This section compares the forecasts of U.S. natural gas production
from the EIA
15
with the NN approach and with the stochastic
modeling approach developed by Al-Fattah.
17
The EIA 2000 fore-
cast of U.S. gas supply is based on U.S. Geological Survey
(USGS) estimates of U.S. natural gas resources, including conven-
tional and unconventional gas. The main assumptions of the EIA
forecast are as follows:
Fig. 10—Frequency of residuals in the NN model.
Fig. 11—Crossplot of NN prediction model and actual gas pro-
duction (first difference).
Fig. 12—Crossplot of NN prediction model and actual gas pro-
duction (normalized).
Fig. 9—Residual plot of the NN model.
89May 2003 SPE Production & Facilities
Drilling, operating, and lease equipment costs are expected to
decline by 0.3 to 2%.
Exploratory success rates are expected to increase by 0.5%/yr.
Finding rates will improve by 1 to 6%/yr.
Fig. 14 shows the EIA forecast compared to those from this
study with the NN and time-series analysis (or stochastic modeling).
The stochastic forecast modeling approach we used was based
on the Box-Jenkins time series method as described in detail by
Al-Fattah.
17
We studied past trends of all input data to determine
if their values could be predicted with an “autoregressive inte-
grated moving average” (ARIMA) time-series model. An ARIMA
model predicts a value in a time series as a linear combination of
its own past values and errors. A separate ARIMA model was
developed for each input variable in the NN forecasting model.
Analyses of all input time series showed that the ARIMA model
was both adequate (errors were small) and stationary (errors
showed no time trend).
When we used the ARIMA model to directly forecast gas pro-
duction with only time-dependent data, we were unable to achieve
time-independent errors throughout the production history (from
1918 to 1998). However, because we determined previously that
both the depletion and reserves discovery rates were stationary
time series, we used these two ARIMA models to forecast gas
production by multiplying the depletion rate and the gas reserves.
The product of these two time series determines the stochastic gas
forecast in Fig. 14.
The EIA forecast of the U.S. gas supply with approximately 20
Tcf/yr for 2000 is higher than the NN forecast of approximately
19.5 Tcf/yr. However, the EIA forecast matches the NN one from
2001 to 2003, after which the EIA forecast increases considerably,
with annual average increases of 2.4% from 2004 to 2014 and
1.3% thereafter.
The stochastic-derived model gives a production forecast that is
much higher than the EIA and NN forecasts. The forecast of U.S.
gas supply from the stochastic-derived model shows an exponen-
tial trend with an average growth rate of 2.3%/yr.
The NN forecast is based on the following assumptions of
independent input forecasts.
Gas prices are expected to increase by 1.5%/yr.
The gas depletion rate is expected to increase by 1.45%/yr.
Drilling of gas exploratory wells will improve by 3.5%/yr.
Drilling of oil/gas exploratory wells will increase an average
of 2.5%/yr.
D
G
will have an average increase of 2.1%/yr.
The NN forecast takes into account the effects of the physical
and economical factors on U.S. gas production, which render fore-
casts of natural gas supply reliable. The NN model indicates that
U.S. gas production will increase from 2002 to 2012 by 0.5%/yr on
average. Thereafter, gas production will have a higher increase,
averaging 1.3%/yr through 2020.
Conclusions
This paper presents a new approach to forecast the future produc-
tion of U.S. natural gas with an NN. The three-layer network was
trained and tested successfully, and comparison with actual pro-
duction data showed excellent agreement. Forecasts of the network
input parameters were developed with a stochastic-modeling
approach to time-series analysis. The network model included
various physical and economic input parameters, rendering it a
useful short-term as well as long-term forecasting tool for future
gas production.
The NN model’s forecasting results showed that the 1998 U.S.
gas production would decline at a rate of 1.8%/yr in 1999, with
2001 at the 1999 production level. After 2001, gas production
starts to increase steadily until 2012, with approximately a 0.5%/yr
average growth rate. This growth will more than double for 2013
to 2020, with a 1.3%/yr average growth rate. By 2020, gas pro-
duction is predicted at 23 Tcf/yr, slightly higher than the 1973
production level.
The NN model is useful as a short-term as well as a long-term
predictive tool for future gas production. It can also be used to
quantitatively examine the effects of various physical and eco-
nomical factors on future gas production. With the NN model
developed in this study, we recommend further analysis to quan-
titatively evaluate the effects of the various physical and economic
factors on future gas production.
Nomenclature
D
G
gross domestic product, U.S. dollars
G
DP
growth rate of gross domestic product
i observation number
t time, 1/t, 1/yr
X input/output vector
X⬘⳱normalized input/output vector
mean or arithmetic average
standard deviation
References
1. Haykin, S.: Neural Networks: A Comprehensive Foundation, Mac-
millan College Publishing Co., New York City (1994).
2. Azoff, E.M.: Neural Network Time Series Forecasting of Financial
Markets, John Wiley & Sons Ltd. Inc., Chichester, England (1994).
3. Neural Networks in Finance and Investing: Using Artificial Intelli-
gence to Improve Real-World Performance, revised edition, R.R.
Trippi and E. Turban (eds.), Irwin Professional Publishing, Chicago,
Illinois (1996).
4. Mohaghegh, S.: “Virtual-Intelligence Applications in Petroleum Engi-
neering: Part I—Artificial Neural Networks,” JPT (September 2000)
64.
5. Al-Kaabi, A.U. and Lee, W.J.: “Using Artificial Neural Nets To Iden-
tify the Well-Test Interpretation Model,” SPEFE (September 1993)
233.
Fig. 13—NN forecasting model of U.S. gas production.
Fig. 14—Comparison of U.S. gas-production forecasts.
90 May 2003 SPE Production & Facilities
6. Habiballah, W.A., Startzman, R.A., and Barrufet, M.A.: “Use of Neural
Networks for Prediction of Vapor/Liquid Equilibrium K Values for
Light-Hydrocarbon Mixtures,” SPERE (May 1996) 121.
7. EIA, Internet Home Page: http://www.eia.doe.gov/.
8. Twentieth Century Petroleum Statistics, 52nd ed., DeGolyer and Mac-
Naughton, Dallas (1996).
9. Twentieth Century Petroleum Statistics, 54th ed., DeGolyer and Mac-
Naughton, Dallas (1998).
10. Attanasi, E.D. and Root, D.H.: “The Enigma of Oil and Gas Field
Growth,” AAPG Bull. (March 1994) 78, 321.
11. Energy Statistics Sourcebook, 13th edition, OGJ Energy Database,
PennWell Publishing Co., Tulsa (1998).
12. “World Energy Projection System,” DOE/EIA-M050, Office of Inte-
grated Analysis and Forecasting, U.S. Dept. of Energy, EIA, Washing-
ton, DC (September 1997).
13. Kutner, M.H. et al.: Applied Linear Statistical Models, fourth edition,
Irwin, Chicago (1996).
14. ThinksPro: Neural Networks Software for Windows User’s Guide,
Logical Designs Consulting Inc., La Jolla, California (1995).
15. “Annual Energy Outlook 2000,” DOE/EIA-0383, Office of Integrated
Analysis and Forecasting, U.S. Dept. of Energy, EIA, Washington, DC
(1999).
16. Box, G.E., Jenkins, G.M., and Reinsel, G.C.: Time Series Analysis
Forecasting and Control, third edition, Prentice-Hall Inc., Englewood
Cliffs, New Jersey (1994).
17. Al-Fattah, S.M.: “New Approaches for Analyzing and Predicting Glob-
al Natural Gas Production,” PhD dissertation, Texas A&M U., College
Station, Texas (2000).
SI Metric Conversion Factors
ft × 3.048* E–01 m
ft
3
× 2.831 685 E–02 m
3
*Conversion factor is exact.
Saud Al-Fattah is a reservoir management engineer in the Res-
ervoir Management Dept. of Audi Aramco, Dhahran. His spe-
cialties include reservoir engineering, operations research,
economic evaluation, forecasting, and strategic planning. Al-
Fattah holds MS and BS degrees from King Fahd U. of Petro-
leum and Minerals and a PhD degree from Texas A&M U., all in
petroleum engineering. Richard A. (Dick) Startzman is currently
a professor of petroleum engineering at Texas A&M U. He was
employed by Chevron Corporation for 20 years in research,
operations, and management in the U.S., Europe, and the
Middle East. He joined the faculty of petroleum engineering
faculty at Texas A&M in 1982. His research interests include
reservoir engineering, economic evaluation, artificial intelli-
gence, and optimization. He was named to the Peterson Pro-
fessorship in 1993. He has been active in the Society of Petroleum
Engineers and was elected a Distinguished Member in 1994.
91May 2003 SPE Production & Facilities
... ANNs have seen a great increase of interest during the past few years. They are powerful and useful tools for solving practical problems in the petroleum industry (Mohaghegh 2005;Al-Fattah and Startzman 2003). Advantages of neural network techniques (Bishop 1995;Fausett 1994;Haykin 1994;Patterson 1996) over conventional techniques include the ability to address highly nonlinear relationships, independence from assumptions about the distribution of input or output variables, and the ability to address either continuous or categorical data as either inputs or outputs. ...
... Oilwet was represented as {1, 0, 0}, mixed-wet as {0, 1, 0}, and water-wet as {0, 0, 1}. In this study, we applied two normalization algorithms-mean/standard deviation, and minimax-to ensure that the network's input and output will be in a sensible range (Al-Fattah and Startzman 2003). The simplest normalization function is the minimax, which finds the minimum and maximum values of a variable in the data and performs a linear transformation using a shift and a scale factor to convert the values into the target range, which is typically [0.0, 1.0]. ...
... An important measure of the network performance is the plot of the root-mean-square error vs. the number of iterations or epochs. A well-trained network is characterized by decreasing errors for both the training and verification data sets as the number of iterations increases (Al-Fattah and Startzman 2003). Statistical analysis used in this study to examine the performance of a network are the output-data SD, output error mean, output error SD, output absolute error mean, SD ratio, and Pearson-R correlation coefficient (Hill and Lewicki 2006). ...
Article
Full-text available
Determination of relative permeability data is required for almost all calculations of fluid flow in petroleum reservoirs. Water-oil relative permeability data play important roles in characterizing the simultaneous two-phase flow in porous rocks and predicting the performance of immiscible displacement processes in oil reservoirs. They are used, among other applications, for determining fluid distributions and residual saturations, predicting future reservoir performance, and estimating ultimate recovery. Undoubtedly, these data are considered probably the most valuable information required in reservoir simulation studies. Estimates of relative permeability are generally obtained from laboratory experiments with reservoir core samples. Because the laboratory measurement of relative permeability is rather delicate, expensive and time consuming, empirical correlations are usually used to predict relative permeability data, or to estimate them in the absence of experimental data. However, developing empirical correlations for obtaining accurate estimates of relative permeability data showed limited success and proved difficult especially for carbonate reservoir rocks. Artificial neural network (ANN) technology has proved successful and useful in solving complex structured and nonlinear problems. This paper presents a new modeling technology to predict accurately water-oil relative permeability using ANN. The ANN models of relative permeability were developed using experimental data from waterflood core tests samples collected from carbonate reservoirs of giant Saudi Arabian oil fields. Three groups of data sets were used for training, verification, and testing the ANN models. Analysis of results of the testing data set show excellent agreement with the experimental data of relative permeability. In addition, error analyses show that the ANN models developed in this study outperform all published correlations. The benefits of this work include meeting the increased demand for conducting special core analysis, optimizing the number of laboratory measurements, integrating into reservoir simulation and reservoir management studies, and providing significant cost savingson extensive lab work and substantial required time. Introduction Artificial neural networks have seen an explosion of interest over the past few years. They are powerful and useful tools for solving practical problems in the petroleum industry (Mohaghegh 2005; Al-Fattah and Startzman 2003). Advantages of neural network techniques (Bishop 1995; Fausett 1994; Haykin 1994; Patterson 1996) over conventional techniques include the ability to address highly nonlinear relationships, independence from assumptions about the distribution of input or output variables, and the ability to address either continuous or categorical data as either inputs or outputs. In addition, neural networks are intuitively appealing as they are based on crude low-level models of biological systems. Neural networks, as in biological systems, simply learn by examples. The neural network user provides representative data and trains the neural networks to learn the structure of the data.
... A hybrid threelayer network that can be used to identify beam pump malfunctions from downhole pump cards is presented (Nazi et al., 1994). Neural network based methods are used to estimate rock permeability in Mohaghegh et al. (1995), Malki et al. (1996), Guler et al. (2003) and predict U.S. natural gas production (Al-Fattah and Startzman, 2003). Artificial network technology is also used to conduct downhole fluid analysis (Hegeman et al., 2007) and predict liquid holdup in twophase flow (Shippen and Scott, 2002). ...
... In recent years, the intelligent methods have been applied [6], such as the artificial neural networks (ANN) [7][8][9][10], support vector machines (SVM) [11], adaptive neural fuzzy interface (ANFIS) [12]. Some of these methods were used to build the time series prediction models [13][14][15], and others were used to build the multivariate regression models [16][17][18]. Although the intelligent methods are better at handling the nonlinearities situations, they do not perform well at times while attempting to solve highly nonlinearity time series data [19]. ...
Article
Full-text available
Prediction of petroleum production plays a key role in the petroleum engineering, but an accurate prediction is difficult to achieve due to the complex underground conditions. In this paper, we employ the kernel method to extend the Arps decline model into a nonlinear multivariate prediction model, which is called the nonlinear extension of Arps decline model (NEA). The basic structure of the NEA is developed from the Arps exponential decline equation, and the kernel method is employed to build a nonlinear combination of the input series. Thus, the NEA is efficient to deal with the nonlinear relationship between the input series and the petroleum production with a one-step linear recursion, which combines the merits of commonly used decline curve methods and intelligent methods. The case studies are carried out with the production data from two real-world oil field in China and India to assess the efficiency of the NEA model, and the results show that the NEA is eligible to describe the nonlinear relationship between the influence factors and the oil production, and it is applicable to make accurate forecasts for the oil production in the real applications.
Article
The tracer breakthrough curve tends to be unimodal in heavy oil reservoir due to high oil-water viscosity ratio, which makes it difficult to classify thief zone (TZ) in these reservoirs using tracer breakthrough curve. To solve the problem, the paper applies the convolutional neural network (CNN) to achieve fast and accurate classification of TZ in heavy oil reservoir. The tracer flow analytical model is established by equating TZ with the flow tubes that satisfy the Hagen-Poiseuille equation. Then, 3000 tracer breakthrough curves are generated by the model as sample. Additionally, one-hot encoding method is applied to deal with these sample curves. Through the orthogonal design, the optimal combination of hyperparameters are determined to establish an OD-CNN model. According to the results, the number of convolutional layers is the most significant influencing factor in the accuracy of OD-CNN. Besides, the optimal hyperparameter combination for OD-CNN is detailed as follows. The number of convolutional layers is 4, the dropout rate is 0.6, the initialization method is Xavier normal, the optimizer is Adam, and the activation function is ReLU. Compared with random forest (RF) and K-means, the accuracy of OD-CNN on the training set is 94.67%, which is higher than 82.30% of RF and 75.63% of k-means. Moreover, OD-CNN can correctly classify 89 of the 100 curves from the oilfield indicating the reliability of OD-CNN. Thus, applying orthogonal design to OD-CNN can avoid the blindness of hyperparameter combination optimization and significantly reduce computing time.
Article
This paper develops a rigorous and advanced data-driven model to describe, analyze, and forecast the global crude oil demand. The study deploys a hybrid approach of artificial intelligence techniques, namely, the genetic-algorithm, neural-network, and data-mining approach for time-series models (GANNATS). The GANNATS was developed and applied to two country cases, including one for a high oil producer (Saudi Arabia) and one for a high oil consumer (China), to develop crude oil demand forecasts. The input variables of the neural network models include gross domestic product (GDP), the country’s population, oil prices, gas prices, and transport data, in addition to transformed variables and functional links. The artificial intelligence predictive models of oil demand were successfully developed, trained, validated, and tested using historical oil-market data, yielding excellent oil demand predictions. The performance of the intelligent models for Saudi Arabia and China was examined using rigorous indicators of generalizability, predictability, and accuracy. The GANNATS forecasting models show that the crude oil demand for both Saudi Arabia and China will continue to increase over the forecast period but with a mildly declining growth, particularly for Saudi Arabia. This decreasing growth in the demand for oil can be attributed to increased energy efficiency, fuel switching, conversion of power plants from crude oil to gas-based plants, and increased utilization of renewable energy, such as solar and wind for electricity generation and water desalination. In this study, the feature engineering of variables selection techniques has been applied to identify and understand significant factors that impact and drive the crude oil demand. The proposed GANNATS methodology optimizes and upgrades the conventional process of developing oil demand forecasts. It also improves and enhances the predictability and accuracy of the current oil demand forecasting models.
Article
Full-text available
Predictive analysis of the reservoir surveillance data is crucial for the high-efficiency management of oil and gas reservoirs. Here we introduce a new approach to reservoir surveillance that uses the machine learning tree boosting method to forecast production data. In this method, the prediction target is the decline rate of oil production at a given time for one well in the low-permeability carbonate reservoir. The input data to train the model includes reservoir production data (e.g., oil rate, water cut, gas oil ratio (GOR)) and reservoir operation data (e.g., history of choke size and shut-down activity) of 91 producers in this reservoir for the last 20 years. The tree boosting algorithm aims to quantitatively uncover the complicated hidden patterns between the target prediction parameter and other monitored data of a high variety, through state-of-the-art automatic classification and multiple linear regression algorithms. We also introduce a segmentation technique that divides the multivariate time-series production and operation data into a sequence of discrete segments. This feature extraction technique can transfer key features, based on expert knowledge derived from the in-reservoir surveillance, into a data form that is suitable for the machine learning algorithm. Compared with traditional methods, the approach proposed in this article can handle surveillance data in a multivariate time-series form with different strengths of internal correlation. It also provides capabilities for data obtained in multiple wells, measured from multiple sources, as well as of multiple attributes. Our application results indicate that this approach is quite promising in capturing the complicated patterns between the target variable and several other explanatory variables, and thus in predicting the daily oil production rate.
Article
This paper develops a novel AI and data-driven predictive model to analyze and forecast energy markets, and tests it for gasoline demand of Saudi Arabia. The AI model is based on a genetic algorithm (GA), artificial neural network (ANN), and data mining (DM) approach for time-series (TS) analysis, referred to as GANNATS. The GANNATS predictive model was successfully designed, trained, validated, and tested using real historical market data. Results show that the model yields accurate predictions with robust key performance indicators. A double cross-validation of the model verified that Saudi Arabia's gasoline demand declined by 2.5% in 2017 from its 2016 level. The model forecasts that Saudi gasoline demand will maintain a mild growth over the short-term outlook. Variables impact and screening analysis was performed to identify the influencing factors driving the gasoline demand. The recent decline in Saudi gasoline demand is primarily attributed to the improvements in vehicle efficiency, lifting of fuel price subsidies, declining population growth, and changes in consumer behavior. This paper enriches existing knowledge of best practices for forecasting domestic and global gasoline demand. In addition, the methodology presented improves on traditional econometric models and enhances the predictability and accuracy of forecasts of gasoline demand.
Article
Full-text available
Oil market volatility affects macroeconomic conditions and can unduly affect the economies of oil-producing countries. Large price swings can be detrimental to producers and consumers, causing infrastructure and capacity investments to be delayed, employment losses, inefficient investments, and/or the growth potential for energy-producing countries to be adversely affected. Undoubtedly, greater stability of oil prices increases the certainty of oil markets for the benefit of oil consumers and producers. Therefore, modeling and forecasting crude-oil price volatility is a strategic endeavor for many oil market and investment applications. This paper focused on the development of a new predictive model for describing and forecasting the behavior and dynamics of global oil-price volatility. Using a hybrid approach of artificial intelligence with a genetic algorithm (GA), artificial neural network (ANN), and data mining (DM) time-series (TS) (GANNATS) model was developed to forecast the futures price volatility of West Texas Intermediate (WTI) crude. The WTI price volatility model was successfully designed, trained, verified, and tested using historical oil market data. The predictions from the GANNATS model closely matched the historical data of WTI futures price volatility. The model not only described the behavior and captured the dynamics of oil-price volatility, but also demonstrated the capability for predicting the direction of movements of oil market volatility with an accuracy of 88%. The model is applicable as a predictive tool for oil-price volatility and its direction of movements, benefiting oil producers, consumers, investors, and traders. It assists these key market players in making sound decisions and taking corrective courses of action for oil market stability, development strategies, and future investments; this could lead to increased profits and to reduced costs and market losses. In addition, this improved method for modeling oil-price volatility enables experts and market analysts to empirically test new approaches for mitigating market volatility. It also provides a roadmap for improving the predictability and accuracy of energy and crude models.
Article
Quantitative appraisal of different operating areas and assessment of uncertainty due to reservoir heterogeneities are crucial elements in optimization of production and development strategies in oil sands operations. Although detailed compositional simulators are available for recovery performance evaluation for steam-assisted gravity drainage (SAGD), the simulation process is usually deterministic and computationally demanding, and it is not quite practical for real-time decision-making and forecasting. Data mining and machine learning algorithms provide efficient modeling alternatives, particularly when the underlying physical relationships between system variables are highly complex, non-linear, and possibly uncertain.
Article
Full-text available
Distinguished Author Series articles are general, descriptive representations that summarize the state of the art in an area of technology by describing recent developments for readers who are not specialists in the topics discussed. Written by individuals recognized to be experts in the area, these articles provide key references to more definitive work and present specific details only to illustrate the technology. Purpose: to inform the general readership of recent advances in various areas of petroleum engineering. Summary This is the first article of a three-article series on virtual intelligence and its applications in petroleum and natural gas engineering. In addition to discussing artificial neural networks, the series covers evolutionary programming and fuzzy logic. Intelligent hybrid systems that incorporate an integration of two or more of these paradigms and their application in the oil and gas industry are also discussed in these articles. The intended audience is the petroleum professional who is not quite familiar with virtual intelligence but would like to know more about the technology and its potential. Those with a prior understanding of and experience with the technology should also find the articles useful and informative. Background and Definitions This section covers some historical background of the technology, provides definitions of virtual intelligence and artificial neural networks, and offers more general information on the nature and mechanism of the artificial neural network and its relation to biological neural networks. Virtual intelligence has been referred to by different names. Among these are artificial intelligence, computational intelligence, and soft computing. There seems to be no uniformly acceptable name for this collection of analyticools among the researchers and practitioners of the technology. Of these, artificial intelligence is used the least as an umbrella term because artificial intelligence has historically referred to rule-based expert systems and today is used synonymously with expert systems. Expert systems made many promises of delivering intelligent computers and programs but did not fulfill these promises. Many believe that soft computing is the most appropriate term to use and that virtual intelligence is a subset of soft computing. While this argument has merit, we use the term virtual intelligence throughout these articles.
Article
SPE Members Abstract The objective of this paper is to present the application of a new approach to identify a preliminary well test interpretation model from derivative plot data. Our approach is based on artificial neural networks technology. In this approach, a neural nets simulator which employs back propagation as the learning algorithm is trained on representative examples of derivative plots for a wide range of well test interpretation models. The trained nets are then used to identify the well test interpretation model from new well tests. In this paper we show that using artificial neural networks technology is a significant improvement over pattern recognition techniques currently used (e.g., syntactic pattern recognition) in well test interpretation. Artificial neural networks have the ability to generalize their understanding of the pattern recognition space they are taught to identify. This implies that they can identify patterns from incomplete and distorted data. This ability is very patterns from incomplete and distorted data. This ability is very useful when dealing with well tests which often have incomplete and noisy data. Moreover, artificial neural networks eliminate the need for elaborate data preparation (e.g., smoothing, segmenting, and symbolic transformation) and they do not require writing complex rules to identify a pattern. Artificial neural networks eliminate the need for using rules by automatically building an internal understanding of the pattern recognition space in the form of weights that describe the strength of the connections between the net processing units. The paper illustrates the application of this new approach with a field example. The mathematical derivation and implementation of this approach can be found in Ref. 1. Introduction In a pressure transient test a signal of pressure vs. time is recorded. When this signal is plotted using specialized plotting functions, it produces diagnostic plots such as derivative or Horner plots which we use often in the interpretation process. The signal on these plots is deformed and shaped by some underlying mechanisms in the formation and the wellbore. These mechanisms are known as the well test interpretation model. The objective of this work is to identify these mechanisms from the signatures present on the derivative plot. The problem of identifying the well test interpretation model has been described in the literature as the inverse problem. The traditional way of solving an inverse problem is to use inverse theory techniques (e.g., regression analysis). A major disadvantage of such techniques is that we have to assume an interpretation model. The inverse theory provides estimates of the model parameters but not the model itself. Realizing that more than one interpretation model can produce the same signal. This approach can lead to misleading results. What we seek in this study is the model itself rather than its parameters. Finding the model parameter after identifying the model is a simple problem. In this study we trained a neural nets simulator to identify the well test interpretation model from the derivative plot. The neural nets simulator can be part of a well test expert system or a computer enhanced well test interpretation. The mathematical derivation and implementation of this approach are detailed in Ref 1. LITERATURE REVIEW In 1988, Allain and Horne used syntactic pattern recognition and a rule-based approach to identify the well test interpretation model automatically from the derivative plot. Their approach is based on transforming the derivative plot into a symbolic form. The symbols generated (e.g., UP, DOWN, etc.) are used by a rule. based system to construct the shapes (e.g., maxima, minima, stabilizations) present on the derivative plot and, consequently present on the derivative plot and, consequently identify the well test interpretation model. The transformation process from digital data to symbols is carried out by process from digital data to symbols is carried out by approximating the derivative curve by a sequence of straight lines. The linear approximation is assumed successful when the fit error of each straight line is within an allowable tolerance. The attributes(e.g, the slope) of each straight line are used to describe the orientation (i.e., UP, DOWN, FLAT) of the curve segments based on preselected angle thresholds. Symbolic merging (i.e., grouping similar consecutive symbols as one symbol) is executed to reduce the symbols to the least possible number. This step is necessary to arrive at a finite number of rules which identify the well test interpretation model from the derivative symbolic form. P. 213
Article
SPE Members Abstract The objective of this paper is to present a new approach to identify a preliminary well test interpretation model from derivative plot data. Our approach is based on artificial neuralnetworks technology. In this approach, a neural nets simulator which employs back propagation as the learning algorithm is trained on representative examples of derivative plots for a wide range of well test interpretation models. The trained nets are then used to identify the well test interpretation model from new well tests. In this paper we show that using artificial neural networks technology is a significant improvement over pattern recognition techniques currently used (e.g., syntactic pattern recognition) in well test interpretation. Artificial neural networks have the ability to generalize their understanding of the pattern recognition space they are taught. to identify. This implies that they can identify patterns from incomplete and distorted data. This ability is very useful when dealing with well tests which often have incomplete and noisy data. Moreover, artificial neural networks eliminate the need for elaborate data preparation (e.g., smoothing, segmenting, and symbolic transformation) and they do not require writing complex rules to identify a pattern. Artificial neural networks eliminate the need for using rules by automatically building an internal understanding of the pattern recognition space in the form of weights that describe the strength of the connections between the net processing units. The paper illustrates the application of this new approach with two field examples. Introduction In a pressure transient test a signal of pressure vs. time is recorded. when this signal is plotted using specialized plotting functions, it produces diagnostic plots such as derivative or Horner plots which we use often in the interpretation process. The signal on these plots is deformed and shaped by some underlying mechanisms in the formation and the wellbore. These mechanisms are known as the well test interpretation model. The objective of this work is to identify these mechanisms from the signatures present on the derivative plot. The problem of identifying the well test interpretation model has been described in the literature as the inverse problem. The traditional way of solving an inverse problem is to use inverse theory techniques (e.g., regression analysis). A major disadvantage of such techniques is that we have to assume an interpretation model. The inverse theory provides estimates of the model parameters but not the model itself. Realizing that more than one interpretation model can produce the same signal, this approach can lead to misleading results. what we seek in this study is the model itself rather than its parameters. Finding the model parameters after identifying the model is a simple problem. In this study we trained a neural nets simulator to identify the well test interpretation model from the derivative plot. The neuralnets simulator can be part of a well test expert system or a computer enhanced well test interpretation. LITERATURE REVIEW In 1988, Allain and Horne used syntactic pattern recognition and a rule-based approach to identify the well test interpretation model automatically from the derivative plot. Their approach is based on transforming the derivative plot into a symbolic form. The symbols generated (e.g., UP, DOWN, etc.)are used by a rule-based system to construct the shapes (e.g., "maxima, minima, stabilizations) present on the derivative plot and, consequently, identify the well test interpretation model. The transformation process from digital data to symbols is carried out by approximating the derivative curve by a sequence of straight lines. The linear approximation is assumed successful when the fit error of each straight line is 'within an allowable tolerance. The attributes (e.g, the slope) of each straight line are used to describe the orientation (i.e., UP, DOWN, FLAT) of the curve segments based on preselected angle thresholds. Symbolic merging (i.e., grouping similar consecutive symbols as one symbol) is executed to reduce the symbols to the least possible number. P. 77
Article
Growth in estimates of recovery in discovered fields is an important source of annual additions to United States proven reserves. This paper examines historical field growth and presents estimates of future additions to proved reserves from fields discovered before 1992. Field-level data permitted the sample to be partitioned on the basis of recent field growth patterns into outlier and common field set, and analyzed separately. The outlier field set accounted for less than 15% of resources, yet grew proportionately six times as much as the common fields. Because the outlier field set contained large old heavy-oil fields and old low-permeability gas fields, its future growth is expected to be particularly sensitive to prices. A lower bound of a range of estimates of future growth was calculated by applying monotone growth functions computed from the common field set to all fields. Higher growth estimates were obtained by extrapolating growth of the common field set and assuming the outlier fields would maintain the same share of total growth that occurred from 1978 through 1991. By 2020, the two estimates for additions to reserves from pre-1992 fields are 23 and 32 billion bbl of oil in oil fields and 142 and 195 tcf of gas in gas fields. 20 refs., 8 figs., 3 tabs.
Article
Equilibrium ratios play a fundamental role in the understanding of phase behavior of hydrocarbon mixtures. They are important in predicting compositional changes under varying temperatures and pressures conditions in reservoirs, surface separators, production and transportation facilities. In particular they are critical for reliable and successful compositional reservoir simulation. This paper presents a new approach for predicting K-values using Neural Networks (NN). The method is applied to binary and multicomponent mixtures, K-values prediction accuracy is in the order of the tradition methods. However, computing speed is significantly faster. Introduction Equilibrium rations, more commonly known as K-values, relate the vapor mole fractions (yi), to the liquid mole fraction (xi) of a component (i) in a mixture, (1) In a fluid mixture consisting of different chemical components, K-values are dependent on mixture pressure, temperature, and composition of the mixture. There are a number of methods for predicting K-values, basically these methods compute K-values explicitly or iteratively. The explicit methods correlate K-values with components parameters (i.e. critical properties), mixtures parameters (i.e. convergence pressure). Iterative methods are based on the equation of state (EOS) and are, usually, tuned with binary interaction parameters. Literature search and experience in the phase behavior of hydrocarbon systems, have shown that current explicit methods are not accurate because they neglect compositional affects. EOS approach requires extensive amount of computational time, may have convergence problems, and must be supplied with good binary interaction parameters. In compositional reservoir simulation where million of K-values are required, the method becomes time consuming and adds to the complexity of simulation studies making some of them impractical. Neural Networks (NN) are emerging technology that seems to offer two advantages, fast computation and accuracy. The objective of this paper is to show the potential of using NN for predicting K-values. Different NN where trained using the Scaled Conjugate Gradient (SCG), and where used to predict the K-values for binary and multicomponent mixtures.
Article
In a pressure-transient test, a signal of pressure vs. time is recorded. When this signal is plotted with specialized plotting functions, diagnostic plots, such as derivative or Horner plots, are produced that often are used in the interpretation process. The signal on these plots is deformed and shaped by underlying mechanisms in the formation and wellbore. These mechanisms are the well-test interpretation model. The objective of this work is to identify these mechanisms from the signatures on the derivative plot. Identifying the well-test interpretation model is described in the literature as an inverse problem. The traditional way of solving an inverse problem is with inverse theory techniques (e.g., regression analysis). A serious disadvantage of such techniques is that one has to assume an interpretation model. The inverse theory provides estimates of the model parameters but not of the model itself. Because more than one interpretation model can produce the same signal, this approach can lead to misleading results. The authors seek the model itself instead of its parameters in this study. In this study, they trained a neural net simulator to identify the well-test interpretation model from the derivative plot. The neural net simulator can be part of a well-test expert system or a computer-enhanced well-test interpretation.
Article
From the Publisher:A neural network is a computer program that can recognise patterns in data, learn from this and (in the case of time series data) make forecasts of future patterns. There are now over 20 commercially available neural network programs designed for use on financial markets and there have been some notable reports of their successful application. However, like any other computer program, neural networks are only as good as the data they are given and the questions that are asked of them. Proper use of a neural network involves spending time understanding and cleaning the data: removing errors, preprocessing and postprocessing. This book takes the reader beyond the 'black-box' approach to neural networks and provides the knowledge that is required for their proper design and use in financial markets forecasting - with an emphasis on futures trading. Comprehensively specified benchmarks are provided (including weight values), drawn from time series examples in chaos theory and financial futures. The book covers data preprocessing, random walk theory, trading systems and risk analysis. It also provides a literature review, a tutorial on backpropagation, and a chapter on further reading and software. For the professional financial forecaster this book is without parallel as a comprehensive, practical and up-to-date guide to this important subject.