Content uploaded by Saud M. Al-Fattah

Author content

All content in this area was uploaded by Saud M. Al-Fattah on Apr 12, 2014

Content may be subject to copyright.

Neural Network Approach Predicts U.S.

Natural Gas Production

S.M. Al-Fattah, SPE, Saudi Aramco, and R.A. Startzman, SPE, Texas A&M U.

Summary

The industrial and residential market for natural gas produced in

the United States has become increasingly significant. Within the

past 10 years, the wellhead value of produced natural gas has

rivaled and sometimes exceeded the value of crude oil. Forecasting

natural gas supply is an economically important and challenging

endeavor. This paper presents a new approach to predict natural

gas production for the United States with an artificial neural net-

work (NN).

We developed an NN model to forecast the U.S. natural gas

supply to 2020. Our results indicate that the U.S. will maintain its

1999 production of natural gas until 2001, after which production

increases. The network model indicates that natural gas production

will increase by an average rate of 0.5%/yr from 2002 to 2012.

This increase will more than double from 2013 to 2020.

The NN was developed with a large initial pool of input pa-

rameters. The input pool included exploratory, drilling, produc-

tion, and econometric data. Preprocessing the input data involved

normalization and functional transformation. Dimension-reduction

techniques and sensitivity analysis of input variables were used to

reduce redundant and unimportant input parameters and to sim-

plify the NN. The remaining parameters included data from gas

exploratory wells, oil/gas exploratory wells, oil exploratory wells,

gas depletion rate, proved reserves, gas wellhead prices, and

growth rate of the gross domestic product. The three-layer NN was

successfully trained with yearly data from 1950 to 1989 using the

quick-propagation learning algorithm. The NN’s target output is

the production rate of natural gas. The agreement between pre-

dicted and actual production rates was excellent. A test set not used

to train the network and containing data from 1990 to 1998 was

used to verify and validate the network prediction performance.

Analysis of the test results showed that the NN approach provides

an excellent match with actual gas production data. An economet-

ric approach, called stochastic modeling or time-series analysis,

was used to develop forecasting models for NN input parameters.

A comparison of forecasts between this study and another is pre-

sented.

The NN model has use as a short-term as well as a long-term

predictive tool for natural gas supply. The model can also be used

to quantitatively examine the effects of the various physical and

economic factors on future gas production.

Introduction

In recent years, there has been a growing interest in applying

artificial NNs

1–4

to various areas of science, engineering, and fi-

nance. Among other applications

4

to petroleum engineering, NNs

have been used for pattern recognition in well-test interpretation

5

and for prediction in well logs

4

and phase behavior.

6

Artificial NNs are an information-processing technology in-

spired by studies of the brain and nervous system. In other words,

they are computational models of biological neural structures.

Each NN generally consists of a number of interconnected pro-

cessing elements (PE) or neurons grouped in layers. Fig. 1 shows

the basic structure of a three-layer network—one input, one hid-

den, and one output. The neuron consists of multiple inputs and a

single output. “Input” denotes the values of independent variables,

and “output” is the dependent variables. Each input is modified by

a weight, which multiplies with the input value. The input can be

raw data or output from other PEs or neurons. With reference to a

threshold value and activation function, the neuron will combine

these weighted inputs and use them to determine its output. The

output can be either the final product or an input to another neuron.

This paper describes the methodology of developing an artifi-

cial NN model to predict U.S. natural gas production. It presents

the results of the NN modeling approach and compares it to other

modeling approaches.

Data Sources

The data used to develop the artificial NN model for U.S. gas

production were collected mostly from the Energy Information

Admin. (EIA).

7

U.S. marketed-gas production for 1918 to 1997

was obtained from Twentieth Century Petroleum Statistics,

8–9

with

the EIA’s 1998 production data. Gas-discovery data from 1900 to

1998 were from Refs. 7 and 10. Proved gas reserves for 1949 to

1999 came from the Oil and Gas J. (OGJ) database.

11

EIA pro-

vides various statistics on U.S. energy historical data, including

gas production, exploration, drilling, and econometrics. These data

are available to the public and can be downloaded from the internet

with ease. The following data (1949 to 1998) were downloaded

from the EIA website.

7

• Gas discovery rate.

• Population.

• Gas wellhead price.

• Oil wellhead price.

• Gross domestic product (D

G

), with purchasing power parity

(PPP) based on 1992 U.S. dollars.

• Gas exploratory wells.

➢ Footage and wells drilled.

• Oil exploratory wells.

➢ Footage and wells drilled.

➢ Percentage of successful wells drilled.

• Oil and gas exploratory wells.

➢ Footage and wells drilled.

• Proved gas reserves.

Other input parameters were also derived from the previous data

parameters. The derived input parameters include:

• Gross domestic product growth rate. This input parameter

was calculated with the following formula.

12

G

DPi+ 1

=

冋

冉

D

Gi+ 1

D

Gi

冊

1

共

t

i+ 1

− t

i

兲

− 1

册

× 100, ...................( 1)

where D

G

⳱gross domestic product, G

DP

⳱growth rate of gross

domestic product, t⳱time, and i⳱observation number.

• Average depth drilled per well. This is calculated by dividing

the footage drilled by the number of exploratory wells drilled each

year. This is done for the gas exploratory wells, oil exploratory

wells, and oil-and-gas exploratory wells, resulting in three addi-

tional new input variables.

• Depletion rate. This measures how fast the reserves are being

depleted each year at that year’s production rate. It is calculated as

the annual production divided by the proved reserves and is ex-

pressed in a percentage.

Data Preprocessing

Data preparation is a critical procedure in the development of an

artificial NN system. The preprocessing procedures used in the

construction process of this study’s NN model are input/output

normalization and transformation.

Copyright © 2003 Society of Petroleum Engineers

This paper (SPE 82411) was revised for publication from paper SPE 67260, first presented

at the 2001 SPE Production and Operations Symposium, Oklahoma City, Oklahoma, 25-28

March. Original manuscript received for review 22 April 2002. Revised manuscript received

21 November 2002. Paper peer approved 3 December 2002.

84 May 2003 SPE Production & Facilities

Normalization. Normalization is the process of standardizing the

possible numerical range input data can take. It enhances the fair-

ness of training by preventing an input with large values from

swamping out another that is equally important but has smaller

values. Normalization is also recommended because the network

training parameters can be tuned for a given range of input data;

thus, the training process can be carried over to similar tasks.

We used the mean/standard deviation normalization method to

normalize all the NN’s input and output variables. Mean standard

deviation preprocessing is the most commonly used method and

generally works well with almost every case. Its advantages are

that it processes the input variable without any loss of information

and its transform is mathematically reversible. Each input variable,

as well as the output, were normalized with the following formula.

13

X⬘

i

=

共

X

i

−

i

兲

i

, ........................................( 2)

where X⬘⳱normalized input/output vector, X⳱original input/

output vector,

⳱mean of the original input/output,

⳱standard

deviation of the input/output vector, and i⳱number of input/output

vector. Each input/output variable was normalized with its mean

and standard deviation values with Eq. 2. This process was applied

to all the data, including the training and testing sets. The single set

of normalization parameters for each variable (i.e., the standard

deviation and the mean) were then preserved to be applied to new

data during forecasting.

Transformation. Our experience found that NN performs better

with normally distributed data and unseasonal data. Input data

exhibiting trend or periodic variations renders data transformation

necessary. There are different ways to transform the input vari-

ables into forms, making the NN interpret the input data easier and

perform faster in the training process. Examples of such transfor-

mation forms include the variable first derivative, relative variable

difference, natural logarithm of the relative variable, square root of

the variable, and trigonometric functions. In this study, all input as

well as output variables were transformed with the first derivative

of each. This transform choice removed the trend in each input

variable, thus helping to reduce the multicolinearity among the

input variables.

Using the first derivative also results in greater fluctuation and

contrast in the values of the input variables. This improves the

ability of the NN model to detect significant changes in patterns.

For instance, if gas exploratory footage (one of the input variables)

is continuously increasing, the actual level may not be as important

as the first-time derivative of footage or the rate of change in

footage from year to year.

The first-derivative transformation, however, resulted in a loss

of one data point because of its mathematical formulation.

Selection of NN Inputs and Outputs

Gas production was selected as the NN output because it is the

prediction target. Diagnostic techniques, such as scatter plots and

correlation matrices, were performed on the data to check their

validity and to study relationships between the target and each of

the predictor variables. For example, a scatter plot for average

footage drilled per oil and gas exploratory well vs. gas production

is shown in Fig. 2. The correlation coefficients for all inputs vs. the

target (gas production) are given in Table 1. The highest correla-

tion coefficient value is 0.924 for Input I-9, average footage drilled

per oil and gas exploratory well. This is also shown in Fig. 2 by the

high linear correlation of this variable with gas production. The

correlation matrix helps reduce the number of input variables by

excluding those with high correlation coefficients, some of which,

however, are important and needed to be included in the network

model because of their physical relations with the target. This

problem can be alleviated by applying transformation techniques

to remove the trend and reduce the high correlation coefficient.

Fig. 3 shows a scatter plot of Input I-9 vs. gas production after

performing the normalization and the first derivative transforma-

tion. The figure shows that the data points are more scattered and

fairly distributed around the zero horizontal line. The preprocess-

ing procedure resulted in a 45% reduction of the correlation coef-

ficient for this input, from 0.924 to 0.512.

NN Model Design

There are a number of design factors that must be considered in

constructing an NN model. These considerations include selection

Fig. 1—Basic structure of a three-layer back-propagation (BP) NN.

Fig. 2—Scatter plot of gas production and average footage

drilled per oil and gas exploratory well.

85May 2003 SPE Production & Facilities

of the NN architecture, the learning rule, the number of processing

elements in each layer, the number of hidden layers, and the type

of transfer function. Fig. 4 depicts an illustration of the NN model

designed in this study.

Architecture. The NN architecture determines the method by

which the weights are interconnected in the network and specifies

the type of learning rules that may be used. Selecting the network

architecture is one of the first tasks in setting up an NN. The

multilayer, normal feed forward

1–3

is the most commonly used

architecture and is generally recommended for most applications;

hence, it was selected to be used for this study.

Learning Algorithm. Selection of a learning rule is also an im-

portant step because it affects the determination of input and trans-

fer functions and associated parameters. The network used is based

on a back-propagation (BP) design,

1

the most widely recognized

and most commonly used supervised-learning algorithm. In this

study, the quick-propagation (QP)

14

learning algorithm, which is

an enhanced version of the BP one, is used for its performance and

speed. The advantage of QP is that it runs faster than BP by

minimizing the time required to find a good set of weights with

heuristic rules. These rules automatically regulate the step size and

detect conditions that accelerate learning. The optimum step size is

then determined by evaluating the trend of the weight updates

with time.

The fundamental design of a BP NN consists of an input layer,

a hidden layer, and an output layer, as shown in Fig. 4. A layer

consists of a number of processing elements or neurons and is fully

connected, indicating that each neuron of the input layer is con-

nected to each hidden-layer node. Similarly, each hidden-layer

node is connected to each output-layer node. The number of nodes

needed for the input and output layers depends on the number of

inputs and outputs designed for the NN.

Activation Rule. A transfer function acts on the value returned by

the input function, which combines the input vector with the

weight vector to obtain the net input to the processing element

given a particular input vector. Each transfer function introduces a

nonlinearity into the NN, enriching its representational capacity. In

fact, it is the nonlinearity of the transfer function that gives an NN

its advantage vs. conventional or traditional regression techniques.

There are also a number of transfer functions. Among those are

sigmoid, arctan, sin, linear, Gaussian, and Cauchy. The most com-

monly used transfer function is the sigmoid function. It squashes

and compresses the input function when it takes on large positive

or negative values. Large positive values asymptotically approach

1, while large negative values are squashed to 0. The sigmoid is

given by

1

f

共

x

兲

=

1

1 + exp

共

− x

兲

. ....................................( 3)

Fig. 5 is a typical plot of the sigmoid function. In essence, the

activation function acts as a nonlinear gain for the processing

element. The gain is actually the slope of the sigmoid at a specific

point. It varies from a low value at large negative inputs to a high

value at zero input, then drops back toward zero as the input

becomes large and positive.

Training Procedure

In the first step of the development process, the available data were

divided into training and test sets. The training set was selected to

cover the data from 1949 to 1989 (40-year data points), while the

testing set covered the data from 1990 to 1998 (9-year data points).

We chose to split the data based on an 80/20 rule. We first nor-

malized all input variables and the output with the average/

standard deviation method, then took the first derivative of all

input variables, including the output. In the initial training and

testing phases, we developed the network model with most of the

default parameters in the NN software. Generally, these default

settings provided satisfactory beginning results. We examined dif-

ferent architectures, different learning rules, and different input

and transfer functions (with increasing numbers of hidden-layer

neurons) on the training set to find the optimal learning parameters

and then the optimal architecture. We primarily used the black-box

testing approach (comparing network results to actual historical

results) to verify that the inputs produce the desired outputs. Dur-

ing training, we used several diagnostic tools to facilitate under-

standing of how the network is training. These include

• The MSE of the entire output.

• A plot of the MSE vs. the number of iterations.

Fig. 4—NN design from this study.

Fig. 5—Sigmoid function.

Fig. 3—Scatter plot of gas production and average footage

drilled per oil and gas exploratory well after data preprocessing.

86 May 2003 SPE Production & Facilities

• The percentage of training- or testing-set samples that are

correct based on a chosen tolerance value.

• A plot of the actual vs. the network output.

• A histogram of all the weights in the network.

The three-layer network with all initial 15 input variables was

trained with the training samples. We chose the number of neurons

in the hidden layer on the basis of existing rules of thumb

2,3

and

experimentation. One rule of thumb states that the number of

hidden-layer neurons should be approximately 75% of the input

variables. Another rule suggests that the number of hidden-layer

neurons be approximately 50% of the total number of input and

output variables. One of the advantages of the neural software used

in this study is that it allows the user to specify a range for the

minimum and maximum number of hidden neurons. Putting all

this knowledge together with our experimentation experience al-

lowed us to specify the range of 5 to 12 hidden neurons for the

single hidden layer.

We used the input sensitivity analysis to study the significance

of each input parameter and how it affects network performance.

This procedure helps to reduce the redundant input parameters and

determine the optimum number of NN input parameters. In each

training run, the results of the input sensitivity analysis are exam-

ined and the least-significant input parameter is deleted, then the

weights are reset and the network-training process is restarted with

the remaining input parameters. This process is repeated until all

the input parameters are found to have a significant contribution to

network performance. The input is considered significant when its

normalized effect value is greater than or equal to 0.7 in the train-

ing set and 0.5 in the test set. We varied the number of iterations

used to train the network from 500 to 7,000 to find the optimal

number. Three thousand iterations were used for most of the train-

ing runs. In the process, training is automatically terminated when

the maximum iterations are reached or the mean square error of the

network falls to less than the set limit, specified as 1.0×10

−5

. While

training the network, the test set is also evaluated. This step en-

ables a pass through the test set for each pass through the training

set. However, this step does not intervene with the training statis-

tics other than evaluating the test set while training for fine-tuning

and generalizing the network parameters.

After training, the network performance was tested. The test set

was used to determine how well the network performed with data

it had not seen during training.

To evaluate network performance, the classification option

used specified the network output as correct based on a set toler-

ance. This method evaluates the percentage of training and testing

samples that faithfully generalize the patterns and values of the

network outputs. We used a tolerance of 0.05 in this study (the

default value is 0.5), meaning that all outputs for a sample must be

within this tolerance for it to be considered correct. Another mea-

sure is the plot of the mean square error vs. the number of itera-

tions. A well-trained network is characterized by decreasing errors

for both the training and test sets as the number of itera-

tions increases.

Results of Training and Testing

We used the input sensitivity-analysis technique

2,14

to gauge the

sensitivity of the gas production rate (output) for any particular

input. The method makes use of the weight values of a successfully

trained network to extract the information relevant to any particu-

lar input node. The outcome is the effect and normalized effect

values for each input variable at the gas-production output rate.

These effect values represent an assessment of the influence of any

particular input node on the output node.

The results of the input-identification process and training pro-

cedure indicated that the network has excellent performance with

11 input parameters. We found that these parameters, described in

Table 2, contribute significantly to network performance.

Tables 3 and 4 present the results of the input sensitivity

analysis for the training and test sets, respectively. The normalized

effect values indicate that all 11 inputs contribute significantly to

the improvement of the network performance and to the prediction

87May 2003 SPE Production & Facilities

of the U.S. natural-gas production rate for both the training and test

sets. The training set input-sensitivity analysis (Table 3) shows that

the gas annual depletion rate (I-15) is the most significant input

parameter contributing to network performance and, hence, to pre-

dicting U.S. natural gas production. Although we found it impor-

tant to network performance improvement and kept it in the model,

the input of gas wellhead prices (I-3) has the least normalized

effect value (0.7) of all other inputs in the training set. Table 4

shows that all inputs in the test set exceeded the arbitrary specified

threshold value of 0.5, indicating that all inputs contribute signifi-

cantly to the network model.

The network was trained with 5,000 iterations and the QP

learning algorithm. We found that the optimum number of hidden-

layer nodes is 5. Fig. 6 shows the NN model prediction, after the

training and validation processes, superimposed on the normal-

ized, actual U.S. gas production. The NN prediction results show

excellent agreement with the actual production data in both the

training and testing stages. These results indicate that the network

was trained and validated very well and is ready to be used for

forecasting. In addition, statistical and graphical error analyses

were used to examine network performance.

Optimization of Network Parameters. We attempted different

network configurations to optimize the number of hidden nodes

and number of iterations and thus fine-tune the network perfor-

mance, running numerous simulations in the optimization process.

Table 5 presents potential cases for illustration purposes only and

shows that increasing the number of iterations to more than 5,000

improves the training-set performance but worsens the test-set per-

formance. In addition, decreasing the number of iterations to 3,000

yields higher errors for both the training and test sets. The number

of hidden-layer nodes also varied by 4 to 22 nodes. Increasing the

number of hidden nodes to more than five showed good results for

the training set but gave unsatisfactory results for the test set,

which is the most important. From these analyses, the op-

timal network configuration for this specific U.S. gas produc-

tion model is a three-layer QP network with 11 input nodes, 5

hidden nodes, and 1 output node. The network is optimally trained

with 5,000 iterations.

Error Analysis. Statistical accuracy of this network performance

is given in Table 5 (Case 11a). The mean squared error (MSE) of

the training set is 0.0034 and 0.0252 for the test set. Fig. 7 shows

the MSE vs. the iterations for both the training and test sets. The

errors with training-set samples decrease consistently throughout

the training process. In addition, errors with the test-set samples

decrease fairly consistently along with the training-set samples,

indicating that the network is generalizing rather than memorizing.

All the training- and test-set samples yield results of 100% correct

based on 0.05 tolerance, as shown in Fig. 8.

Fig. 9 shows the residual plot of the NN model for both the

training and test samples. The plot shows not only that training set

errors are minimal but also that they are evenly distributed around

zero, as shown in Fig. 10. As is usually the case, errors in test

samples are slightly higher than in training ones. The crossplots of

predicted vs. actual values for natural gas production are presented

in Figs. 11 and 12. Almost all the plotted points of this study’s NN

model fall very close to the perfect 45° straight line, indicating its

high degree of accuracy.

Forecasting

After successful development of the NN model for U.S. natural gas

production, future gas production rates must also be forecast. To

Fig. 6—Performance of the NN model with actual U.S.

gas production.

Fig. 7—Convergence behavior of the QP three-layer network

(11, 5, 1) that learned from the U.S. natural gas production data.

Fig. 8—Behavior of training and testing samples classified

as correct.

88 May 2003 SPE Production & Facilities

implement the network model for prediction, forecast models

should be developed for all 11 network inputs or obtained from

independent studies. We developed forecasting models for all the

independent network inputs (except for the input of gas wellhead

prices) with the time-series-analysis approach. The forecasts for

the gas wellhead prices came from the EIA.

15

We adjusted the EIA

forecasts for gas prices, based on 1998 U.S. dollars/Mcf, to 1992

U.S. dollars/Mcf so that the forecasts would be compatible with

the historical gas prices used in network development. We devel-

oped the forecasting models for the NN input variables with the

Box-Jenkins

16

methodology of time-series analysis. Details of

forecast development for other network inputs are described in

Ref. 17.

Before implementing the network model for forecasting, we

took one additional step, taking the test set back and adding it to

the original training set. The network could then be trained only

one time, keeping the same configuration and parameters of the

original trained network intact. The purpose of this step is to have

the network take into account the effects of all available data.

Because the amount of data is limited, this ensures generalization

of the network performance, yielding better forecasting.

Next, we saved data for the forecasted network inputs for 1999

to 2020 as a test-set file, whereas the training-set file contained

data from 1950 to 1998. We then ran the network with one pass

through all the training and test sets. We retained the obtained data

results in their original form by adding the output value at a given

time to its previous one. After decoding the first-difference output

values, we denormalized the obtained values for the training and

test samples with the same normalization parameters as in the

data preprocessing.

Fig. 13 shows this study’s NN forecasting model for U.S. gas

production to 2020. It also shows the excellent match between the

NN model results and actual natural gas production data. The NN

forecasting model indicates that the U.S. gas production in 1999 is

in a decline, at 1.8% of the 1998 production. Production stayed at

the 1999 level with a slight decline until 2001, after which gas

production started to increase. From 2002 to 2012, gas production

will increase steadily, with an average growth rate of approxi-

mately 0.5%/yr. The NN model indicates that this growth will

more than double from 2013 to 2020, with a 1.3%/yr average

growth rate. By 2019, gas production is predicted at 22.6 Tcf/yr,

approximately the same as the 1973 production level.

The NN forecasting model developed in this study is dependent

not only on the performance of the trained data set but also on the

future performance of forecasted input parameters. Therefore, the

network model should be updated periodically when new data

become available. While it is desirable to update the network

model with new data, the architecture and its parameters need not

be changed. However, a one-time run to train the network with the

updated data is necessary.

Comparison of Forecasts

This section compares the forecasts of U.S. natural gas production

from the EIA

15

with the NN approach and with the stochastic

modeling approach developed by Al-Fattah.

17

The EIA 2000 fore-

cast of U.S. gas supply is based on U.S. Geological Survey

(USGS) estimates of U.S. natural gas resources, including conven-

tional and unconventional gas. The main assumptions of the EIA

forecast are as follows:

Fig. 10—Frequency of residuals in the NN model.

Fig. 11—Crossplot of NN prediction model and actual gas pro-

duction (first difference).

Fig. 12—Crossplot of NN prediction model and actual gas pro-

duction (normalized).

Fig. 9—Residual plot of the NN model.

89May 2003 SPE Production & Facilities

• Drilling, operating, and lease equipment costs are expected to

decline by 0.3 to 2%.

• Exploratory success rates are expected to increase by 0.5%/yr.

• Finding rates will improve by 1 to 6%/yr.

Fig. 14 shows the EIA forecast compared to those from this

study with the NN and time-series analysis (or stochastic modeling).

The stochastic forecast modeling approach we used was based

on the Box-Jenkins time series method as described in detail by

Al-Fattah.

17

We studied past trends of all input data to determine

if their values could be predicted with an “autoregressive inte-

grated moving average” (ARIMA) time-series model. An ARIMA

model predicts a value in a time series as a linear combination of

its own past values and errors. A separate ARIMA model was

developed for each input variable in the NN forecasting model.

Analyses of all input time series showed that the ARIMA model

was both adequate (errors were small) and stationary (errors

showed no time trend).

When we used the ARIMA model to directly forecast gas pro-

duction with only time-dependent data, we were unable to achieve

time-independent errors throughout the production history (from

1918 to 1998). However, because we determined previously that

both the depletion and reserves discovery rates were stationary

time series, we used these two ARIMA models to forecast gas

production by multiplying the depletion rate and the gas reserves.

The product of these two time series determines the stochastic gas

forecast in Fig. 14.

The EIA forecast of the U.S. gas supply with approximately 20

Tcf/yr for 2000 is higher than the NN forecast of approximately

19.5 Tcf/yr. However, the EIA forecast matches the NN one from

2001 to 2003, after which the EIA forecast increases considerably,

with annual average increases of 2.4% from 2004 to 2014 and

1.3% thereafter.

The stochastic-derived model gives a production forecast that is

much higher than the EIA and NN forecasts. The forecast of U.S.

gas supply from the stochastic-derived model shows an exponen-

tial trend with an average growth rate of 2.3%/yr.

The NN forecast is based on the following assumptions of

independent input forecasts.

• Gas prices are expected to increase by 1.5%/yr.

• The gas depletion rate is expected to increase by 1.45%/yr.

• Drilling of gas exploratory wells will improve by 3.5%/yr.

• Drilling of oil/gas exploratory wells will increase an average

of 2.5%/yr.

• D

G

will have an average increase of 2.1%/yr.

The NN forecast takes into account the effects of the physical

and economical factors on U.S. gas production, which render fore-

casts of natural gas supply reliable. The NN model indicates that

U.S. gas production will increase from 2002 to 2012 by 0.5%/yr on

average. Thereafter, gas production will have a higher increase,

averaging 1.3%/yr through 2020.

Conclusions

This paper presents a new approach to forecast the future produc-

tion of U.S. natural gas with an NN. The three-layer network was

trained and tested successfully, and comparison with actual pro-

duction data showed excellent agreement. Forecasts of the network

input parameters were developed with a stochastic-modeling

approach to time-series analysis. The network model included

various physical and economic input parameters, rendering it a

useful short-term as well as long-term forecasting tool for future

gas production.

The NN model’s forecasting results showed that the 1998 U.S.

gas production would decline at a rate of 1.8%/yr in 1999, with

2001 at the 1999 production level. After 2001, gas production

starts to increase steadily until 2012, with approximately a 0.5%/yr

average growth rate. This growth will more than double for 2013

to 2020, with a 1.3%/yr average growth rate. By 2020, gas pro-

duction is predicted at 23 Tcf/yr, slightly higher than the 1973

production level.

The NN model is useful as a short-term as well as a long-term

predictive tool for future gas production. It can also be used to

quantitatively examine the effects of various physical and eco-

nomical factors on future gas production. With the NN model

developed in this study, we recommend further analysis to quan-

titatively evaluate the effects of the various physical and economic

factors on future gas production.

Nomenclature

D

G

⳱ gross domestic product, U.S. dollars

G

DP

⳱ growth rate of gross domestic product

i ⳱ observation number

t ⳱ time, 1/t, 1/yr

X ⳱ input/output vector

X⬘⳱normalized input/output vector

⳱ mean or arithmetic average

⳱ standard deviation

References

1. Haykin, S.: Neural Networks: A Comprehensive Foundation, Mac-

millan College Publishing Co., New York City (1994).

2. Azoff, E.M.: Neural Network Time Series Forecasting of Financial

Markets, John Wiley & Sons Ltd. Inc., Chichester, England (1994).

3. Neural Networks in Finance and Investing: Using Artificial Intelli-

gence to Improve Real-World Performance, revised edition, R.R.

Trippi and E. Turban (eds.), Irwin Professional Publishing, Chicago,

Illinois (1996).

4. Mohaghegh, S.: “Virtual-Intelligence Applications in Petroleum Engi-

neering: Part I—Artificial Neural Networks,” JPT (September 2000)

64.

5. Al-Kaabi, A.U. and Lee, W.J.: “Using Artificial Neural Nets To Iden-

tify the Well-Test Interpretation Model,” SPEFE (September 1993)

233.

Fig. 13—NN forecasting model of U.S. gas production.

Fig. 14—Comparison of U.S. gas-production forecasts.

90 May 2003 SPE Production & Facilities

6. Habiballah, W.A., Startzman, R.A., and Barrufet, M.A.: “Use of Neural

Networks for Prediction of Vapor/Liquid Equilibrium K Values for

Light-Hydrocarbon Mixtures,” SPERE (May 1996) 121.

7. EIA, Internet Home Page: http://www.eia.doe.gov/.

8. Twentieth Century Petroleum Statistics, 52nd ed., DeGolyer and Mac-

Naughton, Dallas (1996).

9. Twentieth Century Petroleum Statistics, 54th ed., DeGolyer and Mac-

Naughton, Dallas (1998).

10. Attanasi, E.D. and Root, D.H.: “The Enigma of Oil and Gas Field

Growth,” AAPG Bull. (March 1994) 78, 321.

11. Energy Statistics Sourcebook, 13th edition, OGJ Energy Database,

PennWell Publishing Co., Tulsa (1998).

12. “World Energy Projection System,” DOE/EIA-M050, Office of Inte-

grated Analysis and Forecasting, U.S. Dept. of Energy, EIA, Washing-

ton, DC (September 1997).

13. Kutner, M.H. et al.: Applied Linear Statistical Models, fourth edition,

Irwin, Chicago (1996).

14. ThinksPro: Neural Networks Software for Windows User’s Guide,

Logical Designs Consulting Inc., La Jolla, California (1995).

15. “Annual Energy Outlook 2000,” DOE/EIA-0383, Office of Integrated

Analysis and Forecasting, U.S. Dept. of Energy, EIA, Washington, DC

(1999).

16. Box, G.E., Jenkins, G.M., and Reinsel, G.C.: Time Series Analysis

Forecasting and Control, third edition, Prentice-Hall Inc., Englewood

Cliffs, New Jersey (1994).

17. Al-Fattah, S.M.: “New Approaches for Analyzing and Predicting Glob-

al Natural Gas Production,” PhD dissertation, Texas A&M U., College

Station, Texas (2000).

SI Metric Conversion Factors

ft × 3.048* E–01 ⳱ m

ft

3

× 2.831 685 E–02 ⳱ m

3

*Conversion factor is exact.

Saud Al-Fattah is a reservoir management engineer in the Res-

ervoir Management Dept. of Audi Aramco, Dhahran. His spe-

cialties include reservoir engineering, operations research,

economic evaluation, forecasting, and strategic planning. Al-

Fattah holds MS and BS degrees from King Fahd U. of Petro-

leum and Minerals and a PhD degree from Texas A&M U., all in

petroleum engineering. Richard A. (Dick) Startzman is currently

a professor of petroleum engineering at Texas A&M U. He was

employed by Chevron Corporation for 20 years in research,

operations, and management in the U.S., Europe, and the

Middle East. He joined the faculty of petroleum engineering

faculty at Texas A&M in 1982. His research interests include

reservoir engineering, economic evaluation, artificial intelli-

gence, and optimization. He was named to the Peterson Pro-

fessorship in 1993. He has been active in the Society of Petroleum

Engineers and was elected a Distinguished Member in 1994.

91May 2003 SPE Production & Facilities