This report is confidential.
19.01.2005 [Skriv inn tekst] 1
Application of Artificial Neural Networks to Predicate Shale
Kesheng Wang, Resko Barna, Maxim Boldin, Pablo Pascual, Ove R. Hjelmervik
Knowledge Discover Laboratory
Department of Production and Quality Engineering
The problem of parameter predication in a hydrocarbon reservoir is typically
accomplished by an interpreter using sparse well information and seismic data. The
resulting maps may contain vary levels of uncertainty depending on the experience of the
interpreter and availability and quality of seismic and well log data.
This report describes an Artificial Neural Network approach to the predication problem
of shale content in the reservoir. An interval of seismic data representing the zone of
interest is extracted from a three-dimensional data volume. Seismic data and well log data
are used as input and target to Regularity Back-propagation (RBP) neural network. A
series of ANNs is trained and results are presented.
The results show that the realistic prediction maps can be generated using a ANN
approach with input data consists of raw seismic amplitudes and depth index, and target
data consists of VSH (fraction of rock made up of shale) well logs. The maps created for
subintervals in the zone of interest show consistent features. Also ANN predication result
error is acceptable compared to the human interpretation.
Development and exploitation of a hydrocarbon reservoir typically makes use of maps
describing spatial distributions of relevant parameters. These maps are generated from
well information that may or may not adequately sample the reservoir. In addition,
reservoir often exhibits a high degree of heterogeneity that introduces high levels of
uncertainties into interpolated parameters. A recent approach has been to use complex
trace attributes derived from 3-D seismic data volumes to track the desired parameters
through heterogeneous zones. This is usually a time consuming process requiring a
19.01.2005 [Skriv inn tekst] 2
Oil or gas reservoir, in large part, is characterized by lithological parameters, such as
porosity, permeability and shale content. Accurate estimates of these parameters are
important for calculating oil or gas reserves and developing exploration strategies. The
motivation for this study was to be able to produce accurate shale content predications.
At present, shale content is created by geo-statistical interpretation techniques using well
log data. This approach can account for large-scale variation in lithological parameters,
but is unable to provide the level of resolution required to produce accurate production
estimates or determine location if the reservoir is located in a zone exhibiting high degree
Current, 3-D seismic surveying is one of the primary tools for characterizing a
hydrocarbon reservoir. Attributes extracted from seismic data are used to qualitatively
describe variations in lithology and associated physical parameters across zones of
interest. By correlating seismic data with well log data, it is, in principle, possible to
produce qualitative predications at each common midpoint (CMP) location.
Conventional techniques correlate seismic attributes to lithological parameters using
determinist and stochastic methods. The use of an ANN approach enables seismic data to
be related to shale content without explicitly defining the relationships for the various
parameters – i.e. without previous knowledge in order to establish a mathematical model.
More important, ANNs naturally utilize intervals of seismic data or combinations of
attributes and other information rather than single values. This ability increases the
amount of available information from which to make predications.
This report describes a set of procedures that were used to produce shale content
predication for an oil field in the Harmmerfest basin, located in the Norwegian Sea of the
coast of Norway. The primary focus is how shale content predications depend on
attributes from seismic trace and well log. One well log attribute related to shale content,
VSH, was selected as the target of the ANN, and combinations of several slops of seismic
amplitude samples and time index was selected as input to the ANN.
Because the important issue is to define shale heterogeneities in the sand-dominated
reservoir, the VSH response is used as the indicator for reservoir characteristics.
Consequently, the idea is to predict the VSH response from seismic attributes. This will
enable generation of VSH from seismic trace. Predicting the shale content is to reduce the
risk of drilling by differentiating between sand layers and shale layers. The well log VSH
varies between 1 and 0 and indicates the shale content. A VSH value of 1 indicates pure
shale, while 0 indicates pure sand.
A seismic sample can be a sample from any processed seismic cube – i.e., it can be taken
from migrated stack, P-impedance stack, S-impedance stack, or any LMR stack.
Well data is a key to establishing the relationship between shale content and seismic
attributes. Before such a relationship can be established, the VSH log which was recorded
19.01.2005 [Skriv inn tekst] 3
into function of time has to be up-scaled and transformed into a function of time. The up-
scaling was conducted with an averaging process.
Several options (includes single-attribute regression, multi-attribute correlation, and
Artificial Neural Networks) were investigated in this study. This report will focus on the
ANN application. This study, conducted as a basis of this report, used real seismic and
well-log data originating from the Hammerfest basin, the map of which is shown in the
Figure 1. Map of Hammerfest field in Norwegian Sea, Norway.
2. Architecture of the applied ANN
Artificial Neural Networks have been used extensively in the oil industry. Approximately
10 years after McCormack’s  review of neural network applications in geophysics,
much work has been done to bring such applications to the main stream of geophysical
interpretation. Some of these efforts are documented in some literatures [Lindsay, et al.,
2002; Liu, et al., 2002; Nikravesh et al, 2003; Poulton, 2002; Russell, et al, 2003, 2002;
Sandham et al, 2003, Tonn, 2002; Walls, et al. , 2002;; Wong et al 2002;], which include
many reports and extensive references on Artificial Neural Networks applications in
reservoir characterization. Most of these applications have been in reservoir
characterization, seismic object detection, creating pseudo logs, and log editing. In this
study, we will use a Multilayer Perceptron (MPL) ANN with a Back-propagation (BP)
19.01.2005 [Skriv inn tekst] 4
2.1 Back-propagation ANN
The Artificial Neural Networks technique employed was a Back-propagation net of
Multilayer perceptron (MLP) type. The basic building element of this model is
Perceptron or Processing Element (PE). The network is an artificial representation of
human brain that tries to simulate its learning process. In this type of network, Processing
Elements are organized in layers. In its simplest type, the MLP consists of 3 separated
layers: an input layer, a hidden layer and output layer (See figure 2). All Processing
Elements in each layer are connected to each PEs in the successive layer. Thus, the net is
fully connected and feedforward. MLPs are trained on a representative data set
(supervised learning). Known examples, consisting of input pattern and corresponding
output pattern, are repeatedly fed into the ANN during the training process. The output of
the ANN is compared with the target output and the difference (error) is minimized by
adjusting the weights of the connections.
Seismic Sample i-1w
Well log HVS
Figure 2. Back-propagation ANN architecture.
The representative data set consisted of seismic traces extracted at and round the wells.
(In this testing, we simply chose a seismic trace at the well.) The VSH trace are recorded
in this specified well. In particular, the nearest nine traces should be selected around each
well. Consequently, at each well, nine traces were presented to the ANN to relate to the
desired output (VSH log).
The variations available for constructing an ANN are considerable. The number of input
layers as well as the number of hidden layers can be modified to improve the
performance of the ANN. Furthermore, the number of nodes in each hidden layer can be
varied and some parameters in ANN, such as learning rate and momentum, and the
activation functions can be changed.
The concept of this testing was to keep the structure simple and the number input layer as
small as possible. This idea will minimize the risk of overtraining – i.e., the ANN
attempts to model the data too closely to lose its ability for generalization. This is critical
because generalization is the network’s ability to identify input patterns that were not part
of the training set.
2.2 Regularized Back-propagation network
19.01.2005 [Skriv inn tekst] 5
In the testing, a regularized Back-propagation network proposed by Saggaf et al. 
was selected. A traditional BP ANN is constructed by solving a system of equations such
that the network weights minimize the misfit error between the training data and the
network output. The Objective Function (OF) of the optimization is thus the mean square
error of the misfit:
where ei is the misfit error for the ith training observation and N is the total number of
observations in the training data set. The non-monotonous generalization behavior of the
network can be alleviated by modifying the objective function such that not only is the
misfit error reduced, but also the network weight as well:
where wi is the ith network weight, M is the total number of weights in the ANN, and λ is
the regularization parameter.
This modified objective function results in a network that not only fits the data, but also
has small weights that get rise to smaller variations in the output and thus yield a
smoother result in output space that is less likely to over-fit the data.
The regularization parameter λ determine the relative emphasis of smoothness versus
degree of fit to the training data. A small value of λ results in an un-regularized network.
On the other hand, if λ is too large, then the network would not adequately fit data.
3. Input and Output structure
We discuss here two distinct geometries for setting up the network with regard to its
input and output vectors. A seismic trace near a well and a seismic trace far from the well
are presented to the network input while the observed reservoir property in the well (we
focus on porosity here) is presented at the output. After training, seismic traces in the
inter-well regions are fed to the ANN to estimate or predicate its porosity. Note that these
architectures describe the input/output geometry and are independent of the ANN type
3.1 Parallelized architecture
This architecture represents the conventional geometry that is most often used with
ANNs. In this setup, each seismic trace is represented by input vector, and hence the
network has many input nodes as there are time samples in the seismic trace. At the
19.01.2005 [Skriv inn tekst] 6
network output, each node can represent either an instantaneous time sample of the
reservoir porosity or an interval average of that porosity. The number of nodes in the
input and output layers are decoupled and there is no requirement for these two numbers
to be the same. A schematic of this structure is shown in Figure 3.
Seismic Sample 1
Seismic Sample N
Seismic Sample 3
Seismic Sample 2
Porosity Sample 1
Porosity Sample 3
Porosity Sample M
Porosity Sample 2
Input Layer Output Layer
Figure 3. Schematics of the BP ANN.
The advantage of this architecture is the most flexibility of setting up the problem,
because any number of well log samples can be specified, independent of the number of
seismic samples. The disadvantage is that the size of the designed BP ANN is
significantly increased by any rise in the number of time samples of the seismic trace.
This is because the number of nodes in the input layer is directly tied to the number of
samples if the seismic trace and the number of nodes in the output layer is connected to
the number of sample of the well log data. For a typical BP ANN that has 50 nodes in its
hidden layer, each extra time sample adds 50 extra neural weights to be determined.
3.2 Serialized architecture
In the serialized architecture, each sample in seismic trace is fed to network in sequence,
and the output represents the instantaneous porosity at that time sample. The entire
seismic samples for all traces are thus streamed through the ANN one by one in
continuum. This architecture of the ANN is shown in Figure 4. It is too optimistic,
however, to expect such a simple ANN to be powerful enough to capture the intricacies
of anything but the simplest of cases, and indeed such an ANN perform rather poorly in
our data tests. The reason behind this is that, compared to the parallelized architecture,
this simple serialized architecture not only has very little vertical spatial coverage of the
data, but also there is no delineation of the data belonging to one trace from the next, thus
the correlation characteristics of input, which is important for estimation, are not fully
19.01.2005 [Skriv inn tekst] 7
Seismic Sample i
Porosity Sample 2
Seismic Sample i-1
Porosity Sample i
Input LayerOutput LayerHidden Layer
Seismic Sample i+1
Seismic Sample i
(a). The Simple one
(b). The improved one
Figure 4. Serialized architecture of input/output layer.
The following steps can be used to improve the serialized architecture:
1. Extending the vertical spatial coverage of each seismic vector by including some
of the sample before and after the current input sample.
2. Adding time index in the input.
Hence, for the simplest case, four nodes are used in the input layer. The first receive the
seismic sample at time i, the second receives the seismic sample at time ti-1, the third
accepts the seismic sample at time ti+1, and the fourth receives the index I itself. This
input is fed sequentially for each time sample ti in the seismic traces. When ti is the first
sample or last sample in the trace, zero is used for ti-1 and ti+1 respectively. A schematic
of this architecture is shown in Figure 4.b.
In contrast to the simple one-input-node architecture of Figure 2, this improved serialized
configuration performed as well as the parallelized configuration in the tests. Even more
special correlation can be added to enhance the effectiveness of the architecture by
adding more points at the slide of the current input sample. As a matter of fact, as the
number of such points increases, the serialized architecture approaches the limit of the
parallelized one. However, the current input and output arrangement with just one extra
sample at each side and time index proved to perform satisfactorily for the purpose.
Furthermore, as the number of additional points is increased, edge inconsistencies due to
the zeros at the beginning and end of the seismic trace start to become significant.
19.01.2005 [Skriv inn tekst] 8
The advantage of the serialized architecture is that it requires small BP ANNs to handle
even large seismic trace or the number traces in the training data set. On the other hand,
the number of time samples for both the input (seismic trace) and output (porosity) has to
be the same. This limitation may be alleviated, however, by subsequent averaging of the
output if interval averages of porosity are desired. Table 1 summarizes the properties of
the parallelized and serialized architectures.
Table 1. Summary of the properties of the parallelized and serialized ANN input/output
Number of input samples needs not
equal number of output samples. number of output samples.
Number of neurons in input layer
equals the number of the samples in
input samples. samples in input samples.
Produce few input vectors. Produces numerous input vectors.
Produce large output vectors. Produces small output vectors.
Requires large size of hidden layer. Requires small size of hidden layer,
Number of input samples must equal
Number of neurons in input layer is
independent of the number of the
3.3 Input and output slops
The input of the ANN was three subsequent seismic amplitudes and their sliding window
index. (Note that the sliding window index is not the same as depth, since the time
intervals for each well were adjusted to the actual depth of the horizon.) The output of the
ANN was the VSH well log value that corresponded to the seismic amplitude in the
middle of the 3 inputs.
Sliding in 2 ms
Figure 51. Principle of sliding-window technique. The window relates several seismic
samples simultaneously to a certain sand property (e.g. VSH)
1 Correction: HVS should be VSH
19.01.2005 [Skriv inn tekst] 9
4. Data description and preprocessing
4.1 Seismic data
The raw seismic data is a random line transecting well 7120/9-1&2 and 7121/7-1&2 from
survey st0306 covering the Snøhvit and Albatross Fields (see Figure 1). The data quality
in general is good. The random line exists as near stack (0-30degrees) and far-stack (40-
The raw seismic data amplitudes were the final processed data samples of the reservoir
interval. The instantaneous amplitude attribute was calculated from the raw amplitudes.
Figure 6. Random line transecting four wells on the Albatross structure.
19.01.2005 [Skriv inn tekst] 10
Figure 7. Random line transecting four wells in the Albatross Field. The horizon
corresponding to the top of the Hammerfest reservoir is marked by the arrow.
4.2 Well log data
The general attributes of the well logs are listed as the following:
LFP_Vp_V: P-wave velocity (virgin)
LFP_Vs_V*: S-wave velocity (virgin)
LFP_RHOB_V: Density (virgin)
LFP_AI_V: Acoustic Impedance
LFP_EI_HILT_V: Elastic Impedance
LFP_GR: Gamma ray
LFP_NPHI: Neutron Density
LFP_PHIT: Total Porosity
LFP_SGT: Gas saturation
LFP_SOT: Oil Saturation
LFP_SWT: Water Saturation
LFP_VpVs: Vp/Vs ratio
LFP_VSH: Shale content
19.01.2005 [Skriv inn tekst] 11
VSH well logs from the Statoil OpenWorks database have been used. VSH tops have
been carefully picked to correspond to the interpreted horizon. The logs were converted
from depth to time and sampled to the same sampling interval of the seismic data.
In the tests three well logs were used, two of which served to provide the training data
and one was used to provide measured well log data with which the prediction was
A short time period was chosen to avoid the uniqueness problem. A three-sample-wide
window was sliding within the black rectangle shown on the figure bellow (fig. 8).
Figure 8. Seismic trace and selected short time period.
In the case of the three wells used in the experiment, the depth of the horizon to which
the black rectangle window was attached is different. The difference can be seen on the
figure bellow (fig. 9).
19.01.2005 [Skriv inn tekst] 12
Figure 9. The different horizon of well 72, 71 and 91.
Training network with such data yielded acceptable results, as shown in the figures
Figure 10. Predication result for well 71
19.01.2005 [Skriv inn tekst] 13
Figure 11. The results of the predication of well 72.
Figure 12. The result of the predication of well 91
19.01.2005 [Skriv inn tekst] 14
In the above experiments the time interval spanned 160ms. In another experiment we
tried to expand the interval to see if the results would become worse, due to the probable
occurrence of the uniqueness problem. The results are shown bellow.
Figure 13. Expanded interval results.
It is visible that the prediction has become much less precise, though still follows the
tendency of the real shale content.
In the experiments presented in this document we showed that the use of ANN in well log
prediction is a reasonable choice, however the way the ANN is used has to be designed,
which necessarily includes the basic understanding of the data and phenomena the ANN
is applied to.
This study shows that a neural networks approach can be a valuable technique in order to
predicate shale content in a short interval in the zone of interest.
Compared to conventional methods, artificial neural networks have several advantages,
while many of their limitations are the same as that of other mathematical methods.
19.01.2005 [Skriv inn tekst] 15
The inclusion of time index (depth index) in the input data is shown to be essential for the
ANN to produce realistic predication. Testing results show that midfit errors are
In order to use ANN with success in seismic analysis, it is crucial to go through these
Exact definition of the problem to solve
Understand the dependencies between parameters
Definition of the existence of functions
Design the training data (source, guess, etc.)
Design the ANN
The first three points have to be done together with experts in geology and other
mathematical models. By following these steps one minimizes the risk of failing the test
of the final application. However, it is not obvious how these steps could successfully be
done by geologists, mathematicians or neural network experts. These points would thus
be the basis of further research.
In this study, we only use a small part of data for training and validation (3 well logs and
3 seismic trace). For further research, we need to extend to the whole field. It will help to
make more accurate predication.
In principle, the method used in this study has a generalization to all attributes, especially
to porosity and permeability. We have tested the mapping between the seismic trace and
well log LFP_PHIT (total porosity), but the results are not satisfied. We contribute the
reason for this failure to the following:
1. Not enough testing data. (We only use 3-4 pairs of seismic data and well logs. 2
or 3 was used for training and 1 for validation.)
2. The input design may not be correct. The structure of inputs plays an important
role in reservoir characterization problems. It is not as simple as other engineering
3. Window slops also need to be consideration carefully.
The predicating porosity and permeability has been reported successfully in some
literatures [Saggaf, et al, 2003]. We may take more testing for this purpose.
Another future testing may characterize the properties of the reservoir without well logs.
This method does not predicate particular parameters but instead performs an unbiased
(i.e. minimal user input) analysis resulting in groups or clusters of similar data vectors
(e.g. seismic traces). This clustering is easily implemented into maps that can then be
compared with conversional interpretation maps or parameter predications from
unsupervised ANN approaches. In addition, the resulting weights from the classification
approach, are easily interpreted as prototype data vectors for the resulting clusters. We
have investigated the model of supervised ANN, structure of a slab of seismic traces and