Conference PaperPDF Available

Seismic facies classification using different deep convolutional neural networks

Authors:

Abstract

Convolutional neural networks (CNNs) is a type of supervised learning technique that can be directly applied to amplitude data for seismic data classification. The high flexibility in CNN architecture enables researchers to design different models for specific problems. In this study, I introduce an encoder-decoder CNN model for seismic facies classification, which classifies all samples in a seismic line simultaneously and provides superior seismic facies quality comparing to the traditional patch-based CNN methods. I compare the encoder-decoder model with a traditional patch-based model to conclude the usability of both CNN architectures.
Seismic facies classification using different deep convolutional neural networks
Tao Zhao, Geophysical Insights
SUMMARY
Convolutional neural networks (CNNs) is a type of
supervised learning technique that can be directly applied to
amplitude data for seismic data classification. The high
flexibility in CNN architecture enables researchers to design
different models for specific problems. In this study, I
introduce an encoder-decoder CNN model for seismic facies
classification, which classifies all samples in a seismic line
simultaneously and provides superior seismic facies quality
comparing to the traditional patch-based CNN methods. I
compare the encoder-decoder model with a traditional patch-
based model to conclude the usability of both CNN
architectures.
INTRODUCTION
With the rapid development in GPU computing and success
obtained in computer vision domain, deep learning
techniques, represented by convolutional neural networks
(CNNs), start to entice seismic interpreters in the application
of supervised seismic facies classification. A comprehensive
review of deep learning techniques is provided in LeCun et
al. (2015). Although still in its infancy, CNN-based seismic
classification is successfully applied on both prestack
(Araya-Polo et al., 2017) and poststack (Waldeland and
Solberg, 2017; Huang et al., 2017; Lewis and Vigh, 2017)
data for fault and salt interpretation, identifying different
wave characteristics (Serfaty et al., 2017), as well as
estimating velocity models (Araya-Polo et al., 2018).
The main advantages of CNN over other supervised
classification methods are its spatial awareness and
automatic feature extraction. For image classification
problems, other than using the intensity values at each pixel
individually, CNN analyzes the patterns among pixels in an
image, and automatically generates features (in seismic data,
attributes) suitable for classification. Because seismic data
are 3D tomographic images, we would expect CNN to be
naturally adaptable to seismic data classification. However,
there are some distinct characteristics in seismic
classification that makes it more challenging than other
image classification problems. Firstly, classical image
classification aims at distinguishing different images, while
seismic classification aims at distinguishing different
geological objects within the same image. Therefore, from
an image processing point of view, instead of classification,
seismic classification is indeed a segmentation problem
(partitioning an image into blocky pixel shapes with a
coarser set of colors). Secondly, training data availability for
seismic classification is much sparser comparing to classical
image classification problems, for which massive data are
publicly available. Thirdly, in seismic data, all features are
represented by different patterns of reflectors, and the
boundaries between different features are rarely explicitly
defined. In contrast, features in an image from computer
artwork or photography are usually well-defined. Finally,
because of the uncertainly in seismic data, and the nature of
manual interpretation, the training data in seismic
classification is always contaminated by noise.
To address the first challenge, until today, most, if not all,
published studies on CNN-based seismic facies
classification perform classification on small patches of data
to infer the class label of the seismic sample at the patch
center. In this fashion, seismic facies classification is done
by traversing through patches centered at every sample in a
seismic volume. An alternative approach, although less
discussed, is to use CNN models designed for image
segmentation tasks (Long et al., 2015; Badrinarayanan et al.,
2017; Chen et al., 2018) to obtain sample-level labels in a
2D profile (e.g. an inline) simultaneously, then traversing
through all 2D profiles in a volume.
In this study, I use an encoder-decoder CNN model as an
implementation of the aforementioned second approach. I
apply both the encoder-decoder model and patch-based
model to seismic facies classification using data from the
North Sea, with the objective of demonstrating the strengths
and weaknesses of the two CNN models. I conclude that the
encoder-decoder model provides much better classification
quality, whereas the patch-based model is more flexible on
training data, possibly making it easier to use in production.
THE TWO CNN MODELS
Patch-based model
A basic patch-based model consists of several convolutional
layers, pooling (downsampling) layers, and fully-connected
layers. For an input image (for seismic data, amplitudes in a
small 3D window), a CNN model first automatically extracts
several high-level abstractions of the image (similar to
seismic attributes) using the convolutional and pooling
layers, then classifies the extracted attributes using the fully-
connected layers, which are similar to traditional multilayer
perceptron networks. The output from the network is a single
value representing the facies label of the seismic sample at
the center of the input patch. An example of patch-based
model architecture is provided in Figure 1a. In this example,
the network is employed to classify salt versus non-salt from
seismic amplitude in the SEAM synthetic data (Fehler and
10.1190/segam2018-2997085.1
Page 2046
© 2018 SEG
SEG International Exposition and 88th Annual Meeting
Downloaded 09/04/18 to 12.195.152.138. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Seismic facies classification using CNN
Larner, 2008). One input instance is a small patch of data
bounded by the red box, and the corresponding output is a
class label for this whole patch, which is then assigned to the
sample at the patch center. The sample marked as the red dot
is classified as non-salt.
Encoder-decoder model
Encoder-decoder is a popular network structure for tackling
image segmentation tasks. Encoder-decoder models share a
similar idea, which is first extracting high level abstractions
of input images using convolutional layers, then recovering
sample-level class labels by “deconvolution” operations.
Chen et al. (2018) introduce a current state-of-the-art
encoder-decoder model while concisely reviewed some
popular predecessors. An example of encoder-decoder
model architecture is provided in Figure 1b. Similar to the
patch-based example, this encoder-decoder network is
employed to classify salt versus non-salt from seismic
amplitude in the SEAM synthetic data. Unlike the patch-
based network, in the encoder-decoder network, one input
instance is a whole line of seismic amplitude, and the
corresponding output is a whole line of class labels, which
has the same dimension as the input data. In this case, all
samples in the middle of the line are classified as salt
(marked in red), and other samples are classified as non-salt
(marked in white), with minimum error.
APPLICATION OF THE TWO CNN MODELS
For demonstration purpose, I use the F3 seismic survey
acquired in the North Sea, offshore Netherlands, which is
freely accessible by the geoscience research community. In
this study, I am interested to automatically extract seismic
facies that have specific seismic amplitude patterns. To
remove the potential disagreement on the geological
meaning of the facies to extract, I name the facies purely
based on their reflection characteristics. Table 1 provides a
list of extracted facies. There are eight seismic facies with
distinct amplitude patterns, another facies (“everything else”)
is used for samples not belonging to the eight target facies.
Facies number Facies name
1
Varies amplitude steeply dipping
2
Random
3
Low coherence
4
Low amplitude deformed
5
6
High amplitude deformed
7
Moderate amplitude continuous
8
Chaotic
0
Everything else
Table 1. Seismic facies extracted in the study.
To generate training data for the seismic facies listed above,
different picking scenarios are employed to compensate for
the different input data format required in the two CNN
models (small 3D patches versus whole 2D lines). For the
patch-based model, 3D patches of seismic amplitude data are
extracted around seed points within some user-defined
polygons. There are approximately 400,000 3D patches of
size 65×65×65 generated for the patch-based model, which
is a reasonable amount for seismic data of this size. Figure
2a shows an example line on which seed point locations are
defined in the co-rendered polygons.
The encoder-decoder model requires much more effort for
generating labeled data. I manually interpret the target facies
on 40 inlines across the seismic survey and use these for
building the network. Although the total number of seismic
samples in 40 lines are enormous, the encoder-decoder
model only considers them as 40 input instances, which in
fact are of very small size for a CNN network. Figure 2b
shows an interpreted line which is used in training the
network.
In both tests, I randomly use 90% of the generated training
data to train the network and use the remaining 10% for
Figure 1. Sketches for CNN architecture of a) 2D
patch-based model and b) encoder-decoder model.
In the 2D patch-based model, each input data
instance is a small 2D patch of seismic amplitude
centered at the sample to be classified. The
corresponding output is then a class label for the
whole 2D patch (in this case, non-salt), which is
usually assigned to the sample at the center. In the
encoder-decoder model, each input data instance
is a whole inline (or crossline/time slice) of
seismic amplitude. The corresponding output is a
whole line of class labels, so that each sample is
assigned a label (in this case, some samples are salt
and others are non-salt). Different types of layers
are denoted in different colors, with layer types
marked at their first appearance in the network.
The size of the cuboids approximately represents
the output size of each layer.
10.1190/segam2018-2997085.1
Page 2047
© 2018 SEG
SEG International Exposition and 88th Annual Meeting
Downloaded 09/04/18 to 12.195.152.138. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Seismic facies classification using CNN
testing. On an Nvidia Quadro M5000 GPU with 8GB
memory, the patch-based model takes about 30 minutes to
converge, whereas the encoder-decoder model needs about
500 minutes. Besides the faster training, the patch-based
model also has a higher test accuracy at almost 100%
(99.9988%, to be exact) versus 94.1% from the encoder-
decoder model. However, this accuracy measurement is
sometimes a bit misleading. For a patch-based model, when
picking the training and testing data, interpreters usually
pick the most representative samples of each facies for
which they have the most confidence, resulting in high
quality training (and testing) data that are less noisy, and
most of the ambiguous samples which are challenging for
the classifier are excluded from testing. In contrast, to use an
encoder-decoder model, interpreters have to interpret all the
target facies in a training line. For example, if the target is
faults, one needs to pick all faults in a training line, otherwise
unlabeled faults will be considered as “non-fault” and
confuse the classifier. Therefore, interpreters have to make
some not-so-confident interpretation when generating
training and testing data. Figure 2c and 2d show seismic
facies predicted from the two CNN models on the same line
shown in Figure 2a and 2b. We observe better defined facies
from the encoder-decoder model compared to the patch-
based model.
Figure 3 shows prediction results from the two networks on
a line away from the training lines, and Figure 4 shows
prediction results from the two networks on a crossline.
Similar to the prediction results on the training line,
comparing to the patch-based model, the encoder-decoder
model provides facies as cleaner geobodies that require
much less post-editing for regional stratigraphic
classification (Figure 5). This can be attributed to an
encoder-decoder model that is able to capture the large scale
spatial arrangement of facies, whereas the patch-based
model only senses patterns in small 3D windows. To form
such windows, the patch-based model also needs to pad or
simply skip samples close to the edge of a 3D seismic
volume. Moreover, although the training is much faster in a
patch-based model, the prediction stage is very
computationally intensive, because it processes data size
N×N×N times of the original seismic volume (N is the patch
size along each dimension). In this study, the patch-based
method takes about 400 seconds to predict a line, comparing
to less than 1 second required in the encoder-decoder model.
CONCLUSION
In this study, I compared two types of CNN models in the
application of seismic facies classification. The more
commonly used patch-based model requires much less effort
in generating labeled data, but the classification result is
suboptimal comparing to the encoder-decoder model, and
the prediction stage can be very time consuming. The
encoder-decoder model generates superior classification
result at near real-time speed, at the expense of more tedious
labeled data picking and longer training time.
ACKNOWLEDGEMENT
The author thanks Geophysical Insights for the permission
to publish this work. Thank dGB Earth Sciences for
providing the F3 North Sea seismic data to the public, and
ConocoPhillips for sharing the MalenoV project for public
use, which was referenced when generating the training data.
The CNN models discussed in this study are implemented in
TensorFlow, an open source library from Google.
Figure 2. Example of seismic amplitude co-rendered with
training data picked on inline 340 used for a) patch-based
model and b) encoder-decoder model. The prediction result
from c) patch-based model, and d) from the encoder-decoder
model. Target facies are colored in colder to warmer colors
in the order shown in Table 1. Compare Facies 5, 6 and 8.
10.1190/segam2018-2997085.1
Page 2048
© 2018 SEG
SEG International Exposition and 88th Annual Meeting
Downloaded 09/04/18 to 12.195.152.138. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
Seismic facies classification using CNN
Figure 3. Prediction results from the two networks on a line away from the training lines. a) Predicted facies from the patch-based
model. b) Predicted facies from the encoder-decoder based model. Target facies are colored in colder to warmer colors in the order
shown in Table 1. The yellow dotted line marks the location of the crossline shown in Figure 4. Compare Facies 1, 5 and 8.
Figure 4. Prediction results from the two networks on a crossline. a) Predicted facies from the patch-based model. b) Predicted
facies from the encoder-decoder model. Target facies are colored in colder to warmer colors in the order shown in Table 1. The
yellow dotted lines mark the location of the inlines shown in Figure 2 and 3. Compare Facies 5 and 8.
Figure 5. Volumetric display of the predicted facies from the encoder-decoder model. The facies volume is visually cropped for
display purpose. An inline and a crossline of seismic amplitude co-rendered with predicted facies are also displayed to show a
broader distribution of the facies. Target facies are colored in colder to warmer colors in the order shown in Table 1.
10.1190/segam2018-2997085.1
Page 2049
© 2018 SEG
SEG International Exposition and 88th Annual Meeting
Downloaded 09/04/18 to 12.195.152.138. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
REFERENCES
Araya-Polo, M., T. Dahlke, C. Frogner, C. Zhang, T. Poggio, and D. Hohl, 2017, Automated fault detection without seismic processing: The Leading
Edge, 36, 208214, https://doi.org/10.1190/tle36030208.1.
Araya-Polo, M., J. Jennings, A. Adler, and T. Dahlke, 2018, Deep-learning tomography: The Leading Edge, 37,5866, https://doi.org/10.1190/
tle37010058.1.
Badrinarayanan, V., A. Kendall, and R. Cipolla, 2017, SegNet: A deep convolutional encoder-decoder architecture for image segmentation: IEEE
Transactions on Pattern Analysis and Machine Intelligence, 39, 24812495, https://doi.org/10.1109/tpami.2016.2644615.
Chen, L. C., G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, 2018, DeepLab: Semantic image segmentation with deep convolutional nets,
atrous convolution, and fully connected CRFs: IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 834848, https://doi.org/10
.1109/tpami.2017.2699184.
Chen, L. C., Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, 2018, Encoder-decoder with atrous separable convolution for semantic image seg-
mentation: arXiv preprint, arXiv:1802.02611v2.
Fehler, M., and K. Larner, 2008, SEG advanced modeling (SEAM): Phase I first year update: The Leading Edge, 27, 10061007, https://doi.org/10
.1190/1.2967551.
Huang, L., X. Dong, and T. E. Clee, 2017, A scalable deep learning platform for identifying geologic features from seismic attributes: The Leading
Edge, 36, 249256, https://doi.org/10.1190/tle36030249.1.
LeCun, Y., Y. Bengio, and G. Hinton, 2015, Deep learning: Nature, 521, 436444, https://doi.org/10.1038/nature14539.
Lewis, W., and D. Vigh, 2017, Deep learning prior models from seismic images for full-waveform inversion: 87th Annual International Meeting, SEG,
Expanded Abstracts, 15121517, https://doi.org/10.1190/segam2017-17627643.1.
Long, J., E. Shelhamer, and T. Darrell, 2015, Fully convolutional networks for semantic segmentation: Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, 34313440.
Serfaty, Y., L. Itan, D. Chase, and Z. Koren, 2017, Wavefield separationvia principle component analysis and deep learning in the local angle domain:
87th Annual International Meeting, SEG, Expanded Abstracts, 991995, https://doi.org/10.1190/segam2017-17676855.1.
Waldeland, A. U., and A. H. S. S. Solberg, 2017, Salt classification using deep learning: 79th Annual International Conference and Exhibition, EAGE,
Extended Abstracts, Tu-B4-12. https://doi.org/10.3997/2214-4609.201700918.
10.1190/segam2018-2997085.1
Page 2050
© 2018 SEG
SEG International Exposition and 88th Annual Meeting
Downloaded 09/04/18 to 12.195.152.138. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/
... Domain adaptation (DA) techniques have emerged as a promising avenue to tackle the issue of domain shift 16 . Feature-level adaptation techniques have gained significant attention, with methods such as Maximum Mean Discrepancy (MMD) and adversarial training, demonstrating their ability to align feature distributions 35 . ...
Article
Full-text available
Medical image analysis, empowered by artificial intelligence (AI), plays a crucial role in modern healthcare diagnostics. However, the effectiveness of machine learning models hinges on their ability to generalize to diverse patient populations, presenting domain shift challenges. This study explores the domain shift problem in chest X-ray classification, focusing on cross-population variations, especially in underrepresented groups. We analyze the impact of domain shifts across three population datasets acting as sources using a Nigerian chest X-ray dataset acting as the target. Model performance is evaluated to assess disparities between source and target populations, revealing large discrepancies when the models trained on a source were applied to the target domain. To address with the evident domain shift among the populations, we propose a supervised adversarial domain adaptation (ADA) technique. The feature extractor is first trained on the source domain using a supervised loss function in ADA. The feature extractor is then frozen, and an adversarial domain discriminator is introduced to distinguish between the source and target domains. Adversarial training fine-tunes the feature extractor, making features from both domains indistinguishable, thereby creating domain-invariant features. The technique was evaluated on the Nigerian dataset, showing significant improvements in chest X-ray classification performance. The proposed model achieved a 90.08% accuracy and a 96% AUC score, outperforming existing approaches such as multi-task learning (MTL) and continual learning (CL). This research highlights the importance of developing domain-aware models in AI-driven healthcare, offering a solution to cross-population domain shift challenges in medical imaging.
... In seismic facies classification, machine learning methods provide several advantages over traditional approaches. They can handle large volumes of seismic data efficiently, allowing for rapid analysis and interpretation [8]. Less than a decade ago, significant research focused on applying machine learning (ML) techniques to reduce risks in drilling and oil exploration operations. ...
Article
Full-text available
This paper integrates supervised Bayesian classification with a conjugate-based artificial neural network (ANN) to classify seismic data into four lithofacies defined at well locations. This study focuses on the Miocene Ghar Formation, which serve as a highly productive reservoir in the northwest Persian Gulf and the Mesopotamian Basin. Following well log corrections and shear wave velocity (Vs) prediction through rock physics modeling, an adaptive rock physics template (AI vs. SQp), was generated to enhance the accuracy of facies discrimination in well log data. Based on the observed lithology and petrophysical evaluation, four facies were identified: shale, carbonate, high-porosity sand, and low-porosity sand. Following pre-stack simultaneous seismic inversion, an unsupervised facies classification was conducted to validate the capability of the available seismic data, along with its quantitative interpretation results, in effectively classifying facies. AI and SQp were utilized as the primary inputs to facies discrimination in the Ghar Reservoir, while the Probability density function (PDF) results served as auxiliary inputs in the conjugate-based artificial neural network analysis. This study presents a novel integration of Bayesian classification and artificial neural networks (ANN), resulting in a 3D lithofacies distribution model along with associated probabilities, offering new insights into reservoir quality and spatial distribution. This combined approach not only enhances the accuracy of predictions by leveraging the strengths of both methods but also addresses challenges such as uncertainty quantification and the interpretability of ANN models. The results indicate that high-porosity sand facies are concentrated at the crest of the structure, with the probability data and geological interpretation supporting this finding.
... CNN be able to make a characteristic model with very nonlinear of data with ability to easily utilize large amounts of training material, and generalize well to invisible samples. CNN [6], multiclass structure element classification [7], salt detection [8], and seismic facies classification [9], [10], [11]. ...
Article
Full-text available
Convolutional Neural Network (CNN) method is one of a machine learning algorithm that is adapted from the the way of human brain works and is used to facilitate the processing of large amounts of data. The CNN method predicts the lithology of a well with well log data, there are Gamma-Ray Logs, Density Logs, Neutron Porosity Logs, and DT Logs. The CNN model is formed by several layers, such as 2 Convolutional Layers, 2 Dense Layers, 0.5 Dropout Layers, and 256 Dense Nodes. Contrary outcomes are observed in predictions that were previously generated utilizing 10 input parameters from Poseidon2, Poseidon North, Well Proteus1, and Poseidon1 data. The lithology frequency of Claystone and Siltstone is the dominant type based on the data from Poseidon2, Poseidon North, and Proteus1 Wells. The Poseidon1 Well data mostly consists of Claystone lithological frequency. The trained model utilizes data from the Poseidon2, Poseidon North, and Proteus1 wells, which exhibit lithology distinct from that of the Poseidon1 Wells. Consequently, the model is limited to predicting lithology based solely on log types that align with the training data.. The amount of 3730 initial data from the Poseidon2, Poseidon North, and Proteus1 wells were entered and trained which later became the CNN model and then tested the 820 new data. The accuracy of the lithology prediction was done by comparing the actual facies with the predicted facies which had an accuracy of 87.9% at the Poseidon2, Poseidon North, and Proteus1 wells. The resulting RMSE value is 1.4. As a result, the CNN method can predict the lithology according to the original facies and identify the lithology of the Poseidon1 well as a target at a depth of 4620 m to 4660 m with Claystone lithology. CNN method can also predict thin layers such as Ferruginous Calcilutite and Altered Volcanic. The input section of the CNN method is required to use well report data that matches the depth data.
... In recent years, many methods have been proposed for automatic seismic facies classification by using supervised, semi-supervised, and unsupervised learning. Supervised learning methods (Wrona et al., 2018;Zhao, 2018;Liu et al., 2018;Zhang et al., 2021) first use large amounts of labeled data to train a convolutional neural network (CNN) model and then use the trained model for automatic seismic facies classification. Semi-supervised learning methods (Qi et al., 2016;Dunham et al., 2020;Liu et al., 2020) use both labeled and unlabeled data to train the network to learn the features and distributions characterizing seismic facies. ...
Article
Full-text available
Seismic facies classification is crucial for seismic stratigraphic interpretation and hydrocarbon reservoir characterization but remains a tedious and time-consuming task that requires significant manual effort. Data-driven deep-learning approaches are highly promising for automating the seismic facies classification with high efficiency and accuracy, as they have already achieved significant success in similar image classification tasks within the field of computer vision (CV). However, unlike the CV domain, the field of seismic exploration lacks a comprehensive benchmark dataset for seismic facies, severely limiting the development, application, and evaluation of deep-learning approaches in seismic facies classification. To address this gap, we propose a comprehensive workflow to construct a massive-scale benchmark dataset of seismic facies and evaluate its effectiveness in training a deep-learning model. Specifically, we first develop a knowledge graph of seismic facies based on geological concepts and seismic reflection configurations. Guided by the graph, we then implement the three strategies of field seismic data curation, knowledge-guided synthesization, and generative adversarial network (GAN)-based generation to construct a benchmark dataset of 8000 diverse samples for five common seismic facies. Finally, we use the benchmark dataset to train a network and then apply it to two 3-D seismic data for automatic seismic facies classification. The predictions are highly consistent with expert interpretation results, demonstrating that the diversity and representativeness of our benchmark dataset are sufficient to train a network that can be generalized well in seismic facies classification across field data. We have made this dataset (10.5281/zenodo.10777460, ), the trained model, and the associated codes (10.5281/zenodo.13150879, ) publicly available for further research and validation of intelligent seismic facies classification.
... Typically, the entire seismic volumes need to be cropped into patched samples by a sliding window in order to meet the input requirement of the original CNNs. Researchers commonly assign the desired property of the geometric central measurement point of each sample as its label during the training process of these "patched" models [12]. Hence, the model tends to discard the positional and ordinal details of samples as it extracts the hidden information from every single one independently. ...
Article
Full-text available
Seismic facies interpretation plays a vital role in oil and gas exploration and production. However, traditional methods, such as trace inversion and manual interpretation, are often time-consuming and labor-intensive. In recent years, deep learning algorithms have emerged as promising and efficient tools for facies identification with 3D seismic data. As a rapidly developing field, deep learning models with various network structures rise up all the time. Some of them are employed by researchers in the case studies of facies interpretation and are claimed as the better methods. However, the influence of the input features especially their inherent data structure, has attracted few discussions so far. Furthermore, most current studies using artificial intelligence for seismic interpretation primarily rely on two major branches of deep learning algorithms: convolutional neural networks (CNNs) which are skilled in capturing spatial patterns, and recurrent neural networks (RNNs) which are effective at modeling temporal dependencies. As a result, these networks and their variants fail to simultaneously leverage both spatial and temporal coupling of the multidimensional data. In this paper, we replace the matrix multiplications inside the memory cell of the general long short-term memory unit with a convolution operation, which is a basic module of the deep learning framework, to attach the capability of capturing the spatial dependencies with temporal dynamic behavior to the recurrent architecture. A patched deep learning model based on this theoretically rational and programming feasible RNN variant is implemented in three experiments of the 3D seismic facies interpretation. Our study firstly highlights the importance of the input seismic attributes in providing valuable information for making accurate predictions. The results from the first experiment demonstrate that the selection of seismic attributes based on their correlation with the interpretation target greatly enhances the model performance. Furthermore, by comparing the predictions from the proposed model with the ones from the model that just utilizes the spatial dependencies, our study emphasizes the significance of incorporating spatio-temporal dependencies within the chosen seismic attributes during the interpretation, as it leads to improved predictions, especially in boundary detection. Last but not least, our experiments demonstrate that the contribution of spatial dependencies to 3D seismic interpretation diminishes as the spatial distance increases. Therefore, selectively augmenting the training data with samples that have weaker spatial correlations can significantly enhance the model’s performance. Based on our findings, we prefer to conclude that interpreters that consider spatio-temporal dependencies inside the full covering optimized attributes can improve the quality of 3D seismic facies interpretation. This conclusion can serve as an outline for the workflow of Deep Learning-assisted 3D seismic interpretation.
... This high computational cost restricts exploration to macro-scale velocities and hinders the acquisition of finer details from high-resolution images. However, the field of seismic interpretation has seen significant advancements with the integration of deep learning techniques [34,26]. ...
Article
Full-text available
Seismic interpretation is a crucial task in geophysics, requiring accurate prediction of subsurface layer thickness and seismic wave velocity. Traditional methods are computationally intensive and often hindered by noise in seismic data. Deep learning offers a promising solution to analyze complex geographical structures, but its computational complexity can be a barrier for deployment. This study introduces a deep learning-based approach enhanced by knowledge distillation to predict subsurface layer thickness and seismic wave velocity. By leveraging pre-trained convolutional neural networks (CNNs) and Transformers as teacher models, we transfer knowledge to smaller, more efficient student models through multiple knowledge distillation strategies, including cross-architecture, multi-teacher, and self-distillation. Implementing knowledge distillation from a Vision Transformer (ViT) teacher to CNN student models enhances the performance of student models in depth and velocity predictions compared to baseline CNN models without knowledge distillation. Aggregating knowledge from multiple teacher models improves model generalization and reduces overfitting, as evidenced by a decreased gap between training and validation losses. Self-distillation significantly enhanced the performance of simpler architectures like VGG19, achieving over 95% accuracy in several depth and velocity prediction tasks. Our approach was validated using simulated velocity model data, demonstrating that distilling knowledge from deeper, complex models into smaller ones maintains high prediction accuracy while reducing computational requirements. These findings emphasize the efficiency of knowledge distillation in deploying deep learning models in resource-constrained environments. Integrating deep learning with knowledge distillation techniques not only enhances the accuracy of seismic interpretation models but also makes them more feasible for practical applications, offering a powerful tool for seismic exploration and potentially transforming how seismic data is processed for subsurface characterization and accelerating decision-making in exploration and production.
Article
Deep learning (DL) has been widely used to enhance the efficiency and accuracy of seismic facies classification. However, most DL-based seismic facies classification methods rely only on seismic amplitudes and require substantial labeled training data. We propose the Unconformity-enhanced Hierarchical Context Fusion Network (UHCFNet), which uses seismic unconformity attribute to increase the accuracy of seismic facies classification and reduce the amount of data required for model training. We first calculate the seismic unconformity attribute to highlight potential seismic facies boundaries. We then propose the UHCFNet by combining the Hierarchical Context Fusion Network (HCFNet) with an unconformity guiding branch to integrate the unconformity attribute as an additional constraint. The unconformity guiding branch incorporates the unconformity-aware Transformer block (UTB) to extract unconformity features. Next, we validate the performance of our proposed UHCFNet by applying it and two baseline DL models to the Netherlands F3 field data. Quantitative and qualitative comparisons illustrate that our proposed model provides more accurate seismic facies classification results than baseline DL models, especially for regions with complicated structures and minority facies with fewer seismic samples. Moreover, to further demonstrate the robustness of our UHCFNet, we train the proposed model and baseline DL models using only 5% of the original training dataset. Comparisons between different models indicate that our proposed UHCFNet has a lower dependency on large datasets, as the UHCFNet still provides accurate seismic facies classifications for regions with complicated structures with limited training data.
Article
Full-text available
Velocity-model building is a key step in hydrocarbon exploration. The main product of velocity-model building is an initial model of the subsurface that is subsequently used in seismic imaging and interpretation workflows. Reflection or refraction tomography and full-waveform inversion (FWI) are the most commonly used techniques in velocity-model building. On one hand, tomography is a time-consuming activity that relies on successive updates of highly human-curated analysis of gathers. On the other hand, FWI is very computationally demanding with no guarantees of global convergence. We propose and implement a novel concept that bypasses these demanding steps, directly producing an accurate gridding or layered velocity model from shot gathers. Our approach relies on training deep neural networks. The resulting predictive model maps relationships between the data space and the final output (particularly the presence of high-velocity segments that might indicate salt formations). The training task takes a few hours for 2D data, but the inference step (predicting a model from previously unseen data) takes only seconds. The promising results shown here for synthetic 2D data demonstrate a new way of using seismic data and suggest fast turnaround of workflows that now make use of machine-learning approaches to identify key structures in the subsurface.
Conference Paper
Full-text available
Full-waveform inversion (FWI) is now a mature technology that is routinely used in exploration around the world to obtain high resolution earth models. In geological areas such as the Gulf of Mexico, however, reconstructing complex salt geobodies poses a huge challenge to FWI due to the absence of low frequencies in the data needed to resolve such features. A skilled seismic interpreter has to interpret these geobodies and manually insert them into the earth model and repeat this process several times in the earth model building workflow. Deep learning algorithms have gained a lot of interest in recent years by obtaining state-of-the art results in various problems arising in the fields of computer vision, automatic speech recognition and natural language processing. We investigate the use of these algorithms to generate useful prior models for full-waveform inversion by learning features relevant to earth model building from a seismic image. We test this methodology in full-waveform inversion by generating a probability map of salt bodies in the migrated image along with a prior model and incorporating it in the FWI objective function. This approach is shown to be promising in enabling an automated salt body reconstruction using FWI.
Article
Full-text available
For hydrocarbon exploration, large volumes of data are acquired and used in physical modeling-based workflows to identify geologic features of interest such as fault networks, salt bodies, or, in general, elements of petroleum systems. The adjoint modeling step, which transforms the data into the model space, and subsequent interpretation can be very expensive, both in terms of computing resources and domain-expert time. We propose and implement a unique approach that bypasses these demanding steps, directly assisting interpretation. We do this by training a deep neural network to learn a mapping relationship between the data space and the final output (particularly, spatial points indicating fault presence). The key to obtaining accurate predictions is the use of the Wasserstein loss function, which properly handles the structured output-in our case, by exploiting fault surface continuity. The promising results shown here for synthetic data demonstrate a new way of using seismic data and suggest more direct methods to identify key elements in the subsurface.
Article
Full-text available
We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16 network . The role of the decoder network is to map the low resolution encoder feature maps to full input resolution feature maps for pixel-wise classification. The novelty of SegNet lies is in the manner in which the decoder upsamples its lower resolution input feature map(s). Specifically, the decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to perform non-linear upsampling. This eliminates the need for learning to upsample. The upsampled maps are sparse and are then convolved with trainable filters to produce dense feature maps. We compare our proposed architecture with the fully convolutional network (FCN) architecture and its variants. This comparison reveals the memory versus accuracy trade-off involved in achieving good segmentation performance. The design of SegNet was primarily motivated by road scene understanding applications. Hence, it is efficient both in terms of memory and computational time during inference. It is also significantly smaller in the number of trainable parameters than competing architectures and can be trained end-to-end using stochastic gradient descent. We also benchmark the performance of SegNet on Pascal VOC12 salient object segmentation and the recent SUN RGB-D indoor scene understanding challenge. We show that SegNet provides competitive performance although it is significantly smaller than other architectures. We also provide a Caffe implementation of SegNet and a webdemo at http://mi.eng.cam.ac.uk/projects/segnet/
Article
Full-text available
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
Article
The modern requirement for analyzing and interpreting ever-larger volumes of seismic data to identify prospective hydrocarbon prospects within stringent time deadlines represents an ongoing challenge in petroleum exploration. To provide a computer-based aid in addressing this challenge, we have developed a "big data" platform to facilitate the work of geophysicists in interpreting and analyzing large volumes of seismic data with scalable performance. We have constructed this platform on a modern distributed-memory infrastructure, providing a customized seismic analytics software development toolkit, and a Web-based graphical workflow interface along with a remote 3D visualization capability. These support the management of seismic data volumes, attributes processing, seismic analytics model development, workflow execution, and 3D volume visualization on a scalable, distributed computing platform. Early experiences show that computationally demanding deep learning methods such as convolutional neural networks (CNN) provide improved results over traditional methods such as support vector machines (SVMs) and logistic regression for identifying geologic faults in 3D seismic volumes. Our experiments show encouraging accuracy in identifying faults by combining CNN and traditional machine learning models with a variety of seismic attributes, and the platform is able to deliver scalable performance.
Article
In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First, we highlight convolution with upsampled filters, or 'atrous convolution', as a powerful tool in dense prediction tasks. Atrous convolution allows us to explicitly control the resolution at which feature responses are computed within Deep Convolutional Neural Networks. It also allows us to effectively enlarge the field of view of filters to incorporate larger context without increasing the number of parameters or the amount of computation. Second, we propose atrous spatial pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales. Third, we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed "DeepLab" system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7% mIOU in the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code is made publicly available online.