Content uploaded by Bahman Abbassi
Author content
All content in this area was uploaded by Bahman Abbassi on Dec 05, 2024
Content may be subject to copyright.
Content uploaded by Bahman Abbassi
Author content
All content in this area was uploaded by Bahman Abbassi on Nov 12, 2024
Content may be subject to copyright.
Curvilinear lineament extraction: Bayesian optimization of Principal
Component Wavelet Analysis and Hysteresis Thresholding
Bahman Abbassi
*
, Li-Zhen Cheng
Universit´
e Du Qu´
ebec en Abitibi-T´
emiscamingue, QC, J9X 5E4, Canada
ARTICLE INFO
Keywords:
Bayesian optimization
Hysteresis thresholding
Principal component analysis
Wavelet transform
Curvilinear lineament
ABSTRACT
Understanding deformation networks, visible as curvilinear lineaments in images, is crucial for geoscientic
explorations. However, traditional manual extraction of lineaments is expertise-dependent, time-consuming, and
labor-intensive. This study introduces an automated method to extract and identify geological faults from
aeromagnetic images, integrating Bayesian Hyperparameter Optimization (BHO), Principal Component Wavelet
Analysis (PCWA), and Hysteresis Thresholding Algorithm (HTA). The continuous wavelet transform (CWT),
employed across various scales and orientations, enhances feature extraction quality, while Principal Component
Analysis (PCA) within the CWT eliminates redundant information, focusing on relevant features. Using a
Gaussian Process surrogate model, BHO autonomously ne-tunes hyperparameters for optimal curvilinear
pattern recognition, resulting in a highly accurate and computationally efcient solution for curvilinear linea-
ment mapping. Empirical validation using aeromagnetic images from a prominent fault zone in the James Bay
region of Quebec, Canada, demonstrates signicant accuracy improvements, with 23% improvement in F
β
Score
over the unoptimized PCWA-HTA and a marked 300% improvement over traditional HTA methods, underscoring
the added value of fusing BHO with PCWA in the curvilinear lineament extraction process. The iterative nature of
BHO progressively renes hyperparameters, enhancing geological feature detection. Early BHO iterations
broadly explore the hyperparameter space, identifying low-frequency curvilinear features representing deep
lineaments. As BHO advances, hyperparameter ne-tuning increases sensitivity to high-frequency features
indicative of shallow lineaments. This progressive renement ensures that later iterations better detect detailed
structures, demonstrating BHO’s robustness in distinguishing various curvilinear features and improving the
accuracy of curvilinear lineament extraction. For future work, we aim to expand the method’s applicability by
incorporating multiple geophysical image types, enhancing adaptability across diverse geological contexts.
1. Introduction
Understanding the brittle deformation framework of bedrock is
crucial for assessing geological sites. These deformations inuence
groundwater dynamics, hydrochemistry, and the mechanical properties
of the bedrock—essential information for civil engineers, earthquake
risk analysts, and mineral exploration geoscientists (Ahmadi and Pek-
kan, 2021;Tir´
en, 2010).
Hobbs introduced the term "lineament" in 1903 (Hobbs, 1903),
describing it as "signicant lines of landscape that reveal the hidden
architecture of the rock basement." Modern denitions extend to
curvilinear features such as faults, fractures, shear zones, and other
tectonic structures, categorized into two types: negative lineaments,
such as joints, faults, and shear zones, are related to rock deformation,
and positive lineaments, including dykes and dyke swarms, form when
magma intrudes into pre-existing fractures. This study focuses on
negative lineaments essential for understanding groundwater ow,
seismic risk assessment, and mineral exploration.
Remote sensing techniques allow mapping these lineaments, even
under soil cover (Masoud and Koike, 2011;Farahbakhsh et al., 2019;
Ahmadi and Pekkan, 2021). Geophysical data, including magnetic,
electromagnetic, radiometric, and gravimetric measurements, along
with aerial photography, elevation data, multispectral sensing, laser,
and radar, are invaluable for lineament studies (Fedi and Florio, 2001;
Masoud and Koike, 2011;Boe, 2012;Soto-Pinto et al., 2013). However,
traditional manual extraction methods are expertise-dependent, time--
consuming, and labor-intensive (Tir´
en, 2010). Furthermore, manual
digitization may overlook subtle or complex geological features,
* Corresponding author.
E-mail address: bahman.abbassi@uqat.ca (B. Abbassi).
Contents lists available at ScienceDirect
Computers and Geosciences
journal homepage: www.elsevier.com/locate/cageo
https://doi.org/10.1016/j.cageo.2024.105768
Received 25 July 2024; Received in revised form 4 November 2024; Accepted 6 November 2024
Computers & Geosciences 194 (2025) 105768
Available online 7 November 2024
0098-3004/© 2024 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (
http://creativecommons.org/licenses/by/4.0/ ).
particularly in areas with low-resolution data or poor visibility, leading
to incomplete datasets. Various algorithms have been developed for
automatic detection, using image processing techniques to address this.
The Hough Transform (Wang and Howarth, 1990;Fitton and Cox,
1998;Mohammadpour et al., 2020) effectively detects straight lines,
even in noisy conditions, but struggles with curvilinear features. Sobel
and Laplacian lters (Patel et al., 2011;Zhang et al., 2018) are widely
used edge-detection methods known for computational efciency but
are highly noise-sensitive and require extensive pre-processing. PCA
(¨
Olgen, 2004;Boutrika et al., 2019;Farahbakhsh et al., 2019) enhances
prominent features through dimensionality reduction, though it may
overlook subtle lineaments in multi-dimensional datasets without
guidance on the optimal level of dimensionality reduction needed. The
Radon Transform (Boe, 2012;Krylov and Nelson, 2014) is advantageous
for detecting linear structures but is less effective with curvilinear pat-
terns. The CWT (Guo et al., 2010;Tu and Karstoft, 2015;Zhou et al.,
2023) excels at multi-scale feature extraction but relies on xed pa-
rameters, which may reduce adaptability across different geological
contexts. Panagiotakis and Kokinou (2014,2015) demonstrated the
effectiveness of applying topology and shape optimization techniques to
distinguish geological faults from similar geomorphological structures.
Though this approach helps to separate curvilinear features from linear
ones, it still lacks proper feature extraction and adaptability in
ne-tuning the embedded parameters. Recent studies highlight the po-
tential of BHO for ne-tuning key parameters in applications like
small-scale fault detection, hazard modeling, and landslide susceptibil-
ity mapping (Sun et al., 2021;Janizadeh et al., 2022;Wang et al., 2022).
However, BHO has not yet been integrated with other lineament
extraction methods to boost adaptability.
These limitations highlight the need for an approach that is both
adaptable and capable of capturing complex, curvilinear structures. The
present study addresses these challenges by combining BHO with PCWA
and a HTA, offering a more exible and accurate solution for curvilinear
lineament extraction in aeromagnetic imagery. The proposed algorithm
enables a ne-tuned extraction process at various wavelet frequencies.
Incorporating CWT and PCA provides robust multi-scale edge detection,
enhancing the precision of geological fault delineations. In this study,
the ne-tuned PCWA-HTA algorithm with BHO effectively identies
hidden features across varying wavelet scales and directions, integrating
them into a new spatial construct. Consequently, it reveals obscured
lineament sources in the imagery, distinguishing genuine geological
faults from analogous linear structures. It also overcomes traditional
techniques’deciencies, which often misinterpret noise and ne tex-
tures as geological faults due to a lack of multi-scale analysis, leading to
erroneous representations.
The proposed solution was tested using total magnetic eld intensity
datasets from a region in Quebec, Canada. The results demonstrated
superior performance in unveiling hidden lineaments within aero-
magnetic datasets compared to established approaches. The algorithm is
not limited to magnetic datasets, showcasing adaptability to various
geophysical and geoscientic imagery. Additionally, its applicability
extends beyond geological fault detection, proving helpful in identifying
diverse fracture types in macroscopic rock specimens, drill core samples,
and microscopic images.
2. Methodology
2.1. 2D CWT
The basic theory of the CWT involves decomposing an image into
scaled and translated versions using a chosen wavelet, known as the
mother wavelet. Introduced in 1997 by Moreau et al., it has become
essential in geophysical data analysis for identifying potential eld
sources through maxima lines in the CWT coefcients. The wavelet
coefcients are determined by convolving the mother wavelet with
geophysical images, allowing extraction of time-frequency properties in
time series or space-frequency structures in 2D images (Moreau et al.,
1997;Hornby et al., 1999;Fedi and Florio, 2001;Boukerbout et al.,
2003;Sailhac et al., 2009).
Given a 2D image I(x) ∈ L2R2, the 2D CWT coefcients can be
expressed as (Antoine et al., 2004):
CIa,b
⇀
,θ=R2I(x
⇀)
ψ
a,b,θ(x
⇀)d2x
⇀=R2I(
ω
⇀)
ψ
a,b,θ(
ω
⇀)eib
⇀
.w
⇀d2
ω
⇀(1)
where a, b,θare the scale factor, the translation vector, and the rota-
tional angle, respectively.
ψ
(x)is the complex conjugate of the mother
wavelet
ψ
(x)in 2D. The superscript hats denote the Fourier transforms
of the functions.
All families of mother wavelets can be generally expressed as:
ψ
a,b,θ(x
⇀) = 1
a
ψ
r−θx
⇀−b
⇀
a(2)
where r−θdenotes the rotation matrix that controls the direction in
which the mother wavelet translates:
r−θ=cos(θ) − sin(θ)
sin(θ)cos(θ)(3)
where θ∈ [0,2
π
). The calculated coefcient CImeasures the degree of
similarity between the image I(x)and the directional translating and
scaling of the mother wavelet
ψ
a,b,θat specic location b, scale a, and
angle θ. For isotropic wavelets, like Mexican Hat, we omit the integra-
tion of r−θ.
We can rewrite equation (2) in the convolution form:
CIa,b
⇀
,θ=
ψ
a,θ(x
⇀)*I(x
⇀)b
⇀=fa,θb
⇀(4)
where * denotes the convolution of the image with the mother wavelet,
and wavelet coefcients are calculated over a 2D plane according to the
translations of b
⇀. The results are several 2D layers of features (fa,θ) in
different scales and angles.
In this study, anisotropic Gaussian mother wavelets were constructed
using successive derivatives of a Gaussian distribution function (Jacques
et al., 2003). These wavelets were chosen for their superior ability to
capture directional features, offering enhanced sensitivity to elongated
and curvilinear structures that align with geological lineaments across
varying orientations and scales. This adaptability is essential for
detecting subtle geological features, where isotropic wavelets often fall
short. Furthermore, Gaussian wavelets provide a smoother trans-
formation with reduced boundary effects, both of which are critical for
achieving high accuracy in feature extraction.
The 2D Gaussian used to construct anisotropic Gaussian mother
wavelets is dened as (Antoine et al., 2004):
g(x) = e− |x
⇀
|2/2
σ
2(5)
where x
⇀= (x,y)and thus g(x,y) = e−(x2+y2)/2
σ
2.
σ
determines the width
of the Gaussian. Therefore, the mother wavelet is a differentiated
Gaussian of the form:
ψ
a,θ= (
∂
/
∂
x)m(
∂
/
∂
y)nga,θ(6)
The generalized form of the resulting mother wavelet in the fre-
quency plane is:
ψ
a,θ(
ω
⇀) = i
ω
m
xi
ω
n
ye−(
ω
2
x+
ω
2
y)/2(7)
where
ω
⇀=
ω
x,
ω
yis the frequency plane (Antoine et al., 2004).
A simple Gaussian derivative wavelet as an edge detector can be
constructed by setting mand nto 1 and 0, respectively, as shown in
Fig. 1.
B. Abbassi and L.-Z. Cheng Computers and Geosciences 194 (2025) 105768
2
Our analysis focuses on locality and smooth curvilinearity, tuning m
and nto extract features spanning a broad spectrum of details. Higher
values of mand nresult in wavelets with augmented vanishing moments,
capturing more acute details, while minimal values prevent substantial
amplication of high-frequency noise. In this study, higher-order dif-
ferentiation (m>1 and n>1) is used for high-frequency feature
detection, and lower-order differentiation (m=1 and n=0) for low-
frequency features. However, for m>2, the increase in vanishing mo-
ments sharpens edges but introduces unwanted noise, a byproduct of
higher-order differentiation.
2.2. 2D principal component spectral analysis
Studies show the effectiveness of CWT algorithms in the detection of
lineaments associated with tectonic structures in geoscientic images
(Jordan and Schott, 2005;Sailhac et al., 2009;Xu et al., 2020). How-
ever, the efciency of feature extraction is limited when applying CWT
without proper dimensionality reduction, particularly when isolating
subtle lineament patterns among noise. Luo et al. (2016) developed an
ensemble 4D-seismic history-matching framework that applies wavelet
multiresolution analysis with magnitude-based thresholding to achieve
sparse data representation, enhancing computational efciency by
retaining only the most signicant wavelet coefcients. However, their
simple magnitude-based thresholding approach does not consider the
statistical nature of underlying features. In contrast, PCA emphasizes
data variance, which allows for retaining a broader range of features,
making it more effective for preserving subtle geological details that
may be missed with a purely magnitude-focused method. Additionally,
PCA is computationally efcient compared to other, more complex
feature selection techniques that can be costly and resource-intensive,
particularly in high-dimensional wavelet settings (Donald et al., 2009).
Applying PCA to the outputs of the CWT decomposition helps to
reduce the dimensionality of the wavelet features.
Recent studies from Zhang et al. (2019) and Lim et al. (2021)
formulate the fusion between PCA and CWT. Guo et al. (2009) also show
PCA integration with spectral decomposition in seismic data interpre-
tation for identifying stratigraphic features. They use PCA on the
Matched Pursuit algorithm as a seismic spectral decomposition method.
In this study, we used a similar approach to develop the PCWA algorithm
that combines the effectiveness of CWT in edge detection and the ef-
ciency of PCA in the reduction of redundancies in wavelet spectra. This
transformation simplies the process of recognizing curvilinear patterns
by retaining as much variance as possible. The two transforms operate
over different domains: CWT operates spatially (and/or directionally)
over each input image, while the PCA transform functions spectrally
over an entire set of images. PCA decorrelates the band-to-band spectral
information contained in the wavelet coefcients and therefore yields
new smaller decorrelated datasets that can be used for curvilinear
Fig. 1. Gaussian mother wavelet in spatial and frequency domain (
ψ
a,θand
ψ
a,θ, respectively), as the rst partial derivative of the Gaussian probability distribution
function in the x direction with m=1, n=0. a)
ψ
a,θand
ψ
a,θon scale a=1, with varying angles θ={0,
π
/4,
π
/2}. b)
ψ
a,θand
ψ
a,θon scale a=5, with varying angles
θ={0,
π
/4,
π
/2}.
B. Abbassi and L.-Z. Cheng Computers and Geosciences 194 (2025) 105768
3
lineament extraction.
The resulting spectral features are extracted wavelet coefcients
fa,θb
⇀for every input image. Therefore, if we have nIinput images with
nxby nypixels and mother wavelet scaling in nascales and translating in
nθdirections, the total number of spectral features are t=nInanθ, and
the number of samples for each feature is s=nxny. PCA tends to nd a
linear transformation mapping the original t-dimensional spectral
feature Fonto an r-dimensional decorrelated principal components P
such that r≪t:
F=
f1,1f1,2…f1,t
f2,1f2,2…f2,t
⋮ ⋮ ⋱ ⋮
fs,1fs,2…fs,t
(s×t)
→P=
fʹ
1,1fʹ
1,2…fʹ
1,r
fʹ
2,1fʹ
2,2…fʹ
2,r
⋮ ⋮ ⋱ ⋮
fʹ
s,1fʹ
s,2…fʹ
s,r
(s×r)
(8)
In PCA, the initial preprocessing step involves transforming the input
matrix Fto enhance the interpretability of the resulting principal com-
ponents. This is achieved by ensuring that each extracted spectral
feature (represented by the columns of the matrix F) has a mean of zero,
making it centered around the origin and a variance of one. This
autoscaling step aims to standardize the feature distributions, facili-
tating the statistical meaning of the following principal components by
ensuring that all features contribute equally to the analysis. We imple-
mented PCA using Singular Value Decomposition (SVD) on rectangular
matrices. The SVD algorithm aims to nd the eigenvectors and eigen-
values of the correlation matrix of the standardized inputs FSTD . Each
eigenvector is characterized by a unique eigenvalue directly related to a
principal component.
SVD allows for a low-rank approximation by considering only a
subset of the largest singular values and corresponding singular vectors,
representing the original dataset with fewer dimensions while mini-
mizing the spectral information loss. SVD decomposes FSTD into three
resulting matrices that, multiplied together, return the original input
matrix (Shlens, 2005;Jolliffe and Cadima, 2016):
FSTD =UΣVT(9)
where Uor left singular vector is an s-by-s matrix with orthogonal col-
umns (UTU=Is), containing all the information about the samples
(rows), Vor right singular vector is a t-by-t matrix with orthogonal rows
and columns (VTV=V VT=It) that contains all the information about
the spectral features (columns), and Σis an s-by-t matrix that records the
SVD process, compressing all the signicant information into the rst
columns of the new matrices. The resulting matrix Σshows how the
compression wavelet features happened and contain r≤tsingular
values
σ
i≥0 on the main diagonal with zeros lling up all the rest of the
matrix and i= {1,2,…,r}.
Fig. 2 summarizes the application of economy-size SVD on auto-
scaled wavelet coefcients when r≤t≪sthat is the case in the majority
of geophysical dimensionality reduction applications where the number
of features is much smaller than the number of samples, and dimen-
sionality is not reduced or reduced below the tthreshold. Therefore, an
approximation of FSTD can be presented as:
FSTD =
u1,1u1,2…u1,s
u2,1u2,2…u2,s
⋮ ⋮ ⋱ ⋮
us,1us,2…us,s
(s×r)
σ
10…0
0
σ
2…0
⋮ ⋮ ⋱ ⋮
0 0 …
σ
i
(r×r)
v1,1v1,2…v1,t
v2,1v2,2…v2,t
⋮ ⋮ ⋱ ⋮
vt,1vt,2…vt,t
(r×t)
(10)
where
FSTD denotes the low-rank approximation of FSTD.
The singular values in Σr=diag[
σ
1,
σ
1,…,
σ
r,…,0,…,0]are ar-
ranged in descending order, indicating their importance. Multiplying
the original FSTD with the truncated matrix right singular vector (V) gives
the principal components:
Pi=FSTDV(11)
2.3. Curvilinearity extraction
The principal component spectral analysis enhances and extracts the
edges inside aeromagnetic imagery. This section addresses converting
these edges to meaningful geological faults discernible from other linear
patterns. Here, we used a method that computes the Slope and Aspect of
the extracted principal components and their derivatives (Panagiotakis
and Kokinou, 2014). Then, image enhancement and HTA by
region-growing pixel-labeling method are used to detect curvilinear
geological faults (Panagiotakis and Kokinou, 2015).
The algorithm starts with preprocessing the input images to remove
unwanted noises and artifacts. Then, the CWT at different scales and
directions, increasing the dimensionality. PCA decorrelates the spectral
features while reducing dimensionality and eliminating repetitive pat-
terns. Next, the algorithm calculates the slope and aspect of the
extracted features based on the plane tangent vector of the wavelet
principal component P. The tangent vector T(x,y)to the surface P(x,y)
at a point (x,y)is given by the gradient of Pat that point (Burrough et al.,
1998;Panagiotakis and Kokinou, 2014,2015):
Fig. 2. SVD of wavelet coefcients when the number of features is way smaller than the number of samples, and the dimensionality is reduced from tto r. The black
shaded parts are truncated in the economy-sized SVD procedure; therefore, U’s last s–rcolumns are irrelevant and set to zeros.
B. Abbassi and L.-Z. Cheng Computers and Geosciences 194 (2025) 105768
4
T(x,y) = ∇P(x,y) = Px(x,y),Py(x,y)T(12)
where Px=
∂
P/
∂
xand Py=
∂
P/
∂
yare rate of surface change in xand y
directions. The slope and aspect of Pat each point is therefore
(Panagiotakis and Kokinou, 2014,2015):
S(x,y) = tan−1P2
x+P2
y1/2,A(x,y) = atan 2Py,Px(13)
where P2
x+P2
y1/2is the Euclidean norm of the vector tangent vector
T(x,y). The slope S(x,y) ∈ [0,90]and aspect A(x,y) ∈ [0,360]are usually
measured in degrees.
The potential lineaments are extractable in the places where slopes
(S), slopes of slopes (Sʹ), and slopes of aspects (Aʹ) are high (Panagiotakis
and Kokinou, 2014,2015):
L=S2.Sʹ.Aʹ1/4(14)
However, the enhanced image Ldoes not consider the curvilinearity
of the lineaments to differentiate the geological faults from other linear
patterns. Panagiotakis and Kokinou (2015) proposed a method to tackle
this problem by improving curvilinearity. The algorithm convolves L
with a zero mean step lter Gwith a width of wand an orientation angle
φ(Panagiotakis and Kokinou, 2014,2015):
Ig=L*G(w,φ)(15)
As a modication to the Panagiotakis and Kokinou method, the
width of the step lter (w) in this study is set to change for each input
wavelet principal component Pbased on the global variance of each
image as a measure of the complexity of curvilinear structures. A higher
variance would indicate a more extensive spread of pixel intensities,
suggesting a higher level of detail and, thus, higher complexity and
lower wto capture high-frequency curvilinear lineaments. The result, I
g
contains curvilinear structures that appear in the local maxima of L.
Then the algorithm calculates the maximum of the corresponding pixel
values of the images (Panagiotakis and Kokinou, 2014,2015):
Im=maxa,φIg(16)
In the resulting image I
m
, all curvilinear structures under any orientation
have been enhanced. A pixel labeling algorithm (Jang and Hong, 2002)
ranks the pixels of I
m
based on their similarity to curvilinear structures.
This process includes binarization via HTA, creating a binary image
highlighting potential curvilinear faults. Two thresholds are dened to
perform it: T
l
(for low threshold) and T
h
(for high threshold). T
l
identies
weak lineament pixels, and T
h
identies strong lineament pixels. We also
calculate k, the median value of the neighborhood of 9 pixels of pixel
point pin I
m
. According to the HTA (owchart in Fig. 3), for each pixel
point p:
—If Im≥Thand Im≥k, then pis strong lineament pixel (C
1
).
—If Im≥Th, or Im≥Tl, and Im≥k, then pis weak lineament pixel
(C
2
).
—Otherwise, pis a non-lineament pixel (C
3
).
C
2
pixels are classied as C
1
if connected to a pixel of C
1
; otherwise,
they are classied as class C
3
. This region-growing method isolates
genuine curvilinear geological faults from noise and other linear
artifacts.
2.4. Fine-tuning curvilinear lineament extraction through BHO
BHO has become a valuable tool in numerical modeling for geo-
scientic explorations, especially in optimizing complex models where
exhaustive parameter tuning is computationally prohibitive (Sun et al.,
2021;Janizadeh et al., 2022;Wang et al., 2022).
In this study, a signicant impediment in implementing PCWA with
HTA was the computational expense incurred while calibrating multiple
underlying parameters. These parameters are the number of scales (n
a
),
Wavelet Smoothness Filter Ratio (WSFR), PCA Dimensionality Reduc-
tion (DR), Step Filtering Widths (w), and the Variability of the Step
Filtering Widths (VSFW). The role of the WSFR is to alleviate unwanted
interpolation artifacts that may transpire during scale augmentation.
The DR parameter signies the quantity of the extracted wavelet fea-
tures by PCA, aiding in reducing the dimensionality of the wavelet co-
efcients whilst maintaining critical information within the wavelet
spectra. The parameter winuences the degree of curvilinearity of the
extracted lineaments; a diminished value of wresults in heightened
curvilinearity. The VSFW also plays a pivotal role in determining the
variability of win correlation with the complexity of the extracted
spectral features. A larger value of VSFW yields a more signicant wfor
less complex features and a reduced wfor more curvilinear features. For
subsequent lineament extractions, we employed BHO to facilitate the
optimization of these hyperparameters. BHO is a practical approach for
global optimization of costly functions (Shahriari et al., 2016). Unlike
random search methods, Bayesian optimization dynamically navigates
the hyperparameter space by choosing the following combination based
on prior observations, which signicantly reduces the computational
burden and accelerates convergence toward optimal solutions (Archetti
and Candelieri, 2019;Yang and Shami, 2020). This approach is partic-
ularly effective in geoscientic applications, where models often involve
high-dimensional, noisy data that require precise tuning to distinguish
meaningful geological features from background noise.
F
β
Score, a variant of the widely used FScore, was employed as the
performance measure (Puthiya Parambath et al., 2014). The F
β
Score is
an adjustable harmonic mean of precision (A) and recall (B), two
fundamental metrics in binary classications (Puthiya Parambath et al.,
2014):
Fig. 3. The owchart describes curvilinear lineament extraction by HTA and
region-growing methods (Jang and Hong, 2002;Panagiotakis and Kokinou,
2014,2015) applied to spectral principal components. If the pixel from I
m
to
point pis greater than the high threshold T
h
and also greater than k, then the
pixel is classied as the strong lineament pixel (C
1
). If the pixel from I
m
at point
pis greater than the high threshold T
h
, or greater than the low threshold T
l
, and
greater than k, then it is classied as a weak lineament pixel (C
2
). Otherwise,
everything else is classied as pixels that do not belong to curvilinear struc-
tures (C
3
).
B. Abbassi and L.-Z. Cheng Computers and Geosciences 194 (2025) 105768
5
Fβ=1+β2A.B
β2.A+B,A=T+
T++F+,B=T+
T++F−(17)
where T
+
is the number of true positive pixels or the number of correctly
identied lineament pixels, F
+
is the number of false positive pixels, or
the number of non-lineament pixels misidentied as lineament pixels,
and F
−
is the number of false negative pixels or the number of lineament
pixels misidentied as non-lineament pixels. Ameasures the proportion
of true positives out of all positive predictions made by the model,
emphasizing the accuracy of the model’s positive predictions. B, on the
other hand, quanties the proportion of true positives out of all actual
positive instances, reecting the model’s ability to identify positive in-
stances. The factor βthat allows for recall to be weighted higher than
precision (β>1) or precision to be weighted higher than recall (β<1).
This exibility is benecial when one metric is deemed more
important. Given the importance of precision in lineament detection, a β
value less than one was chosen to emphasize precision in the βScore
calculation. This implies that we place more importance on the model’s
ability to accurately predict lineaments, even at the potential expense of
missing some true lineaments. The reasoning behind this choice is that
in geological studies, falsely identied lineaments could lead to signif-
icant misinterpretations and potentially costly misdirection in subse-
quent exploratory efforts.
The objective function for optimization is dened as fobj(x) = 1−
Fβ(x), where xrepresents the vector of hyperparameters to be opti-
mized. The goal is to minimize fobj(x), effectively maximizing the Fβ
Score.
Bayesian Optimization uses a Gaussian Process (GP) to model the
objective function (Rasmussen and Williams, 2006;Shahriari et al.,
2016) by considering the mean and covariance function, explicitly using
a Mat´
ern kernel:
fobj(x) ∼ GP(x,kM(x,xʹ)) (18)
where xis the mean function, kM(x,xʹ)is the Mat´
ern kernel function, xis
one point in the hyperparameters space, and xʹis another point distinct
from x. The kernel determines the correlation between xand xʹ, affecting
how GP infers values in unexplored regions by using nearby, evaluated
points to estimate values and uncertainty. This correlation enables the
GP to make informed predictions where closer points have higher
covariance, allowing the model to balance exploration and exploitation
by predicting smoother values in areas close to known points while
indicating higher uncertainty in more distant, unexplored regions.
The BHO procedure begins with sampling n
0
initial points in the
hyperparameter space. An internal heuristic is used to determine the n
0
in the MTLAB implementation of the BHO. For low-dimensional spaces
(usually with fewer than ten parameters, like in this study with ve
parameters), BHO typically starts with around 5–10 initial points (10
points in this study). The function may increase the number of seed
points for higher-dimensional spaces to gather a more robust under-
standing of the objective function. The points, x1,x2,…xn0, and their
objective function values fobj(xi)form the initial dataset Dn0=
xi,fobj(xi)n0
i=1to train the GP model, using Maximum Likelihood
Estimation to determine the optimal values for the GP’s hyper-
parameters, such as the length scale (distance over which two points in
the hyperparameter space are considered similar) and variance in the
Mat´
ern kernel. This training process adjusts the GP model to best t the
observed data, resulting in a posterior distribution of the objective
function over the hyperparameter space. This posterior distribution
represents both the predicted mean function values and the associated
uncertainty, establishing a baseline for the optimization process.
The Expected Improvement (EI) acquisition function is used to select
new points. EI guides exploration by balancing exploitation (improving
known good areas) with exploration (testing uncertain areas):
EI(x) = Е[(f(x) − f(x+))+](19)
where f(x+)is the best-known objective value so far, Еdenotes the
expectation operator, and the term (f(x) − f(x+))+represents improve-
ment over the best-known value (zero if no improvement). This function
assigns higher values to points with a higher probability of improve-
ment, focusing sampling on regions where gains are likely.
In each iteration n=n0+1,n0+2,…,N, after the initial n0evalu-
ations, the algorithm performs one new evaluation to rene the objec-
tive function approximation. At each step, it maximizes the EI
acquisition function to determine the next point to sample, xn+1by
selecting the point that maximizes EI over the entire search space R:
xn+1=argmax
x∈R
EI(x)(20)
The objective function is then evaluated at xn+1, and the new
observation xn+1,fobj(xn=1), is added to the dataset:
Dn+1=Dn∪xn+1,fobj(xn=1) (21)
The GP model is updated to incorporate Dn+1, rening its posterior
distribution. This iterative approach allows the algorithm to add one
new sample at a time, improving the GP model’s approximation with
each evaluation. The process continues until the convergence criteria
are met, balancing exploration and exploitation to search the hyper-
parameter space efciently. In the MATLAB implementation, conver-
gence is determined by two critical criteria. First, a Tolerance on
Objective Improvement of 10
−6
is used, allowing the process to halt if
improvements in the objective function fall below this threshold, indi-
cating potential convergence. Additionally, a Max Stalled Iterations
limit of 30 ensures that if no improvement occurs over 30 consecutive
evaluations, the optimization stops to prevent unnecessary calculations,
assuming it has reached a stable solution.
The converged calculation gives the optimal hyperparameter set x*
as:
x*=argmax
x∈R
f(x)(22)
This optimization process enhances curvilinear lineament extraction
by tuning hyperparameters to maximize precision in curvilinear feature
detection.
We also accounted for the algorithm’s predictability in this study.
Given that manually digitized lineaments may often be incomplete, our
proposed algorithm can potentially enhance lineament detection in
areas where manual digitization may not have fully captured the line-
aments. In such cases, lineaments predicted by the algorithm, even if
they do not precisely match the manually digitized lineaments, still hold
value. However, this introduces a challenge in evaluating the perfor-
mance of our algorithm. We might mislabel valuable predictions as false
positives if we strictly compare the algorithm’s predictions with the
manually digitized lineaments. To address this challenge, instead of
requiring an exact pixel match, we deemed a lineament to have been
successfully detected by the algorithm if the detection falls within close
vicinity of the manually digitized lineament, specically within one
pixel. This approach ensures that our performance measure reects the
practical value added by our algorithm in identifying lineaments that
may have been missed or incompletely captured in the manual digiti-
zation process.
We also employed the Random Forest model as a sensitivity analysis
tool to rank the relative importance of each hyperparameter in our
optimization framework. In a Random Forest, each decision tree works
by repeatedly splitting the data based on hyperparameter values that
reduce the prediction error (Gregorutti et al., 2015). At each split, the
tree chooses the hyperparameter threshold that best divides the data to
decrease error. The importance score for a hyperparameter piis
B. Abbassi and L.-Z. Cheng Computers and Geosciences 194 (2025) 105768
6
calculated by summing up how much each split on pireduces the error
across all the trees in the Random Forest, then averaging this value
(Gregorutti et al., 2015):
S(pi) = 1
T
T
t=1
j∈nodes(pi)
ΔI(t)
j(23)
where Tis the total number of trees, and each jrepresents a split using pi
in tree t. The term j∈nodes(pi)refers to all the nodes in the decision
trees where the hyperparameter piis used for splitting. This score helps
us understand which hyperparameters are most critical to minimizing
the BHO error. This approach is valuable because it leverages the
Random Forest’s ability to capture nonlinear relationships and in-
teractions, providing a more comprehensive understanding of hyper-
parameter sensitivity compared to linear methods (Probst et al., 2019;
Antoniadis et al., 2021). The discussion section elaborates on the
resulting ranked sensitivity scores and their implications for improving
curvilinear lineament extraction.
3. Application to magnetic data from James Bay
3.1. Geological context
The study area near Lake Yasinski in the James Bay territory of
Quebec, Canada, offers a rich geological setting for understanding fault
distributions and associated mineralization (Fig. 4). The digitized faults
in Quebec’s SIG´
EOM database form part of a comprehensive geological
dataset maintained by the Quebec Ministry of Natural Resources and
Forests (SIG´
EOM, 2024). This dataset includes over 23 000 faults,
compiled from eld surveys, remote sensing, and historical geological
data. Faults are digitized using GIS software and standardized processes
to ensure data consistency, with accuracy dependent on the scale and
quality of eld verication, especially in remote areas. SIG´
EOM Geol-
ogists record on-site data, which is later digitized in an Oracle database,
ensuring a high standardization across geological features
(Gouvernement du Qu´
ebec, 2019;SIG´
EOM, 2024). We will use the
SIG´
EOM fault data as ground truth to calculate the F
β
score by
comparing detected lineaments from our algorithm with the digitized
faults. This F
β
score will evaluate precision and recall, assessing the al-
gorithm’s effectiveness in identifying true geological structures and
reducing false positives for Bayesian optimization.
The interplay between faulting and hydrothermal uid circulation is
critical for understanding the region’s mineralization processes. This
region is characterized by Archean and Proterozoic rocks from the Su-
perior Province, with a complex interplay of volcanic, plutonic, and
sedimentary units. The Proterozoic gabbroic dykes intrude the Archean
volcanic and plutonic units related to the La Grande sub-province. The
Archean units comprise the Complexe de Langelier, a mix of older
tonalitic gneiss, younger tonalites, and other signicant plutonic bodies.
Major fault networks in the Lake Yasinski area facilitate the move-
ment of hydrothermal uids, which are essential for mineralization
processes. The region is notable for its diverse mineralization types.
Algoma-type iron formations, signicant for their iron content and as-
sociation with banded iron formations, are prominent. Magmatic chro-
mium and platinum group elements mineralization occur in association
with mac and ultramac intrusions. Uranium conglomerates are also
present, highlighting their potential economic signicance. Previous
studies indicate that gold and copper mineralization are both veinous
and disseminated and closely linked to faulting and uid accumulation
along foliation planes (Goutier et al., 1998,1999;Gaudreault and
Beauregard, 2001).
3.2. Magnetic data sets
The study utilized high-resolution residual magnetic eld intensity
data from SIG´
EOM (2024), a geoscientic information system for
Qu´
ebec, Canada (Fig. 5). These datasets include compilations from the
Abitibi and James Bay sectors, featuring ight line spacings of 100–300
Fig. 4. A geological map of the study area complied from SIG´
EOM (2024) shows the major geological units, fault networks, and mineral deposits. The geological
units include the Complexe de Langelier (CL), the Formation d’Apple (FA), the Groupe de Yasinski (GY), the Formation de Shabudowan (FS), and the Formation
d’Ekomiak (FE). Plutonic bodies such as Pluton d’Amisach Wat (PAW) and Pluton de Tipitipisu (PT) are also indicated.
B. Abbassi and L.-Z. Cheng Computers and Geosciences 194 (2025) 105768
7
m and ight heights of 40–100 m, ensuring detailed and accurate data
collection.
The normalized residual magnetic eld values were interpolated
using a 75-m grid, providing rened resolution for the study area. The
tight spacing between ight lines and low ight height ensures that the
data accurately reects subsurface magnetic anomalies. This magnetic
data is foundational for the subsequent lineament extraction process,
enabling precise identication of geological curvilinear patterns.
4. Results
Edge detection was implemented using the CWT algorithm on total
magnetic eld intensity datasets from the James Bay region. This
method utilizes predened parameters a,m, and n, as shown in Table 1,
which lists the values of mand nby scale (a). This allows effective
mapping of sharp transitions in the image and enhances the ability to
distinguish between high-frequency and low-frequency curvilinear
patterns (Fig. 6).
This is performed by increasing mand nfor high-frequency features
and reducing them for extracting low-frequency features. The resulting
spectral features predominantly exhibit patterns linked to specic scales
and directions, making much of the spectra redundant. This necessitates
a dimensionality reduction approach to isolate the most salient features
for curvilinear lineament extraction. Consequently, SVD was applied to
the wavelet coefcients, separating spectral features into low and high-
frequency principal components (Fig. 7). The principal components
(PCs) are sorted according to their eigenvectors. The rst few PCs cap-
ture most wavelet spectra variance and highlight major low-frequency
features. In contrast, the later PCs represent high-frequency features,
with the last PC typically considered noise and, therefore, eliminated.
The next step is the application of HTA to extract the curvilinear
lineaments from the PCs. However, a signicant challenge in imple-
menting this approach is the computational expense incurred during
hyperparameter calibration. The tuning process can be extensive, often
resulting in suboptimal outcomes if not handled efciently. To mitigate
this, we embedded a BHO in the PCWA-HTA scheme to predict the
performance of the given hyperparameters using a probabilistic model
explained in section 2.4. The objective function is based on the maxi-
mization of the F
β
Score, inuenced by the set of hyperparameters. The
GP provides a distribution over the objective function, enabling us to
quantify our uncertainty regarding the curvilinear lineament extraction
performance. The optimized parameters generally converge towards
more meaningful solutions after several iterations of Bayesian global
maximization of the F
β
Score.
The optimization process iteratively renes the acquisition function
to determine the next sampling point, updates the GP with new obser-
vations, and repeats until the cessation condition is met. As illustrated in
Fig. 8, an estimated minimum global objective is achieved after 100
iterations of Bayesian optimization. The objective function signicantly
improves in the early iterations, indicating that the initial exploration
phase effectively identies broad regions in the hyperparameter space
with high potential. By the 100th iteration, the BHO process converges
to an optimal set of hyperparameters, as evidenced by the objective
function’s value plateau. This convergence indicates that further itera-
tions would likely yield diminishing returns in terms of performance
improvement.
After some initial variability, the number of scales (n
a
) stabilizes
around 4 (Fig. 9a). This stabilization indicates that the BHO algorithm
consistently identies four scales as optimal for capturing the relevant
features. The values for the WSFR parameter uctuate earlier but
converge between 0.2 and 0.3 in the later stages (Fig. 9b). This
convergence suggests that slight smoothing is benecial for enhancing
curvilinear lineaments without introducing signicant artifacts. The DR
parameter remains relatively stable around 15–16 in the later iterations,
implying that retaining 15–16 principal components effectively bal-
ances the preservation of important spectral features and dimensionality
reduction (Fig. 9 c). The step ltering width (w) shows signicant
variability in early iterations but stabilizes around 62–66 (Fig. 9d). This
range suggests an optimal width for enhancing curvilinear structures.
The VSFW parameter also converges to values between 0.4 and 0.5,
indicating the importance of dynamically adjusting the ltering width
based on the complexity of the curvilinear structures (Fig. 9e).
In BHO, discrete variables are handled by evaluating only the xed,
allowable values rather than interpolating between them. The GP model
respects the discrete nature of these variables by modeling each possible
value as a distinct option rather than part of a continuous spectrum. The
acquisition function evaluates each discrete value based on expected
improvement, iteratively rening the GP model with observations from
only the predened set. This approach allows BHO to select the optimal
discrete values without any approximation between points.
Table 2 lists the tuned parameters across different BHO iterations.
Notably, the median values from the nal 35 iterations were selected as
the tuned parameters for each case, as the optimization process tends to
stabilize within this range of iterations. As illustrated, the F
β
score
improved signicantly with optimized parameters compared to the
unoptimized settings.
Fig. 10 summarizes the improvement in the F
β
Score for various
parameter combinations. The signicant improvement from the con-
ventional HTA (F
β
=0.216) to the PCWA-HTA (F
β
=0.704) underscores
the importance of incorporating PCA with CWT even with unoptimized
parameter settings. F
β
Score shows an increase during the initial opti-
mization stages, reecting the immediate benets of BHO. By iteration
10, the F
β
Score reaches 0.770, indicating that early adjustments to the
hyperparameters signicantly enhance the lineament detection
Fig. 5. Normalized residual total magnetic eld intensities across Lake Yasinski
in the James Bay territory of Quebec (SIG´
EOM, 2024). The map displays
normalized color variations representing different magnetic intensities, with
blue indicating lower intensities (normalized to 0) and red indicating higher
intensities (normalized to 1), critical for identifying underlying geolog-
ical structures.
Table 1
The choices of mand nas a function of scales (a).
Scale (a) m n
5 1 0
4 2 0
3 1 1
2 2 1
1 2 2
B. Abbassi and L.-Z. Cheng Computers and Geosciences 194 (2025) 105768
8
Fig. 6. Shifting, scaling, and rotating Gaussian mother wavelets with different orders of differentiations (mand n) on a sample magnetic data set for θ=
π
/4 and a=
{1, 2, …, 5}. Increasing the differentiation orders based on Table 1 improves the detection of higher-frequency edge features on shorter scales. Crisper edges are
detected for higher orders of differentiations (mor n>2) in the cost of unwanted artifacts.
Fig. 7. Extracted spectral principal components with a Gaussian mother wavelet expanding on ve scales and shifting in eight directions (every
π
/8 in a range
between 0 to
π
or
π
/2). Gaussian differentiation parameters (mand n) are selected based on Table 1. Typically, the rst few principal components account for the
majority of the variance in the CWT coefcients.
B. Abbassi and L.-Z. Cheng Computers and Geosciences 194 (2025) 105768
9
performance. As the iterations progress, the improvements in the F
β
Score become more gradual. Iteration 25 achieves an F
β
Score of 0.841,
while Iteration 50 reaches 0.850. This trend suggests that the optimi-
zation process is ne-tuning the parameters to achieve incremental gains
in performance. From Iteration 75 onward, the F
β
Score stabilizes
around 0.858 to 0.862. This stabilization indicates that Bayesian opti-
mization has effectively identied a set of hyperparameters that
consistently yield high performance in curvilinear pattern recognition.
The median F
β
Score from the nal 35 iterations is 0.861, close to the
score at Iteration 100 (F
β
Score of 0.862). This proximity highlights the
reliability and robustness of the optimized parameters.
Fig. 11 compares the algorithm’s results against the conventional
method for lineament extraction. Fig. 11a displays the extracted linea-
ments using the conventional HTA approach. With an F
β
Score of 0.22,
this approach demonstrates limited capability in curvilinear lineament
extraction, resulting in sparse and incomplete curvilinearity represen-
tation. Fig. 11b illustrates the application of PCWA followed by HTA
using unoptimized hyperparameters (Table 2). The F
β
Score signicantly
improves to 0.70, highlighting the effectiveness of PCWA- HTA in
enhancing curvilinear features compared to the conventional method
(HTA).
The results of BHO embedment are presented in Fig. 11c, d, and 11e.
Fig. 8. Performance of BHO for curvilinear lineament extraction with PCWA
and HTA.
Fig. 9. Performance of BHO in curvilinear lineament extraction using PCWA
and HTA. The mean values of the optimized parameters are determined by
selecting the values from the last 35 iterations for each tuned parameter.
Table 2
Performance comparison of unoptimized versus optimized algorithms for
curvilinear lineament extraction.
Case F
β
n
a
WSFR DR w VSFW
Unoptimized HTA 0.216 N/
A
N/A N/
A
76 0.25
PCWA-HTA 0.704 5 0 12 76 0.25
Optimized PCWA-HTA-
BHO: Iteration
10
0.770 4 0.1 9 74 0
PCWA-HTA-
BHO: Iteration
25
0.841 4 0.25 14 70 0.4
PCWA-HTA-
BHO: Iteration
50
0.850 3 0.25 15 62 0.4
PCWA-HTA-
BHO: Iteration
75
0.858 4 0.2 15 66 0.5
PCWA-HTA-
BHO: Median
0.861 4 0.25 16 62 0.4
PCWA-HTA-
BHO Iteration
100
0.862 4 0.3 16 62 0.4
Fig. 10. F
β
Score for various parameter combinations. Using HTA on just one
image (magnetic data) with unoptimized parameters (w=76, VSFW =0.25)
resulted in an F
β
Score of 0.22 (red triangle). Combining PCWA with HTA
enhanced the F
β
score to 0.7 with xed parameters n
a
=5, WSFR =0, DR =12,
w=76, VSFW =0.25 (green circle). Integrating the BHO with PCWA-HTA ne-
tunes the parameters and improves the F
β
scores in successive BHO iterations.
The median value shows the median score from the nal 35 iterations.
B. Abbassi and L.-Z. Cheng Computers and Geosciences 194 (2025) 105768
10
Fig. 11c displays the results after ten iterations of BHO, achieving an F
β
Score of 0.77. This early stage shows notable improvements in the
density and accuracy of the detected curvilinear patterns. Fig. 11d shows
further renement after 75 iterations, with an F
β
Score of 0.86. Fig. 11e
represents the median F
β
Score of 0.86 from the nal 35 iterations,
providing a robust average performance measure of the optimized al-
gorithm. Digitized faults are shown in Fig. 11f, representing the limited
geological faults as observed. This serves as a ground truth for evalu-
ating the effectiveness of the curvilinear lineament extraction methods.
The comparative analysis underscores the superior performance of
the proposed algorithm integrating Bayesian-optimized PCWA with
HTA. The substantial improvement in F
β
Scores across iterations high-
lights the effectiveness of Bayesian optimization in ne-tuning the
hyperparameters, leading to more accurate and comprehensive detec-
tion of curvilinear geological faults. This dual capability of conrming
known faults and suggesting new ones makes this method a powerful
tool for geological explorations.
While the F
β
score provides a valuable metric for assessing the al-
gorithm’s performance, the ground truth dataset consists of a limited set
of manually digitized faults, as shown in Fig. 11f, which may not capture
all geological features present in the area. This incompleteness, partic-
ularly in regions where digitization is sparse or inconsistent, may lead to
underestimation of the algorithm’s effectiveness, as valid detections that
do not align with digitized faults could be misclassied as false positives.
Additionally, the subjective nature of manual digitization introduces
variability in data quality, potentially affecting precision and recall
calculations. The one-pixel proximity threshold used to evaluate
matches between detected and digitized lineaments (as described in
Section 2.4) may also inuence the F
β
score, depending on the scale and
resolution of the digitized lineaments. These factors should be taken into
account when interpreting the accuracy of the F
β
score, primarily when
the ground truth is based on a limited or imprecise digitization. We
recommend utilizing comprehensive, high-resolution digitization veri-
ed by eld geologists to minimize potential inaccuracies. This ensures
that the input data accurately represents geological structures, thereby
reducing the risk of compromised outputs that may arise from inaccur-
acies inherent in statistical optimization processes.
Additionally, the iterative nature of BHO provides an approach to
enhancing the detection of hidden curvilinear features by progressively
rening the model’s hyperparameters. Initially, BHO broadly explores
the hyperparameter space, focusing on prominent, low-frequency
curvilinear features representing signicant but less detailed struc-
tures. These low-frequency curvilinear features (Fig. 11c), indicative of
deep lineaments, are more likely to be identied in the early iterations of
BHO when the algorithm seeks to establish general trends and capture
major patterns. As the BHO process advances into later iterations, the
focus shifts towards ne-tuning and exploiting known promising regions
within the hyperparameter space. This rened search enhances the
model’s sensitivity to high-frequency curvilinear features (Fig. 11d and
e), which correspond to more detailed and intricate structures. These
high-frequency features typically indicate shallow lineaments and
require precise tuning for accurate detection. The progressive rene-
ment inherent in BHO ensures that the model can capture these subtle
details as iterations increase. Thus, the logical progression from identi-
fying low-frequency, deep lineaments in early iterations to high-
frequency, shallow lineaments in later iterations underscores the
robustness of this optimization technique in distinguishing between
different depths of geological features. However, this focus on detailed
Fig. 11. Comparison of the proposed algorithm with the conventional method applied to total magnetic eld datasets. (a) Lineament extraction utilizing conven-
tional HTA; (b) Curvilinear lineament extraction utilizing PCWA with HTA; (c &d) Outcomes of the proposed algorithm, integrating Bayesian-optimized PCWA with
HTA (iterations 10 and 75); (e) Results of the median F
β
Scores from the nal 35 iterations; (f) Limited geological faults as observed.
B. Abbassi and L.-Z. Cheng Computers and Geosciences 194 (2025) 105768
11
shallow features elevates the risk of erroneously interpreting noise and
interpolation artifacts as genuine lineaments, particularly under condi-
tions of suboptimal data resolution or elevated levels of instrumental
and environmental noise in datasets.
Elevated variances of the parameters during the nal iterations
suggest a diminished contribution of that parameter in the maximization
of the F
β
Score (Fig. 9). However, if a hyperparameter exhibits high
variability during optimization, it does not necessarily imply it is un-
important. Instead, it might indicate that the Bayesian optimization
found an optimal region for that hyperparameter early on. To quantify
an importance score for hyperparameter ranking, we used a Random
Forest model. By integrating the BHO error metric (Fig. 8) with observed
hyperparameter trends (Fig. 9), we thoroughly evaluated and ranked
each hyperparameter’s relative importance in optimizing the curvilinear
lineament extraction process (Section 2.4). This approach enabled us to
assess the sensitivity of each hyperparameter in relation to the BHO
error, providing valuable insights into their contributions to overall
optimization. Results of hyperparameter importance analysis from the
Random Forest model are presented in Fig. 12, showcasing the signi-
cance of WSFR,w, and DR in optimizing extraction quality. As can be
seen, the most important hyperparameter in the BHO process is WSFR,
with an importance score of 0.512, indicating that it has the most sig-
nicant impact on reducing BHO Error. Since Random Forest prioritized
this feature in many tree splits, it consistently contributed to the lower
BHO error, making it the most inuential hyperparameter in curvilinear
lineament extraction. In addition to WSFR,wand DR are essential,
contributing to detailed resolution and feature retention. Conversely,
VSFW and n
a
exhibit lower importance, suggesting a lesser impact on
overall extraction quality. These ndings guide the prioritization of
hyperparameters for enhancing curvilinear lineament extraction quality
(Fig. 12).
The proposed method advances curvilinear lineament extraction by
automating and optimizing parameters through BHO, addressing limi-
tations in prior approaches. Unlike manual methods, which are time-
intensive and subjective (Tir´
en, 2010), our approach minimizes
human involvement, enhancing consistency. Fixed-parameter models
(Masoud and Koike, 2011;Farahbakhsh et al., 2019;Ahmadi and Pek-
kan, 2021) lack adaptability, while BHO dynamically adjusts hyper-
parameters for precise extractions. Compared to traditional edge
detection (Guo et al., 2010;Boe, 2012;Krylov and Nelson, 2014;Tu and
Karstoft, 2015;Xu et al., 2020;Zhou et al., 2023), our use of the PCWA
algorithm retains essential geological features, improving clarity. While
the topological optimization strategy by Panagiotakis and Kokinou
(2015) improves adaptability in curvilinear shape extraction, it lacks
spectral feature pretreatments, which our method inherently integrates,
enabling efcient multiscale and multidimensional extraction. Our
method is designed to handle various data qualities; however, extremely
low-resolution images or datasets with high noise levels can still chal-
lenge its effectiveness, where signicant noise may interfere with line-
ament clarity. Although wavelet smoothing lters help reduce noise,
high noise levels in the data might lead to artifacts misidentied as
lineaments.
5. Conclusions
This research introduces a novel automated method for extracting
curvilinear geological lineaments from aeromagnetic images by inte-
grating Bayesian Hyperparameter Optimization (BHO) with Principal
Component Wavelet Analysis (PCWA) and a Hysteresis Thresholding
Algorithm (HTA). The proposed approach addresses the limitations of
traditional manual and xed-parameter extraction techniques, which
are often time-consuming, expertise-dependent, and lack adaptability
across different geological contexts.
In our method, PCWA effectively extracts multi-scale and multi-
directional features while reducing dimensionality and eliminating
redundant information in the wavelet spectra. HTA, coupled with a
region-growing pixel-labeling method, accurately detects and delineates
curvilinear lineaments by classifying pixels based on adaptive thresh-
olds. The integration of BHO allows for the autonomous ne-tuning of
multiple hyperparameters in the PCWA-HTA algorithm, signicantly
enhancing the precision and efciency of curvilinear lineament
mapping.
Our empirical validation using aeromagnetic data from the James
Bay region in Quebec, Canada demonstrates substantial accuracy im-
provements. The optimized algorithm achieved a 23% increase in the F
β
Score over the unoptimized PCWA-HTA and a marked 300% improve-
ment over traditional HTA method, highlighting the effectiveness of
BHO in enhancing curvilinear pattern recognition. The iterative nature
of BHO progressively renes hyperparameters, enabling the detection of
both deep lineaments in early iterations and ner, shallow lineaments in
later iterations.
Furthermore, the Random Forest sensitivity analysis revealed that
hyperparameters such as Wavelet Smoothness Filter Ratio, Step Filtering
Width, and PCA Dimensionality Reduction play the most crucial roles in
optimizing extraction quality.
This study signicantly contributes to lineament analysis by
providing an accurate, efcient, and scalable solution for detecting
curvilinear geological faults. Enhanced detection of these faults im-
proves our understanding of subsurface structures, which is essential for
applications in mineral exploration, groundwater management, and
seismic risk assessment. Although our method was tested on aero-
magnetic imagery, the exibility of the Bayesian-optimized PCWA-HTA
approach suggests potential adaptability to other geophysical domains,
such as gravity and seismic datasets. Future work will explore this
adaptability, aiming to expand the method’s utility across diverse
geological contexts.
CRediT authorship contribution statement
Bahman Abbassi: Writing –review &editing, Writing –original
draft, Visualization, Validation, Supervision, Software, Resources,
Methodology, Investigation, Formal analysis, Data curation, Conceptu-
alization. Li-Zhen Cheng: Writing –review &editing, Validation, Su-
pervision, Resources, Project administration, Investigation, Funding
acquisition.
Fig. 12. Importance scores of hyperparameters based on Random Forest
sensitivity analysis. The Wavelet Smoothness Filter Ratio (WSFR) is highly
important, indicating its critical role in reducing BHO error by enhancing
feature clarity. Step Filtering Width (w) and PCA Dimensionality Reduction
(DR) also signicantly impact the detail and information retention levels,
respectively. In contrast, the Variability of Step Filtering Widths (VSFW) and
the number of scales (n
a
) show lower importance, suggesting a limited impact
on the optimization process.
B. Abbassi and L.-Z. Cheng Computers and Geosciences 194 (2025) 105768
12
Code availability section
CLE2D: Curvilinear Lineament Extraction 2D.
Contact: bahman.abbassi@uqat.ca,+1 819 290 1800.
Hardware requirements: This program is designed to run on any
Windows-based personal computer with at least 8 GB of random-access
memory (RAM). Increasing the RAM size allows larger images to be
processed at once. A solid-state drive (SSD) with a non-volatile memory
express interface (NVMe) is recommended.
Program language: MATLAB.
Software required: MATLAB 2024a –version 24.1 (update 4 and
later).
Program size: 3.8 MB.
The source codes can be downloaded at https://github.com/bahma
nabbassi/CLE2D. To execute lineament extraction, please read the in-
structions in the user manual of the Program CLE2D ver. 1.1.
Declaration of competing interest
The authors declare that they have no known competing nancial
interests or personal relationships that could have appeared to inuence
the work reported in this paper.
Acknowledgments
This study was supported by grants from the Natural Sciences and
Engineering Research Council of Canada and Fonds Qu´
eb´
ecois de la
Recherche sur la Nature et les Technologies.
Data availability
All codes, user manual and data sets are uploaded in https://github.
com/bahmanabbassi/CLE2D
References
Ahmadi, H., Pekkan, E., 2021. Fault-based geological lineaments extraction using remote
sensing and GIS—a review. Geosciences 11 (5). https://doi.org/10.3390/
geosciences11050183.
Antoine, J.-P., Murenzi, R., Vandergheynst, P., Twareque Ali, S., 2004. Two-dimensional
Wavelets and Their Relatives. Cambridge University Press. https://doi.org/10.1017/
CBO9780511543395.
Antoniadis, A., Lambert-Lacroix, S., Poggi, J.-M., 2021. Random forests for global
sensitivity analysis: a selective review. Reliab. Eng. Syst. Saf. 206, 107312. https://
doi.org/10.1016/j.ress.2020.107312.
Archetti, F., Candelieri, A., 2019. Bayesian Optimization and Data Science. Springer.
https://doi.org/10.1007/978-3-030-24494-1.
Boe, T.H., 2012. Enhancement of large faults with a windowed 3D Radon transform
lter. SEG Technical Program. Society of Exploration Geophysicists. https://doi.org/
10.1190/segam2012-1008.1.
Boukerbout, H., Gibert, D., Sailhac, P., 2003. Identication of sources of potential elds
with the continuous wavelet transform: application to VLF data. Geophys. Res. Lett.
30 (8). https://doi.org/10.1029/2003GL016884.
Boutrika, R., Ducrot, D., Aissa, D.E., 2019. Contribution of remote sensing to mapping In-
Abeggui gold deposit (Central Hoggar, South Algeria). Arabian J. Geosci. 12 (2).
https://doi.org/10.1007/s12517-018-4201-3.
Burrough, P.A., McDonnell, R.A., Lloyd, C.D., 1998. Principles of Geographical
Information Systems, third ed. Oxford University Press.
Donald, D.A., Everingham, Y.L., McKinna, L.W., Coomans, D., 2009. Feature selection in
the wavelet domain: adaptive wavelets. In: Brown, S.D., Tauler, R., Walczak, B.
(Eds.), Comprehensive Chemometrics. Elsevier, pp. 647–679. https://doi.org/
10.1016/B978-044452701-1.00033-8.
Farahbakhsh, E., Chandra, R., Olierook, H.K.H., Scalzo, R., Clark, C., Reddy, S.M.,
Müller, R.D., 2019. Computer vision-based framework for extracting tectonic
lineaments from optical remote sensing data. Int. J. Rem. Sens. 41 (5), 1760–1787.
https://doi.org/10.1080/01431161.2019.1674462.
Fedi, M., Florio, G., 2001. Detection of potential elds source boundaries by enhanced
horizontal derivative method. Geophys. Prospect. 49 (1), 40–58. https://doi.org/
10.1046/j.1365-2478.2001.00235.x.
Fitton, N.C., Cox, S.J.D., 1998. Optimizing the application of the Hough transform for
automatic feature extraction from geoscientic images. Comput. Geosci. 24 (10),
933–951. https://doi.org/10.1016/S0098-3004(98)00070-3.
Gaudreault, D., Beauregard, A.-J., 2001. Rapport de travaux d’exploration, ´
et´
e-automne
2000, propri´
et´
es Yasinski-Nord et PEM 1404. G´
eologica Groupe-Conseil Inc, GM
59611. Gouvernement du Qu´
ebec. https://gq.mines.gouv.qc.ca/documents/examine
/GM59611/.
Goutier, J., Dion, C., Lafrance, I., David, J., Parent, M., Dion, D.-J., 1999. G´
eologie de la
r´
egion des lacs Langelier et Threefold (33F/03 et 33F/04). Gouvernement du
Qu´
ebec, Minist`
ere des Ressources naturelles. RG, 98-18. https://gq.mines.gouv.qc.
ca/documents/examine/RG9818/.
Goutier, J., Doucet, P., Dion, C., Beausoleil, C., David, J., Parent, M., Dion, D.-J., 1998.
G´
eologie de la r´
egion du lac Kowskatehkakmow (33F/06). Gouvernement du
Qu´
ebec, Minist`
ere des Ressources naturelles, RG, 98-16. https://gq.mines.gouv.qc.
ca/documents/examine/RG9816/.
Gouvernement du Qu´
ebec, 2019. SIG´
EOM step-by-step user guide: Mastering the
SIG´
EOM data. Minist`
ere de l’
´
Energie et des Ressources naturelles. Direction
G´
en´
erale de G´
eologie Qu´
ebec, Canada, pp. 42–44. https://gq.mines.gouv.qc.ca/doc
uments/sigeom/TOUTQC/ANG.
Gregorutti, B., Michel, B., Saint-Pierre, P., 2015. Grouped variable importance with
random forests and application to multiple functional data analysis. Comput. Stat.
Data Anal. 90, 15–35. https://doi.org/10.1016/j.csda.2015.04.002.
Guo, F., Yang, Y., Chen, B., Guo, L., 2010. A novel multi-scale edge detection technique
based on wavelet analysis with application in multiphase ows. Powder Technol.
202 (1–3), 171–177. https://doi.org/10.1155/2015/504725.
Guo, H., Marfurt, K.J., Liu, J., 2009. Principal component spectral analysis. Geophysics
74 (4), P35–P43. https://doi.org/10.1190/1.3119264.
Hobbs, W.H., 1903. Lineaments of the Atlantic border region. Bull. Geol. Soc. Am. 15 (1),
483–506.
Hornby, P., Boschetti, F., Horowitz, F.G., 1999. Analysis of potential eld data in the
wavelet domain. Geophys. J. Int. 137 (1), 175–196. https://doi.org/10.1046/j.1365-
246x.1999.00788.x.
Jacques, L., Coron, A., Demanet, L., Rivoldini, A., Vandergheynst, P., 2003. Yet Another
Wavelet Toolbox: Reference Guide. Version 0.1.1, The YAW Toolbox Team, p. 33.
https://github.com/jacquesdurden/yawtb.
Jang, J.-H., Hong, K.-S., 2002. Detection of curvilinear structures and reconstruction of
their regions in gray-scale images. Pattern Recogn. 35 (5), 807–824. https://doi.org/
10.1016/S0031-3203(01)00073-5.
Janizadeh, S., Vafakhah, M., Kapelan, Z., Mobarghaee Dinan, N., 2022. Hybrid XGBoost
model with various Bayesian hyperparameter optimization algorithms for ood
hazard susceptibility modeling. Geocarto Int. 37 (25), 8273–8292. https://doi.org/
10.1080/10106049.2021.1996641.
Jolliffe, I.T., Cadima, J., 2016. Principal component analysis: a review and recent
developments. Phil. Trans. Math. Phys. Eng. Sci. 374 (2065), 20150202. https://doi.
org/10.1098/rsta.2015.0202.
Jordan, G., Schott, B., 2005. Application of wavelet analysis to the study of spatial
pattern of morphotectonic lineaments in digital terrain models. A case study. Rem.
Sens. Environ. 94 (1), 31–38. https://doi.org/10.1016/j.rse.2004.08.013.
Krylov, V.A., Nelson, J.D., 2014. Stochastic extraction of elongated curvilinear structures
with applications. IEEE Trans. Image Process. 23 (12), 5360–5373. https://doi.org/
10.1109/TIP.2014.2363612.
Lim, Y., Kwon, J., Oh, H.-S., 2021. Principal component analysis in the wavelet domain.
Pattern Recogn. 119, 108096. https://doi.org/10.1016/j.patcog.2021.108096.
Luo, X., Bhakta, T., Jakobsen, M., Nædal, G., 2016. An ensemble 4D-seismic history-
matching framework with sparse representation based on wavelet multiresolution
analysis. SPE Bergen One Day Seminar. Society of Petroleum Engineers. https://doi.
org/10.2118/180025-PA. SPE-180025-MS.
Masoud, A.A., Koike, K., 2011. Auto-detection and integration of tectonically signicant
lineaments from SRTM DEM and remotely sensed geophysical data. ISPRS J.
Photogrammetry Remote Sens. 66 (6), 818–832. https://doi.org/10.1016/j.
isprsjprs.2011.08.003.
Mohammadpour, M., Bahroudi, A., Abedi, M., 2020. Automatic lineament extraction
method in mineral exploration using CANNY algorithm and Hough transform.
Geotectonics 54 (3), 366–382. https://doi.org/10.1134/S0016852120030085.
Moreau, F., Gibert, D., Holschneider, M., Saracco, G., 1997. Wavelet analysis of potential
elds. Inverse Probl. 13 (1), 1–25. https://doi.org/10.1088/0266-5611/13/1/013.
¨
Olgen, M.K., 2004. Determining lineaments and geomorphic features using Landsat 5-TM
data on the lower Bakircay plain, Western Turkey. Aegean Geogr. J. 13 (1), 47–57.
Panagiotakis, C., Kokinou, E., 2014. Automatic enhancement and detection of active sea
faults from bathymetry. In: Proceedings of the 22nd International Conference on
Pattern Recognition, pp. 855–860. https://doi.org/10.1109/ICPR.2014.157.
Panagiotakis, C., Kokinou, E., 2015. Linear pattern detection of geological faults via a
topology and shape optimization method. IEEE J. Sel. Top. Appl. Earth Obs. Rem.
Sens. 8 (1), 3–11. https://doi.org/10.1109/JSTARS.2014.2363080.
Patel, J., Patwardhan, J., Sankhe, K., Kumbhare, R., 2011. Fuzzy inference-based edge
detection system using Sobel and Laplacian of Gaussian operators. In: Proceedings of
the International Conference &Workshop on Emerging Trends in Technology.
https://doi.org/10.1145/1980022.1980171.
Probst, P., Boulesteix, A.-L., Bischl, B., 2019. Tunability importance of hyperparameters
of machine learning algorithms. J. Mach. Learn. Res. 20 (1), 1–32. https://doi.org/
10.48550/arXiv.1802.09596.
Puthiya Parambath, S.A., Usunier, N., Grandvalet, Y., 2014. Optimizing F-measures by
cost-sensitive classication. In: Advances in Neural Information Processing Systems.
https://papers.nips.cc/paper_les/paper/2014/hash/678a1491514b7f1006d60
5e9161946b1-Abstract.html.
Rasmussen, C.E., Williams, C.K.I., 2006. Gaussian Processes for Machine Learning. The
MIT Press. https://doi.org/10.7551/mitpress/3206.001.0001.
Sailhac, P., Gibert, D., Boukerbout, H., 2009. The theory of the continuous wavelet
transform in the interpretation of potential elds: a review. Geophys. Prospect. 57
(4), 517–525. https://doi.org/10.1111/j.1365-2478.2009.00794.x.
B. Abbassi and L.-Z. Cheng
Computers and Geosciences 194 (2025) 105768
13
Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N., 2016. Taking the human
out of the loop: a review of Bayesian optimization. Proc. IEEE 104 (1), 148–175.
https://doi.org/10.1109/JPROC.2015.2494218.
Shlens, J., 2005. A tutorial on principal component analysis. Syst. Neurobiol. Lab.
https://doi.org/10.48550/arXiv.1404.1100.
SIG´
EOM, 2024. Syst`
eme d’Information G´
eomini`
ere: Gouvernement du Qu´
ebec. Minist`
ere
des Ressources naturelles et des Forˆ
ets. https://sigeom.mines.gouv.qc.ca.
Soto-Pinto, C., Arellano-Baeza, A., S´
anchez, G., 2013. A new code for automatic
detection and analysis of the lineament patterns for geophysical and geological
purposes (ADALGEO). Comput. Geosci. 57, 93–103. https://doi.org/10.1016/j.
cageo.2013.03.019.
Sun, D., Xu, J., Wen, H., Wang, D., 2021. Assessment of landslide susceptibility mapping
based on Bayesian hyperparameter optimization: a comparison between logistic
regression and random forest. Eng. Geol. 281. https://doi.org/10.1016/j.
enggeo.2020.105972. Article ID 105972.
Tir´
en, S., 2010. Lineament interpretation short review and methodology. Swedish
Radiation Safety Authority, Stockholm, Sweden, Report 2010, 33, pp. 3–22. https
://www.stralsakerhetsmyndigheten.se/en/publications/reports/waste-shipments
-physical-protection/2010/201033.
Tu, G.J., Karstoft, H., 2015. Logarithmic dyadic wavelet transform with its applications
in edge detection and reconstruction. Appl. Soft Comput. 26, 193–201. https://doi.
org/10.1016/j.asoc.2014.09.044.
Wang, J., Howarth, P.J., 1990. Use of the Hough transform in automated lineament
detection. IEEE Trans. Geosci. Rem. Sens. 28 (4), 561–567. https://doi.org/10.1109/
TGRS.1990.572949.
Wang, X., Ding, C., Chen, T., Yu, T., 2022. Research on the application of Bayesian-
optimized XGBoost in minor faults in coalelds. Math. Probl Eng. https://doi.org/
10.1155/2022/3409468. Article ID 3409468.
Xu, J., Wen, X., Zhang, H., Luo, D., Li, J., Xu, L., Yu, M., 2020. Automatic extraction of
lineaments based on wavelet edge detection and aided tracking by hill shade. Adv.
Space Res. 65 (1), 506–517. https://doi.org/10.1016/j.asr.2019.09.045.
Yang, L., Shami, A., 2020. On hyperparameter optimization of machine learning
algorithms: theory and practice. Neurocomputing 415, 295–316. https://doi.org/
10.1016/j.neucom.2020.07.061.
Zhang, G., Tang, B., Chen, Z., 2019. Operational modal parameter identication based on
PCA-CWT. Measurement 139, 334–345. https://doi.org/10.1016/j.
measurement.2019.02.078.
Zhang, K., Zhang, Y., Wang, P., Tian, Y., Yang, J., 2018. An improved Sobel edge
algorithm and FPGA implementation. Proc. Comput. Sci. 131, 243–248. https://doi.
org/10.1016/j.procs.2018.04.209.
Zhou, J., Li, Z., Chen, J., 2023. Application of two-dimensional Morlet wavelet transform
in damage detection for composite laminates. Compos. Struct. 318. https://doi.org/
10.1016/j.compstruct.2023.117091.
B. Abbassi and L.-Z. Cheng Computers and Geosciences 194 (2025) 105768
14