A preview of this full-text is provided by Springer Nature.
Content available from Earth Science Informatics
This content is subject to copyright. Terms and conditions apply.
Vol.:(0123456789)
Earth Science Informatics (2024) 17:1625–1644
https://doi.org/10.1007/s12145-024-01243-4
RESEARCH
Application ofgeophysical andmultispectral imagery data
forpredictive mapping ofacomplex geo‑tectonic unit: acase study
oftheEast Vardar Ophiolite Zone, North‑Macedonia
FilipArnaut1· DraganaĐurić2· UrošĐurić3· MilevaSamardžić‑Petrović3· IgorPeshevski4
Received: 12 July 2023 / Accepted: 28 January 2024 / Published online: 20 February 2024
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024
Abstract
The Random Forest (RF) and K nearest neighbors (KNN) machine learning (ML) algorithms were evaluated for their abil-
ity to predict ophiolite occurrences, in the East Vardar Zone (EVZ) of central North Macedonia. A predictive map of the
investigated area was created using three data sources: geophysical data (digital elevation model, gravity and geomagnetic),
multispectral optical satellite images (Landsat 7 ETM + and their derivatives), and geological data (distance to fault map
and ophiolite outcrops map). The research included a comparison and discussion on the statistical and geological findings
derived from different training dataset class ratios in relation to a testing dataset characterized by significant class imbal-
ance. The results suggest that the precise selection of a suitable class balance for the training dataset is a critical factor in
achieving accurate ophiolite prediction with RF and KNN algorithms. The analysis of feature importance revealed that the
Bouguer gravity anomaly map, total intensity of the Earth’s magnetic field reduced to the pole map, distance to fault map,
band ratio BR3 map obtained from multispectral satellite images, and digital elevation model are the most significant fea-
tures for predicting ophiolites within the EVZ. KNN showed poorer results compared to RF in terms of both the evaluation
metrics and visual analysis of prediction maps. The methods applied in this research can be applied for predictive mapping
of complex geo-tectonic units covered by dense vegetation, and may indicate the presence of these units even if they were
not previously mapped, particularly when geophysical data are used as features.
Keywords Random Forest· K nearest neighbors· Remote Sensing· Geophysical data· Predictive mapping· East Vardar
Zone
Introduction
In machine learning algorithms (ML), the automatic induc-
tive approach was used to recognize patterns in data, and
the learned pattern relationships were then applied to other
similar data or the same datasets but in different domains
to generate predictions for data-driven classification and
regression problems (Cracknell and Reading 2014). Insitu-
ations involving the prediction of spatially dispersed catego-
ries in extremely complex processes, these algorithms have
proven to be immensely useful (Kanevski etal. 2009). The
Random Forest (RF) algorithm is widely used for predictive
Communicated by H. Babaie.
* Filip Arnaut
filip.arnaut@ipb.ac.rs
Dragana Đurić
dragana.djuric@rgf.bg.ac.rs
Uroš Đurić
udjuric@grf.bg.ac.rs
Mileva Samardžić-Petrović
mimas@grf.bg.ac.rs
Igor Peshevski
pesevski@gf.ukim.edu.mk
1 University ofBelgrade, Institute ofPhysics Belgrade,
Pregrevica 118, 11080Belgrade, Serbia
2 University ofBelgrade, Faculty ofMining andGeology,
Đušina 7, 11000Belgrade, Serbia
3 University ofBelgrade, Faculty ofCivil Engineering,
Bulevar Kralja Aleksandra 73/1, 11000Belgrade, Serbia
4 Ss. Cyril andMethodius University inSkopje, Faculty
ofCivil Engineering MK, Bulevar Partizanski Odredi 24,
1000Skopje, NorthMacedonia
Content courtesy of Springer Nature, terms of use apply. Rights reserved.