February 2025
·
115 Reads
The increasing availability of large geological datasets and modern methods of data analysis facilitate a data science approach to geology in which inferences are drawn from geological data using automated methods based on statistics and machine learning. Such methods offer the potential for faster and less subjective interpretations of geological data than are possible from a human interpreter, but translating the understanding of a trained geologist to an algorithm is not straightforward. In this paper, we present automated workflows for detecting geological folds from map data using both unsupervised and supervised machine learning. For the unsupervised case, we use regular expression matching to identify map patterns suggestive of folds along lines crossing the map. We then use the HDBSCAN clustering algorithm to cluster these possible fold identifications into a smaller number of distinct folds. This clustering algorithm is chosen because it does not require the number of clusters to be known a priori. For the supervised learning case, we use synthetic models of folds to train a convolutional neural network to identify folds using map and topographic data. We test both methods on synthetic and real datasets, where they both prove capable of identifying folds. We also find that distinguishing folds from similar map patterns produced by topography is a major issue that must be accounted for with both methods. The unsupervised method has advantages, including the explainability of its results, and provides clearly better results in one of the two real-world test datasets, while the supervised learning method is more fully automated and likely more easily extensible to other structures. Both methods demonstrate the ability of machine learning to interpret folds on geological maps and have potential for further development targeting a wider range of structures and datasets.