ArticlePDF Available

Abstract and Figures

This paper presents two algorithms for the large-scale automatic detection and instance segmentation of potential archaeological mounds on historical maps. Historical maps present a unique source of information for the reconstruction of ancient landscapes. The last 100 years have seen unprecedented landscape modifications with the introduction and large-scale implementation of mechanised agriculture, channel-based irrigation schemes, and urban expansion to name but a few. Historical maps offer a window onto disappearing landscapes where many historical and archaeological elements that no longer exist today are depicted. The algorithms focus on the detection and shape extraction of mound features with high probability of being archaeological settlements, mounds being one of the most commonly documented archaeological features to be found in the Survey of India historical map series, although not necessarily recognised as such at the time of surveying. Mound features with high archaeological potential are most commonly depicted through hachures or contour-equivalent form-lines, therefore, an algorithm has been designed to detect each of those features. Our proposed approach addresses two of the most common issues in archaeological automated survey, the low-density of archaeological features to be detected, and the small amount of training data available. It has been applied to all types of maps available of the historic 1″ to 1-mile series, thus increasing the complexity of the detection. Moreover, the inclusion of synthetic data, along with a Curriculum Learning strategy, has allowed the algorithm to better understand what the mound features look like. Likewise, a series of filters based on topographic setting, form, and size have been applied to improve the accuracy of the models. The resulting algorithms have a recall value of 52.61% and a precision of 82.31% for the hachure mounds, and a recall value of 70.80% and a precision of 70.29% for the form-line mounds, which allowed the detection of nearly 6000 mound features over an area of 470,500 km2, the largest such approach to have ever been applied. If we restrict our focus to the maps most similar to those used in the algorithm training, we reach recall values greater than 60% and precision values greater than 90%. This approach has shown the potential to implement an adaptive algorithm that allows, after a small amount of retraining with data detected from a new map, a better general mound feature detection in the same map.
This content is subject to copyright. Terms and conditions apply.
1
Vol.:(0123456789)
Scientic Reports | (2023) 13:11257 | https://doi.org/10.1038/s41598-023-38190-x
www.nature.com/scientificreports
Curriculum learning‑based strategy
for low‑density archaeological
mound detection from historical
maps in India and Pakistan
Iban Berganzo‑Besga
1, Hector A. Orengo
1,2*, Felipe Lumbreras
3, Aftab Alam
4,
Rosie Campbell
5, Petrus J. Gerrits
5, Jonas Gregorio de Souza
6, Afa Khan
5,
María Suárez‑Moreno
5, Jack Tomaney
5, Rebecca C. Roberts
5 & Cameron A. Petrie
5,7
This paper presents two algorithms for the large‑scale automatic detection and instance segmentation
of potential archaeological mounds on historical maps. Historical maps present a unique source of
information for the reconstruction of ancient landscapes. The last 100 years have seen unprecedented
landscape modications with the introduction and large‑scale implementation of mechanised
agriculture, channel‑based irrigation schemes, and urban expansion to name but a few. Historical
maps oer a window onto disappearing landscapes where many historical and archaeological
elements that no longer exist today are depicted. The algorithms focus on the detection and shape
extraction of mound features with high probability of being archaeological settlements, mounds being
one of the most commonly documented archaeological features to be found in the Survey of India
historical map series, although not necessarily recognised as such at the time of surveying. Mound
features with high archaeological potential are most commonly depicted through hachures or contour
equivalent form‑lines, therefore, an algorithm has been designed to detect each of those features.
Our proposed approach addresses two of the most common issues in archaeological automated
survey, the low‑density of archaeological features to be detected, and the small amount of training
data available. It has been applied to all types of maps available of the historic 1 to 1‑mile series,
thus increasing the complexity of the detection. Moreover, the inclusion of synthetic data, along with
a Curriculum Learning strategy, has allowed the algorithm to better understand what the mound
features look like. Likewise, a series of lters based on topographic setting, form, and size have been
applied to improve the accuracy of the models. The resulting algorithms have a recall value of 52.61%
and a precision of 82.31% for the hachure mounds, and a recall value of 70.80% and a precision of
70.29% for the form‑line mounds, which allowed the detection of nearly 6000 mound features over
an area of 470,500 km2, the largest such approach to have ever been applied. If we restrict our focus
to the maps most similar to those used in the algorithm training, we reach recall values greater than
60% and precision values greater than 90%. This approach has shown the potential to implement an
adaptive algorithm that allows, after a small amount of retraining with data detected from a new
map, a better general mound feature detection in the same map.
e past 100years and, in particular, the second half of the twentieth century, have seen extensive urban growth
and the large-scale implementation of mechanised agriculture and irrigated systems in India and Pakistan, caus-
ing irreversible eects on the landscape. Among other lasting impacts, such as the implementation of large-scale
OPEN
1Landscape Archaeology Research Group (GIAP), Catalan Institute of Classical Archaeology (ICAC), Pl. Rovellat
s/n, 43003 Tarragona, Spain. 2Catalan Institution for Research and Advanced Studies (ICREA), Passeig Lluís
Companys 23, 08010 Barcelona, Spain. 3Computer Science Department, Computer Vision Center, Universitat
Autònoma de Barcelona, Edici O, Campus UAB, 08193 Bellaterra, Spain. 4Banaras Hindu University, Ajagara,
Varanasi, Uttar Pradesh 221005, India. 5McDonald Institute for Archaeological Research, University of Cambridge,
Downing St., Cambridge CB2 3ER, UK. 6Complexity and Socio-Ecological Dynamics (CaSEs) Research Group,
Universitat Pompeu Fabra, Barcelona, Spain. 7Department of Archaeology, University of Cambridge, Downing St.,
Cambridge CB2 3DZ, UK. *email: horengo@icac.cat
Content courtesy of Springer Nature, terms of use apply. Rights reserved
2
Vol:.(1234567890)
Scientic Reports | (2023) 13:11257 | https://doi.org/10.1038/s41598-023-38190-x
www.nature.com/scientificreports/
irrigation systems, river avulsion and ooding, there have been much systematic attening, for cultivation and
construction, of hundreds, if not thousands, of archaeological settlement mounds13. ese archaeological
mounds with their distinct elevation, colour and form are an indicative feature of past settlements and anthro-
pogenic modications of the landscape. Given their partial or total destruction, these are no longer detectable
by other types of sources such as LIDAR or satellite imagery4,5. Historical maps are therefore oen the only
source of information about the location and size of those lost sites. Available satellite images of the Indian
subcontinent date back to 1972 thanks to the Landsat satellite programme6, but detailed mapping of this region
through triangulation dates back to 1802 and the start of the Great Trigonometrical Survey. Later, during the
period ofBritish rule in India and Pakistan (1858–1947), the Survey of India (SoI) continued the systematic
mapping of the whole subcontinent.
e SoI maps were originally intended to be geographic maps and depicted dierent topographic features
including mound features, many of which, as further research has shown3, are in fact archaeological sites (Fig.1).
It is impossible to calculate the percentage of mounded sites that were not drawn in the SoI maps, given the
disappearance of sites during the last 100years and the lack of reliable large-scale archaeologicalsurvey data.
However, all sites listed as being protected at the time the map surveys took place are indicated on the historic
maps, including sites like Harappa and Taxila7. Also, many major sites that were documented on the map sheets
were not ‘discovered’ by archaeologists for many years if not decades, including the major Indus Civilisation city
sites of Mohenjo-daro, Rakhigarhi and Dholavira. Furthermore, ground truthing has revealed there is a correla-
tion between these mound features and proto-historical and historical sites dating to various periods from the
period of the Indus Civilization onward3.
Deep Learning (DL) has been widely used in recent years to aid archaeological survey by using dierent
resources such as lidar data46,8 and drone imagery9. is study continues the work carried out by several authors
for the detection of archaeological sites using historical maps13. Previous studies made by Garcia-Molsosa
etal.focused on the present district of Multan in the Pakistani province of Punjab. e series of maps used in
this study had similar production standards10. Although this previous approach produced satisfactory results it
presented some drawbacks:
1. It employed a reduced series of maps of similar chronologies, depiction standards, scanning quality and
preservation. is ideal situation, however, proved not to be the norm when a much larger collection of maps
was assembled. e larger collection presented important variations in coloration, representation standards,
scanning quality and preservation, which enormously complicated the large-scale application of these initial
detectors10 and signicantly reduced their detection capabilities.
2. e initial algorithms were designed in a proprietary web-based geospatial machine learning (ML) platform.
e models were not available for download, analysis or free distribution and the processing was expensive,
prohibitively so when considering large areas such as the one under investigation.
e study presented in this paper uses the historical maps produced in the late nineteenth and early twentieth
century by the SoI with the aim of detecting two of the most common ways of drawing mound features (hachure
and form-line, see Section "Deep learning model" for further details), which are similar to those depicted by
the French in Syria and Lebanon10 (Fig.2). Our research seeks to develop two DL segmentation algorithms for
mound feature detection, one for each mound type, extending the detection to an area of 470,500 km2 (most of
which corresponds to the Indus River Basin), the largest area in which such an approach has ever been applied4,
and to all types of maps, thus increasing the complexity of the analysis. We have employed a Region-based
Figure1. Archaeological remains found where the historical maps indicated mounds. View from an elevated
mound feature in northwest India (L742). Image from Green etal. 3, Fig.2. Reproduced here under the terms of
the CC-BY 4.0 license in which it was originally published.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
3
Vol.:(0123456789)
Scientic Reports | (2023) 13:11257 | https://doi.org/10.1038/s41598-023-38190-x
www.nature.com/scientificreports/
Convolutional Neural Network (R-CNN) segmentation algorithm as it collects information about not only the
location of the mound feature, but also about its shape and extent.
Automated detection processes require large amounts of data for their training (typically in the order of tens of
thousands of individual examples), but this is not common in archaeology where the number of known archaeo-
logical samples to train a ML algorithm is very low, as in this case study. Other studies with similar elements
such as burial mounds4, have shown that despite having limited training data, features of interest are detectable
due to the characteristic circular shape of the tumuli, which presented few variations. e archaeological ele-
ments of this study, despite being mound features like those of previous studies where we encountered a similar
problem, are much more diverse. Since they are symbols drawn by human hands and not images of their actual
form, whether aerial or satellite, the features are noticeably divergent in style from each other. Consequently, a
relatively small amount of training data was not enough to achieve meaningful results.
In computational archaeology, trained ML models have been shown to perform worse in areas with low-
density of archaeological features than in high-density ones (e.g.11,12). When performing large-scale detection
with few sites, many False Positives (FPs) are introduced (typically many more than the True Positives (TPs)),
which severely reduces the accuracy of the algorithm. However, real archaeological scenarios typically pre-
sent low-densities of archaeological sites that need to be detected, at least compared to other typical objects
in Computer Vision studies (such as cars, trees, buildings, ships, etc.). During a survey, the actual density of
archaeological features is unknown, so to be a useful tool, the developed ML algorithm must also provide good
results for low-density areas.
erefore, the use of ML approaches in archaeology entails a series of idiosyncratic challenges: including
the customary small amount of archaeological data for training and the usual low-density of archaeological
features. In this article we will implement a series of data augmentation (DA) techniques and learning strategies
to resolve these two issues.
e main goal of this article, besides the successful detection of mound features within acceptable param-
eters of precision and recall, will address these two issues by designing a workow for the correct detection of
archaeological features (1) in low-density areas and (2) with little amount of training data.
Materials and methods
In this study, a total of 645 maps, provided by the Cambridge University Library and the British Library have
been used. ese historical maps were produced and distributed by the SoI, and can be classied into dierent
periods characterized by the then current surveyor general of the SoI, including C. Straham (1898–1899), G.C.
Gore (1900–1902), F.B. Longe (1904–1907), and S.G. Burrard (1912–1913). Maps produced under A.R. Quraishi
(1954) in his role of Surveyor General of the survey of Pakistan have also been included.
Map digitisation and georeferencing. Before proceeding with the training of the DL algorithm, all 645
maps used for this study had to be scanned and georeferenced (Fig.3). e scanning process was done by dif-
ferent institutions and individuals, in dierent periods and using dierent means and resolutions as a result of
the dierent histories, means, and the procedures of the dierent institutions hosting and scanning them. Aer
the digitalisation of the maps, they were georeferenced using a minimum of 12 Ground Control Points (GCPs)
and an average of 25, geometrically distributed within the map to achieve a good distribution and an accurate
transformation. e GCPs were obtained from georeferenced high resolution RGB satellite imagery available
as Web Map Services layers in QGIS soware (several versions were employed)13. e georeferencing process
mainly used second order polynomials, which was the preferred method, and was applied to most maps. On
Figure2. e two types of mound features depictions that need to be detected in historical maps: (a) hachure
[8r], and (b) form-line [16r].
Content courtesy of Springer Nature, terms of use apply. Rights reserved
4
Vol:.(1234567890)
Scientic Reports | (2023) 13:11257 | https://doi.org/10.1038/s41598-023-38190-x
www.nature.com/scientificreports/
few occasions, when the maps had suered lineal distortions due to folds in the map surface, the adjust trans-
formation was used. ese methods produced average Root Mean Square Error (RMSE) values of 0.00035° (ca.
33.7–38.8m at this latitude) using a second order polynomial and 0.00010236° (ca. 10.3m with a maximum
value of ca. 26.8m) using the adjust transformation. Since the mounds under consideration are typically much
larger than these values, the georeferencing process results in mound feature locations, which, largely overlap the
real locations (for more details on the georeferencing process see1).
Deep learning model. In recent years, R-CNN models have become very common in archaeological
survey, highlighting segmentation algorithms such as mask R-CNN9 and DeepLabV3+14. For this study, we
developed two mound symbol detection DL algorithms using mask R-CNN15, since we are looking for instance
segmentation rather than semantic. Mask R-CNN detects objects in an image while simultaneously generating
a high-quality segmentation mask for each instance16. It extends Faster R-CNN17,18 by adding a branch for pre-
dicting segmentation masks, a small fully convolutional network (FCN)19, on each region of interest (RoI), in
parallel with the existing branch for classication and bounding box regression. Mask R-CNN is simple to train
and adds only a small overhead to Faster R-CNN16. Likewise, VGG Image Annotator (VIA) from the University
of Oxford has been used to label mound features20.
e digitized and georeferenced historical maps are 3-channel RGB images and we have cropped them into
512 × 512 pixel images to save computing costs. Of the 645 maps used, only 43 contained known mound features,
which have been used for training and validation: 286 hachure and 103 form-line mound features. Of those maps,
22 were used for training, including 168 hachure and 26 form-line mound features, and 21 were used for valida-
tion, including 118 hachure and 77 form-line mound features. In addition, given the small number of known
mound features, another 21 maps, chosen randomly from the 645 original maps, were manually analysed. In
this way, we have been able to create another dataset, the test dataset with 230 hachure and 137 form-line mound
features, to evaluate the model obtained from training and validation for a second time.
SoI map styles, colours and symbology depended on the date the maps were produced, the team drawing
them, the region and the print quality of the map1. Each map type also corresponds to a drawing style and,
therefore, to a dierent mound colour, despite corresponding to the same type of mound feature. ere are
three typologies by which mound features are represented in the SoI maps, of which the most common ones
are the hachure and the form-line mound feature. e hachure is depicted with many fragmented lines which
show the orientation of the slope, whereas the form-line mound features are drawn to represent one elevation
(Fig.2). e third type of mound feature representation on SoI historical maps is shaded-relief. Although these
are also present on the maps under study, they are not included in the automated detection given the low cor-
respondence of this type of mound feature with archaeological sites, where 86.36% of the examples visited on
the ground were found not to be archaeological sites3. We have focused the form-line algorithm on detecting
only its most common typology, as opposed to the hachure algorithm which detects all types of hachure depic-
tion. is is due to the fact that other form-line mound feature types (mound feature with concentric lines, with
continuous line and black ones) do not have their characteristic shape and they are similar to other typologies
that have no relation to archaeological features, such as road and slope lines (Fig.4). Likewise, cropped mound
features by the process of clipping maps to 512 × 512 pixels have not been detected because there are form-line
and hachure-shaped features that are not closed in a circle and are not mound features.
ML algorithms like Mask R-CNN typically evaluate their models on images that contain labelled objects
and do not evaluate those without labels. Since our goal is to demonstrate the good performance of the model
in low-density areas, we have created articial mound labels on all those images without real mounds to force
the analysis in them. is way, the algorithm also evaluates the presence or absence of mounds in areas of the
map where we know there are no mound features to better assess its precision. e 4 × 4 pixel articial mound
features are placed in the upper le corner of the images and will never be detected, as our algorithm discards
any detection at the edges of the images (10 pixels from the edge) to avoid FPs derived from cropped symbols.
ese articial mound features will never be detected, but these areas will be analysed, allowing our model to
analyse both high-density and low-density areas.
If our study had focused on areas with a high-density of mound features, our method and research could
have ended here since we obtained good results aer the rst training for both hachure and form-line mound
features. However, the majority of archaeological surveys are conducted in areas with a low-density of sites, or
in places where the density of archaeological features is undetermined. erefore, if we have looked at the reality
of archaeological research and analyse the results of the rst training for low-density areas, we observe that it is
necessary to rene the model given the high number of FPs present in the results.
Figure3. Scheme of the workow for the detection of mounds in historical maps.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
5
Vol.:(0123456789)
Scientic Reports | (2023) 13:11257 | https://doi.org/10.1038/s41598-023-38190-x
www.nature.com/scientificreports/
Model renement. e high number of FPs present in the rst training was due to the limited number of
training data available. erefore, with the idea of introducing new training data, both positive and negative,
various DA techniques have been applied. e rst DA methods developed were mound feature random transla-
tion (DA1), random rotation (DA2),and the so-called Doppelgänger technique (DA3).
For each type of mound feature and algorithm, 1500 new articial mound features were used, created ran-
domly from the original ones used for training, and they were placed, by an automated process, randomly on all
the maps used in training, implementing both DA1 and DA2. When pasting these articial mound features at
random on each of the training maps, they were emptied of any other feature than the actual mound depiction
as they contained various symbols unrelated to the mound feature itself, thus avoiding possible FPs derived from
the presence of these symbols, but also because the training maps had dierent background colours and the
inclusion of these features would have created articial colour-related features (Fig.5).
In order to avoid FPs due to common symbols on the maps such as roads, grass and trees where these new
articial mound features could have been placed randomly, DA3 was developed to copy the inside of each mound
feature and to paste it to the outside of the mound feature so that it can be taken as negative training and just
the mound feature as positive data (Fig.6). In this study it has been decided not to implement other possible
DA techniques such as resizing, because mound features of dierent sizes are drawn dierently than the resized
mound feature itself. e hachure and form-line shapes are dierent for each size, increasing or decreasing the
number of strokes drawn. erefore, noise would be introduced into the algorithm. e entire DA process has
been done using our own script written in Python (see Data availability Section for further details).
Aer increasing the positive and negative training, the number of FPs detected was considerably reduced,
but a series of specic FPs was still maintained. In order to further reduce these, a renement stage (DA4) was
Figure4. Dierent form-line typologies found in historical SoI maps: (a) dashed [23r] and solid line [34r]
mound feature, (b) mound feature with concentric lines [30r], and (c) road-like black line mound feature [16r].
Figure5. Some examples of hachure mound features containing dierent symbols inside, as well as two types
of map background colour (a and b).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
6
Vol:.(1234567890)
Scientic Reports | (2023) 13:11257 | https://doi.org/10.1038/s41598-023-38190-x
www.nature.com/scientificreports/
included (Fig.7). In both cases the same correct mound features were used as continuous line circles for negative
training data, so the algorithm could decide that continuous lines are not mound features. e total number of
elements used as renement is 88 for the form-line and 127 for the hachure ones, which have been placed using
the DA1 and DA2 techniques up to a total of 8800 for the form-line algorithm and 12,700 for the hachure model.
Curriculum learning approach. anks to these DA methods we managed to reduce the number of FPs
considerably, increasing the precision of the model. However, we stopped detecting some of the mound features
that were initially detected, which also reduced the recall value. For this reason and with the aim of improving
the accuracy metrics, it was decided to implement a Curriculum Learning (CL) strategy with synthetic data
(DA5) (Fig.8).
Firstly, CL is a way to gradually introduce complexity to the model through more training phases21. Secondly,
the lack of data forced us to create synthetic data for each mound feature class (DA5), which we have used to make
the algorithm learn through a CL strategy. In this way, the algorithm rst learns the basics from the synthetic data
and then more complex variations from the few known mound features in its second training, as a ne-tuning
stage (Fig.9). A total of 75 synthetic mound features were created for each of the two types.
Model ltering. Previous ground-truthing studies in India3, which included only a small number of well-
preserved archaeological mounds, showed that those mound features smaller than 200m in diameter were
mostly not archaeological sites, with hachure features adjacent to villages oen corresponding to ponds or upcast
from the creation of those ponds. Only 7.96% of the hachure and 25.83% of the form-line mound features of less
than 200m corresponded to archaeological sites3. Likewise, research on mound features in Pakistan showed that
many of the small mound features less than 100m in diameter were mostly dunes or modern spoil from pond
Figure6. First DA techniques used: (a) random translation (DA1), (b) random rotation (DA2), and (c) the
so-called Doppelgänger technique (DA3).
Figure7. Some FPs used as negative training data for renement (DA4): (a) hachure FPs and (b) form-line FPs.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
7
Vol.:(0123456789)
Scientic Reports | (2023) 13:11257 | https://doi.org/10.1038/s41598-023-38190-x
www.nature.com/scientificreports/
excavation10. In contrast, 56.34% of the form-line and 40% of the hachure features greater than 200m in diam-
eter did correspond to sites3. For this reason, it has been decided to lter, throughout the study area, all those
mound features formed by areas of less than 500 pixels, a range of 60–150m in diameter depending on the pixel
resolution of each map, to avoid including mound features that are not likely to be archaeological sites (Filter1).
A second lter, using blob analysis, was applied to remove those elongated mound features which are not
commonly archaeological sites and are mostly dunes. e ellipsoidal shape of each detected mound feature has
been evaluated and all those that presented an elongation, a ratio between the largest and smallest diameter of
the ellipse, greater than 3.5 were eliminated (Filter2).
Finally, in the post-processing stage, given the similarity of the mound features with the characteristic eleva-
tion shape of mountainous areas, a script was applied using Google Earth Engine and QGIS to lter all those
mountainous regions (Filter3), areas with a slope greater than 5 degrees (of mean value within a 7 pixel radius,
equivalent in this area to 210m), and thus eliminate all mound features that, correctly identied by their drawn
shape, do not correspond to possible archaeological mounds (Fig.10).
Model evaluation. Once the algorithm was trained, new mound features were detected in the remaining
581 maps for which we possessed no information on the presence of mound features. Given the diversity of the
new maps compared to those used for training and validation (Fig.11), this evaluation was carried out dier-
entiating the maps based on their similarity with those used in training and validation following a probability
density function (Fig.12).
is detection can be replicated in Colab in order to facilitate its application by other users with the aim of
making this algorithm reproducible and replicable. e resulting shapele contains the masks of all detected
mound features for easy viewing in standard GIS soware such as QGIS.
Results
Below we present the results of the workow followed for the detection of mound features in SoI historical
maps. Both the initial (Tables1 and 2) and the nal results (Tables7 and 8) of the detection of hachure and
form-line mound features are presented, and only the intermediate results of the detection of hachure as an
example of the evolution of the process (Tables3, 4, 5, and 6), which was the same for both types of mound
feature representations.
Finally, the trained model was applied to maps covering an area of 470,500 km2 where a total of 2802 hachure
and 3145 form-line mound features have been detected (5947 mound features), and perfectly georeferenced by
Figure8. Hachure and form-line mound feature datasets for CL: (a) examples of synthetic hachure mound
features (DA5), (b) examples of original hachure mound features, (c) examples of synthetic form-line mound
features (DA5), and (d) examples of original form-line mound features. e synthetic data (a and c) is for the
rst training of each of the two algorithms and the original data (b and d) for the second training also for both
algorithms.
Figure9. CL process scheme where stages with more complex aspects of the mound features are gradually
included: rst the synthetic dataset with DA and second the original with DA.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
8
Vol:.(1234567890)
Scientic Reports | (2023) 13:11257 | https://doi.org/10.1038/s41598-023-38190-x
www.nature.com/scientificreports/
Figure10. Hachure mound-shaped mountain peaks on (a) historical map and (b) its satellite image.
Figure11. Similarity based on the RGB values of their backgrounds compared to the training and validation
maps: (a) sample map used for training, (b) sample map used for test for a standard deviation of 0.5, and (c)
sample map used for test for a standard deviation of 3.
Figure12. Percentage of maps in which new mounds are detected (blue) relative to the probability density
of the maps used both in training and in validation (brown), their similarity based on the RGB values of their
backgrounds.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
9
Vol.:(0123456789)
Scientic Reports | (2023) 13:11257 | https://doi.org/10.1038/s41598-023-38190-x
www.nature.com/scientificreports/
Table 1. Evaluation of the mask R-CNN model in high and low-density validation datasets, average mound
features per image, before the entire DA workow for the detection of hachure mound features.
Algorithm Density (%) TPs FNs FPs Recal l (%) Precision (%) F1 (%)
High-density 128.26 87 21 26 80.56 76.99 78.73
Low-density 2.67 87 21 737 80.56 10.56 18.67
Table 2. Evaluation of the mask R-CNN model in high and low-density validation datasets, average mound
features per image, before the entire DA workow for the detection of form-line mound features.
Algorithm Density (%) TPs FNs FPs Recal l (%) Precision (%) F1 (%)
High-density 95.77 45 22 20 67.16 69.23 68.18
Low-density 1.47 45 22 1366 67.16 3.19 6.09
Table 3. Evaluation of the mask R-CNN models in low-density validation dataset using dierent DA
techniques for the detection of hachure mound features: random translation (DA1), random rotation (DA2)
and the so-called Doppelgänger technique (DA3).
Algorithm TPs FNs FPs Recall (%) Precision (%) F1 (%)
None 87 21 737 80.56 10.56 18.67
DA1 71 39 37 64.55 65.74 65.14
DA1 + DA2 68 45 53 60.18 56.20 58.12
DA1 + DA2 + DA3 68 44 31 60.71 68.69 64.45
Table 4. Evaluation of the Mask R-CNN models in low-density validation dataset using a renement step
(DA4) for the detection of hachure mound features.
Algorithm TPs FNs FPs Recall (%) Precision (%) F1 (%)
DA1 + DA2 + DA3 68 44 31 60.71 68.69 64.45
DA1 + DA2 + DA3 + DA4 70 43 19 61.95 78.65 69.31
Table 5. Evaluation of the Mask R-CNN models in low-density validation dataset using CL-based approach
with synthetic data (DA5) for the detection of hachure mound features.
Algorithm TPs FNs FPs Recall (%) Precision (%) F1 (%)
DA1 + DA2 + DA3 + DA4 70 43 19 61.95 78.65 69.31
DA1 + DA2 + DA3 + DA4 + DA5 77 38 11 66.96 87.50 75.86
Table 6. Evaluation of area (Filter1), blob (Filter2) and slope (Filter3) lters in low-density validation dataset
for the detection of hachure mound features.
Algorithm TPs FNs FPs Recall (%) Precision (%) F1 (%)
None 87 225 15 27.88 85.29 42.03
Filter1 78 43 13 64.46 85.71 73.58
Filter1 + Filter2 77 38 11 66.96 87.50 75.86
Filter1 + Filter2 + Filter3 77 38 10 66.96 88.51 76.24
Content courtesy of Springer Nature, terms of use apply. Rights reserved
10
Vol:.(1234567890)
Scientic Reports | (2023) 13:11257 | https://doi.org/10.1038/s41598-023-38190-x
www.nature.com/scientificreports/
our algorithm (Figs.13 and 14). A manual evaluation of a series of maps of this area was performed, the afore-
mentioned test dataset (Tables9 and 10).
Discussion
Low‑density approach. In archaeology, it is common to nd unsatisfactory results masked by the dier-
ence in the density of archaeological features. e density of the features must be taken into account11,12 since
good results in high-density areas may actually be hiding much worse results in low-density areas. e rst
results showed a number of FPs of up to twenty times more than the mound features present in the area (Tables1
and 2). is algorithm would be useless in a large-scale survey, as it would generate a large number of FPs and an
overly large dataset, which would not be of use in the planning of eld validation or for archaeological analysis.
ese results strongly show that archaeological studies should focus their validation on low-density areas in
order to avoid biased results.
During an archaeological survey, the true density of archaeological features is unknown, so algorithms must
be developed to show good metrics in areas of both high and low-density of sites. Contrary to recently pub-
lished discussions12, poor results in low-density areas due to the sparse presence of archaeological features and
class imbalance are not inevitable, but these are the product of insucient model training. e foreground-to-
background imbalance as an example of class imbalance22, is not the reason for poor results in the detection
stage. e imbalance problem from each category for object detection in the training pipeline23, occurs when
one class heavily outnumbers the examples in the other class in the training data24, not in the validation and
test datasets. Variation in results due to the dierent density of archaeological features (Tables1 and 2) can be
resolved by dierent DA and CL approaches (Tables7 and 8).
Model renement and curriculum learning approach. e DA, with the introduction of 1500 new
mound features, signicantly improves the precision by increasing the training data, both positive and negative.
Both DA1 and DA2 show similar results that, despite the slight reduction in recall we have achieved a substantial
improvement in precision (Table3). anks to its negative training, the introduction of DA3 improves the preci-
sion of the model, which uses the DA4 to improve its accuracy.
e initial training data was not sucient and resulted in a large number of FPs indicating that the model had
not learned well what a mound feature looks like. e increase of the training data removed a large number of
FPs, but to eliminate more specic FPs it was necessary to resort to DA3 and DA4 (Table8). As shown in Fig.7,
most of the FPs used in renement were pointed circular and non-circular shapes for the hachure algorithm,
and both continuous and dashed circular shapes for the form-line model.
Likewise, as can be seen in Fig.15, the use of DA5 has allowed the detection of hachure shapes not included
in the original training data. e inclusion of synthetic data, along with the CL strategy, has allowed the algo-
rithm to better understand what the mound features look like. e CL using synthetic data helped to develop
an algorithm from a small training dataset, which is common in archaeology. As seen in Table5, both the recall
value and the precision value improved noticeably.
Model ltering. Smaller objects, such as mound features less than 500 pixels in area, are the most dicult
for a CNN to detect, because such objects do not have enough pixels for the necessary feature extraction. at is
why the recall value is so low without Filter1 but high enough when we apply it (Table6). Both Filter2 and Fil-
ter3 remove many FPs, which results in an increase of the precision of the model, with fewer, but higher quality
results that are more likely to be of archaeological interest.
In future work, the idea of developing new lters could be contemplated for the elimination of mound fea-
tures correctly detected but not correctly classied in their type. Some hachure mound features, in addition to
being detected by the hachure algorithm, have been detected by the form-line mound features algorithm. What
has been detected is not the complete mound feature but only its interior, which on many occasions resembles a
form-line mound feature. ese misclassied mound features could easily be removed with a lter that discards
Table 7. Evaluation of the mask R-CNN model in high and low-density validation datasets, average mound
features per image, aer the entire DA workow for the detection of hachure mound features.
Algorithm Density (%) TPs FNs FPs Recal l (%) Precision (%) F1 (%)
High-density 128.26 77 38 3 66.96 96.25 78.97
Low-density 2.67 77 38 10 66.96 88.51 76.24
Table 8. Evaluation of the mask R-CNN model in high and low-density validation datasets, average mound
features per image, aer the entire DA workow for the detection of form-line mound features.
Algorithm Density (%) TPs FNs FPs Recal l (%) Precision (%) F1 (%)
High-density 95.77 48 20 0 70.59 100 82.76
Low-density 1.47 48 20 4 70.59 92.31 80.00
Content courtesy of Springer Nature, terms of use apply. Rights reserved
11
Vol.:(0123456789)
Scientic Reports | (2023) 13:11257 | https://doi.org/10.1038/s41598-023-38190-x
www.nature.com/scientificreports/
the smallest duplicate detected mound feature. is can also happen with the hachure and the shaded-relief
mound features. Some shaded-relief examples, as the last image of Fig.15, resembles a hachure mound feature.
Applying the same lter mentioned above would also resolve these double detections, as well as reduce the FPs
for shaded-like dunes.
Model evaluation. Only 40.03% of the maps with unknown mound features, the ones used for testing, are
similar to 63.64% of the maps used for training and validation (Fig.12), so most are substantially dierent. is
diversity as well as its resulting metrics (Tables9 and 10) indicate the need for an adaptive algorithm that allows,
aer a small amount of retraining with data detected from a new map, a better general mound feature detection
in the same map. e more similar the maps are to those used in training and validation, the more similar the test
metrics are to the validation ones. An adaptive algorithm would improve both recall value by including dierent
ways of drawing the mound features, only some of which have been detected thanks to the synthetic data, and
precision value by including backgrounds not taken into account in the original training.
Figure13. Detection of mound features [21r] in an area where urban and agricultural development have
made those mapped mound features disappear: (a) satellite image of the area, (b) historical map of the area, (c)
detection of form-line mound features (blue) on the historical map, and (d) location of the detected potential
site mound features (blue) in the satellite image.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
12
Vol:.(1234567890)
Scientic Reports | (2023) 13:11257 | https://doi.org/10.1038/s41598-023-38190-x
www.nature.com/scientificreports/
Likewise, new DA methods could be included in the training, such as random brightness jittering and random
Blur/Sharpen25. Some test maps, unlike those used in training and validation, have shown darker and blurred
images (Fig.16).
Comparison to manual digitisation of mound features. e VIAannotation soware was used to
hand digitise 756 mound features in JSON format, which were digitised using 64 random historical maps. e
density of mound features is not distributed uniformly throughout each map. Instead, mound features frequently
cluster together, indicating a high number of mound features on certain maps and a low number on others. is
type of pattern increases the amount of labour and time necessary for manual mound feature digitising using
GIS soware. We predicted that manually digitising all mound features from the 645 historical maps used in this
research region would take an experienced professional more than 120 work hours based on the manually dig-
itised mound features prepared as training data for the algorithm. e detection time, running each algorithm
on a single NVIDIA A40 GPU, has been more than 6 computing hours. While 120h does not seem too long for
this project, creating a ML-based algorithm paves the way to scale this research to the additional 2200 historical
maps covering other parts of Pakistan and India that have been scanned and are ready for analysis.
Figure14. Distribution of detected mound features in the Indus River Basin: (a) hachure and (b) form-line
mound features. Figure created by the rst author using QGIS 3.28.4 13 and a WMS-connected Google Earth
satellite imagery layer as a background.
Table 9. Evaluation of the mask R-CNN model in low-density test dataset based on its maps RGB similarity
relative to training and validation ones for the detection of hachure mound features.
Similarity TPs FNs FPs Recall (%) Precision (%) F1 (%)
|0.5σ| 92 61 9 60.13 91.09 72.44
|1σ| 111 89 14 55.50 88.80 68.31
|2σ| 116 104 19 52.73 85.93 65.35
|3σ| 121 109 26 52.61 82.31 64.19
Table 10. Evaluation of the Mask R-CNN model in low-density test dataset based on its maps RGB similarity
relative to training and validation ones for the detection of form-line mound features. *Four of the detected
mound features were drawn in another way than the one used for training, the continuous form-line. For this
reason, they have not been taken into account either as TP or as FP.
Similarity TPs FNs FPs Recall (%) Precision (%) F1 (%)
|0.5σ| 15 1 1 93.75 93.75 93.75
|1σ| 25 6 5* 80.65 83.33 81.97
|2σ| 97 40 27 70.80 78.23 74.33
|3σ| 97 40 41 70.80 70.29 70.55
Content courtesy of Springer Nature, terms of use apply. Rights reserved
13
Vol.:(0123456789)
Scientic Reports | (2023) 13:11257 | https://doi.org/10.1038/s41598-023-38190-x
www.nature.com/scientificreports/
Conclusions
A workow has been designed with dierent techniques and strategies that has allowed not only the detection
of nearly 6000 mound features in India and Pakistan, which will allow for a better understanding of the settle-
ment distributions related to the Indus Civilization and later cultural periods, but has also provided solutions to
common problems in archaeology such as the low-density of archaeological features in large-scale surveys and
the few training data for ML models.
Historical maps constitute one of the basic sources available to both historians and archaeologists. e study
area analysed in this paper present an excellent case. Much of the information provided by the maps cannot be
obtained using other survey methods as the area has been systematically modied during the last century. is
is also the case of many other areas where systematic landscape modications have been implemented and for
which historical map series exist26. ese are housed in many archives and some series cover very large national
and colonial territories using very similar symbols and conventions. is study opens the door for the large-scale
automated extraction of relevant information from historical maps and, in doing so, provides a workow and
open code that has the potential to immensely contribute to the historical sciences.
As with other large-scale site detection methods4, these DL algorithms will allow researchers to carry out
studies that could not be done before given the new amount of data obtained, facilitating the task of the archae-
ologist. Furthermore, this model could be applied in other regions that have historical maps such as Syria and
Lebanon9, but particularly those areas that were also mapped by or followed the model established by the SoI. e
outputs of this study represent a powerful tool in the large-scale documentation and monitoring of archaeological
Figure15. Dierent types of hachure mound features detected aer applying the trained model. e last image
represents the third type of mound features on the maps, the shaded relief mound features, erroneously detected
as hachure but similar to them due to their characteristic pointed and circular shapes.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
14
Vol:.(1234567890)
Scientic Reports | (2023) 13:11257 | https://doi.org/10.1038/s41598-023-38190-x
www.nature.com/scientificreports/
heritage, with much work ahead to validate the results through remote sensing, archival work, and ground survey
in collaboration with partners in India and Pakistan.
Data availability
e historical map datasets generated and/or analysed during the current study are scheduled to be made publicly
available via the British Library and Cambridge University Library digital data repositories. Until that occurs, they
are available from the corresponding author on reasonable request. e historical map mound feature dataset
generated and/or analysed during the current study are scheduled to be made publicly available via the Arches
instance hosted by the Mapping Archaeological Heritage in South Asia (MAHSA) project. Until that occurs,
they are available from the corresponding author on reasonable request. e supplementary code for the Data
Augmentation process can be found online at https:// github. com/ iberg anzo/ Archa eolDA.
Received: 18 February 2023; Accepted: 4 July 2023
References
1. Petrie, C. A. et al. Mapping archaeology while mapping an empire: Using historical maps to reconstruct ancient settlement land-
scapes in modern India and Pakistan. Geosciences 9, 11 (2019).
2. Garcia-Molsosa, A., Orengo, H. A., Conesa, F. C., Green, A. S. & Petrie, C. A. Remote sensing and historical morphodynamics of
alluvial plains. e 1909 indus ood and the city of Dera Ghazi Khan (Province of Punjab, Pakistan). Geosciences 9, 21 (2019).
3. Green, A. S. et al. Re-discovering ancient landscapes: Archaeological survey of mound features from historical maps in northwest
India and implications for investigating the large-scale distribution of cultural heritage sites in south asia. Remote Sens. 11, 2089
(2019).
4. Berganzo-Besga, I. et al. Hybrid MSRM-based deep learning and multitemporal sentinel 2-based machine learning algorithm
detects near 10k archaeological tumuli in north-western Iberia. Remote Sens. 13, 4181 (2021).
5. Berganzo-Besga, I., Orengo, H. A., Canela, J. & Belarte, M. C. Potential of multitemporal lidar for the detection of subtle archaeo-
logical features under perennial dense Forest. Land 11, 1964 (2022).
6. Landsat Science. Landsat1 https:// lands at. gsfc. nasa. gov/ satel lites/ lands at-1/ (2022).
7. Petrie, C.A., Abdul-Jabbar, J., Abhayan, G.S., Alam, A., Berganzo Besga, I., Campbell, R., Conesa, F., Green, A.S., Green, L.M.,
Garcia-Molsosa, A., Gerrits, P., Gregorio de Souza, J., Hameed, M., Khan, A.S., Madella, M., Orengo, H.A., Prabhakar, V.N., Rajesh,
S.V., Redhouse, D.I., Roberts, R., Samad, A., Singh, R.N., Singh, V.K., Suarez Moreno, M., Tomaney, J., & Vafadari, A. Hidden
in plain sight: e unrecognised contribution of the survey of India in the documentation of Indus civilisation settlements. Century
Celebration on Mohenjodaro (2022).
8. Davis, D. S., Gaspari, G., Lipo, C. P. & Sanger, M. C. Deep learning reveals extent of archaic Native American shell-ring building
practices. J. Archaeol. Sci. 132, 105433 (2021).
9. Orengo, H. A. et al. New developments in drone-based automated surface survey: Towards a functional and eective survey system.
Archaeol. Prospect. 28, 1–8 (2021).
10. Garcia-Molsosa, A. et al. Potential of deep learning segmentation for the extraction of archaeological features from historical map
series. Archaeol. Prospect. 28, 187–199 (2021).
11. Soroush, M., Mehrtash, A., Khazraee, E. & Ur, J. A. Deep learning in archaeological remote sensing: Automated Qanat detection
in Kurdistan region of Iraq. Remote Sens. 12, 500 (2020).
12. Verschoof van der Vaart, W., Bonhage, A., Schneider, A., Ouimet, W. & Raab, T. Automated large-scale mapping and analysis of
relict charcoal hearths in connecticut (USA) using a Deep Learning YOLOv4 framework. Archaeol. Prospect. 2022, 1–16 (2022).
13. QGIS Development Team. QGIS geographic information system. QGIS Association. http:// www. qgis. org (2023).
14. Landauer, J., Hoppenstedt, B., Allgaier, J. Image segmentation to locate ancient maya architectures using deep learning. In Discover
the Mysteries of the Maya: Selected Contributions from the Machine Learning Challenge & e Discovery Challenge Workshop at
ECML PKDD 2021, (eds. Kocev, D., Simidjievski, N., Kostovska, A., Dimitrovski, I., Kokalj, Ž.) 7–12 (arXiv: Ithaca, NY, USA,
2022) arXiv: 2208. 03163.
Figure16. Map samples found in the test data with dierent characteristics than those used in training and
validation: (a) darker image background and (b) blurred image.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
15
Vol.:(0123456789)
Scientic Reports | (2023) 13:11257 | https://doi.org/10.1038/s41598-023-38190-x
www.nature.com/scientificreports/
15. Waleed, A. Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow. GitHub repository. https://
github. com/ matte rport/ Mask_ RCNN (2017).
16. He, K., Gkioxari, G., Dollár, P., Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision,
2961–2969 (2017).
17. Girshick, R. Fast r-cnn. In 2015 Proceedings of the IEEE International Conference on Computer Vision, 1440–1448 (2015).
18. Ren, S., He, K., Girshick, R. & Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE
Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2016).
19. Long, J., Shelhamer, E., Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, 3431–3440 (2015).
20. Dutta, A; Zisserman, A. e VIA Annotation Soware for Images, Audio and Video. In Proceedings of the 27th ACM International
Conference on Multimedia (MM ’19), Nice, France. ACM, New York, NY, USA, 4 (2019).
21. Soviany, P., Ionescu, R.T., Rota, P., Sebe, N. Curriculum learning: A survey. arXiv, arXiv: 2101. 10382 (2022).
22. Oksuz, K., Cam, B.C., Kalkan, S., Akbas, E. Imbalance problems in object detection: A review. arXiv, arXiv: 1909. 00169 (2022).
23. Luque, A., Carrasco, A., Martín, A. & de las Heras, A. e impact of class imbalance in classication performance metrics based
on the binary confusion matrix. Pattern Recognit. 91, 216–231 (2019).
24. Batista, G. E. A. P. A., Prati, R. C. & Monard, M. C. A study of the behavior of several methods for balancing machine learning
training data. ACM SIGKDD Explor. Newslett. 6(1), 20–29 (2004).
25. Berganzo-Besga, I., Orengo, H. A., Lumbreras, F., Aliende, P. & Ramsey, M. N. Automated detection and classication of multi-cell
Phytoliths using deep learning-based algorithms. J. Archaeol. Sci. 148, 105654 (2022).
26. Orengo, H. A., Krahtopoulou, A., Garcia-Molsosa, A., Palaiochoritis, K. & Stamati, A. Photogrammetric re-discovery of the hidden
long-term landscapes of western essaly, central Greece. J. Archaeol. Sci. 2015(64), 100–109 (2015).
Acknowledgements
e Mapping Archaeological Heritage in South Asia (MAHSA) project is funded by Arcadia, a charitable fund of
Lisbet Rausing and Peter Baldwin. is research was also partially supported by Grant PID2021-128945NB-I00,
awarded by MCIN/AEI/10.13039/501100011033, and by “ERDF A way of making Europe”. e authors acknowl-
edge the support of the Generalitat de Catalunya CERCA Program to CVC and ICAC.Finally, the authors would
like to thank Junaid Abdul Jabbar, Mou Sarmah, Ushni Dasgupta, Azadeh Vafadari, Kuili Suganya Chittiraibalan,
Arnau Garcia-Molsosa and Adam Green.
Author contributions
I.B.B. developed methods, executed research, wrote the initial dra of the paper, implemented corrections,
produced the gures; H.A.O. and F.L., planned research, developed methods, corrected initial dra; A.A., P.J.G.,
J.G.S., A.K., R.C., M.S.M. and J.T. georeferenced maps; RR coordinated research; C.P. coordinated research,
planned research and acquired funding.
Competing interests
e authors declare no competing interests.
Additional information
Correspondence and requests for materials should be addressed to H.A.O.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional aliations.
Open Access is article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons licence, and indicate if changes were made. e images or other third party material in this
article are included in the articles Creative Commons licence, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.
© e Author(s) 2023
Content courtesy of Springer Nature, terms of use apply. Rights reserved
1.
2.
3.
4.
5.
6.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
onlineservice@springernature.com
... Applications of CNN and Deep Learning are incredibly diverse. Current approaches in archaeology include the detection of archaeological features from LiDAR, satellite imagery, topographic maps, drone-based automated surveys and more (Soroush et al., 2020;Garcia-Molsosa et al., 2021;Berganzo-Besga et al., 2023;Orengo et al., 2021;Verschoof-van der Vaart et al., 2020;W. Li et al., 2023;Trier et al. 2019;Trier et al. 2021). ...
... The approach presented above is promising and can be expanded to other areas. The model can be used to create more training data and more advanced post-processing methods, such as using a mask that incorporates a slope/radius approach (Berganzo-Besga et al., 2023) or a pixel-based classification approach for trees . More diversified training data are necessary to correct the model's average precision and recall values. ...
Article
Full-text available
Qanats are a remarkable type of ancient hydraulic structure for sustainable water distribution in arid environments that use subterranean channels to transport water from highland or mountainous areas. The presence of the qanat system is marked by a line of regularly spaced shafts visible from the surface, which can be used to detect qanats using satellite imagery. Typically, qanats have been documented by field mapping or manual digitisation within a Geographic Information System (GIS) environment. This process is time-consuming due to the numerous shafts within each qanat line. However, several automated methods for detecting qanat structures have been explored, using techniques such as morphological filters, custom convolutional neural networks (CNN) and, more recently, YOLOv5 and Mask R-CNN. These approaches used high-resolution RGB images and CORONA images. However, the use of black and white CORONA in CNNs has been limited in its applicability due to a high rate of false positives. This paper explores the potential of YOLOv9 in processing the black and white HEXAGON (KH-9) high-resolution spy satellite system launched in 1971. Two areas in Afghanistan (Maiwand) and Iran (Gorgan Plain) were selected to train the system images extracted from HEXAGON imagery and artificial synthetic data. The training dataset was augmented using the Albumentation library, which increased the number of tiles used. The model was tested using two types of HEXAGON imagery for selected areas in Afghanistan (Maiwand), Iran (Gorgan Plain) and Morocco (Rissani), and CORONA imagery in Iran (Gorgan Plain). Our study provided a model capable of predicting the location of qanat shafts with a precision of over 0.881 and a recall of 0.627 for most of the case studies tested. This is the first case study aimed at detecting qanats in different landscapes using different types of satellite imagery. Using real, augmented, and artificial data allowed us to generalise the representation of qanats into lineal groups of circular features. Thanks to applying labelling for individual qanats and their pairs as separate classes, our approach eliminated most of the isolated and clustered false positives.
... In geoarchaeology for instance, machine learning has been used to more accurately source and classify samples of soils, minerals, and tephras [59][60][61][62][63] , improve precision in temperature estimations of heat-treated lithic items 64 , as well as to interpolate geochemical properties between samples to improve resolution of large-scale geological mapping 65 . These methods have also helped identify anthropogenic structures from aerial surveying 66 , like mounds [67][68][69][70][71][72][73][74][75] , structures [76][77][78] , desert kites 79 , irrigation systems 80 , and combustion features 81,82 . The identification of anthropogenic cut-marks on bones has also been improved with machine learning [83][84][85][86][87][88] . ...
Preprint
Full-text available
Reconciling the ever-increasing volume of new archaeological data with the abundant corpus of legacy data is fundamental to making robust archaeological interpretations. Yet, combining new and existing results is hampered by inconsistent standards in the recording and illustration of archaeological features and artefacts. Attempts at collating data from images in existing publications first involve scouring the substantial body of existing literature, followed by extracting images that require onerous manual preprocessing steps, like re-scaling, reorienting , and re-formatting. While the sample sizes of such manual analyses are curtailed by these problems, recent developments in AI and big data methods are poised to accelerate and automate large syntheses of existing data. This paper introduces an AI-assisted workflow capable of creating uniform archaeological datasets from heterogeneous published resources. The associated software (AutArch) takes large and unsorted PDF files as input, and uses neural networks to conduct image processing, object detection, and classification. Objects commonly found in archaeological catalogues-like graves, skeletons, ceramics, ornaments, stone tools, and maps-are reliably detected. Accompanying elements of the illustrations, like North arrows and scales, are automatically used for orientation and scaling. Outlines are then extracted with contour detection, allowing whole-outline morphometrics. Detected objects, contours, and other automatically retrieved data can be manually validated and adjusted via AutArch's graphical user interface. While we test this workflow on third millennium BCE Central European graves and Final Neolithic/Early Bronze Age arrowheads from Northwest Europe, this method can be applied to the vast number of artefacts and archaeological features for which shape, size, and orientation holds technological, functional, cultural, and/or temporal significance. This AI-assisted workflow has the potential to speed-up, automate, and standardise data collection throughout the discipline, allowing more objective interpretations and freeing sample sizes from budget and time constraints.
... A recent study by Vinci et al. (2024) found that, out of 291 reviewed projects, only eleven had research areas exceeding 10,000 km², and only five of these utilized AI for automatic detection. These include Berganzo-Besga et al. (2023) in Galicia, Spain (approximately 30,000 km 2 ); Carter, Blackadar and Conner (2021) in Pennsylvania, USA (37,000 km 2 ); Cerrillo-Cuenca and Bueno-Ramírez (2019) on the Iberian Peninsula (30,000 km 2 ); Stott et al. (2019) in Denmark (42,000 km 2 ); and Verschoof-van der Vaart et al. (2023) in Connecticut, USA (12,500 km 2 ). ...
... This workflow is based on well-established landscape archaeology approaches that can be replicated beyond this specific study case, totally or partially, in other areas of the Indus Basin, South Asia and worldwide. The workflow presented here is being further enhanced through the use of machine learning approaches [98] which we have already explored the potential of both within the study area and in other areas that also have collections of historical maps that can be used for detecting archaeological sites [22]. With appropriate modifications to suit local data sources and different research agendas, this workflow also has the potential to be applied to other types of features beyond artificial mounds and river palaeochannels. ...
Article
Full-text available
Alluvial floodplains were one of the major venues of the development and long-term transformation of urban agrarian-based societies. The historical relationship between human societies and riverine environments created a rich archaeological record, but it is one that is not always easy to access due to the dynamism of alluvial floodplains and the geomorphological processes driven their hydrological regimes. Alluvial floodplains are also targeted for urban and agricultural expansion, which both have the potential to pose threats to cultural heritage and the environment if not carefully managed. Analysis that combines Historical Cartography and Remote Sensing sources to identify potential archaeological sites and river palaeochannels is an important first step towards the reconstruction of settlement patterns in different historical periods and their relationship to the history of hydrological networks. We are able to use different computational methods to great effect, including algorithms to enhance the visualization of different features of the landscape; and for processing large quantity of data using Machine-Learning based methods. Here we integrate those methods for the first time in a single study case: a section of the Indus River basin. Using a combined approach, it has been possible to map the historical hydrological network in a detail never achieved before and identify hundreds of potential archaeological sites previously unknown. Discussing these datasets together, we address the interpretation of the archaeological record, and highlight how Remote Sensing approaches can inform future research, heritage documentation, management, and preservation. The paper concludes with a targeted analysis of our datasets in the light of previous field-based research in order to provide preliminary insights on how long-term processes might have re-worked historical landscapes and their potential implications for the study of settlement patterns in different Historical periods in this region, thereby highlighting the potential for such integrated approaches.
Article
Full-text available
The northeastern region of Romania exhibits a notable concentration of burial mounds, many of which remain still unknown. Previous research in this area has shown that most of these monuments were constructed by Early Bronze Age communities, known as the Yamnaya culture, and were later reused by subsequent civilizations up until the early medieval period. However, there has been a distinct lack of systematic efforts to document these sites, determine their chronology, or study their geomorphological characteristics. Furthermore, many of these mounds are under constant threat from natural forces and human activity, leading to irreversible damage. The present study aims to fill some of these gaps by applying an innovative approach based on high-resolution airborne sensing techniques, including oblique and vertical aerial photography, photogrammetry, and LiDAR (Light Detection and Ranging). Our main objective was to accurately identify all of the burial mounds from the Jijia River’s catchment, in order to attempt a restoration of the ancient barrow landscape of north-eastern Romania. In this sense, a preliminary review of the scientific literature revealed a discrepancy regarding the number of existing sites, their location, and their research. However, the availability of high-resolution digital elevation models from LiDAR measurements has allowed a significant increase in the number of identified sites (up to 1,660 burial mounds) and a reassessment of their spatial distribution within the workspace. Additionally, the research included an analysis of the old cartographic sources, improving the database with 131 lost sites, that are no longer visible, even when using the LiDAR measurements.
Article
Material characteristics of casting moulds are crucial for understanding the evolution and diversification of bronze ritual vessel production in Bronze Age China. During relevant studies, a Back Scattered Electron (BSE) image detector is commonly employed to analyze mould microstructure, effectively revealing the volume ratios and shape features of the clay matrix, silt/sand particles, and voids. It is always challenging to analyze and crosscompare these BSE images quantitatively since they typically contain numerous phases with highly irregular shapes. Traditionally, time consuming manual point counting or multi-step image processing were used to obtain semi-quantitative results. Addressing these challenges, we have proposed a deep learning method called BCMSegNet, an optimized Mask R-CNN-based algorithm for segmenting BSE images of bronze casting moulds and cores. Using the proposed method, key parameters, such as area, Feret diameter, roundness, and solidity of segmented particles, can be provided based on well segmented results, even for the images with complex background. Experimental outcomes show that the algorithm achieves a segmentation precision of 95% and an accuracy of around 91%, demonstrating its strong generalization capability. This study provides a significant foundation for micro-feature analysis of archaeological ceramic materials, classification of particles, and determination of technological processes in archaeological research.
Article
Purpose This paper provides practical advice for archaeologists and heritage specialists wishing to use ML approaches to identify archaeological features in high-resolution satellite imagery (or other remotely sensed data sources). We seek to balance the disproportionately optimistic literature related to the application of ML to archaeological prospection through a discussion of limitations, challenges and other difficulties. We further seek to raise awareness among researchers of the time, effort, expertise and resources necessary to implement ML successfully, so that they can make an informed choice between ML and manual inspection approaches. Design/methodology/approach Automated object detection has been the holy grail of archaeological remote sensing for the last two decades. Machine learning (ML) models have proven able to detect uniform features across a consistent background, but more variegated imagery remains a challenge. We set out to detect burial mounds in satellite imagery from a diverse landscape in Central Bulgaria using a pre-trained Convolutional Neural Network (CNN) plus additional but low-touch training to improve performance. Training was accomplished using MOUND/NOT MOUND cutouts, and the model assessed arbitrary tiles of the same size from the image. Results were assessed using field data. Findings Validation of results against field data showed that self-reported success rates were misleadingly high, and that the model was misidentifying most features. Setting an identification threshold at 60% probability, and noting that we used an approach where the CNN assessed tiles of a fixed size, tile-based false negative rates were 95–96%, false positive rates were 87–95% of tagged tiles, while true positives were only 5–13%. Counterintuitively, the model provided with training data selected for highly visible mounds (rather than all mounds) performed worse. Development of the model, meanwhile, required approximately 135 person-hours of work. Research limitations/implications Our attempt to deploy a pre-trained CNN demonstrates the limitations of this approach when it is used to detect varied features of different sizes within a heterogeneous landscape that contains confounding natural and modern features, such as roads, forests and field boundaries. The model has detected incidental features rather than the mounds themselves, making external validation with field data an essential part of CNN workflows. Correcting the model would require refining the training data as well as adopting different approaches to model choice and execution, raising the computational requirements beyond the level of most cultural heritage practitioners. Practical implications Improving the pre-trained model’s performance would require considerable time and resources, on top of the time already invested. The degree of manual intervention required – particularly around the subsetting and annotation of training data – is so significant that it raises the question of whether it would be more efficient to identify all of the mounds manually, either through brute-force inspection by experts or by crowdsourcing the analysis to trained – or even untrained – volunteers. Researchers and heritage specialists seeking efficient methods for extracting features from remotely sensed data should weigh the costs and benefits of ML versus manual approaches carefully. Social implications Our literature review indicates that use of artificial intelligence (AI) and ML approaches to archaeological prospection have grown exponentially in the past decade, approaching adoption levels associated with “crossing the chasm” from innovators and early adopters to the majority of researchers. The literature itself, however, is overwhelmingly positive, reflecting some combination of publication bias and a rhetoric of unconditional success. This paper presents the failure of a good-faith attempt to utilise these approaches as a counterbalance and cautionary tale to potential adopters of the technology. Early-majority adopters may find ML difficult to implement effectively in real-life scenarios. Originality/value Unlike many high-profile reports from well-funded projects, our paper represents a serious but modestly resourced attempt to apply an ML approach to archaeological remote sensing, using techniques like transfer learning that are promoted as solutions to time and cost problems associated with, e.g. annotating and manipulating training data. While the majority of articles uncritically promote ML, or only discuss how challenges were overcome, our paper investigates how – despite reasonable self-reported scores – the model failed to locate the target features when compared to field data. We also present time, expertise and resourcing requirements, a rarity in ML-for-archaeology publications.
Article
Full-text available
Surface archaeological survey has been widely established as the principal method for the regional study of Mediterranean diachronic landscapes. Before the introduction of GPS and digital, GIS‐based recordings in the late 1990s, survey projects employed analogue recording strategies (e.g. personal notebooks, printed forms and cartographic materials) resulting in low‐precision spatial datasets. These archives, termed here as legacy survey data, can today be visualized and analysed using computational tools. The aim of the present work is to exemplify how legacy data can be reused and reproduced to explore unknown aspects of past survey projects. It showcases a multi‐source, GIS‐structured workflow to manage and re‐evaluate data from the region of Grevena, north‐western Greece, where a largely unpublished all‐period extensive survey titled the Grevena Project has pinpointed a rich, yet unavailable to the archaeological community cultural record. The publications lacked critical evaluation of the survey results and significance, such as accurate site locations, size and chronology as well as a description of the field collection strategies used. To recover and combine these data into a single geodataset, a three‐step workflow was created, including the systematic recording of collected artefacts, the deployment of archival and remote‐sensing resources (e.g. georeferenced cartographic and photographic materials and satellite imagery) and the development of a new extensive survey in selected areas for validation purposes. Results indicated heterogeneity in the techniques employed by the Grevena Project for site recognition. They also brought an important assemblage of Palaeolithic finds unrecorded before. Furthermore, large‐scale geomorphological analysis using geomorphometric approaches demonstrated an irregularly high density of sites in elevated areas, which is considered a surveying bias. Remote sensing sources including archival aerial photographs highlighted regional landscape changes (e.g. in forest coverage) revealing architectural remains unmapped before. Finally, the new survey around Ayios Georgios showed the discovery of several new sites, emphasizing a case study of much more complex dynamics than originally considered during the Grevena Project.
Article
Full-text available
In the past decade, numerous studies have successfully mapped thousands of former charcoal production sites (also called relict charcoal hearths) manually using digital elevation model (DEM) data from various forested areas in Europe and the north‐eastern USA. The presence of these sites causes significant changes in the soil physical and chemical properties, referred to as legacy effects, due to high amounts of charcoal that remain in the soils. The overwhelming amount of charcoal hearths found in landscapes necessitates the use of automated methods to map and analyse these landforms. We present a novel approach based on open source data and software, to automatically detect relict charcoal hearths in large‐scale LiDAR datasets (visualized with Simple Local Relief Model). In addition, the approach simultaneously provides both general as well as domain‐specific information, which can be used to further study legacy effects. Different versions of the methodology were fine‐tuned on data from north‐western Connecticut and subsequently tested on two different areas in Connecticut. The results show that these perform adequate, with F1‐scores ranging between 0.21 and 0.76, although additional post‐processing was needed to deal with variations in LiDAR quality. After testing, the best performing version of the prediction model (with an average F1‐score of 0.56) was applied on the entire state of Connecticut. The results show a clear overlap with the known distribution of charcoal hearths in the state, while new concentrations were found as well. This shows the usability of the approach on large‐scale datasets, even when the terrain and LiDAR quality varies.
Article
Full-text available
This paper presents a method for the merging of lidar-derived point clouds of the same area taken at different moments, even when these are not co-registered. The workflow also incorporates the filtering of vegetation allowing the classification of unclassified point clouds using the ground points of reliable coverages. The objective is to produce a digital terrain model by joining all ground points to generate a higher resolution model than would have been possible using a single coverage. The workflow is supplemented by a multi-scale relief visualisation tool that allows for better detection of archaeological micro-reliefs of variable size even in areas of complex topography. The workflow is tested in six Iberian Iron Age sites, all of them located in mountain areas with dense Mediterranean perennial forests and shrub vegetation.
Article
Full-text available
This paper presents an algorithm for large-scale automatic detection of burial mounds, one of the most common types of archaeological sites globally, using LiDAR and multispectral satellite data. Although previous attempts were able to detect a good proportion of the known mounds in a given area, they still presented high numbers of false positives and low precision values. Our proposed approach combines random forest for soil classification using multitemporal multispec-tral Sentinel-2 data and a deep learning model using YOLOv3 on LiDAR data previously pre-processed using a multi-scale relief model. The resulting algorithm significantly improves previous attempts with a detection rate of 89.5%, an average precision of 66.75%, a recall value of 0.64 and a precision of 0.97, which allowed, with a small set of training data, the detection of 10,527 burial mounds over an area of near 30,000 km 2 , the largest in which such an approach has ever been applied. The open code and platforms employed to develop the algorithm allow this method to be applied anywhere LiDAR data or high-resolution digital terrain models are available.
Conference Paper
Full-text available
Exploration of the Maya forest region remotely through machine learning has recently accelerated. Using experts to manually look at satellite data is time-consuming and expensive. The machine learning competition Discover the mysterious of the Maya addresses this problem and calls for a competition to improve the performance of state-of-the-art models to automatically detect objects of interest using satellite images. With a given LiDAR image, the model should detect three classes of objects: Aguadas, buildings and platforms. We have set up a pipeline that essentially consists of three steps. First, we generate synthetic data in three different ways to increase the training set. In the second step, we mix them with the real training data and then train an ensemble of DeepLabV3+ and HRnet networks. In the third step, we applied thresholds to improve the segmentation masks. We achieved an average intersection over union (IOU) of 0.8275 for all three classes and the best score of 0.7569 for the building class.
Article
Full-text available
In the mid-Holocene (5000-3000 cal B.P.), Native American groups constructed shell rings, a type of circular midden, in coastal areas of the American Southeast. These deposits provide important insights into Native American socioeconomic organization but are also quite rare: only about 50 such rings have been documented to date. Recent work using automated LiDAR analysis demonstrates that many more shell rings likely exist than are currently recorded in state archaeological databases. Here, we use deep learning, a form of machine intelligence, to detect shell ring deposits and identify their geographic range in LiDAR data from South Carolina. We corroborate our results using synthetic aperture radar (SAR), multispectral data, and a random forest analysis. We conclude that a greater number of shell rings exist and that their distribution expanded further north than currently documented. Our evidence suggests that ring-construction was a more widespread and common practice during the mid-Holocene.
Article
Full-text available
This paper presents new developments on drone-based automated survey for the detection of individual items or fragments of material culture visible on the ground surface. Since the publication of our original proof of concept, awarded with the Journal of Archaeological Science and Society for Archaeological Sciences Emerging Investigator Award 2019, additional funding has allowed us to implement a series of improvements to the method. These aim to improve detection capabilities and the extraction of items' shapes and increase flight autonomy, control, area covered per flight and the type of environments in which the method can be applied while reducing computing needs, processing time and expertise necessary for its application. This paper provides an account of the methods followed to achieve these objectives, their preliminary results and the current development for their implementation into a free and open-source system that can be used by the archaeological community at large.
Article
Full-text available
Historical maps present a unique depiction of past landscapes, providing evidence for a wide range of information such as settlement distribution, past land use, natural resources, transport networks, toponymy and other natural and cultural data within an explicitly spatial context. Maps produced before the expansion of large-scale mechanized agriculture reflect a landscape that is lost today. Of particular interest to us is the great quantity of archaeologically relevant information that these maps recorded, both deliberately and incidentally. Despite the importance of the information they contain, researchers have only recently begun to automatically digitize and extract data from such maps as coherent information, rather than manually examine a raster image. However, these new approaches have focused on specific types of information that cannot be used directly for archaeological or heritage purposes. This paper provides a proof of concept of the application of deep learning techniques to extract archaeological information from historical maps in an automated manner. Early twentieth century colonial map series have been chosen, as they provide enough time depth to avoid many recent large-scale landscape modifications and cover very large areas (comprising several countries). The use of common symbology and conventions enhance the applicability of the method. The results show deep learning to be an efficient tool for the recovery of georeferenced, archaeologically relevant information that is represented as conventional signs, line-drawings and text in historical maps. The method can provide excellent results when an adequate training dataset has been gathered and is therefore at its best when applied to the large map series that can supply such information. The deep learning approaches described here open up the possibility to map sites and features across entire map series much more quickly and coherently than other available methods, opening up the potential to reconstruct archaeological landscapes at continental scales.
Article
Full-text available
In this paper, we report the results of our work on automated detection of qanat shafts on the Cold War-era CORONA Satellite Imagery. The increasing quantity of air and space-borne imagery available to archaeologists and the advances in computational science have created an emerging interest in automated archaeological detection. Traditional pattern recognition methods proved to have limited applicability for archaeological prospection, for a variety of reasons, including a high rate of false positives. Since 2012, however, a breakthrough has been made in the field of image recognition through deep learning. We have tested the application of deep convolutional neural networks (CNNs) for automated remote sensing detection of archaeological features. Our case study is the qanat systems of the Erbil Plain in the Kurdistan Region of Iraq. The signature of the underground qanat systems on the remote sensing data are the semi-circular openings of their vertical shafts. We choose to focus on qanat shafts because they are promising targets for pattern recognition and because the richness and the extent of the qanat landscapes cannot be properly captured across vast territories without automated techniques. Our project is the first effort to use automated techniques on historic satellite imagery that takes advantage of neither the spectral imagery resolution nor very high (sub-meter) spatial resolution.
Article
Full-text available
Incomplete datasets curtail the ability of archaeologists to investigate ancient landscapes, and there are archaeological sites whose locations remain unknown in many parts of the world. To address this problem, we need additional sources of site location data. While remote sensing data can often be used to address this challenge, it is enhanced when integrated with the spatial data found in old and sometimes forgotten sources. The Survey of India 1” to 1-mile maps from the early twentieth century are one such dataset. These maps documented the location of many cultural heritage sites throughout South Asia, including the locations of numerous mound features. An initial study georeferenced a sample of these maps covering northwest India and extracted the location of many potential archaeological sites—historical map mound features. Although numerous historical map mound features were recorded, it was unknown whether these locations corresponded to extant archaeological sites. This article presents the results of archaeological surveys that visited the locations of a sample of these historical map mound features. These surveys revealed which features are associated with extant archaeological sites, which were other kinds of landscape features, and which may represent archaeological mounds that have been destroyed since the maps were completed nearly a century ago. Their results suggest that there remain many unreported cultural heritage sites on the plains of northwest India and the mound features recorded on these maps best correlate with older archaeological sites. They also highlight other possible changes in the large-scale and long-term distribution of settlements in the region. The article concludes that northwest India has witnessed profound changes in its ancient settlement landscapes, creating in a long-term sequence of landscapes that link the past to the present and create a foundation for future research and preservation initiatives.
Article
This paper presents an algorithm for automated detection and classification of multi-cell phytoliths, one of the major components of many archaeological and paleoenvironmental deposits. This identification, based on phytolith wave pattern, is made using a pretrained VGG19 deep learning model. This approach has been tested in three key phytolith genera for the study of agricultural origins in Near East archaeology: Avena, Hordeum and Triticum. Also, this classification has been validated at species-level using Triticum boeoticum and dicoccoides images. Due to the diversity of microscopes, cameras and chemical treatments that can influence images of phytolith slides, three types of data augmentation techniques have been implemented: rotation of the images at 45-degree angles, random colour and brightness jittering, and random blur/sharpen. The implemented workflow has resulted in an overall accuracy of 93.68% for phytolith genera, improving previous attempts. The algorithm has also demonstrated its potential to automatize the classification of phytoliths species with an overall accuracy of 100%. The open code and platforms employed to develop the algorithm assure the method's accessibility, reproducibility and reusability.