ArticlePDF Available

Convolutional neural networks for archaeological site detection -Finding "princely" tombs

Authors:
  • Max Planck Institute of Geoanthropology

Figures

Content may be subject to copyright.
Convolutional Neural Networks for Archaeological Site
Detection Finding “Princely” Tombs
Gino Caspari*
Department of Archaeology, University of Sydney
and
Institute of Archaeological Sciences, University of Bern
Pablo Crespo
Ph.D. Program in Economics, The Graduate Center, City University of New York
ABSTRACT
Creating a quantitative overview over the early Iron Age heritage of the Eurasian steppes is
a difficult task due to the vastness of the ecological zone and the often problematic access. Re-
mote sensing based detection on open-source high-resolution satellite data in combination with
convolutional neural networks (CNN) provide a potential solution to this problem. We create
a CNN trained to detect early Iron Age burial mounds in freely available optical satellite data.
The CNN provides a superior method for archaeological site detection based on the comparison
to other detection algorithms trained on the same dataset. Throughout all comparison metrics
(precision, recall, and score) the CNN performs best.
Keywords: CNN; object detection; archaeological remote sensing; convolutional neural net-
works; site detection
*Email: gino.caspari@sydney.edu.au .
Email: pcrespo@gradcenter.cuny.edu.
1 INTRODUCTION1
The archaeology of the Early Iron Age in the Eurasian steppe deals with a vast and archae-2
ologically unexplored space between Eastern Europe and Mongolia. Despite the amount3
of research which has been conducted by scholars of the former USSR and the recent wave4
of new research coming out of these areas, a quantifiable understanding of the wealth of5
cultural heritage the Eurasian steppe harbors, has yet to be achieved. One of the problems6
which hinders researchers in gaining a wider understanding is the fact that the ancient7
cultural phenomena of the Early Iron Age did not neatly adhere to modern nation state8
borders (Figure 1). The current administrative, linguistic, and institutional fragmentation9
of this vast ecological zone –the steppe–makes research on the ground difficult. Remote10
sensing in combination with automatic or semi-automatic approaches for object detection11
have been established as a tool which largely disregards these problems and is able to pro-12
vide the basis for solutions (Caspari et al.,2014). Rooted in archaeological field research13
we combine open source data with convolutional neural networks (CNNs) in order to14
encompass the newest technological advances and use them to detect elite tombs of the15
Early Iron Age in the Eurasian steppe.16
When it comes to restrictive access for foreign researchers, the Xinjiang Uyghur Au-17
tonomous Region is maybe the most extreme example in the region. It is known for its18
political and ethnical issues (Clarke,2008) and recently received international media at-19
tention due to its increasingly oppressive counter-terrorism campaigns. (Roberts,2018)20
Notoriously hard to receive permits for archaeological fieldwork in the first place, spo-21
radic eruptions of ethnic conflicts between the Uyghur minority and Han Chinese major-22
ity in southern Xinjiang can abort long-planned projects last minute. Militarized border23
zones geographically curtail the areas archaeologists can work in. Even receiving a permit24
is not necessarily a guarantee that a field campaign can be conducted as planned, since25
the security apparatus is suspicious of any research activity by foreigners. Remote sensing26
mitigates these problems of access and the quality of publicly available high-resolution27
satellite data for Xinjiang has increased dramatically over the past years (Caspari,2018).28
1
Figure 1: The area of interest in the eastern Central Asian steppes
CNNs have become the standard tool in computer vision applications in recent years.29
Their particular use in pattern and shape recognition is noted and popularized with the30
LeNet-5 architecture for recognizing handwritten digits (Lecun et al.,1999). Their particu-31
lar usefulness is predicated on their ability to take inputs in the shape of multidimensional32
matrices (tensors), allowing them to work with patterns in multiple directions. Pixels ad-33
jacent to each other have influence on what is identified. Most other machine learning34
algorithms used in image recognition work with inputs that take the shape of single row35
vectors, eliminating the ability to harness the information given by adjacent pixels in36
an image that are not in the same row (the pixels right below, above or set diagonally).37
Hence, CNNs are much more sensitive to identifying subtle patterns in images.38
CNNs are a versatile solution to a plethora of problems in archaeology which works39
well when plenty of data is available. It comes at the cost of not being able to fully and an-40
alytically understand the process of solving the problem. The outcomes however can be41
2
qualitatively assessed and the solution is reproducible. Consistent with their versatility,42
CNNs have been used in different archaeological sub fields and for a diverse number of43
tasks from sex determination of skeletal remains to solving mapping tasks and extracting44
pottery depictions from archaeological publications.45
Unsurprisingly, being one of the main categories of archaeological material, research46
on ceramics has seen a wide application of CNNs already. From recognizing vessels47
to classifying ceramic form, to understanding and classifying the structure of ceramics,48
CNNs have been useful in solving complex problems. (Benhabiles and Tabia,2016) build49
a CNN to design local descriptors for content-based retrieval of three-dimensional (3-D)50
vessel replicas. (Pasquet et al.,2017) use a CNN to detect amphorae in an underwater set-51
ting, correctly mapping around 90% of the vessels. (Hein et al.,2018) automatically extract52
and classify ceramics based on textures. (Chetouani et al.,2018) enlist the help of a CNN in53
order to classify shards and understand the movement of potters. The ArchAIDE project54
experiments with CNNs to create an as-automated-as-possible tool for the classifications55
and interpretation of shards (Gualandi et al.,2016). A similar application is envisioned56
by (Tyukin et al.,2018) with the project Arch-I-Scan which aims to automatically classify57
Roman pottery.58
Interpreting other archaeological classes of information with CNNs is still in its in-59
fancy, but a number of examples can give the reader an idea of what might be possible60
if expertly human labeled datasets are combined with CNNs. (Byeon et al.,2019) au-61
tomatically identify and classify cut marks on bones. The authors manage to demon-62
strate that CNNs recognize and classify marks with a much higher accuracy rate than63
human experts. CNNs also perform exceptionally well when tasked with determining64
the sex of skeletal remains based on CT scans thereby eliminating human bias (Bewes65
et al.,2019). In the analysis and interpretation of ancient scripts, CNNs are also begin-66
ning to make an impact. First attempts have been made in indexing Mayan hieroglyphs67
(Roman-Rangel and Marchand-Maillet,2016;Can et al.,2018) and creating a standard-68
ized corpus of graphemes for the Indus Valley script (Palaniappan and Adhikari,2017).69
Further applications of CNNs in classifying, transcribing, and ultimately translating e.g.70
3
cuneiform are to be expected.71
CNNs have so far found the widest application in the area of archaeological remote72
sensing. This subfield of archaeology has the advantage of already working within a73
data-focused framework where classification and mapping tasks are common. The ap-74
plication of CNNs thus comes as an obvious extension of existing automated and semi-75
automated methods. Especially with LiDAR data collection, the data volume is becoming76
too large to be analyzed through a manual approach. CNNs help to mitigate this problem77
while simultaneously maintaining a consistent approach. (Trier et al.,2019) present a case78
study mapping a number of archaeological object classes on an island in Scotland based79
on airborne laser scanning data. (Guyot et al.,2018) detect Neolithic burial mounds in80
a LiDAR-derived digital elevation model. (Kramer et al.,2017) combine aerial imagery81
and LiDAR data to detect archaeological structures using previously identified archaeo-82
logical sites as training data. Other non-invasive methods like geophysical prospection,83
in particular ground penetrating radar (Travassos et al.,2018;Ishitsuka et al.,2018;Pham84
and Lef`
evre,2018), have also seen the application of CNNs. Our own case study in this85
paper belongs to the wide field of CNN applications which arose from image processing86
conceptually close to well-known and widely applied tasks like the recognition of faces87
and vehicles in images. CNNs can be useful in any area where remote sensing data needs88
to be searched for archaeological structures. The efficient processing of image data even89
allows for real-time decision making so that (Rutledge et al.,2018) are able to present an90
autonomous underwater robot system, which allows for the autonomous surveying of91
underwater sites including path planning and acquisition of high-resolution sonar data.92
Even art historical classifications and comparisons are supported by CNNs. With the93
appropriate amount of data, it becomes feasible to define stylistic affinity. First applica-94
tions can be seen in the classification of wall paintings in Pompeii (Schoelz,2018) and95
(Li et al.,2018) approach towards dating the Mogao Grottoes wall paintings based on96
drawing styles defined by a CNN. (Wang et al.,2017) use CNNs for defining similari-97
ties of Bodhisattva head images at the Dazu Rock Carving site and thus contribute to98
the reconstruction of some of the damaged rock carvings. An application of CNNs in the99
4
restoration of damaged archaeology can also be seen in a paper by (Hermoza and Sipiran,100
2017) where the authors try to predict the missing geometry of damaged archaeological101
objects opening a promising avenue of research into computer-supported reconstruction102
and restoration of archaeological artifacts.103
Wherever the exploration and analysis of large data sets is aided by recognizing com-104
plex patterns, CNNs can be helpfully employed. This leads to creative applications like105
a study by (Graham,2018). The authors identify sales of human remains on social media106
platforms using CNNs to detect patterns allowing for the classification of a combination107
of images and text ultimately aiding the reconstruction of sales networks.108
2 THE FIELD ARCHAEOLOGICAL FOUNDATION109
The Dzungaria Landscape Project, first established in 2014 (Caspari et al.,2017), relied on110
a large-scale automated survey by means of a trained Hough Forest algorithm (Caspari111
et al.,2014). Since then, machine learning has made enormous progress and the quality112
of the freely available satellite imagery has increased substantially. Through an intensive113
on-ground survey, the project was able to obtain a dataset of archaeological structures in114
the foothills of the Chinese Altai Mountains. Accumulations of very large Early Iron Age115
burial mounds early on caught the attention of the researchers (Figure 2and Figure 3).116
It soon became clear that the southern Altai Mountains, in particular the area around117
Heiliutan were a focus of intense funerary building activity, especially during the first118
millennium BCE (Caspari et al.,2017). A number of different Early Iron Age material119
cultures in the first millennium BCE can be identified (van Geel et al.,2004). Here, we are120
specifically focusing on the funerary architecture of the Saka culture due to its relative121
homogeneity. There is a plethora of architectural remains from the Early Iron Age present122
in the survey area, but many of them are too small to be reliably detected in open source123
optical satellite data (Caspari,2018). By far the most dominant anthropogenic features of124
the landscape are large burial mounds with circular ditches around them.125
These monuments of which 59 (Caspari et al.,2017;Caspari,Forthcoming) were mapped126
5
Figure 2: Map generated during the 2015 survey of the Heiliutan Valley in northern Xinjiang. Large Saka
burial mounds tend to cluster. Dark grey = mound. Light grey = ditch.
during the field surveys, bear a striking resemblance to so-called Saka burials from the127
Semirechye (eastern Kazakhstan), the northern Tianshan and the Ili Valley. The term128
“Saka” is a relatively unspecific ethnic term stemming from Persian sources as (P’iankov,129
1994) elaborates and thus should only be used with the appropriate care. Over decades of130
archaeological research in what is now eastern Kazakhstan, the term, however, has come131
to denote a specific Early Iron Age material culture and is seen as a technical term among132
many researchers without implying the potentially problematic ethnic connotations. The133
Saka material culture in eastern Kazakhstan is dated to the 7th/6th cent. BCE and the134
3rd cent. BCE (Parzinger,2011). Saka burials have so far mainly been known from the135
Semirechye (Davis-Kimball,1991;Gass,2011;Nagler,2009;Nagler et al.,2010) and have136
only recently been compiled in a large study by (Gass,2016).137
The connections of Saka-related material culture into northern Xinjiang have been ana-138
lyzed (Davis-Kimball,1991;Chen and Hiebert,1995) but due to the fragmentary nature of139
archaeological data in Xinjiang have been assumed to mainly be confined to the western-140
most stretches of Xinjiang, namely the Ili Valley and the northern Tianshan. Older Chinese141
6
research has looked at these connections from the eastern side (Wang,1985) working on142
a number of sites which show clear relations to eastern Kazakhstan like Tiemulike (Insti-143
tute of Archaeology of the Xinjiang Academy of Social Sciences,1988), Dacaotan (Institute144
of Archaeology of the Xinjiang Academy of Social Sciences,1985), and Zhongyangchang145
(Institute of Archaeology of the Xinjiang Academy of Social Sciences,1986). The architec-146
tural features of the mounds in the Heiliutan Valley, however, suggest a strong cultural147
connection during the middle of the first millennium BCE all the way into the foothills of148
the Chinese Altai Mountains.149
Figure 3: Architectural features of Saka burial mounds.
The large burial mounds of the Saka material culture usually were built from a mixture150
of pebbles, larger round stones and earth from the alluvial terraces. Mounds are typically151
elevated and surrounded by circular rings of stones or circular ditches (Figure 3). Both152
ditch and mound are clearly visible in open source optical satellite data. The profile of153
the Saka burial mounds typically shows steep sides (sometimes three steep sides and one154
with a gentler slope) and a flat top. Maximum diameters in the Heiliutan area are typi-155
cally between 15.5m and 34.1m (89.5 %) and therefore well within the range of detectable156
objects in open-source satellite imagery (Figure 4). A group of outliers has diameters of157
over 40m. The average diameter of Saka mound in the Heiliutan Valley is 27.93m (median158
26.8m).159
The heights of these burial mounds average at 1.97m (median 1.4m). The largest160
7
Figure 4: Scatterplot of Saka burial mound diameters, notice the cluster of extraordinarily large mounds
which clearly set themselves apart from the smaller ones. These “princely” tombs are easily recognizable
in open source remote sensing data.
mounds have a height of up to 6.5m. Both diameters and heights of Saka burial mounds161
in the Chinese Altai are comparable to Saka burial mounds from Issyk, Kegen and other162
cemeteries with princely tombs (Gass,2011;Samashev,2007). The Saka burials of the163
Heiliutan area are all practically identical in their composition of building materials and164
the profile of the mound. The largest mound has a diameter of 53.5m, a height of 6.0m,165
and the circular ditch measures 91.5m across. This type of burial usually has a 5:3 ratio166
between circular ditch diameter and mound diameter which again matches Saka buri-167
als from the Semirechye (Gass,2011). The large accumulation of Saka burials (Figure 2)168
with a length of almost 2km are visible from afar and one of the dominant archaeological169
places within the landscape of the Heiliutan Valley. One of these monuments has been170
excavated in 2016 by the Institute of Archaeology of the Xinjiang Academy of Social Sci-171
ences but has yet to be published like many other burial mounds in the area of interest172
the grave was unfortunately looted.173
8
3 CONVOLUTIONAL NEURAL NE T W ORK S174
CNNs are a specific type of neural network architectures popularized by (Lecun et al.,175
1999), which can take grid-like inputs. Our particular case is a two-dimensional grid of176
pixels, in which each pixel can be considered a source of information in the same way as177
a cell in a row of tabular data would be. Note that images can be interpreted as numerical178
grids if each pixel on each channel (RGB) is given a numerical value based on the intensity179
of the color from 0 to 255. In order to understand how CNNs work, we will define them180
as the junction of three different operating components as types of “layers”:181
convolutional layers182
pooling layers183
fully connected layers184
The convolutional and pooling layers are used to identify and summarize patterns185
in the data. The fully connected layers are used to utilize these summaries as inputs186
of a classification problem, helping us make the determination of whether our (in this187
case) image belongs to a specific class based on the model. An example diagram of these188
architectures is presented in Figure 5.189
Figure 5: CNN architecture example diagram
9
3. 1 CONVOLUTIONAL LAYER S190
The network uses convolutional layers to detect simple features or patterns in the data.191
The patterns can be small and simple, but the combination of multiple simple patterns192
allow for the search of complex forms.193
Each convolutional layer is composed of two stages: convolution and detection. In the
first stage a set of convolution operations are run on the input grid. A kernel or filter is
moved sequentially on the input generating outputs on each position they take. These
are defined by:
hi,j=m
h=1
m
l=1
wk,lxi+k1,j+l1
where hi,jis the output of the convolution at position (i,j),xi+k1,j+l1portion of the input194
grid over which the filter is applied, wk,lis the filter at position (k,l), and mdetermines195
the height and width of the filter.196
Hence, the filter is a weighting square grid, which is applied to the larger input grid to197
highlight specific patterns within it. The higher the value of a convolution operation, the198
higher the chance that the pattern that the filter searches for is found. Figure 6highlights199
this process by exemplifying it. We can then use different filters to find different patterns.200
For example, using a filter of the form:201
0 1 0
0 1 0
0 1 0
would be used to identify vertical lines.202
Filters can be, and often are, initialized at random to pick on many and varied subtle203
patterns within the input grid. Each convolutional layer runs several filters on the inputs204
and outputs grids for each.205
For the detection or activation stage, the results from the convolution stage are taken206
10
Figure 6: Filter applied over a matrix
and passed through a function. We used the ReLU (Rectifying linear unit), which is de-207
fined as:208
σ(x)=max (0, x)
This specific function grants extra weight to all of the non-negative units. Since the209
filters can have negative values, this activation allows for extra salience of patterns.210
After activation, the outputs of the convolutional layer are used as inputs for the pool-211
ing layers.212
3. 2 POO L IN G LAYE RS213
A pooling layer summarizes the resulting activated grids through max pooling. This214
work uses max pooling. A new grid is constructed from each activated grid by assigning215
each entry of it to the maximum value of 2 ×2 subgrids. An example is shown in Figure 7.216
11
Figure 7: 2 ×2 max pooling
At this point, the practitioner has two choices: to summarize the results once more217
through a fully connected layer (see Section 3.3) or to repeat the process of passing the218
outputs through a convolutional layer and pooling layer once again. This is what is meant219
by making a network ”deeper.” Passing the data through extra convolutional and pooling220
layers allows for further and more subtle evaluation of patterns. This is said to elevate221
the complexity of the model.222
3. 3 FUL LY CO N NE C TE D L AYERS223
Fully connected layers have the basic structure of artificial neural networks or multilayer224
perceptrons. Their task is to take the outputs from the last pooling layers and classifying225
them into specific categories. Before passing the grids resulting from the pooling layers226
to the fully connected layers, the grids are ”flattened.” Meaning the results from all the227
resulting grids are combined into a single row vector. The resulting elements of the vector228
12
produced after the flattening are then linearly combined. This means they are written as:229
β0+
i
βi×elementi
where β0is called the ”bias” and the rest of the βiare called the ”weights.” Each of230
these linear combinations is passed through an activation function yet again generating a231
single number output. This particular structure of operations constitutes what is called a232
“neuron.” The set of these activated linear combinations is called a ”hidden layer.” The233
practitioner can add extra complexity to the model by using the outputs of each hidden234
layer as the inputs for a new fully connected layer. The practitioner has to choose both235
the number of neurons and the depth of the model by choosing the number of hidden236
layers. Once it has been decided that the architecture is deep enough, in the case of binary237
classification such as ours, a final fully connected layer is created yielding a single linear238
combination and the activation of this one is what we consider the “output layer” of the239
network usually normalized between zero and one thanks to the activation function (a240
sigmoid function1is a common choice). The corresponding number in this output layer241
is mapped to a specific class according to a threshold. For example, binary classes code242
their “target” variable as either having values of zero or one. We can say that an output243
layer with a value larger than 0.5 will predict the input belongs to class one and to class244
zero otherwise.245
The question remains on how these networks are actually generalized for large sam-246
ples of images. Consider that a sample of already identified and labeled images, which247
we call our ”target” is compiled in a vector y. Then, we would like to make sure that over-248
all the values of the distinct weights and biases are chosen such that the resulting output249
layer is as close to the target as possible for representative samples. In this case, we would250
like to choose values such that the following distance is minimized via a process called251
“backpropagation” for nobservations in a sample:252
n
i=1
1
2yioutputi2(1)
1
σ(x)=1
1+ex
13
It needs to be noted that deeper networks with hidden layers and many neurons in253
each are capable to make the distance in Equation (1) very small for a sample due to added254
complexity. This however does not come without the risks of making the network attuned255
to only the images fed through the specific sample and incapable of generalizing to others256
from the same population of objects but that were not present in the sampled data. This257
process is called ”overfitting.” Hence, the practicioner needs to be sure to design their258
architecture in a fine balance. The network must be capable to process complex enough259
patterns for classification, but not be so overly attuned to the sample data such that it fails260
classifying data from the same population outside the sample.261
4 AP P L ICATION OF A CNN O N PRINCELYTOMB CLASSIFICATION262
4. 1 DATA PRE P RO C ES S IN G263
Using open-source optical satellite data from Google Earth (100 x 100 pixels) of tombs264
with known locations and arbitrary patches of land around them, a labelled dataset was265
created with the following labelling scheme:266
y=
0 if tomb present
1 if tomb absent
The dataset is composed of 1212 images with 169 including tombs. Typical observa-267
tions of each case are presented in Figure 8. It is important to note that the distinctive268
shape of the tombs makes them easily distinguishable from other patches of land even in269
low-resolution data.270
In order to verify that the model we fit is a good model, the data is split in two portions,271
one for fitting the model (training data) and one for looking at how well it generalizes272
(testing and validation). The testing and validation data are simply datasets that don’t273
undergo the fitting process. Since the data belongs to the same population as the training274
data does, assessing the goodness of fit of the model in these can give us a good idea of275
14
Figure 8: Top: Examples of images labelled as tomb absent. Bottom: Examples of images labelled as tomb
present.
how well the model generalizes and it helps identify overfitting.276
The data was split with 75% used for training and 25% used for testing and valida-277
tion. Since the images containing tombs are heavily underrepresented in the dataset,278
augmentation is necessary for training appropriately with multiparameter methods such279
as convolutional neural networks. In this case 655 new images were synthesized from the280
training data with tombs present. The new samples are created by modifying the existing281
ones through randomly zooming, shearing, and performing horizontal flips. Note that282
augmented images are only used during the training stage. Using them for testing or val-283
idation is inappropriate due to their high correlation with the images that they originate284
from.285
4.2 CNN ARCHITECTURE286
The CNN utilized for our problem was trained on the augmented data mentioned at the287
beginning of this section. The full summary of the architecture is detailed in Figure 9.288
The CNN was trained in Keras, a Python module which uses Google’s TensorFlow as a289
15
backend in our case.290
Figure 9: Keras model summary
The architecture shown is relatively simple consisting of 3 convolutional and pooling291
layers with ReLU activations and two fully connected layers before the final activation292
with a sigmoid. The diagram specifies the dimensions of each. For example, the first293
convolutional layer uses 32 filters and outputs a 98 ×98 grid. A natural question not nec-294
essarily explained in the prior sections is what the ”dropout” row means in the diagram.295
Dropout is a regularization technique which disallows certain linear combinations to ex-296
ist at random during the optimization step. This technique helps ”regularize” or penalize297
overfitting. Hence, making sure the model is generalizable.298
4. 3 BENCHMARKS AND RESULTS299
Judging the accuracy of the convolutional neural network specified in Section 4.2 requires300
plausible methods for benchmarking. Furthermore, the true metrics of accuracy we are301
interested in are those in the validation data. These would be the ones that would tell302
16
us how each model works under observations not seen by the training model. As such,303
three models were chosen: a biased random guess, a support vector classifier with a linear304
kernel and a support vector classifier with a radial basis function kernel.305
Random guess is useful as a comparative benchmark since it selects its output by sim-306
ple random chance. In order to make the benchmark tougher, we biased the probabilities307
of classifying an image as containing a tomb to be the proportion of the actual number of308
tombs in the validation set309
Since the shapes of the tombs are simple and easily distinguishable, it stands to rea-310
son that simpler and more tractable classification methods could work as long as they311
allow for flexible boundary classification. Support vector machines with kernels as pro-312
posed by (Boser et al.,1992) work as sensible and powerful alternatives to deep learning313
models. We attempt using two types of kernels in this study, the linear kernel and the314
radial basis function kernel which both allow for different transformations of the data315
pre-classification. Each of these models have their hyperparameters adjusted via 5 -fold316
cross validation.317
We use three measures to compare the predictions made by the classifiers accuracy:318
Precision, Recall and F1score. Definitions below:319
Precision =# of True Positives
# of True Positives + # of False Positives
Recall =# of True Positives
# of True Positives + # of False Negatives
F1score =2
1
Recall +1
Precision
Precision simply gives the rate of correctly classified objects among all classified objects320
with the same label. Recall gives the rate of correctly labeled objects among all actual321
objects with that label. F1score gives a balanced measure of both. All tables and figures322
17
comparing models in this paper use these measures.323
Table 1and Table 2encapsulate the results obtained from the trained models making324
predictions on the validation set. We can appreciate that for both, images which con-325
tained tombs or those which did not, the CNN performs best. Interestingly, despite the326
fact that SVMs worked under training with an augmented dataset, their performance in327
identifying pictures with tombs was not comparable to that of the neural network. This328
is surprising since the tomb shapes are mostly simple to the naked eye, hence nonlinear329
classification should work well. The reason lies in the likelihood of the SVM models con-330
taining many false positives (objects that are not tombs being identified as such). This331
occurs because other images that might just simply be circular in shape are likely to be332
picked up by the SVM models as tombs . This has been an issue with other detection333
algorithms before e.g. (Caspari et al.,2014) . The big advantage of our architecture relies334
on the quantity of filters used being able to recognize higher subtlety in the patterns of335
the trained dataset that might identify a tomb, beyond just the circular shape. Figure 10336
summarizes both tables and includes a bar for Average/Total, which has a weighted av-337
erage for both classes under the measure. Showing that overall the CNN is the better338
performing model.339
Model Precision Recall F1score
Random Guessing 0.64 0.65 0.65
SVM with linear kernel 0.9 0.96 0.94
SVM with RBF kernel 0.96 0.97 0.97
CNN 0.98 1 0.99
Table 1: Classification metrics for validation data set pictures without tombs.
Model Precision Recall F1score
Random Guessing 0.59 0.58 0.59
SVM with linear kernel 0.29 0.15 0.20
SVM with RBF kernel 0.76 0.67 0.71
CNN 1 0.84 0.91
Table 2: Classification metrics for validation data set pictures with tombs.
18
Figure 10: Result Summaries
5 CONCLUSIONS340
The distinctive shape of the early Iron Age Saka burial mounds and their relatively large341
size make them an ideal training set for machine learning algorithms which can be run342
on open source satellite imagery. Our CNN outperforms other methods and provides343
a valuable approach for the large-scale detection of elite burial mounds in the Eurasian344
steppes. In this way a macro-regional survey of northern Xinjiang and the adjacent ar-345
eas could be conducted in order to assess the spatial distribution of this monument type346
and possibly revise the geographical extent to which Saka-related material culture spread347
through Eastern Central Asia during the first millennium BCE. The method has the clear348
advantage that all analyses can be conducted without the access problems archaeological349
projects in the region usually have to deal with.350
Preliminary satellite imagery analysis has developed into playing a major role in plan-351
ning and implementing archaeological field research (Lasaponara and Masini,2012;Cas-352
19
pari et al.,2019). But automatic feature detection has yet to become accessible to a wider353
range of researchers in order to be widely applied. A number of attempts have been354
made to connect archaeological surveys with automatic detection of features (Caspari355
et al.,2017;Trier et al.,2009;Trier and Pil,2012), however, it is not commonly used by356
practitioners. Both the complexity of the method which often demands cooperation with357
computer science specialists, and the lack of awareness for the possibility play a role in358
the so far rare application of automatic detection algorithms by archaeological practition-359
ers. The authors do not expect to see a widespread application unless intuitive tools are360
developed for feature selection, algorithm training and visualization of ready-to-use re-361
sults.362
20
REFERENCES
BENHABILES, H. and TABIA, H. (2016). Convolutional neural network for pottery re-
trieval. Journal of Electronic Imaging,26 (1).
BEWES, J., LOW, A., MO RPH ETT, A., PATE, F. D. and HENNEB ERG , M. (2019). Artificial
intelligence for sex determination of skeletal remains: Application of a deep learning
artificial neural network to human skulls. Journal of Forensic and Legal Medicine,62, 40 –
43.
BOSER, B. E., GUY ON, I. M. and VAPNIK, V. N. (1992). A training algorithm for opti-
mal margin classifiers. In Proceedings of the 5th Annual ACM Workshop on Computational
Learning Theory, ACM Press, pp. 144–152.
BYEON, W., DOMNGUEZ-RODRIGO, M., ARAMPATZIS, G., BAQUEDANO, E., YRAVEDRA,
J., MAT-GONZLEZ, M. A. and KOUMOUTSAKOS, P. (2019). Automated identification
and deep classification of cut marks on bones and its paleoanthropological implications.
Journal of Computational Science,32, 36 – 43.
CAN, G., ODOBEZ, J.-M. and GATICA-PE REZ , D. (2018). How to tell ancient signs apart?
recognizing and visualizing maya glyphs with cnns. J. Comput. Cult. Herit.,11 (4), 20:1–
20:25.
CASPARI, G. (2018). Assessing looting from space: The destruction of early iron age buri-
als in northern xinjiang. Heritage,1(2), 320–327.
— (Forthcoming). Quantifying the funerary ritual activity of the late prehistoric southern
kanas region (xinjiang, china). Asian Perspectives.
—, BALZ, T., GAN G, L., WANG, X. and LIAO, M. (2014). Application of hough forests for
the detection of grave mounds in high-resolution satellite imagery.
—, PLETS, G., BALZ, T. and FU, B. (2017). Landscape archaeology in the chinese altai
mountains survey of the heiliutan basin. Archaeological Research in Asia,10 (Complete),
48–53.
21
—, SADYKOV, T., BLOCHIN, J., BUESS, M., NIEBERLE, M. and BALZ, T. (2019). Integrating
remote sensing and geophysics for exploring early nomadic funerary architecture in the
siberian valley of the kings. Sensors,19 (14).
CHEN, K. T. and HIEB ERT, F. T. (1995). The late prehistory of xinjiang in relation to its
neighbors. Journal of World Prehistory,9(2), 243–300.
CHETO UANI, A., DEBROUTELLE, T., TREUILLET, S., EXBR AYAT, M. and JESSET, S. (2018).
Classification of ceramic shards based on convolutional neural network. 2018 25th IEEE
International Conference on Image Processing (ICIP), pp. 1038–1042.
CLARKE, M. (2008). China’s war on terror in xinjiang: Human security and the causes of
violent uighur separatism. Terrorism and Political Violence,20 (2), 271–301.
DAVIS -KIMBALL, J. (1991). Kazakh/american research project: 1990 field work report.
Middle East Studies Association Bulletin,25 (1), 33–35.
GASS, A. (2011). Early iron age burials in southeastern zhetysu: The geoarchaeological
evidence. Archaeology, Ethnology and Anthropology of Eurasia,39 (3), 57 – 69.
— (2016). Das Siebenstromland Zwischen Bronze- Und Fr¨uheisenzeit: Eine Regionalstudie.
Berlin; Boston; De Gruyter.
GRAHAM, S. (2018). Fleshing out the bones: Studying the human remains trade with
tensorflow and inception. Journal of Computer Applications in Archaeology,1(1), 55–63.
GUALANDI, M. L., SCOPIGNO, R., WOLF, L., RICHARDS, J., GARRIGOS, J. B. I., HEINZEL-
MANN, M., HERVAS, M. A., VILA, L. and ZALLOCCO, M. (2016). ArchAIDE - Archae-
ological Automatic Interpretation and Documentation of cEramics. In C. E. Catalano
and L. D. Luca (eds.), Eurographics Workshop on Graphics and Cultural Heritage, The Eu-
rographics Association.
GUYOT, A., HU BERT-MOY, L. and LORH O, T. (2018). Detecting neolithic burial mounds
from lidar-derived elevation data using a multi-scale approach and machine learning
techniques. Remote Sensing,10 (2).
22
HEIN, I., ROJ AS-DOMNGUEZ, A., ORN ELA S, M., DERCOLE, G. and PELOS CHE K, L.
(2018). Automated classification of archaeological ceramic materials by means of tex-
ture measures. Journal of Archaeological Science: Reports,21, 921 – 928.
HERMOZA, R. and SIPIRAN, I. (2017). 3d reconstruction of incomplete archaeological ob-
jects using a generative adversary network. CoRR,abs/1711.06363.
INSTIT UTE O F ARCHAEOLOGY OF THE XINJIANG ACAD EMY O F SOCI AL SCIENCES
(1985). Xinjiang xinyuan gongnaisi zhongyangchang shiguan mu (stone cist tomb of
zhongyangchang site, xinyuan, xinjiang). Kaogu yu Wenwu,2, 21–26.
INSTIT UTE O F ARCHAEOLOGY OF THE XINJIANG ACAD EMY O F SOCI AL SCIENCES (1986).
Xinjiang miquan dacaotan faxian shiduimu (the rock-fill tombs discovered at dacaotan,
miquan, xinjiang). Kaogu yu Wenwu,1, 36–38.
INSTIT UTE O F ARCHAEOLOGY OF THE XINJIANG ACAD EMY O F SOCI AL SCIENCES (1988).
Xinjiang xinyuan tiemulike gumu qun (tiemulike cemetery, xinyuan, xinjiang). Kaogu yu
Wenwu,8, 59–66.
ISHITSUKA, K., ISO , S., ONIS HI, K. and MATS UOK A, T. (2018). Object detection in
ground-penetrating radar images using a deep convolutional neural network and im-
age set preparation by migration.
KRAMER, I. C., HARE, J. S., PR¨
UGEL-BEN NET T, A. and SARGENT, I. (2017). Automated
detection of archaeology in the new forest using deep learning with remote sensor data.
LASAPONARA, R. and MASINI, N. (2012). Satellite remote sensing: a new tool for archaeology.
Springer.
LECUN, Y., HAFFN ER, P., BOT TOU, L. and BENGIO, Y. (1999). Object recognition with
gradient-based learning, Springer.
LI, Q., ZOU , Q., MA, D., WANG , Q. and WAN G, S. (2018). Dating ancient paintings of
mogao grottoes using deeply learnt visual codes. CoRR,abs/1810.09168.
NAGLER, A. (2009). Grosskurgane im Siebenstromland (Kazachstan), vol. 1. Jahresbericht
2008 des DAI. Arch¨
aologischer Anzeiger 2009.
23
—, Z, S., H, P. and M, N. (2010). S¨udkasachstan: Kurgane Asy Zaga, Kegen und Zoan Tobe,
Berlin, DAI, chap. Arch¨
aologische Forschungen in Kasachstan, Tadschikistan, Turk-
menistan und Usbekistan., pp. 49–54.
PALANIAPPAN, S. and ADHIKARI, R. (2017). Deep learning the indus script. ArXiv,
abs/1702.00523.
PARZINGER, H. (2011). Die fr ¨uhen V¨olker Eurasiens: vom Neolithikum zum Mittelalter. C.H.
Beck.
PASQUET, J., DEMESTICHA, S., SKARLATOS, D., MERAD, D. and DRAP, P. (2017). Am-
phora detection based on a gradient weighted error in a convolution neuronal network.
PHAM, M. and LE F `
EVRE, S. (2018). Buried object detection from b-scan ground penetrat-
ing radar data using faster-rcnn. CoRR,abs/1803.08414.
P’IAN KOV, I. V. (1994). The ethnic history of the sakas. Bulletin of the Asia Institute,8,
37–46.
ROBERTS, S. R. (2018). The biopolitics of chinas war on terror and the exclusion of the
uyghurs. Critical Asian Studies,50 (2), 232–258.
ROMAN-RANGEL, E. and MARCHAND-MAILLET, S. (2016). Indexing mayan hieroglyphs
with neural codes. In 2016 23rd International Conference on Pattern Recognition (ICPR), pp.
253–258.
RUTLEDGE, J., YUAN, W., WU, J. Y., FREE D, S., LEWIS, A., WOOD, Z. J., GAMBIN, T.
and CLARK, C. M. (2018). Intelligent shipwreck search using autonomous underwater
vehicles. 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1–8.
SAMASHEV, Z. (2007). Im Zeichen des goldenen Greifen - K¨onigsgr¨aber der Skythen, Berlin,
Staatliche Museen zu Berlin, chap. Die F ¨
urstengr—”aber des Siebenstromlandes, pp.
162–170.
SCHOELZ, A. (2018). An embarrassment of riches: Data integration in vr pompeii.
24
TRAVASSOS, X. L., AVIL A, S. L. and IDA , N. (2018). Artificial neural networks and ma-
chine learning techniques applied to ground penetrating radar: A review. Applied Com-
puting and Informatics.
TRIER,I. D., COWLEY, D. C. and WAL DEL AND , A. U. (2019). Using deep neural networks
on airborne laser scanning data: Results from a case study of semi-automatic mapping
of archaeological topography on arran, scotland. Archaeological Prospection,26 (2), 165–
175.
—, LARSEN , S. Y. and SOLBERG, R. (2009). Automatic detection of circular structures in
high-resolution satellite images of agricultural land. Archaeological Prospection,16 (1),
1–15.
— and PIL, L. H. (2012). Automatic detection of pit structures in airborne laser scanning
data. Archaeological Prospection,19 (2), 103–121.
TYUKIN, I., SOFEIK OV, K., LEVESLEY, J., GORBAN, A. N., ALLISON, P. and COOPER, N. J.
(2018). Exploring automated pottery identification [arch-i-scan].
VAN GEE L, B., BOKOVENKO, N., BUROVA, N., CHUGUNOV, K., DERGACHEV, V., DIRK-
SEN, V., KULKOVA, M., NAGLER, A., PARZINGER, H., VAN DER PLICHT, J., VASILIEV,
S. and ZAITSEVA, G. (2004). Climate change and the expansion of the scythian culture
after 850 bc: a hypothesis. Journal of Archaeological Science,31 (12), 1735–1742.
WANG, B. (1985). Gudai xinjiang sairen lishi gouchen (a preliminary research of the his-
tory of the ancient saka in xinjiang). Xinjiang Shehui Kexue,1, 48–58, 64.
WANG, H., HE, Z., HU ANG , Y., CHEN, D. and ZHOU, Z. (2017). Bodhisattva head images
modeling style recognition of dazu rock carvings based on deep convolutional network.
Journal of Cultural Heritage,27, 60 – 71.
25
... Deep learning has proven superior for automatically extracting features from complex visual data, making it an essential technology for cultural heritage work, including artifact recognition, heritage restoration, and multimedia cultural data classification [14][15][16][17]. Early applications of deep learning in cultural heritage focused primarily on identifying archeological sites [18][19][20]. These studies classified traces of tombs, legacy marks, and topographic visualizations using aerial photographs [14,15,21]. ...
Article
Full-text available
This study investigates the classification of pigment-manufacturing processes using deep learning to identify the optimal model for cultural property preservation science. Four convolutional neural networks (CNNs) (i.e., AlexNet, GoogLeNet, ResNet, and VGG) and one vision transformer (ViT) were compared on micrograph datasets of various pigments. Classification performance indicators, receiver-operating characteristic curves, precision–recall curves, and interpretability served as the primary evaluation measures. The CNNs achieved accuracies of 97–99%, while the ViT reached 100%, emerging as the best-performing model. These findings indicate that the ViT has potential for recognizing complex patterns and correctly processing data. However, interpretability using guided backpropagation approaches revealed limitations in the ViT ability to generate class activation maps, making it challenging to understand its internal behavior through this technique. Conversely, CNNs provided more detailed interpretations, offering valuable insights into the learned feature maps and hierarchical data processing. Despite its interpretability challenges, the ViT outperformed the CNNs across all evaluation metrics. This study underscores the potential of deep learning in classifying pigment manufacturing processes and contributes to cultural property conservation science by strengthening its scientific foundation for the conservation and restoration of historical artifacts.
... In geoarchaeology for instance, machine learning has been used to more accurately source and classify samples of soils, minerals, and tephras [59][60][61][62][63] , improve precision in temperature estimations of heat-treated lithic items 64 , as well as to interpolate geochemical properties between samples to improve resolution of large-scale geological mapping 65 . These methods have also helped identify anthropogenic structures from aerial surveying 66 , like mounds [67][68][69][70][71][72][73][74][75] , structures [76][77][78] , desert kites 79 , irrigation systems 80 , and combustion features 81,82 . The identification of anthropogenic cut-marks on bones has also been improved with machine learning [83][84][85][86][87][88] . ...
Preprint
Full-text available
Reconciling the ever-increasing volume of new archaeological data with the abundant corpus of legacy data is fundamental to making robust archaeological interpretations. Yet, combining new and existing results is hampered by inconsistent standards in the recording and illustration of archaeological features and artefacts. Attempts at collating data from images in existing publications first involve scouring the substantial body of existing literature, followed by extracting images that require onerous manual preprocessing steps, like re-scaling, reorienting , and re-formatting. While the sample sizes of such manual analyses are curtailed by these problems, recent developments in AI and big data methods are poised to accelerate and automate large syntheses of existing data. This paper introduces an AI-assisted workflow capable of creating uniform archaeological datasets from heterogeneous published resources. The associated software (AutArch) takes large and unsorted PDF files as input, and uses neural networks to conduct image processing, object detection, and classification. Objects commonly found in archaeological catalogues-like graves, skeletons, ceramics, ornaments, stone tools, and maps-are reliably detected. Accompanying elements of the illustrations, like North arrows and scales, are automatically used for orientation and scaling. Outlines are then extracted with contour detection, allowing whole-outline morphometrics. Detected objects, contours, and other automatically retrieved data can be manually validated and adjusted via AutArch's graphical user interface. While we test this workflow on third millennium BCE Central European graves and Final Neolithic/Early Bronze Age arrowheads from Northwest Europe, this method can be applied to the vast number of artefacts and archaeological features for which shape, size, and orientation holds technological, functional, cultural, and/or temporal significance. This AI-assisted workflow has the potential to speed-up, automate, and standardise data collection throughout the discipline, allowing more objective interpretations and freeing sample sizes from budget and time constraints.
... In recent years, the rise of artificial intelligence and machine learning has introduced new possibilities in the field of cultural heritage 13,14 . With the help of artificial intelligence and machine learning, many advancements have been made in the field of cultural heritage preservation [15][16][17] . ...
Article
Full-text available
Rock art is recognized globally as significant cultural heritage. Symbols in rock art capture scenes of daily life from ancient societies, revealing the cultural context of past civilizations and holding significant research value. In the research of rock art symbols, it is necessary to accurately and efficiently segment the symbols in the images to ensure subsequent research on rock art symbols and the construction of symbol system databases. Although existing methods for rock art symbol segmentation can effectively extract symbols from 2D images, they are often time-consuming, labor-intensive, and have low segmentation accuracy of the model. To address these challenges, this study proposes a rock art symbol segmentation method based on an improved YOLOv7-Seg model, which incorporates SE (Squeeze-and-Excitation Networks) and ODConv (Omni-Dimensional Dynamic Convolution) to enhance the model’s focus on rock art symbol features, enabling efficient and accurate segmentation in complex backgrounds. This model facilitates the recognition and segmentation of human and animal symbols in images. The study employs Cangyuan rock art as a case study, validating the model’s accuracy through ablation experiments and comparative analysis. The model achieves an overall AP score of 0.961, with specific AP score of 0.973 for animal segmentation and 0.948 for human segmentation. The results demonstrate that the improved model effectively segments rock art symbols in complex environments, achieving high-precision automated segmentation of rock art symbols. This research lays the foundation for the subsequent unified management of rock art symbols as well as the study of rock art protection and heritage preservation.
... Although traditional learning methods are able to extract mural features to a certain extent, simple feature extraction has greater limitations due to the subjectivity and unique cultural background of murals. Therefore, deep learning methods have been widely used.Kumar et al [3] used pre-trained AlexNet and VGGNet models to extract mural features and combined them with support vector machines for classi er fusion to successfully classify thangka images into eight categories of Indian art forms.Caspari et al [4] constructed a three-layer CNN network for detecting early Iron Age tombs of Google Earth open-source optical satellite data. Cao et al [5] designed an Inception-v3 network incorporating migration learning for ancient mural painting dynasty identi cation classi cation, and added color histograms with local binary patterns (LBP) to better extract the artistic features of the mural paintings. ...
Preprint
Full-text available
Dunhuang murals are a crucial historical and cultural heritage, presenting challenges in feature extraction and classification due to their number, age, and similarity. This study introduces SER-Net, a lightweight classification network for real-time mural analysis on mobile devices. A dataset covering nine dynasties—Early Tang, Northern Wei, Northern Zhou, Peak Tang, Sui, Late Tang, Middle Tang, Five Dynasties, and Western Wei—was created through manual collection and annotation. Data augmentation addressed uneven image distribution. SER-Net, based on RepVGG and ResNet18, features the SE D-Block module, which integrates SE attention and Channel-Shuffle mechanisms for enhanced feature fusion while using deep separable convolution to control model size. Experimental results show SER-Net reduces model size, boosts efficiency, and increases accuracy.
... [cs.CV] 16 Dec 2024 techniques has become increasingly popular in archaeology (Bickler 2021;Cacciari and Pocobelli 2022), following trends in other fields, including non-scientific ones, such as arts or industries (Le et al. 2020;Ramesh et al. 2021;Maerten and Soydaner 2024). However, many applications have focused on specific research areas like site detection (Sakai et al. 2024;Caspari and Crespo 2019;Buławka, Orengo, and Berganzo-Besga 2024), artifact classification (Emmitt et al. 2022;Gualandi, Gattiglia, and Anichini 2021;Anichini et al. 2021) or heritage preservation (Cui et al. 2024;D'Orazio et al. 2024), while the retrieval and processing of legacy data has remained relatively unexplored. This gap is particularly significant given that the efficient and rapid retrieval of archaeological data from printed publications or PDFs is essential for the advancement of archaeological research, as we can use this ready-to-use data for both traditional analysis and the development of even more complex ML techniques that require even larger training datasets. ...
Preprint
Full-text available
Archaeological pottery documentation and study represents a crucial but time-consuming aspect of archaeology. While recent years have seen advances in digital documentation methods, vast amounts of legacy data remain locked in traditional publications. This paper introduces PyPotteryLens, an open-source framework that leverages deep learning to automate the digitisation and processing of archaeological pottery drawings from published sources. The system combines state-of-the-art computer vision models (YOLO for instance segmentation and EfficientNetV2 for classification) with an intuitive user interface, making advanced digital methods accessible to archaeologists regardless of technical expertise. The framework achieves over 97\% precision and recall in pottery detection and classification tasks, while reducing processing time by up to 5x to 20x compared to manual methods. Testing across diverse archaeological contexts demonstrates robust generalisation capabilities. Also, the system's modular architecture facilitates extension to other archaeological materials, while its standardised output format ensures long-term preservation and reusability of digitised data as well as solid basis for training machine learning algorithms. The software, documentation, and examples are available on GitHub (https://github.com/lrncrd/PyPottery/tree/PyPotteryLens).
Book
Full-text available
Airborne Laser Scanning (sometimes referred to as lidar) has been described as revolutionary for the understanding and management of cultural landscapes. The ability to create highly accurate three dimensional (3D) models and visualise the topographic features that represent past human interaction with the land surface has undoubtedly changed our view of the world and our approach to heritage management. The use of airborne laser scanning (ALS) for cultural heritage management across Europe has increased greatly in the first two decades of the 21st century as data have become more widely available. While there is a growing expertise in the implementation of ALS derived visualisations for our discipline, it is clear from our survey of practitioners that this specialism is often represented by only a few experts scattered across a country. Specialist practitioners have facilitated knowledge exchange at training and networking via conferences and events such as the TRAIL (Training and Research in the Archaeological Interpretation of Lidar) meetings. However there have been few opportunities to capture the collective understanding of how to make the most of ALS for cultural heritage management that has developed over the last two decades. The aim of these guidelines, instigated by the European Archaeological Council (EAC), is to bring together for the first time a reference document that combines the experience of colleagues across Europe. The guidelines have been designed fully collaboratively by an extensive network of cultural heritage professionals. The foundation was an online survey conducted in 2022 to which more than 100 individuals and organisations from across 30 countries responded, detailing the status quo and their needs and aspirations with respect to integrating ALS data into their work. Using the results of this baseline survey to define the requirements of the wider community, 49 co-authors worked together to design the structure and content of the guidelines, in doing so ensuring their relevance and impact. The development of the guidelines was undertaken entirely online in Autumn 2022 and Spring 2023, an approach that has paid dividends in the quality and relevance of the end product by facilitating the input of so many experts from across the continent.
Preprint
Full-text available
Artificial intelligence and machine learning applications in archaeology have increased significantly in recent years, and these now span all subfields, geographical regions, and time periods. The prevalence and success of these applications have remained largely unexamined, as recent reviews on the use of machine learning in archaeology have only focused only on specific subfields of archaeology. Our review examined an exhaustive corpus of 135 articles published between 1997 and 2022. We observed a significant increase in the number of publications from 2019 onwards. Automatic structure detection and artefact classification were the most represented tasks in the articles reviewed, followed by taphonomy, and archaeological predictive modelling. From the review, clustering and unsupervised methods were underrepresented compared to supervised models. Artificial neural networks and ensemble learning account for two thirds of the total number of models used. However, if machine learning models are gaining in popularity they remain subject to misunderstanding. We observed, in some cases, poorly defined requirements and caveats of the machine learning methods used. Furthermore, the goals and the needs of machine learning applications for archaeological purposes are in some cases unclear or poorly expressed. To address this, we proposed a workflow guide for archaeologists to develop coherent and consistent methodologies adapted to their research questions, project scale and data. As in many other areas, machine learning is rapidly becoming an important tool in archaeological research and practice, useful for the analyses of large and multivariate data, although not without limitations. This review highlights the importance of well-defined and well-reported structured methodologies and collaborative practices to maximise the potential of applications of machine learning methods in archaeology.
Article
Due to saving time and manpower, automatic and semi‐automatic methods can be used to identify and analyse ancient artefacts. Such methods are usually among the studies of neural networks and machine learning systems, which are carried out using remote sensing data and are completely based on spatial information. In the present research, the aim is to detect archaeological phenomena in the landscape of the historical city of Zuzan using convolutional neural network and object detection using the YOLO v8 algorithm, which uses aerial images from the 1960s and 1990s as input data. The most important steps of this method are: training and learning model, image pre‐processing, feature extraction and feature labelling are implemented to provide an automatic pattern recognition system for recognizing archaeological phenomena in an urban landscape. The training data set consists of old aerial images in which features such as the city wall (fence), Citadel and Aqueduct (Qanat) are labelled. The results of CNN training with aerial images of the 60s and 90s and Yolo modelling show the detection of feature such as the aqueduct with 69% accuracy, the city wall with 91% accuracy and the citadel with 100% accuracy.
Article
Full-text available
This article analyses the architecture of the Early Iron Age royal burial mound Tunnug 1 in the “Siberian Valley of the Kings” in Tuva Republic, Russia. This large monument is paramount for the archaeological exploration of the early Scythian period in the Eurasian steppes, but environmental parameters make research on site difficult and require the application of a diversity of methods. We thus integrate WorldView-2 and ALOS-2 remote sensing data, geoelectric resistivity and geomagnetic survey results, photogrammetry-based DEMs, and ortho-photographs, as well as excavation in order to explore different aspects of the funerary architecture of this early nomadic monument. We find that the large royal tomb comprises of a complex internal structure of radial features and chambers, and a rich periphery of funerary and ritual structures. Geomagnetometry proved to be the most effective approach for a detailed evaluation of the funerary architecture in our case. The parallel application of several surveying methods is advisable since dataset comparison is indispensable for providing context.
Article
Full-text available
A deep learning artificial neural network was adapted to the task of sex determination of skeletal remains. The neural network was trained on images of 900 skulls virtually reconstructed from hospital CT scans. When tested on previously unseen images of skulls, the artificial neural network showed 95% accuracy at sex determination. Artificial intelligence methods require no significant expertise to implement once trained, are rapid to use, and have the potential to eliminate human bias from sex estimation of skeletal remains.
Article
Full-text available
Ground-penetrating radar allows the acquisition of many images for investigation of the pavement interior and shallow geological structures. Accordingly, an efficient methodology of detecting objects, such as pipes, reinforcing steel bars, and internal voids, in ground-penetrating radar images is an emerging technology. In this paper, we propose using a deep convolutional neural network to detect characteristic hyperbolic signatures from embedded objects. As a first step, we developed a migration-based method to collect many training data and created 53510 categorized images. We then examined the accuracy of the deep convolutional neural network in detecting the signatures. The accuracy of the classification was 0.945 (94.5%)–0.979 (97.9%) when using several thousands of training images and was much better than the accuracy of the conventional neural network approach. Our results demonstrate the effectiveness of the deep convolutional neural network in detecting characteristic events in ground-penetrating radar images.
Article
Full-text available
Burial mounds (kurgans) of the Early Iron Age in the steppe zones of Central Asia have long been the target of severe looting activities. Protection of these monuments in remote areas is difficult since accurate mapping is rarely available. We map an area in northern Xinjiang using a combination of high-resolution optical data and on-ground survey to establish a quantitative and qualitative assessment of looting. We find that at least 74.5% of burial mounds are looted or otherwise destroyed. Due to the large number of visibly impacted burial mounds, it becomes clear that the bulk of cultural heritage of the Early Iron Age in this area is under threat. The looting, however, continues until present day. Rescue excavation of potentially untouched burials in the area is advisable.
Article
Full-text available
Ground Penetrating Radar is a multidisciplinary Nondestructive Evaluation technique that requires knowledge of electromagnetic wave propagation, material properties and antenna theory. Under some circumstances this tool may require auxiliary algorithms to improve the interpretation of the collected data. Detection, location and definition of target's geometrical and physical properties with a low false alarm rate are the objectives of these signal post-processing methods. Basic approaches are focused in the first two objectives while more robust and complex techniques deal with all objectives at once. This work reviews the use of Artificial Neural Networks and Machine Learning for data interpretation of Ground Penetrating Radar surveys. We show that these computational techniques have progressed GPR forward from locating and testing to imaging and diagnosis approaches.
Article
Identifying peaks in anthropogenic activity in a landscape is an important starting point for understanding past social dynamics in the longue durée. Through intensive surveys and remote sensing surveys of the Heiliutan Basin (Heiliutan Dacaoyuan 黑流滩大草原) in the southern Kanas Region (Kanasi 喀纳斯), Xinjiang, China, a high-resolution dataset for over 4000 years of material culture is established. The complete coverage of the area of interest allows for the quantification of ritual funerary activity based on the number of constructed monuments per century. The data show that the intensity of ritual funerary activity was very low and only left marginal traces in the landscape from the Eneolithic Age to the Late Bronze Age. During the Early Iron Age (ca. 850-200 B.C.E.), the basin became a center for construction of burials for social elites of nomadic tribes and the area was rapidly transformed into a landscape of the dead. The Late Iron Age (starting ≈200 B.C.E.) saw a decline of ritual funerary activities in the basin as it became an unimportant side scene to the cultural developments of the wider region.
Article
The identification of cut marks and other bone surface modifications (BSM) provides evidence for the emergence of meat-eating in human evolution. This most crucial part of taphonomic analysis of the archaeological human record has been controversial due to highly subjective interpretations of BSM. Here, we use a sample of 79 trampling and cut marks to compare the accuracy in mark identification on bones by human experts and computer trained algorithms. We demonstrate that deep convolutional neural networks (DCNN) and support vector machines (SVM) can recognize marks with accuracy that far exceeds that of human experts. Automated recognition and analysis of BSM using DCNN can achieve an accuracy of 91% of correct identification of cut and trampling marks versus a much lower accuracy rate (63%) obtained by trained human experts. This success underscores the capability of machine learning algorithms to help resolve controversies in taphonomic research and, more specifically, in the study of bone surface modifications. We envision that the proposed methods can help resolve on-going controversies on the earliest human meat-eating behaviors in Africa and other issues such as the earliest occupation of America.
Article
Thanks to the digital preservation of cultural heritage materials, multimedia tools (e.g., based on automatic visual processing) considerably ease the work of scholars in the humanities and help them to perform quantitative analysis of their data. In this context, this article assesses three different Convolutional Neural Network (CNN) architectures along with three learning approaches to train them for hieroglyph classification, which is a very challenging task due to the limited availability of segmented ancient Maya glyphs. More precisely, the first approach, the baseline, relies on pretrained networks as feature extractor. The second one investigates a transfer learning method by fine-tuning a pretrained network for our glyph classification task. The third approach considers directly training networks from scratch with our glyph data. The merits of three different network architectures are compared: a generic sequential model (i.e., LeNet), a sketch-specific sequential network (i.e., Sketch-a-Net), and the recent Residual Networks. The sketch-specific model trained from scratch outperforms other models and training strategies. Even for a challenging 150-class classification task, this model achieves 70.3% average accuracy and proves itself promising in case of a small amount of cultural heritage shape data. Furthermore, we visualize the discriminative parts of glyphs with the recent Grad-CAM method, and demonstrate that the discriminative parts learned by the model agree, in general, with the expert annotation of the glyph specificity (diagnostic features). Finally, as a step toward systematic evaluation of these visualizations, we conduct a perceptual crowdsourcing study. Specifically, we analyze the interpretability of the representations from Sketch-a-Net and ResNet-50. Overall, our article takes two important steps toward providing tools to scholars in the digital humanities: increased performance for automation and improved interpretability of algorithms.