PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

Archaeologists often use data and quantitative statistical methods to evaluate their ideas. Although there are various statistical frameworks for decision-making in archaeology and science in general, in this chapter, we provide a simple explanation of Bayesian statistics. To contextualize the Bayesian statistical framework, we briefly compare it to the more widespread null hypothesis significance testing (NHST) approach. We also provide a simple example to illustrate how archaeologists use data and the Bayesian framework to compare hypotheses and evaluate their uncertainty. We then review how archaeologists have applied Bayesian statistics to solve research problems related to radiocarbon dating and chronology, lithic, ceramic, zooarchaeological, bioarchaeological, and spatial analyses. Because recent work has reviewed Bayesian applications in archaeology from the 1990s up to 2017, this work considers the relevant literature published since 2017.
The Bayesian Inferential Paradigm in Archaeology
ERIK OTÁROLA-CASTILLO*, MELISSA G. TORQUATO
Department of Anthropology, Purdue University, West Lafayette, Indiana, USA
CAITLIN E. BUCK*
School of Mathematics and Statistics, University of Sheeld, Sheeld, UK
R Markdown version last compiled on Wednesday November 17 2021, 11:27:10 AM, EST
Manuscript accepted to the Handbook of Archaeological Sciences, 2nd ed. Forthcoming volume
under contract (2022). Edited by M. Pollard, R.A. Armitage, and C.M. Makarewicz. Wiley.
*Corresponding Authors
email: eoc@purdue.edu
email: c.e.buck@sheeld.ac.uk
1
INTRODUCTION
Archaeologists often use data and quantitative statistical methods to evaluate their ideas. Although there
are various statistical frameworks for decision-making in archaeology and science in general, in this chapter,
we provide a simple explanation of Bayesian statistics. To contextualize the Bayesian statistical framework,
we briey compare it to the more widespread null hypothesis signicance testing (NHST) approach. We
also provide a simple example to illustrate how archaeologists use data and the Bayesian framework to com-
pare hypotheses and evaluate their uncertainty. We then review how archaeologists have applied Bayesian
statistics to solve research problems related to radiocarbon dating and chronology, lithic, ceramic, zooarchae-
ological, bioarchaeological, and spatial analyses. Because recent work has reviewed Bayesian applications in
archaeology from the 1990s up to 2017 (Caitlin E. Buck, Cavanagh, and Litton 1996; Caitlin E. Buck 2001;
Otárola-Castillo and Torquato 2018), this work considers the relevant literature published since 2017.
Null hypothesis signicance testing
Archaeologists use NHST to assess the extent to which well-observed material culture recovered from archae-
ological sites aligns with their hypotheses about past people. Statisticians pioneered the NHST inferential
structure in the early twentieth century and, thanks to its success in research practice, it became widely
available to scientists of the time (e.g., R. A. Fisher 1925; Neyman and Pearson 1933: 294). In the 1950s,
various science-oriented archaeological works introduced NHST methodology to the eld (e.g., Myers 1950;
Spaulding 1953; Binford 1964; Clarke 1968). Today, numerous textbooks continue to teach archaeological
scientists introductory NHST statistical concepts such as condence intervals and p-values (e.g., Fletcher
and Lock 2005; Carlson 2017; McCall 2018; Banning 2020).
Statistical methods that follow the NHST framework provide inference by estimating the parameters
of a probability model used to represent the salient features of a population (e.g., the mean and variance).
Scientists usually hypothesize the value of the population’s parameters—the so-called “null” hypothesis—and
design experiments or observational studies to generate quantiable data that can be used to test it. After
observation, the data are compared to the null hypothesis’ assumptions using a probability measure known
as the p-value. This comparative procedure rst assumes a probability model for the underlying population,
then evaluates whether the data collected are expected or probable outcomes of that population, and thus
whether the null hypothesis is (plausibly) true.
A large p-value, usually greater than 0.05, indicates that the data are not extreme and “fails to reject”
the null hypothesis. By contrast, a small or “signicant” p-value, usually less than 0.05, indicates that the
data are extreme and have a low probability with respect to the assumptions stated in the null hypothesis.
2
In this case, investigators may “reject the null hypothesis” in favour of an alternative hypothesis. In short, to
arbitrate between hypotheses, NHST uses the probability that the stated null hypothesis generated the data.
Although this approach is one of the most widely used inferential frameworks across the sciences, it has
had its share of criticism (e.g., Gelman 2006, 2018; Vidgen and Yasseri 2016). For example, statisticians
have recently targeted p-values mainly for their arbitrariness and misuse (Wasserstein, Schirm, and Lazar
2019). Although some mistake statistical signicance for practical signicance (e.g., Kramer, Veile, and
Otárola-Castillo 2016), the interpretation of signicant p-values, in terms of rejecting the null hypothesis,
is well understood. However, how to interpret non-signicant p-values is less clear. Similarly, the NHST
toolkit does not include acceptance of a null hypothesis. Nevertheless, some misunderstand this point and
attempt to use NHST to verify their null hypotheses.
Language appears to be part of the problem here, but failing to reject a null hypothesis is not synonymous
with accepting it. Instead, “failing to reject” means that there is not enough evidence to invalidate the null
hypothesis. Moreover, the relationship between probabilities and alternative hypotheses is not clear and
is often misunderstood (Benjamin and Berger 2019). In particular, it is challenging to evaluate multiple
alternative hypotheses within the NHST framework. Indeed, the ability to assign probabilities to multiple
hypotheses in light of the data is one of the many reasons researchers have turned to Bayesian statistics.
BAYESIAN STATISTICS
During the late twentieth century, scientists popularized Bayesian inference, a statistical approach based on
developments made in the eighteenth century by Reverend Thomas Bayes (1763). Bayes was an English
Presbyterian minister and mathematician who solved problems in probability involving conditional and
prior probabilities (Bellhouse 2004). Soon after the popularization of Bayesian inference in the sciences,
archaeologists also incorporated Bayesian methods into their toolkits to evaluate hypotheses (e.g., Caitlin
E. Buck, Cavanagh, and Litton 1996). Today, Bayesian methods have proliferated throughout the scientic
literature, including in anthropological and archaeological science (Gelman et al. 2020; Otárola-Castillo and
Torquato 2018; McElreath 2020). In the past, feasible execution of Bayesian methods was dicult because
some calculations are intractable and require intensive computation. Today’s powerful personal computers
and high-speed Markov Chain Monte Carlo (MCMC) algorithms, such as the Metropolis-Hastings, Gibbs,
and Hamiltonian procedures, have helped to overcome this obstacle and further popularize the approach
(e.g., Howson and Urbach 2006 :xi; Robert and Casella 2011; Dunson and Johndrow 2020).
Another reason for Bayesian approaches’ increased popularity might be the simplicity of interpreting
probabilities compared to the p-values used in NHST (Otárola-Castillo and Torquato 2018). Scientists apply
3
Bayesian inference to compute the probability of a hypothesis directly and thus obtain clearer and more
direct interpretations than those available from NHST. Also, as with NHST, the degree to which the given
hypothesis supports the data is computed, usually via an explicit probability model, known as a likelihood.
We formally dene these terms below, but in summary, the likelihood is a statistical function whose form is
determined by the specic probability model we are using. Crucially, Bayesian inference enables researchers
to incorporate their expert (or prior) knowledge about the hypothesis into the statistical analysis. Experts’
prior knowledge in a eld can be quite valuable; however, it is not often operationalized. Practitioners of
Bayesian inference convert prior knowledge into prior probabilities and use them as part of statistical
analyses. Once the prior probability has been determined, as with NHST, new data are observed to test the
hypothesis. The likelihood is combined with (or weighted by) the prior to give the Bayesian posterior
distribution. From this, the probability of the hypothesis given the observed data and the prior knowledge
can be computed (Caitlin E. Buck, Cavanagh, and Litton 1996). These steps, including the formalization of
a simple prior probability, likelihood, and computation of the posterior will be exemplied below in a simple
archaeological example.
The primary advantage of Bayesian statistics over NHST is the clarity of the inferences drawn from the
analysis. Furthermore, by formally including previous experience or expert information, prior probabilities
oer practical improvements over NHST, typically reducing uncertainty in the conclusions reached (George
L. Cowgill 2001). Including prior knowledge produces a comprehensive understanding of the proposed
hypothesis’ relevance to a larger body of knowledge. Moreover, incorporating prior probabilities enables
Bayesian inferences to be “updated,” creating a cyclical eect as current knowledge becomes prior knowledge
for future studies. Perhaps Dennis Lindley (1972) best summarized the Bayesian learning process by writing
the aphorism “today’s posterior is tomorrow’s prior. Helpfully, it is also possible to use what is known
as a at, vague, or uninformative prior (as we do in our example below) in situations where little or no
expert prior knowledge is available, but one may wish to take advantage of the other features of the Bayesian
framework.
To further contextualize the application of Bayesian statistics, we provide an example that illustrates
how one can use Bayesian statistics to select a hypothesis and solve an archaeological research problem.
The example demonstrates how archaeologists can make probabilistic inferences using data and simple prior
information about a hypothesis, how to evaluate the uncertainty surrounding a hypothesis, why this approach
seems less ambiguous than NHST, and thus why it is becoming increasingly popular. We also formally dene
the Bayesian framework and review recent Bayesian statistics applications in the archaeological literature.
4
A SIMPLE ARCHAEOLOGICAL EXAMPLE
Otárola-Castillo and Torquato (2018) introduced a simple example to contrast NHST and Bayesian infer-
ence. They presented an articial case study where an archaeologist proposed to infer projectile propelling
technology from its relationship to stone projectile morphology. In their example, the archaeologist used
the known relationship between projectile point propelling technology and point size, from an ethnographic
context. The archaeologist used this relationship as a frame of reference to infer the propelling technology
of a sample of stone projectile points recovered from a multi-component archaeological site. Using known
measurements of each technology type, Otárola-Castillo and Torquato (2018) demonstrated how the archae-
ologist could use NHST and a Bayesian framework to infer the most likely propelling technology (Table
1).
Table 1 Summary statistics (mean and standard deviation) of articial maximum projectile point
lengths recovered from the Early and Late Period archaeological contexts, along with equivalent
summaries of the maximum point lengths known to be associated with dierent propelling tech-
nologies. The latter are used to dene the hypotheses to be evaluated using the archaeological
data.
Archaeological Projectile Data
Early Period Late Period
Mean Length (cm) 6.1 13
SD (cm) 2 3.2
N 10 9
Propelling Hypotheses
Arrow Dart Spear
Mean Length (cm) 6.9 11 14
SD (cm) 2 2 2
The simulated data are maximum length measurements of projectile points from the Early (N=10) and
Late (N=9) components of an archaeological site (upper part of Table 1). The archaeologist also measured
the maximum lengths of a large sample of ethnographic projectile points with known propelling technology,
summarized in the lower part of Table 1 by their means and standard deviations. The hypotheses to be tested
5
are that the archaeological data from the Late and Early Period derive from each of the three ethnographically
observed propelling technologies: 1) bow and arrow, 2) atlatl and dart, and 3) hand-thrown spear.
Analysis using NHST
The archaeologist tested the hypotheses that the archaeological data were plausible given the ethnographic
data relating to each propelling technology. To do this, they used the means () in Table 1 and formalized
the hypotheses shown in Table 2.
Table 2 Nul l and alternative hypotheses used for NHST.
H0- the null hypotheses
Early Period Late Period
𝜇Early Period =𝜇Arrow 𝜇Late Period =𝜇Arrow
𝜇Early Period =𝜇Dart 𝜇Late Period =𝜇Dart
𝜇Early Period =𝜇Spear 𝜇Late Period =𝜇Spear
HA- the alternative hypotheses
Early Period Late Period
𝜇Early Period 𝜇Arrow 𝜇Late Period 𝜇Arrow
𝜇Early Period 𝜇Dart 𝜇Late Period 𝜇Dart
𝜇Early Period 𝜇Spear 𝜇Late Period 𝜇Spear
They then assumed that the summaries in Table 1 were for samples from populations distributed under
“Normal” probability models and applied the well-known z-test (Diez, Barr, and Cetinkaya-Rundel 2019:
134). The same Normal probability model assumptions will also be useful to generate the likelihood function
in the Bayesian analysis, later on. Knowing the means and standard deviations of the ethnographic and
archaeological data, Otárola-Castillo and Torquato (2018)’s archaeologist computed the z-scores and their
associated p-values (Table 3).
6
Table 3 Results of the z-score hypothesis tests described in the text, including the associated
p-values.
Early Period
z-score p-value
Arrow -1.26 0.21
Dart Tips -7.75 <0.001
Spear Tips -12.49 <0.001
Late Period
z-score p-value
Arrow 5.71 <0.001
Dart Tips 1.87 0.06
Spear Tips 0.94 0.35
Using this method, because the p-values were less than 0.001, they rejected the null hypotheses that the
means of the Early Period projectile points resembled those of darts or spears (Tables 2 and 3). Instead,
the archaeologist determined that the points may have come from a population of arrow projectile points
because the associated p-value is greater than 0.05 (Table 3). Thus, there was not enough evidence to reject
this null hypothesis. NHST allows the archaeologist to infer that “the Early Period sample does not have a
low probability of resulting from a population of arrow tips,” and they do not reject this hypothesis.
Following this exact procedure for the Late Period, the archaeologist obtained a p-value less than 0.001
for the arrow hypothesis. However, the p-values for the speartip and dart tip hypotheses are both greater
than 0.05. Therefore, although the archaeologist may reject the arrow hypothesis, the inference cannot be
distinguished between “the sample does not have a low probability of resulting from a population of dart tips”
and “the sample does not have a low probability of resulting from a population of spear tips”.
Bayesian analysis
Otárola-Castillo and Torquato (2018)’s archaeologist then compared the NHST analysis to one using a
Bayesian framework. The authors did this to show how archaeologists might apply Bayesian statistics to
assign probabilities to the hypotheses that the archaeological projectile points were arrows, dart tips, or
spear tips - given the data in hand.
7
Early Period
Length
Likelihood of (Length)
0.00 0.05 0.10 0.15 0.20
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Simulated length
Empirical length
Late Period
Length
Likelihood of Length
0.00 0.05 0.10 0.15 0.20
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Figure 1: Figure 1 Likelihoods of the maximum length data. The red dashed lines illustrate the “Normal”
likelihood of the maximum lengths data measurements obtained empirically from the archaeological Early
(top) and Late (bottom) periods. The likelihood values were obtained after estimating the parameters of
the Normal probability distribution that maximized the likelihood of the measurement values (i.e.,
Maximum Likelihood Estimation, MLE). Following this procedure, we used the MLE parameter estimates
to simulate the likelihood of hypothetical maximum length values greater than the range observed
archaeologically. The black solid line depicts these values. In both panels, we overlaid the empirical (gray
dotted) over the hypothetical (black solid) likelihood estimates for comparison
8
This approach is advantageous when a scientist uses multiple working hypotheses and is interested in de-
ciding which is most probably supported by the data. The Bayesian framework can achieve this goal by using
the same assumptions about the underlying probability distributions as those in the NHST approach. Using
Bayes’ theorem, one may then represent prior knowledge as a corresponding prior probability distribution
and calculate each hypothesis’s posterior probability.
To conduct this analysis, Otárola-Castillo and Torquato followed the procedures on likelihoods, prior
and posterior probabilities, and MCMC sampling we outlined in our Bayesian Statistics section above.
For additional technical detail, we refer the reader to the What is Bayes’ Theorem section below. The
authors modelled the likelihood of the maximum projectile length using the “Normal” probability model
(Figure 1). They also modelled prior knowledge using a uniform probability distribution to reect no
previous information and demonstrate the probabilistic approach to hypothesis selection. Under the uniform
distribution, the prior probabilities of all maximum projectile lengths were identical. Together, the likelihood
and prior probability models are foundational components of Bayesian Inference.
The archaeologist in Otárola-Castillo and Torquato (2018) then used an MCMC procedure to calculate the
probability of each hypothesis: that the archaeological samples were either arrow, dart, or spear, applying
a two-step process. First, they generated samples from the posterior distribution via MCMC. Second,
they compared the sampled values to intervals dened by one standard deviation around the mean of the
ethnographically observed values for each of the point types (Table 4, Figure 2). In this way, sampled
posterior point lengths that lay between 4.9 cm and 8.9 cm were dened (a posteriori) as Arrows; those
in the range 9 to 13 cm Darts; those in 12 to 16 cm Spears. Point lengths outside the range of 4.6 to 16
cm were outside of the evaluated hypotheses. Therefore, they were declared to belong to a group labelled
“Other. There is some overlap between the summary probability distributions implied by the ethnographic
data. Thus, the hypotheses are not mutually exclusive. In other words, due to overlapping measurements,
some projectile points may be consistent with more than one hypothesis. We discuss the implications of this
overlap below.
Next, the archaeologist calculated the posterior probabilities by dividing the number of projectile points
consistent with each hypothesis by the total number of projectile points for each period. The posterior
probability of point lengths for each hypothesis is reported in Table 4 and illustrated in Figure 2. Since the
hypotheses are non-mutually exclusive, the posterior probabilities corresponding to each hypothesis within
a period are not expected to sum to 1. Depending on one’s hypotheses, the interpretation of probabilities
relating to non-mutually exclusive outcomes may be problematic. For example, one might ask, what is the
probability that the projectile points are Arrows or Darts? In this case, because some Arrows may be similar
in length to Darts, these are not mutually exclusive outcomes, and one may not simply add their respective
9
probabilities together. The solution is to use the General Addition Rule of probability (see Diez, Barr, and
Cetinkaya-Rundel (2019): 83-88). We do not evaluate such an hypothesis in this example but note this rule
so that readers may adopt it if needed for their own work.
Using the resulting Bayesian posterior probability distribution to conduct inference lets scientists make
fully probabilistic statements about their hypotheses and thus make more explicit comparisons than those
provided by the NHST framework. The results highlighted by Figure 2 seem clear regarding the probability
of each hypothesis. We will discuss these further. After examining the resulting posterior probabilities,
the archaeologist determined that the Early period sample points were most probably used as arrows (with
probability 0.97) and likely propelled by a bow-like mechanism. This mode of stone point propelling changed
during the Late Period when the people living on this site began to use mainly hand-thrown spears (with
probability 0.89).
In this way, the Bayesian approach to testing hypotheses leads to results that are more readily interpreted
than those via the p-value based NHST. In particular, we are provided with measures of probability that
the data support the hypotheses, which have considerably more intuitive interpretation than those provided
by p-values.
10
Table 4 Posterior probabilities that the Early and Late Period maximum projectile point lengths
were associated with arrow, dart, and spear propelling technologies.
Function Mean±SD of max. point length Early Period Late Period
Arrow 6.9 ± 2 0.97 0.0009
Dart Tips 11 ± 2 0.004 0.17
Spear Tips 14 ± 2 0.00002 0.89
Early Period
Late Period
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Point length (cm)
Period
Hypothesis
Arrow
Dart
Spear
Other
Figure 2: Bayesian posterior probability distributions of each of three propelling technology hypotheses:
a) Arrow, b) Dart, c) Spear in the Early (bottom) and Late (top) periods. The amount of area under the
curve reects the probability of each hypothesis expressed as percentages.
11
WHAT IS BAYES’ THEOREM
Bayes’ theorem is an algorithm for obtaining the value of a conditional probability statement, when one
knows its inverse. It is usually exemplied by considering two related events, A and B. Put simply, Bayes’
theorem states that (equation 1):
𝑃 (𝐴|𝐵) = 𝑃 (𝐵|𝐴) 𝑃 (𝐴)
𝑃 (𝐵) (1)
In this case, to obtain the conditional probability of A given B, P(A|B) - here P represents probability
and | is read as ‘given’ - one needs to divide the joint probability of A and B, P(A and B), by the marginal
probability of B, P(B). The product of P(B|A) and P(A) is the joint probability P(A and B). The formula
then generalizes to equation (2):
𝑃 (𝐴|𝐵) = 𝑃 (𝐴 𝐵)
𝑃 (𝐵) (2)
where the joint probability is divided by the marginal P(B). Statisticians call P(A|B) the posterior
probability of A given B, P(B|A) the inverse conditional (or likelihood) of B given A, and P(A) the prior
probability of A.
The link between Bayes’ theorem, inference, data and hypotheses
The simulated archaeological scenario above provided a tangible example of the dierent components of a
Bayesian analysis, including an event’s probability, the probability of one event given another, prior and
posterior probabilities. Although the procedure here is specic to archaeological data, Bayes’ theorem is a
very general algorithm that is useful for a wide variety of data and data-generating processes. This section
generalizes Bayes’ theorem to a variety of other scenarios.
We stated earlier that Bayesian statistics uses the data in hand, (D), to assign probabilities to hypotheses
about a population (H). The statement P(H|D), i.e., the probability of the hypothesis given the data,
formalizes this relationship. To operationalize this statement in the context of data and hypotheses, Bayes’
theorem functions as
𝑃 (𝐻 |𝐷) = 𝑃 (𝐷|𝐻 ) 𝑃 (𝐻 )
𝑃 (𝐷) (3)
where: P(H|D) is the posterior probability the probability of the hypothesis given the data in hand;
P(D|H) is the probability of the data given the hypothesis, or the “likelihood” of the observed data; P(H) is
the prior probability of the hypothesis (before the data were observed), and P(D) is the probability of the
12
data in hand (out of all possible values of the data). Alternatively, using modern statistical vernacular this
operation can then be expressed in a slightly dierent form as:
𝑃 𝑜𝑠𝑡𝑒𝑟𝑖𝑜𝑟 = 𝐿𝑖𝑘 𝑒𝑙𝑖ℎ𝑜𝑜𝑑 𝑃 𝑟𝑖𝑜𝑟
𝑃 (𝐷𝑎𝑡𝑎) (4)
In Otárola-Castillo and Torquato’s articial example, the hypotheses represented the belief that the
observed Early and Late period projectile point data represented samples from populations derived from
particular propelling technologies. The data were modelled by the Normal probability distribution, and the
hypotheses were characterized by the values of the model’s parameters.
We use the symbol x to represent the observed data and the symbol 𝜃to represent the parameter(s) of
our model of the population that we are trying to learn about. Given x and a model with parameter(s) 𝜃,
we can more formally describe Bayes’ theorem and its three components: the likelihood, the prior, and the
posterior.
i. The likelihood is a statistical function. Its form is determined by the specic probability model we are
using but, in general terms it is represented by P(x|𝜃). Consequently, the likelihood is the probability
of observing particular data values given some specic values of the unknown parameters. Thus, this
is a formal statement of the relationship between what we want to learn and the data we collect.
ii. The prior is also a function and can be represented by P(𝜃). In simple terms, we can think of this as
the probability we attach to observing specied values of the unknown parameters before (a priori) we
observe the data. In other words, this is a formal statement of what we knew before the latest data
were collected.
iii. The posterior is the probability distribution that we want to obtain (a combination of the information
contained in the data, the likelihood and the prior) and can be represented by P(𝜃|x). Put plainly, this
is the probability we attach to specied values of the unknown parameters after observing the data.
In this more technical context, we can express Bayes’ theorem as:
𝑃 (𝜃|𝑥) = 𝑃 (𝑥|𝜃) 𝑃 (𝜃)
𝑃 (𝑥) (5)
In addition, the numerator, the product of the likelihood and the prior probability without the normalizing
denominator P(x)is proportional to () the posterior and is often computed and expressed by
𝑃 (𝜃|𝑥) 𝑃 (𝑥|𝜃) 𝑃 (𝜃) (6)
13
or,
𝑃 𝑜𝑠𝑡𝑒𝑟𝑖𝑜𝑟 𝐿𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜 𝑑 𝑃 𝑟 𝑖𝑜𝑟 (7)
In this manner, Bayesian statistics oers an alternative statistical framework for evaluating hypothe-
ses through a mechanism for obtaining a posteriori information about the parameter values of interest,
based upon the data, a model, and appropriately formulated prior information. In other words, given an
explicit statement of our a priori information, a clearly dened statistical model and a desire to obtain a
posteriori understanding, Bayes’ theorem provides us with a probabilistic framework within which to make
interpretations.
In addition to the coherent and explicit nature of the framework, there is another attractive feature of
adopting the Bayesian paradigm in that it allows us to learn from experience. Priors enable the explicit
contextualization of previous knowledge or beliefs about the topic under investigation (George L. Cowgill
1993; Caitlin E. Buck, Cavanagh, and Litton 1996). This should be a natural feature to archaeologists for
whom context is quite meaningful, or as Caitlin E. Buck, Cavanagh, and Litton (1996) discuss, archaeologists
interpret the discovery of new artifacts in conjunction with artifacts that have already been discovered.
Moreover, today’s posterior information (based on current data and prior information) is in a suitable
form to become the prior for further work if and when more data become available. Few other interpretative
frameworks oer a clear structure for updating one’s beliefs in the light of new information and yet it is such
an important part of most intuitive approaches to learning about the world in which we live.
14
OTHER ARCHAEOLOGICAL APPLICATIONS
Chronological modelling
The reliable construction of chronologies is an integral part of all archaeological research. Consequently, an
abundance of research has been conducted to create robust chronologies and thus, assist archaeologists in
interpreting past events. Early work laid the foundation for the use of Bayesian methods in chronological
modelling to improve precision (Naylor and Smith 1988; Caitlin E. Buck et al. 1991; Caitlin E. Buck, Litton,
and Smith 1992). The advent and continued improvement of user-friendly modelling software, including
BCal (Caitlin E. Buck, Christen, and James 1999) and OxCal (Bronk Ramsey 1994, 2017), has enabled
many archaeologists to employ Bayesian chronological modelling in their research. In fact, the construction
of chronologies has been described as the one archaeological application of Bayesian methods that is now
routine (Caitlin E. Buck and Meson 2015).
There has been a documented increase in the use of Bayesian chronological modelling over the last decade
(Bayliss 2015; Hamilton and Krus 2018), as numerous studies have re-examined radiocarbon dates to rene
regional chronologies. Although these methods were initially used by archaeologists in the United Kingdom
(Hamilton and Krus 2018), Bayesian chronological studies have now been conducted in nearly every region
of archaeological interest, including:
i. Central America (Inomata et al. 2017; Mendelsohn 2018; Tsukamoto et al. 2020);
ii. South America (Erik J. Marsh et al. 2017; Wynveldt et al. 2017),
iii. Europe (Arvaniti and Maniatis 2018; Jiménez et al. 2018; Krajcarz et al. 2018; Manning et al. 2018;
Ricci et al. 2018; Paulsson 2019),
iv. Asia (Long, Wagner, and Tarasov 2017; Ricci et al. 2018; Birch-Chapman and Jenkins 2019; Yang et
al. 2019),
v. Africa (Kramer, Veile, and Otárola-Castillo 2016; Brandt et al. 2017; Sadr et al. 2017; Loftus, Mitchell,
and Bronk Ramsey 2019), and
vi. Oceania (Brockwell et al. 2017; Kirch and Swift 2017; Urwin and Arifeae 2018; David et al. 2019).
Indeed, even those archaeologists who simply report individual, calibrated radiocarbon dates are now
reliant on Bayesian methods since the most recent estimates of the radiocarbon calibration curves (IntCal20,
SHCal20, and Marine20) are grounded in Bayesian inference. They were constructed using a Bayesian spline
approach to combine data from tree rings, oating tree-ring chronologies, lacustrine and marine sediments,
speleothems, and corals (Reimer et al. 2020).
Archaeologists have applied Bayesian methods to other methods of absolute dating. For example, recent
15
studies have constructed chronological models using optically-stimulated luminescence (OSL) dates (Clarkson
et al. 2017; Combès and Philippe 2017; Veth 2017; Jiménez et al. 2018; Demuro et al. 2019; Heydari et al.
2020), and dendrochronology (Millard 2002; Hassan, Jones, and Buck 2019; Lorentzen, Manning, and Cvikel
2020).
Perhaps most signicantly, Bayesian chronological modelling enables archaeologists to include numerous
sources of archaeological dates in a single interpretive framework, including those drawn from relative dating
and absolute dating, to create chronologies of hard-to-date contexts. By combining relative dating and
absolute dating methods with Bayesian modelling, archaeologists can produce more precise and accurate
dates (George L. Cowgill 2015). For example, Croix et al. (2019) combined artifact chronologies, coin
dates, and radiocarbon dating in a Bayesian model to date earthworks in Denmark. Prior to this research,
dating these structures was dicult due to the limited survival of dateable artifacts and the reuse of building
materials in antiquity. By constructing a Bayesian chronological model using coin age and radiocarbon
dates, researchers improved the dating precision of the earthworks. Furthermore, DiNapoli et al. (2020)
used a Bayesian modelling approach to combine radiocarbon dates, stratigraphy, and ethnohistoric accounts
to examine the collapse and resilience of populations on Rapa Nui. Other examples of studies include those
combining absolute dating methods (e.g., Anyon et al. 2017; Fitzsimmons et al. 2017; Smith, Williams, and
Ross 2017) and those drawing on relative and absolute dating methods (e.g., Guérin et al. 2017; Douka et
al. 2019).
Other studies have used Bayesian modelling to clarify the complex relationship between humans and the
environment. For example, Banks and colleagues (2019) utilized Bayesian hierarchical modelling to determine
the date for cultures from Upper Palaeolithic France. These dates were then compared to palaeoecological
records to determine the palaeoclimatic variability during each period. Similarly, Kearney (2019) used
Bayesian methods to combine archaeological and palaeoecological chronologies in a study examining the
connection between vegetation changes and human activity near a megalithic tomb dating to the Neolithic
in Ireland. Using this method, he was able to determine if signicant palynological events occurred before,
after or during the construction and use of the tomb. Ultimately, he determined that the clearing of the
woodland occurred prior to the construction of the megalith.
Artifact analysis
Bayesian inference has been applied in numerous ways to study a broad array of artifacts, including ceramics,
bone and stone tools. Early applications examined the provenance of artifacts and ceramic seriation (e.g.,
Caitlin E. Buck and Litton 1990; Caitlin E. Buck, Cavanagh, and Litton 1996; Halekoh and Vach 1999;
Robertson 1999). Continued research examining ceramics has utilized Bayesian modelling of radiocarbon
16
dates to determine the chronologies of ceramic artifacts by combining absolute dating and studies of ceramic
typologies (e.g., Naylor and Smith 1988). Similar methods have been used to examine ceramic traditions
in Europe (Krol, Dee, and Nieuwhof 2020), Bolivia (Erik J. Marsh et al. 2019), Guatemala (Arroyo et al.
2020), and Papua New Guinea (Skelly et al. 2018). The combination of chronological modelling and ceramic
data has been used to examine the dispersal and spread of ceramic cultures (e.g., Méhault 2017; Binder et
al. 2018).
Recently, the application of Bayesian modelling to ceramic analysis has extended beyond seriation. For
example, Fernandes et al. (2018) used a Bayesian approach to identify the types of food that created
residues in prehistoric European pottery. By analysing carbon isotope measurements and comparing them
with measurements from known sources, the authors determined which foods had contributed to the residues
and thus how the pots had been used. Since pots are reused to prepare multiple types of foods, results can
be ambiguous when identifying the foods contributing to residues. The use of Bayesian methods addressed
this ambiguity by estimating the contribution of various food types to the residues.
Furthermore, Bayesian methods are becoming integral in the study of stone and bone tools. Researchers
have used Bayesian methods to test hypotheses about stone tool assemblages (e.g., Marwick et al. 2016)
and develop techniques for studying stone tools. These techniques allow researchers to assign probabilities
to the phenomenon being studied. For example, Murray et al. (2020) developed a novel method combining
3D microscopic analyses of surface roughness and a Bayesian probability model to evaluate if Middle Stone
Age silcrete tools from Pinnacle Point 13B (South Africa) had been heat treated. The model measured the
probability that a tool has been heat treated,allowed for the continued updating from future heat treatment
experiments, and performed with high accuracy. Similarly, other researchers combined a taphonomic anal-
ysis of the surface of unworked bone and bone tools with multivariate Bayesian modelling to quantify the
taphonomic changes on the surfaces of the unworked and worked bones to accurately predict the original
surface of the bone tools (Martisius et al. 2018, 2020).
Zooarchaeology
Researchers have used Bayesian statistics to study zooarchaeological trends. Pioneering work by D. C. Fisher
(1987) used Bayesian inference to determine whether scavenging or hunting led to the creation of butchery
marks on proboscidean assemblages. Recent work has focused on studying seasonality and domestication.
For example, Parkington et al. (2020) used a Bayesian approach to study the seasonal use of Later Stone
Age archaeological sites in South Africa. By reanalysing their previous studies on the timing of death using
a Bayesian framework, they were able to determine, with greater accuracy, when hunter-gatherers would
have used the sites where seal remains were found. Additionally, scholars have used Bayesian methods to
17
construct phylogenies examining the domestication of animals, including swamp bualo (Wang et al. 2017)
and pigs (Xiang et al. 2017). Other research has examined the foods consumed by domesticated animals.
Blanz et al. (2020) used Bayesian modelling to examine the diets of modern sheep, specically the amount
of seaweed consumed, which can be used as a reference sample for identifying similar consumption patterns
in archaeological contexts.
Additionally, archaeologists have used Bayesian methods to study faunal assemblages and make inferences
about their use. For example, Osborn (2019) constructed a Bayesian network model using ethnographic,
ethnohistoric, and archaeological data to determine whether Andean faunal assemblages indicated feasting,
sacrice, or daily refuse. The primary benet of using a Bayesian approach in the study was the resulting
replicable analysis that eliminates the subjectivity present in interpreting faunal assemblages. Rather, this
method reports the probabilities of the faunal assemblage representing each type of behaviour. Furthermore,
Baumann et al. (2020) used Bayesian methods to estimate the abundance of foxes and hares in Palaeolithic
Europe to determine how their abundance changed over time as they were hunted by humans for their meat,
fur, and teeth. The use of Bayesian methods in this study allowed the researchers to overcome a small sample
size while modelling animal abundance.
Bayesian techniques have been used to develop and re-examine the methods used in zooarchaeological
research. Researchers have used Bayesian inference to develop a reliable and replicable probabilistic method
to distinguish between sheep and goat bones in archaeological contexts (Wolfhagen and Price 2017). Since
goats and sheep are very similar species that share many traits, it can be dicult to distinguish between them.
This method provides the probability that a specimen is a goat given the identied traits. Furthermore,
Wolfhagen (2020) has re-examined the “logarithm size index” (LSI), a method for comparing the body sizes of
animals between assemblages that is typically used in studies of animal domestication. He suggests adopting
Bayesian multilevel LSI models to examine hypotheses about faunal assemblages.
Bioarchaeology
The use of Bayesian methods in bioarchaeological analyses was pioneered by Konigsberg and colleagues for
studying age-at-death and stature estimation (e.g., Lyle W. Konigsberg and Frankenberg 1992; Lyle W.
Konigsberg and Frankenberg 1994; Lucy et al. 1996; Lyle W. Konigsberg et al. 1998). Recent research
has continued to apply Bayesian statistics to the construction of biological proles. For example, Anzellini
and Toyne (2019) proposed the use of Bayesian logistic regression to account for uncertainty in the sample
when estimating the sex of individuals found in commingled contexts in the Andes. Although the frequentist
and Bayesian approaches produced similar results, the authors demonstrated the validity of using Bayesian
methods to account for uncertainty and to produce usable demographic proles in bioarchaeological studies.
18
Furthermore, Rosenstock et al. (2019) used Bayesian additive mixed modelling to examine the global spa-
tiotemporal trend in stature. This method enabled the researchers to account for spatiotemporally patchy
data as well as fragmentary skeletal samples.
Further studies have utilized Bayesian mixing models to reconstruct prehistoric diets. Typically, these
methods have used carbon and nitrogen stable isotope data to determine the types of foods people were
eating. One popular method is called the Food Reconstruction Using Isotopic Transferred Signals (FRUITS)
approach, which can account for multiple dietary sources and the uncertainty inherent in dietary infer-
ence. For example, Pezo-Lanfranco et al. (2018) used Bayesian mixing models to quantify the proportion
of three sources of food: plants, marine mammals, and terrestrial mammals. They determined that the
people of the Atlantic Forest of South America consumed a large amount of carbohydrates, suggesting a
unique diet compared to other populations in the area during the Middle Holocene. Using various Bayesian
mixing models, other studies have examined prehistoric dietary trends in Europe (Bownes et al. 2017; sjo-
gren_modelling_2017?; Boethius and Ahlström 2018; Cubas et al. 2019), South America (Gordón et
al. 2018), and Africa (Maurer et al. 2017). Recent studies have used similar Bayesian modelling to study
prehistoric weaning trends (King et al. 2017). Specically, using the FRUITS method, De Angelis et al.
(2020) reconstructed the diet of those buried at the Quarto Cappello del Prete. From this reconstruction,
they determined that Roman children were weaned around three years of age.
Other researchers have used mixed/multilevel/hierarchical modelling approaches. For example, Perri et
al. (2019) examined the canine diet as a proxy for human diets in archaeological contexts in Nicaragua. To
infer the probability of the model’s parameters the authors used a Bayesian approach including MCMC to
estimate the denominator of Bayes theorem. Hierarchical models in this context are exible and scalable
(Gelman and Hill 2006). They can include individual and group level data in a model. This exibility
provides improved inference on the parameters in question, resulting in more accurate estimates of the
model’s parameters (Katahira 2016).
Spatial archaeology
By combining prior knowledge regarding geographical data, archaeologists have been able to study spatial
trends (see Chapter xx). For example, researchers have used Bayesian methods to examine the placement
of archaeological sites on the landscape (Wright, MacEachern, and Lee 2014) and predict the locations
and settlement patterns of archaeological sites (Ortman, Varien, and Gripp 2007; Stewart et al. 2017).
Other research has incorporated Bayesian chronological modelling into spatial archaeological analyses. For
example, Snitker et al. (2018) combined prehistoric land use maps generated by surveys, chronological data,
and Bayesian methods to examine shifting occupation and land use patterns in Spain. The use of Bayesian
19
methods in this study was critical as it allowed the researchers to make probabilistic inferences regarding
the most likely occupation period at archaeological sites that may have been reused throughout history.
Similarly, Wright et al. (2020) used Bayesian chronological modelling of radiocarbon dates to construct a
summed probability distribution estimating occupation events in the Baekje Kingdom of Korea during the
Three Kingdoms Period (57 BCE to 688 CE). The researchers proceeded to use these data as part of a larger
model examining the spatial distribution and dynamics of human activity areas over time. These methods
allowed the researchers to make probabilistic statements about settlement patterns’ hypotheses at a time
when occupation patterns were thought to be changing.
20
SOME PRACTICALITIES
Modelling
Although numerous probability models exist, many archaeological problems are statistically non-standard.
This has often meant that the close collaboration of a number of specialists, including statisticians, is required
to build useful models. Fortunately, statisticians have often found archaeological problems to be interesting
and challenging and so this kind of collaboration is not too unusual. Nonetheless, although applications of
Bayesian analysis to archaeology have been around for more than 30 years, they are by no means ubiquitous
and further collaboration is certainly needed.
Specifying the prior
One of the major stumbling blocks to the more widespread use of Bayesian techniques in archaeology is the
perceived diculty of specifying prior information. Some archaeologists do not acknowledge that reliable
prior information exists and others have philosophical objections to the use of subjective opinions in formal
inference. Both such groups typically prefer to continue using exploratory methods or traditional NHST-
based ones. Others have expert knowledge and would like to use it, but have diculty expressing their ideas
in a suitable form because of their lack of knowledge about the mathematics that underlie the models they
wish to use. Tackling this problem requires further collaboration, clear communication, and an acceptance
that dierent researchers will have varying views on which interpretive framework to use or which specic
model to adopt. Most importantly, there is no need for everyone to agree. Researchers who adopt the
Bayesian framework are forced to be explicit about what they believe. As a result, dierent workers can
compute posteriors based on their own prior information and compare them formally with the inferences of
others.
Evaluating the posteriors
Early applications of the Bayesian framework to archaeology (as with other disciplines) were restricted to
likelihoods and priors for which the necessary calculations could easily be undertaken. However, since the
mathematical integrations required for some models are not analytically soluble, a fair number of real ques-
tions simply could not be tackled. These problems have now largely been overcome by the widespread
adoption of numerical techniques that allow the posterior information to be sampled rather than obtained
exactly. Some of the earliest illustrations of the use of these techniques for evaluating Bayesian posteriors
were in Bayesian radiocarbon calibration (Caitlin E. Buck, Litton, and Smith 1992; Caitlin E. Buck, Chris-
ten, and James 1999; Litton and Buck 1996). Advances in algorithms to create and sample from Markov
21
chain Monte Carlo simulations (MCMC) such as Metropolis-Hastings, Gibbs sampling, and the Hamiltonian
procedures such as No U-Turn Sampling (NUTS) (e.g., Dunson and Johndrow 2020; Homan and Gelman
2014) implemented by popular software like BUGS, JAGS, and STAN (Gilks, Thomas, and Spiegelhalter
1994; Plummer 2003; Sturtz, Ligges, and Gelman 2005; Team 2019) have helped to alleviate this problem.
Interpretation
Ultimately, the most important part of any statistical investigation is the interpretation of the results ob-
tained. The posterior distributions that arise from Bayesian analyses can be very complex and are sometimes
not directly interpretable in terms of the original problem. This means that exploratory methods of data
analysis may be needed to help investigate, interpret, and report upon the posterior distributions obtained.
When making such interpretations, the level of condence in the posteriors is aected by their sensitivity to
changes in the data, priors, or model. Such sensitivity should be investigated as part of the interpretation
of all posterior information. It is always useful to relax some of the prior assumptions and re-compute the
posteriors to see what eect this has. All reports of Bayesian analyses should make reference to sensitivity
analyses of this type, since without them we cannot be sure how robust the results are and thus how reliable
they would be as prior information for future research.
22
HOPES FOR THE FUTURE
We have discussed the positive contributions of Bayesian inference to archaeological thinking. In addition to
providing a fully probabilistic framework, Bayesian statistics requires that one makes existing prior knowledge
explicit to use in statistical analyses. By doing so, scientists take advantage of a more comprehensive
set of information when evaluating hypotheses. This is a major advantage over NHST and the related
Maximum Likelihood, and Information Theory approaches to model-selection (Murtaugh 2014). Increases
in the popularity of Bayesian applications in archaeology are likely due to the recognition of these features.
To continue this trend, we outline an ambitious set of initiatives we hope to see in the future of Bayesian
applications in archaeology.
A framework for archaeological science
The Bayesian approach provides a systematic learning procedure, using evidence to update one’s beliefs or
hypotheses until reaching a condent and accurate level of knowledge. This evidence-based learning approach
inherently resembles the scientic process of hypothesis generation and evaluation. As a science, data-laden
inference about the past is also inherent to archaeology. New knowledge from archaeological data recovery
through excavation, survey, or analytical activities constantly update archaeologists’ state of knowledge and
revise the degree of support for prior hypotheses (e.g., the initial colonization of the Americas and out of
Africa origins of Homo sapiens).
Increase diversity of Bayesian applications
Gauging by the seemingly exponential increase in the number of Bayesian papers in archaeology in the 2000s
to the 2010s (Otárola-Castillo and Torquato 2018, Fig 1), not only has the Bayesian inferential framework
increased in popularity in the general sciences, but also in archaeology. This jump in usage is also evidenced
by the number of Bayesian papers, posters, and symposia at conferences (e.g., C. Buck, Dye, and May 2020;
Krus and Barkwill Love 2020; Wolfhagen and Otárola-Castillo 2021).
The increase in applications is due in part to purpose-written software and libraries, tailored to the
needs of archaeologists (e.g., OxCal, BCal, and Bchron (Haslett and Parnell 2008)). Increasingly, however,
as archaeologists become more condent to write their own code, simple-to-use and accessible software
like STAN, JAGS, and BUGS (Gilks, Thomas, and Spiegelhalter 1994; Plummer 2003; Sturtz, Ligges, and
Gelman 2005; Team 2019) are also being adopted. For R users, for example, the RStan package (Team 2020)
has simplied the access to this software, and so has the development of “higher level” code R-packages like
Rstanarm and BRMS (Bürkner 2017; Goodrich et al. 2020).
23
Training in underlying theory
With accessibility, however, there is potential for technical sophistication and attention to detail to be missed.
Adopting easy-to-access software might hide some of the Bayesian approach’s complexity, comprehension of
which is necessary in order for users to take responsibility for the modelling choices inherent in adopting
them.
As such, one of our hopes for the future is an increase in training opportunities for archaeologists in both
the statistical and theoretical details underlying Bayesian inference, and the technical and practical details
associated with implementation. In our opinion, greater knowledge of these two steps will generate a deeper
understanding and more responsible adoption of the Bayesian framework for inference.
This leads to the type of student training we hope to see in the future. Training students to become aware
of and uent in the theory underlying NHST and Bayesian inference will need some remodelling to current
curricula. Integrating statistical and computational theory into archaeological study programs would be one
step towards providing students with the expertise to evaluate and develop reliable Bayesian solutions for
themselves. It would, of course, also allow them to evaluate more responsibly the modelling work of others,
thus leading to a better informed and more articulate body of reviewers for archaeological journals.
The power of algorithmic thinking
Training in probability theory and coding alone will not change a discipline, but together with an encourage-
ment to formalize thinking they might. Archaeologists are widely known for our meticulous record keeping.
We propose that archaeologists complement our reputation for high quality documentation, by adding greater
formalization to our thinking and hypothesizing. Coders do this out of necessity, but it is not routine practice
in most of archaeology.
There are, of course, widely used and highly regarded eld manuals that encourage step-by-step record-
keeping (Center 2001; Hester, Shafer, and Feder 2016; White and King 2016) and many modern excavations
follow these closely. . However, beyond eldwork, careful data handling and modelling procedures have
traditionally not been given such emphasis, although there are, and have been, notable examples of good
practice (e.g., Carlson 2017; McCall 2018; Banning 2020). Processes such as phasing a site or interpretation
of the archaeological record in an entire landscape, require the handling of very large amounts of information,
typically held in many dierent computer les. The eld would benet if this processing were systematically
recorded and replicable. The consequence of not doing this might be an undocumented workow that even
those involved struggle to fully recreate if needed.
Those with coding experience know that poorly documented workows are not a sustainable approach
24
to information management. What’s needed instead is a step-by-step or ow-diagram approach to planning
and documenting the post-excavation workow. Setting up such approaches is time-consuming, of course,
but the advantages for reproducibility are immeasurable. Fortunately, archaeologists are increasingly open
to adopting some of these processes (Marwick 2017). Moreover, there are now several well-established
environments that encourage researchers to take this approach. One such is Rmarkdown (Allaire et al.
2020) which allows users to embed R code and output within a text document. Those of us who use such
environments have found that we naturally document the data management and analysis process, as we
work, and can write up and archive our work much more quickly and accurately, too.
25
REFERENCES
Allaire, J, Yihui Xie, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, Hadley Wickham,
Joe Cheng, Winston Chang, and Richard Iannone. 2020. “Rmarkdown: Dynamic Documents for R. R
Package. Available: URL Https://Rmarkdown.rstudio.com. rmarkdown.rstudio.com.
Anyon, Roger, Darrell Creel, Patricia A Gilman, Steven A LeBlanc, Myles R Miller, Stephen E Nash,
Margaret C Nelson, Kathryn J Putsavage, Barbara J Roth, and Karen Gust Schollmeyer. 2017. “Re-
Evaluating the Mimbres Region Prehispanic Chronometric Record. Kiva 83 (3): 316–43.
Anzellini, Armando, and J Marla Toyne. 2019. “Estimating Sex Using Isolated Appendicular Skeletal
Elements from Chachapoyas, Peru. International Journal of Osteoarchaeology 29 (6): 961–73.
Arroyo, Bárbara, Takeshi Inomata, Gloria Ajú, Javier Estrada, Hiroo Nasu, and Kazuo Aoyama. 2020.
“Rening Kaminaljuyu Chronology: New Radiocarbon Dates, Bayesian Analysis, and Ceramics Studies.
Latin American Antiquity 31 (3): 477–97.
Arvaniti, Theodora, and Yannis Maniatis. 2018. “Tracing the Absolute Time-Frame of the Early Bronze
Age in the Aegean. Radiocarbon 60 (3): 751.
Banning, Edward B. 2020. The Archaeologist’s Laboratory: The Analysis of Archaeological Evidence. 2nd
ed. New York: Springer International Publishing. https://doi.org/10.1007/978-3-030-47992-3.
Baumann, Chris, Gillian L Wong, Britt M Starkovich, Susanne C Münzel, and Nicholas J Conard. 2020.
“The Role of Foxes in the Palaeolithic Economies of the Swabian Jura (Germany). Archaeological and
Anthropological Sciences 12 (9): 1–17.
Bayes, Thomas. 1763. “An Essay Towards Solving a Problem in the Doctrine of Chances. Philosophical
Transactions 53: 370–418.
Bayliss, Alex. 2015. “Quality in Bayesian Chronological Models in Archaeology. World Archaeology 47 (4):
677–700.
Bellhouse, David R. 2004. “The Reverend Thomas Bayes, FRS: A Biography to Celebrate the Tercentenary
of His Birth. Statistical Science 19 (1): 3–43.
Benjamin, Daniel J, and James O Berger. 2019. “Three Recommendations for Improving the Use of p-
Values. The American Statistician 73 (sup1): 186–91.
Binder, Didier, Philippe Lanos, Lucia Angeli, Louise Gomart, Jean Guilaine, Claire Manen, Roberto Maggi,
Italo M Muntoni, Chiara Panelli, and Giovanna Radi. 2018. “Modelling the Earliest North-Western
Dispersal of Mediterranean Impressed Wares: New Dates and Bayesian Chronological Model. Documenta
Praehistorica. 44: 54–77.
Binford, Lewis R. 1964. “A Consideration of Archaeological Research Design. American Antiquity, 425–41.
26
Birch-Chapman, Shannon, and Emma L Jenkins. 2019. “A Bayesian Approach to Calculating Pre-Pottery
Neolithic Structural 1 Contemporaneity for Reconstructing Population Size. Journal of Archaeological
Science 112 (December).
Blanz, Magdalena, Ingrid Mainland, Michael Richards, Marie Balasse, Philippa Ascough, Jesse Wolfhagen,
Mark A Taggart, and Jörg Feldmann. 2020. “Identifying Seaweed Consumption by Sheep Using Isotope
Analysis of Their Bones and Teeth: Modern Reference �13C and �15N Values and Their Archaeological
Implications. Journal of Archaeological Science 118: 105140.
Boethius, Adam, and Torbjörn Ahlström. 2018. “Fish and Resilience Among Early Holocene Foragers
of Southern Scandinavia: A Fusion of Stable Isotopes and Zooarchaeology Through Bayesian Mixing
Modelling. Journal of Archaeological Science 93: 196–210.
Bownes, Jessica M, Philippa L Ascough, Gordon T Cook, Iona Murray, and Clive Bonsall. 2017. “Using
Stable Isotopes and a Bayesian Mixing Model (FRUITS) to Investigate Diet at the Early Neolithic Site
of Carding Mill Bay, Scotland. Radiocarbon 59 (5): 1275–94.
Brandt, Steven, Elisabeth Hildebrand, Ralf Vogelsang, Jesse Wolfhagen, and Hong Wang. 2017. “A New
MIS 3 Radiocarbon Chronology for Mochena Borago Rockshelter, SW Ethiopia: Implications for the
Interpretation of Late Pleistocene Chronostratigraphy and Human Behavior. Journal of Archaeological
Science: Reports 11: 352–69.
Brockwell, Sally, BILLY Ó FOGHLÚ, Jack N Fenner, Janelle Stevenson, Ulrike Proske, and Justin Shiner.
2017. “New Dates for Earth Mounds at Weipa, North Queensland, Australia. Archaeology in Oceania
52 (2): 127–34.
Bronk Ramsey, Christopher. 1994. “Analysis of Chronological Information and Radiocarbon Calibration:
The Program OxCal. Archaeological Computing Newsletter 41 (11): e16.
———. 2017. “Methods for Summarizing Radiocarbon Datasets. Radiocarbon 59 (6): 1809–33.
Buck, Caitlin E. 2001. Applications of the Bayesian Statistical Paradigm.
Buck, Caitlin E., and Bo Meson. 2015. “On Being a Good Bayesian. World Archaeology 47 (4): 567–84.
https://doi.org/10.1080/00438243.2015.1053977.
Buck, Caitlin E, William G Cavanagh, and Cli D Litton. 1996. Bayesian Approach to Interpreting Archae-
ological Data. New York: Wiley.
Buck, Caitlin E, J Andrés Christen, and Gary N James. 1999. “BCal: An on-Line Bayesian Radiocarbon
Calibration Tool. Internet Archaeology 7.
Buck, Caitlin E, James B Kenworthy, Cli D Litton, and Adrian Frederick Melhuish Smith. 1991. “Com-
bining Archaeological and Radiocarbon Information: A Bayesian Approach to Calibration. Antiquity 65
(249): 808–21.
27
Buck, Caitlin E, and Cliord D Litton. 1990. “A Computational Bayes Approach to Some Common
Archaeological Problems. In Computer Applications and Quantitative Methods in Archaeology, BAR
International Series, edited by K Lockyear and S Rahtz, 565:93–99. Oxford.
Buck, Caitlin E, Cliord D Litton, and Adrian FM Smith. 1992. “Calibration of Radiocarbon Results
Pertaining to Related Archaeological Events. Journal of Archaeological Science 19 (5): 497–512.
Buck, Caitlin, Thomas Dye, and Keith May. 2020. “Stratication and Correlation: Tools and Techniques for
Archaeological Chronology (Symposium). In Society for American Archaeology 85th Annual Meeting.
Bürkner, Paul-Christian. 2017. “Brms: An R Package for Bayesian Multilevel Models Using Stan. Journal
of Statistical Software 80 (1): 1–28.
Carlson, David L. 2017. Quantitative Methods in Archaeology Using R. Cambridge, UK/New York: Cam-
bridge University Press.
Center, Crow Canyon Archaeological. 2001. “The Crow Canyon Archaeological Center Field Manual.
http://www.crowcanyon.org/fieldmanual.
Clarke, David L. 1968. Analytical Archaeology. London: Methuen.
Clarkson, Chris, Zenobia Jacobs, Ben Marwick, Richard Fullagar, Lynley Wallis, Mike Smith, Richard
G Roberts, Elspeth Hayes, Kelsey Lowe, and Xavier Carah. 2017. “Human Occupation of Northern
Australia by 65,000 Years Ago.” Nature 547 (7663): 306–10.
Combès, Benoit, and Anne Philippe. 2017. “Bayesian Analysis of Individual and Systematic Multiplicative
Errors for Estimating Ages with Stratigraphic Constraints in Optically Stimulated Luminescence Dating.
Quaternary Geochronology 39: 24–34.
Cowgill, George L. 1993. “Distinguished Lecture in Archeology: Beyond Criticizing New Archeology. Amer-
ican Anthropologist 95 (3): 551–73. http://www.jstor.org/stable/679650.
Cowgill, George L. 2001. “Past, Present, and Future of Quantitative Methods in United States Archae-
ology. In Computing Archaeology for Understanding the Past. CAA 2000. Computer Applications
and Quantitative Methods in Archaeology, edited by Z Stančič and T Veljanovski, 35–40. Oxford, UK:
Archaeopress.
———. 2015. “We Need Better Chronologies: Progress in Getting Them. Latin American Antiquity 26
(1): 26–29.
Croix, Sarah, Olav Elias Gundersen, Søren M Kristiansen, Jesper Olsen, Søren M Sindbæk, and Morten
Søvsø. 2019. “Dating Earthwork Fortications: Integrating Five Dating Methods in Viking-Age Ribe,
Denmark. Journal of Archaeological Science: Reports 26: 101906.
Cubas, Miriam, Rita Peyroteo-Stjerna, Maria Fontanals-Coll, Laura Llorente-Rodr�guez, Alexandre Lucquin,
Oliver Edward Craig, and André Carlo Colonese. 2019. “Long-Term Dietary Change in Atlantic and
28
Mediterranean Iberia with the Introduction of Agriculture: A Stable Isotope Perspective. Archaeological
and Anthropological Sciences 11 (8): 3825–36. https://doi.org/10.1007/s12520-018-0752-1.
David, Bruno, Jean-Jacques Delannoy, Fiona Petchey, Robert Gunn, Jillian Huntley, Peter Veth, Kim
Genuite, Robert J Skelly, Jerome Mialanes, and Sam Harper. 2019. “Dating Painting Events Through
by-Products of Ochre Processing: Borologa 1 Rockshelter, Kimberley, Australia. Australian Archaeology
85 (1): 57–94.
De Angelis, Flavio, Virginia Veltre, Sara Varano, Marco Romboni, Sonia Renzi, Stefania Zingale, Paola Ricci,
Carla Caldarini, Stefania Di Giannantonio, and Carmine Lubritto. 2020. “Dietary and Weaning Habits
of the Roman Community of Quarto Cappello Del Prete (Rome, 1st-3rd Century CE). Environmental
Archaeology, 1–15.
Demuro, Martina, Leej Arnold, Nigel A Spooner, Kane Ditcheld, and Peter Veth. 2019. “Corrigendum:
Coastal Occupation Before the ‘Big Swamp’: Results from Excavations at John Wayne Country Rock-
shelter on Barrow Island. Archaeology in Oceania 54 (1): 68–72.
Diez, David M, Christopher D Barr, and Mine Cetinkaya-Rundel. 2019. OpenIntro Statistics. OpenIntro.
DiNapoli, Robert J, Timothy M Rieth, Carl P Lipo, and Terry L Hunt. 2020. “A Model-Based Approach
to the Tempo of ‘Collapse’: The Case of Rapa Nui (Easter Island).” Journal of Archaeological Science,
105094.
Douka, Katerina, Viviane Slon, Zenobia Jacobs, Christopher Bronk Ramsey, Michael V Shunkov, Anatoly P
Derevianko, Fabrizio Mafessoni, Maxim B Kozlikin, Bo Li, and Rainer Grün. 2019. “Age Estimates for
Hominin Fossils and the Onset of the Upper Palaeolithic at Denisova Cave. Nature 565 (7741): 640–44.
Dunson, David B, and JE Johndrow. 2020. “The Hastings Algorithm at Fifty. Biometrika 107 (1): 1–23.
Fernandes, Ricardo, Yvette Eley, Marek Brabec, Alexandre Lucquin, Andrew Millard, and Oliver E Craig.
2018. “Reconstruction of Prehistoric Pottery Use from Fatty Acid Carbon Isotope Signatures Using
Bayesian Inference. Organic Geochemistry 117: 31–42.
Fisher, Daniel C. 1987. “Mastodont Procurement by Paleoindians of the Great Lakes Region: Hunting or
Scavenging?” In The Evolution of Human Hunting, 309–421. Springer.
Fisher, Ronald Aylmer. 1925. Statistical Methods for Research Workers. Edinburgh/London: Oliver; Boyd.
Fitzsimmons, Kathryn E, Radu Iovita, Tobias Sprafke, Michelle Glantz, Sahra Talamo, Katharine Horton,
Tyler Beeton, Saya Alipova, Galymzhan Bekseitov, and Yerbolat Ospanov. 2017. “A Chronological
Framework Connecting the Early Upper Palaeolithic Across the Central Asian Piedmont. Journal of
Human Evolution 113: 107–26.
Fletcher, Mike, and Gary R Lock. 2005. Digging Numbers: Elementary Statistics for Archaeologists. Oxford,
UK: Oxford Press.
29
Gelman, Andrew. 2006. “Multilevel (Hierarchical) Modeling: What It Can and Cannot Do. Technometrics
48 (3): 432–35. https://doi.org/10.1198/004017005000000661.
———. 2018. “The Failure of Null Hypothesis Signicance Testing When Studying Incremental Changes,
and What to Do about It. Personality and Social Psychology Bulletin 44 (1): 16–23.
Gelman, Andrew, John B Carlin, Hal S Stern, David B Dunson, Aki Vehtari, and Donald B Rubin. 2020.
Bayesian Data Analysis. Chapman; Hall/CRC press.
Gelman, Andrew, and Jennifer Hill. 2006. Data Analysis Using Regression and Multilevel/Hierarchical
Models. Cambridge university press.
Gilks, Wally R, Andrew Thomas, and David J Spiegelhalter. 1994. “A Language and Program for Complex
Bayesian Modelling. Journal of the Royal Statistical Society: Series D (The Statistician) 43 (1): 169–77.
Goodrich, Ben, Jonah Gabry, Imad Ali, and Sam Brilleman. 2020. “Rstanarm: Bayesian Applied Regression
Modeling via Stan. R Package Version 2.21.1. https://mc-stan.org/rstanarm.
Gordón, Florencia, S Ivan Perez, Adam Hajduk, Maximiliano Lezcano, and Valeria Bernal. 2018. “Dietary
Patterns in Human Populations from Northwest Patagonia During Holocene: An Approach Using Bin-
ford’s Frames of Reference and Bayesian Isotope Mixing Models. Archaeological and Anthropological
Sciences 10 (6): 1347–58.
Guérin, Gilles, Pierre Antoine, Esther Schmidt, Emilie Goval, David Hérisson, Guillaume Jamet, Jean-
Louis Reyss, Qingfeng Shao, Anne Philippe, and Marie-Anne Vibet. 2017. “Chronology of the Up-
per Pleistocene Loess Sequence of Havrincourt (France) and Associated Palaeolithic Occupations: A
Bayesian Approach from Pedostratigraphy, OSL, Radiocarbon, TL and ESR/U-Series Data. Quater-
nary Geochronology 42: 15–30.
Halekoh, UU, and Werner Vach. 1999. “Bayesian Seriation as a Tool in Archaeology. In Archaeology in the
Age of the Internet, edited by L Dingwall, S Exon, V Ganey, S Lain, and M. van Leusen, 750:107–7.
Hamilton, W Derek, and Anthony M Krus. 2018. “The Myths and Realities of Bayesian Chronological
Modeling Revealed. American Antiquity 83 (2): 187–203.
Haslett, John, and Andrew Parnell. 2008. “A Simple Monotone Process with Application to Radiocarbon-
Dated Depth Chronologies. Journal of the Royal Statistical Society: Series C (Applied Statistics) 57 (4):
399–418. https://doi.org/https://doi.org/10.1111/j.1467-9876.2008.00623.x.
Hassan, Masoud M, E Jones, and Caitlin E Buck. 2019. “A Simple Bayesian Approach to Tree‐ring Dating.
Archaeometry 61 (4): 991–1010.
Hester, Thomas R, Harry J Shafer, and Kenneth L Feder. 2016. Field Methods in Archaeology. Routledge.
Heydari, Maryam, Guillaume Guérin, Sebastian Kreutzer, Guillaume Jamet, Mohammad Akhavan
Kharazian, Milad Hashemi, Hamed Vahdati Nasab, and Gilles Berillon. 2020. “Do Bayesian Methods
30
Lead to More Precise Chronologies?‘BayLum’and a First OSL-Based Chronology for the Palaeolithic
Open-Air Site of Mirak (Iran). Quaternary Geochronology, 101082.
Homan, Matthew D, and Andrew Gelman. 2014. “The No-U-Turn Sampler: Adaptively Setting Path
Lengths in Hamiltonian Monte Carlo. J. Mach. Learn. Res. 15 (1): 1593–623.
Howson, Colin, and Peter Urbach. 2006. Scientic Reasoning: The Bayesian Approach. Open Court
Publishing.
Inomata, Takeshi, Daniela Triadan, Jessica MacLellan, Melissa Burham, Kazuo Aoyama, Juan Manuel
Palomo, Hitoshi Yonenobu, Flory Pinzón, and Hiroo Nasu. 2017. “High-Precision Radiocarbon Dating
of Political Collapse and Dynastic Origins at the Maya Site of Ceibal, Guatemala. Proceedings of the
National Academy of Sciences 114 (6): 1293–98.
Jiménez, Gonzalo Aranda, Águeda Lozano Medina, Marta Díaz-Zorita Bonilla, Margarita Sánchez Romero,
and Javier Escudero Carrillo. 2018. “Cultural Continuity and Social Resistance: The Chronology of
Megalithic Funerary Practices in Southern Iberia. European Journal of Archaeology 21 (2): 192–216.
Katahira, Kentaro. 2016. “How Hierarchical Models Improve Point Estimates of Model Parameters at the
Individual Level. Journal of Mathematical Psychology 73: 37–58. https://doi.org/https://doi.org/10.
1016/j.jmp.2016.03.007.
Kearney, Kevin. 2019. “Vegetation Impacts and Early Neolithic Monumentality: A Palaeoenvironmental
Case Study from South-West Ireland. Journal of Archaeological Science: Reports 27: 101940.
King, Charlotte L, Andrew R Millard, Darren R Gröcke, Vivien G Standen, Bernardo T Arriaza, and Siân E
Halcrow. 2017. “A Comparison of Using Bulk and Incremental Isotopic Analyses to Establish Weaning
Practices in the Past. STAR: Science & Technology of Archaeological Research 3 (1): 126–34.
Kirch, Patrick V, and Jillian A Swift. 2017. “New AMS Radiocarbon Dates and a Re-Evaluation of the
Cultural Sequence of Tikopia Island, Southeast Solomon Islands. Journal of the Polynesian Society, The
126 (3): 313.
Konigsberg, Lyle W., and Susan R. Frankenberg. 1994. “Paleodemography: ‘Not Quite Dead’.” Evolutionary
Anthropology: Issues, News, and Reviews 3 (3): 92–105. https://doi.org/https://doi.org/10.1002/evan.
1360030306.
Konigsberg, Lyle W., Samantha M. Hens, Lee Meadows Jantz, and William L. Jungers. 1998. “Stature
Estimation and Calibration: Bayesian and Maximum Likelihood Perspectives in Physical Anthropology.
American Journal of Physical Anthropology 107 (S27): 65–92. https://doi.org/https://doi.org/10.1002/
(SICI)1096-8644(1998)107:27+%3C65::AID-AJPA4%3E3.0.CO;2-6.
Konigsberg, Lyle W, and Susan R Frankenberg. 1992. “Estimation of Age Structure in Anthropological
Demography. American Journal of Physical Anthropology 89 (2): 235–56.
31
Krajcarz, M. T., M. Krajcarz, B. Ginter, T. Goslar, and P. Wojtal. 2018. “Towards a Chronology of the
Jerzmanowician—a New Series of Radiocarbon Dates from Nietoperzowa Cave (Poland). Archaeometry
60 (2): 383–401. https://doi.org/https://doi.org/10.1111/arcm.12311.
Kramer, Karen L, Amanda Veile, and Erik Otárola-Castillo. 2016. “Sibling Competition & Growth Tradeos.
Biological Vs. Statistical Signicance. PloS One 11 (3): e0150126.
Krol, Tessa N, Michael Dee, and Annet Nieuwhof. 2020. “The Chronology of Anglo‐Saxon Style Pottery in
Radiocarbon Dates: Improving the Typo‐chronology. Oxford Journal of Archaeology 39 (4): 410–41.
Krus, Anthony M, and Lori Barkwill Love. 2020. “The Big Picture: Multiple Perspective Chronologies with
Bayes and Beyond (Symposium). In Society for American Archaeology 85th Annual Meeting.
Lindley, Dennis Victor. 1972. Bayesian Statistics: A Review. SIAM.
Litton, Cliord D, and Caitlin E Buck. 1996. “An Archaeological Example: Radiocarbon Dating. Markov
Chain Monte Carlo in Practice, 466–86.
Loftus, Emma, Peter J Mitchell, and Christopher Bronk Ramsey. 2019. “An Archaeological Radiocarbon
Database for Southern Africa. Antiquity 93 (370): 870–85.
Long, Tengwen, Mayke Wagner, and Pavel E. Tarasov. 2017. “A Bayesian Analysis of Radiocarbon Dates
from Prehistoric Sites in the Haidai Region, East China, for Evaluation of the Archaeological Chronology.”
Journal of Archaeological Science: Reports 12: 81–90. https://doi.org/https://doi.org/10.1016/j.jasrep.
2017.01.024.
Lorentzen, Brita, Sturt W. Manning, and Deborah Cvikel. 2020. “Shipbuilding and Maritime Activity on
the Eve of Mechanization: Dendrochronological Analysis of the Akko Tower Shipwreck, Israel. Journal
of Archaeological Science: Reports 33: 102463. https://doi.org/https://doi.org/10.1016/j.jasrep.2020.
102463.
Lucy, Dave, RG Aykroyd, AM Pollard, and T Solheim. 1996. “A Bayesian Approach to Adult Human Age
Estimation from Dental Observations by Johanson’s Age Changes. Journal of Forensic Science 41 (2):
189–94.
Manning, Sturt W, Adam T Smith, Lori Khatchadourian, Ruben Badalyan, Ian Lindsay, Alan Greene, and
Maureen Marshall. 2018. “A New Chronological Model for the Bronze and Iron Age South Caucasus:
Radiocarbon Results from Project ArAGATS, Armenia. Antiquity 92 (366): 1530–51.
Marsh, Erik J., Andrew P. Roddick, Maria C. Bruno, Scott C. Smith, John W. Janusek, and Christine A.
Hastorf. 2019. “Temporal Inection Points in Decorated Pottery: A Bayesian Renement of the Late
Formative Chronology in the Southern Lake Titicaca Basin, Bolivia. Latin American Antiquity 30 (4):
798–817. https://doi.org/10.1017/laq.2019.73.
Marsh, Erik J, Ray Kidd, Dennis Ogburn, and Víctor Durán. 2017. “Dating the Expansion of the Inca
32
Empire: Bayesian Models from Ecuador and Argentina. Radiocarbon 59 (1): 117.
Martisius, Naomi L., Shannon P. McPherron, Ellen Schulz-Kornas, Marie Soressi, and Teresa E. Steele.
2020. “A Method for the Taphonomic Assessment of Bone Tools Using 3d Surface Texture Analysis of
Bone Microtopography. Archaeological and Anthropological Sciences 12 (10): 1–16.
Martisius, Naomi L., I Sidéra, MN Grote, Teresa E. Steele, Shannon P. McPherron, and Ellen Schulz-Kornas.
2018. “Time Wears on: Assessing How Bone Wears Using 3d Surface Texture Analysis. PlosS ONE 13
(11): e0206078. https://doi.org/https://doi.org/10.1371/journal.pone.0206078.
Marwick, Ben. 2017. “Computational Reproducibility in Archaeological Research: Basic Principles and a
Case Study of Their Implementation. Journal of Archaeological Method and Theory 24 (2): 424–50.
Marwick, Ben, Chris Clarkson, Sue O’Connor, and Sophie Collins. 2016. “Early Modern Human Lithic
Technology from Jerimalai, East Timor. Journal of Human Evolution 101: 45–64.
Maurer, A-F, Alain Person, Antoine Zazzo, Mathieu Sebilo, Vincent Balter, Florence Le Cornec, Valery
Zeitoun, Elise Dufour, Annette Schmidt, and Marc de Rafelis. 2017. “Geochemical Identity of Pre-
Dogon and Dogon Populations at Bandiagara (Mali, 11th–20th Cent. AD). Journal of Archaeological
Science: Reports 14: 289–301.
McCall, Grant S. 2018. Strategies for Quantitative Research: Archaeology by Numbers. Routledge.
McElreath, Richard. 2020. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. CRC
press.
Mendelsohn, Rebecca R. 2018. “The Chronology of the Formative to Classic Period Transition at Izapa: A
Reevaluation. Latin American Antiquity 29 (2): 239–59.
Méhault, Ronan. 2017. “Applying a Bayesian Approach in the Northeastern North American Context:
Reassessment of the Temporal Boundaries of the “Pseudo-Scallop Shell Interaction Sphere. Canadian
Journal of Archaeology 41: 139–72.
Millard, Andrew. 2002. “Bayesian Approach to Sapwood Estimates and Felling Dates in Dendrochronology.
Archaeometry 44 (1): 137–43.
Murray, John K., Jacob A. Harris, Simen Oestmo, Miles Martin, and Curtis W. Marean. 2020. “A New
Approach to Identify Heat Treated Silcrete Near Pinnacle Point, South Africa Using 3d Microscopy and
Bayesian Modeling. Journal of Archaeological Science: Reports 34: 102622.
Murtaugh, Paul A. 2014. “In Defense of P Values. Ecology 95 (3): 611–17. https://doi.org/https://doi.
org/10.1890/13-0590.1.
Myers, OH. 1950. Some Applications of Statistics to Archaeology. Cairo: Serv. Antiq. Egypte.
Naylor, JC, and AFM Smith. 1988. “An Archaeological Inference Problem. Journal of the American
Statistical Association 83 (403): 588–95.
33
Neyman, Jerzy, and Egon Sharpe Pearson. 1933. “On the Problem of the Most Ecient Tests of Statistical
Hypotheses. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers
of a Mathematical or Physical Character 231: 289–337.
Ortman, Scott G, Mark D Varien, and T Lee Gripp. 2007. “Empirical Bayesian Methods for Archaeological
Survey Data: An Application from the Mesa Verde Region. American Antiquity, 241–72.
Osborn, Jo. 2019. “A Bayesian Approach to Andean Faunal Assemblages. Latin American Antiquity 30
(2): 354–72.
Otárola-Castillo, Erik, and Melissa G. Torquato. 2018. “Bayesian Statistics in Archaeology. Annual Review
of Anthropology 47 (1): 435–53. https://doi.org/10.1146/annurev-anthro-102317-045834.
Parkington, John, John W Fisher Jr, Simon Hoyte, Maria Lazarides, and Stephan Woodborne. 2020. “Con-
temporaneity and Entanglement: Archaeological Site Structure from a Bayesian Perspective. Journal
of Archaeological Science: Reports 31: 102349.
Paulsson, B. Schulz. 2019. “Radiocarbon Dates and Bayesian Modeling Support Maritime Diusion Model
for Megaliths in Europe. Proceedings of the National Academy of Sciences 116 (9): 3460–65.
Perri, Angela R., Jeremy M. Koster, Erik Otárola-Castillo, Jessica L. Burns, and Catherine G. Cooper.
2019. “Dietary Variation Among Indigenous Nicaraguan Horticulturalists and Their Dogs: An Ethnoar-
chaeological Application of the Canine Surrogacy Approach. Journal of Anthropological Archaeology 55:
101066. https://doi.org/https://doi.org/10.1016/j.jaa.2019.05.002.
Pezo-Lanfranco, Luis, Sabine Eggers, Cecilia Petronilho, Alice Toso, Dione da Rocha Bandeira, Matthew Von
Tersch, Adriana M. P. dos Santos, Beatriz Ramos da Costa, Roberta Meyer, and André Carlo Colonese.
2018. “Middle Holocene Plant Cultivation on the Atlantic Forest Coast of Brazil?” Royal Society Open
Science 5 (9): 180432. https://doi.org/doi:10.1098/rsos.180432.
Plummer, Martyn. 2003. “JAGS: A Program for Analysis of Bayesian Graphical Models Using Gibbs
Sampling. In Proceedings of the 3rd International Workshop on Distributed Statistical Computing,
124:1–10. Vienna, Austria.
Reimer, Paula J., William E. N. Austin, Edouard Bard, Alex Bayliss, Paul G. Blackwell, Christopher Bronk
Ramsey, Martin Butzin, et al. 2020. “The IntCal20 Northern Hemisphere Radiocarbon Age Calibration
Curve (0–55 Cal kBP). Radiocarbon 62 (4): 725–57. https://doi.org/10.1017/RDC.2020.41.
Ricci, Paola, Maite Iris García-Collado, Josu Narbarte Hernández, Idoia Grau Sologestoa, Juan Antonio
Quirós Castillo, and Carmine Lubritto. 2018. “Chronological Characterization of Medieval Villages in
Northern Iberia: A Multi-Integrated Approach. Eur. Phys. J. Plus 133 (9): 375. https://doi.org/10.
1140/epjp/i2018-12233-5.
Robert, Christian, and George Casella. 2011. “A Short History of Markov Chain Monte Carlo: Subjective
34
Recollections from Incomplete Data. Statistical Science, 102–15.
Robertson, Ian G. 1999. “Spatial and Multivariate Analysis, Random Sampling Error, and Analytical Noise:
Empirical Bayesian Methods at Teotihuacan, Mexico. American Antiquity, 137–52.
Rosenstock, Eva, Julia Ebert, Robert Martin, Andreas Hicketier, Paul Walter, and Marcus Groß. 2019.
“Human Stature in the Near East and Europe Ca. 10,000–1000 BC: Its Spatiotemporal Development
in a Bayesian Errors-in-Variables Model. Archaeological and Anthropological Sciences 11 (10): 5657–90.
https://doi.org/10.1007/s12520-019-00850-3.
Sadr, Karim, C. Britt Bousman, Thomas A. Brown, Kamela G. Sekonya, Elias Sideras-Haddad, and Andrew
B. Smith. 2017. “New Radiocarbon Dates and the Herder Occupation at Kasteelberg B, South Africa.
Antiquity 91 (359): 1299–1313. https://doi.org/10.15184/aqy.2017.102.
Skelly, Robert, Bruno David, Matthew Leavesley, Fiona Petchey, Alu Guise, Roxanne Tsang, Jerome Mi-
alanes, and Thomas Richards. 2018. “Changing Ceramic Traditions at Agila Ancestral Village, Hood
Bay, Papua New Guinea. Australian Archaeology 84 (2): 181–95. https://doi.org/10.1080/03122417.
2018.1515146.
Smith, Mike, Alan N. Williams, and June Ross. 2017. “Puntutjarpa Rockshelter Revisited: A Chronological
and Stratigraphic Reappraisal of a Key Archaeological Sequence for the Western Desert, Australia.
Australian Archaeology 83 (1-2): 20–31. https://doi.org/10.1080/03122417.2017.1351673.
Snitker, Grant, Agustín Diez Castillo, C. Michael Barton, Joan Bernabeu Aubán, Oreto García Puchol,
and Salvador Pardo-Gordó. 2018. “Patch-Based Survey Methods for Studying Prehistoric Human Land-
Use in Agriculturally Modied Landscapes: A Case Study from the Canal de Navarrés, Eastern Spain.
Quaternary International 483: 5–22. https://doi.org/https://doi.org/10.1016/j.quaint.2018.01.034.
Spaulding, Albert C. 1953. “Statistical Techniques for the Discovery of Artifact Types. American Antiquity
18 (4): 305–13.
Stewart, S. T., P. M. N. Hitchings, P. Bikoulis, and E. B. Banning. 2017. “Novel Survey Methods Shed Light
on Prehistoric Exploration in Cyprus. Antiquity 91 (355): e3. https://doi.org/10.15184/aqy.2016.235.
Sturtz, Sibylle, Uwe Ligges, and Andrew E Gelman. 2005. “R2WinBUGS: A Package for Running WinBUGS
from R.
Team, Stan Developent. 2019. “Stan Modeling Language Users Guide and Reference Manual. User’s Guide
Version 2.25. http://mc-stan.org/.
———. 2020. “RStan: The R Interface to Stan. R Package Version 2.21.2. http://mc-stan.org/.
Tsukamoto, K, F Tokanai, T Moriya, and H Nasu. 2020. “Building a High-Resolution Chronology at the
Maya Archaeological Site of El Palmar, Mexico. Archaeometry 62 (6).
Urwin, Quan Hua, Chris, and Henry Arifeae. 2018. “The Chronology of Popo, an Ancestral Village Site in
35
Orokolo Bay, Gulf Province, Papua New Guinea. Australian Archaeology 84 (1): 90–97.
Veth, Ingrid Ward, Peter. 2017. “Early Human Occupation of a Maritime Desert, Barrow Island, North-West
Australia. Quaternary Science Reviews 168: 19–29.
Vidgen, Bertie, and Taha Yasseri. 2016. “P-Values: Misunderstood and Misused. Frontiers in Physics 4:
6.
Wang, S., N. Chen, M. R. Capodiferro, T. Zhang, H. Lancioni, H. Zhang, Y. Miao, et al. 2017. “Whole
Mitogenomes Reveal the History of Swamp Bualo: Initially Shaped by Glacial Periods and Eventually
Modelled by Domestication. Scientic Reports 7 (1): 4708. https://doi.org/10.1038/s41598-017-04830-
2.
Wasserstein, Ronald L, Allen L Schirm, and Nicole A Lazar. 2019. “Moving to a World Beyond ‘p< 0.05’.
The American Statistician 73 (Sup1).
White, Gregory G, and Thomas F King. 2016. The Archaeological Survey Manual. Routledge.
Wolfhagen, Jesse. 2020. “Re-Examining the Use of the LSI Technique in Zooarchaeology. Journal of
Archaeological Science 123: 105254. https://doi.org/https://doi.org/10.1016/j.jas.2020.105254.
Wolfhagen, Jesse, and Erik Otárola-Castillo. 2021. “Bayesian Archaeology (Symposium). In Society for
American Archaeology 86th Annual Meeting.
Wolfhagen, Jesse, and Max D. Price. 2017. “A Probabilistic Model for Distinguishing Between Sheep and
Goat Postcranial Remains. Journal of Archaeological Science: Reports 12: 625–31.
Wright, David K., Junkyu Kim, Jiyoung Park, Jiwon Yang, and Jangsuk Kim. 2020. “Spatial Modeling
of Archaeological Site Locations Based on Summed Probability Distributions and Hot-Spot Analyses: A
Case Study from the Three Kingdoms Period, Korea. Journal of Archaeological Science 113: 105036.
https://doi.org/https://doi.org/10.1016/j.jas.2019.105036.
Wright, David K., Scott MacEachern, and Jaeyong Lee. 2014. “Analysis of Feature Intervisibility and
Cumulative Visibility Using GIS, Bayesian and Spatial Statistics: A Study from the Mandara Mountains,
Northern Cameroon. PLOS ONE 9 (11): e112191. https://doi.org/10.1371/journal.pone.0112191.
Wynveldt, Federico, Bárbara Balesta, María Emilia Iucci, Celeste Valencia, and Gabriela Soledad Lorenzo.
2017. “Late Chronology in Hualn Valley (Catamarca, Argentina): A Revisión from 14c Dating.” Ra-
diocarbon 59.
Xiang, Hai, Jianqiang Gao, Dawei Cai, Yunbing Luo, Baoquan Yu, Langqing Liu, Ranran Liu, et al. 2017.
“Origin and Dispersal of Early Domestic Pigs in Northern China. Scientic Reports 7 (1): 5602. https:
//doi.org/10.1038/s41598-017-06056-8.
Yang, Yishi, Shanjia Zhang, Chris Oldknow, Menghan Qiu, Tingting Chen, Haiming Li, Yifu Cui, et al. 2019.
“Rened Chronology of Prehistoric Cultures and Its Implication for Re-Evaluating Human-Environment
36
Relations in the Hexi Corridor, Northwest China. Science China Earth Sciences 62 (10): 1578–90.
https://doi.org/10.1007/s11430-018-9375-4.
37
... Using a fictional zooarchaeological example, this article provides a straightforward explanation of Bayesian inference and compares it to the more conventional null hypothesis significance testing (NHST). Although some have previously described and reviewed the application of these concepts elsewhere (e.g., Buck and Meson 2015;Buck et al. 1996;Otárola-Castillo and Torquato 2018;Otárola-Castillo et al. 2022;Wolfhagen 2019Wolfhagen , 2020, this work is focused on presenting replicable step-by-step examples of the Bayesian framework for evaluating and discerning among competing hypotheses. R Markdown code to reproduce all materials presented here is available in an OpenScience Framework repository: https://osf.io/ ...
... Below, we provide an overview of the central concepts of the two major probability paradigms to evaluate hypotheses: NHST and Bayesian inference. Whereas most scientists widely use NHST, the Bayesian approach is considered a modern data-driven learning system that has enjoyed increasing application to archaeology (Buck and Meson 2015;Buck et al. 1996;Howson and Urbach 2006;Jaynes 2003;Otárola-Castillo and Torquato 2018;Otárola-Castillo et al. 2022). ...
Article
Full-text available
Archaeologists frequently use probability distributions and null hypothesis significance testing (NHST) to assess how well survey, excavation, or experimental data align with their hypotheses about the past. Bayesian inference is increasingly used as an alternative to NHST and, in archaeology, is most commonly applied to radiocarbon date estimation and chronology building. This article demonstrates that Bayesian statistics has broader applications. It begins by contrasting NHST and Bayesian statistical frameworks, before introducing and applying Bayes's theorem. In order to guide the reader through an elementary step-by-step Bayesian analysis, this article uses a fictional archaeological faunal assemblage from a single site. The fictional example is then expanded to demonstrate how Bayesian analyses can be applied to data with a range of properties, formally incorporating expert prior knowledge into the hypothesis evaluation process.
... Using a fictional zooarchaeological example, this paper provides a straightforward explanation of Bayesian inference and compares it to the more conventional null hypothesis significance testing (NHST). Although some have previously described and reviewed the application of these concepts elsewhere (Buck, Cavanagh, and Litton 1996;Buck 2001;Buck and Meson 2015;Otárola-Castillo, Torquato, and Buck 2022;Wolfhagen 2019Wolfhagen , 2020, this work is focused on presenting replicable step-bystep examples of the Bayesian framework for evaluating and discerning among competing hypotheses. In addition, a Spanish translation of this manuscript and associated materials is also available in an Open Science Framework repository: https://osf.io/23bdt/. ...
Preprint
Full-text available
Manuscript accepted for publication by Advances in Archaeological Science. Abstract: Archaeologists frequently use probability distributions and null hypothesis significance testing (NHST) to assess how well survey, excavation, or experimental data align with their hypotheses about the past. Bayesian inference is increasingly used as an alternative to NHST and, in archaeology, is most commonly applied to radiocarbon date estimation and chronology building. This paper demonstrates that Bayesian statistics has broader applications. It begins by contrasting NHST and Bayesian statistical frameworks, before introducing and applying Bayes' Theorem. In order to guide the reader through an elementary step-by-step Bayesian analysis, this paper uses a fictional archaeological faunal assemblage from a single site. The fictional example is then expanded to demonstrate how Bayesian analyses can be applied to data with a range of properties, formally incorporating expert prior knowledge into the hypothesis evaluation process.
Article
Full-text available
Following the near extinction of bison (Bison bison) from its historic range across North America in late 19th century, novel bison conservation efforts in the early 20th century catalyzed a popular widespread conservation movement to protect and restore bison among other species and places. Since Allen’s initial delineation (1876) of the historic distribution of North American bison, subsequent attempts have been hampered by knowledge gaps about bison distribution and abundance previous to and following colonial arrival and settlement. For the first time, we apply a multi‐disciplinary approach to assemble a comprehensive, integrated geographic database and meta‐analysis of bison occurrences over the last 200,000 years BCE, with particular emphasis over the last 450 years before present. We combined paleontology, archaeology, and historical ecology data for our database totaling 6,438 observations. We derived the observations from existing online databases, published literature, and first‐hand exploration journal entries. To illustrate the conservative maximum historical extent of occurrence of bison, we created a concave hull using observations occurring over the last 450 years (n = 3,379 observations) which is the broadly accepted historical benchmark at 1500 CE covering 59% of the North American continent. While this distribution represents a historic extent of occurrence — merely delineating the maximum margins of the near‐continental distribution — it does not replace a density‐based approach reconstructing potential historical range distributions which identifies core and marginal ranges. However, we envision the contained observations of this database will contribute to further research in the increasingly evidence‐based disciplines of bison ecology, evolution, rewilding, management, and conservation. There are no copyright or proprietary restrictions on this data, and this data paper should be cited when these data are reused.
Article
Full-text available
Increasingly researchers have employed confocal microscopy and 3D surface texture analysis to assess bone surface modifications in an effort to understand ancient behavior. However, quantitative comparisons between the surfaces of purported archaeological bone tools and experimentally manufactured and used bones are complicated by taphonomic processes affecting ancient bone. Nonetheless, it may be reasonable to assume that bones within the same deposits are altered similarly and thus these alterations are quantifiable. Here we show how unworked bones can be used to quantify the taphonomic effect on bone surfaces and how this effect can then be controlled for and incorporated into an analysis for evaluating the modified surfaces of purported bone tools. To assess the baseline taphonomy of Middle Paleolithic archaeological deposits associated with typologically identified bone artifacts, specifically lissoirs, we directly compare the surface textures of ancient and modern unworked ribs. We then compare the ancient unworked ribs and lissoirs to assess their differences and predict the ancient artifacts’ original surface state using a multilevel multivariate Bayesian model. Our findings demonstrate that three of five tested surface texture parameters (Sa, Spc, and IsT) are useful for distinguishing surface type. Our model predictions show that lissoirs tend to be less rough, have more rounded surface peaks, and exhibit more directionally oriented surfaces. These characteristics are likely due to anthropogenic modifications and would have been more pronounced at deposition. Quantifying taphonomic alterations moves us one step closer to accurately assessing how bone artifacts were made and used in the ancient past.
Article
Full-text available
In the fourth and fifth centuries AD, the Anglo‐Saxon style was introduced in north‐western Europe. To what extent immigrants contributed to this process for each region is still debated. How and when the Anglo‐Saxon style spread is essential in this debate. Handmade pottery is the most common find category, but so far it can only be dated globally. An earlier and a later style have been postulated and the introduction of this pottery is seemingly not simultaneous in every region. Hitherto this could not be supported by the radiocarbon dates. The present study shows that, with the help of Bayesian modelling, it is possible to substantiate these patterns, which is of utmost importance for understanding migration patterns, contacts and exchange along the southern North Sea coastal regions during this period.
Article
Full-text available
In this study, we examine the role of foxes in Palaeolithic economies, focusing on sites of the Middle Palaeolithic, Aurignacian, Gravettian and Magdalenian of the Swabian Jura. For this purpose, we used published faunal data from 26 assemblages from the region, including new information from the Magdalenian layers of Langmahdhalde. We explore how the abundance of foxes changes over time, how they were used by humans, and how they were deposited at the sites, with a special focus on fox hunting methods. To evaluate these hunting methods, we use the prey choice model of optimal foraging theory (OFT) and simulate possible hunting scenarios, which we test based on the published faunal assemblages. Our research indicates that foxes were hunted since the early Upper Palaeolithic for their meat, fur and teeth, possibly with traps. We find that the abundance of fox remains in the archaeological record of the region increased continuously starting in the Aurignacian, which cannot be explained by taphonomic factors. The trend of foxes to adapt to human-influenced environments with commensal behavior may also have contributed to them being hunted more often.
Article
In a 1970 Biometrika paper, W. K. Hastings developed a broad class of Markov chain algorithms for sampling from probability distributions that are difficult to sample from directly. The algorithm draws a candidate value from a proposal distribution and accepts the candidate with a probability that can be computed using only the unnormalized density of the target distribution, allowing one to sample from distributions known only up to a constant of proportionality. The stationary distribution of the corresponding Markov chain is the target distribution one is attempting to sample from. The Hastings algorithm generalizes the Metropolis algorithm to allow a much broader class of proposal distributions instead of just symmetric cases. An important class of applications for the Hastings algorithm corresponds to sampling from Bayesian posterior distributions, which have densities given by a prior density multiplied by a likelihood function and divided by a normalizing constant equal to the marginal likelihood. The marginal likelihood is typically intractable, presenting a fundamental barrier to implementation in Bayesian statistics. This barrier can be overcome by Markov chain Monte Carlo sampling algorithms. Amazingly, even after 50 years, the majority of algorithms used in practice today involve the Hastings algorithm. This article provides a brief celebration of the continuing impact of this ingenious algorithm on the 50th anniversary of its publication.
Article
The heat treatment of stone to enhance flaking attributes was an important advance in the adaptive toolkit of humans and a major step in pyrotechnology. The earliest evidence for this is the heat treatment of silcrete ~164 ka at the Middle Stone Age site, Pinnacle Point 13B in South Africa. Heating stone prior to knapping alters the physical and chemical composition of the stone. This study investigates whether surface roughness, as measured by a 3D microscope, can be used as a proxy to identify the presence of heat treatment in the archaeological record. We record values for multiple surface texture parameters on a sample of experimentally created stone tools from paired heat-treated and unheated silcrete nodules. A Bayesian probability model, trained on the experimental sample, was then used to evaluate the probability individual samples have undergone heat treatment. Furthermore, we tested whether an industrial silicon compound can be used to record and preserve surface roughness for analysis. This research provides a novel, probabilistic, and non-invasive technique for identifying heat treatment from three sources near Pinnacle Point.
Article
This paper aims to provide the isotopic characterization of the diet consumed by people buried in a graveyard of the Imperial Rome Suburbium (1st–3rd centuries CE), where numerous children were buried. A sample of 50 human remains from Quarto Cappello del Prete was selected for carbon and nitrogen stable isotope analysis. Published data related to coeval faunal remains set the baseline of the diet. The results for humans were integrated with previously analyzed data from Quarto Cappello del Prete. The resulting sample of 71 people has been dissected for stratification according to demographics, focusing on the ability to ascertain the weaning process in children. Isotopic data are steady with an overall diet mainly based on terrestrial resources, where C3 plants played a pivotal role in the diet, though the δ13C range suggests that the foodstuff should have been heterogeneous. The remarkable amount of children allows us to evaluate the weaning process. Infants seem to be adequately weaned after 3 years, when they were considered as adults to what concerns the dietary habits. These data represent a valuable enhancement for understanding the weaning practices in ancient Rome, contributing to supporting the hypothesis about lifestyle and health in the Roman Imperial period. KEYWORDS: Diet, weaning practices, Romans, stable isotope analysis, Molecular Bioarcheology, Roman Suburbium
Article
Biometric analysis of faunal remains is crucial for estimating the age/sex composition of assemblages and exploring large-scale processes that affected animal biology in the past. The LSI technique is a premier method for examining biometry in different zooarchaeological scenarios, particularly domestication research and regional-scale surveys. Despite the technique's popularity, several early arguments describing limitations or concerns about the LSI technique still impact interpretations and applications today. More generally, though, the LSI technique is treated as a method of increasing sample sizes as a last resort when unmodified measurements are too scarce to use. This paper reexamines the theoretical foundations of the LSI technique to update best practices in LSI analyses in zooarchaeology. Redefining the LSI technique as a pseudo-centering process shows why LSI values are preferable to unmodified measurements for biometric analyses. This new definition also highlights the arbitrary nature of standard animal and logarithm base choice, though certain decisions (smaller standard animals and base e logarithms) can aid interpretation by closely linking changes in LSI values to proportional changes of the original measurements relative to the standard. Of more consequence on LSI analyses , however, is the way to aggregate LSI values from different measurement types; this paper shows how multilevel modeling uses partial pooling to balance the trade-offs of bias and variance caused by aggregation. To showcase the benefits of the Bayesian multilevel LSI model, the biometric variation of ten simulated sites using a reference set of Shetland sheep measurements (Popkin et al., 2012). Modeling all ten sites within a single multilevel structure provides a clear way to evaluate biometric differences while accounting for potential al-lometries and variation in body part representation between different sites. These results clarify earlier arguments about the limitations of the LSI technique, summarized in a set of best practices for LSI applications.
Article
Since Kaminaljuyu was first systematically excavated in the 1930s, the chronology of the site has been fraught with confusion and scholarly disagreement. In recent years, scholars generally adopted the chronology presented by Shook and Popenoe de Hatch (1999) as the most authoritative account. In 2014, however, Inomata and colleagues proposed a revision of this chronology by shifting its Preclassic portion (including the Las Charcas and Providencia phases) roughly 300 years later in time. In this article, we analyze a total of 108 radiocarbon dates with Bayesian statistics, tying them to detailed ceramic analysis. These dates include previously reported dates, measured after the year 2000, as well as 68 new radiocarbon dates obtained from Kaminaljuyu and nearby sites. The results largely support Inomata and coauthors’ (2014) revised Preclassic chronology, placing the Las Charcas–Providencia transition around 350 BC and the Providencia–Verbena transition around 75 BC. In addition, we present new dates on the Early Classic period, although some ambiguity remains for the Esperanza phase, when Teotihuacan-related elements were introduced to Kaminaljuyu. The revised chronology, combined with environmental data, suggests an explosive increase in population and construction activity during the Verbena and Arenal phases.
Article
The 19th century was an era of increasing mechanization and globalization, which transformed maritime networks and shipbuilding in and beyond the Mediterranean. Shipwrecks offer valuable physical evidence of such maritime connectivity and evolving shipbuilding techniques but must be dated within a high-resolution timeframe to be synchronized with, and thereby enhance, historical records. We focus here on high-resolution dating of the Akko Tower Shipwreck, the remains of an Ottoman merchant brig found inside the harbor of Akko, Israel. We use dendrochronology, ¹⁴C wiggle-matching, and Bayesian chronological modeling to determine that the ship was likely constructed in the mid-1850s, and therefore called at Akko’s harbor after the town’s 1840 bombardment, a period of decline traditionally under-studied in Ottoman historical narratives. Using dendroprovenancing methods, we find that the ship’s hull used timbers from the Anatolian Black Sea region, although it was built in the French construction tradition, and used British metal rigging and fasteners, which reflect shifting Anglo-French influence and socioeconomic interconnections with the Ottoman Empire during the 19th century. The Akko Tower Shipwreck is the first shipwreck from Israel to be dendrochronologically dated and provenanced. Our results show how dendrochronology and Bayesian chronological modeling can be used successfully not only for high-precision dating, but also for untangling the shipbuilding processes and the socioeconomic networks that made ship construction possible. We also re-evaluate East Mediterranean oak sapwood datasets and develop an approximate new sapwood model that provides more robust estimates of felling dates for tree-ring analysis of this region’s oak wooden cultural heritage.