ArticlePDF Available

The IntCal20 Approach to Radiocarbon Calibration Curve Construction: A New Methodology Using Bayesian Splines and Errors-in-Variables

Authors:

Abstract and Figures

To create a reliable radiocarbon calibration curve, one needs not only high-quality data but also a robust statistical methodology. The unique aspects of much of the calibration data provide considerable modeling challenges and require a made-to-measure approach to curve construction that accurately represents and adapts to these individualities, bringing the data together into a single curve. For IntCal20, the statistical methodology has undergone a complete redesign, from the random walk used in IntCal04, IntCal09 and IntCal13, to an approach based upon Bayesian splines with errors-in-variables. The new spline approach is still fitted using Markov Chain Monte Carlo (MCMC) but offers considerable advantages over the previous random walk, including faster and more reliable curve construction together with greatly increased flexibility and detail in modeling choices. This paper describes the new methodology together with the tailored modifications required to integrate the various datasets. For an end-user, the key changes include the recognition and estimation of potential over-dispersion in ¹⁴ C determinations, and its consequences on calibration which we address through the provision of predictive intervals on the curve; improvements to the modeling of rapid ¹⁴ C excursions and reservoir ages/dead carbon fractions; and modifications made to, hopefully, ensure better mixing of the MCMC which consequently increase confidence in the estimated curve.
Content may be subject to copyright.
Radiocarbon, Vol 00, Nr 00, 2020, p 143 DOI:10.1017/RDC.2020.46
© 2020 by the Arizona Board of Regents on behalf of the University of Arizona. This is an Open Access
article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.
org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium,
provided the original work is properly cited.
THE INTCAL20 APPROACH TO RADIOCARBON CALIBRATION CURVE
CONSTRUCTION: A NEW METHODOLOGY USING BAYESIAN SPLINES
AND ERRORS-IN-VARIABLES
Timothy J Heaton1*Maarten Blaauw2Paul G Blackwell1
Christopher Bronk Ramsey3Paula J Reimer2E Marian Scott4
1School of Mathematics and Statistics, University of Sheffield, Sheffield S3 7RH, UK
2The 14CHRONO Centre for Climate, the Environment and Chronology, School of Natural and Built Environment,
Queens University Belfast, Belfast BT7 1NN, UK
3Research Laboratory for Archaeology and the History of Art, University of Oxford, 1 South Parks Road,
Oxford OX1 3TG, UK
4School of Mathematics and Statistics, University of Glasgow, Glasgow G12 8QS, UK
ABSTRACT.To create a reliable radiocarbon calibration curve, one needs not only high-quality data but also a robust
statistical methodology. The unique aspects of much of the calibration data provide considerable modeling challenges
and require a made-to-measure approach to curve construction that accurately represents and adapts to these
individualities, bringing the data together into a single curve. For IntCal20, the statistical methodology has
undergone a complete redesign, from the random walk used in IntCal04, IntCal09 and IntCal13, to an approach
based upon Bayesian splines with errors-in-variables. The new spline approach is still fitted using Markov Chain
Monte Carlo (MCMC) but offers considerable advantages over the previous random walk, including faster and
more reliable curve construction together with greatly increased flexibility and detail in modeling choices. This
paper describes the new methodology together with the tailored modifications required to integrate the various
datasets. For an end-user, the key changes include the recognition and estimation of potential over-dispersion in
14C determinations, and its consequences on calibration which we address through the provision of predictive
intervals on the curve; improvements to the modeling of rapid 14C excursions and reservoir ages/dead carbon
fractions; and modifications made to, hopefully, ensure better mixing of the MCMC which consequently increase
confidence in the estimated curve.
1 INTRODUCTION
One of the core challenges in creating a reliable radiocarbon (14C) calibration curve is
ensuring that the statistical approach used for the curvesproductionisableto
accurately synthesize the various datasets which make up the curve, specifically
recognizing and incorporating their diverse and unique aspects in doing so. This is
particularly critical in light of the recent increase in the availability of high-precision
radiocarbon determinations, and the consequent demand by users for ever more precise
and accurate calibration. For IntCal20, not only is the underlying data available for use
in curve construction considerably more numerous and detailed than in previous curves
but, due to the advances in our wider understanding of the Earthssystems,ourinsight
into the specific and individual characteristics of much of that data has also improved. If
we are able to harness and more accurately represent these specific features within our
curve construction then this will hopefully improve the resultant calibration. To achieve
this, for the new calibration curve we have completely revised the statistical
methodology from a random walk to a Bayesian spline based approach. This new
approach allows us to take better advantage of the underlying data and also to provide
output which is more useful and relevant for calibration users.
Over the years, the IntCal statistical methodology has developed and improved alongside our
understanding of the constituent data and as we gain more knowledge of the underlying Earth
*Corresponding author. Email: t.heaton@shef.ac.uk.
, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use
Systems processes. For IntCal98 (Stuiver et al. 1998), the curve was produced via a
combination of linear interpolation of windowed tree-ring averages and frequentist splines.
In 2004, the approach was updated to a random walk model (Buck and Blackwell 2004)
which introduced an approximate Bayesian method, and began to incorporate some of the
important additional aspects of the constituent data such as potential uncertainty in the
calendar ages for the older determinations, and that many tree-ring observations related to
multiple years of metabolization. This random walk was developed further for IntCal09
(Blackwell and Buck 2008; Heaton et al. 2009) and IntCal13 (Niu et al. 2013) into a fully
Bayesian MCMC approach that enabled the inclusion of more of the unique structures in
the calibration datasets such as floating sequences of tree rings and more complex
covariances in the calendar age estimates provided by wiggle-matching or palaeoclimate
tie-pointing.
The random walk approach however had several disadvantages. While in principle, it allowed
for modeling flexibility, the details of the implementation required a very large number of
parameters that, despite being highly dependent, were predominantly updated individually.
This made it extremely slow to run; difficult to ensure it explored the full space of possible
curves; and also hard to assess whether the MCMC had reached its equilibrium
distribution. Furthermore, this restricted the ability to explore the impact of specific
modeling assumptions or individual data on the final curve. As the volume of the data used
to generate IntCal has increased, alongside the need for further bespoke modeling of key
data structures, this random walk approach to curve creation has consequently become
computationally impractical for further use.
For the updated IntCal20 curve, the IntCal working group therefore requested a new approach
to curve creation be developed. This new approach needed to be equally rigorous, from a
statistical perspective, as the previous methodology; to incorporate the advances in both
geoscientific understanding and radiocarbon precision that have occurred since 2013; and
yet overcome the implementational difficulties of the previous random walk approach.
Specifically, to increase confidence in the curves robustness and reliability the new
approach needed to run at a speed which allowed investigation of the effect of key
modeling choices not possible with the previous methodology. The new approach is based
upon Bayesian splines with errors-in-variables as introduced by Berry et al. (2002) but
requires multiple bespoke innovations to accurately adapt to the specific data complexities.
We believe it offers a significant improvement over previous approaches not just in the
modeling but also the additional output it can provide for both calibration users and
geoscientific users more widely.
The paper is set out as follows. In Section 2we list the main advances made in the statistical
methodology, data understanding and modeling since the last set of IntCal curves (Reimer
et al. 2013; Hogg et al. 2013). We then provide, in Section 3, a short non-technical
introduction to three key practical ideas for calibration users: an explanation of Bayesian
splines; the importance of recognizing potential calendar age uncertainty in curve
construction; and the modeling of potential over-dispersion in the data. Section 4then
describes the main technical details of curve construction. The IntCal20 curve is created in
two sections with somewhat different statistical concerns. The more recent part of the
curve, back to approximately 14 cal kBP, is based entirely upon tree-ring determinations,
that are predominantly dendrodated with exact, known calendar ages. Here the main
challenges are to incorporate the large amount of data, provide high resolution in the curve
2T J Heaton et al.
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
with appropriate uncertainties, and accurately incorporate blocked determinations that
represent the measurement of multiple years. This is described in Section 4.3.Further
back in time, the curve is based upon a wider range of 14C material where, as well as
tree rings, we use direct and indirect records of atmospheric 14C in the form of corals,
macrofossils, forams and speleothems. We describe the modifications required to
incorporate these kinds of data in Section 4.4. This being a Bayesian approach we are
able to incorporate prior information into our model and also provide posterior
information on particular outputs of independent interest. In Section 5we provide an
indication of the kind of additional output the new methodology is able to provide. In
particular the new approach generates not only pointwise means and variances for the
calibration curve but, for the first time, sets of complete posterior realizations from 055
cal kBP which also allow access to covariance information. This covariance has the
potential to improve future calibration of multiple radiocarbon determinations, for
example in wiggle matches or more complex modeling. We also gain information on the
level of additional variation (beyond laboratory reported uncertainty) present in tree-ring
14C determinations arising from the same calendar year. Finally, we discuss potential
future work in Section 6along with areas we have identified for further improvement for
the next IntCal iteration. Note that the statistical approaches to the creation of SHCal20
(Hogg et al. 2020 in this issue) and Marine20 (Heaton et al. 2020 in this issue) are
presented within those papers and not discussed in detail here.
Notation
All ages in this paper and the database are reported relative to mid-1950 AD (=0 BP, before
present). Conventional, pre-calibration, 14C ages are given in units 14 C yrs BP.Calendar, or
calibrated, ages are denoted as cal BPor cal kBP(thousands of calibrated years before
present).
Data and Code
As in previous updates to the curve, the constituent data is available from the IntCal database
http://www.intcal.org/. Other inputs are available on request, for example covariance matrices
for the calendar ages of the tie-pointed records of the Cariaco Basin (Hughen and Heaton 2020
in this issue), Pakistan and Iberian margin (Bard et al. 2013); and Lake Suigetsu (Bronk
Ramsey et al. 2020 in this issue). Coding was performed in R(R Core Team 2019), using
the fda (Ramsay et al. 2018), mvtnorm (Genz et al. 2019), mvnfast (Fasiolo 2016) and
Matrix (Bates and Maechler 2019) packages to efficiently create and fit the spline bases;
and doParallel (Corporation and Weston 2019) to implement parallelization for the
MCMC tempering. This code is available on request from the first author.
2 DEVELOPMENTS IN THE INTCAL20 STATISTICAL METHODOLOGY
The main differences/improvements in the updated IntCal20 methodology over the previous
random walk can be split into three broad categories: those relating to improvements in the
statistical implementation itself; progress in the detailed modeling of unique data aspects
that are enabled by the updated methodology; and finally, advances in the curve output
that are both relevant to users of the calibration curve and of potential interest in their
own right.
IntCal20 Approach to 14C Calibration Curve Construction 3
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
Improvements in Statistical Implementation
1. Bayesian Splinesthe change from the random walk of Niu et al. (2013) to splines allows a
much more computationally feasible fitting algorithm. We retain the Bayesian aspect since
it allows us to incorporate the various unique aspects of the calibration data; provides useful
posterior information, e.g. the posterior calendar ages of the constituent data; and maintains
consistency with calibration itself which is now universally implemented under that
paradigm. This Bayesian spline approach is equally conceptually rigorous as the random
walk but considerably more flexible and can be run much more quickly.
2. Change in modeling and fitting domainfunction estimation via splines is based on a
trade-off between creating a curve that passes close to the observed data yet is not too
rough. Previously all aspects of the curves construction occurred in the radiocarbon
age domain. However, this is not the natural space in which to either model the curve
roughness or decide on fidelity of the data to the curve. In the older section of the 14C
record, the measurement uncertainties on the radiocarbon age scale become non-
symmetric. This causes difficulties in fairly assessing the fit of a model to observed
data if we judge it in this radiocarbon age domain. Instead, it is more natural to
assess model fit in F14C where measurement uncertainty remains symmetric. Similarly,
a more natural domain in which to model the curve roughness is in Δ14C space. We
therefore change our modeling domain to Δ14C and our fitting domain to F14 Cwhen
creating the curve. See Section 3.1.2 for definitions of the radiocarbon age, F14Cand
Δ14Cdomains.
3. Over-dispersion in dendrodated treesas the volume of data entering IntCal increases, it is
key to make sure we do not produce a curve which is over-precise as this would give
inaccurate calibration for a user. While laboratories attempt to quantify all 14C
uncertainty in their measurements, including through intercomparison exercises such as
Scott et al. (2017), there remain some sources of additional 14 C variation which are
difficult for any laboratory to capture. For tree rings, potential examples include
variation between local region, species, or growing season. Consequently, when we bring
together 14C measurements from the same calendar year, they may potentially be over-
dispersed, i.e. more widely spread than would be expected given their laboratory quoted
uncertainty. The new approach incorporates a term allowing for potential over-
dispersion within 14C determinations of the tree rings so that, if additional variability is
seen in the underlying IntCal data, the method will account for it and prevent excessive
confidence in the resultant curve.
4. Heavy tailed errorsin the older portion of the curve, where data come from a range of
different 14C reservoirs, we aim to reduce the influence of potential outliers by
permitting each dataset to have heavier tailed errors. These tails are adaptively
estimated during curve construction.
5. Improved model mixing and parallel temperingwhen using any MCMC method, it is
crucial to ensure that the chain has reached its equilibrium distribution and that it is not
stuck in one part of the model space. This was a significant concern with the previous
random walk approach since the curve was updated one calendar year at a time.
Conversely, Bayesian splines enable updates of the complete curve simultaneously via
Gibbs sampling. To address any additional concerns about mixing of the new MCMC
and to ensure we explore the space of possible curves more freely, we also implement
parallel tempering whereby we run multiple modified/tempered chains simultaneously in
4T J Heaton et al.
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
such a way that some can move around the space more easily. By appropriate switching
between these tempered chains we can further improve model mixing.
Improvements in Data Modeling
These steps forward in the statistical implementation, while still maintaining computational
feasibility, also enable us both to incorporate new data structures and to advance our
modeling of the processes from which the data arise:
1. Large increase in volume of data throughout the curvethe new IntCal20 is based upon a
much greater number of 14C determinations. These include many annual tree-ring
determinations such as remeasurement within various laboratories of data from the
period of Thera (e.g. Pearson et al. 2018); new wiggle-matched and floating sequences of
late-glacial tree rings (e.g. Capano et al. 2020 in this issue); and new measurements of
Hulu Cave extending further back in time (Cheng et al. 2018). In total, the new
IntCal20 curve is based upon 12,904 raw 14C measurements compared to the 7019 used
for IntCal13 (Reimer et al. 2013). This rapid increase means a faster curve construction
method is essential.
2. Blocking in dendrodated treeswith the new radiocarbon revolution providing an
increased ability to measure annual tree rings, there has been a concern that the
inclusion within IntCal of determinations relating to decadal or bi-decadal averages
alongside these new annual measurements could mean we lose critical information on
short-term variation in atmospheric radiocarbon levels. Our approach fully recognizes
the number of years each determination represents, meaning there is no loss of
information as a result of such blocked determinations.
3. Variable marine reservoir agesIntCal09 (Reimer et al. 2009) and IntCal13 (Reimer et al.
2013) modeled marine reservoir ages beyond the Holocene as constant over time. For
IntCal20, we incorporate time-varying marine reservoir ages using the LSG model of
Butzin et al. (2020 in this issue) and a separate adaptive spline for the Cariaco Basin.
4. Inclusion of new floating tree-ring sequencesthe new curve includes several new late-
glacial trees which have calendar age estimates obtained via wiggle matching (Reinig
et al. 2018; Capano et al. 2020 in this issue), as well as three entirely floating tree
chronologies around the time of the Bølling-Allerød1(Adolphi et al. 2017) and two
older Southern Hemisphere floating kauri tree-ring sequences (Turney et al. 2010,2017)
on which we have no absolute dating information and which need to be placed
accurately amongst the other data.
5. Rapid 14 C excursionsthere are several specific times when the level of atmospheric 14 Cis
known to vary extremely rapidly. For IntCal20 we have identified three such events around
7745 AD, 9934 AD and 660 BC (Miyake et al. 2012,2013;OHare et al. 2019). Such
rapid 14C changes will typically not be modeled well by standard regression which will tend
to smooth them out. However, by increasing the density of knots forming our spline basis in
the vicinity of these rapid excursions we can better represent these significant features.
1While Adolphi et al. (2017) do provide potential age estimates for their Bølling-Allerød trees obtained by comparison
to ice-core 10Be, to maintain independence between the ice-core and radiocarbon timescales these calendar age
estimates are not used for IntCal20.
IntCal20 Approach to 14C Calibration Curve Construction 5
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
Improvements in Output
The statistical innovations also mean that we can now provide several new facets to our output:
1. Annualized outputwe are able to provide curve estimates on an annual grid enabling
more detailed calibration. This will be needed in light of the increased demand to
calibrate annual radiocarbon determinations. While we do not discuss the implications
for calibration itself in this paper, such a detailed annual calibration curve will likely
increase the potential for multimodal calendar age estimates which require significant
care in interpretation, especially during periods of plateaus in the calibration curve. See
the IntCal20 companion paper (van der Plicht et al. 2020 in this issue) for more details
and an illustrative example with the calibration of the Minoan Santorini/Thera eruption.
2. Predictive intervalsif the IntCal20 14 C data contain additional sources of variation
beyond their laboratory quantified uncertainties (due to potential regional, species or
growing season differences), i.e. they appear more widely spread around the calibration
curve than can be explained by their quoted uncertainties, we will need to take this into
account for calibration of new determinations too. Specifically, any 14 C determination a
user calibrates against the curve is likely to contain similar levels of unseen additional
variation. If we can assess the level of additional variation seen within the IntCal20
data, then we can incorporate this into calibration for a user by providing a predictive
interval on the curve. Intuitively this predictive interval aims to incorporate both
uncertainty in the value of the curve itself and potential additional uncertainty sources
such as regional, species or growing season effects beyond the laboratory quoted
uncertainty. These predictive intervals are therefore more relevant for calibration than
curve intervals which do not incorporate or adapt to potential additional sources of
variability.
3. Posterior information arising from curve constructionthe Bayesian implementation
allows us to provide posterior estimates for many aspects of interest. For example, all of
the records with uncertainties in their calendar timescales (the various floating tree-ring
sequences, marine sediments, Lake Suigetsu and the speleothems) will be calibrated
during the curves construction. We can provide these posterior calibrated age estimates
along with other information such as the level of over-dispersion seen in the data, and
posterior estimates of marine reservoir ages and dead carbon fractions.
4. Complete realizations of the curve from 055 cal kBPhistorically IntCal output has
consisted of pointwise posterior means and variances. However, the Bayesian approach
provides a set of underlying curve realizations which provide covariance information on
the value of the curve at any two calendar ages. When calibrating single determinations,
this covariance does not affect the calendar age estimates obtained, i.e. calibrating
against the pointwise posterior means and variances provides the same inference as
calibrating against the set of individual curve realizations. However, when calibrating
multiple determinations simultaneously, for example if we are seeking to determine the
length of time elapsed between two determinations or fit a more complex model, use of
these complete realizations in consequent calibration offers potential to improve insight.
Work is planned by the group to explore how this may be best incorporated into
existing calibration software.
We discuss some of these aspects in more detail, and present some of the output available, in
Section 5and the Supplementary Information.
6T J Heaton et al.
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
3 A BRIEF SUMMARY OF BAYESIAN SPLINES, ERRORS-IN-VARIABLES
AND PREDICTIVE INTERVALS
For a typical calibration user the key elements of the new methodology consist of the change to
Bayesian splines with the accompanying shift in the modeling and fitting domains; the
continued recognition that many of the data have uncertainties in their calendar ages and
the significant effect that has on curve estimation; and the use of predictive intervals on the
published curve for calibration. We therefore commence with an intuitive explanation of
these three elements. The technical details can be found later in Section 4. We note that, in
our intuitive descriptions, the observational model may be further complicated by 14C
determinations representing multiple, as opposed to single, calendar years and the inclusion
of a reservoir age or dead carbon fraction for those observations which are not directly
atmospheric. However, for clarity of exposition, we do not consider either of these factors
here, and refer to Section 4for details on how these additions can be incorporated.
3.1 Bayesian Splines and Choice of Modeling Domain
3.1.1 Frequentist Ideas
Suppose that we observe a function f, subject to noise, at a series of times i,i1;...;n,
Yifi"ii1;...;n;
where, for the time being, we assume that the times iare known absolutely. To obtain a
spline estimate for the unknown function we seek to find a function that provides a
satisfactory compromise between going close to the data but yet does not overfit. This
can be done by choosing the estimate
ˆ
fthat minimizes, over a suitable set of functions,
SfFITf;YPENf;
where FITf;Ymeasures the lack of agreement between a potential fand the observed Y; and
PENfrepresents a penalty for functions that might be overfitting. Typically, FITf;Y
consists of the sum-of-squares difference between fiand Yi, while PENfassesses the
roughness of a proposed fensuring that the more variable fthe larger the penalty given to
it with the aim of preventing the spline estimate from overfitting the data. The parameter
determines the relative trade-off between how one values fidelity to the observed data, i.e.
FITf;Y, compared with function roughness, PENf. A large will heavily penalize
roughness and typically results in a smooth curve that is less close to the data; while a
small will mean the spline goes closer to the data but at the expense of being more variable.
3.1.2 Selection of Appropriate Fitting and Modeling Domains for Radiocarbon
Within the radiocarbon community there are three commonly used domains2:Δ14C, F14 C and
the radiocarbon age. Given g, the historical level of Δ14 C in year cal BP, we can freely
convert between the domains as follows:
F14C domain: f 1
1000 g1

e=8267:The value of F14 C, or fraction modern (Reimer
et al. 2004b), denotes the relative proportion of radiocarbon to stable carbon in a sample
compared to that of a sample with a conventional radiocarbon age of 0 14C yrs BP
(i.e. mid-1950 AD) after accounting for isotopic fractionation. It is a largely linear
calculation from the instrumental measurements meaning its uncertainties are
approximately normal.
2We use the age-corrected Δ14C as described in Stuiver and Polach (1977).
IntCal20 Approach to 14C Calibration Curve Construction 7
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
Radiocarbon age domain: h8033 ln f:The radiocarbon age, in 14C yrs BP, is
obtained from the fraction modern using Libbys original half-life and without any
calibration. Since this is a non-linear mapping of F14 C, the uncertainties are no longer
even approximately normally distributed as we approach the limit of the technique.
When fitting a smoothing spline, we have some flexibility in the precise choice of FITf;Y, our
assessment of fit, and our roughness penalty PENf. For radiocarbon purposes, the natural
domain to assess the quality of fit of a proposed calibration curve to observations is the F14 C
domain, the raw scale on which the determinations are obtained. In this F14 C domain, our
measurement uncertainties are symmetric. Conversely, in the radiocarbon age domain these
uncertainties become asymmetric as we progress back in time. We therefore choose
FITf;FX
n
i1
1
2
i
fiFi

2;
where Fiare the observed F14C values of the determinations and itheir associated uncertainties in
the F14C domain.
F14C is not however the most appropriate domain in which to assess roughness since it exhibits
exponential decay over time making equitable penalization of potential calibration curves
more difficult. Instead a more natural choice is the Δ14C domain where, a priori, one might
expect a calibration curve to display approximately equal roughness over its length. For
our penalty function we therefore choose
PENfZg00

2d;
where g00is the second derivative of the proposed level of Δ14C, a standard penalty for
function roughness.
We summarize this by saying that we perform modeling in the Δ14C domain as it is here we
penalize roughness; while we perform fitting in the F14C domain since it is here we assess
fidelity of the curve to the raw observations. Since, given , the transformation from F14 C
to Δ14C is affine we can utilize these different modeling and fitting domains while still
maintaining a practical spline estimation approach.
3.1.3 Bayesian Reframing
To reinterpret the above idea in a Bayesian framework we can split our functional
SfX
n
i1
1
2
i
fiFi

2Zg00

2d
into two distinct parts. The penalty component Rg00

2dcan be considered as specifying the
prior distribution on the space of calibration curves; more precisely, it gives the negative log
density of the prior. Intuitively this summarizes our prior beliefs about the form of the
calibration curve before we observe any actual data. We then wish to update this prior
belief, in light of the observed data, to obtain our posterior. This updating is achieved via
the fitting component Pn
i1
1
2
ifiFi

2which represents the negative log-likelihood of
the observed data under an assumption that each FiNfi;
2
i. The value of Sfthen
represents the negative posterior log-density for a potential calibration curve f.
8T J Heaton et al.
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
This idea is illustrated in Figure 1. In panel (a) we present three potential calibration curves in
the Δ14C domain. Using just our prior, we give each curve an initial weight corresponding to
their roughness. The black curve is the least rough and so would be given highest prior weight
as a plausible potential calibration curve. The green curve is the roughest and so has least
weight according to our prior. Each of these Δ14 C curves is then converted into the
symmetric F14C space and compared with our observations as shown in panel (b). The
plausibility of each curve is then updated to incorporate the fit to this data. Based upon
this, the red curve would have a higher posterior density as it fits relatively closely while
not being overly rough. Both the green and black curves would have a low posterior
density and so be considered highly implausible as they do not pass near the data. Use of
MCMC allows us to generate any number of plausible calibration curves drawn from the
posterior that provide a satisfactory trade-off between the roughness prior and fidelity to
the observations. Some such realizations are shown in panels (c) and (d) in both the Δ14C
and F14C domains respectively. These realizations are then summarized to produce the final
IntCal20 curve.
3.1.4 Variable Smoothing and Knot Selection
A further choice to be made when spline smoothing is the set of functions (or potential
calibration curves) to search over for our functional Sf. This set is called a basis. We use
cubic splines where this basis is determined by specifying what are known as knots at
distinct calendar times. The more knots one has in a particular time period, the more the
200 220 240 260 280 300
−30
−20
−10
0
10
20
30
Δ14C Space Prior
Calendar Age (cal BP)
Δ14C (‰)Δ14C (‰)
(a)
200 220 240 260 280 300
0.960
0.965
0.970
0.975
0.980
0.985
0.990
0.995
Comparing curves to data in F14C Domain
Calendar Age (cal BP)
(b)
F14C
200 220 240 260 280 300
0
5
10
15
20
25
Δ14C Space Posterior
Calendar A
g
e (cal BP)
(c)
200 220 240 260 280 300
0.960
0.965
0.970
0.975
0.980
0.985
0.990
0.995
F14C Space Posterior
Calendar A
g
e (cal BP)
F14C
(d)
Figure 1 An illustration of Bayesian splines. Panel (a) shows some potential calibration curves in the Δ14C domain
drawn from the prior. These are then compared with the observed data in the F14C domain as shown in panel (b) to
form our Bayesian posterior. The bottom two panels (c) and (d) show posterior realizations of potential curves (shown
in Δ14C and F14C space respectively) obtained via MCMC that provide a satisfactory trade-off between agreement
with our prior penalizing over-roughness and the fit to our observed data.
IntCal20 Approach to 14C Calibration Curve Construction 9
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
spline can vary. One common option, known as a smoothing spline, is to place a knot at every
calendar age ifor i1;...;n. However, since we have a very large number of observations n
this is computationally impractical. Instead we use P-splines (see Berry et al. 2002, for details)
where we select a smaller number of knots; this is equivalent to restricting the potential
calibration curves to a somewhat smaller subspace of functions.
In implementing P-splines we need to make sure that the curves we consider are still able to
provide sufficient detail for a calibration user and identify fine scale features such as solar cycles
where we have the ability to do so. The data on which we base our curve have highly variable
density. In some periods, such as recent dendrodated trees, we have a great density of
determinations while in others the underlying data is more sparse. To adapt to this and
keep required detail, we choose a large number of knots and place them at calendar age
quantiles of the data to provide variable smoothing. This approach of locating the knots at
observed quantiles is standard in the regression literature (e.g. Harrell 2001). Where our
data is dense, we can pick out the required fine detail but where the data is more sparse we
smooth more significantly. An example can be seen in Figure 2. This figure also highlights
how we pack additional knots around known Miyake-type events, narrow spikes
900 1000 1100 1200 1300 1400 1500 1600
800 1000 1200 1400 1600 1800
Calendar Age (cal BP)
Radiocarbon Age (14Cyrs BP)
Posterior Mean
95% Predictive Intervals
Figure 2 Variable smoothing and knot selection. Shown as a rug of tick marks along the
bottom are the locations of the knots for the cubic spline. These are placed at quantiles of
the observed calendar ages to provide variable smoothing. In dense regions, we can
identify more detail in the calibration curve; while where the underlying data is less dense
we perform more smoothing. In particular, note the additional knots placed around the
two Miyake-type events (i.e. 957 and 1176 cal BP) which allow the curve to vary much
more rapidly at these times. The points from different datasets within the IntCal database
are shown in different colors to distinguish them.
10 T J Heaton et al.
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
(sub-annual) of increased 14C production, to enable us to better retain these events in the final
curve. For more details on choice of knots and placement see Sections 4.3.4 and 4.4.3.
3.2 Errors-in-Variables Regression
3.2.1 Calendar Age Uncertainty
Within the IntCal database (http://www.intcal.org/), many of the determinations have calendar
ages which are not known absolutely. This is particularly the case as we progress further back in
time and calendar age estimates are constructed from uranium thorium (U-Th) dating, e.g.
speleothems (Southon et al. 2012; Cheng et al. 2018;Becketal.2001;Hoffmannetal.2010)
and corals (Bard et al. 1990,1998,2004;Cutleretal.2004; Durand et al. 2013;Fairbanks
et al. 2005; Edwards et al. 1993;Burretal.1998,2004); varve counting, e.g. parts of Cariaco
Basin (Hughen et al. 2004) and Lake Suigetsu (Bronk Ramsey et al. 2020 in this issue); and
palaeoclimate tuning/tie-pointing, e.g. other parts of Cariaco Basin (Hughen and Heaton
2020 in this issue), and the Pakistan and Iberian Margins (Bard et al. 2013). Furthermore, in
the case of several of the floating tree-ring sequences, we have only a relative set of calendar
ages and no absolute age estimate. Regression in this situation is called errors-in-variables
since we have errors/uncertainties in both the calendar age and radiocarbon determination
variables. For these observations we therefore observe pairs Fi;Tifg
i2Iwhere:
Fifi"i;
Tii i:
Here f is our calibration curve of interest in the F14C domain; "iN0;
2
iindependently;
and idescribes the uncertainty in our calendar age estimate. The form of this calendar age
uncertainty varies between the datasets. While there is no restriction on the distribution of i
for our Bayesian spline approach, we model all these calendar age uncertainties as normally
distributed but with appropriate covariances. For some sets they are considered independent,
e.g. corals dated by U-Th; while for others, e.g. floating tree-ring sequences and those records
dated by varve counting or palaeoclimate tie-pointing (Heaton et al. 2013), there is
considerable dependence between the observations. For more detail on the various
covariance structures in the calendar age uncertainties see Niu et al. (2013).
3.2.2 Importance of Recognizing Calendar Age Uncertainty
Incorporation of calendar age uncertainty in the construction of the IntCal calibration curves
has occurred since IntCal04 (Reimer et al. 2004a) and is key for reliable curve construction. It
was therefore crucial to retain within the new methodology. A range of statistical methods have
been developed to deal with errors-in-variables regression. From the frequentist, non-Bayesian,
perspective Fan and Truong (1993) introduced a kernel-based approach which achieves global
consistency; Cook and Stefanski (1994) proposed a more general approach known as SIMEX.
These approaches were not however considered suitable for our application due to the genuine
prior information we have on certain aspects of the data; the wider, almost universal, use of
Bayesian methodology within radiocarbon calibration; and the independent interest in several
aspects of the calibration data which could be provided more simply through a Bayesian
method. We therefore began the development of our method using the Bayesian splines of
Berry et al. (2002).
The effect of not recognizing calendar age uncertainty, where it is present, in regression is quite
varied. In the case that the calendar age uncertainties are entirely independent of one another,
IntCal20 Approach to 14C Calibration Curve Construction 11
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
such as U-Th dating, not recognizing calendar age uncertainty will typically result in curve
estimates that are overly smooth and do not attain the peaks and troughs of the true
underlying function. See Samworth and Poore (2005) for an illustrative case study. However,
in the case of IntCal, the situation is made more complicated by the shared dependence of the
calendar age estimates within particular datasets. Specifically, the entire timescale for, e.g.
Hulu Cave, is not the same as the Suigetsu or Cariaco timescale. As a consequence, all of
these individual datasets may show the same overall features but at slightly different times.
We require our method to recognize that these different timescales may need to be aligned,
within their respective uncertainties, to keep these shared features. This requires us to shift
multiple ages jointly by stretching/squashing the timescales accordingly. A failure to adapt to
this joint calendar age uncertainty can give curve estimates which either introduce spurious
wiggles as the curve flips between the different datasets or lose major features entirely.
Illustration
We provide in Figure 3an illustrative example of the need to incorporate calendar age
uncertainty. For simplicity, in this example, we fit and model our spline in the same
0.0 0.2 0.4 0.6 0.8 1.0
−1.0 −0.5 0.0 0.5 1.0 1.5
True function and calendar age θ of observations
Calendar Age
y
(a)
Dataset 1
Dataset 2
True Function
0.0 0.2 0.4 0.6 0.8 1.0
−1.0 −0.5 0.0 0.5 1.0 1.5
Observed noisy ages T of observations
Calendar Age
y
(b)
Dataset 1
Dataset 2
True Function
0.0 0.2 0.4 0.6 0.8 1.0
−1.0 −0.5 0.0 0.5 1.0 1.5
Estimated function if do not recognize calendar age uncertainty
Calendar A
g
e
y
Dataset 1
Dataset 2
True Function
Estimate (no errors−in−variables)
(c)
0.0 0.2 0.4 0.6 0.8 1.0
−1.0 −0.5 0.0 0.5 1.0 1.5
Estimated function if do recognize calendar age uncertainty
Calendar A
g
e
y
Dataset 1
Dataset 2
True Function
Estimate (with errors−in−variables)
(d)
Figure 3 The importance of recognizing calendar age uncertainty (and correctly representing covariance within that
uncertainty) when constructing a curve based on data arising from records with different observed timescales. Panel
(a) shows the true, underlying, calendar ages of the data in two different records; while panel (b) shows a joint shift in
the observed calendar ages within record 2. In such a case that observed timescales in two records are offset from one
another, if we ignore this calendar age uncertainty then our spline estimate will introduce spurious variation as it flips
between the records as in panel (c). Conversely if we incorporate such calendar age uncertainty and accurately
represent it then we can still recover the underlying function accurately, as shown in (d).
12 T J Heaton et al.
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
domain. We consider a straightforward underlying function which we wish to reconstruct from
100 noisy observations Yi;Tifg
100
i1
fesin4cos2;
where our underlying iBeta1:1;1:1and observed YiNfi;0:052for i1;...;100.
Further, let us assume that these observations arise from two different sediment cores, 50
from core 1 (shown as black triangles) and 50 from core 2 (shown as red dots) as seen in
panel (a). Within core 1, the calendar ages are known absolutely. However, within core 2
the observed timescale is somewhat shifted/biased so that when we observe the calendar
ages Tin this core they all share the same joint shift from their true values, i.e.
Tiiif iis in core 1;
Tii if iis in core 2;
where N0;0:12. The observed pairs Yi;Tifg
100
i1are presented in panel (b) showing the
shift in timescale within core 2. If we attempt to reconstruct the function using splines based
upon our observed Yi;Tifg
100
i1without recognizing the calendar age uncertainty, i.e. assuming
both cores are on the same timescale, then we obtain the estimate in the panel (c). The estimate
is spuriously variable as the spline will try to pass near all of the data on their non-comparable
timescales. Conversely, if we recognize that the timescale in core 2 may need to be shifted onto
the timescale of core 1, we obtain the estimate shown in panel (d). If we inform our Bayesian
method that the calendar age observations in core 2 are subject to uncertainty, it will estimate
the size of the joint shift needed to register core 2 onto the true timescale of core 1 and
simultaneously reconstruct the true underlying function.
Implications for Visually Assessing Curve Fit
While it is important to incorporate errors-in-variables in curve construction, and accurately
represent any dependence in the calendar age uncertainties, this does cause some difficulty in
visually assessing the quality of fit between the raw constituent data and the final IntCal curve.
Since the proposed method can shift the calendar ages of the observed data left and right within
their uncertainties as described, one cannot estimate the final curves goodness-of-fit by eye
using only the radiocarbon age axis if the raw data are only plotted at their initial,
observed calendar ages. The method will have tried to align shared features even if they
occur at somewhat different observed times within the different sets. This should be taken
into consideration when viewing the final curve against the raw data. We provide the
posterior calendar age estimates for the true calendar ages ifor all the data. The difficulty
of assessing the fit of the curve by eye is further compounded by the offsets in 14 C
described in Section 4.4, in particular the marine reservoir ages which vary significantly
over time and so will change as the MCMC updates the calendar ages of the data.
3.3 Predictive Intervals and Over-Dispersion
3.3.1 Background and Motivation
As well as creating a curve which has the correct posterior mean, it is key to make sure we provide
appropriate intervals on the curve to ensure that, when used for calibration, the calendar age
estimates produced are not over- or under-precise. This becomes particularly relevant for
IntCal20 due to the large increase in data available to create the curve, and also the increased
measurement precision provided by current laboratories. In addition to the laboratory
reported uncertainty, there are a wide range of possible further sources of variation in the
recorded 14C within objects of the same age, for example determinations come from different
IntCal20 Approach to 14C Calibration Curve Construction 13
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
locations; have different local environments; tree rings may be of different species; and have
different periods of growth due to local weather and so may have differing elements of wood
from late/early growth. Even when the same sample is measured in different laboratories, we
have evidence of a greater level of observation spread (i.e. over-dispersion) in the 14C
measurements within objects from the same calendar year than the laboratory reported
uncertainties would support (see e.g. Scott et al. 2017). Recognizing this potential over-
dispersion, whatever its cause, is important both for curve estimation and resultant calibration
for IntCal20. Specifically, if there is more variation in determinations from the same calendar
year than the uncertainties reported by the laboratory, we need to make sure we incorporate
that in our modeling, and also recognize its existence in the objects users will then calibrate
against the IntCal20 curve.
We achieve this through the inclusion of a term to quantify the level of over-dispersion which
we adaptively estimate, based on the high-quality and dense IntCal data, within our curve
construction. This ensures that our IntCal20 curve does not become over-precise as an
estimate of the Northern Hemispheric average. However, this alone is not sufficient for
calibration since any additional sources of variation seen within the IntCal data are also
likely to be present in uncalibrated determinations. We must therefore propagate this over-
dispersion through to the calibration process itself. This is achieved using predictive curve
intervals which incorporate not only uncertainty in the Northern Hemispheric average
atmospheric curve but also the potential additional sources of variation seen in individual
14C determinations. Importantly, if no such additional variability exists then our model will
estimate this appropriately, however if such additional sources of variability do seem
present in the calibration data, our method will also recognize this and adapt accordingly.
Without such an over-dispersion term then, as we get more data, the calibration curve will
have intervals that become narrower and narrower even if the underlying data suggests
considerably higher variability. Eventually, this would mean that the data entering the
curve itself would potentially not calibrate to their known ages.
Such an idea of irreducible uncertainty was introduced in Niu et al. (2013) but here we extend
the idea further and generate predictive curve intervals which are more relevant for calibration
users. Importantly, this does not however mean that we produce a calibration tool that any
measurement (no matter its quality) can be calibrated againstwe wish to maintain a
calibration curve which has basic minima for data quality for both what goes into its
construction and also what can be reliably calibrated using it.
3.3.2 The Over-Dispersion Model: Additive Errors Scaling with 
F14C
p
The basic tree-ring model for an annual measurement assumes that a determination of calendar
age arises from a hemispherically uniform atmospheric level of 14 C shared by all
determinations of the same calendar age. Under this model, the only uncertainty is that
reported by the laboratory so that, in the F14C domain, the observed value is
Fifi"i;
where iis the year the determination represents; and "iN0;
2
iwith ithe uncertainty
reported by the laboratory. The function fis our unique estimate of the hemispheric
level of F14C present in the atmosphere at calendar age .
Currently it is not feasible to identify the specific sources that may potentially contribute extra
variability beyond that reported by the laboratory (e.g. regional, species, growing season
14 T J Heaton et al.
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
differences) and so we aim to cover them all in a unified approach. We therefore modify the
above model to allow any determination to have an additional and independent source of
potential variability ibeyond that reported by the laboratory, i.e.
Fifi"iifor i1;...;n
where iN0;2
iare independent random effects with unknown variance 2
i. To create the
curve we simultaneously estimate both i, the level of over-dispersion, and the calibration
curve f.
After investigation of several options, see Section 5.1 and the Supplementary Information, we
model the level of over-dispersion (i.e. the level of any additional variability) to scale
proportionally with the underlying value of 
f
pi.e. i
fi
pso that
iN0;2f. We choose a prior for the constant of proportionality based upon data
taken from the SIRI inter-comparison project (Scott et al. 2017), and update this prior
within curve construction based upon the IntCal data. The information provided by the
very large volume of IntCal data dominates the SIRI prior and so the posterior estimate
for the level of over-dispersion is primarily based upon the high-quality IntCal data.
3.3.3 Implications for Calibration
A user seeking to calibrate a high-quality 14C determination against the IntCal20 curve is likely to
have a measurement that has been subject to similar additional sources of potential variation to
whatever is seen in the IntCal data, and so have a similar level of over-dispersion. However, such a
user typically has no way of assessing this over-dispersion themselves. Consequently, they should
calibrate their determination using predictive curve estimates which additionally account for
potential over-dispersion in their own determination.Thesepredictiveestimatesarebased
upon the posterior of
f;
where N0;2fand is the posterior estimate of over-dispersion based upon the IntCal
data. For IntCal20 we therefore report this predictive interval since it is more relevant for
calibration. It is slightly wider than the corresponding credible interval for fbut more likely
to give accurate calibrated dates.
Note that the level of over-dispersion, and hence the predictive intervals, we incorporate into
the IntCal20 curve is based upon the high-quality, and screened, IntCal database. If a new
uncalibrated 14C determination has additional sources of variability beyond those present in
the IntCal tree-ring database (e.g. tree-ring species or locations not represented in IntCal,
or a non-tree-ring sample) then this may mean that the level of over-dispersion for that
uncalibrated determination is higher than that incorporated within the IntCal prediction
intervals. This would result in a potentially over-precise calibrated age estimate. Users are
advised to be cautious in such circumstances.
4 CREATING THE CURVE
As in previous versions of IntCal, the IntCal20 curve itself is created in two linked sections.
Firstly we create the more recent part of the curve (extending from 0 cal kBP back to
approximately 14 cal kBP) which is predominantly based upon dendrodated tree-ring
determinations. Secondly we create the older part of the curve (from approximately 14 cal
kBP back to 55 cal kBP) which is based upon a wider range of material, e.g. corals,
macrofossils, forams, speleothems as well as five floating tree-ring chronologies, that are of
IntCal20 Approach to 14C Calibration Curve Construction 15
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
uncertain calendar ages. Furthermore, these older 14C determinations are often not direct
atmospheric measurements and hence have marine reservoir ages or dead carbon fractions.
We therefore split our technical description of the approach accordingly. The two curve
sections are stitched together to ensure smoothness and continuity through appropriate
design of spline basis and conditioning the older part of the curve on the, already
estimated, more recent section. This means that we can still produce sets of complete curve
realizations from 055 cal kBP. Future work will aim to adapt the approach into a single
step, updating the entire 055 cal kBP range as one.
4.1 Notation
Calibration Curve by Domain
We can represent the atmospheric history of 14 C, as a function of calendar age , in three
domains equivalently: Δ14C, F14 C, and radiocarbon age. For any proposed history, i.e.
calibration curve, we switch between these domains as appropriate within our statistical
methodology. Let us denote:
gthe 14C calibration curve represented in Δ14C space, i.e. the level of Δ14 Catcal BP.
We model gin our spline basis and penalize roughness of the curve in this domain.
fthe 14C calibration curve represented in F14 C space. Given gthen
f 1
1000 g1

e=8267:
It is this F14C domain that is used to assess goodness-of-fit to the observed data.
hthe 14C calibration curve represented in radiocarbon age:
h8033 ln f:
This is the domain in which the calibration curve is plotted in the main IntCal20 paper
(Reimer et al. 2020 in this issue).
Observed Data
We consider all of our observations in F14 C since, as explained in Section 3.1, in this domain
our 14C measurement uncertainties are symmetric. We define:
Fithe observed F14C value of the data used to create the IntCal20 curve.
ithe laboratory-reported uncertainty on the observed F14C.
mithe number of annual years of metabolization that a determination represents. The
determination is considered to represent the mean of this block.
Tithe observed calendar age of a determination, which could either be the true calendar
age (in the case of absolutely dendrodated determinations) or an estimate with uncertainty
(in the case of e.g. corals, varves, speleothems).
Model Parameters
Within the model we update:
jthe spline coefficients which describe the calibration curve.
the smoothing parameter for the spline estimate.
16 T J Heaton et al.
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
ithe true calendar age of a determination or, in the case of a block-average
determination, the most recent year of metabolization included in the block.
the level of over-dispersion in the observed F14C under a model whereby the potential
additional variability for determination Fiis iN0;2fi.
rKthe offset measured in terms of radiocarbon age, either due to dead carbon fraction
(dcf) or marine reservoir age (MRA), between a determination from set Kand the atmosphere
at time cal BP. These will be specified either by K, the mean dcf offset/coastal MRA shift
for a particular dataset; or further spline coefficients CinthecaseoftheCariacounvarved
record. These are only needed when estimating the older part of the curve.
iand %Kadaptive error multipliers for data arising from the older time period to enable
heavier tailed errors and the down-weighting of outliers. Also only included when
estimating the older part of the curve.
4.2 Basic Model for Observed Data and Curve
As described in Section 3.1, we fit our curve to the data in the F14 C domain while the curve
modeling is performed in the Δ14 C domain. Specifically, for a single-year determination of the
direct atmosphere with no offset and no over-dispersion, we observe pairs Fi;Tin
i1where
Fifi"i;
gi
1000 1

e=8267 "i;
Tii i:
Here g, the value of Δ14C over time, is modeled as
gX
K
j1
jBj
BTβ;
where BB1;...;BKTand the Bj are cubic B-splines (Green and Silverman 1993)
at a chosen fixed set of knots. To maintain computational feasibility, we consider Knso
that the number of splines in our basis is considerably smaller than the total number of
observations in the IntCal database. Knot number and placement is discussed in Sections
4.3.4 and 4.4.3.
Prior on β
The Bayesian spline approach, equivalent to penalizing roughness in Δ14 CbyRb
ag00

2d,
is obtained by placing a partially improperGaussian process on the spline coefficients β:
βj
2

rankD=2
jDj?1=2exp
2βTDβ

;
where Dis a penalty matrix which penalizes the integrated squared second derivative and jDj?
its generalized determinant (see Green and Silverman 1993, for details). For splines of degree m
(cubic splines have degree 3) then rankDKm1.
IntCal20 Approach to 14C Calibration Curve Construction 17
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
Prior on
As standard within Bayesian splines (e.g. Berry et al. 2002), we place a hyperprior on the
smoothing parameter GaA;B:
 1
ΓABAA1exp
B
 for >0:
We select an uninformative prior on with A1and B50000.
4.3 Creating the Predominantly Dendrodated Part of the Curve Back to Approximately
14 cal kBP
Back to approximately 14 cal kBP, we have a sufficient density of tree-ring determinations that
we can estimate the curve based solely upon these direct atmospheric observations3. The main
challenges in creating this more recent section of the curve are:
High density and volume of datathere are 10,713 14C measurements in this,
predominantly dendrodated, section. Such a large volume of data makes a
computationally efficient algorithm essential.
Consequent demand for high level of detail in calibration curvethis, more recent, period
of the calibration curve is the most heavily interrogated by archaeologists who require high
precision, not just for individual calibration but also for modeling of multiple dates. This
increases the user demand for fine detail in the calibration curve, e.g. incorporating solar
cycles.
Blocking within the datamany of the 14 C determinations do not relate to the
measurement of the atmospheric 14C in a single year but rather averages of multiple
tree rings and so represent multiple years.
Floating tree-ring sequences around 12:7cal kBPseveral late glacial trees have their
chronologies estimated by wiggle matching and so while their internal chronology is
known absolutely, their absolute ages are not. Otherwise, the true calendar ages of all
determinations are known absolutely for this section of curve.
4.3.1 Modifications to Basic Model
Blocking and Additive Errors
Our model for the tree-ring determinations, taking account of blocking, considers the observed
Fias
Fi1
miX
mi1
j0
fij
()
"iifor i1;...;n
where miis the number of annual rings in the block for that determination; iis the most recent year
(start ring) of the block; and "iN0;
2
iwhere iis the uncertainty reported by the laboratory.
This implicitly assumes that the tree rings over the section of mirings are approximately equal in
width, so that approximately equal annual amounts of wood were deposited and have ended up in
3The IntCal database contains tree-ring determinations extending back to 14,189 cal BP all of which are used to create a
predominantly dendrodated tree-ring-only curve from 014,189 cal BP. However, beyond 13,913 cal BP the density of
this data was not sufficient to estimate the curve precisely. Hence, we merge the predominantly dendrodated curve
section with the older section at 13,913 cal BP.
18 T J Heaton et al.
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
the final measured sample. The iN0;2
iare independent random effects, with iunknown,
that represent potential over-dispersion in the 14C determinations.
Additive Errors Model
As described in Section 5.1 we model i, the standard deviation of the additive over-dispersion
on the ith determination, as proportional to the square-root of the underlying F14 C, i.e.
2
i21
miX
mi1
j0
fij:
The individual values of iare not of interest and so we integrate them out, updating only the
overall constant of proportionality . Formally, a structure where idepends upon the
underlying curve means this term should be incorporated in the update to βwithin our
samplera change to the curve (via its spline coefficients β) would mean a change to the
additive uncertainty ion all observations. However, this would make the βupdate no longer
a Gibbs step. Since, given the overall multiplier , the change in ibetween two potential
curves is minimal, when updating βwe consider ias fixed; see Section 4.3.3 for more details.
Floating Tree-Ring Sequences
Several late-glacial tree sequences measured by Aix, ETH and Heidelberg (Reinig et al. 2018;
Capano et al. 2020 in this issue) were dated by wiggle-matching so, while their internal relative
chronologies were known precisely, their absolute calendar ages were somewhat uncertain. For
such a floating tree-ring sequence, the true calendar ages are:
float
jjfor jin floating sequence:
where is the unknown start date (i.e. most recent year) of the floating sequence and jis the
precisely-known internal ring-count chronology. We place a discretized (integer) normal prior
on NM;N2. Here, the prior mean Mand variance N2for the start date of each such tree-
ring sequence were provided. All of the other dendrodated determinations were assumed to
have their calendar age iknown absolutely.
4.3.2 Efficient Incorporation of Blocking
To incorporate blocking exactly, at each iteration of our MCMC we need to evaluate the level of
F14C at every year represented within a block, and then average over them appropriately for each
of our ndeterminations. One might think this additional calculation will make the method much
slower as it requires the evaluation of the curve at many more than ncalendar years. However, in
the predominantly dendrodated part of the curve, it is possible to exactly incorporate blocking
without any compromise on the speed of the estimation procedure. Since the vast majority of
the data in this period have exactly known calendar ages (only the floating tree-ring sequences
have uncertain ages) we can calculate an initial matrix that relates, for each blocked
determination, the necessary averaged F14C to the underlying spline coefficients. This matrix
then remains fixed throughout estimation. Further it has the same dimensions as if blocking
had been ignored and hence does not significantly alter the speed of the MCMC updates.
Finding Values of F14Cat Each Year Represented in a Block
We begin by describing how we can relate each blocked determination to the spline coefficients
β. Let θA1;
11;...;
1m11;
2;...;
2m21;...;
nmn1Tbe a vector
concatenating every calendar year, including any potential blocking, represented in all n
determinations. Also, define vectors cA
θand eA
θso that
IntCal20 Approach to 14C Calibration Curve Construction 19
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
cA
;l1
1000 eA
l=8267 and eA
;leA
l=8267:
Then, create the matrix BA
θcontaining the spline bases evaluated at each value in θA:
BA
θ
B11... BK1
B111... BK11
.
.
..
.
.
B1nmn1... BKnmn1
0
B
B
B
@
1
C
C
C
A
;
so that, at the calendar years θA, the modeled value of Δ14CisgABA
θβwhere βrepresents the
spline coefficients. The modeled value of F14 C at an individual calendar age A
lthen becomes
fA
lcA
;lgA
leA
;land at the full θA,fABy
θβeθwhere By
θis formed by multiplying each row
in BA
θby the corresponding element in cA
θ.
A Blocking Matrix
For each blocked determination, we now wish to average the values of F14C according to the
multiple years represented in that block. Consider an averaging matrix Mcorresponding to
averaging the individual values in fA:
M
1=m1... 1=m10 ... 0 ... ...
0001=m2... 1=m20 ...
.
.
.
0
B
@1
C
A:
Our observed vector of F14C measurements Fthus becomes
FMBy
θβMeθεη;
B?
θβe?
θεη:
Note that both the matrix B?
θand e?
θdepend upon the true calendar ages. However, since for the
vast majority of the tree-rings, the calendar ages are known exactly, one only has to calculate B?
θ
and e?
θonce as it is then fixed. The final B?
θhas dimension K×nand hence the Gibbs updating of
βis the same speed as if one had no blocking. To incorporate the floating tree-ring sequences, for
any potential sequence start date we only need update a small submatrix of B?
θand e?
θ
corresponding to these observations. The required updates for all sequence start dates
covering the prior range can be calculated and stored before commencing curve estimation.
4.3.3 Details of MCMC Algorithm
Posterior
Using the above hierarchical structure, the joint posterior is proportional to
; β;;jF;θFjβ;θ;;βj
det W1
2exp 1
2Fe?
θB?
θβTWFe?
θB?
θβ

M2
N2βTDβ

×ArankD=21×exp
B

;
where Wdiag2
12
1;...;
2
n2
n1includes the additive error representing the over-
dispersion. We can sample from this using Metropolis-within-Gibbs but where most of the
updates can be performed directly via Gibbs and only a few require Metropolis-Hastings (MH).
20 T J Heaton et al.
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
Gibbs Updating βjF;λ;ξ;τ
βjF;;exp 1
2Fe?
θB?
θβTWFe?
θB?
θβ

βTDβ


For the purposes of maintaining a Gibbs update here, we treat Wdiag2
12
1;...;
2
n2
n1
as fixed and independent of β, i.e. we ignore the dependence of the individual over-dispersion i
on the curve. With this minor approximation, algebra gives a direct complete conditional
βjF;MVN QB?T
θWFe?
θ;Q

;
where
QB?T
θWB?D1:
Here large computational savings can be made if we calculate QB?T
θWFrather than
QB?T
θWFsince Qis K×Kwhile B?T
θis K×n(and nK).
Gibbs Updating λjβ;A;B
Due to conjugacy of the prior:
jβArankD=21exp βTDβ
21
B

;
i.e. jβGaA0;B0where A0ArankD=2and B0βTDβ
21
B

1.
MH Updating ξjF;λ;β;τ
To update the start of our floating tree-ring sequences, we use an MH step:
Propose 0MVN; prop ;
Calculate B?
θand B?
θ0where θ0are the calendar ages of the floating sequence with the
proposed new start date 0. Similarly calculate e?
θand e?
θ0. Accept 0according to
Hastings Ratio:
HR min 1;0jF;β;
jF;β;

;
where
jF;β;exp 1
2Fe?
θB?
θβTWFe?
θB?
θβ

M2
N2

:
MH Updating τjF;λ;β
This also requires an MH update step. Let the residuals between the observed Fand fitted
values
ˆ
FB?
θβe?
θbe
RFB?
θβe?
θ:
Then, we have independently RiN0;
2
i2
iwhere i
ˆ
Fi
p. With our prior we
update:
IntCal20 Approach to 14C Calibration Curve Construction 21
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
Sample 0N;2;
Accept according to Hastings Ratio:
HR min 1;fR;00Φ=
fR;Φ0=

;
where fR;Qn
i1Ri;0;
2
i2
iand i
ˆ
Fi
p. Note the asymmetric proposal
adjustment.
4.3.4 Additional Considerations
Choice of Splines
For the portion of the curve from 0 cal kBP back to approximately 14 cal kBP, we placed 2000
knots at the unique calendar age quantiles of the constituent determinations (each blocked
determination was represented by its midpoint) to permit variable smoothing dependent
upon data density. In regions where there were more data this allowed us to pick out finer
scale variation. Such a selection enabled close to annual resolution in the detail of the final
curve while still maintaining computational feasibility in curve construction. We also placed
additional knots in the vicinity of the known Miyake-type events at 7745 AD, 9934AD
and 660 BC (Miyake et al. 2012,2013;OHare et al. 2019) to enable us to capture more
rapid variation at these times. Around the calendar age of each such event, we added an
extra 4 jittered knots, i.e. after addition of small amounts of random noisefor the 660 BC
event we used a slightly larger jitter due to its slightly less certain timing. If there is no
event at these times then the curve will not introduce one but, if there is, then these
additional knots will allow it to be picked up more clearly. All knot locations then
remained fixed throughout the sampler.
Outlier Screening
In order to identify potential outliers, after fitting a preliminary curve, scaled residuals
(incorporating potential blocking) were calculated for each datum, i.e. for an annual
measurement
riFi
ˆ
fi

2
ii2
p;
where iis the posterior standard deviation on the calibration curve at ical BP, the calendar
age of that datum. These residuals were then combined for each dataset Kinto a scaled deviance,
ZKPi2Kr2
i, and compared with a 2
nKwhere nKare the number of determinations in that set.
The mean offset of each dataset from the preliminary curve was also calculated
K1
nKPi2KFi
ˆ
fi
no
.Setswhichhadlowp-values for their deviance and high mean
offsets were then discussed with the data set providers as to whether they should be included
in the final IntCal or not (see Bayliss et al. 2020 in this issue, for more details).
Run Length
The MCMC was run for 50,000 iterations. The first 25,000 of these iterations were discarded as
burn-in. The remaining 25,000 were used to create the final curve (thinned to every 10th) and
passed to the older part of the curve to create a seamless merging between the curve sections as
discussed in Section 4.4.1. Since the main update steps (i.e. βand ) are Gibbs this was felt to be
sufficient although convergence was further assessed by initializing the sampler at different
values and comparing the resultant curve estimates visually.
22 T J Heaton et al.
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
4.4 Creating the Older Part of the Curve
Beyond approximately 14 cal kBP, the number of tree-ring determinations decreases and they
are not sufficient to create a precise calibration curve. We therefore merge the predominantly
dendrodated curve with the older section of curve, which is estimated from a wider range of 14C
material, at 13,913 cal BP. In this older period, the underlying data used to construct the curve
incorporates corals, foraminifera, macrofossils and speleothems, in addition to a small number
of floating tree-ring sequences. These alternative sources of data are typically not direct
measurements of the atmosphere but instead are offset due to marine reservoir effects and
dead carbon fractions. Further, their true calendar ages are quite uncertain and can only be
estimated. They therefore present several further challenges for construction of an
atmospheric curve.
4.4.1 Modifications to Basic Model
Uncertain Calendar Ages
All our data in the older part of the curve have estimates (either in the form of noisy
observations or priors) for their calendar ages rather than absolutely known values. For the
corals and speleothems these estimates are obtained via U-Th dating and are considered
independent; for the Cariaco Basin they are either provided by varve counting or elastic
palaeoclimate tie-pointing (Heaton et al. 2013); similar elastic palaeoclimate tie-pointing
provides the ages for the Pakistan and Iberian margins; for Lake Suigetsu they are
provided by wiggle matching; while the floating tree-ring sequences have internally known
chronologies but no absolute age estimates. We assume all our calendar age estimates are
(potentially multivariate) normal. For a dataset Kon which we have noisy observations TK
of its calendar ages e.g. U-Th dated corals and the varve counted section of Cariaco, we
model:
TKMVNθK;ΨK;Tso that TKjθKexp 1
2TKθKTΨ1
K;TTKθK
no

:
Here ΨK;Tis the pre-specified and fixed covariance matrix that encodes the dependence within
the calendar age observations for that dataset. For such datasets we place an uninformative
prior on θK. For Lake Suigetsu and the datasets where calendar age estimates are obtained
via palaeoclimate tie-pointing, no actual calendar ages were observed but rather we have a
prior on their true values:
θKexp 1
2TKθKTΨ1
K;TTKθK
no

;
and the TKand ΨK;Tnow represent our prior mean and covariance for the calendar ages. As we
have no datasets on which we have both priors and noisy calendar age observations, these two
types of calendar age estimate become equivalent for the purposes of updating. Details of how
to construct the various covariance matrices for the case of varve counting and wiggle matching
can be found in Niu et al. (2013); for the datasets with priors obtained by tie-pointing in Heaton
et al. (2013); and for Lake Suigetsu in Bronk Ramsey et al. (2020 in this issue).
Offsets: Reservoir Ages and Dead Carbon Fractions
Several of our datasets do not provide direct measurements of the atmosphere but are instead
offset. The speleothem records contain carbon obtained from dripwater which has passed
through old limestone and so is a mixture of atmospheric CO2and dissolved old (dead)
carbon that has no 14C activity. The offset in radiocarbon age this creates is called a dead
IntCal20 Approach to 14C Calibration Curve Construction 23
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
carbon fraction (dcf) and is specific to the speleothem. Similarly the marine records have an
atmospheric offset called a marine reservoir age (MRA) that arises due to both the limited gas
exchange with the atmosphere and ocean circulation drawing up deep old carbon. These
MRAs are location specific and vary over time. For both of these types of data we can
incorporate the offset (either dcf or MRA) in the radiocarbon age domain as:
hoffsethatmos rK;
where hoffsetis the radiocarbon age at time cal BP in the offset environment; hatmos our
atmospheric calibration curve in the radiocarbon age domain; and rKour offset. This alters
our observational model so that, in the F14 C domain, for determinations from dataset Kand
with F14C domain calibration curve f, the offset becomes a multiplier,
FierKi=8033fi"i:
These offsets must be estimated within curve construction to adaptively synchronize the
different records. For IntCal20, speleothem dcfs were considered to be approximately
constant over time but with an unknown level. MRAs were modeled as time-varying,
providing a step forward from IntCal09 and IntCal13. Initial MRA estimates for each of
our locations were obtained by creating a preliminary atmospheric calibration curve using
the same Bayesian spline methodology but based only on the Hulu Cave record (Southon
et al. 2012; Cheng et al. 2018). This Hulu-based curve was then used as a forcing for an
enhanced Hamburg Large Scale Geostrophic (LSG) ocean general circulation model to
provide estimates Kof MRA at each given location (see Butzin et al. 2020 in this issue,
for details). Due to coastal effects, these LSG estimates were considered to define the shape
of the MRA, i.e. relative changes over time, but subject to a constant potential coastal
shift. Ideally we might cycle the process of building the curve and re-estimating MRAs
several times but the LSG model had a run time of several weeks and so this was not
feasible. Notwithstanding the above attempts to accurately model dcf and MRA, we
recognized that there was further variation in the offsets over time we were not able to
fully capture. This was incorporated through the addition of further independent variability
in the offsets from one calendar age to the next. Consequently our model for the offsets is:
Speleothemsdcfs:rKiNK;2
K;
Marine recordsMRAs:rKiNKKi;2
K;
where Kis the further independent variability in offsets from year-to-year around our LSG/
constant dcf model and added to data before curve construction. We place priors on the
constant dcf mean/coastal shifts K:
KNK;!
2
K:
We do not aim to estimate the precise values of rKiduring curve construction. However, the
parameter ν, containing the values of Kwhich determine the mean dcf/MRA offset for each
dataset, is updated within our MCMC sampler.
To fully specify our offset model, values of the prior mean dcf/coastal shift, K, for each
dataset; and !K, our uncertainty on the level of this shift, were estimated from the overlap
between additional data available for each speleothem/marine location and the dendrodated
curve from 014 cal kBP. Similarly, these overlaps provided an estimate for each K. See
Section 5.2 for more details. Finally, for the two floating Southern Hemisphere kauri trees
we assumed an offset of 43 ±23 14C yrs (1) based upon the North-South hemispheric
24 T J Heaton et al.
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
offset estimated in SHCal13 (Hogg et al. 2013). Being direct measurements of the NH
atmosphere, both Suigetsu and the three Bølling-Allerød floating tree-ring sequences are
not offset, i.e. for all ,rK0.
Offsets: Cariaco Basin
The LSG model was not able to adequately resolve the MRA within the geographically unique
Cariaco Basin which has a shallow sill that potentially limits exchange with the wider ocean. A
further model for the MRAs in this location was therefore needed. As it covered a short time
period, the MRA for the Cariaco varved record (Hughen et al. 2004) was modeled as for the
speleothems, i.e. independently varying around a constant level. For the Cariaco unvarved
record (Hughen and Heaton 2020 in this issue), we modeled the F14 C domain multiplier
corresponding to any offset as
KerK=8033 X
30
k1
C;kBC;k;
a further Bayesian spline with 30 knots placed at jittered quantiles of the Cariaco prior calendar
ages. The value of βCC;1;...;
C;30Tdetermining this spline was also estimated during
curve construction but with a fixed and large smoothing parameter C. This approach
meant rapid changes in the 14C determinations within Cariaco were considered as
atmospheric signal while smoother, longer term drifts away from the other data that might
otherwise introduce spurious features into the curve, would be resolved as time-varying
reservoir effects.
Heavier Tailed Errors
Due to the diversity of the datasets in this older time period, as well as the difficulty in
incorporating all their potential complexities, we wished to create a curve that was not
overly influenced by single determinations. We therefore permit the possibility of heavier
tailed errors in the F14C determinations for the data in this older section of curve extending
beyond 13,913 cal BP. This is incorporated by introducing an error multiplier where for
each observation we maintain
FierKi=8033fi"i;
but with each determinations reported uncertainty iscaled according to a further
parameter i
"ijiN0;2
i
i

;where iGashape %Ki
2;rate %Ki
2;
and Kiis the dataset of observation i. All observations ibelonging to the same dataset K
therefore have the same shape and rate in the Gamma prior for their individual i. This
model, integrating out i, equates to a Studentst-distribution for an individual observation i,
"itlocation 0;scale i;df %Ki:
Further, for each dataset K, we place an independent hyperprior on the value of its particular
%Kparameter so that the observations belonging to that dataset are grouped,
%KGashape A%;rate B%;
to capture potential differences between the various records. Consequently, if a dataset is seen
to contain several outliers, the other determinations in that same set will also be treated with
IntCal20 Approach to 14C Calibration Curve Construction 25
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at
more caution. We choose a subjective hyperprior encapsulating an expectation of a low level of
heavy-tailed behavior by selecting, for each dataset, A%100 and B%1
2. We update both
κ1;...;
nTand ϱ, the values of the various %K, within our sampler.
Parallel Tempering
A significant concern with the previous random walk approach was how well the MCMC
sampler mixed. With this random walk approach, we only updated the calibration curve
one calendar year at a time, conditional on the value at all other times, which restricted the
ability of the curve estimate to move significantly. Furthermore, this update was performed
by MH meaning we would frequently reject proposed updates. By switching to Bayesian
splines, we are instead able to update, conditional on the current calendar ages, the entire
curve through its spline coefficients simultaneously via Gibbs. These dual changes, to
update the entire curve at once and to do so via Gibbs sampling as opposed to MH,
hopefully begin to address mixing concerns. However, due to the uncertain calendar ages it
is likely the posterior for the curve (and the true calendar ages) will remain multi-modal. In
an attempt to overcome any remaining concerns over mixing, we also incorporate parallel
tempering. In tempering, we run multiple chains concurrently. One of these MCMC chains
has as its target our posterior of interest while the others have modified, higher
temperature, targets. These higher temperature targets are typically flattened versions of the
posterior of interest designed so that, as the temperature increases, the corresponding
MCMC chain mixes more easily. We then propose swaps between the states of the multiple
chains so that the chains that mix well (i.e. those that run at higher temperatures) improve
the mixing of the chains which do not (i.e. our posterior of interest). We are free to choose
the elements of the likelihood we wish to temper and by how much. We only temper the
likelihood of the observed data, i.e. Fjθ;β;ν;βC;κ; and Tjθ(or equivalent prior θ) since
these were thought to be the main elements restricting mixing and doing so keeps the
MCMC updates straightforward. Our coupled chains run with a pair of temperatures
1;
2>1with each chain sampling from a modified posterior,
θ;β;;ν;βC;κ;ϱjF;T1;2
L1;2θ;β;;ν;βC;κ;ϱjF;T
Tjθ1=1Fjθ;β;ν;βC;κ1=2βjνβCκjϱϱθ1=1;
so higher temperatures give flatter posteriors. We show in Section 4.4.3 how these
modifications are simply equivalent to increasing our observational noise and so we can
maintain straightforward and fast updates for each chain. We run the chain at four
temperatures, including the unmodified posterior of actual interest where 121,in
parallel. We discuss the effect of tempering, and how it aids mixing, in more detail within
the Supplementary Information.
Merging with Already Created Tree-Ring-Only Section of Curve
We need to ensure we smoothly merge our older section of calibration curve with the
independently created tree-ring-only curve described in Section 4.3. This is achieved by
selection of an appropriate overlapping set of knots for the two sections. We first identify,
in the tree-ring-only curve, the calendar age at which to create the join. This is determined
to be the calendar age of the knot at which, due to the ending of the tree-ring
determinations, the curve uncertainty on the tree-ring-only curve begins to increase
significantly. For IntCal20, this was selected to be at 13,913 cal BP. This knot and the two
knots either side that were used to create the tree-ring-only curve (five tree-ring-only knots
in total) are then copied into the knot basis for the older portion of curve. On each update
26 T J Heaton et al.
https://www.cambridge.org/core/terms. https://doi.org/10.1017/RDC.2020.46
Downloaded from https://www.cambridge.org/core. IP address: 81.97.186.145, on 18 Aug 2020 at 07:26:00, subject to the Cambridge Core terms of use, available at