ArticlePDF Available

Predicting historic forest composition using species lists in presettlement land survey records, western New York

Authors:

Abstract and Figures

QuestionsDo the species lists (SLs) of presettlement land surveys, little used though available for over two million km in the USA, provide useful information on historic forest composition? Do the bearing trees (BTs) and SLs of pre-settlement land surveys record similar species and relative abundances? Are relative abundances more similar when species in an SL are given equally weighted relative abundances or rank-weighted relative abundances?LocationTemperate broadleaf forest, western New York, US.MethodsBT and SL data were obtained for the 3 172 953 m of lines surveyed in 1797–9 by the Holland Land Company. The presence and relative abundance of 8394 BTs were compared with 11 192 unique taxon mentions within SLs. Comparisons were made for individual taxa and whole communities, and at the scales of individual line segments and whole town lines. The influence of equal weighting and rank weighting of the SLs on the comparisons was assessed.ResultsThe SLs on a line segment typically recorded more taxa than did the BTs. The species present in BTs and SLs on individual line segments were indicated by Cohen's Kappa to have ‘substantial agreement.’ Correlations between BTs and SLs of all taxa on the same town line increased with the number of BTs and SLs on the town line, and when the relative abundances of SL taxa were modelled using rank weighting rather than equal weighting. Correlations between BTs and SLs for individual taxa were higher for taxa that had more BTs. Correlations between BTs and SLs were also higher when the relative abundances of SL taxa were modelled using rank weighting vs equal weighting.ConclusionsSLs capture greater detail of presettlement forest composition than do BTs. The relative abundance of a taxon mentioned in a presettlement land survey SL can be meaningfully predicted using its rank in an SL and application of a rank weighting model.
Content may be subject to copyright.
Applied Vegetation Science
&&
(2015)
Predicting historic forest composition using species
lists in presettlement land survey records, western
New York
Chris P. S. Larsen, Stephen J. Tulowiecki, Yi-Chen Wang & Andrew B. Trgovac
Keywords
Bearing trees; Historical ecology;
Presettlement land survey records;
Rank abundance distributions; Species lists
Abbreviations
BTs = bearing trees; EW = equal weight;
PLSRs = presettlement land survey records;
RW = rank weight; SL = species list.
Nomenclature
Little (1979)
Received 24 July 2014
Accepted 8 January 2015
Co-ordinating Editor: Martin Hermy
Larsen, C.P.S. (corresponding author,
larsen@buffalo.edu )
1
,
Tulowiecki, S.J. (sjt7@buffalo.edu)
1
,
Wang, Y.-C. (yi-chen.wang@nus.edu.sg)
2
&
Trgovac, A.B.(atrgovac@buffalo.edu)
1
1
Department of Geography, University at
Buffalo, The State University of New York,
Buffalo, NY 14261, USA;
2
Department of Geography, National University
of Singapore,Singapore 117570, Singapore
Abstract
Questions: Do the species lists (SLs) of presettlement land surveys, little used
though available for over two million km in the USA, provide useful informa-
tion on historic forest composition? Do the bearing trees (BTs) and SLs of
pre-settlement land surveys record similar species and relative abundances? Are
relative abundances more similar when species in an SL are given equally
weighted relative abundances or rank-weighted relative abundances?
Location: Temperate broadleaf forest, western New York, US.
Methods: BT and SL data were obtained for the 3 172 953 m of lines surveyed
in 17979 by the Holland Land Company. The presence and relative abundance
of 8394 BTs were compared with 11 192 unique taxon mentions within SLs.
Comparisons were made for individual taxa and whole communities, and at the
scales of individual line segments and whole town lines. The influence of equal
weighting and rank weighting of the SLs on the comparisons was assessed.
Results: The SLs on a line segment typically recorded more taxa than did the
BTs. The species present in BTs and SLs on individual line segments were indi-
cated by Cohen’s Kappa to have ‘substantial agreement.’ Correlations between
BTs and SLs of all taxa on the same town line increased with the number of BTs
and SLs on the town line, and when the relative abundances of SL taxa were
modelled using rank weighting rather than equal weighting. Correlations
between BTs and SLs for individual taxa were higher for taxa that had more BTs.
Correlations between BTs and SLs were also higher when the relative abun-
dances of SL taxa were modelled using rank weighting vs equal weighting.
Conclusions: SLs capture greater detail of presettlement forest composition
than do BTs. The relative abundance of a taxon mentioned in a presettlement
land survey SL can be meaningfully predicted using its rank in an SL and appli-
cation of a rank weighting model.
Introduction
presettlement land survey records (PLSRs) provide insights
into forest attributes prior to Euro-American settlement,
such as species composition, disturbance frequency and
forest density (Whitney 1996; Wang 2005). PLSRs come
from metes and bounds surveys, later private surveys, and
the later still Public Land Surveys of the General Land
Office (Wang 2005). The species composition information
in PLSRs has been used for many purposes, including the
study of human impacts upon forest composition (Whit-
ney 1996; Hanberry et al. 2012; Larsen et al. 2012) and to
guide conservation and restoration (Ravenscroft et al.
2010). PLSRs record tree species information as bearing
(witness) trees (BTs) and species lists (SLs; Bourdo 1956).
BTs witnessed survey monuments (e.g. posts cut from
nearby trees) distributed at regular intervals (e.g. every
half mile) along survey lines, and at survey corners. BTs
were marked and their species, diameter and distance and
compass angle from the monument were noted. Depend-
ing on the survey, SLs consisted of tree species names
recorded along line segments of a regular length, or along
line segments that corresponded to different vegetation
conditions (Tulowiecki 2014). BTs and SLs are both
1
Applied Vegetation Science
Doi: 10.1111/avsc.12165©2015 International Association for Vegetation Science
samples of the trees and species in a forest, though sampled
in different manners. Although PLSRs have mostly been
used in the USA, they are also available in Australia (Bick-
ford & Mackey 2004) and Canada (Terrail et al. 2014).
The viability of PLSRs for ecological applications has
long been studied, chiefly by examining survey instruc-
tions and by analyzing data within PLSRs. The collec-
tion of PLSRs was largely guided by the Land
Ordinance Act of 1785 (White 1983), which from its
inception required surveyors to mark BTs (White 1983,
p. 12). It was not until 1815 that instructions required
surveyors to note “the kinds of timber...with which the
land may be covered” (White 1983, p. 249), not until
1831 that they suggested preferentially choosing BTs
with smooth bark and large stems (White 1983, p.
270), and not until 1833 that they required notes on
“each kind of timber in the order in which it is most
prevalent” (White 1983, p. 299). The gathering of SLs
and the preferential choice of BT species did, however,
predate those rules: SLs are present in PLSRs from Ver-
mont 17837 (Siccama 1971), and bias was detected in
BT selection from 18th century PLSRs. For example,
Kronenfeld & Wang (2007) found that BTs in the Hol-
land Land Company survey (also used in this study)
had selection bias, with the greatest underestimate
being Fagus grandifolia by 5.3%, and the greatest over-
estimate being Acer saccharum by 4.8%. Biases in BT
species selection are believed, however, to typically be
too minor to influence ecological analyses (Rhemtulla &
Mladenoff 2010). Biases in SLs have not been explored,
and indeed it was suggested by Batek et al. (1999) that
SLs should contain less bias, because unlike BTs their
choice was not influenced by how well they could be
marked.
The BTs have been employed more often than have SLs
to reconstruct presettlement vegetation. In a meta-analysis
of PLSR studies from eastern USA, C.P.S. Larsen, B.J.
Kronenfeld and Y.-C. Wang (in prep.) found that 145 stud-
iesusedBTs,36usedSLsand18usedbothBTsandSLs.
Species abundances were quantified in all 145 of the BT
studies, but in only 20 of the 36 SL studies. The low usage
of SLs is not due to their use being unknown, because they
were used to quantify species abundance as early as Lutz
(1930). SLs were collected from over two million km
surveyed in the USA by the Public Land Survey System
(White 1983), and from many parts of Canada (e.g. Terrail
et al. 2014). Europe does not have analogous surveys, but
lists of vegetation made by naturalists (e.g. McCracken
1959) may have been unconsciously ranked.
Whereas the relative abundances of presettlement
species have been estimated using simple counts of BTs,
various weighting schemes have also been applied to esti-
mate relative abundances from SLs. An equal weight (EW)
approach, introduced by Lutz (1930), treats all species
mentions as the same, regardless of their ranking in an SL.
If a study area contained 20 SLs that each mentioned five
species, then each species mention would contribute 1%
of relative abundance to the overall measure for that study
area. The resulting measure was called ‘prevalence’ by Ter-
rail et al. (2014). A limitation of the EW approach is that it
does not employ information on the rank position of a spe-
cies, with species named first in a list presumably more
abundant than those named last. A rank position approach
was first applied by Lorimer (1977) by calculating the
number of times a species had a given rank; the resulting
measures for each rank position (first, second, third, etc.)
were called ‘dominance’ by Scull & Richardson (2007).
Two limitations of the rank position approach are that it
does not allow estimates of relative abundances to be made
at the scale of individual line segments, and that it provides
information for each rank position separately, resulting in
multiple rank positions for each species.
A rank weight (RW) approach, developed by Seischab
(1990), estimates the relative abundance of a species on
any individual line segment, using both the rank position
of a species in an SL and the total number of species in that
SL. The RW approach is based on the observation that
communities contain a few common species and many
rare ones, and that their relative abundances exhibit rank
abundance distributions that follow from a variety of
mathematical distributions (Wilson 1991; McGill et al.
2007). Of the 20 studies in the eastern USA that quantified
species abundances using SLs, 13 used an EW approach,
five used a rank position approach and two (Seischab
1990, 1992) used the RW approach (Larsen et al. in prep.).
Since Seischab’s (1990) introduction of the RW approach,
there have been over 190 publications that used PLSRs
from the eastern USA, but none employed RWs, although
two California studies did (e.g. Fritschle 2012).
That the RW approach to predicting species abundances
from SLs has not been more widely employed is surprising,
because SLs often contain more species mentions than do
BTs (e.g. Lutz 1930; Scull & Richardson 2007). The general
reluctance to use SLs, and to employ RWs to predict species
abundances from SLs in particular, likely stems from three
factors. First, many SLs were created during surveys con-
ducted prior to the 1833 instruction to surveyors to name
species in order of their prevalence. Second, although the
relative abundances of species in a community typically
follow rank abundance distributions (McGill et al. 2007),
it is unclear if the qualitative manner in which surveyors
created the lists would be sensitive enough to accurately
reflect those rankings (Finley 1951). Third, although there
have been a number of qualitative comparisons of SLs and
BTs (e.g. Lorimer 1977; Batek et al. 1999; Brister et al.
2011), no research has conducted a statistical comparison
Applied Vegetation Science
2Doi: 10.1111/avsc.12165 ©2015 International Association for Vegetation Science
Predicting historic forest composition C.P.S. Larsen et al.
of SLs and BTs, and thus the use of SLs does not have
quantitative support. A recent effort by Terrail et al.
(2014) did statistically relate first ranked species in SLs
with relative abundance in an adjacent, near-contempora-
neous forest survey, but did not consider species with
lower rank positions.
The goal of this paper is to explore quantitatively the
taxonomic information in SLs. Three specific questions are
addressed. First, how unique are the species present in the
BTs and SLs on the same line segments? Second, do EW or
RW approaches to SL data result in stronger community-
scale relations between BTs and SLs at the scale of individ-
ual town lines? Community-scale relations are considered
here as the relations between the relative abundances of
all taxa in the BTs and SLs in individual town lines. Third,
do EW or RW approaches to SL data result in the stronger
taxon-scale relations between BTs and SLs at the scale of
individual town lines? Taxon-scale relations are consid-
ered here as the relations between the relative abundance
of an individual taxon in the BTs and the SLs across multi-
ple town lines.
Methods
Study area, taxonomy and association of BTs to line
segments
The study area is 14 400 km
2
in western New York
(Fig. 1). The Holland Land Company surveyed 162 town-
ship perimeters in 17979 (Wyckoff 1988). Each township
typically had four edges of 9.6 km in length, referred to
hereafter as town lines. BTs were recorded around survey
posts typically placed every half-mile along the town lines.
SLs were recorded for line segments, of which there were
typically multiple per town line; line segments varied in
length, corresponding to the varying vegetation conditions
that the surveyors encountered. The posts, BTs and SLs
were transcribed from microfilms of the handwritten
‘Range Books’ in the Holland Land Company Archives of
SUNY Fredonia and the New York State Archives in
Albany, New York (Wang 2007). These private surveys
predate the Public Land Surveys begun by the General
Land Office in 1812 (White 1983). Transcription of BTs
Fig. 1. The Holland Land Company study area in western New York, USA. Township 8, Range 6 (labelled ‘T8R6’) is shown in Fig. 2.
3
Applied Vegetation Science
Doi: 10.1111/avsc.12165©2015 International Association for Vegetation Science
C.P.S. Larsen et al. Predicting historic forest composition
and SLs in these survey records takes approximately equal
time.
Interpretation of the taxonomic equivalents for SLs gen-
erally followed Wang (2007), with the following excep-
tions. ‘Oak’ was assigned to Quercus velutina and not Q. alba
because many mentions of ‘oak’ occurred in SLs that also
recorded white oak; chestnut oak and rock oak were both
assigned to Q. montana as Q. prinus is considered synony-
mous with Q. montana (Little 1979); and ‘gum’ was reas-
signed from Liquidambar styracifolia to Nyssa sylvatica
because the geographic range of L. styracifolia is south of
the study area. The following surveyor’s names were
employed in the SLs, but not in the BTs, and were assigned
the following taxonomic equivalents: butterwood and but-
tonwood (Platanus occidentalis), dogwood (Cornus florida),
hardbeam (Carpinus caroliniana), hazelnut (Corylus cornuta),
pepperage (Nyssa sylvatica) and shinwood (Taxus canaden-
sis). Assignment of taxonomic equivalents caused some BT
and SL taxa to contain multiple species. We will thus use
the words taxon or taxa for the remainder of the paper,
though we will maintain the acronym SL.
Predicting relative abundances using BTs
Each town line created in the Holland Land Company sur-
vey typically contains multiple line segments of various
lengths, most with their own SL (Fig. 2a). BTs were
grouped with line segments by assigning them to the clos-
est line segment (Fig. 2b). Since line segments have irregu-
lar lengths, while the posts (around which the BTs are
arranged) are regularly distributed, not all line segments
would possess associated BTs. The BTs for a whole town
line were those that were assigned to the multiple line seg-
ments that comprised that town line. The relative abun-
dance of a taxon in the BTs for a line segment, and for a
town line, was calculated as the ratio of the number of BTs
of that taxon on that line, divided by the total number of
all BTs of all taxa on that line.
Predicting relative abundances using SLs
Relative taxon abundances were predicted using SLs at the
scales of line segments and town lines using different EW
and RW models. Three different EW models were
employed: equally-weighted unique taxon, equally-
weighted and distance-weighted unique taxon, and
equally-weighted unique mentions. At the line segment
scale, only the EW unique taxon model could be
employed; it was applied by determining the number of
taxa in an SL, dividing that number into 1, and then using
the quotient as the relative abundance for all listed taxa.
For example, if an SL mentioned five taxa, then they were
all given an EW unique taxon relative abundance of 0.2
(1/5). At the town line scale, the EW unique taxon model
was similarly applied. For example, if a town line con-
tained ten line segments that each had an SL with five
taxa, but the ten SLs together only mentioned eight differ-
ent taxa, then each taxon on that town line would be
given a EW unique taxon relative abundance of 0.125
(1/8).
(a) (b)
Fig. 2. (a) 26 line segments (symbolized with two shades of grey) containing SLs, and posts (each symbolized with an‘x’)with associated BTs, in Township
8, Range 6. As indicated by an arrow, one short line segment contains no BTs. (b) The four BTs at the northwest corner of Township 8, Range 6. To
calculate species relative abundances in each line segment using BTs, the BTs were independently assigned to the nearest line segment (as indicated by
faint dashed lines).
Applied Vegetation Science
4Doi: 10.1111/avsc.12165 ©2015 International Association for Vegetation Science
Predicting historic forest composition C.P.S. Larsen et al.
The EW model that involved distance weighting of
unique taxon was applied at the town line scale by first
applying the EW unique taxon model for each line seg-
ment, and then for each taxon weighting those values by
the proportion they comprised of the town line’s length.
For example, if a town line contained three line segments,
respectively making up 0.7, 0.2, and 0.1 of the town line,
and the EW unique taxon relative abundance for Fagus
grandifolia on those lines was 0.2, 0.3 and 0.4, respectively,
then the EW distance-weighted unique taxon relative
abundance for F. grandifolia on that town line would be:
(0.2 90.7) +(0.3 90.2) +(0.4 90.1) =0.24.
The EW unique mention model was applied at the town
line scale by determining how many times a taxon was
mentioned in all of the SLs on that town line, dividing that
number by the total number of unique mentions from all
SLs on that town line, and then using the quotient as the
relative abundance for that taxon. For example, if a town
line contained ten line segments and each had an SL with
five taxa, then there were 50 unique mentions. If a taxon
was named in nine of the SLs, then it would have an EW
unique mention relative abundance of 0.18 (9/50).
Five different RW models, representing a variety of rank
abundance distributions (McGill et al. 2007), were
employed: broken stick (MacArthur 1957), pre-emption
(Motomura 1932), log normal (Preston 1948), Zipf (Zipf
1949) and Zipf-Mandelbrot (Frontier 1985). Relative
abundances were predicted using the ‘vegan’ package
within R (R Foundation for Statistical Computing, Vienna,
Austria) with the following equations (where ais the pre-
dicted abundance of a taxon with rank r) for fitting
observed taxon abundances to a particular distribution.
For the broken stick distribution, ar¼ðJ=TÞPT
X¼rð1=xÞ,
where Jis the total number of individuals of all taxa and T
is the total number of taxa. For the pre-emption, a
r
=Ja
(1a)
r1
, where ais the estimated rate of decay of
abundance per rank. For the log normal, a
r
=exp
(logµ+logrN), with the assumption that logarithmic
abundances are distributed normally where Nis a normal
deviate. For the Zipf, a
r
=Jp
1
r
c
, where p
1
is the proportion
of the most abundant taxon and cis a decay constant. For
the Zipf-Mandelbrot, a
r
=c(r+b)
c
, where cis a scaling
constant and bis a fitted parameter.
To obtain the rank abundance distribution weights, a
series of ‘observed’ taxon abundance vectors were con-
structed with inumber of taxa, and Jset to 1000. In our
study, iranged from 2 to 14, as that was the range of taxa
on the SLs. A Jof 1000 individuals was employed to obtain
smooth abundance curves. Each observed abundance vec-
tor was fitted to each distribution using the above equa-
tions. The fitted abundance values for each vector were
then relativized to sum to 1 to construct a set of weights for
any line segment with itaxa in it. The rank abundance
distribution weights for the five RW models for SLs differ
slightly from each other (Fig. 3). The weights obtained
from applying the pre-emption model to line segments
with different numbers of species is in App. S1. When the
RW models were applied at the town line scale they were
distance-weighted in the same manner as described for the
EW models.
Line segments
To assess the relative amount of information in the BTs
and SLs, they were first compared using five metrics: how
many taxa they contained, how many mentions they con-
tained (where each BT and each taxon in an SL denoted as
one ‘mention’), the total length of line segments that con-
tained them, the median number of BT or SL taxa that
occurred on a line segment, and how the number of taxa
changed with length of line segments.
Next, the degree of similarity between BT and SL taxa
on the same line segment was assessed in six steps.
First, line segments that contained both BTs and SLs
were selected. Second, for each line segment, the pro-
portion of unique BT taxa was calculated as the number
of BT taxa that were not also in the SL taxa, divided by
the total number of BT taxa. Third, for each line seg-
ment the proportion of unique SL taxa was calculated
as the number of SL taxa that were not also in the BT
taxa divided by the total number of SL taxa. Fourth, for
all line segments with any given number of total BT
taxa, the mean was calculated of their proportions of
unique BT taxa as calculated in step two, and then of
their proportions of unique SL taxa as calculated in step
three. Fifth, for all line segments with any given num-
ber of SL taxa, the mean was calculated of their propor-
tions of unique BT taxa as calculated in step two,
and then of their proportions of unique SL taxa as
calculated in step three. Sixth, data from steps four and
five were graphed for groups of line segments with a
Fig. 3. Relative abundances assigned to taxon ranks derived from five
different rank abundance distribution models for SLs with three and 14
taxa.
5
Applied Vegetation Science
Doi: 10.1111/avsc.12165©2015 International Association for Vegetation Science
C.P.S. Larsen et al. Predicting historic forest composition
given number of BT or SL taxa if they contained more
than one line segment, and trends in the data were
assessed visually.
The degree of similarity of BT and SL taxa on line seg-
ments was also assessed in three steps using Cohen’s Kappa
statistic. First, only line segments that contained the same
number of BT and SL taxa were selected, because the use
of line segments with unequal numbers would create
unreliable results (Gwet 2002). Second, Cohen’s Kappa
was calculated for each group of line segments with a given
number of BT or SLs. Third, results from step two were
graphed and trends were assessed using regression.
Town lines
Town line data were analyzed at the community scale
and the taxon scale, both of which had the town line
as the basic unit of analysis. Community-scale analyses
were conducted in three steps. First, for each town line
on its own, relative abundances calculated from its BT
taxa and SL taxa were compared together by calculat-
ing their Pearson’s correlation and root mean square
error (RMSE). For each town line this was done eight
times: once each for the three EW and five RW
distance-weighted models. Second, to assess whether
Pearson’s correlations and RMSEs improved with
increasing numbers of BTs and line segments with SLs
on a town line, data were plotted and polynomial
trend lines were fitted; graphs were visually assessed
for adequate numbers of BTs and line segments, as
indicated by levelling off of the trend lines. Third, to
assess differences in performance of the models, med-
ian values of all Pearson’s correlations and RMSEs for
all town lines were calculated, and differences in them
were assessed using MannWhitney tests. This was per-
formed first for all town lines and then for just those
town lines with adequate numbers of BTs and line seg-
ments with SLs, as identified using trend lines in the
second step.
Taxon-scale analyses were conducted in three steps.
First, for all town lines together, relative abundances of a
taxon in the BTs were compared with those predicted from
the SLs using the three EW and five RW distance-weighted
models, using Pearson’s correlations and RMSE. This was
done for all town lines and then for all town lines with an
adequate number of BTs and line segments with SLs, as
determined in the second step of the community-scale
analyses. Second, regression analyses were used to assess
whether the Pearson’s correlations were higher and
RMSEs were lower for taxa present on a larger number of
town lines. Third, significant differences in the Pearson’s
correlations and RMSEs for each taxon were assessed
between the models using MannWhitney tests.
Results
Line segments
Individual line segments varied from 20 to 12 875 m in
length. The total length of line surveyed was 3 172 953 m,
of which 86.9% contained BTs and SLs, 11.6% contained
only SLs, 0.4% contained only BTs and 1.0% contained
neither. The BT data had 8394 BTs from 38 taxa located
around 3302 posts. The SL data had 11 192 unique taxon
mentions from 40 taxa located on 2859 line segments. The
median number of different taxa on a line segment was
three for the BTs and seven for the SLs. The number of taxa
recorded in the 1953 segments with BTs and SLs increased
linearly with the line segment length, with stronger rela-
tions for BTs (R
2
=0.41, P<0.001) than for SLs
(R
2
=0.18, P<0.001); the trend lines predicted a line seg-
ment of 20 m would have 3.9 SL and 1.5 BT taxa, and a
line of 12 875 m would have 9.8 SL and 9.0 BT taxa. The
highest taxon number recorded in the BTs was ten taxa on
a line segment of 7092 m, and in the SLs was 14 taxa on a
line segment of 573 m; for line segments of 100 m or less,
the largest number was four BT taxa over 100 m, and nine
SL taxa over 61 m.
As the number of SL taxa on a line segment increased
from one to five, the percentage of those SL taxa that were
unique (i.e. not also mentioned in BTs of that same line
segment) increased, while that of unique BT taxa from
thosesamelinesegmentsdecreased(Fig. 4a).Asthenum-
ber of SL taxa increased from 6 to 10, there was generally
no further change in uniqueness of BT or SL taxa. As the
number of BT taxa on a line segment increased from one
to nine, the percentage of the unique BT taxa (i.e. not also
mentioned in SLs of that same line segment) increased,
and that of unique SL taxa decreased (Fig. 4b). Cohen’s
Kappa, assessed using the 184 line segments with the same
number of BT and SL taxa, increased from a mean of 0.61
(n=42) when there was only one taxon in both the BTs
and SLs on a line segment, to a mean of 0.74 (n=2) when
the number of taxa was seven (R
2
=0.68, P=0.022), indi-
cating ‘substantial agreement’ (Landis & Koch 1977).
Town lines
The set of all town lines (n=433) ranged in length from
119 to 12 873 m, with a median of 8805 m; short town
lines were adjacent to irregularly shaped Indian Reserva-
tions (Fig. 1). The set of town lines that had 15 BTs and
7 line segments with SLs (see next paragraph) ranged in
length from 5130 to 12 873 m, with a median of 9539 m
(n=164). Community-scale and taxon-scale relations dif-
fered in two regards. The community scale compared the
relative abundances of all 40 taxa at once by comparing
Applied Vegetation Science
6Doi: 10.1111/avsc.12165 ©2015 International Association for Vegetation Science
Predicting historic forest composition C.P.S. Larsen et al.
their relative abundances in BTs and SLs on one town line
at a time, and then compiling those results for all town
lines. The taxon scale examined one taxon at a time by
comparing its relative abundance in BTs and SLs on all
town lines, and then compiling those results for the 28 taxa
that had non-zero values for both BTs and SLs.
An example of community-scale relations is shown for a
typical town line using the EW unique taxon and RW dis-
tance-weighted pre-emption models (Fig. 5). Community-
scale relations between BTs and SLs varied with number of
BTs on the town line (Fig. 6a, b) and number of line
segments with SLs on the town line (Fig. 6c, d). Visual
analyses of results for the EW unique taxon and RW
distance-weighted pre-emption models (Fig. 6) suggested
that relations became asymptotic when the number of BTs
onatownlinewas15 and the number of line segments
with SLs was 7; of the total of 433 town lines, 164 met
those two criteria.
The median Pearson’s correlations in the community-
scale analyses ranged from 0.49 when relative abun-
dances in the SLs were predicted using the EW unique
taxon model with all 433 town lines, to 0.84 when
using any of the RW distance-weighted models (other
than the RW distance-weighted Zipf model) for the 164
town lines (Table 1). The median RMSE ranged from
7.54 when the relative abundances in the SLs were pre-
dicted using the EW unique taxon model with all 433
town lines, to 4.33 when using the RW distance-
weighted pre-emption model for the 164 town lines
(Table 1; note: lower RMSEs suggest better agreement).
The median Pearson’s correlations for all eight models
were all higher, and median RMSEs were all lower, for
relations determined using the 164 town lines than for
those using all 433 town lines. MannWhitney tests
indicated that Pearson’s correlations and RMSEs for both
the 433 town lines and the 164 town lines had the fol-
lowing patterns: the performance of the five RW dis-
tance-weighted models did not differ significantly from
each other, the five RW distance-weighted models all
performed better than the three EW models, and the
EW distance-weighted unique taxon and the EW
unique mention models did not differ significantly from
each other but performed significantly better than the
EW unique taxon model. Results that were significant
had a P<0.007, while non-significant results had a
P>0.480.
In taxon-scale analyses the Pearson’s correlations ran-
ged from 0.01 for Carpinus caroliniana using both models
and data sets, to 0.89 for Betula lenta using the EW unique
taxon model and the 164 town lines (Table 2). An exam-
ple of taxon-scale analyses is shown for Fagus grandifolia
using the 164 town lines with the EW unique taxon and
RW distance-weighted pre-emption models (Fig. 7).
RMSE ranged from 36.31 for Fagus grandifolia when using
the EW unique taxon model and the 164 town lines, to
(a)
(b)
Fig. 4. The mean proportions of the SL taxa ( ) and BT taxa ( )thatare
unique to a line segment with that number of (a) SL taxa, and of (b)BT
taxa, for the 1953 line segments whichhave BTs and SLs.
Fig. 5. Relations between the relative abundance of a taxon in the BTs
and the relative abundance of the taxon in the SLs predicted using the EW
unique taxon model ( ) and the RW distance-weighted pre-emption model
() for a typical town line. This town line’s correlations were close to the
median values for all town lines. This town line is 9159-m long, has 13 line
segments all of which have SLs and which together have 50 unique
mentions from 13 taxa, and has 29 BTs from 11 taxa.
7
Applied Vegetation Science
Doi: 10.1111/avsc.12165©2015 International Association for Vegetation Science
C.P.S. Larsen et al. Predicting historic forest composition
0.04 for Abies balsamea when using the RW distance-
weighted pre-emption model and the 164 town lines. The
number of taxa with significant Pearson’s correlations
(P<0.001) ranged from 19 of 28 taxa for the EW unique
taxon model using the 164 town lines, to 24 of 28 taxa for
the EW unique taxon models and the RW distance-
weighted pre-emption models using all 433 town lines
(Table 2). The two other EW and four other RW distance-
weighted models provided similar results. For both the 433
and 164 town lines, the RW distance-weighted pre-emp-
tion model produced higher Pearson’s correlations than
those from the EW unique taxon model for 19 of 28 taxa.
The RW distance-weighted pre-emption model produced
RMSEs that were lower than those for the EW unique
taxon model for 23 of 28 taxa in the 433 town lines, and
21 of 28 taxa in the 164 town lines. Pearson’s correlations
and RMSE values both increased as the percentage of town
lines on which a taxon had BTs present increased, when
considered using the RW distance-weighted pre-emption
model and the 164 town lines (Fig. 8).
The median Pearson’s correlation for all 28 taxa was
highest for the RW distance-weighted pre-emption model
for the 164 town lines, and lowest for the EW unique
taxon model for the 433 town lines; the median RMSE
was highest for the EW unique taxon model for the 164
town lines, and lowest for the RW distance-weighted pre-
emption model for the 433 town lines. MannWhitney
tests conducted on all 28 taxa for all three EW models and
all five RW distance-weighted models showed that the EW
unique taxon model had significantly lower Pearson’s cor-
relations than did the seven other models (P<0.048), but
no other models differed significantly from each other.
(a) (b)
(c) (d)
Fig. 6. Relations between (a,b) the number of BTs on a town lineand the Pearson’s correlation between the relative abundances of the BT and SL taxa on
atownline,and(c,d) the number of line segment with SLs on a town line and the Pearson’s correlation between the relative abundances of the BT and SL
taxa on a town line. The SL relative abundances were predicted using the EW unique taxon (a,c)( ) and the RW distance-weighted pre-emption models
(b,d)( ).
Table 1. Community-scale results. Median values of Pearson’s R and of RMSE calculated between town lines using the relative abundance of all taxa in
their BTs and all taxa in the SLs, with the values predicted for the SLs using eight different models. The subset of 164 town lines was selected following the
visual analyses in Fig. 6. The models are: EWunique taxon (EW-UT), EW unique mentions (EW-UM), EW distance-weighted unique taxon (EW-DWUT), RW dis-
tance-weighted broken stick (RW-DWBS), RW distance-weighted log normal (RW-DWLN), RW distance-weighted pre-emption (RW-DWPE), RW distance-
weighted Zipf (RW-DWZ) and RW distance-weighted Zipf Madelbrot (RW-DWZM).
EW-UT EW-UM EW-DWUT RW-DWBS RW-DWLN RW-DWPE RW-DWZ RW-DWZM
All town lines (n=433)
R 0.49 0.69 0.68 0.78 0.77 0.78 0.77 0.77
RMSE 7.54 6.17 6.17 5.35 5.40 5.34 5.47 5.36
Town lines with 15 BTs and 7 vegetatedline segments (n=164)
R 0.50 0.76 0.76 0.84 0.84 0.84 0.83 0.84
RMSE 6.82 5.04 5.31 4.36 4.40 4.33 4.44 4.35
Applied Vegetation Science
8Doi: 10.1111/avsc.12165 ©2015 International Association for Vegetation Science
Predicting historic forest composition C.P.S. Larsen et al.
MannWhitney tests did not disclose significant differ-
ences between RMSEs of any of the eight EW or RW
models (P>0.521).
Relations between BTs and SLs were stronger for com-
munity-scale analyses than for taxon-scale analyses. The
median Pearson’s correlations between BT and SL data for
the 164 townl ines when using the EW unique taxon and
RW distance-weighted pre-emption models were, respec-
tively, 0.50 and 0.84 for community-scale analyses
(Table 1), and 0.29 and 0.45 for taxon-scale analyses
(Table 2). That median value of 0.84 for the community-
scale analyses using the RW distance-weighted pre-emp-
tion model was only exceeded by one taxon in taxon-scale
analyses: black birch, with a correlation of 0.89 when
using the EW unique taxon model and the 164 town lines.
Discussion
Comparisons between BTs and SLs at community and
taxon scales indicate that the two data sets are related bet-
ter when using RW than EW models. This supports the
contention that SLs are ranked and correspond to real dif-
ferences in relative abundances. These results extend the
finding of Terrail et al. (2014) that first-ranked SL taxa are
the most abundant, to also include later-ranked SL taxa.
This was indicated by statistical analyses at the town line
scale, showing that BTs were more strongly related to
relative abundances predicted from SLs using RW
distance-weighted than with EW models. While RW
distance-weighted models performed better than all EW
models in the community-scale analyses and better than
Table 2. Taxon-scale results. Relations between the relative abundance of BT and SL taxa for all town lines, and for selected town lines with 15 BTs and
7 vegetated line segments. The 11 taxa that were in the selected group but which had zero BTs or SL taxa were excluded here. Results are shown for the
EW unique taxon (EW-UT) and the RW distance-weighted pre-emption (RW-DWPE) models as they had, respectively, the lowest and highest median Pear-
son’s R and RMSE values among the eight models. Pearson’s R values with a P<0.01 are italicized, and those with a P<0.001 are bolded. RMSE values
are in percentages.
Taxon Pearson’s R RMSE
All (n=433) Selected (n=164) All (n=433) Selected (n=164)
EW-UT RW-DWPE EW-UT RW-DWPE EW-UT RW-DWPE EW-UT RW-DWPE
Abies balsame a 0.42 0.56 0.41 0.55 1.46 0.60 0.65 0.04
Acer rubrum 0.25 0.21 0.36 0.47 9.34 9.45 13.80 14.58
Acer saccharum 0.18 0.34 0.36 0.50 20.80 19.25 24.97 25.94
Betula alleghaniensis 0.16 0.17 0.19 0.35 8.29 6.37 10.56 8.65
Betula lenta 0.04 0.05 0.89 0.83 3.38 2.97 5.14 4.76
Carpinus caroliniana 0.01 0.01 0.01 0.01 1.03 0.56 1.30 0.65
Carya spp. 0.25 0.29 0.41 0.46 5.44 4.09 6.63 5.73
Castanea dentata 0.38 0.44 0.49 0.75 7.78 6.05 11.35 8.78
Fagus grandifolia 0.23 0.32 0.20 0.52 33.53 26.74 36.31 33.94
Fraxinus americana 0.10 0.16 0.14 0.43 9.22 7.68 12.38 10.84
Fraxinus nigra 0.26 0.58 0.28 0.73 9.16 7.69 11.41 10.97
Juglans nigra 0.24 0.23 0.23 0.16 2.33 1.60 2.82 1.67
Juglans cinera 0.29 0.43 0.34 0.68 4.43 2.35 5.44 3.18
Larix laracina 0.55 0.53 0.22 0.00 6.41 6.71 9.65 10.57
Magnolia acuminata 0.19 0.20 0.36 0.31 4.75 3.41 4.53 4.06
Ostrya virginiana 0.02 0.01 0.05 0.03 7.92 7.38 11.73 11.39
Pinus strobus 0.47 0.46 0.40 0.65 11.13 11.46 16.89 17.80
Platanus occidentalis 0.25 0.39 0.38 0.35 3.82 2.72 4.85 4.27
Populus spp. 0.19 0.10 0.29 0.22 3.61 2.65 3.68 2.74
Prunus serotina 0.30 0.21 0.24 0.16 4.54 2.00 3.11 1.60
Quercus alba 0.20 0.34 0.48 0.64 9.85 9.35 14.45 13.92
Quercus montana 0.56 0.48 0.57 0.57 1.52 1.50 2.35 2.37
Quercus rubra 0.25 0.25 0.18 0.19 2.22 1.69 1.48 1.75
Quercus velutina 0.25 0.33 0.36 0.45 11.25 12.33 17.31 19.20
Thuja occidentalis 0.34 0.77 0.36 0.50 1.65 0.92 2.28 1.28
Tilia americana 0.29 0.37 0.17 0.34 8.32 8.24 9.68 9.28
Tsuga canadensis 0.38 0.46 0.24 0.68 16.17 16.11 22.81 24.20
Ulmus spp. 0.22 0.33 0.29 0.45 9.66 8.80 12.49 11.64
Median 0.25 0.33 0.29 0.45 6.41 6.05 9.68 8.78
9
Applied Vegetation Science
Doi: 10.1111/avsc.12165©2015 International Association for Vegetation Science
C.P.S. Larsen et al. Predicting historic forest composition
the EW unique taxon model in the taxon-scale analyses,
none of the five RW distance-weighted models performed
better than each other. This may be because the five rank
abundance distribution models we used to create RW dis-
tance-weighted predictions provide fairly similar results
when low numbers of taxa are present (Fig. 3). Alterna-
tively, it could be that each of the five rank abundance dis-
tribution models fit certain forest community types better
than others (Ulrich et al. 2010), and that combining those
communities together degraded the results of each model.
Since the RW pre-emption model forwarded by Seischab
(1990, 1992) and used by Fritschle (2012) performs as well
as any other RW distance-weighted model, our results sup-
port continued usage of that model, especially as it is the
most intuitive RW model to understand and easiest RW
model to apply.
The BTs and SLs also exhibit uniqueness and thus inde-
pendence, as indicated by the Cohen’s kappa of 0.61 for
line segments that had only one BT and one SL taxon. If
surveyors had simply been creating their SLs from the BTs,
that measure would have been higher. The independence
of BTs and SLs is furthered by the observation that some
short line segments contained many SL taxa, and that the
number of taxa in BTs and SLs was typically different. The
SLs exhibited more unique taxa than did the BTs, except
for line segments that contained only one SL taxon or eight
or more BT taxa (Fig. 4), because there were more unique
mentions in the SLs (11 192) than in the BTs (8394).
Although it was unknown whether the Holland Land
Company surveyors were required to rank the taxa in their
SLs, as the 17979 survey pre-dates the 1833 requirement
to rank them, it was either an informal practice of the Hol-
land Land Company surveyors to rank them, or the sur-
veyors were good at observing the natural manner in
which taxa in communities exhibit rank abundance distri-
butions.
Of the three EW models, EW unique mentions and EW
distance-weighted unique taxon models performed as well
as RW models for taxon-scale analyses of town lines and
second-best to RW models for community-scale analyses
of town lines, while EW unique taxon models consistently
performed poorest. The EW unique mention model has
been used in most other studies of SLs (e.g. Lutz 1930;
Lorimer 1977; Scull & Richardson 2007; Brister et al.
2011), while EW unique taxon and EW distance-weighted
unique taxon models have not been used in any other
PLSR studies. Although EW distance-weighted unique
taxon models employed distance weighting to account for
differences in lengths of line segments, they may not have
been better than EW unique mention models because
longer line segments tend to have more SL taxa. If future
studies want to employ EW models, we suggest the EW
unique mention model as it is easier to apply than is the
EW distance-weighted unique taxon model.
The larger abundance and spatial extent of SLs than BTs
that we found has also been reported in other studies (e.g.
Seischab 1990; Brister et al. 2011). For example, Scull &
Richardson (2007) had fewer than 50 BTs but more than
Fig. 7. Relations between the relative abundance of Fagus grandifolia in
the BTs on a town line and the relative abundance of Fagus grandifolia
predicted from the SLs on the town line. The predictions are modelled
using the EW unique taxon model ( ) and the RW distance-weighted pre-
emption model ( ) for the 164 town lines with 15 BTs and 7line
segments. The trend line for the RW distance-weighted pre-emption model
is solid, and for the EW unique taxon model is dashed.
Fig. 8. Relations between the percentage of town lines on which a taxon
had BTs present on it, and the Pearson’s R ( )andtheRMSE( )between
the relative abundance of a taxon as determined from BTs, and as
predicted from SLs using the RW distance-weighted pre-emption model.
The trend line for Pearson’s R values is dashed, and for RMSE values is
solid. The abundances were modelled for the 164 town lines with 15 BTs
and 7 vegetated line segments. The results are shown for the 28 taxa that
have non-zeroabundances for both BTs and SLs.
Applied Vegetation Science
10 Doi: 10.1111/avsc.12165 ©2015 International Association for Vegetation Science
Predicting historic forest composition C.P.S. Larsen et al.
1300 species mentions in SLs from 485 line segments. Our
finding that some line segments contained just SLs or just
BTs was also found by Batek et al. (1999). We agree with
them that to obtain the most spatially comprehensive
reconstructions of vegetation, analytic methods should be
developed that combine BTs and SLs. If SLs contain less
species selection bias than occurs in BTs (e.g. Kronenfeld &
Wang 2007), then the use of SLs may provide more realis-
tic reconstructions of presettlement forest landscapes.
Indeed, Tulowiecki (2014) found better performance in
environmental models constructed with SLs than with
BTs. If a modelling approach could combine BTs and SLs, it
could be applied to over two million km of SLs in the USA,
parts of Canada and potentially Australia.
Higher correlations for community-scale analyses rela-
tive to those of taxon-scale analyses suggest that the posi-
tion of a taxon in an SL is better at indicating its
abundance relative to other taxa on that same line, than
it is at indicating its absolute abundance across multiple
lines. Three possible reasons for the lower taxon-scale
correlations are as follows. First, if a taxon is infrequent
on the landscape then it might be in the BTs or SLs on a
line segment but will unlikely be in both. That this occurs
is suggested by the taxa with the lowest correlations
being found in the BTs on the fewest town lines (Fig. 7).
Although the occurrence of this would result in low cor-
relations, it does not mean either form of data is incor-
rect. Taxa with fewer BTs have also been found to
exhibit poorer performances in geostatistical models
(Wang & Larsen 2006) and species distribution models of
species with broad niches (Hanberry et al. 2012). Second,
the BT data do contain biases and thus their measures of
relative abundance may be biased. For example, while
the maximum predicted abundance of Fagus grandifolia in
the SLs was 42% (Fig. 7), its relative abundance in the
BTs for 61 of the 164 town lines was between 42% and
79%. Such high values suggest selection bias for Fagus
grandifolia in the BTs, resulting in overestimates of its
abundance. In contrast, analyses of the BT distances from
the survey posts by Kronenfeld & Wang (2007), using
the same BTs as this study, indicated that in this Holland
Land Company survey Fagus grandifolia was underesti-
mated in the BTs by 5.3%. Third, although the five RW
models we employed are believed to represent the major
families of rank abundance distribution models (McGill
et al. 2007), it may be that these forests have a unique
structure that is not reflected in those models. But if that
was the case, the community-scale analyses should not
have provided such strong results.
We conclude that in our study area, and presumably
others, SLs have four distinct advantages over BTs: over
the whole study area, SLs contain more taxa and more
unique mentions than do BTs; short line segments will
have few BT taxa but potentially many SL taxa; the ranked
nature of the SLs means that individual species mentions
in an SL indicate its relative abundance while an individual
BT simply indicates presence; and the SLs allow relative
abundances to be predicted for short line segments while
BTs require the integration of data over much larger areas.
The correlations between SLs and BTs indicate that their
information on relative abundances of tree taxa is similar
but unique. It is indeed possible that SLs may provide more
accurate information than do BTs on relative abundances
of species; to assess that we suggest that species distribution
and abundance models (Anadon et al. 2010) be applied in
a manner similar to Hanberry et al. (2012) and Tulowiecki
(2014). The prediction of species abundances using RW
models of SLs will thus provide much data, in some cases
with many taxa on short line segments, with which to con-
duct ecological studies of presettlement forests.
Acknowledgements
We thank Martin Hermy and the two anonymous
reviewers for providing detailed comments that helped
improve this paper. Y.-C.W. thanks NUS for funding sup-
port (R-109-000-060-112/113) and Canice Chua and
Yikang Feng for transcribing the species lists. S.J.T.
received funding support for his research from a University
at Buffalo Presidential Fellowship.
References
Anadon, J.D., Gimenez, A. & Ballestar, R. 2010. Linking local
ecological knowledge and habitat modelling to predict abso-
lute species abundance on large scales. Biodiversity and Conser-
vation 19: 14431454.
Batek, M.J., Rebertus, A.J., Schroeder, W.A., Haithcoat, T.L.,
Compas, E. & Guyette, R.P. 1999. Reconstruction of early
nineteenth-century vegetation and fire regimes in the Mis-
souri Ozarks. Journal of Biogeography 26: 397412.
Bickford, S. & Mackey, B. 2004. Reconstructing pre-impact vege-
tation cover in modified landscapes using environmental
modelling, historical surveys and remnant vegetation data: a
case study in the Fleurieu Peninsula, South Australia. Jour-
nal of Biogeography 31: 787805.
Bourdo, E.A. 1956. A review of the General Land Office survey
and of its use in quantitative studies of former forests. Ecology
37: 754768.
Brister, E., Hane, E. & Korfmacher, K. 2011. Visualizing plant
community change using historical records. International
Journal of Applied Geospatial Research 2: 118.
Finley, R.W. 1951. The original vegetation cover of Wisconsin.
Dissertation. University of Wisconsin, WI, USA.
Fritschle, J.A. 2012. Identification of old-growth forest reference
ecosystems using historic land surveys, Redwood National
Park, California. Restoration Ecology 20: 679687.
11
Applied Vegetation Science
Doi: 10.1111/avsc.12165©2015 International Association for Vegetation Science
C.P.S. Larsen et al. Predicting historic forest composition
Frontier, S. 1985. Diversity and structure in aquatic ecosystems.
Oceanography and Marine Biology Annual Review 23: 253312.
Gwet, K. 2002. Kappa statistic is not satisfactory for assessing the
extent of agreement between raters. Statistical Methods for
Inter-Reliability Assessment 1: 15.
Hanberry, B.B., He, H.S. & Palik, B.J. 2012. Comparing predicted
historical distributions of tree species using two tree-based
ensemble classification methods. The American Midland
Naturalist 168: 443455.
Kronenfeld, B.J. & Wang, Y.-C. 2007. Accounting for surveyor
inconsistency and bias in estimation of tree density from pre-
settlement land survey records. Canadian Journal of Forest
Research 37: 23652379.
Landis, J.R. & Koch, G.G. 1977. The measurement of observer
agreement for categorical data. Biometrics 33: 159174.
Larsen, C.P.S., Kronenfeld, B.J. & Wang, Y.-C. 2012. Forest com-
position: more altered by future climate change than by
Euro-American settlement in western New York and Penn-
sylvania? Physical Geography 33: 320.
Little, E.L. Jr 1979. Checklist of United States trees (native and
naturalized). Agricultural Handbook 541.U.S.Departmentof
Agriculture Forest Service, Washington, DC.
Lorimer, C.G. 1977. The presettlement forest and natural distur-
bance cycle of northeastern Maine. Ecology 58: 139148.
Lutz, H.J. 1930. Original forest composition in northwestern
Pennsylvania as indicated by early land survey notes. Journal
of Forestry 28: 10981103.
MacArthur, R. 1957. On the relative abundance of bird species.
Proceedings of the National Academy of Sciences of the United States
of America 43: 293295.
McCracken, E. 1959. The woodlands of Ireland circa 1600. Irish
Historical Studies 11: 271296.
McGill, B.J., Etienne, R.S., Gray, J.S., Alonso, D., Anderson,
M.J., Benecha, H.K., Dornelas, M., Enquist, B.J., Green, J.
L., (...) & White, E.P. 2007. Species abundance distributions:
moving beyond single prediction theories to integration
within an ecological framework. Ecology Letters 10: 995
1015.
Motomura, I. 1932. On the statistical treatment of communities.
Zoological Magazine, Tokyo 44: 379383.
Preston, F.W. 1948. The commonness, and rarity, of species.
Ecology 29: 254283.
Ravenscroft, C., Scheller, R.M., Mladenoff, D.J. & White, M.A.
2010. Forest restoration in a mixed-ownership landscape
under climate change. Ecological Applications 20: 327346.
Rhemtulla, J. & Mladenoff, D.J. 2010. Relative consistency, not
absolute precision, is the strength of the Public Land Survey:
response to Bouldin. Ecological Applications 20: 11871189.
Scull, P.R. & Richardson, J.L. 2007. A method to use ranked tim-
ber observations to perform forest composition reconstruc-
tions from land survey data. The American Midland Naturalist
158: 446460.
Seischab, F.K. 1990. Presettlement forests of the Phelps and Gor-
ham purchase in western New York. Bulletin of the Torrey
Botanical Club 117: 2738.
Seischab, F.K. 1992. Forests of the Holland Land Company in
western New York, circa. 1798. New York State Museum Bulle-
tin 484: 3653.
Siccama, T.G. 1971. Presettlement and present forest vegetation
in northern Vermont with special reference to Chittenden
County. The American Midland Naturalist 85: 153172.
Terrail, R., Arsenault, D., Fortin, M.-J., Dupuis, S. & Boucher, Y.
2014. An early forest inventory indicates high accuracy of
forest composition data in pre-settlement land survey
records. Journal of Vegetation Science 25: 691702.
Tulowiecki, S.J. 2014. Using vegetation data within presettle-
ment land survey records for species distribution modeling: a
tale of two datasets. Ecological Modelling 291: 109120.
Ulrich, W., Ollink, M. & Ugland, K.I. 2010. A meta-analysis of
species-abundance distributions. Oikos 119: 11491155.
Wang, Y.-C. 2005. Presettlement land survey records of vegeta-
tion: geographic characteristics, quality and modes of analy-
sis. Progress in Physical Geography 29: 568598.
Wang, Y.-C. 2007. Spatial patterns and vegetation-site relation-
ships of the presettlement forests in western New York, USA.
Journal of Biogeography 34: 500513.
Wang, Y.-C. & Larsen, C.P.S. 2006. Do coarse resolution US pre-
settlement land survey records adequately represent the spa-
tial pattern of individual tree species? Landscape Ecology 21:
10031017.
White, C.A. 1983. A History of the Rectangular Survey System.U.S.
Dept. of the Interior, Bureau of Land Management, Wash-
ington, DC.
Whitney, G.G. 1996. From Coastal Wilderness to Fruited Plain: a His-
tory of Environmental Change in Temperate North America, 1500
to Present. Cambridge University Press, Cambridge, UK.
Wilson, J.B. 1991. Methods for fitting dominance diversity
curves. Journal of Vegetation Science 2: 3546.
Wyckoff, W. 1988. The Developer’s Frontier: the Making of the Wes-
tern New York Landscape. Yale University Press, New Haven,
CT.
Zipf, G.K. 1949. Human Behavior and the Principle of Least Effort.
Addison-Wesley, Reading, MA, US.
Supporting Information
Additional Supporting Information may be found in the
online version of this article:
Appendix S1. Table of abundance values predicted using
the pre-emption model.
Applied Vegetation Science
12 Doi: 10.1111/avsc.12165 ©2015 International Association for Vegetation Science
Predicting historic forest composition C.P.S. Larsen et al.
... Thompson et al. 2013); to reconstruct presettlement vegetation communities (e.g. Cogbill et al. 2002;Wang 2007;Dupuis et al. 2011;Larsen et al. 2015;Paciorek Communicated by E.C. Grimm. ...
... LSRs have also been used in recent years as sources of information on Native American land-use impacts on temperate forested ecosystems (e.g. Black and Abrams 2001;Foster et al. 2004;Black et al. 2006;Tulowiecki and Larsen 2015). Analysis of the interactions between prehistoric and historic indigenous societies and native vegetation communities has relevance for a variety of disciplines including archaeology (environmental contextualization of prehistoric culture systems; Gajewski et al. 2019), biogeography (reconstruction of Holocene species' dispersals; MacDougall 2003), climate change studies (estimation of biomass burning and prehistoric carbon storage; Power et al. 2012;Koch et al. 2019), fire ecology (reconstruction of fire regimes prior to Euro-American settlement; Thomas-Van Gundy and Nowacki 2013; Abrams and Nowacki 2019), conservation and restoration ecology (reintroduction of extirpated taxa and recovery of degraded/destroyed ecosystems; Alagona et al. 2012), and historical geography (indigenous land-use legacies on Euro-American settlement geography; Coughlan and Nelson 2018). ...
... Analysis of the interactions between prehistoric and historic indigenous societies and native vegetation communities has relevance for a variety of disciplines including archaeology (environmental contextualization of prehistoric culture systems; Gajewski et al. 2019), biogeography (reconstruction of Holocene species' dispersals; MacDougall 2003), climate change studies (estimation of biomass burning and prehistoric carbon storage; Power et al. 2012;Koch et al. 2019), fire ecology (reconstruction of fire regimes prior to Euro-American settlement; Thomas-Van Gundy and Nowacki 2013; Abrams and Nowacki 2019), conservation and restoration ecology (reintroduction of extirpated taxa and recovery of degraded/destroyed ecosystems; Alagona et al. 2012), and historical geography (indigenous land-use legacies on Euro-American settlement geography; Coughlan and Nelson 2018). Unfortunately, studies focused on Native American land-use impacts using high-resolution (spatially and taxonomically) LSR datasets have been relatively few in number and are frequently of limited spatial coverage, typically encompassing county-level-sized or slightly larger (~ 10 3 km 2 ; Black and Abrams 2001;Black et al. 2006;Tulowiecki and Larsen 2015) geographic extents. During this time, developments in prehistoric North American archaeology (e.g. ...
Article
Full-text available
Historic land survey records (LSRs) offer important details on local- and landscape-scale vegetation patterns related to Native American land-use practices prior to widespread Euro-American settlement. This study’s use of an expanded range of vegetation-related variables derived from LSR sources, combined with archaeological site distribution data, and analysed using complementary multivariate statistical methods, has provided new insights on the spatial and compositional dynamics of the vegetation of central New York State, USA, an area historically occupied by the Cayuga and Onondaga nations. The upland vegetation of the study area was modulated primarily by fire, followed by soil fertility, and canopy disturbance. Clear signals of Native American agriculture and silviculture were associated with a number of fire-tolerant vegetation communities that were geographically concentrated within an area most conducive to maize cultivation. Numerical classification partitioned the LSR vegetation data into distinct community types: mesophytic upland forest and xerophytic upland forest. This latter type was secondarily differentiated into an unequivocally anthropogenic landscape (Iroquoian agricultural mosaic) and a series of fire-tolerant forest and savanna communities with possible connections to silvicultural land-use practices. Distance analysis of ordination scores indicated statistically-significant spatial trends associated with the distribution of archaeological sites, with disturbance most heavily concentrated within 6 km of most sites. Given the success of this methodology, we recommend that this integrated approach become the standard for LSR-based research of Native American vegetation disturbance.
... relative metrics (i.e. relative ranks) of taxon abundance obtained with taxon lists are more reliable than absolute metrics (Terrail et al. 2014;Larsen, Tulowiecki, Wang, & Trgovac, 2015). In this study, we use a dataset comprising 22555 tree taxon lists over an area of 8910 km 2 to reconstruct changes in position of relative order of prevalence for the principal tree taxa as a consequence of maple and poplar expansion. ...
... reconstruction based on taxa basal area) and early land survey taxon lists (Terrail et al. 2014). This showed that taxon lists are highly accurate for reconstructing pre-settlement composition, and particularly when using relative metrics (Terrail et al. 2014;Larsen et al. 2015). Thus, we computed relative prevalence positions of the eight retained taxa in order to describe the vegetation for each period. ...
Article
Full-text available
How has European settlement of Eastern North America modified tree species assemblages?. The northern temperate forests of the Lower St. Lawrence region (Québec, Canada). Changes in relative prevalence of tree taxa were reconstructed with early land survey records (1821‐1900) and modern forest inventories (1980‐2010). Forest composition reconstructions were then used to analyse changes in tree taxa assemblages at the landscape scale and test for potential landscape homogenization. Our results show important maples (Acer saccharum and A. rubrum) and poplar (Populus tremuloides and P. balsamifera) encroachment, shifting from the 6th to the 2nd positions of relative prevalence and from the 7th to the 5th positions, respectively, resulting in a significant shift in tree assemblage. Maples have spread throughout the whole landscape and have tended to become the most abundant taxa in community where it was already present in pre‐settlement times. Poplars also widely spread throughout the landscape but rarely became the most abundant taxa. Accordingly, deciduous encroachment clearly engendered a spatial homogenization of composition at the landscape scale. Considering that both red maple and trembling aspen are opportunist early‐successional species, the increased relative prevalence of both species, as well as the consequent reorganization of tree taxa assemblages and landscape homogenization probably, resulted from the regional convergence toward an early successional state. Along with restoration of long‐lived shade‐tolerant conifer populations, land and forest managers should aim to increase heterogeneity of forest stand composition to improve forests resilience to future global changes. This article is protected by copyright. All rights reserved.
... Researchers like Fagin and Hoagland (2011), He et al. (2006), Larsen et al. (2015), Puric-Mladenovic (2003), and Tulowiecki (2014) and have used bearing tree and/or line description data from the GLO survey to create species distribution models. ...
Thesis
Full-text available
Restoration and management of ecologically important sites depend on an understanding of reference conditions and the ability of people to return the site to those historic conditions. Historical ecology research sifts through the data about a site to be able to offer restoration options to land managers. This project demonstrates transitions in natural communities of a protected area in East Central Florida: Split Oak Forest. Natural communities are defined based on the General Land Office (GLO) survey maps and notes and applied to historical black and white aerial photos, modern digital orthophotos, and high resolution satellite imagery. Because of the channelization of the Kissimmee River and the subsequent draining of the Everglades from 1883 onward, Split Oak, like other areas whose surroundings have been drained, cannot be returned to the conditions at the time of the GLO survey. Thus, a detailed time series of eight snapshots over 171 years will be valuable to land managers and restoration ecologists working in sites that share the hydrologically-modified Northern Everglades watershed with Split Oak. Natural community descriptions gleaned from the surveyors maps and notes and their application to current land cover are a potential backbone to future historical ecology in the southeast. Seasonally re-hydrating drained wetlands is a priority in this watershed, and is supported by cost-share funding from the State of Florida. This research affirms that most grassy wetlands on the site have transitioned to upland communities. Most of the remaining marshes have been invaded by woody plants and swamps extended their boundaries. Sandhill was used for orange (Citrus x sinensis) culture and, along with scrub and flat pine, transitioned to hammock.
Article
American chestnut (Castanea dentata) once held great economic and ecological importance in eastern US forests before its demise due to an invasive blight fungus. Backcross breeding and genetic engineering methods are currently developing a blight-resistant tree with mostly American chestnut traits. With the potential re-introduction of chestnut, research has sought to understand the geographic distribution of chestnut to locate potential restoration sites, but less research has compared ideal restoration sites to underlying land ownership. This research models the historical distribution of chestnut in western New York State (NYS, approximately 27,617 km 2), containing a portion of the original range of chestnut, in order to determine suitable areas for chestnut reintroduction. This study models chestnut distribution using original land survey record (OLSR) data (ca. 1797-1799 CE) and species distribution models (SDMs), then compares model predictions to current protected lands. Results indicate that depending upon modeling technique, predicted suitable habitat for chestnut ranges 27.9-49.7% of the study area, and that 8.0-11.5% of suitable area is within protected land parcels. SDMs suggest that within the study area, the two predictors most important to chestnut distribution are soil pH and terrain slope, with chestnut favoring acidic soils and steeper slopes. By identifying sites for potential re-introduction of chestnut, this study highlights that reintroduction will depend upon cooperation of private landowners along with governments and non-governmental agencies. This study offers a revision to the historical distribution of chestnut in western NYS, and provides insight into land ownership and management issues facing its restoration.
Article
Full-text available
Predicting future ecosystem dynamics depends critically on an improved understanding of how disturbances and climate change have driven long-term ecological changes in the past. Here we assembled a dataset of >100,000 tree species lists from the 19th century across a broad region (>130,000km²) in temperate eastern Canada, as well as recent forest inventories, to test the effects of changes in anthropogenic disturbance, temperature and moisture on forest dynamics. We evaluate changes in forest composition using four indices quantifying the affinities of co-occurring tree species with temperature, drought, light and disturbance. Land-use driven shifts favouring more disturbance-adapted tree species are far stronger than any effects ascribable to climate change, although the responses of species to disturbance are correlated with their expected responses to climate change. As such, anthropogenic and natural disturbances are expected to have large direct effects on forests and also indirect effects via altered responses to future climate change.
Article
Full-text available
Fine scale spatial mapping of historical tree records over large extents is important for determining historical species distributions. We compared performance of two ensemble methods based on classification trees, random forests, and boosted classification, for mapping continuous historical distributions of tree species. We used a combination of soil and terrain predictor variables to predict species distributions for 21 tree species, or species groups, from historical tree surveys in the Missouri Ozarks. Mean true positive rates and AUC values of all species combined for random forests and boosted classification, at a modeling prevalence and threshold of 0.5, were similar and ranged from 0.80 to 0.84. Although prediction probabilities were correlated (mean r = 0.93), predicted probabilities from random forests generated maps with more variation within subsections, whereas boosted classification was better able to differentiate the restricted range of shortleaf pine. Both random forests and boosted classification performed well at predicting species distributions over large extents. Comparison of species distributions from two or more statistical methods permits selection of the most appropriate models. Because ensemble classification trees incorporate environmental predictors, they should improve current methods used for mapping historical trees species distributions and increase the understanding of historical distributions of species.
Article
Full-text available
Researchers have long utilized the vegetation data within presettlement land survey records (PLSRs) to understand past forest composition in North America. PLSRs typically contain two datasets: bearing-tree(BT) data and line-description (LD) data. BT data are records of the trees that surveyors blazed adjacent to survey monuments, whereas LD data provide descriptions of tree species that surveyors observed along survey lines. Recently, studies have applied BT data to develop species distribution models (SDMs). SDMs create predictions of species distributions, based upon the modeled relationship between species presence and absence records, and environmental variables. Despite the applications of BT data in SDMs, the value of LD data for developing SDMs has not been explored. This study compares SDMs trained from LD data versus BT data, using PLSRs that were created ca. 1799–1814 CE in Chautauqua County, New York State. Using consensus modeling techniques, this study finds that despite positional uncertainty issues, LD data produce SDMs with better predictive performance than BT data, and more adequately generalize to independent datasets. Moreover, a comparable amount of data can be collected from LD data as from BT data, in order to develop models with greater predictive ability. This study challenges the use of BT data in SDMs, and suggests that modeling past species distributions can be accomplished more effectively using LD data.
Article
Full-text available
The amount of forest compositional change that occurred due to Euro-American settlement over the past two centuries is compared with changes simulated to occur in the future under 2X and 3.5X atmospheric CO2 scenarios. The comparison employs data from presettlement land survey records, modern forest inventory data, and future predictions from niche-based species distribution models. Comparisons are made in four independent study areas in western Pennsylvania and New York. Forest compositional changes in the recent past, attributed largely to anthropogenic factors other than climate change, are intermediate in size to changes predicted to occur as the result of climate change under the 2X CO2 and 3.5X CO2 scenarios. Results are similar across the four study areas, and are robust to variations in data collection and compilation methods. These results disagree with previous pollen-based estimates that suggested a greater relative influence of a 2X CO2 climate change, but do indicate that a 3.5X CO2 climate change may cause greater changes in forest composition than has already occurred due to anthropogenic impacts.
Article
Full-text available
QuestionsDo early land survey records of the ‘line description’ type allow accurate reconstructions of pre-settlement forest composition? Did surveyors record all tree taxa in forest stands encountered along the surveyed lines? Were taxa ranked according to their relative importance in forest stands? What criteria did surveyors used to rank taxa in stands? LocationNorthern range limit of northern hardwoods, Lower St. Lawrence region, eastern Québec, Canada. Methods Validation of 1695 taxon lists recorded by surveyors in the 19th century through comparison of the number of stems by tree species and stem diameter classes recorded in 2790 old-growth plots over the same two regions during a 1930 forest inventory. ResultsTaxon prevalence and dominance (i.e. proportion of observations for which each taxon is dominant) are highly correlated between the pre-settlement surveys and the 1930 forest inventory data sets. Surveyors ranked taxa in decreasing order of relative importance, using criteria directly equivalent to basal area of stems in modern forest inventory plots. Taxon prevalence is more accurately reconstructed using relative metrics (i.e. ranks of taxon prevalence in a region), whereas taxon dominance is more accurately reconstructed using absolute metrics (percentage of dominant stands across landscapes). The early land surveys allow spatial patterns of forest composition to be reconstructed by computing relative taxon prevalence in cells of 3 km × 3 km. Prevalence of balsam fir (Abies balsamea) and white birch (Betula papyrifera) are underestimated in survey data, probably reflecting their low economic value in the 19th century. Conclusions Taxon lists of early surveyors can accurately reconstruct pre-settlement forest composition and spatial patterns using metrics of taxon prevalence and dominance across landscapes. Relative prevalence is a more comprehensive description of forest composition than dominance, but tends to underestimate some taxa. Absolute taxon dominance is a more robust metric than prevalence, but only reports on the abundance of the most dominant taxa.
Article
In the general descriptions of Ireland written in Elizabethan and early Stuart times there are constant, although casual, references to the woodlands, Moryson, Perrott, Bagenal, Speede and Boate all allude to areas which were wooded or carried woody scrub on bog. Their descriptions are too general to be of use in assessing the probable extent of the woodland that remained at the end of the sixteenth-century, but they are pointers to the distribution. The same is true of contemporary maps although they are rather more helpful in that, in spite of their distortion of distance and configuration, they may indicate the position of a wood relative to physical features such as hills, rivers or bays.
Article
This book traces the history of ecological changes in the North American landscape from 1500, the period of European settlement, up to modern times. Using land surveys and early traveller's records, the author reconstructs the "virgin' forests of the northeastern and central US during the presettlement period. Documents the clearance and fragmentation of the region's forests, the harvesting of woodland for timber and game, the ploughing of prairies, and the draining of wetlands. The degree to which human activity altered the soil, climate, flora, fauna, and water cycle to produce modern forest, farm, and urban ecosystems is a central theme of this book. The 16 chapters presented outline methods of historical reconstruction, give a background to the "virgin' environment, and describe how each type of human interference altered the landscape. -from Publisher
Article
Old‐growth forests in the American West typically represent fragments of former, more extensive forests that were subjected to nineteenth and twentieth century land‐clearing activities, such as logging. These present‐day forest fragments are thought to be representative of the former landscape, and thus are capable of serving as living references for guiding restoration of logged forests. Yet how do we determine the extent to which existing old‐growth stands represent the former forest, especially when little of the surrounding original vegetation remains? Historic land surveys conducted prior to significant logging can reconstruct the former forest at the stand level, thereby allowing an analysis of old‐growth patches within the larger historic landscape. This study utilized original Public Land Surveys to assess the applicability of old‐growth stands in Redwood National Park as reference ecosystems. A geographic information system (GIS) and statistical analysis of the nineteenth century forest found that vegetation communities, woody species composition, and ratios of dominant canopy species in unlogged patches were highly representative of the forests that were logged. Significance testing (Ho: μ 1 = μ 2) revealed p‐values greater than 0.10000 in all measures of community and species composition, except for the higher abundance of oak in present‐day old‐growth (p‐value = 0.0395). The results of this study suggest that the national park should increase efforts to protect old‐growth reference ecosystems from further human impacts, and minimize ongoing degradation from edge effects by prioritizing restoration of adjoining second‐growth forest.