Regional mixed-effects height–diameter models for loblolly pine (Pinus taeda L.) plantations

ArticleinEuropean Journal of Forest Research 126(2):253-262 · April 2007with114 Reads
Impact Factor: 2.10 · DOI: 10.1007/s10342-006-0141-7
Abstract

A height–diameter mixed-effects model was developed for loblolly pine (Pinus taeda L.) plantations in the southeastern US. Data were obtained from a region-wide thinning study established by the Loblolly Pine Growth and Yield Research Cooperative at Virginia Tech. The height–diameter model was based on an allometric function, which was linearized to include both fixed- and random-effects parameters. A test of regional-specific fixed-effects parameters indicated that separate equations were needed to estimate total tree heights in the Piedmont and Coastal Plain physiographic regions. The effect of sample size on the ability to estimate random-effects parameters in a new plot was analyzed. For both regions, an increase in the number of sample trees decreased the bias when the equation was applied to independent data. This investigation showed that the use of a calibrated response using one sample tree per plot makes the inclusion of additional predictor variables (e.g., stand density) unnecessary. A numerical example demonstrates the methodology used to predict random effects parameters, and thus, to estimate plot specific height–diameter relationships.

Full-text

Available from: Harold E. Burkhart, May 17, 2014
ORIGINAL PAPER
Guillermo Trincado Æ Curtis L. VanderSchaaf
Harold E. Burkhart
Regional mixed-effects height–diameter models for loblolly pine
(
Pinus taeda
L.) plantations
Received: 12 November 2005 / Accepted: 28 April 2006 / Published online: 22 August 2006
Springer-Verlag 2006
Abstract A height–diameter mixed-effects model was
developed for loblolly pine (Pinus taeda L.) plantations
in the southeastern US. Data were obtained from a re-
gion-wide thinning study established by the Loblolly
Pine Growth and Yield Research Cooperative at Vir-
ginia Tech. The height–diameter model was based on an
allometric function, which was linearized to include both
fixed- and random-effects parameters. A test of regional-
specific fixed-effects parameters indicated that separate
equations were needed to estimate total tree heights in
the Piedmont and Coastal Plain physiographic regions.
The effect of sample size on the ability to estimate ran-
dom-effects parameters in a new plot was analyzed. For
both regions, an increase in the number of sample trees
decreased the bias when the equation was applied to
independent data. This investigation showed that the use
of a calibrate d response using one sample tree per plot
makes the inclusion of additional predictor variables
(e.g., stand density) unnecessary. A numerical example
demonstrates the methodology used to predict random
effects parameters, and thus, to estimate plot specific
height–diameter relationships.
Keywords Height–diameter relationship Æ Forest
inventory Æ Linear mixed-effects model Æ Pinus taeda
Introduction
An accurate estimate of total tree height is often re-
quired in forest management and research. For exa mple,
estimation of total tree height is an important compo-
nent of forest inventories for predicting total and mer-
chantable tree volume (So ares and Tome
´
2002). Bucking
systems based on taper equations require total tree
height for an acc urate estimation of multi-product
recovery (Epstein et al. 1999), and growth and yield
models often predict stand productivity using a measure
of dominant height or site index (Avery and Burkhart
2002, p. 313). Furthermore, height –diameter equations
are also used in the analysis of permanent plots and in
growth and yield studies (Assman 1970, p. 146; Burkhart
et al. 1972; Lynch and Murphy 1995).
Since measuring individual total tree heights is time
consuming and thus costly, often only a sub-sample is
measured. These trees are then used to fit a local stand
height–diameter equation (Zhang et al. 2002) that pre-
dicts height for those trees that only had diameter
measured in the field (Martin and Flewelling 1998 ;
Huang et al. 1992). An alternative approach is to use a
more generalized or region-wide eq uation. Rather than
fitting stand specific height–diameter equations, gener-
alized models estimate height using dbh while accounting
for between-stand variability by including stand-specific
regressors. Several generalized and region-wide equa-
tions have been developed recently for many tree species
(Scho
¨
eder and A
´
lvarez Gonza
´
lez 2001;Lo
´
pez Sa
´
nchez
et al. 2003; Sharma and Zhang 2004; Temesgen and
Gadow 2004; Castedo Dorado et al. 2005).
In forestry, mixed-effects models are increasingly
being used to model stem form variation (Lappi 1986),
dominant height growth (Lappi and Bailey 1988; Fang
and Bailey 2001), volume (Lappi 1991; Eerika
¨
inen 2001),
cumulative bole volume usi ng volume-ratio equations
(Gregoire and Schabenberger 1996a, b), thinning effects
on stem profiles (Tasissa and Burkhart 199 8) and timber
yield predictions (Hall and Clutter 2004). Mixed-models
have also been used to model the height–diameter rela-
tionship. For instance, linear mixed-models were used
by Lappi (1991) for Scots pine (Pinus sylvestris L.), Ja-
yaraman an d Zakrzewski (2001) for sugar map le (Acer
saccharum Marsh), Mehta
¨
talo (2004) for Norway spruce
(Picea abies (L.) Karst), and by Lynch et al. (2005
) for
cherrybark oak(Quercus pagoda Raf). Calama and
Communicated by Hans Pretzsch
G. Trincado (&) Æ C. L. VanderSchaaf Æ H. E. Burkhart
Department of Forestry, Virginia Polytechnic Institute
and State University, Blacksburg, VA 24061, USA
E-mail: gtrincad@vt.edu
Tel.: +1-540-2313596
Fax: +1-540-2313698
Eur J Forest Res (2007) 126: 253–262
DOI 10.1007/s10342-006-0141-7
Page 1
Montero (2004) used nonlinear mixed-effects models to
predict stone pine (Pinus pinea L.) heights.
In contrast to traditional regression techniques,
mixed-effects models allow for both population-specific
and cluster-specific models (e.g., Verbeke and Mole-
nberghs 1997). A population-specific model considers
only fixed-parameters and a cluster-speci fic (e.g., a par-
ticular plot) model considers both fixed- and random-
parameters. This characteristic makes mixed-effects
models more efficient when a predic tion for a new
individual is required and prior information is available.
In forest inventory the use of a regional mixed-effects
model can be highly beneficial, because a sample of trees
measured for total height can be used to calibrate the
height–diameter curve at plot-level. The process of cal-
ibration can increase the predictive capability of a
height–diameter model and possibly eliminate the need
for additional predictor variables.
The objectives of this research were to (1) model re-
gional height–diameter relationships for loblolly pine
(Pinus taeda L.) using a linear mixed-effects model, (2)
determine if separate equations for the Piedmont and
Coastal Plain physiographic regions are needed, (3)
quantify how the number of sub-sample trees used for
plot-specific calibration affects the estimates of random-
effects parameters, and (4) determine if a variable rep-
resenting stand density is required in the model when
using a calibrated response.
Materials and methods
Data
Data for this research were obtained from a region-wide
thinning study maintained by the Loblolly Pine Growth
and Yield Research Cooperative at Virginia P olytechnic
Institute and State University. This study is comprised
of 186 sites located throughout the Piedmont and Gulf
and Atlantic Coastal Plain physiographic regions of the
southeastern US. Study installation began in the dor-
mant seasons of 1980–1981 and 1981–1982 in stands of
different ages, site indexes, and densities on cutover, site-
prepared areas following protocols described by Burk-
hart et al. (1985). Three plots with similar spacing, basal
area, and site index were established at each location.
Each plot was randomly assigned to a thinning treat-
ment: (1) unthinned or control, (2) light thinning
(approximately 1/3 basal area removed) and (3) heavy
thinning (around 1/2 basal area remo ved). After instal-
lation, plots were remeasured on a 3 years interval. After
the fourth measurement, some of the thinned plots re-
ceived a second thinning treatment.
All measurement ages, with the exception of those
occurring after the second thinning treatment that was
only conducted in some plots, were used to fit the pre-
sented model. Thus, for those plots receiving a second
thinning treatme nt, only the first through fourth mea-
surements were used in model fitting. The final data set
consisted of 1,384 and 902 measur ements for the Coastal
Plain and Piedmont physiographic regions, respectively.
A similar procedure for generating height–diameter data
from the same plots was used by Zhang et al. (1997).
Within a given plot and measurement age, only planted
loblolly pines that were unforked and did not have a
broken top were selected as sample trees. Within each
physiographic region, seventy percent of the plot mea-
surements were randomly assigned to a model fitting
dataset while the remaining 30% were assigned to a
model validation dataset (Table 1).
Model development
A mathematical expression used for modeling the
height–diameter relationship is a power function other-
wise known as an allometric function (e.g., Arabatzis
and Burkhart 1992; Eerika
¨
inen 2003):
h ¼ b
0
d
b
1
; ð1Þ
where h is individual total tree height and d is the dbh.
The parameter b
0
controls the rate of increase and
parameter b
1
adjusts the shape of the curve. This
equation can be conditioned to estimate total tree height
as 1.37 m when d = 0, producing the following form:
h ¼ 1:37 þ b
0
d
b
1
: ð2Þ
Other studies have used the same constrained non-
linear equation for mode ling the height–diameter rela-
Table 1 Descriptive statistics of
the data in the model fitting
dataset (70%) and the model
validation dataset (30%) by
physiographic region and
across both regions combined
Region No. plots No. trees Age (years) dbh (cm) Total height
(m)
Mean SD Mean SD Mean SD
Fit
Combined 1,600 84,500 22.9 6.3 19.4 5.5 16.0 4.1
Coastal Plain 969 50,782 22.5 6.4 19.5 5.7 16.1 4.3
Piedmont 631 33,718 23.5 6.2 19.4 5.3 15.9 5.3
Validation
Combined 686 36,813 23.4 6.5 19.4 5.5 16.1 4.3
Coastal Plain 415 21,877 23.2 6.7 19.6 5.7 16.3 4.4
Piedmont 271 14,936 23.6 6.1 19.1 5.2 15.7 4.0
254
Page 2
tionship (e.g., Huang et al. 1992; Hui and Gadow 1993).
This equation can easily be linearized by taking loga-
rithms of both sides:
lnðHÞ¼lnðb
0
Þþb
1
lnðdÞ; ð3Þ
where H = h1.37. A model containing both fixed and
random effects and having the same structure as in
Eq. (3) can be expressed as:
lnðH
ki
Þ¼b
0
þ b
1
lnðd
ki
Þþb
0k
þ b
1k
lnðd
ki
Þþe
ki
;
ð4Þ
where H
ki
is the height of tree i above 1.37 m in plot k,
d
ki
is the dbh of tree i in plot k, b
0,
b
1
are the fixed-effects
parameters (or the population average parameters) and
b
0k,
b
1k
are the random effects parameters (or the cluster
specific parameters) of plot k. The mixed-effects model
can be expressed in the same general form as given in
Eq. (14), where the matrices involved take the following
forms:
Y
0
k
¼ lnðH
k1
Þ; ...; lnðH
kn
k
Þ½;
X
0
k
¼ Z
0
k
¼ 1lnðd
k1
Þ; ...;1lnðd
kn
k
Þ
hi
;
b
0
¼ b
0
b
1
½;
b
0
k
¼ b
0k
b
1k
½:
In this analysis all parameters were considered as
mixed e.g., X¢
k
= Z¢
k
and the following variance–
covariance structures for random parameters and the
random errors were assumed (see Eq. 14):
D ¼ Var b
k
½¼
Varðb
0
Þ Covðb
0
; b
1
Þ
Covðb
0
; b
1
Þ Varðb
1
Þ

and
R
k
¼ Var e
k
½¼r
2
I
n
k
:
Under this variance structure random errors are
assumed to be uncorrelated and have constant vari-
ance (r
2
). Parameter estimates were obtained using
PROC MIXED from SAS Institute Inc. (Littell et al.
1996).
Testing for regional-specific parameters
To determine if separate equations are needed for the
Piedmont and Coastal Plain physiographic regions,
Eq. (4) was modified by including a dummy variable for
the intercept and slope terms producing the following
form:
lnðH
ki
Þ¼b
0
þ b
1
lnðd
ki
Þþb
2
Z þ b
3
Z lnðd
ki
Þ
þ b
0k
þ b
1k
lnðd
ki
Þþe
ki
; ð5Þ
where Z = 0 if a plot is from the Piedmo nt physio-
graphic region and Z = 1 if a plot is from the Coastal
Plain physiographic region. If b
2
and b
3
are not signifi-
cant, then observations from each physiographic region
can be combined and Eq. (5) collapses to Eq. (4). The
variance components of the combined model (Eq. 5)
were obtained using restricted maximum likelihood
(REML).
Statistical validation
The developed model was evaluated using (1) only the
fixed-effects response (mean response) and (2) a more
complete model by estimating random parameters (cal-
ibrated response) for each plot of the validation dataset
(Table 1). Estimation of random parameters (calibra-
tion) for a particular plot depends on the number of
trees selec ted and the amount of inherent variability
among trees within a plot. Therefore, for evaluating the
calibration process, one to three trees were randomly
selected from each plot. This procedure was repeated ten
times in each plot to account for within tree variability.
The validation process followed the methodology pro-
posed by Arabatzis and Burkhart (1992). The difference
between the observed and predicted total tree height
e
ki
¼ h
ki
^
h
ki

for each individua l tree (i) within a
plot (k) was calculated. Then, for each plot, the mean
residual
e
k
ðÞand the sample varian ce (v
k
) of residuals
were computed. They were considered to be estimates of
bias and precision, respectively. An estimate of mean
square error (MS
k
) was obtained combining the bias and
precision measures using the following formula:
MS
k
¼
e
2
k
þ v
k
: ð6Þ
For those estimations carried out using only the fixed-
effects part of the model, measurements of bias, preci-
sion, and error were calculated for each plot and a mean
value over all the plots was computed. However, for
those predictions using estimated random parameters,
the same statistical measures were averaged across the
ten repetitions by plot for each of the three samp le sizes.
An analysis of residuals was performed to determine
the effects of stand density (basal area) on both the mean
and calibrated respon ses.
Results and discussion
Mixed-effects height –diameter model
For the combined model all parameter estimates were
significant (Table 2). Therefore, separate equations for
the two physiographic regions were needed (Table 3 ).
An analysis of residuals showed that predictions based
255
Page 3
on the fixed-effects model were biased with respect to
basal area (Fig. 1). However, the use of the random-
effects parameters adjusted the predictions for local
conditions producing unbiased predictions with respect
to basal area, even though this variable is not explicitly
incorporated in the model (Eq. 4). The incorporation of
additional predictor variables has a major effect on the
ability of the fixed-effects model to explain between-
individual variability, but certainly not on the mixed-
effects model. In this application, we assumed that
random-effects parameters will be estimated and a
mixed-effects model used. Otherwise the use of a fixed-
effects model would likely require including additional
predictor variables. The use of a mixed-effects model in
forest inventory through a sub-sample of trees for height
measurement allows maintenance of a simple model
structure without including addi tional predictor vari-
ables.
The validation analyses indicated that higher accu-
racy in predicting total tree height was observed when
random parameters were estimated (Table 4). The root
mean square error
ffiffiffiffiffiffiffi
MS
p

decreased when the number
of sample trees for calibration increased. However, the
error reduction in both regions was mainly due to a
reduction in bias since the variance remained relatively
constant as sample size increased. The greatest change in
accuracy occurred when a single tree was used for cali-
bration, indicating only a marginal gain when more
sample trees were used.
The predictions of total tree heights using a mean
response (fixed-effects parameters) were as expected
biased with respect to basal area (Fig. 2). However, the
selection of one to three trees for measurement of total
height permitted estimation of random-effects parame-
ters and unbiased predictions of tree heights with respect
to basal area. These results demonstrate that the use of a
calibrated response does not require incorporating into
the model additional predictor variables representing
stand density.
The utility of this technique in forest inventory is that
in each plot only one height measurement is required in
order to obtain unbiased predictions using a simple
height–diameter model.
Prediction of total tree heights—an example
In comparison to conventional regression techniques,
the mixed-effects height–diameter model permits the
estimation of a mean response (population-specific)and/
or a calibrated response (cluster-specific) for a new plot.
The prediction of total tree heights using both types of
responses is explained and demonstrated using param-
eter estimates from the Coastal Plain physiographic re-
gion (Table 3).
Mean response (only fixed-effects parameters)
A mean response for a new plot can be obtained using
only the fixed-effects component of Eq. (4), e.g., b
0
and
b
1
. Thus, this equation can be expressed as:
lnðHÞ¼1:4027 þ 0:4386 lnðdÞ: ð7Þ
However, in order to get an estimate of height (h), the
predicted value from Eq. (7) must first be transformed
back to the original units (m). Additive error terms in
Table 2 Estimated parameters and fit statistics for the combined
data set (Coastal Plain/Piedmont physiographic regions) using
Eq. (5)
Parameters Estimate SE t-value P-value
b
0
1.4684 0.01964 74.78 < 0.0001
b
1
0.4099 0.00502 81.74 < 0.0001
b
2
0.0659 0.02518 2.62 0.0088
b
3
0.0287 0.00642 4.47 < 0.0001
Variance components
a
Var (b
0
) 0.2136 0.00842 25.36 < 0.0001
Var (b
1
) 0.0126 0.00054 23.56 < 0.0001
Cov ( b
0
,b
1
) 0.0486 0.00206 23.59 < 0.0001
r
2
0.00568 0.00003 201.73 < 0.0001
Goodness-of-fit
-2LL (smaller better) 185481
AIC (smaller better) 185473
a
Asymptotic standard errors (SE) and test statistics based on
Wald–Z tests
Table 3 Estimated parameters and fit statistics for the Coastal Plain and Piedmont physiographic regions using Eq. (4)
Parameters Coastal plain Piedmont
Estimate SE t-value P-value Estimate SE t-value P-value
b
0
1.4027 0.01611 87.07 < 0.0001 1.4683 0.01894 77.53 < 0.0001
b
1
0.4386 0.00402 109.18 < 0.0001 0.4100 0.00499 82.20 < 0.0001
Variance components
a
Var (b
0
) 0.2243 0.01137 19.73 < 0.0001 0.1970 0.01236 15.95 < 0.0001
Var (b
1
) 0.0127 0.00273 18.21 < 0.0001 0.0125 0.00083 14.93 < 0.0001
Cov ( b
0
,b
1
) 0.0499 0.00070 18.28 < 0.0001 0.0466 0.00312 14.92 < 0.0001
r
2
0.00575 0.00004 156.32 < 0.0001 0.00559 0.00004 127.5 < 0.0001
Goodness-of-fit
-2LL (smaller better) 110773 74730
AIC (smaller better) 110765 74722
a
Asymptotic standard errors (SE) and test statistics based on Wald–Z tests
256
Page 4
log–log models become multiplicative when transformed
back to the original scale and thus this bias must be
accounted for. An unbiased estimate can be obtained by
adding a co rrection factor as proposed by Baskerville
(1972):
lnðHÞ¼1:4027 þ 0:4386 lnðdÞþ0:002875: ð8Þ
This correction factor was computed using the esti-
mated error variance found in Table 3 for the Coastal
Plain region r
2
/2 or (0.00575/2). Following this, Eq. (8)
can be converted to arithmetic units isolating h on the
left-hand side:
h ¼ 1:37 þ 4:0779 d
0:4386
: ð9Þ
This equati on allows for a mean response for total
tree height. The curve obtained is presented in (Fig. 3a)
for a 30-year-old plot located in the Coastal plain re-
gion, where all trees have been measured for dbh and
total tree height (36 observations). Clearly, a severely
biased height–diameter curve that underestimates ob-
served tree heights for this plot is obtained when only
using a mean response.
Calibrated response (fixed- and random-effects
parameters)
Mixed-effects models allow for the mean response to
be calibrated for an individual plot by estimating
random parameter components for Eq. (4) e.g., b
0k
and b
1k
. The use of a calibrated response rather than a
population mean response produced an increase in the
accuracy of predicted tree heights for a given plot
(Table 4). In order to estimate random parameters, the
calibration process requires observed height–diameter
data. From the 30-year-old plot located in the Coastal
Plain region mentioned above, three sampled trees
were randomly selected for measuring dbh (d)incm
and total tree height (h) in m. The pair of measure-
ments (d, h) corresponded to: (25.9, 21.0), (16.5, 17.7)
and (18.5, 17.7). These sample trees are represe nted by
black dots in Fig. 3b. Prediction of random paramet ers
is accomplished using formula (16). Estimated vari-
ances and the covariance of the parameters are given
in Table 3. Thus, the variance–covariance matrix of
random coefficients is:
Fig. 1 Dispersion of residuals
against basal area for the fixed-
and mixed-effects models for
the Coastal Plain and Piedmont
physiographic regions
Table 4 Mean prediction bias
eðÞ; mean precision (v), and mean
root square error
ffiffiffiffiffiffiffi
MS
p

in meters across all plots for three
sample sizes used to estimate random effects (n = 415 for Coastal
Plain and n = 271 for the Piedmont)
Region Number of
sample trees
Bias e Precision v Error
ffiffiffiffiffiffiffi
MS
p
Coastal Plain 0
a
0.507 1.096 2.912
1 0.099 1.095 1.359
2 0.051 1.095 1.228
3 0.024 1.096 1.170
Piedmont 0
a
0.072 0.959 2.202
1 0.042 0.953 1.260
2 0.012 0.956 1.134
3 0.004 0.958 1.089
a
Only the fixed-effects part of model (4) was used for total tree
height prediction
257
Page 5
Var½b¼D ¼
0:2243 0:0499
0:0499 0:0127

The variance–covariance matrix for the random error
term must also be determi ned. As specified earlier, all
observations are assumed to have constant variance r
2
and errors are assumed to be uncorrelated. Thus, the
estimated variance–covariance matrix, Var[e
k
]=R
k
,
can be expressed as:
R
k
¼ 0:00575
100
010
001
2
4
3
5
or
R
k
¼ 0:00575 I
3
;
where I
3
is the identity matrix with dimension (3 · 3)
equal to the number of sample trees used for calibration.
Then, according to the definition for the dependent and
independent variables as specified in Eq. (4), the matri-
ces Y
k
, X
k
, and Z
k
are
Y
k
¼
lnð21:0 1:37Þ
lnð17:7 1:37Þ
lnð17:7 1:37Þ
2
4
3
5
and
X
k
¼ Z
k
¼
1lnð25:9Þ
1lnð16:5Þ
1lnð18:5Þ
2
4
3
5
The expression YXb represent s the difference be-
tween the observed values and the estimated mean re-
sponses using only the fixed-effects parameter estimates
from Eq. (7). Thus, the vector of residuals is expressed as:
Y
k
X
k
b ¼
2:98 ð1:4027 þ 0:4386 lnð25:9ÞÞ
2:79 ð1:4027 þ 0:4386 lnð16:5ÞÞ
2:79 ð1:4027 þ 0:4386 lnð18:5ÞÞ
2
6
4
3
7
5
¼
0:15
0:16
0:11
2
6
4
3
7
5
Fig. 2 Dispersion of residuals
against basal area for the mean
response (MR) and calibrated
response (CR) using different
numbers of sample trees for the
Coastal Plain and Piedmont
physiographic regions
258
Page 6
Replacing the matrices in formula (16) with their
corresponding estimated matrices gives the following
predictions for the random parameters of this specific
plot: b
0k
= 0.2343 and b
1k
= 0.0339. Therefore, a
model containing both fixed and random-effects
parameters for this specific plot is:
lnðHÞ¼1:4027 þ 0:4386 lnðdÞþ0:2343
0:0339 lnðdÞ: ð10Þ
As before, after applying the correction factor using
the residual variance and transf orming to arithmetic
units, a final unbiased plot-specific equation can be ob-
tained:
h ¼ 1:37 þ 5:1545 d
0:4047
ð11Þ
The calibrated height–diameter curve is presented in
Fig. 3b. A visual inspection of this figure reveals that the
inclusion of random parameters improved the predi ctive
capability of the model over the mean response (Eq. 9).
However, a comparison of the calibrated curve to ob-
served data indicates that the model still tend s to
underestimate tree heights. This finding agrees with the
overall analysis performed in terms of predictive capa-
bilities, which indicated that on the average these models
underestimate tree heights (Table 4 ). The calculation
procedure for estimating random parameters was
implemented in PROC IML and this program code has
been included in Appendix 2.
For comparison purposes and to determine the de-
gree of underestimation of the calibrated model, Eq. (3)
was fit to all height–diameter data contained in the plot.
This equation was fitted using ordinary least squares
(OLS) and it was used only as a reference of optimum
curve estimation. The parameter estimates and error
variance for the data were: b
0
= 4.9913, b
1
= 0.4253
and r
2
= 0.00187. The calibrated curve is located be-
neath the OLS curve indicating that the use of the cali-
brated curve will slightly underestimate observed tree
heights (Fig. 3c). However, the estimated curve using a
calibrated response can be considered an adequate esti-
mate of unmeasured total tree heights. M oreover, the
calibration procedure only requires three or fewer
height–diameter observations from each plot.
Conclusions
A mixed-effects height–diameter model was developed
for use in loblolly pine plantations in the Southeastern
US. A test for regional parameters indicated that sepa-
rate equations were needed for the Piedmont and
Coastal Plain physiographic regions. The use of separate
equations for both regions was also reported by Tasissa
et al. (1997) when constructing volume and taper equa-
tions for the same tree species. These two studies indi-
cate a possible effect of geographical location on the
growing pattern of loblolly pine trees. Measures of bias,
precision, and mean-square error using region-wide
plots for validation indicated that the use of random
parameters considerably increased the predictive capa-
bility of the model (Table 4). Therefore, a calibrated
curve is recommended rather than a mean response
curve. The most significant improvements in accuracy
were observed when one sample tree was used for pre-
dicting random parameters. Only a margi nal gain in
accuracy was observ ed when two or three sample trees
were used for calibration, and this was mainly because of
a reduction in bias. Other studies have reported similar
behavior using linear or nonlinear mixed-effects height–
diameter equations for other tree species (Jayaraman
and Zakrzewski 2001; Calama and Montero 2004).
The developed models can be implemented in forest
inventories by measurement of one tree height per plot,
resulting in unbiased predictions using a calibrated
curve. The process of calibration apparently accounted
indirectly for the effects of stand density on the height–
diameter relationships (e.g., Zeide and VanderSchaaf
DBH (cm)
10 15 20 25 30
Height (m)
10
15
20
25
DBH
(
cm
)
10 15 20 25 30
Height (m)
10
15
20
25
DBH (cm)
10 15 20 25 30
Height (m)
10
15
20
25
a
c
b
Fig. 3 Estimated height–diameter curves using : a mean response (Eq. 9), b calibrated response (Eq. 11) and c a comparison between the
calibrated response and a local height–diameter equation (dashed line) fitted by OLS
259
Page 7
2002). Even though not evaluated here, the effects of site
productivity can be similarly accounted for by a mixed-
effects modeling approach (e.g., Huang et al. 1992). The
estimate of random-effects parameters makes the inclu-
sion of additional predictor variables in the model
unnecessary for many forest inventory applications that
consider measurement of a sub-sample of heights. The
advantages are maintenance of a simple model structure
that does not assum e the height–diameter relationship is
constant within a stand but rather allows for plot-spe-
cific curves.
Acknowledgments Data for this study and financial support were
provided through the Loblolly Pine Growth and Yield Research
Cooperative, Department of Forestry, Virginia Polytechnic Insti-
tute and State University.
Appendix 1. Linear mixed-effects model theory
A general linear model can be expressed as:
E½Y
i
¼X
i
b ; ð12Þ
where Y
i
is a vector of observations from cluster i
(i = 1,..,N), X
i
is an (n
i
· p) regressor matrix of cluster i
and b is a (p · 1) vector of regression coefficients (vector
of fixed-effects parameters applicable to all N clusters).
This fixed-effec ts model assumes the mean respon se of a
specific set of regressor values is constant for all clusters.
However, the fixed-effect model can be modified by
including cluster-specific parameters (random effects)
thus permitting the mean response to vary from cluster
to cluster, taking the following form:
E½Y
i
jb
i
¼X
i
ðb þ b
i
Þ¼X
i
b þ X
i
b
i
;
where b
i
provides for cluster-specific behavior. A more
general expression for this model is given by:
E½Y
i
jb
i
¼X
i
b þ Z
i
b
i
; ð13Þ
where Z
i
is a (n
i
· q) regressor matrix (or design matrix)
containing explanatory variables and b
i
is a (q · 1)
vector of random effects. The matrix Z
i
can have the
same regressors as in X
i
or it may only contain those
regressors in X
i
that vary among clust ers. The expecta-
tion of Eq. (13) is conditioned on the vector of random
parameters, assuming that b
i
is normally distributed
with E[b
i
] = 0 and Var[b
i
]=D. In terms of linear
models, a linear mixed-effects model can be expressed as:
Y
i
jb
i
¼ X
i
b þ Z
i
b
i
þ e
i
where e
i
is a random error assumed to be normally
distributed with E[e
i
] = 0 and Var[e
i
]=R
i
. R
i
is
an (n
i
· n
i
) covariance–variance matrix of cluster i.
The random error is assumed to be independent of
the random vector b
i
with Cov[e
i
, b
i
] = 0. The n,
E[Y
i
]=X
i
b with a covariance matrix of Var(Y
i
)=
V
i
= Z
i
DZ¢
i
+ R
i
(e.g., Verbeke and Molenberghs
1997, p. 71). Thus, a mixed-effects model can be
expressed in general form as (Laird and War e 1982):
Y
i
¼ X
i
b þ Z
i
b
i
þ e
i
; ð14Þ
where
Y
i
NðX
i
b ; Z
i
DZ
0
i
þ R
i
Þ
and
b
i
e
i
!
N
0
0
!
;
D 0
0 R
i

!
:
Estimation of the fixed vector b
An estimate of the fixed parameter vector b under model
(4) can be obtained from a generalized lea st squares
(GLS) analysis using V
i
1
as weights. Assuming all the
variance and covariance parameters of D and R
i
are
known, e.g., which means V
i
is known, results in the
following estimator of
^
b ; which is the best linear
unbiased estimator (BLUE):
^
b ¼
X
N
i¼1
X
0
i
V
1
i
X
i
!
1
X
N
i¼1
X
0
i
V
1
i
Y
i
; ð15Þ
where V
1
i
=[Z
i
DZ¢
i
+ R
i
]
1
and its variance–covari-
ance matrix is:
Varð
^
bÞ¼
X
N
i¼1
X
0
i
V
1
i
X
i
!
1
:
Prediction of the random vector b
i
Even though we are interested in the fixed-effects
parameters, providing us with a population average
curve, the main purpose for using mixed-effects models
is to estimate clust er specific parameters. Knowing that
b
i
and Y
i
are distributed jointly multivariate normal,
then the conditional expectation of b
i
is given by:
E½b
i
jY
i
¼Eðb
i
ÞþCov½b
i
; Y
i
Var½Y
i
1
ðY
i
E½Y
i
Þ;
where Cov[b
i
,Y
i
] = Cov[b
i
, X
i
b + Z
i
b
i
]=DZ¢
i
and
E(b
i
) = 0, and thus the best linear unbiased predictor
(BLUP) of b
i
is given by:
^
b
i
¼ DZ
0
i
W
i
ðY
i
X
i
^
bÞ; ð16Þ
where W
i
= V
1
i
=[Z
i
DZ¢
i
+ R
i
]
1
(see Rencher
2000, p. 431). As mentioned by Lappi (1991), this
expression requires the inversion of a matrix with
dimension equal to the number of observations. The
variance–covariance matrix of the prediction errors,
Var½b
i
^
b
i
; is given by
260
Page 8
Var½b
i
^
b
i
¼½Z
0
i
R
1
i
Z
0
i
þ D
1
i
1
ð17Þ
The expressions given for
^
b and
^
b
i
in (15)and(16)
assume that V
i
is known, e.g., D and R
i
are known.
However in normal practice a consistent estimator given
by
^
V
i
¼ Z
i
^
DZ
0
i
þ
^
R
i
must be used instead. Likelihood-
based methods are used for estimat ing D and R
i
based on
the assumptions that b
i
and e
i
are normally distributed
(see Littell et al. 1996; Schabenberge r and Pierce 2002).
Appendix 2
References
Arabatzis AA, Burkhart HE (1992) An evaluation of sampling
methods and model forms for estimating height–diameter
relationships in loblolly pine plantations. For Sci 38:192–198
Avery TE, Burkhart HE (2002) Forest measurements. 5th edn.
McGraw-Hill, New York
Assman E (1970) The principles of forest yield studies. Pergamon,
Oxford
Baskerville GL (1972) Use of logarithmic regression in the esti-
mation of plant biomass. Can J For Res 2:49–53
Burkhart HE, Parker RC, Strub MR, Oderwald RG (1972) Yield
of old-field loblolly pine plantations. School of Forestry and
Wildlife Resources, Va Polytech Institute and State University
Publication FWS, pp 3–72
Burkhart HE, Cloren DC, Amateis RL (1985) Yield relationships
in unthinned loblolly pine plantations on cutover, site-prepared
lands. South J Appl For 9:84–91
Calama R, Montero G (2004) Interregional nonlinear height–
diameter model with random coefficients for stone pine in
Spain. Can J For Res 34:150–163
Castedo Dorado F, Barrio Anta M, Parresol BR, A
´
lvarez Gon-
za
´
lez JG (2005) Stochastic height–diameter model for maritime
pine ecoregions in Galicia (northwestern Spain). Ann For Sci
62:455–465
Eerika
¨
inen K (2001) Stem volume models with random coefficients
for Pinus kesiya in Tanzania, Zambia, and Zimbabwe. Can J
For Res 31:879–888
Eerika
¨
inen K (2003) Predicting the height–diameter pattern of
planted Pinus kesiya stands in Zambia and Zimbabwe. For Ecol
Manage 175:355–366
Epstein R, Nieto E, Weintraub A, Chevalier P, Gabarro
´
J (1999) A
system for the design of short term harvesting strategy. Eur J
Oper Res 119:427–439
Fang Z, Bailey RL (2001) Nonlinear mixed effects modeling for
slash pine dominant height growth following intensive silvi-
cultural treatments. For Sci 47:287–300
Gregoire TG, Schabenberger O (1996a) Non-linear mixed effects
modeling of cumulative bole volume with spatially correlated
within-tree data. J Agri Biol Environ Stat 1:107–109
Gregoire TG, Schabenberger O (1996b) A non-linear mixed-effects
model to predict cumulative bole volume of standing trees. J
Appl Stat 23:257–271
Hall DB, Clutter M (2004) Multivariate multilevel nonlinear mixed
effects models for timber yield predictions. Biometrics 60:16–24
Huang S, Titus SJ, Wiens DD (1992) Comparison of nonlinear
height–diameter functions for major Alberta tree species. Can J
For Res 22:1.297–1.304
Hui G, Gadow Kv (1993) Zur Entwicklung von Ein-
heitsho
¨
henkurven am Beispiel der Baumart Cunninghamia
lanceolata. Allg Forst Jagdztg 164:218–220
Jayaraman K, Zakrzewski WT (2001) Practical approaches to
calibrating height–diameter relationships for natural maple
stand in Ontario. For Ecol Manage 148:169–177
Laird NM, Ware JH (1982) Random-effects models for longitudi-
nal data. Biometrics 38:963–974
Lappi J (1986) Mixed linear models for analyzing and predicting
stem form variation of Scots pine. Commun Inst For Fenn
134:1–69
Lappi J (1991) Calibration of height and volume equations with
random parameters. For Sci 37:781–801
Lappi J, Bailey RL (1988) A height prediction model with random
stand and tree parameters: an alternative to traditional site in-
dex methods. For Sci 34:907–927
Littell RC, Milliken GA, Stroup WW, Wolfinger RD (1996) SAS
System for mixed models. SAS Institute Inc., Cary
Lo
´
pez Sa
´
nchez CA, Gorgoso Varela J, Castedo Dorado F, Rojo
Alboreceda R, Rodriguez Soalleiro R, Alvarez Gonzalez JG,
Sanchez Rodriguez F (2003) A height–diameter model for Pinus
radiata D, Don in Galicia (Northwest Spain). Ann For Sci
60:237–345
Lynch T, Murphy P (1995) A compatible height prediction and
projection system for individual trees in natural, even-aged
shortleaf pine stands. For Sci 41:194–209
Lynch TB, Holley AG, Stevenson DJ (2005) A random-parameter
height-dbh model for cherrybark oak. South J Appl For
29:22–26
Martin F, Flewelling J (1998) Evaluation of tree height prediction
models for stand inventory. West J Appl For 13:109–119
Mehta
¨
talo L (2004) A longitudinal height–diameter model for
Norway spruce in Finland. Can J For Res 34:131–140
Rencher AC (2000) Linear models in statistics. John Wiley, New
York
Schabenberger O, Pierce FJ (2002) Contemporary statistical mod-
els for the plant and soil sciences. CRC, Boca Raton
Scho
¨
eder J, A
´
lvarez Gonza
´
lez JG (2001) Comparing the perfor-
mance of generalized diameter–height equations for maritime
pine in Northwestern Spain. Forstw Cbl 120:18–23
Sharma M, Zhang SY (2004) Height–diameter models using stand
characteristics for Pinus banksiana and Pinus mariana. Scand J
For Res 19:442–451
Soares P, Tome
´
M (2002) Height–diameter equation for first
rotation eucalypt plantations in Portugal. For Ecol Manage
166:99–109
Tasissa G G, Burkhart HE (1998) An application of mixed effects
analysis to modeling thinning effects on stem profile of loblolly
pine. For Ecol Manage 103:87–101
DATA Example ;
INPUT h d ;
one = 1 ;
lnd = LOG(d) ;
RES = LOG(h-1.37)-(1.4027+0.4386*lnd) ;
CARDS ;
21.0 25.9
17.7 16.5
17.7 18.5
;
RUN ;
PROC IML ;
USE Example ;
READ ALL VAR {one lnd} INTO Z ;
READ ALL VAR {RES} INTO RES ;
D = { 0.2243 -0.0499, -0.0499 0.0127 } ;
R = 0.00575 * I(3) ;
b = D*Z`*INV(Z * D * Z` + R)*RES ;
PRINT b ;
QUIT ;
Fig. 4 SAS program for computing random parameters
261
Page 9
Tasissa G, Burkhart HE, Amateis RL (1997) Volume and taper
equations for thinned and unthinned loblolly pine trees in cut-
over, site-prepared plantations. South J Appl For 21:146–152
Temesgen H, Gadow Kv (2004) Generalized height–diameter
models: an application for major tree species in complex stands
of interior British Columbia. Eur J For Res 123:45–51
Verbeke G, Molenberghs G (1997) Linear mixed models in practice:
a SAS-oriented approach. Springer, Berlin Heidelberg New York
Zeide B, VanderSchaaf C (2002) The effect of density on the
height–diameter relationship. In: Proceedings of the 11th
biennial southern silvicultural research conference. Outcalt
Kenneth W (ed) Gen Tech Rep SRS-48 Asheville, NC:
Department of Agriculture, Forest Service, Southern Research
Station, pp 463–466
Zhang S, Burkhart HE, Amateis RL (1997) The influence of
thinning on tree height and diameter relationships in loblolly
pine plantations. South J Appl For 21:199–205
Zhang L, Peng C, Huang S, Zhou X (2002) Development
and evaluation of ecoregion-based jack pine height–diameter
models for Ontario. For Chron 78:530–538
262
Page 10