Page 1

ORIGINAL PAPER

Guillermo Trincado Æ Æ Curtis L. VanderSchaaf

Harold E. Burkhart

Regional mixed-effects height–diameter models for loblolly pine

(Pinus taeda L.) plantations

Received: 12 November 2005/ Accepted: 28 April 2006/Published online: 22 August 2006

? Springer-Verlag 2006

Abstract A height–diameter mixed-effects model was

developed for loblolly pine (Pinus taeda L.) plantations

in the southeastern US. Data were obtained from a re-

gion-wide thinning study established by the Loblolly

Pine Growth and Yield Research Cooperative at Vir-

ginia Tech. The height–diameter model was based on an

allometric function, which was linearized to include both

fixed- and random-effects parameters. A test of regional-

specific fixed-effects parameters indicated that separate

equations were needed to estimate total tree heights in

the Piedmont and Coastal Plain physiographic regions.

The effect of sample size on the ability to estimate ran-

dom-effects parameters in a new plot was analyzed. For

both regions, an increase in the number of sample trees

decreased the bias when the equation was applied to

independent data. This investigation showed that the use

of a calibrated response using one sample tree per plot

makes the inclusion of additional predictor variables

(e.g., stand density) unnecessary. A numerical example

demonstrates the methodology used to predict random

effects parameters, and thus, to estimate plot specific

height–diameter relationships.

Keywords Height–diameter relationship Æ Forest

inventory Æ Linear mixed-effects model Æ Pinus taeda

Introduction

An accurate estimate of total tree height is often re-

quired in forest management and research. For example,

estimation of total tree height is an important compo-

nent of forest inventories for predicting total and mer-

chantable tree volume (Soares and Tome ´ 2002). Bucking

systems based on taper equations require total tree

height for an accurate estimation of multi-product

recovery (Epstein et al. 1999), and growth and yield

models often predict stand productivity using a measure

of dominant height or site index (Avery and Burkhart

2002, p. 313). Furthermore, height–diameter equations

are also used in the analysis of permanent plots and in

growth and yield studies (Assman 1970, p. 146; Burkhart

et al. 1972; Lynch and Murphy 1995).

Since measuring individual total tree heights is time

consuming and thus costly, often only a sub-sample is

measured. These trees are then used to fit a local stand

height–diameter equation (Zhang et al. 2002) that pre-

dicts height for those trees that only had diameter

measured in the field (Martin and Flewelling 1998;

Huang et al. 1992). An alternative approach is to use a

more generalized or region-wide equation. Rather than

fitting stand specific height–diameter equations, gener-

alized models estimate height using dbh while accounting

for between-stand variability by including stand-specific

regressors. Several generalized and region-wide equa-

tions have been developed recently for many tree species

(Scho ¨ eder and A´lvarez Gonza ´ lez 2001; Lo ´ pez Sa ´ nchez

et al. 2003; Sharma and Zhang 2004; Temesgen and

Gadow 2004; Castedo Dorado et al. 2005).

In forestry, mixed-effects models are increasingly

being used to model stem form variation (Lappi 1986),

dominant height growth (Lappi and Bailey 1988; Fang

and Bailey 2001), volume (Lappi 1991; Eerika ¨ inen 2001),

cumulative bole volume using volume-ratio equations

(Gregoire and Schabenberger 1996a, b), thinning effects

on stem profiles (Tasissa and Burkhart 1998) and timber

yield predictions (Hall and Clutter 2004). Mixed-models

have also been used to model the height–diameter rela-

tionship. For instance, linear mixed-models were used

by Lappi (1991) for Scots pine (Pinus sylvestris L.), Ja-

yaraman and Zakrzewski (2001) for sugar maple (Acer

saccharum Marsh), Mehta ¨ talo (2004) for Norway spruce

(Picea abies (L.) Karst), and by Lynch et al. (2005) for

cherrybark oak(Quercus pagoda Raf). Calama and

Communicated by Hans Pretzsch

G. Trincado (&) Æ C. L. VanderSchaaf Æ H. E. Burkhart

Department of Forestry, Virginia Polytechnic Institute

and State University, Blacksburg, VA 24061, USA

E-mail: gtrincad@vt.edu

Tel.: +1-540-2313596

Fax: +1-540-2313698

Eur J Forest Res (2007) 126: 253–262

DOI 10.1007/s10342-006-0141-7

Page 2

Montero (2004) used nonlinear mixed-effects models to

predict stone pine (Pinus pinea L.) heights.

In contrast to traditional regression techniques,

mixed-effects models allow for both population-specific

and cluster-specific models (e.g., Verbeke and Mole-

nberghs 1997). A population-specific model considers

only fixed-parameters and a cluster-specific (e.g., a par-

ticular plot) model considers both fixed- and random-

parameters. This characteristic makes mixed-effects

models more efficient when a prediction for a new

individual is required and prior information is available.

In forest inventory the use of a regional mixed-effects

model can be highly beneficial, because a sample of trees

measured for total height can be used to calibrate the

height–diameter curve at plot-level. The process of cal-

ibration can increase the predictive capability of a

height–diameter model and possibly eliminate the need

for additional predictor variables.

The objectives of this research were to (1) model re-

gional height–diameter relationships for loblolly pine

(Pinus taeda L.) using a linear mixed-effects model, (2)

determine if separate equations for the Piedmont and

Coastal Plain physiographic regions are needed, (3)

quantify how the number of sub-sample trees used for

plot-specific calibration affects the estimates of random-

effects parameters, and (4) determine if a variable rep-

resenting stand density is required in the model when

using a calibrated response.

Materials and methods

Data

Data for this research were obtained from a region-wide

thinning study maintained by the Loblolly Pine Growth

and Yield Research Cooperative at Virginia Polytechnic

Institute and State University. This study is comprised

of 186 sites located throughout the Piedmont and Gulf

and Atlantic Coastal Plain physiographic regions of the

southeastern US. Study installation began in the dor-

mant seasons of 1980–1981 and 1981–1982 in stands of

different ages, site indexes, and densities on cutover, site-

prepared areas following protocols described by Burk-

hart et al. (1985). Three plots with similar spacing, basal

area, and site index were established at each location.

Each plot was randomly assigned to a thinning treat-

ment: (1) unthinned or control, (2) light thinning

(approximately 1/3 basal area removed) and (3) heavy

thinning (around 1/2 basal area removed). After instal-

lation, plots were remeasured on a 3 years interval. After

the fourth measurement, some of the thinned plots re-

ceived a second thinning treatment.

All measurement ages, with the exception of those

occurring after the second thinning treatment that was

only conducted in some plots, were used to fit the pre-

sented model. Thus, for those plots receiving a second

thinning treatment, only the first through fourth mea-

surements were used in model fitting. The final data set

consisted of 1,384 and 902 measurements for the Coastal

Plain and Piedmont physiographic regions, respectively.

A similar procedure for generating height–diameter data

from the same plots was used by Zhang et al. (1997).

Within a given plot and measurement age, only planted

loblolly pines that were unforked and did not have a

broken top were selected as sample trees. Within each

physiographic region, seventy percent of the plot mea-

surements were randomly assigned to a model fitting

dataset while the remaining 30% were assigned to a

model validation dataset (Table 1).

Model development

A mathematical expression used for modeling the

height–diameter relationship is a power function other-

wise known as an allometric function (e.g., Arabatzis

and Burkhart 1992; Eerika ¨ inen 2003):

h ¼ b0db1;

where h is individual total tree height and d is the dbh.

The parameter b0 controls the rate of increase and

parameter b1 adjusts the shape of the curve. This

equation can be conditioned to estimate total tree height

as 1.37 m when d = 0, producing the following form:

ð1Þ

h ¼ 1:37 þ b0db1:

Other studies have used the same constrained non-

linear equation for modeling the height–diameter rela-

ð2Þ

Table 1 Descriptive statistics of

the data in the model fitting

dataset (70%) and the model

validation dataset (30%) by

physiographic region and

across both regions combined

RegionNo. plotsNo. treesAge (years)dbh (cm)Total

(m)

height

MeanSDMean SDMeanSD

Fit

Combined

Coastal Plain

Piedmont

Validation

Combined

Coastal Plain

Piedmont

1,600

969

631

84,500

50,782

33,718

22.9

22.5

23.5

6.3

6.4

6.2

19.4

19.5

19.4

5.5

5.7

5.3

16.0

16.1

15.9

4.1

4.3

5.3

686

415

271

36,813

21,877

14,936

23.4

23.2

23.6

6.5

6.7

6.1

19.4

19.6

19.1

5.5

5.7

5.2

16.1

16.3

15.7

4.3

4.4

4.0

254

Page 3

tionship (e.g., Huang et al. 1992; Hui and Gadow 1993).

This equation can easily be linearized by taking loga-

rithms of both sides:

lnðHÞ ¼ lnðb0Þ þ b1lnðdÞ;

where H = h?1.37. A model containing both fixed and

random effects and having the same structure as in

Eq. (3) can be expressed as:

ð3Þ

lnðHkiÞ ¼ b0þ b1lnðdkiÞ þ b0k þ b1klnðdkiÞ þ eki;

ð4Þ

where Hkiis the height of tree i above 1.37 m in plot k,

dkiis the dbh of tree i in plot k, b0,b1are the fixed-effects

parameters (or the population average parameters) and

b0k,b1kare the random effects parameters (or the cluster

specific parameters) of plot k. The mixed-effects model

can be expressed in the same general form as given in

Eq. (14), where the matrices involved take the following

forms:

Y

0

k¼ lnðHk1Þ;...;lnðHknkÞ½?;

X

0

k¼ Z

0

k¼ 1lnðdk1Þ;...;1lnðdknkÞ

hi

;

b

0¼ b0

b1

½?;

b

0

k¼ b0k

In this analysis all parameters were considered as

mixed e.g., X¢k= Z¢k and the following variance–

covariance structures for random parameters and the

random errors were assumed (see Eq. 14):

?

and

b1k

½?:

D ¼ Var bk

½ ? ¼

Varðb0Þ

Covðb0;b1Þ

Covðb0;b1Þ

Varðb1Þ

?

Rk¼ Var ek

Under this variance structure random errors are

assumed to be uncorrelated and have constant vari-

ance (r2). Parameter estimates were obtained using

PROC MIXED from SAS Institute Inc. (Littell et al.

1996).

½ ? ¼ r2Ink:

Testing for regional-specific parameters

To determine if separate equations are needed for the

Piedmont and Coastal Plain physiographic regions,

Eq. (4) was modified by including a dummy variable for

the intercept and slope terms producing the following

form:

lnðHkiÞ ¼ b0þ b1lnðdkiÞ þ b2Z þ b3Z lnðdkiÞ

þ b0k þ b1klnðdkiÞ þ eki;

where Z = 0 if a plot is from the Piedmont physio-

graphic region and Z = 1 if a plot is from the Coastal

Plain physiographic region. If b2and b3are not signifi-

cant, then observations from each physiographic region

can be combined and Eq. (5) collapses to Eq. (4). The

variance components of the combined model (Eq. 5)

were obtained using restricted maximum likelihood

(REML).

ð5Þ

Statistical validation

The developed model was evaluated using (1) only the

fixed-effects response (mean response) and (2) a more

complete model by estimating random parameters (cal-

ibrated response) for each plot of the validation dataset

(Table 1). Estimation of random parameters (calibra-

tion) for a particular plot depends on the number of

trees selected and the amount of inherent variability

among trees within a plot. Therefore, for evaluating the

calibration process, one to three trees were randomly

selected from each plot. This procedure was repeated ten

times in each plot to account for within tree variability.

The validation process followed the methodology pro-

posed by Arabatzis and Burkhart (1992). The difference

between the observed and predicted total tree height

eki¼ hki ?^hki

plot (k) was calculated. Then, for each plot, the mean

residual

? ek

ð Þ and the sample variance (vk) of residuals

were computed. They were considered to be estimates of

bias and precision, respectively. An estimate of mean

square error (MSk) was obtained combining the bias and

precision measures using the following formula:

??

for each individual tree (i) within a

MSk¼ ? e2

For those estimations carried out using only the fixed-

effects part of the model, measurements of bias, preci-

sion, and error were calculated for each plot and a mean

value over all the plots was computed. However, for

those predictions using estimated random parameters,

the same statistical measures were averaged across the

ten repetitions by plot for each of the three sample sizes.

An analysis of residuals was performed to determine

the effects of stand density (basal area) on both the mean

and calibrated responses.

kþ vk:

ð6Þ

Results and discussion

Mixed-effects height–diameter model

For the combined model all parameter estimates were

significant (Table 2). Therefore, separate equations for

the two physiographic regions were needed (Table 3).

An analysis of residuals showed that predictions based

255

Page 4

on the fixed-effects model were biased with respect to

basal area (Fig. 1). However, the use of the random-

effects parameters adjusted the predictions for local

conditions producing unbiased predictions with respect

to basal area, even though this variable is not explicitly

incorporated in the model (Eq. 4). The incorporation of

additional predictor variables has a major effect on the

ability of the fixed-effects model to explain between-

individual variability, but certainly not on the mixed-

effects model. In this application, we assumed that

random-effects parameters will be estimated and a

mixed-effects model used. Otherwise the use of a fixed-

effects model would likely require including additional

predictor variables. The use of a mixed-effects model in

forest inventory through a sub-sample of trees for height

measurement allows maintenance of a simple model

structure without including additional predictor vari-

ables.

The validation analyses indicated that higher accu-

racy in predicting total tree height was observed when

random parameters were estimated (Table 4). The root

mean square errorMS

of sample trees for calibration increased. However, the

error reduction in both regions was mainly due to a

ffiffiffiffiffiffiffiffip

??decreased when the number

reduction in bias since the variance remained relatively

constant as sample size increased. The greatest change in

accuracy occurred when a single tree was used for cali-

bration, indicating only a marginal gain when more

sample trees were used.

The predictions of total tree heights using a mean

response (fixed-effects parameters) were as expected

biased with respect to basal area (Fig. 2). However, the

selection of one to three trees for measurement of total

height permitted estimation of random-effects parame-

ters and unbiased predictions of tree heights with respect

to basal area. These results demonstrate that the use of a

calibrated response does not require incorporating into

the model additional predictor variables representing

stand density.

The utility of this technique in forest inventory is that

in each plot only one height measurement is required in

order to obtain unbiased predictions using a simple

height–diameter model.

Prediction of total tree heights—an example

In comparison to conventional regression techniques,

the mixed-effects height–diameter model permits the

estimation of a mean response (population-specific) and/

or a calibrated response (cluster-specific) for a new plot.

The prediction of total tree heights using both types of

responses is explained and demonstrated using param-

eter estimates from the Coastal Plain physiographic re-

gion (Table 3).

Mean response (only fixed-effects parameters)

A mean response for a new plot can be obtained using

only the fixed-effects component of Eq. (4), e.g., b0and

b1. Thus, this equation can be expressed as:

lnðHÞ ¼ 1:4027 þ 0:4386 lnðdÞ:

However, in order to get an estimate of height (h), the

predicted value from Eq. (7) must first be transformed

back to the original units (m). Additive error terms in

ð7Þ

Table 2 Estimated parameters and fit statistics for the combined

data set (Coastal Plain/Piedmont physiographic regions) using

Eq. (5)

ParametersEstimateSEt-value P-value

b0

b1

b2

b3

Variance componentsa

Var (b0)

Var (b1)

Cov (b0,b1)

r2

Goodness-of-fit

-2LL (smaller better)

AIC (smaller better)

1.4684

0.4099

?0.0659

0.0287

0.01964

0.00502

0.02518

0.00642

74.78

81.74

?2.62

4.47

< 0.0001

< 0.0001

0.0088

< 0.0001

0.2136

0.0126

?0.0486

0.00568

0.00842

0.00054

0.00206

0.00003

25.36

23.56

?23.59

201.73

< 0.0001

< 0.0001

< 0.0001

< 0.0001

?185481

?185473

aAsymptotic standard errors (SE) and test statistics based on

Wald–Z tests

Table 3 Estimated parameters and fit statistics for the Coastal Plain and Piedmont physiographic regions using Eq. (4)

ParametersCoastal plainPiedmont

EstimateSEt-valueP-valueEstimateSEt-valueP-value

b0

b1

Variance componentsa

Var (b0)

Var (b1)

Cov (b0,b1)

r2

Goodness-of-fit

-2LL (smaller better)

AIC (smaller better)

1.4027

0.4386

0.01611

0.00402

87.07

109.18

< 0.0001

< 0.0001

1.4683

0.4100

0.01894

0.00499

77.53

82.20

< 0.0001

< 0.0001

0.2243

0.0127

?0.0499

0.00575

0.01137

0.00273

0.00070

0.00004

19.73

18.21

?18.28

156.32

< 0.0001

< 0.0001

< 0.0001

< 0.0001

0.1970

0.0125

?0.0466

0.00559

0.01236

0.00083

0.00312

0.00004

15.95

14.93

?14.92

127.5

< 0.0001

< 0.0001

< 0.0001

< 0.0001

?110773

?110765

?74730

?74722

aAsymptotic standard errors (SE) and test statistics based on Wald–Z tests

256

Page 5

log–log models become multiplicative when transformed

back to the original scale and thus this bias must be

accounted for. An unbiased estimate can be obtained by

adding a correction factor as proposed by Baskerville

(1972):

lnðHÞ ¼ 1:4027 þ 0:4386 lnðdÞ þ 0:002875:

This correction factor was computed using the esti-

mated error variance found in Table 3 for the Coastal

Plain region r2/2 or (0.00575/2). Following this, Eq. (8)

can be converted to arithmetic units isolating h on the

left-hand side:

ð8Þ

h ¼ 1:37 þ 4:0779 d0:4386:

ð9Þ

This equation allows for a mean response for total

tree height. The curve obtained is presented in (Fig. 3a)

for a 30-year-old plot located in the Coastal plain re-

gion, where all trees have been measured for dbh and

total tree height (36 observations). Clearly, a severely

biased height–diameter curve that underestimates ob-

served tree heights for this plot is obtained when only

using a mean response.

Calibrated response (fixed- and random-effects

parameters)

Mixed-effects models allow for the mean response to

be calibrated for an individual plot by estimating

random parameter components for Eq. (4) e.g., b0k

and b1k. The use of a calibrated response rather than a

population mean response produced an increase in the

accuracy of predicted tree heights for a given plot

(Table 4). In order to estimate random parameters, the

calibration process requires observed height–diameter

data. From the 30-year-old plot located in the Coastal

Plain region mentioned above, three sampled trees

were randomly selected for measuring dbh (d) in cm

and total tree height (h) in m. The pair of measure-

ments (d, h) corresponded to: (25.9, 21.0), (16.5, 17.7)

and (18.5, 17.7). These sample trees are represented by

black dots in Fig. 3b. Prediction of random parameters

is accomplished using formula (16). Estimated vari-

ances and the covariance of the parameters are given

in Table 3. Thus, the variance–covariance matrix of

random coefficients is:

Fig. 1 Dispersion of residuals

against basal area for the fixed-

and mixed-effects models for

the Coastal Plain and Piedmont

physiographic regions

Table 4 Mean prediction bias

root square error

sample sizes used to estimate random effects (n = 415 for Coastal

Plain and n = 271 for the Piedmont)

? e ð Þ; mean precision (v), and mean

in meters across all plots for three

ffiffiffiffiffiffiffiffi

MS

p

??

RegionNumber of

sample trees

Bias ePrecision v

Error

ffiffiffiffiffiffiffiffi

MS

p

Coastal Plain0a

1

2

3

0a

1

2

3

0.507

0.099

0.051

0.024

0.072

0.042

0.012

?0.004

1.096

1.095

1.095

1.096

0.959

0.953

0.956

0.958

2.912

1.359

1.228

1.170

2.202

1.260

1.134

1.089

Piedmont

aOnly the fixed-effects part of model (4) was used for total tree

height prediction

257

Page 6

Var½b? ¼ D ¼

0:2243

?0:0499

?0:0499

0:0127

??

?

The variance–covariance matrix for the random error

term must also be determined. As specified earlier, all

observations are assumed to have constant variance r2

and errors are assumed to be uncorrelated. Thus, the

estimated variance–covariance matrix, Var[ek] = Rk,

can be expressed as:

2

Rk¼ 0:00575

1

0

0

0

1

0

0

0

1

4

3

5

or

Rk¼ 0:00575 I3;

where I3is the identity matrix with dimension (3 · 3)

equal to the number of sample trees used for calibration.

Then, according to the definition for the dependent and

independent variables as specified in Eq. (4), the matri-

ces Yk, Xk, and Zkare

Yk¼

lnð21:0 ? 1:37Þ

lnð17:7 ? 1:37Þ

lnð17:7 ? 1:37Þ

2

4

3

5

and

Xk¼ Zk¼

1

1

1

lnð25:9Þ

lnð16:5Þ

lnð18:5Þ

2

4

3

5

The expression Y?Xb represents the difference be-

tween the observed values and the estimated mean re-

sponses using only the fixed-effects parameter estimates

from Eq. (7).Thus,the vectorof residualsis expressedas:

2

Yk? Xkb ¼

2:98 ? ð1:4027 þ 0:4386 ? lnð25:9ÞÞ

2:79 ? ð1:4027 þ 0:4386 ? lnð16:5ÞÞ

2:79 ? ð1:4027 þ 0:4386 ? lnð18:5ÞÞ

0:15

0:16

0:11

64

64

3

75

¼

23

75

Fig. 2 Dispersion of residuals

against basal area for the mean

response (MR) and calibrated

response (CR) using different

numbers of sample trees for the

Coastal Plain and Piedmont

physiographic regions

258

Page 7

Replacing the matrices in formula (16) with their

corresponding estimated matrices gives the following

predictions for the random parameters of this specific

plot: b0k= 0.2343 and b1k= ?0.0339. Therefore, a

modelcontainingboth

parameters for this specific plot is:

fixedandrandom-effects

lnðHÞ ¼ 1:4027 þ 0:4386 lnðdÞ þ 0:2343

? 0:0339 lnðdÞ:

As before, after applying the correction factor using

the residual variance and transforming to arithmetic

units, a final unbiased plot-specific equation can be ob-

tained:

ð10Þ

h ¼ 1:37 þ 5:1545 d0:4047

The calibrated height–diameter curve is presented in

Fig. 3b. A visual inspection of this figure reveals that the

inclusion of random parameters improved the predictive

capability of the model over the mean response (Eq. 9).

However, a comparison of the calibrated curve to ob-

served data indicates that the model still tends to

underestimate tree heights. This finding agrees with the

overall analysis performed in terms of predictive capa-

bilities, which indicated that on the average these models

underestimate tree heights (Table 4). The calculation

procedure for estimating random parameters was

implemented in PROC IML and this program code has

been included in Appendix 2.

For comparison purposes and to determine the de-

gree of underestimation of the calibrated model, Eq. (3)

was fit to all height–diameter data contained in the plot.

This equation was fitted using ordinary least squares

(OLS) and it was used only as a reference of optimum

curve estimation. The parameter estimates and error

variance for the data were: b0= 4.9913, b1= 0.4253

and r2= 0.00187. The calibrated curve is located be-

neath the OLS curve indicating that the use of the cali-

brated curve will slightly underestimate observed tree

ð11Þ

heights (Fig. 3c). However, the estimated curve using a

calibrated response can be considered an adequate esti-

mate of unmeasured total tree heights. Moreover, the

calibration procedure only requires three or fewer

height–diameter observations from each plot.

Conclusions

A mixed-effects height–diameter model was developed

for use in loblolly pine plantations in the Southeastern

US. A test for regional parameters indicated that sepa-

rate equations were needed for the Piedmont and

Coastal Plain physiographic regions. The use of separate

equations for both regions was also reported by Tasissa

et al. (1997) when constructing volume and taper equa-

tions for the same tree species. These two studies indi-

cate a possible effect of geographical location on the

growing pattern of loblolly pine trees. Measures of bias,

precision, and mean-square error using region-wide

plots for validation indicated that the use of random

parameters considerably increased the predictive capa-

bility of the model (Table 4). Therefore, a calibrated

curve is recommended rather than a mean response

curve. The most significant improvements in accuracy

were observed when one sample tree was used for pre-

dicting random parameters. Only a marginal gain in

accuracy was observed when two or three sample trees

were used for calibration, and this was mainly because of

a reduction in bias. Other studies have reported similar

behavior using linear or nonlinear mixed-effects height–

diameter equations for other tree species (Jayaraman

and Zakrzewski 2001; Calama and Montero 2004).

The developed models can be implemented in forest

inventories by measurement of one tree height per plot,

resulting in unbiased predictions using a calibrated

curve. The process of calibration apparently accounted

indirectly for the effects of stand density on the height–

diameter relationships (e.g., Zeide and VanderSchaaf

DBH (cm)

1015202530

Height (m)

10

15

20

25

DBH (cm)

1015202530

Height (m)

10

15

20

25

DBH (cm)

10 15202530

Height (m)

10

15

20

25

a

c

b

Fig. 3 Estimated height–diameter curves using : a mean response (Eq. 9), b calibrated response (Eq. 11) and c a comparison between the

calibrated response and a local height–diameter equation (dashed line) fitted by OLS

259

Page 8

2002). Even though not evaluated here, the effects of site

productivity can be similarly accounted for by a mixed-

effects modeling approach (e.g., Huang et al. 1992). The

estimate of random-effects parameters makes the inclu-

sion of additional predictor variables in the model

unnecessary for many forest inventory applications that

consider measurement of a sub-sample of heights. The

advantages are maintenance of a simple model structure

that does not assume the height–diameter relationship is

constant within a stand but rather allows for plot-spe-

cific curves.

Acknowledgments Data for this study and financial support were

provided through the Loblolly Pine Growth and Yield Research

Cooperative, Department of Forestry, Virginia Polytechnic Insti-

tute and State University.

Appendix 1. Linear mixed-effects model theory

A general linear model can be expressed as:

E½Yi? ¼ Xib;

where Yi is a vector of observations from cluster i

(i = 1,..,N), Xiis an (ni· p) regressor matrix of cluster i

and b is a (p · 1) vector of regression coefficients (vector

of fixed-effects parameters applicable to all N clusters).

This fixed-effects model assumes the mean response of a

specific set of regressor values is constant for all clusters.

However, the fixed-effect model can be modified by

including cluster-specific parameters (random effects)

thus permitting the mean response to vary from cluster

to cluster, taking the following form:

ð12Þ

E½Yijbi? ¼ Xiðb þ biÞ ¼ Xib þ Xibi;

where biprovides for cluster-specific behavior. A more

general expression for this model is given by:

E½Yijbi? ¼ Xib þ Zibi;

where Ziis a (ni· q) regressor matrix (or design matrix)

containing explanatory variables and bi is a (q · 1)

vector of random effects. The matrix Zican have the

same regressors as in Xior it may only contain those

regressors in Xithat vary among clusters. The expecta-

tion of Eq. (13) is conditioned on the vector of random

parameters, assuming that bi is normally distributed

with E[bi] = 0 and Var[bi] = D. In terms of linear

models, a linear mixed-effects model can be expressed as:

ð13Þ

Yijbi¼ Xib þ Zibiþ ei

where ei is a random error assumed to be normally

distributed with E[ei] = 0 and Var[ei] = Ri. Ri is

an (ni· ni) covariance–variance matrix of cluster i.

The random error is assumed to be independent of

the random vector bi with Cov[ei, bi] = 0. Then,

E[Yi] = Xi b with a covariance matrix of Var(Yi) =

Vi= ZiDZ¢i+ Ri (e.g., Verbeke and Molenberghs

1997, p. 71). Thus, a mixed-effects model can be

expressed in general form as (Laird and Ware 1982):

Yi¼ Xib þ Zibiþ ei;

where

ð14Þ

Yi? NðXib; ZiDZ

and

0

iþ RiÞ

bi

ei

!

? N

0

0

!

;

D

0

0

Ri

??

!

:

Estimation of the fixed vector b

An estimate of the fixed parameter vector b under model

(4) can be obtained from a generalized least squares

(GLS) analysis using Vi

variance and covariance parameters of D and Ri are

known, e.g., which means Viis known, results in the

following estimator of

^b; which is the best linear

unbiased estimator (BLUE):

i¼1

where V?1i= [ZiDZ¢i+ Ri]?1and its variance–covari-

ance matrix is:

?1as weights. Assuming all the

^b ¼

X

N

i¼1

X

0

iV?1

i

Xi

!?1X

N

X

0

iV?1

i

Yi;

ð15Þ

Varð^bÞ ¼

X

N

i¼1

X

0

iV?1

i

Xi

!?1

:

Prediction of the random vector bi

Even though we are interested in the fixed-effects

parameters, providing us with a population average

curve, the main purpose for using mixed-effects models

is to estimate cluster specific parameters. Knowing that

bi and Yi are distributed jointly multivariate normal,

then the conditional expectation of biis given by:

E½bijYi? ¼ EðbiÞ þ Cov½bi;Yi?Var½Yi??1ðYi? E½Yi?Þ;

where Cov[bi,Yi] = Cov[bi, Xib + Zibi] = DZ¢iand

E(bi) = 0, and thus the best linear unbiased predictor

(BLUP) of biis given by:

^bi¼ D Z

where

2000, p. 431). As mentioned by Lappi (1991), this

expression requires the inversion of a matrix with

dimension equal to the number of observations. The

variance–covariance matrix of the prediction errors,

Var½bi?^bi?; is given by

0

iWiðYi? Xi^bÞ;

Wi= V?1i= [ZiDZ¢i+ Ri]?1

ð16Þ

(seeRencher

260

Page 9

Var½bi?^bi? ¼ ½Z

The expressions given for^b and^bi in (15) and (16)

assume that Viis known, e.g., D and Riare known.

However in normal practice a consistent estimator given

by ^Vi¼ Zi^DZ

based methods are used for estimating D and Ribased on

the assumptions that biand eiare normally distributed

(see Littell et al. 1996; Schabenberger and Pierce 2002).

0

iR?1

i

Z

0

iþ D?1

i??1

ð17Þ

0

iþ^Rimust be used instead. Likelihood-

Appendix 2

References

Arabatzis AA, Burkhart HE (1992) An evaluation of sampling

methods and model forms for estimating height–diameter

relationships in loblolly pine plantations. For Sci 38:192–198

Avery TE, Burkhart HE (2002) Forest measurements. 5th edn.

McGraw-Hill, New York

Assman E (1970) The principles of forest yield studies. Pergamon,

Oxford

Baskerville GL (1972) Use of logarithmic regression in the esti-

mation of plant biomass. Can J For Res 2:49–53

Burkhart HE, Parker RC, Strub MR, Oderwald RG (1972) Yield

of old-field loblolly pine plantations. School of Forestry and

Wildlife Resources, Va Polytech Institute and State University

Publication FWS, pp 3–72

Burkhart HE, Cloren DC, Amateis RL (1985) Yield relationships

in unthinned loblolly pine plantations on cutover, site-prepared

lands. South J Appl For 9:84–91

Calama R, Montero G (2004) Interregional nonlinear height–

diameter model with random coefficients for stone pine in

Spain. Can J For Res 34:150–163

Castedo Dorado F, Barrio Anta M, Parresol BR, A´lvarez Gon-

za ´ lez JG (2005) Stochastic height–diameter model for maritime

pine ecoregions in Galicia (northwestern Spain). Ann For Sci

62:455–465

Eerika ¨ inen K (2001) Stem volume models with random coefficients

for Pinus kesiya in Tanzania, Zambia, and Zimbabwe. Can J

For Res 31:879–888

Eerika ¨ inen K (2003) Predicting the height–diameter pattern of

planted Pinus kesiya stands in Zambia and Zimbabwe. For Ecol

Manage 175:355–366

Epstein R, Nieto E, Weintraub A, Chevalier P, Gabarro ´ J (1999) A

system for the design of short term harvesting strategy. Eur J

Oper Res 119:427–439

Fang Z, Bailey RL (2001) Nonlinear mixed effects modeling for

slash pine dominant height growth following intensive silvi-

cultural treatments. For Sci 47:287–300

Gregoire TG, Schabenberger O (1996a) Non-linear mixed effects

modeling of cumulative bole volume with spatially correlated

within-tree data. J Agri Biol Environ Stat 1:107–109

Gregoire TG, Schabenberger O (1996b) A non-linear mixed-effects

model to predict cumulative bole volume of standing trees. J

Appl Stat 23:257–271

Hall DB, Clutter M (2004) Multivariate multilevel nonlinear mixed

effects models for timber yield predictions. Biometrics 60:16–24

Huang S, Titus SJ, Wiens DD (1992) Comparison of nonlinear

height–diameter functions for major Alberta tree species. Can J

For Res 22:1.297–1.304

HuiG,GadowKv(1993)

heitsho ¨ henkurven am Beispiel der Baumart Cunninghamia

lanceolata. Allg Forst Jagdztg 164:218–220

Jayaraman K, Zakrzewski WT (2001) Practical approaches to

calibrating height–diameter relationships for natural maple

stand in Ontario. For Ecol Manage 148:169–177

Laird NM, Ware JH (1982) Random-effects models for longitudi-

nal data. Biometrics 38:963–974

Lappi J (1986) Mixed linear models for analyzing and predicting

stem form variation of Scots pine. Commun Inst For Fenn

134:1–69

Lappi J (1991) Calibration of height and volume equations with

random parameters. For Sci 37:781–801

Lappi J, Bailey RL (1988) A height prediction model with random

stand and tree parameters: an alternative to traditional site in-

dex methods. For Sci 34:907–927

Littell RC, Milliken GA, Stroup WW, Wolfinger RD (1996) SAS?

System for mixed models. SAS Institute Inc., Cary

Lo ´ pez Sa ´ nchez CA, Gorgoso Varela J, Castedo Dorado F, Rojo

Alboreceda R, Rodriguez Soalleiro R, Alvarez Gonzalez JG,

Sanchez Rodriguez F (2003) A height–diameter model for Pinus

radiata D, Don in Galicia (Northwest Spain). Ann For Sci

60:237–345

Lynch T, Murphy P (1995) A compatible height prediction and

projection system for individual trees in natural, even-aged

shortleaf pine stands. For Sci 41:194–209

Lynch TB, Holley AG, Stevenson DJ (2005) A random-parameter

height-dbh model for cherrybark oak. South J Appl For

29:22–26

Martin F, Flewelling J (1998) Evaluation of tree height prediction

models for stand inventory. West J Appl For 13:109–119

Mehta ¨ talo L (2004) A longitudinal height–diameter model for

Norway spruce in Finland. Can J For Res 34:131–140

Rencher AC (2000) Linear models in statistics. John Wiley, New

York

Schabenberger O, Pierce FJ (2002) Contemporary statistical mod-

els for the plant and soil sciences. CRC, Boca Raton

Scho ¨ eder J, A´lvarez Gonza ´ lez JG (2001) Comparing the perfor-

mance of generalized diameter–height equations for maritime

pine in Northwestern Spain. Forstw Cbl 120:18–23

Sharma M, Zhang SY (2004) Height–diameter models using stand

characteristics for Pinus banksiana and Pinus mariana. Scand J

For Res 19:442–451

Soares P, Tome ´ M (2002) Height–diameter equation for first

rotation eucalypt plantations in Portugal. For Ecol Manage

166:99–109

Tasissa G G, Burkhart HE (1998) An application of mixed effects

analysis to modeling thinning effects on stem profile of loblolly

pine. For Ecol Manage 103:87–101

Zur EntwicklungvonEin-

DATA Example ;

INPUT h d ;

one = 1 ;

lnd = LOG(d) ;

RES = LOG(h-1.37)-(1.4027+0.4386*lnd) ;

CARDS ;

21.0 25.9

17.7 16.5

17.7 18.5

;

RUN ;

PROC IML ;

USE Example ;

READ ALL VAR {one lnd} INTO Z ;

READ ALL VAR {RES} INTO RES ;

D = { 0.2243 -0.0499, -0.0499 0.0127 } ;

R = 0.00575 * I(3) ;

b = D*Z`*INV(Z * D * Z` + R)*RES ;

PRINT b ;

QUIT ;

Fig. 4 SAS program for computing random parameters

261

Page 10

Tasissa G, Burkhart HE, Amateis RL (1997) Volume and taper

equations for thinned and unthinned loblolly pine trees in cut-

over, site-prepared plantations. South J Appl For 21:146–152

Temesgen H, Gadow Kv (2004) Generalized height–diameter

models: an application for major tree species in complex stands

of interior British Columbia. Eur J For Res 123:45–51

Verbeke G, Molenberghs G (1997) Linear mixed models in practice:

a SAS-oriented approach. Springer, Berlin Heidelberg New York

Zeide B, VanderSchaaf C (2002) The effect of density on the

height–diameter relationship. In: Proceedings of the 11th

biennial southern silvicultural research conference. Outcalt

Kenneth W (ed) Gen Tech Rep SRS-48 Asheville, NC:

Department of Agriculture, Forest Service, Southern Research

Station, pp 463–466

Zhang S, Burkhart HE, Amateis RL (1997) The influence of

thinning on tree height and diameter relationships in loblolly

pine plantations. South J Appl For 21:199–205

Zhang L, Peng C, Huang S, Zhou X (2002) Development

and evaluation of ecoregion-based jack pine height–diameter

models for Ontario. For Chron 78:530–538

262