Content uploaded by Murari Singh
Author content
All content in this area was uploaded by Murari Singh on May 19, 2015
Content may be subject to copyright.
Available online at www.isas.org.in/jisas
JOURNAL OF THE INDIAN SOCIETY OF
AGRICULTURAL STATISTICS 64(2) 2010 313-321
Modeling Unstructured Heterogeneity along with Spatially Correlated
Errors in Field Trials
M. Singh1*, Y.P. Chaubey1, A. Sarker2 and D. Sen1
1Department of Mathematics and Statistics,
Concordia University, Montreal, Quebec, Canada
2International Center for Agricultural Research in the
Dry Areas (ICARDA), P.O. Box 5466, Aleppo, Syria
Received 11 August 2009; Revised 21 July 2010; Accepted 23 July 2010
SUMMARY
In this paper we consider analysis of two experimental data sets for evaluating lentil genotypes. One of these data sets
comes from an incomplete block design and the other one from a complete block design. The incomplete blocks contribute to
the experimental error reduction and spatially correlated plot-errors can be modeled using autoregressive scheme that may
lead to further improvement in the assessment of the genotypes. Such an approach was applied in several other studies to
model the linear trends and spatially correlated errors. However, the assumption of a constant error variance restricts the scope
of the analysis in many agricultural field trials, and in other situations in general, where heterogeneity of error variances is a
reality. In this study, we have approached the problem first by fitting a model with constant error variance and generating the
residuals. Using the squared residuals, we use K-cluster means technique to group the experimental units for similar squared-
residuals. Next, we allow the error variances to vary with the group of the experimental units which need not require any
spatial restrictions to model the error variances. The number of heterogeneous errors and the experimental units belonging to
the heterogeneous clusters are obtained using the AIC criterion values followed by a groups merger scheme based on insignificant
change in the residual maximum log likelihood values. The final models with heterogeneous variances were used to evaluate
the precision of the genotype means comparisons. We found a substantial improvement on the effciency of the pair-wise
comparisons over the other ways of analysis. We recommend the application of this procedure in any general situation permitting
unstructured heterogeneity.
Keywords: Heterogeneous error variances, Spatially correlated errors, Variogram, Clustering, Field trials.
1. INTRODUCTION
Control of field variability is normally done by
applying blocking methods where experience with the
obvious landscape configuration guides the formation
of the blocks for assigning the treatments such as
genotypes of a crop to the field-plots, i.e. the
experimental units. Furthermore, the design may
consist of complete blocks or incomplete blocks
allowing a certain degree of balance under a constant
error variance model. Such approaches are discussed
in standard texts on design and analysis of experiments
(see for example, Fisher 1990, Cochran and Cox 1992,
*Corresponding author : M. Singh
E-mail address : smurari@mathstat.concordia.ca
Cox and Reid 2002 and Hinkelmann and Kempthorne
2007). In the context of field experiments, the
experimental units on a rectangular layout would
generally be correlated due to their fixed physical
proximity, and, in addition, there might be presence of
local fertility trends. Analysis approaches in these
situations have been developed in order to account for
blocking effects and correlated errors in space and time
(see Gilmore et al. 1997, Cullis and Gleeson 1991 and
Grondona et al. 1996). Various criteria such as Akaike
information criteria (AIC) have been used for selection
of appropriate covariance models in these areas (see
Wolfinger 1996 and Singh et al. 2003).
314 M. Singh et al. / Journal of the Indian Society of Agricultural Statistics 64(2) 2010 313-321
The approaches used in the analysis to capture
spatial variability in field trials have been found useful
in enhancing the breeding efficiency of crop variety
improvement programs (Sarker et al. 2001, Malhotra
2004). The underlying models in most of these analyses
have assumed homogeneous error variances across all
the plots of the layout. We believe that in reality,
experimental errors need not be homoscedastic even
after accounting for various local fertility trends and or
autocorrelations across various directions in the layout.
This may be due to variety of reasons. In field trials,
lack of homogeneity may be attributed to ineffective
cover cropping in the preceding season, or the farmers
fields used for experimentation having been subjected
to the application of crop management input factors to
the where-needed plots or sections of the field. In a well
designed blocking experiment, the uniform application
of the management practices over the whole of a block
might have been overlooked or ignored. It is also
possible that the prevalence or distribution of
underground parasites such as orobanche in legume
fields or striga in the sorghum fields, may follow
irregular pattern and make the nearest neighbor
adjustment unreliable (Wilkinson et al. 1983).
Therefore, it is essential to allow for heterogeneous
error variances in the field trials in addition to
accounting for the other factors. The heterogeneous
error variance need not follow any spatial structure on
the field layout. The general objective of this study,
therefore, is to address the unstructured heterogeneity
of error variances in evaluation of variety trials and
apply on lentil data.
The identification of the sources and the structure
of heterogeneity is based on residuals from the fitted
model found most suitable when the heterogeneity was
ignored. The squared residuals were used to form
clusters or groups of homoscedastic experimental units
and to identify the structure of homogeneity, if any, by
using an empirical or non-parametric approach. The use
of squared residuals for studying the homogeneity of
variances have also been found robust to the departure
from normality (Levene 1960). Since, no clear structure
is expected in the residuals, non-hierarchical approach
or K-means clustering could be applied to obtain the
prevailing clusters of units with homoscedastic units.
Other alternative methods of clustering could also be
used (Everitt et al. 2001). The most appropriate cluster
could be determined from the trend of a criterion values
and the change in the log-likelihood value for the
heterogeneous models. This study uses data from two
lentil trials with relatively high coefficient of variation
that are described in Section 2. The statistical methods
for identifying the structure of heterogeneous errors are
given in Section 3, computational details appear in
Section 4 and results are summarized in Section 5.
2. EXPERIMENTAL DATA
Two trials consisting of genetic materials for a
preliminary yield trial (PYT) and an advanced yield trial
(AYT ) were evaluated in block designs at an
experimental station of the International Center for
Agricultural Research in the Dry Areas (ICARDA) at
Breda in northern Syria. Data on seed yield were
examined. Trial 1, a PYT, had 25 genotypes and was
evaluated in a square lattice with 4 replications on a
4 ´ 25 rectangular layout in 2005. In field trials, the
coefficient of variation (CV ) is normally used as an
indicator or a measure of experimental error variability.
The analysis using randomized complete block design
resulted in a CV of 51% for seed yield. Trial 2, an
AYT, was conducted in randomized complete blocks
with 30 genotypes and 3 replications on a 3 ´ 30 layout
in 2003 and gave a CV of 41% for seed yield. In the
PYT, the plot size was 4m ´ 1.5m, and in the AYT it
was 4m ´ 3m with a standard row-to-row distance of
30cm for lentil crop. However, at maturity, actual
harvest area per plot was 4.5m2 and 9m2 for the PYT
and the AYT, respectively. Analysis was performed
based on net harvested area per plot.
3. MODELING HETEROGENEITY OF
SPATIALLY CORRELATED ERRORS
The two data sets were first analyzed by fitting the
best spatial models described in Singh et al. (2003) to
screen the AIC best model out of the group of models
generated by various combinations of complete or
incomplete blocks, fixed linear, random cubic spline or
no trend, and first-order autocorrelated along rows and
columns or independent errors. In the two trials, the best
model for seed yield was found to be randomized
complete blocks with first-order autoregressive errors
along rows. At this stage, each model was based on the
assumption of homogeneous error variances. In order
to examine any possible indication of heterogeneity, the
residuals obtained from the fitted models in above can
be plotted and their variograms can be examined as well
M. Singh et al. / Journal of the Indian Society of Agricultural Statistics 64(2) 2010 313-321 315
(see Sarker et al. 2001 for details on obtaining the
variograms). Figs. (1 4) exhibit 3D plots of residuals
and their variograms for the two trials. We noticed no
clear spatial patterns in the residuals (Figs. 1 and 2).
This can be expected since we have screened various
models accounting for the presence of linear trends in
the field layout and the residuals are computed from
the best models as obtained using Singh et al. (2003).
Another way to explore the variability is in terms of
the variograms, which indicate the presence of different
levels of variability between the residuals over the
layouts. For instance, in Fig. 3, the variogram of the
residuals in the PYT (2005) indicates that there is a
variation in the variances of the plot residuals: 0.40
0.65 for plots within 2 plot-units, fluctuating values
within 0.4 0.6 for distances from 3 22 units and
variation from 0.2 to 0.6 for plots separated by more
than 22 plot-units. There is no clear spatial pattern to
allow modelling of the variogram with different values
for nearly the same distances. In Fig. 4 (AYT, 2003),
the variogram indicates different levels of variances:
less than 0.3 for distances within 5 plot-units, between
0.3 0.4 is fairly constant between 5 to 23 plot-units,
while a higher value of nearly 0.48 and low values close
Fig. 1. 3D plot of the residuals from RCB-AR model analysis of
seed yields in the preliminary yield trials (2005) in 25
genotypes (RCB-AR model: The model incorporates
random replication effects and first-order autoregressive
plot-errors across columns)
Fig. 2. 3D plot of the residuals from RCB-AR model analysis of
seed yields in the advanced yield trials (2003) in 30
genotypes (RCB-AR model: The model incorporates
random replication effects and first-order autoregressive
plot-errors across columns)
to 0.2 are observed for distances exceeding 23 plot
units. Here also, there is no clear spatial pattern for
distances more than 23 units. Thus, these cases support
Fig. 3. Variogram of the residuals from RCB-AR model analysis
of seed yields in the preliminary yield trials (2005) in 25
genotypes (RCB-AR model: The model incorporates
random replication effects and first-order autoregressive
plot-errors across columns)
316 M. Singh et al. / Journal of the Indian Society of Agricultural Statistics 64(2) 2010 313-321
the need of examining non-spatial or unstructured
heterogeneity in the plot error variances.
In addition to the visual approach of exploring
heterogeneity in the above data, we also applied the
method presented in Chaubey (1981) for detecting the
presence of heterogeneity of variances in the data. The
residuals were ordered based on their absolute values.
The variances were computed using these ordered
residuals from (a) two groups formed from the highest/
lowest 50% of the residuals, and (b) three groups from
lowest/highest 33% of the residuals. The F-test was
used with residual degrees of freedom equally allotted.
As can be seen in Table 1, there is an indication of
presence of the heterogeneous error variance in the
data.
In the presence of heterogeneity, the next question
is to identify the experimental units groups with
heterogeneous errors variances. For this purpose, we
follow the following two-step procedure.
Step-1: Formation of clusters: Based on the best model
selected using Singh et al. (2003), we first applied K-
means clustering on its squared residuals using the
criterion which maximizes between group sum of
squares. The number of groups, set a priory, varied
from K = 2, ... 10. The change in the criterion values
were noted with successive values of K, the number of
groups or clusters of the experimental units. The value
of K, for which the change was not substantial was
considered as the potential number of clusters. For each
set of clusters of the experimental units, we modeled
the data using the spatial errors as per the best model
and a random factor where error variances were allowed
to vary with the cluster of units obtained for a chosen
value of K. For example, if K = 3, there were three error
variances,
22
12
,
σσ
and
2
3
σ
. For such a fitted model, we
computed the likelihood value in terms of 2ln (REML:
residual maximum likelihood) value and the successive
increase in its values with a unit increase in K. At this
stage, it is not likely to have a nested structure defining
the heterogeneous with increase in the number of
groups, therefore, we can not apply a test of
significance (such as chi-square) on the decrease in the
2ln(REML), however, we can use Akaike information
criterion (AIC) to decide on the number of groups,
smaller AIC is better. We used Genstat (Payne et al.
2009) for the computation which produces a quantity
Fig. 4. Variogram of the residuals from RCB-AR model analysis
of seed yields in the advanced yield trials (2003) in 30
genotypes (RCB-AR model: The model incorporates
random replication effects and first-order autoregressive
plot-errors across columns)
Table 1. Preliminary indication of heterogeneity of error variances using approximate F-tests in the data on seed
yields of the two trials at Breda, Syria
(a) Trial 1: Preliminary yield trial, 2005
Two groups:
2
1
s
= 10.01
2
2
s
= 20.26 F36,36 =
2
2
s
/
2
1
s
= 2.02 Pvalue = 0.0204
Three groups:
2
1
s
= 10.00
2
3
s
= 30.23 F24,24 =
22
31
/
s
s = 3.02 Pvalue = 0.0046
(b) Trial 2: Advanced yield trial, 2003
Two groups:
2
1
s
= 10.00
2
2
s
= 20.21 F29,29 =
22
21
/
s
s = 2.02 Pvalue = 0.0330
Three groups:
2
1
s
= 10.00
2
3
s
= 36.85 F13,14 =
22
31
/
s
s = 3.02 Pvalue = 0.0093
Note: F-test is based on Chaubey (1981) adapted to the fitted models.
M. Singh et al. / Journal of the Indian Society of Agricultural Statistics 64(2) 2010 313-321 317
called deviance which is equal to 2ln(REML)
ignoring a constant which depends on the fixed effect
terms. We used the quantity AICD which expresses AIC
in terms of the deviance where AICD = deviance 2q
where q is the number of covariance parameters (Singh
et al. 2003).
Step-2: Fusion of the clusters: Step-1 provides a
number of clusters, say K, with heterogeneous error
variances (
2
j
σ
, j = 1 ... K). Let the deviance at this step
be D0. The error variances were arranged in order, we
merged those two clusters which were the closest for
the values of their error variance estimates. Then the
model was fitted with, now, the reduced number of
clusters (K 1) and the deviance was computed, say
D1. Since the fusion of the clusters presents a nested
structure of the units, it is possible to test the hypothesis
of the equality of the variance components of the two
merged clusters. In the case of equality of the variances,
the difference D1 D
0 will follow a chi-square
distribution with 1 degree of freedom. If the observed
difference is greater than the chi-square value at the
chosen level of significance, then the number of clusters
K available at Step-1 will be taken as final, and the
estimation of the genotypes effects will proceed with
the K error variances. If the observed difference is
smaller than the chi-square value, then the K 1 merged
clusters will be considered for further analysis
repeating the process of fusing the clusters with closest
error variance estimates, and evaluating the change in
the deviance against the value of chi-square with 1
degree of freedom.
4. ESTIMATION OF THE VARIANCE-
COVARIANCE PARAMETERS
We present here a general model and a
computational procedure for estimation of the variance-
covariance parameters. Let y = (yijk) be the vector of
responses or yield from the plot receiving the i-th
genotype (treatment) in the k-th incomplete block of the
j-th replication of the design used. The vector y can,
equivalently, be denoted also by y = (yRC) as well where
R,C denote the row and column coordinates of the plot
associated with indices i, j, k. The model for yijk is given
by:
yijk = m + pj + bjk + ti + eRC
where m is the general mean, pj is the effect of
replication j, bjk is the effect of block k in the replication
j, ti is the effect of treatment i, and eRCs are random
errors with an auto-covariance structure along/across
rows/column. Let N be the number of the experimental
units. The N errors presented as the vector e = (eRC)
may have the heterogeneous variances,
2
l
σ
l (l = 1, ...,
K), where K is the number of clusters of the N
experimental units. The diagonal matrix of variances for
the N errors can be written as s2d using the associated
2
l
σ
for a given plot. Further, suppose that the model
selection using Singh et al. (2003) resulted in an auto-
correlated errors across columns with correlations
expressed as corr(eRC, eR¢C¢) = f|CC¢|, then the above
model can more compactly be written as:
y= Xa + Zb + e
where X is the design matrix associated with factors
with effects assumed as fixed, a say, consisting of
genotypes effects tis and m, and Z is the design matrix
with factors with effects assumed random, b say,
consisting of replication effects, pjs etc. The variance-
covariance of the plot-error vector e can be written as
R= sd(corr(eRC, eR¢C¢))sd
The computation of the estimates of the parameters
associated with the fixed effects a, variance
components of the factors in b, correlation parameter
f are given in the various computing software such as
GENSTAT and SAS. Generally, the matrix R has a
structure of correlations and variances. In the two
datasets, while the correlations between the plot errors
eRC have a spatial structure, the (plot) error variances
do not. For example, neither there is an assumed
structure in terms of
2
l
σ
over the positions of the units
nor the variances are totally unstructured as there are
K N distinct variances. Let REP, GENO, ROWS and
COLS stand for the replication, genotype (treatment),
rows and columns factors and YIELD for the response
variate. Let HGROUP stand for the factor with the K
levels representing heterogeneous variances units. The
key Genstat directives to compute the variances,
autocorrelation and standard errors are:
Vcomponents[Fixed=GENO]REP+HGROUP.ROWS.COLS;
constraints=positive
VStructure[Term=HGROUP.ROWS.COLS]diag, AR;
Factor=HGROUP, COLS
Reml[prin=m,c,w,mean,d; workspace=50;
maxcycle=150;pse=d] YIELD
The above codes produce a common
2
e
σ
(error
variance) and other variances as ratios dl or
2
l
σ
where
318 M. Singh et al. / Journal of the Indian Society of Agricultural Statistics 64(2) 2010 313-321
2
l
σ
= (dl + 1)
2
e
σ
signifies the error variance
corresponding to the lth cluster, which varies with the
level of the grouping factor HGROUP, l = 1, ..., K.
5. RESULTS & DISCUSSION
Following the test by Chaubey (1981), Table 1
gives estimates of error variances based on ordered
absolute residuals for assumed two and three groups.
As can be noted from the computed F-values for all the
three data sets, there is an indication of the
heterogeneity in the error variances. This supports our
venture to explore the heterogeneous clusters of units.
Table 2 gives the information on distribution of
experimental units with homogeneous error variances
obtained using a K-cluster means and the AICD (AIC
values expressed as deviance, see Singh et al. 2003).
It may be seen that the number of heterogenous groups
inferred at this step are 3 for each of the two trials.
Table 3 provides the estimates of the variance
components at Step-1 (i.e. when selected using AIC
criterion) and Step-2 (i.e. closest groups were fused and
tested for the change in deviance values against
chi-square). For Trial 1, fusion of two closest clusters
Table 3. Number of experimental units and estimates of
variance components for various groups when the
heterogeneous groups were selected using AIC
criterion or fused using the change in the deviance,
and corresponding deviance from the fitted model
for seed yield data from the two trials conducted
at Breda, Syria.
(a) Trial 1: Preliminary yield trial, 2005
(i) Overall grouping: Deviance = 33.76, DF = 68
Group (l) No. of Units (Nl)
2
ˆ
l
σ
l 60 0.15
2 13 2.12
3 27 0.85
(ii) Groups 1 and 3 merged: Deviance = 54.3, DF = 69
Group (l) No. of Units (Nl)
2
ˆ
l
σ
l 87 0.371
2 13 2.587
Change in deviance = 20.54, DF = 1, P-value < 0.001
(b) Trial 2: Advanced yield trial, 2003
(i) Overall grouping: Deviance = 5.28, DF = 54
Group (l) No. of Units (Nl)
2
ˆ
l
σ
1 10 1.515
2 19 0.973
3 61 0.0998
(ii) Groups 1 and 2 merged: Deviance = 3.88, DF = 55
Group (l) No. of Units (Nl)
2
ˆ
l
σ
1 29 1.0642
2 61 0.0961
Change in deviance = 1.4, DF = 1, P-value = 1.00
(iii) All the groups merged: Deviance = 43.54, DF= 57
Group (l) No. of Units (Nl)
2
ˆ
l
σ
1 90 0.501
Change in deviance = 39.66, DF = 2, P-value < 0.001
Note:DF = degrees of freedom associated with the deviance
(residuals).
Table 2. Clusters of experimental units with heterogeneous
error variances on seed yield data in the two trials
KCluster sizes Criterion Change in qDeviance AICD
value criterion
value
(a) Trial 1: Preliminary yield trial, 2005
1 100 3 71 77
2 78, 22 10.23 5 48.06 58.06
3 60, 13, 27 5.75 4.48 6 33.76 45.71
45, 49, 30, 16 2.85 2.90 7 32.02 46.02
(b) Trial 2: Advanced yield trial, 2003
1 90 3 43.54 49.54
2 12, 78 7.64 5 17.33 27.33
3 10, 19, 61 4.19 3.45 6 5.82 17.28
42, 8, 19, 61 1.28 -2.91 7 NC
Note: Bold letters indicate that the corresponding clusters
were identified with heterogeneous error variances.
q = number of covariance parameters. AICD = AIC (Akaike
information criterion) expressed in terms of deviance (Singh
et al. 2003).
resulted in significant increase in deviance (P < 0.001),
therefore, the three heterogeneous groups with 60, 13
and 27 units were considered for using the models for
the evaluation of the genotypes. For Trial 2, the three
clusters obtained from Step - 1 were fused into two
clusters with an insignificant increase in the deviance.
When merged again (now into a single group), there
M. Singh et al. / Journal of the Indian Society of Agricultural Statistics 64(2) 2010 313-321 319
Table 4. Position of experimental units grouped (1-3) according to heterogeneous error variances on the
rectangular layouts for the three trials conducted at Breda, Syria
Trial 1: Seed yield (Preliminary yield trial, 2005)
Using three heterogeneous groups selected on AIC criterion.
Columns
Rows 123456789101112131415
1131113311111121
2113131121132111
3131321311332311
4111133111122331
Columns
Rows 16 17 18 19 20 21 22 23 24 25
13113213133
21112112111
31331311131
43123211112
Trial 2: Seed yield (Advanced yield trial, 2003)
(a) Using three heterogeneous groups selected on AIC criterion
Columns
Rows 123456789101112131415
1212333233222133
2333332113323323
3333133323333332
Rows 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1233331121123333
2233332333332323
3333333333312333
(b) Merged to two heterogeneous groups
Columns
Rows 123456789101112131415
1111222122111122
2222221112212212
3222122212222221
Columns
Rows 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
1122221111112222
2122221222221212
3222222222211222
320 M. Singh et al. / Journal of the Indian Society of Agricultural Statistics 64(2) 2010 313-321
was a significant increase in the deviance, implying the
presence of only two heterogenous groups of units.
Further, the spatial distribution of the experimental
plots are exhibited on the layout schema (Table 4) for
the various heterogeneous groups resulted at Step-1
and/or at the final stage of the formation of
heterogeneous clusters. In these two trials, nearly 60%
of the units have lowest level of error variability. The
positions of the units from the other clusters are
reasonably spread throughout the field layout. Using the
chosen combination of autocorrelation (spatial errors)
and heterogeneous variances for the errors in the model,
the estimates of various variances and autocorrelation
parameters are given in Table 5: Table 5 also exhibits
the P-value for equality of the genotypes effects based
on the Wald statistic and the average variance of
estimated difference of pair-wise genotypes effects. The
efficiency (%) values are given in comparison with the
Table 5. Estimates of variance components, Wald test statistics value and significance level, and average estimated variance
error of pair-wise genotypes comparison and efficiency of the design-analysis models for seed yield data from the two trials
conducted at Breda, Syria
(a) Trial 1: Preliminary yield trial, 2005
Variance components Estimates WStat DF P-value Av. var. Eff(%)
RCB, homogeneous
2
e
σ
= 0.630 ±0.10 36 36.07 24 0.094 0.3194 100
Homogeneous
2
e
σ
= 0.64 ± 0.111 40.36 24 0.055 0.2953 108
f= 0.27 ± 0.128
Heterogeneous
2
e
σ
= 0.15 ± 0.035 74.88 24 0:001 0.1495 214
d1= 0.00 ± 0.00
d2= 12.93 ± 6.78
d3= 4.62 ± 2.29
f= 0.56 ± 0.246
(b) Trial 2: Advanced yield trial, 2003
Variance components Estimates WStat DF P-value Av. var. Eff(%)
RCB, homogeneous
2
e
σ
= 0.470 ± 0.0871 43.41 29 0.096 0.3132 100
Homogeneous
2
e
σ
= 0.501 ± 0.109 60.84 29 0.014 0.2346 134
f= 0.46 ± 0.116
Heterogeneous
2
e
σ
= 0.0822 ± 0.0304 194.32 29 < :001 0.0933 336
d1= 12.27 ± 6.87
d2= 0.198 ± 0.4151
f= 0.80 ± 0.121
Note: WStat = Wald statistic for testing equality of genotype effects (assumed fixed). DF =Degrees of freedom of the genotype.
Av. var. = Average variance of difference of estimated effects between a pair of genotypes. AIC= Akaike information criterion.
P- value= P- value based on the Wald test. Eff(%)= Percent efficiency over RCB (randomized complete block design) model.
standard randomized complete block design model. It
may be noted that the best models, without
heterogeneity components in, fail to detect significant
statistical differences in genotypes effects in Trials 1
(P-values 0.055) while the P-value is 0.014 for
Trial 2. An introduction of the heterogeneous error
variances clearly shows an enhanced significance level
(P-value £ 0.001) for genotype main-effects in both the
cases. For the spatial models, reductions of 49% and
60% in the average variance of the difference of the
genotypes effects for Trials 1 and 2 respectively can be
considered substantial. While the spatial models for
Trials 1 and 2 are more efficient than RCB model even
without heterogeneity of error variances, incorporation
of heterogeneity of error variances in the model has
drastically improved the efficiency of the pairwise
comparisons of the genotypes. The efficiencies were
found as 214% and 336% for the Trials 1 and 2,
respectively.
M. Singh et al. / Journal of the Indian Society of Agricultural Statistics 64(2) 2010 313-321 321
The evaluation of these trials support the need for
examining the presence of heterogeneous errors in the
experimental units in field trials, and shows clearly that
considerable improvement can be made by their
identification and accounting at the analysis stage. Such
an approach actually can easily be incorporated in most
of the data analysis situations involving spatial, time
or even unstructured experimental units, and, therefore,
would enhance the efficiency of the associated plant
breeding process.
ACKNOWLEDGEMENT
The authors are thankful to Professor Sudhir Gupta for
his efficient handling of the paper and to a referee for his
critical reading and constructive comments. The research of
the second author was partially supported from the authors
Discovery Grant from the Natural Sciences and Engineering
Research Council of Canada.
REFERENCES
Chaubey, Y.P. (1981). Testing the equality of variances of two
linear models. Canad. J. Statist., 9, 119-127.
Cochran, W.G. and Cox, G.M. (1992). Experimental Designs.
Wiley Classic Edition (2nd ed.), John Wiley and Sons,
New York.
Cox, D.R. and Reid, N. (2000). The Theory of the Design of
Experiments. Boca Raton, CRC Press, Florida.
Cullis, B.R. and Gleeson, A.C. (1991). Spatial analysis of
field experiments an extension to two dimensions.
Biometrics, 47, 1449-1460.
Fisher, R.A. (1990). Statistical Methods, Experimental Design
and Scientific Inference. J.H. Bennett (Ed.), Oxford
University Press, Oxford.
Gilmour, A.R., Cullis, B.R. and Verbyla, A.P. (1997).
Accounting for natural and extraneous variation in the
analysis of field experiments. J. Ag. Biol. Environ.
Statist., 2, 269-293.
Gomez, K.A. and Gomez, A.A. (1984). Statistical Procedures
for Agricultural Research, (2nd ed). John Wiley and
Sons, New York.
Grondona, M.O., Crossa, J., Fox, P.N. and Pfeiffer, W.H.
(1996). Analysis of variety yield trials using two-
dimensional separable ARIMA processes. Biometrics,
52, 763-770.
Hinkelmann, K. and Kempthorne, O. (2007). Design and
analysis of experiments. In: Introduction to Experimental
Design. John Wiley and Sons, New York.
Malhotra, R.S., Singh, M. and Erskine, W. (2004).
Application of spatial variability models in enhancing
precision and efficiency of selection in chickpea trials.
J. Ind. Soc. Agril. Statist., 57, 71-83.
Payne, R.W., Harding, S.A., Murray, D.A., Soutar, D.M.,
Baird, D.B., Glaser, A.I., Channing, I.C., Welham, S.J.,
Gilmour, A.R., Thompson, R. and Webster, R. (2009).
The Guide to GenStat Release 12, Part 2: Statistics. VSN
International, Hemel Hempstead.
Singh, M., Malhotra, R.S., Ceccarelli, S., Sarker, A., Grando,
S. and Erskine, W. (2003). Spatial variability models to
improve dry land field trials. J. Exp. Agric., 39, 151-160.
Sarker, A., Singh, M. and Erskine, W. (2001). Efficiency of
spatial methods in yield trials in lentils (lens culinaris ssp.
Culinaris). J. Agric. Sci., 137, 427-438.
Wilkinson, G.N., Eckert, S.R, Aancock, T.W. and Mayo, O.
(1983). Nearest-neighbour (NN) analysis of field
experiments (with Discussion). J. Roy. Statist. Soc., B45,
151-211.
Levene, H. (1960). Robust tests for equality of variances. In
Contributions to Probability and Statistics: Essays in
Honor of Harold Hotelling, I. Olkin, et al. ( eds.)
Stanford University Press, CA, pp. 278-292.
Everitt, B.S., Landau, S. and Leese, M. (2001). Cluster
Analysis. Edward Arnold, London.
Wolfinger, R.D. (1996). Heterogeneous variance-covariance
structures for repeated measures. J. Ag. Biol. Environ.
Statist., 1, 205-230.