Robust designs for misspecified logistic models
ABSTRACT We develop criteria that generate robust designs and use such criteria for the construction of designs that insure against possible misspecifications in logistic regression models. The design criteria we propose are different from the classical in that we do not focus on sampling error alone. Instead we use design criteria that account as well for error due to bias engendered by the model misspecification. Our robust designs optimize the average of a function of the sampling error and bias error over a specified misspecification neighbourhood. Examples of robust designs for logistic models are presented, including a case study implementing the methodologies using beetle mortality data.

Article: Numerical Analysis of the Statistical Properties of Uniform Design in Stated Choice Modelling
[Show abstract] [Hide abstract]
ABSTRACT: Stated choice methods have been widely used in transportation studies since 1980s. In recent years, much research attention has been paid to develop optimal or efficient designs for choice experiments, such as the socalled Doptimal design, which does not seek for orthogonality as the traditional approach does but aims at minimizing the determinant of the variancecovariance matrix of the parameter estimators. This paper examines the statistical properties of an alternative design method—uniform design, which also does not look for orthogonality but aims at maximizing uniformity—a measure that is closely related to model efficiency. We compare the estimation efficiency and prediction efficiency of uniform design with that of the traditional fractional factorial orthogonal design in stated choice modelling. Monte Carlo experiments are used to generate models, whose parameters vary in scale. The results show that though uniform design uses a lot fewer profiles than orthogonal designs do, its prediction and estimation efficiencies in stated choice modelling are comparable to that of orthogonal design.Transport Reviews 01/2009; 29(5). · 1.88 Impact Factor  SourceAvailable from: ualberta.ca[Show abstract] [Hide abstract]
ABSTRACT: We construct experimental designs for dose–response studies. The designs are robust against possibly misspecified link functions; for this they minimize the maximum meansquared error of the estimated dose required to attain a response in 100p% of the target population. Here p might be one particular value—p=0.5 corresponds to ED50estimation—or it might range over an interval of values of interest. The maximum of the meansquared error is evaluated over a Kolmogorov neighbourhood of the fitted link. Both the maximum and the minimum must be evaluated numerically; the former is carried out by quadratic programming and the latter by simulated annealing.Journal Of The Royal Statistical Society 02/2011; 73(2):215  238.  SourceAvailable from: brocku.ca[Show abstract] [Hide abstract]
ABSTRACT: We discuss robust designs for generalized linear models with protection for possible departures from the usual model assumptions. Besides possible inaccuracy in an assumed linear predictor, both problems of overdispersion and misspecification in link function are addressed. For logistic and Poisson models, as examples, we incorporate the variance function prescribed by a superior model similar to a generalized linear mixed model to address overdispersion, and adopt a parameterized generalized family of link functions to deal with the problem of link misspecification. The design criterion is the average mean squared prediction error (AMSPE). The exact optimal design, which minimizes the AMSPE, is also presented using examples on the toxicity of ethylene oxide to grain beetles, and on Ames Salmonella Assay.Computational Statistics & Data Analysis 01/2010; 54(4):875890. · 1.30 Impact Factor
Page 1
Journal of Statistical Planning and Inference 139 (2009) 315
Contents lists available at ScienceDirect
Journal of Statistical Planning and Inference
journal homepage: www.elsevier.com/locate/jspi
Robust designs for misspecified logistic models
Adeniyi J. Adewalea, Douglas P. Wiensb,∗
aMerck Research Laboratories, North Wales, Pennsylvania 19454, United States
bDepartment of Mathematical and Statistical Sciences, University of Alberta, Edmonton, Alberta, Canada T6G 2G1
A R T I C L EI N F OA B S T R A C T
Available online 24 May 2008
MSC:
primary 62K05;62F35
secondary 62J05
Keywords:
Fisher information
Logistic regression
Linear predictor
Monte Carlo sample
Polynomial
Random walk
Simulated annealing
We develop criteria that generate robust designs and use such criteria for the construction of
designs that insure against possible misspecifications in logistic regression models. The design
criteria we propose are different from the classical in that we do not focus on sampling error
alone. Instead we use design criteria that account as well for error due to bias engendered
by the model misspecification. Our robust designs optimize the average of a function of the
sampling error and bias error over a specified misspecification neighbourhood. Examples of
robust designs for logistic models are presented, including a case study implementing the
methodologies using beetle mortality data.
© 2008 Elsevier B.V. All rights reserved.
1. Introduction
Experimental designs have been treated extensively in the statistical literature, starting with designs for linear models and
extending to nonlinear models. A large volume of literature is devoted to designs assuming the exact correctness of the relation
ship between the response variable and the design (explanatory) variables. Box and Draper (1959) added another dimension to
the theory by investigating the impact of model misspecification. Following the work of Box and Draper the literature has since
beenrepletewithregressiondesignswhicharerobustagainstviolationsofvariousmodelassumptions—linearityoftheresponse,
independence and homoscedasticity of the errors, etc. Authors who have considered designs with an eye on the approximate
nature of the assumed linear models include Marcus and Sacks (1976), Li and Notz (1982), Wiens (1992) and Wiens and Zhou
(1999), to mention but a few.
For nonlinear designs, Fedorov (1972), Ford and Silvey (1980), Chaloner and Larntz (1989) and Chaudhuri and Mykland (1993)
have explored the construction of optimal designs while assuming that the nonlinear model of interest is correctly specified.
Still others have investigated designs for generalized linear models, a class of possibly nonlinear models in which the response
follows a distribution from the exponential family such as the normal, binomial, Poisson or gamma (McCullagh and Nelder,
1989). The expository article (Ford et al., 1989) hinted that in the context of nonlinear models, as in the case of linear models, the
misspecification of the model itself is of serious concern. They asserted that “indeed, if the model is seriously in doubt, the forms
of design that we have considered may be completely inappropriate.” Sinha and Wiens (2002) have explored some designs for
nonlinear models with due consideration for the approximate nature of the assumed model. In this work we consider designs for
misspecified logistic regression models.
For the logistic model, the mean response E(Y) = ? depends on the parameters, b, and the vector of explanatory variables, x,
through the nonlinear function ? = e?/(1 + e?), where ? = zT(x)b. The function ? is termed the linear predictor, with regressors
z1(x), ...,zp(x) depending on the qdimensional independent variable x. The variance of the response, written var(Yx), is a
∗Corresponding author. Tel.: +17804924406; fax: +17804926826.
Email addresses: adeniyiadewale@hotmail.com (A.J. Adewale), doug.wiens@ualberta.ca (D.P. Wiens).
03783758/$see front matter © 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.jspi.2008.05.022
Page 2
4
A.J. Adewale, D.P. Wiens / Journal of Statistical Planning and Inference 139 (2009) 315
nonlinear function of the linear predictor. Abdelbasit and Plackett (1983), Minkin (1987), Ford et al. (1992), Chaudhuri and
Mykland (1993), Burridge and Sebastiani (1994), Atkinson and Haines (1996) and King and Wong (2000) have investigated
designs for binary data, and in particular for logistic regression. As illustrated in these papers, the general approach to optimal
design is to seek a design that optimizes certain functions of the information matrix of the model parameters. The information
matrix for b from a design consisting of the points x1, ...,xnis given by
n
?
i=1
w(xi,b)z(xi)zT(xi) = ZTWZ,(1)
where Z = (z(x1),z(x2), ...,z(xn))Tand
W = diag(w(x1,b),w(x2,b), ...,w(xn,b))
for weights w(xi,b) = (d?/d?i)2/var(Yxi). Thus, as with nonlinear experiments the information matrix depends on the unknown
parametersb. Designing an experiment for the estimation of these parameters would then seem to require that these parameters
be known. The following are some of the approaches that have been explored in the literature for handling the dependency of
the information matrix on b.
(1) Locally optimal designs: A traditional approach in designing a nonlinear experiment is to aim for maximum efficiency at a
best guess (initial estimate) of the parameter values (Chernoff, 1953). Designs that are optimal for given parameter values
are dubbed locally optimal designs. These designs may be stable over a range of parameter values. However, if unstable, a
design which is optimal for a best guess may not be efficient for parameter values in even a small neighbourhood of this
guess.
(2) Bayesian optimal designs: A natural generalization of locally optimal designs is to use a prior distribution on the unknown
parameters rather than a single guess. The approach which assumes such a prior and incorporates this distribution into the
appropriate design criteria is termed Bayesian optimal design—see Chaloner and Larntz (1989) and Dette and Wong (1996).
(3) Minimax optimal designs: Rather than assume a prior distribution, this approach assumes a range of plausible values for the
parameters. The minimax optimal design is the design with the least loss when the parameters take the worst possible value
within their respective ranges. These least favourable parameter values are those that maximize the loss (King and Wong,
2000; Dette et al., 2003).
(4) Sequential designs: In sequential designs, the experiment is done in stages. Parameter estimates from a previous stage are
used as initial estimates in the current stage. The process continues until optimal designs are obtained (Abdelbasit and
Plackett, 1983; Sinha and Wiens, 2002).
Suppose an experimenter is faced with a set S = {xi}N
not necessarily distinct, points at which to observe a binary response Y. The experimenter makes ni?0 observations at xisuch
that?N
logisticdesignisthesalientassumptionthattheassumedmodelformisexactlycorrect.Inthiswork,weproposetheconstruction
of robust designs for logistic models with due consideration for possible misspecification in the assumed form of the systematic
component—the linear predictor. The linear predictor could be said to be misspecified when it does not reflect the influence of
the covariates correctly, possibly due to omitted covariates or to omission of some transformation of existing covariates in the
model. In this section we formalize our notion of model misspecification.
We suppose that the experimenter fits a logistic model with the mean response
i=1of possible design points from which he is interested in choosing n,
i=1ni= n. The design problem is to choose n1, ...,nNin an optimal manner. The objective then is to choose a probability
distribution {pi}N
i=1, with pi= ni/n, on the design space S. The commonalities in the work of the authors who have considered
?i=?(?i),
i = 1, ...,n, (2)
for ?i= zT(xi)b0when in fact the true mean response is represented by
?T,i=?(?i+ f(xi)).
The target parameter b0is defined by
(3)
b0= arg min
?
1
N
N
?
i=1
{E(Yxi) −?(zT(xi)b)}2.
Thus the target parameter is that which guarantees the least sum of squares of discrepancies, over all points in the design space,
between the assumed mean response and the true mean response. The contamination function f(x) may or may not be known. It
would be known, for example, when an experimenter decides to fit the more parsimonious model (2) despite the knowledge of a
more appropriate model (3) with a specified f(x). For instance, the simplified model might be required if the number of support
Page 3
A.J. Adewale, D.P. Wiens / Journal of Statistical Planning and Inference 139 (2009) 315
5
points is not sufficient to handle a more complicated but more appropriate model. Knowing that the parsimonious model might
result in an inferior analysis, the experimenter may seek a design that remedies the anticipated model inadequacy.
The contamination function would be unknown in a situation where the experimenter is aware of the possible uncertainties
in the assumed model form and might have clues about the properties of the possible misspecification, without knowing its exact
structure. When f(x) is unknown, some knowledge about its properties or conditions it satisfies would be required to construct
any appropriate design. This is so because no single design which takes a finite number of observations can protect against all
possible forms of bias. Thus, we must impose some conditions on the contamination function when its precise form is unknown.
To bound the bias of an estimatorˆ?, we assume that
N
?
with ?2=O(n−1). This latter requirement is analogous to the notion of contiguity in the asymptotic theory of hypothesis testing,
and is justified in the same manner. The choice of ? is discussed following Theorem 3 in the next section. In order to ensure
identifiability of the model parameters b and the contamination function f(x) we require that the vector of regressors and the
contamination be orthogonal. That is,
1
N
i=1
f2(xi)??2,(4)
1
N
N
?
i=1
z(xi)f(xi) = 0. (5)
Let F denote the class of contamination functions f(x) satisfying (4) and (5).
2. Loss functions: estimated and averaged mean squared errors of prediction
The basis for the construction of classical designs for logistic regression models has typically been the minimization of
(a function of) the inverse of Fisher's information matrix (1)—see Atkinson and Haines (1996). However, in the face of model
misspecification the asymptotic covariance, cov(ˆb), of the maximum likelihood estimator of the model parameters no longer
equals the inverse of Fisher information—see White (1982) and also Fahrmeir (1990), who discusses the asymptotic properties
of MLEs under a misspecified likelihood.
Suppose that data {xi,yi} are given, where the xiare the design points chosen from S with niobservations at xisuch that
?N
MLEˆb are used in the derivation of the loss function in Corollary 2.
i=1ni= n, and yiis the proportion of successes at location xi. The asymptotic bias and covariance of the MLEˆb are given in
Theorem 1; see the Appendix for details of this and other proofs. The expressions for the asymptotic bias and covariance of the
Theorem 1. Define
wi=d?i
d?i
=?i(1 −?i) =1
4sech2
?
zT(xi)b0
2
?
, (6)
andletZbetheN×pmatrixwithrowszT(xi).Recall(2)and(3);letcandcTbetheN×1vectorswithelements?iand?T,i,respectively.
Let P, W and WTbe the N × N diagonal matrices with diagonal elements ni/n, wiand wT,i=?T,i(1 −?T,i), respectively. Finally, define
b = ZTP(cT− c),Hn= ZTPWZ,˜Hn= ZTPWTZ. The asymptotic bias and asymptotic covariance matrix of the maximum likelihood
estimatorˆb of the model parameter vector b from the misspecified model are
bias(ˆb) = E(ˆb −b0) = H−1
cov(√n(ˆb −b0)) = H−1
respectively.
nb + o(n−1/2),
˜HnH−1
nn
+ o(1),
Since the typical focus of logistic designs is prediction, we take as loss function the normalized average mean squared error
(AMSE) I of the response prediction ?(ˆ ?i), with ˆ ?i= zT(xi)ˆb. This is given by
N
?
I?n
N
i=1
E[{?(ˆ ?i) −?(?i+ f(xi))}2].
Corollary 2. The AMSE has the asymptotic approximation I =LI(P,f) + o(1), where
LI(P,f) =1
for f = (f(x1), ...,f(xN))T.
N{tr[WZH−1
n
˜HnH−1
nZTW] + n?W(ZH−1
nb − f)?2}
(7)
Page 4
6
A.J. Adewale, D.P. Wiens / Journal of Statistical Planning and Inference 139 (2009) 315
By using the expressions for asymptotic bias and covariance given in Theorem 1, Corollary 2 expresses the AMSE as an
explicit function of the design matrix Z and contamination vector f. The first term in the loss function LIcorresponds to the
average variance of the predictions and it depends on the contamination function f(x) through the matrix˜Hn. The second term
in the expression for LIis the average squared bias of the predictions, which depends on the contamination f(x) through
the contamination vector f and implicitly through the vector b. Thus a design cannot minimize (7) directly without certain
assumptions about the contamination f(x).
Fang and Wiens (2000) constructed integervalued designs for linear models, in the case of an unknown f, using a minimax
approach. Their minimax criterion minimizes the maximum value of the loss function over f. They solve the design problem by
minimizing the loss when the misspecification is the worst possible in the neighbourhood of interest.
Here, we take one of the two approaches depending on whether or not there are initial data. If we have initial data we
represent the discrepancy between the true response and the assumed response, at a sampled location x, by
d(x) =?(zT(x)b0+ f(x)) −?(zT(x)b0),
and estimate this by the residualˆd(x) = y(x) − ?(zT(x)ˆb). A first order approximation is d(x) ≈ (d?/d?)f(x), leading toˆf(x) =
ˆd(x)/(d?/d?b=ˆb). We smooth this estimated contamination over the entire design space—see Example 3 of Section 5 for an
illustration. The resulting estimateˆf, together withˆb, is then substituted into the terms in (7), and we compute a design
minimizing LI(P,ˆf) using the techniques outlined in Section 3.
If there are no initial data we propose to instead average LIover F defined by (4) and (5). Our optimal design minimizes
this average value. This criterion is in the spirit of L¨ auter (1974, 1976). L¨ auter's criterion optimizes the weighted average of the
loss of a finite set of plausible models. Here we are instead faced with an infinite set of models indexed by f ∈ F.
To carry out the averaging we begin as in Fang and Wiens (2000), with the singular value decomposition
Z = UN×pKp×pVT
p×p, (8)
with UTU = VTV = Ipand K diagonal and invertible. We augment U by˜UN×(N−p)such that [U
and (5), we have that there is an (N − p) × 1 vector t, with ?t??1, satisfying
f(=ft) =?
The average loss is taken to be the expected value of (7), as a function of t, with respect to the uniform measure on the unit
sphere and its interior in RN−p. This measure has density p(t) = (1/?N,p)I(?t??1), where ?N,p=?(N−p)/2/?((N − p)/2 + 1) is the
volume of the unit sphere. Theorem 3 handles the averaging of LI. The importance of this theorem is in its elimination of the
dependency of our design criterion on the unknown contamination function.
...˜U]N×Nis orthogonal. Then by (4)
√
N˜Ut. (9)
Theorem 3. The average loss over the misspecification neighbourhood F is, apart from terms which are o(1), given by
?
=1
LI,ave(P,?)?
LI(P,ft)p(t)dt
Ntr[(UTPWU)−1(UTW2U)] +
?
N − p + 2tr[W(R − IN)(RT− IN)W],(10)
where ? = n?2and RN×N= U(UTPWU)−1UTPW. For numerical work it is more efficient to compute the second trace in (10) as
N
?
where ˜ rT
tr[W(R − IN)(RT− IN)W] =
i=1
w2
i?˜ ri?2,
iis the ith row of R − IN.
The dependency of the design criterion on the unknown contamination has now been represented by a design parameter ?,
which can be chosen by the experimenter. This parameter can be interpreted as a measure of departure of the true model from
the fitted model. In other words, it is a measure of the experimenter's lack of confidence in the validity of the model that he fits.
If he believes that this assumed model is exactly correct, he chooses ?=0 corresponding to the classical Ioptimal design. On the
other hand, if the experimenter believes that the assumed model is highly uncertain, he chooses a large value of ? for his design.
Designs corresponding to a large value of ? are dominated by the bias component of the loss.
Our design criterion (10) remains dependent on the model parameter vector b0, as is the case in the general nonlinear design
problems, through the weights, as at (6). In the examples of the next section we handle this dependency by either taking a guess
(locallyoptimaldesigns)orassumingapriordistribution,say?(b0),onb0(Bayesiandesigns).ThelossfunctionLI,aveismodified
as?LI,ave(b)?(b)db in the case of a Bayesian construction.
Page 5
A.J. Adewale, D.P. Wiens / Journal of Statistical Planning and Inference 139 (2009) 315
7
3. Designs—algorithm and examples
3.1. Simulated annealing
Weconsiderproblemswithpolynomialpredictors,viz.?=zT(x)bwithz(x)=(1,x,x2, ...,xp−1)T.Wetakeequallyspaceddesign
points {xi}N
is a nonlinear integer optimization problem for which there is no analytic solution, and for which we employ simulated annealing
to search for the optimal design.
The simulated annealing algorithm is a direct search random walk optimization algorithm which has been quite successful
at finding global extrema of nonsmooth functions and/or functions with many local extrema. The algorithm consists of three
steps, each of which must be well adapted to the problem of interest for the algorithm to be successful. The first step is a
specification of the initial state of the process. In this step an initial design has to be specified, say P0. The second is a specification
of a scheme by which a new design P1is chosen from the optimization space. The last step is a prescription of the basis
of acceptance or rejection: an acceptance with probability 1 if LI,ave(P1)<LI,ave(P0), otherwise acceptance with probability
exp{−(LI,ave(P1) − LI,ave(P0))/T}, where T is a tuning parameter. The tuning parameter is usually decreased as the iterations
proceed. After a large number of iterations between the second and third steps the loss function is expected to converge to its
(near) minimum value. Simulated annealing has been used for design problems by, among others, Meyer and Nachtsheim (1988),
Fang and Wiens (2000) and Adewale and Wiens (2006).
A very simple and general approach that we considered for choosing the initial design is to randomly select p points from
{xi}N
(2000) used a different approach which assumes that one of (n,N) is a multiple of the other. For any (n,N) combination they chose
the initial design to be as uniform as possible. We applied this approach as well but found that the two approaches are equally
efficient. For generating a new design we adopted the perturbation scheme of Fang and Wiens (2000). The turning parameter in
the third step was chosen initially such that the acceptance rate is in the range 70% and 95%. We decrease T by a factor of .95 after
each 20 iterations. In the examples below we run the algorithm several times with varying turning parameter specification and
reduction rate in order to satisfy ourselves that the resulting design has the least loss possible under the relevant circumstances
of each example. In Fig. 1 we present the simulated annealing trajectory for one of the cases presented in Example 1. It took
83s for the algorithm to complete the preset maximum number of iterations (12000, for this case) and the minimum loss was
attained just before the 9000th iteration.
i=1in the intervalS. Our design minimizes the relevant loss function through the matrix P=diag(n1/n, ...,nN/n). This
i=1and randomly allocate the observations to these points such that the total number of observations is n. Fang and Wiens
3.2. Examples
Example 1 (No contamination). As a benchmark we first consider the logistic regression model with a single predictor: p = 2,
z(x) = (1,x)T, x ∈ S = [−1,1],b = (1,3)T, and no contamination: ? = 0. We initially took n = 20, N = 40 and considered designs
minimizing LI. The annealing algorithm converged to the design placing 10 of the 20 observations at each of the points −.744
and .128. This design is therefore the classical Ioptimal design minimizing the integrated variance of the predictions over S.
There is evidently no previous theory that applies to this case. However, using a model that is a reparameterization of ours, and
a continuous design space [−1,1], King and Wong (2000) showed the locally Doptimal design to be the design that is equally
supported at −.848 and .181. For the sake of comparison, we sought an equivalent design using our finite design space and the
algorithm described above. The resulting design places 10 of 20 observations at each of −.846 and .180. Thus, our algorithm
attains the closest approximation to King and Wong's solution in that the points −.846 and .180 are nearest, in our design space,
to −.848 and .181. Unlike designs for linear models, the optimal designs in this case do not necessarily place observations at the
extreme points of the design space. This phenomenon is due to the curvature introduced by the link function and the resulting
nonlinear relationship between the mean response and x.
0 2000 40006000
Iteration number
800010000 12000 14000
0.26
0.28
0.3
0.32
Loss
Fig. 1. Simulated annealing trajectory for logistic design with ? = 1 + 3x, x ∈ S = [−1,1], ? = 0 and (N = 40,n = 200).
Page 6
8
A.J. Adewale, D.P. Wiens / Journal of Statistical Planning and Inference 139 (2009) 315
101
0
2
4
6
8
10
Number of observations
loss = 0.2496
0.789
0.684
0.0526
0.158
101
0
20
40
60
80
100
loss = 0.2491
0.789
0.0526
0.158
101
0
2
4
6
8
10
12
loss = 0.2527
0.128
0.744
101
0
20
40
60
80
loss = 0.2524
0.795
0.744
0.07690.128
Fig. 2. Locally optimal designs minimizingLI,avewhen?=0 (no contamination) with (a) N=20,n=20; (b) N=20,n=200; (c) N=40,n=20; (d) N=40,n=200.
Table 1
Comparing restricted designs with unrestricted design; ? = 0
(N,n) Restricted design (twopoint) Unrestricted design
Design points Loss Design pointsa
Loss
(20,20)
−.789(9),.053(11) .250
−.789(7),−.684(3),
.0526(2),.158(8)
−.789(95),.0526(63),.158(42)
−.744(10),.128(10)
−.795(49),−.744(47),
.0769(39),.128(65)
.250
(20,200)
(40,20)
(40,200)
−.789(94),.053(106)
−.744(10),.128(10)
−.744(97),.128(103)
.250
.2527
.2525
.249
.2527
.2524
aNumber of observations in parentheses.
Our numerical results further revealed that the designs depend on the number of points in the design space and the number
of observations the experimenter is willing to take. For this “nocontamination” case, we investigated designs for various
combinations of N and n. Some of these designs are presented in Fig. 2. The number of distinct design points varies from 2
to 4. We found this somewhat surprising, in light of the fact that all Doptimum designs for the two parameter logistic model
in the literature are twopoint designs. Presumably this is explained through our use of a finite design space, and/or our use of
average loss rather than that based on the determinant of the information matrix.
To check that this phenomenon was not merely an artefact due to a lack of convergence, we modified our algorithm to obtain
“restricted” designs—restricted to two support points only. The results for the same values of N and n as in Fig. 2 are presented
in Table 1. The loss for the unrestricted design is less than or equal to that for the corresponding restricted design in all cases
considered.
In the examples that follow we limit discussion to the case N = 40, n = 200.
Example 2 (Example 1 continued). In this example, which we include largely for illustrative purposes, the form of the con
tamination is known. Suppose that the experimenter anticipates fitting a simple logistic model, while wishing protection
against a range of logistic models with quadratic predictor: ?(x) = zT(x)b + f(x), where zT(x) and b are as in Example 1, and
f(x) = ?2(x2− ?2)/
and scaled to ensure the orthogonality condition (5); (4) becomes ?2??. We obtained optimal designs for various values of the
quadratic coefficient ?2. The resulting designs and the corresponding values of the loss function are presented in Table 2. In the
range of values of ?2considered, we found that the number of distinct points varied from 3 to 6. The spread of the design over
the design space tended to increase as the magnitude of the omitted quadratic term increases. We computed the premium paid
for robustness and the gain due to robustness for each design presented as
?
?4−?2
2, for ?k= N−1?xk
i(=0 if k is odd). The contaminant f(x) is an omitted quadratic term, translated
Premium =
?
LI(Popt,f = 0)
LI(Pclassical,f = 0)− 1
?
× 100%(11)
and
Gain =
?
1 −
LI(Popt,f)
LI(Pclassical,f)
?
× 100%. (12)
Page 7
A.J. Adewale, D.P. Wiens / Journal of Statistical Planning and Inference 139 (2009) 315
9
Table 2
Designs for simple logistic model when the true model has a quadratic term
?2
Design points (number of observations)
LI(P,f)PremiumGain
−10
−1(42),−.180(42),−.128(96),
−.077(12),−.026(2),.077(6)
−1(48),−.282(26),−.231(64),.282(62)
−.949(42),−.590(34),−.539(29),.128(95)
−.795(49),−.744(47),.077(39),.128(65)
−.641(100),.128(22),.180(78)
−.692(57),−.641(29),−.077(39),
−.026(44),.795(31)
−1(11),−.590(51),−.539(27),
−.231(30),−.180(40),.949(41)
3.50035.0%34.8%
−3
−1
0
1
3
.5020
.2756
.2524
.3080
.6073
10.9%
2.1%
0
1.9%
11.9%
17.0%
.5%
0
4.0%
19.9%
10 3.67939.0%42.9%
Table 3
Experimental design and response values
Design point
No. of observations
No. of successes
.−1
−7
20
6
9
−5
20
7
9
−3
20
13
9
−1
20
17
9
1
9
3
9
5
9
7
9
1
20
20
20 20
18
20
18
20
19
20
208
The gain measure is the percentage reduction in loss due to the use of a robust design as opposed to a (nonrobust) classical
design which assumes the fitted model to be exactly correct. The premium measure is the percentage increase in loss as a result
of not using the classical design if in actual fact the assumed model is correct. The application of the premium and/or gain
measure depends on the amount of confidence the experimenter has in his knowledge of the true model. In this example, since
the assumption is that the experimenter knows that the model with a linear predictor involving the quadratic term is a more
appropriate model, the relevant measure would be the gain. Nevertheless, both measures are reported in Table 2. The value of
a design from our robust procedure increases with increasing magnitude of the quadratic parameter. On the other hand, the
experimenter has to be aware of the increasing premium when his knowledge of the true model is not accurate. The premium
paid for robustness also increases with the magnitude of the quadratic parameter.
Of course this example is artificial, assuming as it does that the true form of the predictor is known to be quadratic, with
parameter ?2= 3, say. If one did indeed possess this knowledge then the classically optimal—i.e., variance minimizing—design
would be −1(49), −.282(2), −.231(91), .436(9), .487(49). The premium figures in Table 2 would rise appreciably—to typical values
of 100% or more—since the robust design would be protecting against bias, known not to be present.
Example 3 (Designing when there are initial data to estimate contamination). Table 3 shows simulated data (“# of successes”) from
a logistic regression model with the predictor ?(x) = 1 + 3x + f(x), the model of the previous example; the quadratic parameter
was ?2=3. The data were simulated using a uniform design over equally spaced points in [−1,1]. Having simulated the data, we
suppose the contamination function f(x) to be unknown. We proceed using the procedure described in Section 2 for estimation
and eventual smoothing of the contamination. A plot of the estimated contamination with its loess smoothˆf(x) over the design
space is presented in Fig. 3.
We plugged the smoothed contamination values into the loss function (7), and used simulated annealing to obtain the design.
Our design places 34, 82, and 84 of the 200 observations at −.641,−.590, and .180, respectively. For this design the premium for
robustness is 5.0% and the gain is 60.0%. This example indicates that when there are initial data, it is expedient to incorporate the
information from the data into the design procedure. The resulting design can lead to substantial gain at a reduced premium.
Example 4 (Unknown contamination). Consider the logistic model with predictor
?(x) =?0+?1x + f(x).
In this example—as in Example 3—we assume that f is an unknown member of the class F defined by (4) and (5). In Fig. 4 we
exhibit designs minimizing the averaged loss (10) for various values of?,?0and?1. We observed a progression of the dispersion
of the design points over the design space with increasing?. The pattern of the dispersion is, however, modified by the curvature
indexed by ?0and ?1through the nonlinear mean response. For small ? our robust designs can be described as taking clusters
of observations at neighbouring locations rather than replications at only a few distinct sites; this was noticed for linear models
by Fang and Wiens (2000). However, here there is always a pattern to the clusters of observation to be taken depending on the
values of the model parameters. Large values of ? denote large departures from the assumed model and an extremely large ?
value corresponds to the allbias design. Even though the allbias design is spread over the entire design space the frequencies
of observations are different and these frequencies are prescribed by the curvature of the mean response as determined by the
(13)
Page 8
10
A.J. Adewale, D.P. Wiens / Journal of Statistical Planning and Inference 139 (2009) 315
1.0
x [Design Space]
0.0
0.5
1.0
1.5
2.0
2.5
Contamination
Estimated Contamination with its Loess
Smooth Superimposed
0.5 0.00.5
1.0
Fig. 3. Estimated contamination plot for Example 3. True (but unknown) form of contamination is quadratic.
101
0
50
100
#Observations
ρ = 0, loss = 0.29
101
0
10
20
ρ = 10, loss = 0.68
101
0
5
10
ρ = 100, loss = 3.98
101
0
5
10
ρ = 10000, loss = 363.19
101
0
50
100
#Observations
ρ = 0, loss = 0.25
101
0
5
10
ρ = 10, loss = 0.51
101
0
5
10
ρ = 100, loss = 2.72
101
0
5
10
ρ = 10000, loss = 244.75
101
0
50
100
#Observations
ρ = 0, loss = 0.08
101
0
20
40
ρ = 10, loss = 0.11
101
0
10
20
ρ = 100, loss = 0.39
101
0
5
10
ρ = 10000, loss = 30.15
101
0
50
100
#Observations
ρ = 0, loss = 0.14
101
0
20
40
ρ = 10, loss = 0.28
101
0
10
20
ρ = 100, loss = 1.42
101
0
10
20
ρ = 10000, loss = 125.88
Fig. 4. Locally optimal designs in Example 4: (a)–(d) (?0,?1) = (1,1); (e)–(h) (?0,?1) = (1,3); (i)–(l) (?0,?1) = (3,1); (m)–(p) (?0,?1) = (3,3).
Page 9
A.J. Adewale, D.P. Wiens / Journal of Statistical Planning and Inference 139 (2009) 315
11
Table 4
Design for unknown contamination with ?0= 1 and ?1= 3
?
LI,ave(P,?)PremiumGain
0
1
.2524
.2809
.5090
2.7204
24.7294
244.7545
00
2.01%
14.33%
25.87%
28.16%
28.43%
.62%
3.06%
7.44%
11.29%
11.97%
10
100
1000
10 000
101
0
10
20
30
Number of observations
loss = 0.2630
101
0
10
20
30
loss = 0.2836
101
0
10
20
loss = 0.2748
101
0
10
20
loss = 0.2840
Fig. 5. Robust Bayesian optimal design in Example 5 with ? = .25 and parameters ?0and ?1having independent uniform priors over (a) [.5,1.5] × [2.5,3.5],
(b) [.5,1.5] × [1,5], (c) [−1,3] × [2.5,3.5], (d) [−1,3] × [1,5].
model parameters. In Table 4 we present the values of the premium paid and the gain in robustness for designs corresponding
to different values of ? for the particular case of (?0,?1)=(1,3). The gain in robustness, measured by (12), exceeds the premium
paid, as measured by (11), for each design. Increasing robustness, however, comes with increasing premium; the experimenter
would thus have to choose his level of comfort.
Thus far, the examples we have presented have been locally optimal, hence have assumed good parameter guesses for
unknownmodelparameters.Intheabsenceofareliablebestguessformodelparameters,Sitter(1992)andKingandWong(2000)
considered minimax Doptimality, a procedure which assumes the knowledge of a prior range for each of the parameters. We
consider a Bayesian paradigm to be in the same spirit as averaging the contamination function over the specified misspecification
neighbourhood, and take independent uniform prior distributions over the range of each model parameter. Our design criteria
then becomes the expected loss, E(LI,ave(P,?)), with the expectation taken with respect to these priors. The dependency of our
design criteria on the model parameters is through the weights wi, and we do not have analytic expressions for the resulting
integrals. In the examples that follow we employ numbertheoretic methods for numerical evaluation of multiple integrals as
discussed in Fang and Wang (1994). This approach is based on generating quasirandom points in the domain of definition of the
integrand, and averaging the values of the loss over the sample of points.
Example 5 (RobustBayesiandesign). Inthisexampleweconsiderthefollowingrangesofparametervalues:(a)[.5,1.5]×[2.5,3.5],
(b) [.5,1.5] × [1,5], (c) [−1,3] × [2.5,3.5], (d) [−1,3] × [1,5], all with centre point (1,3) but with coverage areas 1, 4, 4, and 16,
respectively. As described above, the robust design for each range of parameter values is the design that minimizes the expected
average loss with respect to uniform distributions on the specified ranges of parameter values. For each of the designs—see
Fig. 5—we take?=.25. We observed an increasing spread over the design space with increasing uncertainty in model parameters,
as measured by the coverage area of the priors. This is consistent with previous work in optimal Bayesian design—see, for
example, Chaloner and Larntz (1989)—which suggests increasing number of distinct design points with increasing uncertainty
in the specified prior distributions. Comparing the design plots in panels (b) and (c) of Fig. 5, we see that there is more sensitivity
to uncertainty in the intercept parameter than the slope parameter.
4. Case study: beetle mortality data
Bliss (1935) reported the numbers of beetles dead after 5h exposure to gaseous carbon disulphide at various concentrations.
The doses are given in Table 5; to facilitate our discussion we have linearly transformed these to the range [0,1]. Note that the
original design is then very nearly uniform on the eight equally spaced points 0(1
7)1.
Page 10
12
A.J. Adewale, D.P. Wiens / Journal of Statistical Planning and Inference 139 (2009) 315
Table 5
Beetle mortality data
Dose, xi(log10CS2mgl−1)
Number of beetles, ni
Number killed, niyi
1.69
59
6
1.72
60
13
1.75
62
18
1.78
56
28
1.81
63
52
1.84
59
53
1.86
62
61
1.88
60
60
0 0.51
0
50
100
Number of observations
0
0.5
1
0
10
20
30
Fig. 6. (a) Prediction design when contamination is estimated from initial data. (b) Robust Bayesian prediction design with multivariate normal prior and ? = 5.
We first fitted the logistic model with the linear predictor ?(1)= ?(1)
ˆ?(1)
1
?
and deviance =11.232 (df =6). The corresponding estimates for the logistic model with the linear predictor?(2)=?(2)
?(2)
012
⎛
.489
−3.690
with deviance =3.195 (df = 5). The deviances and a plot (not presented here) of proportions of beetles killed against dose levels
with the estimated proportions from each model superimposed suggest that the model with the quadratic term is a significantly
better fit for these data. Suppose the experimenter is inclined to use the simple logistic fit for future data for ease of interpretation
and model simplicity or that the adequacy of the model with the quadratic term is itself in doubt. We proceed by estimating
the contamination and then smoothing over the design space as discussed in Section 2. The resulting design, obtained using the
parameter estimatesˆ?(1)
01
N =40 points in [0,1] is presented in panel (a) of Fig. 6. This would be the design of choice if the experimenter were interested in
predictionbutcontemplatedthesuperiorityofthemodelwithquadraticterm.However,theexperimentercanensurerobustness
against a broader set of alternatives by taking the contamination to belong to the class F while assuming an initial multivariate
normal prior on the parameter, with mean vector (ˆ?(1)
paradigm as in Example 5. The loss function becomes the expected value of (10), with expectation taken with respect to the
multivariate normal prior. The numerical implementation of expectation is done using a quasiMonte Carlo sampling approach.
The design plot is given in Fig. 6(b).
0
+ ?(1)
1x, and obtained the estimatesˆ?(1)
0
= −2.777 and
= 6.621 with the estimated variance–covariance matrix
.082
−.144
−.144
?(1)=
.317
?
0
+?(2)
1x+
2x2areˆ?(2)
= −2.00,ˆ?(2)
.124
−.522
= 1.60,ˆ?(2)
−.522
3.252
= 5.84 and
⎞
?(2)=
⎝
.489
−3.690
4.665
⎠
andˆ?(1)
as initial guesses, with total number of observations n = 481 over an equally spaced grid of
0,ˆ?(1)
1)Tand variance–covariance matrix ?(1), and then using the Bayesian
5. Conclusions
We have investigated integervalued designs for logistic regression models, using polynomial predictors as specific examples.
Our designs are robust against misspecification in the predictor. We have addressed both known and unknown contamination.
Previous robustness work done for logistic models has concentrated on the uncertainty of model parameters; in this contribution
we have gone further to investigate specific violations in the form of the assumed linear (in the parameters) predictor.
Designs for a specific alternative, for example quadratic versus linear in the independent variable, are quite different from
those for broad classes of alternatives. The number of distinct design points is usually not as large in the former case as in the
latter. In fact, when the magnitude of the misspecification is minimal the resulting robust design could have about the same
numberofdistinctobservationpointsasitsclassicalcounterpart.Nevertheless,thegaininrobustnessoftenexceedsthepremium
paid for robustness—see Table 1.
Page 11
A.J. Adewale, D.P. Wiens / Journal of Statistical Planning and Inference 139 (2009) 315
13
Designs for a very specific alternative may, however, suffer the same fate as designs assuming the correctness of the fitted
model when the alternative itself is not valid. Both take replicates of observations at only a few distinct points, especially when
the magnitude of the departure is small. However, when there is a higher degree of certainty in the alternative, these designs
could result in substantial gain in robustness. An example of this would be when the experimenter is aware of a more appropriate
model but seeks a design that allows for the fitting of a more parsimonious model. Also, designs when there are data to estimate
model contamination are quite similar to designs when the exact form of the contamination is known (single alternative). When
the information in the initial data is incorporated into the design procedure, as seen in Example 3 above, the robustness of the
resulting design could come at a very reduced premium.
In general, we have found there to be increasing numbers of distinct observation sites with increasing model uncertainty. The
overall message is consistent with that reported in the model robust design literature for linear models—robust designs can be
approximated by placing clusters of observations about the support points for classical designs. However, the nonlinearity of the
mean response in logistic design adds a slight twist to the overall message, in that the clusters of observation come with patterns
that are determined by the curvature prescribed by the model parameters. More striking is the fact that the allbias design is
nonuniform in logistic regression models—even though the recommended design points are spread over the entire design space,
the frequencies of observations vary due to the curvature.
Overall, the design that protects against uncertainty in model parameters (via a Bayesian paradigm) and that which protects
against uncertainty in assumed model form could be described as taking observations in clusters. These clusters often come in
interesting patterns of curvature prescribed by the nonlinearity of the model—see examples in the previous section. Further work
wouldbe required toobtain analytic descriptionsof theeffect of curvature inthis robustapproach, oreven forthe allbias designs
forlogisticmodels.Whilethefocusofthemodelmisspecificationreportedhereisexclusivelyonlinearpredictormisspecification,
we are currently investigating other forms of misspecification in designing for the broader class of generalized linear models, of
which the logistic model is but a special case.
Acknowledgements
The research of both authors is supported by the Natural Sciences and Engineering Research Council of Canada. We appreciate
helpful comments from an anonymous referee.
Appendix A. Derivations
Proof of Theorem 1. Under conditions as in Fahrmeir (1990) the maximum likelihood estimateˆb exists and is consistent, and
?l(ˆb)/?b is op(n−1/2). The loglikelihood l, the score function and −1 times the second derivative according to the assumed model
are
?
l(b) =
N
?
i=1
ni
?
yilog
?
?i
1 −?i
?
+ log(1 −?i)
?
+ log
?
ni
niyi
??
,
?l(b)
?b
=
N
?
i=1
ni(yi−?i)z(xi),
−?2l(b)
?b?bT=
N
?
i=1
niwiz(xi)zT(xi).
An expansion of ?l(b)/??jaround b0gives
?l(b)
??j
=?l(b0)
??j
+
?
k
(?k−?0,k)?2l(b0)
??j??k
+1
2
?
k
?
l
(?k−?0,k)(?l−?0,l)
?3l(b∗)
??j??k??l
,
where ?jand ?0,jare the jth terms of the vectors b and b0, respectively, and b∗is a point on the line segment connecting b and
b0. If we replace b byˆb in this expansion, we obtain
⎡
n
??j??k
l
√n
?
k
(ˆ?k−?0,k)
⎣1
?2l(b0)
+
1
2n
?
(ˆ?l−?0,l)
?3l(b∗)
??j??k??l
⎤
⎦= −
1
√n
?l(b0)
??j
.
For the logistic likelihood the ?3l(b∗)/??j??k??lare bounded, and so, using the consistency ofˆb, we have that
?
??j??k
??j??k??l
where Hjkis the (j,k)th element of the matrix Hn= −(1/n)?2l(b0)/?b?bT= ZTPWZ. Thus the limit distribution of√n(ˆb − b0) is
that of the solution of the equations?Hjk
1
n
?2l(b0)
+
1
2n
?
(ˆ?l−?0,l)
?3l(b∗)
?
p
− →−Hjk,
√n(ˆ?k−?0,k) = (1/√n)?l(b0)/??j, i.e., is the limit distribution of H−1
n(1/√n)?l(b0)/?b.
Page 12
14
A.J. Adewale, D.P. Wiens / Journal of Statistical Planning and Inference 139 (2009) 315
Using the central limit theorem for independent not identically distributed random variables we have that (1/√n)?l(b0)/?b has a
multivariate normal limit distribution with asymptotic mean (1/√n)?N
i=1niE[yi−?i(b0)]z(xi)=√nb and asymptotic covariance
nb,H−1
n
matrix˜Hn= ZTPWTZ. From this it follows that√n(ˆb −b0) is AN(√nH−1
˜HnH−1
n), as required. ?
Proof of Corollary 2. First write
I =1
N
N
?
i=1
var[√n?(ˆ ?i)] +1
N
N
?
i=1
{E[√n?(ˆ ?i)] −√n?(?i+ f(xi))}2.
By the ?method, the first sum is, up to terms which are o(1),
1
N
N
?
i=1
?d?i
d?i
?2
var[√nˆ ?i] =1
N
N
?
i=1
?d?
d?i
?2
˜HnH−1
zT(xi)H−1
n
˜HnH−1
nz(xi)
=1
Ntr[WZH−1
nnZTW].
Also, on expanding ?(ˆ ?i) and ?(?i+ f(xi)) around ?i, we have
E[√n?(ˆ ?i)] =√n?(?i) + E
?√nd?
d?i
(ˆ ?i−?i) + o(√n(ˆ ?i−?i))
?
,
and
√n?(?i+ f(xi)) =√n?(?i) +√nd?
d?i
f(xi) + o(√nf(xi)).
Using an argument similar to that in the proof of Theorem 1, we have
E[√n?(ˆ ?i)] =√n?(?i) +√nd?
d?i
E(ˆ ?i−?i) + o(1).
Thus, the second sum in the expression of I is, up to terms which are o(1),
1
N
N
?
=1
i=1
{E[√n?(ˆ ?i)] −√n?(?i+ f(xi))}2
N
N
?
i=1
?d?
d?i
?2
{n · biasT(ˆb)z(xi)zT(xi)bias(ˆb) + nf2(xi)}
=1
N{n · bTH−1
nZTW2ZH−1
nb − 2nfTW2ZH−1
nb + n · fTW2f},
reducing to (n/N)?W(ZH−1
Proof of Theorem 3. Here and elsewhere, in the averaging we will use the identity?tTtp(t)dt = (N − p)/(N − p + 2), which
?
First use (8) and (9) to write (7) explicitly in terms of t:
?tr[(UTPWU)−1(UTPWT(t)U)(UTPWU)−1UTW2U]
Using (9) again we have WT(t) = W +˙W?√N˜Ut + O(?2), where˙W = diag(w?(?1), ...,w?(?N)). Since ?2= O(n−1) we obtain
?WT(t)p(t)dt = W + O(n−1), and so
tr[(UTPWU)−1(UTPWT(t)U)(UTPWU)−1UTW2U]p(t)dt = tr[(UTPWU)−1(UTW2U)].
Similarly, we have cT(t) −c =?√NW˜Ut + O(?2), and so
n?W(U(UTPWU)−1UTP(cT(t) −c) −?
nb − f)?2. ?
implies that
ttTp(t)dt =
1
N − p + 2IN−p.
LI(P,f) =1
N
+n?W(U(UTPWU)−1UTP(cT(t) −c) −?√N˜Ut)?2
?
. (A.1)
?
√
N˜Ut)?2= n?2N?W(R − I)˜Ut?2+ O(n−1/2),
Page 13
A.J. Adewale, D.P. Wiens / Journal of Statistical Planning and Inference 139 (2009) 315
15
with
?
n?W(U(UTPWU)−1UTP(cT(t) −c) −?
√
N˜Ut)?2p(t)dt =n?2N · tr[W(R − I)˜U˜UT(R − I)TW]
N − p + 2
.
The result follows upon noting that˜U˜UT= I − UUTand (R − I)U = 0, and then substituting these integrals into (A.1) and
simplifying. ?
References
Abdelbasit, K.M., Plackett, R.L., 1983. Experimental design for binary data. J. Amer. Statist. Assoc. 8, 90–98.
Adewale, A., Wiens, D.P., 2006. New criteria for robust integervalued designs in linear models. Comput. Statist. Data Anal. 51, 723–736.
Atkinson, A.C., Haines, L.M., 1996. Designs for nonlinear and generalized linear models. In: Ghosh, S., Rao, C.R. (Eds.), Handbook of Statistics, vol. 13. pp. 437–475.
Bliss, C.I., 1935. The calculation of the dosemortality curve. Ann. Appl. Biol. 22, 134–167.
Box, G.E.P., Draper, N.R., 1959. A basis for the selection of a response surface design. J. Amer. Statist. Assoc. 54, 622–654.
Burridge, J., Sebastiani, P., 1994. Doptimal designs for generalized linear models with variance proportional to the square of the mean. Biometrika 81, 295–304.
Chaloner, K., Larntz, K., 1989. Optimal Bayesian design applied to logistic regression experiments. J. Statist. Plann. Inference 21, 191–208.
Chaudhuri, P., Mykland, P., 1993. Nonlinear experiments: optimal design and inference based on likelihood. J. Amer. Statist. Assoc. 88, 538–546.
Chernoff, H., 1953. Locally optimal designs for estimating parameters. Ann. Math. Statist. 24, 586–602.
Dette, H., Wong, W.K., 1996. Optimal Bayesian designs for models with partially specified heteroscedastic structure. Ann. Statist. 24, 2108–2127.
Dette, H., Haines, L., Imhof, L., 2003. Bayesian and maximin optimal designs for heteroscedastic regression models. Canad. J. Statist. 33, 221–241.
Fahrmeir, L., 1990. Maximum likelihood estimation in misspecified generalized linear models. Statistics 21, 487–502.
Fang, K.T., Wang, Y., 1994. NumberTheoretic Methods in Statistics. Chapman & Hall, London.
Fang, Z., Wiens, D.P., 2000. Integervalued, minimax robust designs for estimation and extrapolation in heteroscedastic, approximately linear models. J. Amer.
Statist. Assoc. 95, 807–818.
Fedorov, V.V., 1972. Theory of Optimal Experiments. Academic Press, New York.
Ford, I., Silvey, S.D., 1980. A sequentially constructed design for estimating a nonlinear parametric function. Biometrika 67, 381–388.
Ford, I., Titterington, D.M., Kitsos, C.P., 1989. Recent advances in nonlinear experimental design. Technometrics 31, 49–60.
Ford, I., Torsney, B., Wu, C.F.J., 1992. The use of a canonical form in the construction of locally optimal designs for nonlinear problems. J. Roy. Statist. Soc. B 54,
569–583.
King, J., Wong, W.K., 2000. Minimax DOptimal designs for the logistic model. Biometrics 56, 1263–1267.
L¨ auter, E., 1974. Experimental design in a class of models. Math. Operationsforschung Statist. 5, 379–396.
L¨ auter, E., 1976. Optimal multipurpose designs for regression models. Math. Operationsforschung Statist. 7, 51–68.
Li, K.C., Notz, W., 1982. Robust designs for nearly linear regression. J. Statist. Plann. Inference 6, 135–151.
Marcus, M.B., Sacks, J., 1976. Robust designs for regression problems. In: Gupta, S.S., Moore, D.S. (Eds.), Statistical Theory and Related Topics II. Academic Press,
New York, pp. 245–268.
McCullagh, P., Nelder, J.A., 1989. Generalized Linear Models. Chapman & Hall, CRC, London, Boca Raton, FL.
Meyer, R.K., Nachtsheim, C.J., 1988. Constructing exact Doptimal experimental designs by simulated annealing. Amer. J. Math. Management Sci. 3 & 4, 329–359.
Minkin, S., 1987. Optimal design for binary data. J. Amer. Statist. Assoc. 82, 1098–1103.
Sinha, S., Wiens, D.P., 2002. Robust sequential designs for nonlinear regression. Canad. J. Statist. 30, 601–618.
Sitter, R.R., 1992. Robust designs for binary data. Biometrics 48, 1145–1155.
White, H., 1982. Maximum likelihood estimation of misspecified models. Econometrica 50, 1–25.
Wiens, D.P., 1992. Minimax designs for approximately linear regression. J. Statist. Plann. Inference 31, 353–371.
Wiens, D.P., Zhou, J., 1999. Minimax designs for approximately linear models with AR(1) errors. Canad. J. Statist. 27, 781–794.
View other sources
Hide other sources
 Available from Douglas P. Wiens · Oct 24, 2014
 Available from sciencedirect.com