Robust designs for misspecified logistic models

Merck Research Laboratories, North Wales, Pennsylvania 19454, United States; Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, Alberta, Canada T6G 2G1
Journal of Statistical Planning and Inference (Impact Factor: 0.71). 01/2009; DOI: 10.1016/j.jspi.2008.05.022
Source: OAI

ABSTRACT We develop criteria that generate robust designs and use such criteria for the construction of designs that insure against possible misspecifications in logistic regression models. The design criteria we propose are different from the classical in that we do not focus on sampling error alone. Instead we use design criteria that account as well for error due to bias engendered by the model misspecification. Our robust designs optimize the average of a function of the sampling error and bias error over a specified misspecification neighbourhood. Examples of robust designs for logistic models are presented, including a case study implementing the methodologies using beetle mortality data.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Experimentation in scientific or medical studies is often carried out in order to model the ‘success’ probability of a binary random variable. Experimental designs for the testing of lack of fit and for estimation, for data with binary responses depending upon covariates which can be controlled by the experimenter, are constructed. It is supposed that the preferred model is one in which the probability of the occurrence of the target outcome depends on the covariates through a link function (logistic, probit, etc.) evaluated at a regression response — a function of the covariates and of parameters to be estimated from the data, once gathered. The fit of this model is to be tested within a broad class of alternatives over which the regression response varies. To this end, the problem is phrased as one of discriminating between the preferred model and the class of alternatives. This, in turn, is a hypothesis testing problem, for which the asymptotic power of the test statistic is directly related to the Kullback–Leibler divergence between the models, averaged over the design. ‘Maximin’ designs, which maximize (through the design) the minimum (among the class of alternative models) value of this power together with a measure of the efficiency of the parameter estimates are also constructed. Several examples are presented in detail; two of these relate to a medical study of fluoxetine versus a placebo in depression patients. The method of design construction is computationally intensive, and involves a steepest descent minimization routine coupled with simulated annealing.
    Computational Statistics & Data Analysis. 01/2010; 54:3371-3378.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We construct experimental designs for dose–response studies. The designs are robust against possibly misspecified link functions; for this they minimize the maximum mean-squared error of the estimated dose required to attain a response in 100p% of the target population. Here p might be one particular value—p=0.5 corresponds to ED50-estimation—or it might range over an interval of values of interest. The maximum of the mean-squared error is evaluated over a Kolmogorov neighbourhood of the fitted link. Both the maximum and the minimum must be evaluated numerically; the former is carried out by quadratic programming and the latter by simulated annealing.
    Journal Of The Royal Statistical Society 02/2011; 73(2):215 - 238.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We discuss robust designs for generalized linear models with protection for possible departures from the usual model assumptions. Besides possible inaccuracy in an assumed linear predictor, both problems of overdispersion and misspecification in link function are addressed. For logistic and Poisson models, as examples, we incorporate the variance function prescribed by a superior model similar to a generalized linear mixed model to address overdispersion, and adopt a parameterized generalized family of link functions to deal with the problem of link misspecification. The design criterion is the average mean squared prediction error (AMSPE). The exact optimal design, which minimizes the AMSPE, is also presented using examples on the toxicity of ethylene oxide to grain beetles, and on Ames Salmonella Assay.
    Computational Statistics & Data Analysis 01/2010; 54(4):875-890. · 1.30 Impact Factor


Available from