ArticlePublisher preview available
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

We propose an algorithm for computing efficient approximate experimental designs that can be applied in the case of very large grid-like design spaces. Such a design space typically corresponds to the set of all combinations of multiple genuinely discrete factors or densely discretized continuous factors. The proposed algorithm alternates between two key steps: (1) the construction of exploration sets composed of star-shaped components and separate, highly informative design points and (2) the application of a conventional method for computing optimal approximate designs on medium-sized design spaces. For a given design, the star-shaped components are constructed by selecting all points that differ in at most one coordinate from some support point of the design. Because of the reliance on these star sets, we call our algorithm the galaxy exploration method (GEX). We demonstrate that GEX significantly outperforms several state-of-the-art algorithms when applied to D-optimal design problems for linear, generalized linear and nonlinear regression models with continuous and mixed factors. Importantly, we provide a free R code that permits direct verification of the numerical results and allows researchers to easily compute optimal or nearly optimal experimental designs for their own statistical models.
This content is subject to copyright. Terms and conditions apply.
Statistics and Computing (2021) 31:70
https://doi.org/10.1007/s11222-021-10046-2
Optimal design of multifactor experiments via grid exploration
Radoslav Harman1
·Lenka Filová1
·Samuel Rosa1
Received: 10 April 2021 / Accepted: 22 August 2021 / Published online: 13 September 2021
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021
Abstract
We propose an algorithm for computing efficient approximate experimental designs that can be applied in the case of very
large grid-like design spaces. Such a design space typically corresponds to the set of all combinations of multiple genuinely
discrete factors or densely discretized continuous factors. The proposed algorithm alternates between two key steps: (1) the
construction of exploration sets composed of star-shaped components and separate, highly informative design points and (2)
the application of a conventional method for computing optimal approximate designs on medium-sized design spaces. For a
given design, the star-shaped components are constructed by selecting all points that differ in at most one coordinate from
some support point of the design. Because of the reliance on these star sets, we call our algorithm the galaxy exploration
method (GEX). We demonstrate that GEX significantly outperforms several state-of-the-art algorithms when applied to D-
optimal design problems for linear, generalized linear and nonlinear regression models with continuous and mixed factors.
Importantly, we provide a free R code that permits direct verification of the numerical results and allows researchers to easily
compute optimal or nearly optimal experimental designs for their own statistical models.
Keywords Optimal design ·Multifactor experiments ·Regression models ·Generalized linear models ·Algorithms
Mathematics Subject Classification 62K05 ·90C59
1 Introduction
The usual aim of the so-called “optimal” design of experi-
ments is to perform experimental trials in a way that enables
efficient estimation of the unknown parameters of an underly-
ing statistical model (see, e.g., Fedorov 1972; Pázman 1986;
Pukelsheim 2006; Atkinson et al. 2007; Goos and Jones
2011; Pronzato and Pázman 2013). The literature provides
optimal designs in analytical forms for many specific sit-
uations; for a given practical problem at hand, however,
analytical results are often unavailable. In such a case, it
is usually possible to compute an optimal or nearly optimal
design numerically (e.g., Chapter 4 in Fedorov 1972, Chap-
ter 5 in Pázman 1986, Chapter 12 in Atkinson et al. 2007,
and Chapter 9 in Pronzato and Pázman 2013).
In this paper, we propose a simple algorithm for solving
one of the most common optimal design problems: com-
BRadoslav Harman
harman@fmph.uniba.sk
1Department of Applied Mathematics and Statistics, Faculty of
Mathematics, Physics and Informatics, Comenius University
in Bratislava, Bratislava, Slovakia
puting efficient approximate designs for experiments with
uncorrelated observations and several independent factors.
The proposed algorithm employs a specific strategy to adap-
tively explore the grid of factor-level combinations without
the need to enumerate all elements of the grid. The key idea
of this algorithm is to form exploration sets composed of star-
like subsets and other strategically selected points; therefore,
we refer to this algorithm as the “galaxy” exploration method
(GEX).
If the set of all combinations of factor levels is finite and
not too large, it is possible to use many available efficient
and provably convergent algorithms to compute an optimal
design (e.g., those of Fedorov 1972; Atwood 1973; Silvey
et al. 1978; Böhning 1986; Vandenberghe et al. 1998;Ucnski
and Patan 2007;Yu2011; Sagnol 2011; Yang et al. 2013;Har-
man et al. 2020). However, in the case of multiple factors,
each with many levels, the number of factor-level combina-
tions is often much larger than the applicability limit of these
methods.
The main advantage of GEX is that it can be used to solve
problems with an extensive number of combinations of fac-
tor levels, e.g., 1015 (5 factors, each with 1000 levels), and
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
... In contrast, if X is an infinite set (or a continuum), a good finite representer, sayX, of X must be constructed first. There are several approaches to attack such a problem, treatingX as a variable or a fixed parameter of the problem, which is chosen accordingly to certain heuristics as space filling techniques, grid exploration (see e.g., [8]), or minimal spanning tree. Recently, it has been shown that the use of polynomial admissible meshes gives precise quantitative estimates of the approximation intruduced in the discretization of the problem , i.e., when passing from X toX, see [5]. ...
Preprint
Optimal experimental designs are probability measures with finite support enjoying an optimality property for the computation of least squares estimators. We present an algorithm for computing optimal designs on finite sets based on the long-time asymptotics of the gradient flow of the log-determinant of the so called information matrix. We prove the convergence of the proposed algorithm, and provide a sharp estimate on the rate its convergence. Numerical experiments are performed on few test cases using the new matlab package OptimalDesignComputation.
Article
Full-text available
This paper proposes a novel enhancement for competitive swarm optimizer (CSO) by mutating loser particles (agents) from the swarm to increase the swarm diversity and improve space exploration capability, namely competitive swarm optimizer with mutated agents (CSO-MA). The selection mechanism is carried out so that it does not retard the search if agents are exploring in promising areas. Simulation results show that CSO-MA has a better exploration–exploitation balance than CSO and generally outperforms CSO, which is one of the state-of-the-art metaheuristic algorithms for optimization. We show additionally that it also generally outperforms swarm based types of algorithms and an exemplary and popular non-swarm based algorithm called Cuckoo search, without requiring a lot more CPU time. We apply CSO-MA to find a c-optimal approximate design for a high-dimensional optimal design problem when other swarm algorithms were not able to. As applications, we use the CSO-MA to search various optimal designs for a series of high-dimensional statistical models. The proposed CSO-MA algorithm is a general-purpose optimizing tool and can be directly amended to find other types of optimal designs for nonlinear models, including optimal exact designs under a convex or non-convex criterion.
Article
Full-text available
Several common general purpose optimization algorithms are compared for finding A- and D-optimal designs for different types of statistical models of varying complexity, including high dimensional models with five and more factors. The algorithms of interest include exact methods, such as the interior point method, the Nelder–Mead method, the active set method, the sequential quadratic programming, and metaheuristic algorithms, such as particle swarm optimization, simulated annealing and genetic algorithms. Several simulations are performed, which provide general recommendations on the utility and performance of each method, including hybridized versions of metaheuristic algorithms for finding optimal experimental designs. A key result is that general-purpose optimization algorithms, both exact methods and metaheuristic algorithms, perform well for finding optimal approximate experimental designs.
Article
Full-text available
In the area of statistical planning, there is a large body of theoretical knowledge and computational experience concerning so-called optimal approximate designs of experiments. However, for an approximate design to be executed in practice, it must be converted into an exact, i.e., integer, design, which is usually done via rounding procedures. Although rapid, rounding procedures have many drawbacks; in particular, they often yield worse exact designs than heuristics that do not require approximate designs at all. In this paper, we build on an alternative principle of utilizing optimal approximate designs for the computation of optimal, or nearly-optimal, exact designs. The principle, which we call ascent with quadratic assistance (AQuA), is an integer programming method based on the quadratic approximation of the design criterion in the neighborhood of the optimal approximate information matrix. To this end, we present quadratic approximations of all Kiefer's criteria with an integer parameter, including D- and A-optimality and, by a model transformation, I-optimality. Importantly, we prove a low-rank property of the associated quadratic forms, which enables us to apply AQuA to large design spaces, for example via mixed integer conic quadratic solvers. We numerically demonstrate the robustness and superior performance of the proposed method for models under various types of constraints. More precisely, we compute optimal size-constrained exact designs for the model of spring-balance weighing, and optimal symmetric marginally restricted exact designs for the Scheffe mixture model. We also show how can iterative application of AQuA be used for a stratified information-based subsampling of large datasets under a lower bound on the quality and an upper bound on the cost of the subsample.
Article
Full-text available
We propose a class of subspace ascent methods for computing optimal approximate designs that covers both existing as well as new and more efficient algorithms. Within this class of methods, we construct a simple, randomized exchange algorithm (REX). Numerical comparisons suggest that the performance of REX is comparable or superior to the performance of state-of-the-art methods across a broad range of problem structures and sizes. We focus on the most commonly used criterion of D-optimality that also has applications beyond experimental design, such as the construction of the minimum volume ellipsoid containing a given set of data-points. For D-optimality, we prove that the proposed algorithm converges to the optimum. We also provide formulas for the optimal exchange of weights in the case of the criterion of A-optimality. These formulas enable one to use REX for computing A-optimal and I-optimal designs.
Article
Full-text available
We find optimal designs for linear models using a novel algorithm that iteratively combines a semidefinite programming (SDP) approach with adaptive grid techniques. The proposed algorithm is also adapted to find locally optimal designs for nonlinear models. The search space is first discretized, and SDP is applied to find the optimal design based on the initial grid. The points in the next grid set are points that maximize the dispersion function of the SDP-generated optimal design using nonlinear programming. The procedure is repeated until a user-specified stopping rule is reached. The proposed algorithm is broadly applicable, and we demonstrate its flexibility using (i) models with one or more variables and (ii) differentiable design criteria, such as A-, D-optimality, and non-differentiable criterion like E-optimality, including the mathematically more challenging case when the minimum eigenvalue of the information matrix of the optimal design has geometric multiplicity larger than 1. Our algorithm is computationally efficient because it is based on mathematical programming tools and so optimality is assured at each stage; it also exploits the convexity of the problems whenever possible. Using several linear and nonlinear models with one or more factors, we show the proposed algorithm can efficiently find optimal designs.
Article
Differential Evolution (DE) has become one of the leading metaheuristics in the class of Evolutionary Algorithms, which consists of methods that operate off of survival-of-the-fittest principles. This general purpose optimization algorithm is viewed as an improvement over Genetic Algorithms, which are widely used to find solutions to chemometric problems. Using straightforward vector operations and random draws, DE can provide fast, efficient optimization of any real, vector-valued function. This article reviews the basic algorithm and a few of its modifications with various enhancements. We provide guidance for practitioners, discuss implementation issues and give illustrative applications of DE with the corresponding R codes to find different types of optimal designs for various statistical models in chemometrics that involve the Arrhenius equation, reaction rates, concentration measures and chemical mixtures.
Article
D-efficient saturated subsets are natural initial solutions of various algorithms applied in statistics and computational geometry. We propose two greedy heuristics for the construction of D-efficient saturated subsets: an improvement of the method suggested by Galil and Kiefer in the context of D-optimal experimental designs and a modification of the Kumar–Yildirim method for the initiation of the minimum-volume enclosing ellipsoid algorithms. We provide mathematical insights into the two methods and compare them to the commonly used random and regularized heuristics.
Article
Finding Bayesian optimal designs for nonlinear models is a difficult task because the optimality criterion typically requires us to evaluate complex integrals before we perform a constrained optimization. We propose a hybridized method where we combine an adaptive multidimensional integration algorithm and a metaheuristic algorithm called imperialist competitive algorithm (ICA) to find Bayesian optimal designs. We apply our numerical method to a few challenging design problems to demonstrate its efficiency. They include finding D-optimal designs for an item response model commonly used in education, Bayesian optimal designs for survival models and Bayesian optimal designs for a four-parameter sigmoid Emax dose response model. Supplementary materials for this article are available online and they contain an R package for implementing the proposed algorithm and codes for reproducing all the results in this paper.
Article
D-optimal designs are frequently used in controlled experiments to obtain the most accurate estimate of model parameters at minimal cost. Finding them can be a challenging task, especially when there are many factors in a nonlinear model. As the number of factors becomes large and interact with one another, there are many more variables to optimize and the D-optimal design problem becomes highdimensional and non-separable. Consequently, premature convergence issues arise. Candidate solutions get trapped in local optima and the classical gradient-based optimization approaches to search for the D-optimal designs rarely succeed. We propose a specially designed version of differential evolution (DE) which is a representative gradient-free optimization approach to solve such high-dimensional optimization problems. The proposed specially designed DE uses a new novelty-based mutation strategy to explore the various regions in the search space. The exploration of the regions will be carried out differently from the previously explored regions and the diversity of the population can be preserved. The proposed novelty-based mutation strategy is collaborated with two common DE mutation strategies to balance exploration and exploitation at the early or medium stage of the evolution. Additionally, we adapt the control parameters of DE as the evolution proceeds. Using logistic models with several factors on various design spaces as examples, our simulation results show our algorithm can find D-optimal designs efficiently and the algorithm outperforms its competitors. As an application, we apply our algorithm and re-design a 10-factor car refueling experiment with discrete and continuous factors and selected pairwise interactions. Our proposed algorithm was able to consistently outperform the other algorithms and find a more efficient D-optimal design for the problem.
Article
Identifying optimal designs for generalized linear models with a binary response can be a challenging task, especially when there are both discrete and continuous independent factors in the model. Theoretical results rarely exist for such models, and for the handful that do, they usually come with restrictive assumptions. In this paper we propose the d-QPSO algorithm, a modified version of quantum-behaved particle swarm optimization, to find a variety of D-optimal approximate and exact designs for experiments with discrete and continuous factors and a binary response. We show that the d-QPSO algorithm can efficiently find locally D-optimal designs even for experiments with a large number of factors and robust pseudo-Bayesian designs when nominal values for the model parameters are not available. Additionally, we investigate robustness properties of the d-QPSO algorithm-generated designs to various model assumptions and provide real applications to design a bio-plastics odor removal experiment, an electronic static experiment, and a ten-factor car refueling experiment.