May 2025
SIAM Journal on Scientific Computing
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
May 2025
SIAM Journal on Scientific Computing
November 2024
·
36 Reads
This paper introduces a modified Byrd-Omojokun (BO) trust region algorithm to address the challenges posed by noisy function and gradient evaluations. The original BO method was designed to solve equality constrained problems and it forms the backbone of some interior point methods for general large-scale constrained optimization. A key strength of the BO method is its robustness in handling problems with rank-deficient constraint Jacobians. The algorithm proposed in this paper introduces a new criterion for accepting a step and for updating the trust region that makes use of an estimate in the noise in the problem. The analysis presented here gives conditions under which the iterates converge to regions of stationary points of the problem, determined by the level of noise. This analysis is more complex than for line search methods because the trust region carries (noisy) information from previous iterates. Numerical tests illustrate the practical performance of the algorithm.
August 2023
·
14 Reads
·
21 Citations
SIAM Journal on Optimization
May 2023
·
18 Reads
·
12 Citations
IMA Journal of Numerical Analysis
The motivation for this paper stems from the desire to develop an adaptive sampling method for solving constrained optimization problems, in which the objective function is stochastic and the constraints are deterministic. The method proposed in this paper is a proximal gradient method that can also be applied to the composite optimization problem min , where f is stochastic and h is convex (but not necessarily differentiable). Adaptive sampling methods employ a mechanism for gradually improving the quality of the gradient approximation so as to keep computational cost to a minimum. The mechanism commonly employed in unconstrained optimization is no longer reliable in the constrained or composite optimization settings, because it is based on pointwise decisions that cannot correctly predict the quality of the proximal gradient step. The method proposed in this paper measures the result of a complete step to determine if the gradient approximation is accurate enough; otherwise, a more accurate gradient is generated and a new step is computed. Convergence results are established both for strongly convex and general convex f. Numerical experiments are presented to illustrate the practical behavior of the method.
March 2023
·
26 Reads
·
24 Citations
Mathematical Programming
Classical trust region methods were designed to solve problems in which function and gradient information are exact. This paper considers the case when there are errors (or noise) in the above computations and proposes a simple modification of the trust region method to cope with these errors. The new algorithm only requires information about the size/standard deviation of the errors in the function evaluations and incurs no additional computational expense. It is shown that, when applied to a smooth (but not necessarily convex) objective function, the iterates of the algorithm visit a neighborhood of stationarity infinitely often, assuming errors in the function and gradient evaluations are bounded. It is also shown that, after visiting the above neighborhood for the first time, the iterates cannot stray too far from it, as measured by the objective value. Numerical results illustrate how the classical trust region algorithm may fail in the presence of noise, and how the proposed algorithm ensures steady progress towards stationarity in these cases.
September 2022
·
89 Reads
·
42 Citations
Optimization Methods and Software
The goal of this paper is to investigate an approach for derivative-free optimization that has not received sufficient attention in the literature and is yet one of the simplest to implement and parallelize. In its simplest form, it consists of employing derivative-based methods for unconstrained or constrained optimization and replacing the gradient of the objective (and constraints) by finite-difference approximations. This approach is applicable to problems with or without noise in the functions. The differencing interval is determined by a bound on the second (or third) derivative and by the noise level, which is assumed to be known or to be accessible through difference tables or sampling. The use of finite-difference gradient approximations has been largely dismissed in the derivative-free optimization literature as too expensive in terms of function evaluations or as impractical in the presence of noise. However, the test results presented in this paper suggest that it has much to be recommended. The experiments compare newuoa, dfo-ls and cobyla against finite-difference versions of l-bfgs, lmder and knitro on three classes of problems: general unconstrained problems, nonlinear least squares problems and nonlinear programs with inequality constraints.
August 2022
·
19 Reads
·
17 Citations
SIAM Journal on Scientific Computing
March 2022
·
43 Reads
·
29 Citations
SIAM Journal on Optimization
January 2022
·
73 Reads
Classical trust region methods were designed to solve problems in which function and gradient information are exact. This paper considers the case when there are bounded errors (or noise) in the above computations and proposes a simple modification of the trust region method to cope with these errors. The new algorithm only requires information about the size of the errors in the function evaluations and incurs no additional computational expense. It is shown that, when applied to a smooth (but not necessarily convex) objective function, the iterates of the algorithm visit a neighborhood of stationarity infinitely often, and that the rest of the sequence cannot stray too far away, as measured by function values. Numerical results illustrate how the classical trust region algorithm may fail in the presence of noise, and how the proposed algorithm ensures steady progress towards stationarity in these cases.
October 2021
·
15 Reads
A common approach for minimizing a smooth nonlinear function is to employ finite-difference approximations to the gradient. While this can be easily performed when no error is present within the function evaluations, when the function is noisy, the optimal choice requires information about the noise level and higher-order derivatives of the function, which is often unavailable. Given the noise level of the function, we propose a bisection search for finding a finite-difference interval for any finite-difference scheme that balances the truncation error, which arises from the error in the Taylor series approximation, and the measurement error, which results from noise in the function evaluation. Our procedure produces near-optimal estimates of the finite-difference interval at low cost without knowledge of the higher-order derivatives. We show its numerical reliability and accuracy on a set of test problems. When combined with L-BFGS, we obtain a robust method for minimizing noisy black-box functions, as illustrated on a subset of synthetically noisy unconstrained CUTEst problems.
... Various algorithms have been designed to solve deterministic equality-constrained optimization problems (see [6,11] for further references), while recent research has focused on developing stochastic optimization algorithms. There has been a growing interest in adapting line search and trust region methods in stochastic framework for unconstrained optimization problems [1-5, 9, 10, 12, 14, 20, 22, 24, 26-28], but significantly fewer algorithms have been proposed to solve stochastic equality-constrained optimization problems (see [6] for further references and [7,13,15,34,37]). ...
August 2023
SIAM Journal on Optimization
... Deterministic version of this condition has been used in [2] for the analysis of a proximal inexact trust-region algorithm. A stochastic version imposed in expectation was used in [4] and an alternative, that is meant to be more practical, is suggested in [60]. Further variants for general constrained optimization are proposed in [8]. ...
May 2023
IMA Journal of Numerical Analysis
... In this paper, we focus on noise-aware algorithms for solving such problems, i.e., algorithms that exploit information about the noise and that are adaptive. In the unconstrained and bounded noise setting, several noise-aware algorithms that leverage noise-level dependent constants (e.g., ϵ f and ϵ g ) to evaluate the acceptability of steps within line search [5,6,29,48] or trust region [2,12,30,44] methods have been proposed. A natural extension of these algorithms to the constrained setting assumes bounded noise in the objective function and associated derivatives, and possibly in the constraint functions. ...
March 2023
Mathematical Programming
... Our findings indicate a general superiority of newly developed methods over the basic version of IGD (inexact gradient descent) method without momentum. As discussed in [20,45], IGD in general outperforms other well-developed methods in derivative-free optimization including FMINSEARCH, i.e., the Nelder-Mead simplex-based method from [25], the implicit filtering algorithms [10], and the random gradient-free algorithm for smooth optimization proposed by Nesterov and Spokoiny [34]. As a consequence, IGDm can be recommended as a preferable optimizer for derivative-free smooth (convex and nonconvex) optimization problems. ...
September 2022
Optimization Methods and Software
... Finally, it would be interesting to compare our methods with recent results on adaptive finite-difference methods [35], which automatically adjust the finite-difference interval to balance truncation error and measurement error, making them suitable for noisy derivativefree optimization. We keep these questions for further research. ...
August 2022
SIAM Journal on Scientific Computing
... Byrd et al. [3] have proposed a stochastic quasi-Newton method in limited memory form through subsampled Hessian-vector products. Shi et al. [23] have proposed practical extensions of the BFGS and L-BFGS methods for nonlinear optimization that are capable of dealing with noise by employing a new linesearch technique. Xie et al. [24] have considered the convergence analysis of quasi-Newton methods when there are (bounded) errors in both function and gradient evaluations, and established conditions under which an Armijo-Wolfe linesearch on the noisy function yields sufficient decrease in the true objective function. ...
March 2022
SIAM Journal on Optimization
... In this paper, we focus on noise-aware algorithms for solving such problems, i.e., algorithms that exploit information about the noise and that are adaptive. In the unconstrained and bounded noise setting, several noise-aware algorithms that leverage noise-level dependent constants (e.g., ϵ f and ϵ g ) to evaluate the acceptability of steps within line search [5,6,29,48] or trust region [2,12,30,44] methods have been proposed. A natural extension of these algorithms to the constrained setting assumes bounded noise in the objective function and associated derivatives, and possibly in the constraint functions. ...
January 2020
SIAM Journal on Optimization
... In this paper, we focus on noise-aware algorithms for solving such problems, i.e., algorithms that exploit information about the noise and that are adaptive. In the unconstrained and bounded noise setting, several noise-aware algorithms that leverage noise-level dependent constants (e.g., ϵ f and ϵ g ) to evaluate the acceptability of steps within line search [5,6,29,48] or trust region [2,12,30,44] methods have been proposed. A natural extension of these algorithms to the constrained setting assumes bounded noise in the objective function and associated derivatives, and possibly in the constraint functions. ...
March 2018
SIAM Journal on Optimization
... Due to the importance of machine learning and deep learning, [29] and [30] analyze quasi-Newton methods performance in these fields. Also, [31] and [32] seek to determine a suitable batch selection method for training machine learning models. ...
February 2018
... represents the sensitivity matrix (also denoted as functional matrix or Jacobian matrix). The solution of problem (56) can be computed by various methods, see [141][142][143][144][145]. In the case of gradient-based algorithms, the derivatives (57) are required to compute * . ...
January 2006