Yurii Nesterov’s research while affiliated with Corvinus University of Budapest and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (148)


Convex Quartic Problems: Homogenized Gradient Method and Preconditioning
  • Article

April 2025

SIAM Journal on Optimization

Radu-Alexandru Dragomir

·

Yurii Nesterov

Asymmetric Long-Step Primal-Dual Interior-Point Methods with Dual Centering
  • Preprint
  • File available

March 2025

·

15 Reads

In this paper, we develop a new asymmetric framework for solving primal-dual problems of Conic Optimization by Interior-Point Methods (IPMs). It allows development of efficient methods for problems, where the dual formulation is simpler than the primal one. The problems of this type arise, in particular, in Semidefinite Optimization (SDO), for which we propose a new method with very attractive computational cost. Our long-step predictor-corrector scheme is based on centering in the dual space. It computes the affine-scaling predicting direction by the use of the dual barrier function, controlling the tangent step size by a functional proximity measure. We show that for symmetric cones, the search procedure at the predictor step is very cheap. In general, we do not need sophisticated Linear Algebra, restricting ourselves only by Cholesky factorization. However, our complexity bounds correspond to the best known polynomial-time results. Moreover, for symmetric cones the bounds automatically depend on the minimal barrier parameter between the primal or the dual feasible sets. We show by SDO-examples that the corresponding gain can be very big. We argue that the dual framework is more suitable for adjustment to the actual complexity of the problem. As an example, we discuss some classes of SDO-problems, where the number of iterations is proportional to the square root of the number of linear equality constraints. Moreover, the computational cost of one iteration there is similar to that one for Linear Optimization. We support our theoretical developments by preliminary but encouraging numerical results with randomly generated SDO-problems of different size.

Download

Quartic Regularity

March 2025

·

3 Reads

Vietnam Journal of Mathematics

In this paper, we propose new linearly convergent second-order methods for minimizing convex quartic polynomials. This framework is applied for designing optimization schemes, which can solve general convex problems satisfying a new condition of quartic regularity. It assumes positive definiteness and boundedness of the fourth derivative of the objective function. For such problems, an appropriate quartic regularization of Damped Newton Method has global linear rate of convergence. We discuss several important consequences of this result. In particular, it can be used for constructing new second-order methods in the framework of high-order proximal-point schemes (Nesterov, Math. Program. 197 , 1–26, 2023 and Nesterov, SIAM J. Optim. 31 , 2807–2828, 2021). These methods have convergence rate O~(kp)\tilde{O}(k^{-p}) O ~ ( k - p ) , where k is the iteration counter, p is equal to 3, 4, or 5, and tilde indicates the presence of logarithmic factors in the complexity bounds for the auxiliary problems, which are solved at each iteration of the schemes.


Local and Global Convergence of Greedy Parabolic Target-Following Methods for Linear Programming

December 2024

·

1 Read

In the first part of this paper, we prove that, under some natural non-degeneracy assumptions, the Greedy Parabolic Target-Following Method, based on {\em universal tangent direction} has a favorable local behavior. In view of its global complexity bound of the order O(nln1ϵ)O(\sqrt{n} \ln {1 \over \epsilon}), this fact proves that the functional proximity measure, used for controlling the closeness to Greedy Central Path, is large enough for ensuring a local super-linear rate of convergence, provided that the proximity to the path is gradually reduced. This requirement is eliminated in our second algorithm based on a new auto-correcting predictor direction. This method, besides the best-known polynomial-time complexity bound, ensures an automatic switching onto the local quadratic convergence in a small neighborhood of solution. Our third algorithm approximates the path by quadratic curves. On the top of the best-known global complexity bound, this method benefits from an unusual local cubic rate of convergence. This amelioration needs no serious increase in the cost of one iteration. We compare the advantages of these local accelerations with possibilities of finite termination. The conditions allowing the optimal basis detection sometimes are even weaker than those required for the local superlinear convergence. Hence, it is important to endow the practical optimization schemes with both abilities. The proposed methods have a very interesting combination of favorable properties, which can be hardly found in the most of existing Interior-Point schemes. As all other parabolic target-following schemes, the new methods can start from an arbitrary strictly feasible primal-dual pair and go directly towards the optimal solution of the problem in a single phase. The preliminary computational experiments confirm the advantage of the second-order prediction.


Fig. 1: Optimal choice of β and γ. The feasible set given by (44) and (45) is filled with red color.
Improved global performance guarantees of second-order methods in convex minimization

August 2024

·

19 Reads

In this paper, we attempt to compare two distinct branches of research on second-order optimization methods. The first one studies self-concordant functions and barriers, the main assumption being that the third derivative of the objective is bounded by the second derivative. The second branch studies cubic regularized Newton methods (CRNMs) with the main assumption that the second derivative is Lipschitz continuous. We develop a new theoretical analysis for a path-following scheme (PFS) for general self-concordant functions, as opposed to the classical path-following scheme developed for self-concordant barriers. We show that the complexity bound for this scheme is better than that of the Damped Newton Method (DNM) and show that our method has global superlinear convergence. We propose also a new predictor-corrector path-following scheme (PCPFS) that leads to further improvement of constant factors in the complexity guarantees for minimizing general self-concordant functions. We also apply path-following schemes to different classes of constrained optimization problems and obtain the resulting complexity bounds. Finally, we analyze an important subclass of general self-concordant functions, namely a class of strongly convex functions with Lipschitz continuous second derivative, and show that for this subclass CRNMs give even better complexity bounds.


Primal Subgradient Methods with Predefined Step Sizes

May 2024

·

18 Reads

·

2 Citations

Journal of Optimization Theory and Applications

In this paper, we suggest a new framework for analyzing primal subgradient methods for nonsmooth convex optimization problems. We show that the classical step-size rules, based on normalization of subgradient, or on knowledge of the optimal value of the objective function, need corrections when they are applied to optimization problems with constraints. Their proper modifications allow a significant acceleration of these schemes when the objective function has favorable properties (smoothness, strong convexity). We show how the new methods can be used for solving optimization problems with functional constraints with a possibility to approximate the optimal Lagrange multipliers. One of our primal-dual methods works also for unbounded feasible set.



Fig. 1 Subfigure (a) shows that for ¯ x < 3 √ 1 − β, only the exact solution of the auxiliary problem satisfies (2.5), and Subfigure (b) illustrates the set of solutions for ¯ x = 1.4 and β = 0.85 satisfying ¯ x ≥ 3 √ 1 + β
Fig. 2 Subfigure (a) stands for the set of points y satisfying the inequality | Ω x,3,M (y)| ≤ γ 1+γ |Ω x,3 (y)| with x = 0.8, γ = 8/19, and β = 0.9, and Subfigure (b) illustrates the set of acceptable solutions for x = 0.8 and β = 0.9
High-order methods beyond the classical complexity bounds: inexact high-order proximal-point methods

January 2024

·

139 Reads

·

16 Citations

Mathematical Programming

We introduce a Bi-level OPTimization (BiOPT) framework for minimizing the sum of two convex functions, where one of them is smooth enough. The BiOPT framework offers three levels of freedom: (i) choosing the order p of the proximal term; (ii) designing an inexact pth-order proximal-point method in the upper level; (iii) solving the auxiliary problem with a lower-level non-Euclidean method in the lower level. We here regularize the objective by a (p+1)th-order proximal term (for arbitrary integer p1p\ge 1) and then develop the generic inexact high-order proximal-point scheme and its acceleration using the standard estimating sequence technique at the upper level. This follows at the lower level with solving the corresponding pth-order proximal auxiliary problem inexactly either by one iteration of the pth-order tensor method or by a lower-order non-Euclidean composite gradient scheme. Ultimately, it is shown that applying the accelerated inexact pth-order proximal-point method at the upper level and handling the auxiliary problem by the non-Euclidean composite gradient scheme lead to a 2q-order method with the convergence rate O(k(p+1)){\mathcal {O}}(k^{-(p+1)}) (for q=p/2q=\lfloor p/2\rfloor and the iteration counter k), which can result to a superfast method for some specific class of problems.



Convex quartic problems: homogenized gradient method and preconditioning

June 2023

·

34 Reads

We consider a convex minimization problem for which the objective is the sum of a homogeneous polynomial of degree four and a linear term. Such task arises as a subproblem in algorithms for quadratic inverse problems with a difference-of-convex structure. We design a first-order method called Homogenized Gradient, along with an accelerated version, which enjoy fast convergence rates of respectively O(κ2/K2)\mathcal{O}(\kappa^2/K^2) and O(κ2/K4)\mathcal{O}(\kappa^2/K^4) in relative accuracy, where K is the iteration counter. The constant κ\kappa is the quartic condition number of the problem. Then, we show that for a certain class of problems, it is possible to compute a preconditioner for which this condition number is n\sqrt{n}, where n is the problem dimension. To establish this, we study the more general problem of finding the best quadratic approximation of an p\ell_p norm composed with a quadratic map. Our construction involves a generalization of the so-called Lewis weights.


Citations (54)


... To our knowledge, the QSC class has not been previously studied in the context of first-order methods. The only other first-order methods for which one can prove similar bounds are the nonadaptive variants of our scheme, namely the normalized gradient method (NGM) from [16, Section 5] and the recent improvement of this algorithm for constrained problems [17]. ...

Reference:

DADA: Dual Averaging with Distance Adaptation
Primal Subgradient Methods with Predefined Step Sizes

Journal of Optimization Theory and Applications

... In recent years, many efficient modifications of CNM have been developed, including adaptive and universal methods [9,10,15,16,21,22] that do not require to know the actual Lipschitz constant of the Hessian and can automatically adapt to the best problem class among the functions with Hölder continuous derivatives. Additionally, accelerated secondorder schemes [8,22,26,27,29,30] have been introduced, which offer improved convergence rates for convex functions and match the lower complexity bounds [1,2,30]. ...

Super-Universal Regularized Newton Method
  • Citing Article
  • January 2024

SIAM Journal on Optimization

... Nonnegative matrix factorization (NMF) has been studied in machine learning [8,17,19,30,37] and signal processing [12,23,24] as well as mathematical optimization [15,20,27]. Given an observed nonnegative matrix X ∈ R m×n + , NMF is a method to find two nonnegative matrix W ∈ R m×r + and H ∈ R r×n + such that X = W H or X ≃ W H. The dimension r is often assumed to be smaller than n and m, and thus W and H have smaller sizes than X. ...

Conic optimization-based algorithms for nonnegative matrix factorization

Optimization Methods and Software

... In particular, when the Hessian is ill-conditioned or singular, Newton's updates can be unstable, leading to divergence or poor progress toward a solution. One approach to mitigate these issues is the regularized Newton method, which improves stability by adding a regularization term to the Hessian to ensure well-posed updates and better global behavior [28,16]. Another approach, cubic regularization, addresses these instability issues by introducing a cubic term to control the step size adaptively [35]. ...

Gradient regularization of Newton method with Bregman distances

Mathematical Programming

... These global guarantees are essential for optimization algorithms, as the initial point may often be far from the solution's neighbourhood. In the past decade, the study of the global behaviour of second-order methods has become one of the driving forces in the field, including analysis for Self-Concordant functions [17,23,38]. ...

Set-Limited Functions and Polynomial-Time Interior-Point Methods

Journal of Optimization Theory and Applications

... -To balance the per-iteration cost and better curvature exploration, we choose a preconditioner from the perspective of implicit scalarization, and the Barzilai-Borwein method is embedded in the preconditioning method. This paves the way for the development of efficient high-order [12] and high-order regularized methods [28] for MOPs. ...

High-Order Optimization Methods for Fully Composite Problems
  • Citing Article
  • September 2022

SIAM Journal on Optimization

... Using any of these methods, we can obtain optimal dual prices conforming to our access assumptions. Interestingly, there have been attempts to marry the two ideas of subgradient methods with ellipsoid methods [RN23], but achieving fast practical convergence in such hybrid schemes remains an open research question. ...

Subgradient ellipsoid method for nonsmooth convex problems

Mathematical Programming

... Still, during the next decade, the development of the second-order methods was accomplished within this framework. For completeness of the picture, we mention interesting results on different variants of Cubic Regularization [7,8], universal methods [13], accelerated second-order methods based on contractions [9,11], and many others. ...

Affine-invariant contracting-point methods for Convex Optimization

Mathematical Programming

... In this paper, we consider an approach based on the characterization of the market state by a potential function of total expected revenue, similar to the function of total excessive revenue from [13]. In [9], the authors, using this approach, analyze the case in which consumers follow a discrete choice model with imperfect behavior introduced by random noise in the assessment of utility, and suppliers seek to maximize their profit, taking into account the quantity adjustment costs. This paper substantiates the convexity of the used potential as a function of prices. ...

Dynamic pricing under nested logit demand
  • Citing Article
  • December 2021