Article

A Stochastic Gradient Method With Mesh Refinement for PDE-Constrained Optimization Under Uncertainty

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Mathematics Subject Classification 49J20 · 49J55 · 60F17 · 65C05 · 90C15 · 35R60 1 Introduction PDE-constrained optimization under uncertainty is a rapidly growing field with a number of recent contributions in theory [9,31,32,34], numerical and computational methods [17,18,30,59], and applications [7,8,11,49]. Nevertheless, a number of Dedication from T.M. Surowiec. ...
... Broadly speaking, the numerical solution methods available for such problems derive in part from the standard paradigms found in the classical stochastic programming literature: first-order stochastic approximation/stochastic gradient (SG) approaches, [18], versus methods that rely on sampling from the underlying probability measure [59]. The latter approaches are not optimization algorithms per se, rather, they use approximations of the expectations by replacing the underlying probability distribution with a discrete measure. ...
... A collection F of measurable functions on is called P-Donsker if the empirical process (18) converges in distribution to a tight random variable G in the space ∞ (F), where the limit process G = {G f : f ∈ F} is a zero-mean Gaussian process with the covariance function ...
Article
Full-text available
Monte Carlo approximations for random linear elliptic PDE constrained optimization problems are studied. We use empirical process theory to obtain best possible mean convergence rates O(n-12)O(n12)O(n^{-\frac{1}{2}}) for optimal values and solutions, and a central limit theorem for optimal values. The latter allows to determine asymptotically consistent confidence intervals by using resampling techniques. The theoretical results are illustrated with two sets of numerical experiments. The first demonstrates the theoretical convergence rates for optimal values and optimal solutions. This is complemented by a study illustrating the usage of subsampling bootstrap methods for estimating the confidence intervals.
... Therefore, the topic has received a lot of attention in the last years, see, e.g. [1][2][3][4][5][6][7][8][9]. ...
... Indeed, performing a Schur complement on u i , the system x n 1 =Collective_Smoothing(x 0 , S , n 1 ) (n 1 steps of coll. smoothing) 5: ...
... Figure 2 shows a sequence of optimal controls for different values of β with and without box-constraints. The optimal control for β = 0 and without box-constraints corresponds to the minimizer of the linear-quadratic OCP (5). We observe that L 1 penalization indeed induces sparsity, since the optimal controls are more and more localized as β increases. ...
Article
Full-text available
In this manuscript, we present a collective multigrid algorithm to solve efficiently the large saddle-point systems of equations that typically arise in PDE-constrained optimization under uncertainty, and develop a novel convergence analysis of collective smoothers and collective two-level methods. The multigrid algorithm is based on a collective smoother that at each iteration sweeps over the nodes of the computational mesh, and solves a reduced saddle-point system whose size is proportional to the number N of samples used to discretized the probability space. We show that this reduced system can be solved with optimal O(N) complexity. The multigrid method is tested both as a stationary method and as a preconditioner for GMRES on three problems: a linear-quadratic problem, possibly with a local or a boundary control, for which the multigrid method is used to solve directly the linear optimality system; a nonsmooth problem with box constraints and L1L1L^1-norm penalization on the control, in which the multigrid scheme is used as an inner solver within a semismooth Newton iteration; a risk-averse problem with the smoothed CVaR risk measure where the multigrid method is called within a preconditioned Newton iteration. In all cases, the multigrid algorithm exhibits excellent performances and robustness with respect to the parameters of interest.
... This approach has also been pursued recently by other authors. In [14], the authors present a stochastic optimization strategy for PDEconstrained optimization. In the context of aerodynamic shape optimization, stochastic gradient descent has been applied for the robust design optimization of the NACA-0012 airfoil in [15], which considers uncertainties in operating conditions and model uncertainty. ...
... where denotes the probability simplex. The inequality in the above formula is equivalent to (14) is closely related to likelihood robust optimization [20], which makes assumptions that random variables taking values in set = { 1 , 2 , … ,̃} and the set is finite and known in advance. Assuming that independent samples were observed, with occurrences of . ...
... Assuming that independent samples were observed, with occurrences of . Then (14) rewrites as ...
Article
Full-text available
We formulate and solve data-driven aerodynamic shape design problems with distribution-ally robust optimization (DRO) approaches. DRO aims to minimize the worst-case expected performance in a set of distributions that is informed by observed data with uncertainties. Building on the findings of the work Gotoh, et al. (2018), we study the connections between a class of DRO and robust design optimization, which is classically based on the mean-variance (standard deviation) optimization formulation pioneered by Taguchi. Our results provide a new perspective to the understanding and formulation of robust design optimization problems. It enables data-driven and statistically principled approaches to quantify the trade-offs between robustness and performance, in contrast to the classical robust design formulation that captures uncertainty only qualitatively. Our preliminary computational experiments on aerodynamic shape optimization in transonic turbulent flow show promising design results.
... with the equality due to (29b). Recall the definition of m τ from (14). Introducing ...
... For simulations, problem (1) is replaced by the penalised problem (19); i.e., the inequality constraint is penalised as in (15) using the smoothed max function defined in (14) with τ → τ 1.1 . We replace (19) by its sample average approximation (SAA) with the finite set Ξ = {ξ 1 , . . . ...
... , n}), although other choices are possible. While the convergence of Algorithm 1 has yet to be proven in the function space setting, the convergence of the stochastic gradient method (without variance reduction) has been established; see [13,14] for convergence results when the method is applied to nonconvex PDE-constrained optimisation problems. ...
Preprint
We consider a risk-averse optimal control problem governed by an elliptic variational inequality (VI) subject to random inputs. By deriving KKT-type optimality conditions for a penalised and smoothed problem and studying convergence of the stationary points with respect to the penalisation parameter, we obtain two forms of stationarity conditions. The lack of regularity with respect to the uncertain parameters and complexities induced by the presence of the risk measure give rise to new challenges unique to the stochastic setting. We also propose a path-following stochastic approximation algorithm using variance reduction techniques and demonstrate the algorithm on a modified benchmark problem.
... A solution method for (1) is (robust) stochastic approximation. It has thoroughly been analyzed in [38,44] for finite-dimensional and in [22,24,45] for infinite-dimensional optimization problems. For reliable -optimal solutions, the sample size estimates established in [44, Proposition 2.2] do not explicitly depend on the problem's dimension. ...
... Defining K( ) = −QA( ) −1 B( ) and h( ) = QA −1 ( )g( ) − y d , the control problem (24) can be written as We discuss differentiability and the lack of strong convexity of the expectation func- ...
... Many instances of the linear-quadratic control problem (24) frequently encountered in the literature are defined by the following data: ...
Article
Full-text available
We analyze the tail behavior of solutions to sample average approximations (SAAs) of stochastic programs posed in Hilbert spaces. We require that the integrand be strongly convex with the same convexity parameter for each realization. Combined with a standard condition from the literature on stochastic programming, we establish non-asymptotic exponential tail bounds for the distance between the SAA solutions and the stochastic program’s solution, without assuming compactness of the feasible set. Our assumptions are verified on a class of infinite-dimensional optimization problems governed by affine-linear partial differential equations with random inputs. We present numerical results illustrating our theoretical findings.
... A novel stochastic gradient method was formulated over infinite-dimensional shape spaces and convergence of the method was proven. The work was informed by its demonstrated success in the context of PDEconstrained optimization under uncertainty [23,26,46,25,24]. ...
... Remark 5 There are several consequences of theorem 1. The first is computational: particularly for large-scale problems with many shapes, a decomposition approach can be used by solving (25) for an arbitrary admissible partition instead of the more expensive problem (24). Second, for smaller-scaled problems, the theorem justifies the solving (24) "all-at-once" to obtain descent directions with respect to each shape. ...
... Again for optimal control problems with PDEs, but for a larger class of problems, including nonsmooth and convex problems, the authors [25] propose a different approximation scheme without needing to take additional samples (meaning m k ≡ 1 is permissible). The proposed method uses averaging of the iterate u k instead of the stochastic gradient. ...
Preprint
Full-text available
Shape optimization models with one or more shapes are considered in this chapter. Of particular interest for applications are problems in which where a so-called shape functional is constrained by a partial differential equation (PDE) describing the underlying physics. A connection can made between a classical view of shape optimization and the differential-geometric structure of shape spaces. To handle problems where a shape functional depends on multiple shapes, a theoretical framework is presented, whereby the optimization variable can be represented as a vector of shapes belonging to a product shape space. The multi-shape gradient and multi-shape derivative are defined, which allows for a rigorous justification of a steepest descent method with Armijo backtracking. As long as the shapes as subsets of a hold-all domain do not intersect, solving a single deformation equation is enough to provide descent directions with respect to each shape. Additionally, a framework for handling uncertainties arising from inputs or parameters in the PDE is presented. To handle potentially high-dimensional stochastic spaces, a stochastic gradient method is proposed. A model problem is constructed, demonstrating how uncertainty can be introduced into the problem and the objective can be transformed by use of the expectation. Finally, numerical experiments in the deterministic and stochastic case are devised, which demonstrate the effectiveness of the presented algorithms.
... PDE-constrained optimization under uncertainty is a rapidly growing field with a number of recent contributions in theory [6,24,25,27], numerical and computational methods [16,17,23,46], and applications [4,5,8,39]. Nevertheless, a number of open questions remain unanswered, even for the ideal setting including a strongly convex objective function, a closed, bounded and convex feasible set, and a linear elliptic PDE with random inputs. ...
... Broadly speaking, the numerical solution methods available for such problems derive in part from the standard paradigms found in the classical stochastic programming literature: first-order stochastic approximation/stochastic gradient (SG) approaches, [17], versus methods that rely on sampling from the underlying probability measure [46]. The latter approaches are not optimization algorithms per se, rather, they use approximations of the expectations by replacing the underlying probability distribution with a discrete measure. ...
... A collection F of measurable functions on Ξ is called P -Donsker if the empirical process (17) converges in distribution to a tight random variable G in the space ∞ (F), where the limit process G = {Gf : f ∈ F} is a zero-mean Gaussian process with the covariance function ...
Preprint
Full-text available
Monte Carlo approximations for random linear elliptic PDE constrained optimization problems are studied. We use empirical process theory to obtain best possible mean convergence rates O(1/\sqrt{n}) for optimal values and solutions , and a central limit theorem for optimal values. The latter allows to determine asymptotically consistent confidence intervals by using resampling techniques.
... to additive bias, which is not covered by existing results. This theory can be used to develop mesh refinement strategies in applications with PDEs [20]. ...
... This is done to project the stochastic gradient onto the L 2 (D) space as in [20]. Hence, the last line of Algorithm 2 is given by the expression u n+1 ∶= prox t n h u n − t n P̂hG(u n , n ) . ...
... Additionally, there is no need to determine the Lipschitz constant for the gradient, which in the application depends on (among other things) the Poincaré constant and the lower bound on the random fields, and thus lead to a prohibitively small constant step size. This phenomenon has been demonstrated in [20]. ...
Article
Full-text available
For finite-dimensional problems, stochastic approximation methods have long been used to solve stochastic optimization problems. Their application to infinite-dimensional problems is less understood, particularly for nonconvex objectives. This paper presents convergence results for the stochastic proximal gradient method applied to Hilbert spaces, motivated by optimization problems with partial differential equation (PDE) constraints with random inputs and coefficients. We study stochastic algorithms for nonconvex and nonsmooth problems, where the nonsmooth part is convex and the nonconvex part is the expectation, which is assumed to have a Lipschitz continuous gradient. The optimization variable is an element of a Hilbert space. We show almost sure convergence of strong limit points of the random sequence generated by the algorithm to stationary points. We demonstrate the stochastic proximal gradient algorithm on a tracking-type functional with a L 1 -penalty term constrained by a semilinear PDE and box constraints, where input terms and coefficients are subject to uncertainty. We verify conditions for ensuring convergence of the algorithm and show a simulation.
... It is critical to incorporate this uncertainty in the optimization problem to make the optimal solution more reliable and robust. Optimization under uncertainty has become an important research area and received increasing attentions in recent years [74,10,45,41,70,76,49,26,78,19,52,63,20,51,48,8,3,5,84,85,42,69,53,50,55,33,59,47,79,80,81,27,18,86,82,35,54,61,6,38,39,31,37,36]. To account for the uncertainty in the optimization problem, different statistical measures of the objective function have been studied, e.g., mean, variance, conditional value-at-risk, worst case scenario, etc., [70,48,85,3,53,50,35]. ...
... , N f , which are part of the approximation of E[ β (f )], all implicitly (possibly also explicitly) depend on the optimization variable z through the state equation (25), the adjoint equations (29) and (36), the generalized eigenvalue problems (19) and (22) with orthonormal conditions (20) and (23), the incremental state and adjoint equations (32) and (33) for the incremental stateû q =û q n and adjointv q =v q n atm q = ψ q n needed by the Hessian action ∇ 2 mq ψ q n in (19) through (34), n = 1, . . . , N q , as well as the incremental state and adjoint equations (38) and (39) for the incremental stateû f =û f n and adjointv f =v f n at m f = ψ f n needed by the Hessian action ∇ 2 mf ψ f n in (22) through (40), n = 1, . . . , N f . ...
... which, together with (60) and (38) with (û f ,m f ) = (û f n , ψ f n ), leads to ...
Preprint
Full-text available
We propose a fast and scalable optimization method to solve chance or probabilistic constrained optimization problems governed by partial differential equations (PDEs) with high-dimensional random parameters. To address the critical computational challenges of expensive PDE solution and high-dimensional uncertainty, we construct surrogates of the constraint function by Taylor approximation, which relies on efficient computation of the derivatives, low rank approximation of the Hessian, and a randomized algorithm for eigenvalue decomposition. To tackle the difficulty of the non-differentiability of the inequality chance constraint, we use a smooth approximation of the discontinuous indicator function involved in the chance constraint, and apply a penalty method to transform the inequality constrained optimization problem to an unconstrained one. Moreover, we design a gradient-based optimization scheme that gradually increases smoothing and penalty parameters to achieve convergence, for which we present an efficient computation of the gradient of the approximate cost functional by the Taylor approximation. Based on numerical experiments for a problem in optimal groundwater management, we demonstrate the accuracy of the Taylor approximation, its ability to greatly accelerate constraint evaluations, the convergence of the continuation optimization scheme, and the scalability of the proposed method in terms of the number of PDE solves with increasing random parameter dimension from one thousand to hundreds of thousands.
... These ideas are used in, e.g., [21]. Stochastic gradient methods have been thoroughly investigated for elliptic problems in the work of Martin and Nobile [28] and others [14]. Finally, MG/OPT techniques [24,25] have been used in this context. ...
... This is observed in experiments in [35]. From Assumption 5, it follows that a single constant implicit in in expression (14) for the cost in Theorem 1 exists for all gradient calculations in a neighborhood of the optimal point. The finest MG/OPT level K is fixed when Algorithm 2 starts. ...
... The first model problem consists of the Laplace equation where the right hand side (the source term) can be controlled at any point in the domain. This is the ubiquitous model problem analyzed in many papers, e.g., [1,10,14,18,29,28,32,35]. In the second problem, the flux at the boundary of the Laplace PDE is to be controlled by the Dirichlet condition at that boundary. ...
Preprint
An algorithm is proposed to solve robust control problems constrained by partial differential equations with uncertain coefficients, based on the so-called MG/OPT framework. The levels in this MG/OPT hierarchy correspond to discretization levels of the PDE, as usual. For stochastic problems, the relevant quantities (such as the gradient) contain expected value operators on each of these levels. They are estimated using a multilevel Monte Carlo method, the specifics of which depend on the MG/OPT level. Each of the optimization levels then contains multiple underlying multilevel Monte Carlo levels. The MG/OPT hierarchy allows the algorithm to exploit the structure inherent in the PDE, speeding up the convergence to the optimum. In contrast, the multilevel Monte Carlo hierarchy exists to exploit structure present in the stochastic dimensions of the problem. A statement about the rate of convergence of the algorithm is proven, and some additional properties are discussed. The performance of the algorithm is numerically investigated for three test cases. A reduction in the number of samples required on expensive levels and therefore in computational time can be observed.
... Definition 2. 10. Let x ∈ H and X be a subspace of H. ...
... In [10], the step sizes for an unbounded admissible set U ad can be found. ...
... 10: Optimal solution of (6.1) computed by the projcted gradient method combined with the Armijo linesearch and the suboptimal solution of (6.1) computed with the RB method and the projected stochastic gradient method. ...
... This is done to project the stochastic gradient onto the L 2 (D) space as in [19]. Hence, the last line of Algorithm 3 is given by the expression u n+1 := prox tnh (u n − t n P h G(u n , ξ n )) . ...
... Additionally, there is no need to determine the Lipschitz constant for the gradient, which in the application depends on (among other things) the Poincaré constant and the lower bound on the random fields, and thus lead to a prohibitively small constant step size. This phenomenon has been demonstrated in [19]. ...
... In lieu of efficiency estimates, it would be desirable to have better termination conditions that do not rely on increased sampling as our heuristic did in the numerical experiments. Finally, it would be natural to investigate mesh refinement strategies as in [19]. For more involved choices of nonsmooth terms, the prox computation is also subject to numerical error and should be treated. ...
Preprint
Full-text available
For finite-dimensional problems, stochastic approximation methods have long been used to solve stochastic optimization problems. Their application to infinite dimensional problems is less understood, particularly for nonconvex objectives. This paper presents convergence results for the stochastic proximal gradient method applied to Hilbert spaces, motivated by optimization problems with partial differential equation (PDE) constraints with random inputs and coefficients. We study stochastic algorithms for nonconvex and nonsmooth problems, where the nonsmooth part is convex and the nonconvex part is the expectation, which is assumed to have a Lipschitz continuous gradient. The optimization variable is an element of a Hilbert space. We show almost sure convergence of strong limit points of the random sequence generated by the algorithm to stationary points. We demonstrate the stochastic proximal gradient algorithm on a tracking-type functional with a L1L^1-penalty term constrained by a semilinear PDE and box constraints, where input terms and coefficients are subject to uncertainty. We verify conditions for ensuring convergence of the algorithm and show a simulation.
... This classical method dating back to Robbins and Monro [41] involves randomly sampling the otherwise intractable gradient and has been well-investigated in the literature. For the function space setting without this discrepancy, the stochastic gradient method and its variants have already been analyzed; see [5,14,33,45] and more recent contributions in the context of PDE-constrained optimization under uncertainty [28,29,30,31,38]. The setting we present in subsection 5.1 is adapted from [23], where convergence of a deterministic Ritz-Galerkin-type method was proven. ...
... Theorem 5.5 provides the argument for almost sure convergence of the projected stochastic gradient method. It is natural to ask whether convergence rates (in the mean square) can be derived as in [31]. Interestingly, because of the two-norm discrepancy, one cannot obtain the expected convergence rates for strongly convex problems. ...
Preprint
The present article is dedicated to proving convergence of the stochastic gradient method in case of random shape optimization problems. To that end, we consider Bernoulli's exterior free boundary problem with a random interior boundary. We recast this problem into a shape optimization problem by means of the minimization of the expected Dirichlet energy. By restricting ourselves to the class of convex, sufficiently smooth domains of bounded curvature, the shape optimization problem becomes strongly convex with respect to an appropriate norm. Since this norm is weaker than the differentiability norm, we are confronted with the so-called two-norm discrepancy, a well-known phenomenon from optimal control. We therefore need to adapt the convergence theory of the stochastic gradient method to this specific setting correspondingly. The theoretical findings are supported and validated by numerical experiments.
... The numerical approximations for optimal control problem governed by SPDEs have been the concerned issues just in recent years. Some efficient numerical methods were discussed, such as [3,13,26,29,30,33,35,36,43,48,51,55]. Among them, Alexandrian et al. [3] dealt with the optimal control of systems governed by PDE with uncertain parameter fields by using quadratic approximations; Geierbash et al. [26], Martin et al. [43] presented stochastic gradient method for optimal control with random parameters; Hou et al. [30] applied the finite element method to approximate the stochastic optimal control problem; Ulbrich et al. [33] gave an approximate robust formulation that employed quadratic model of the involved functions to the design problems with uncertain parameters; Kouri et al. [35,36] considered the trust-region algorithm for PDE-constrained optimization under uncertainty; Rosseel et al. [48] used stochastic finite element to the optimal control problem constrained by PDE with uncertain controls. ...
... Some efficient numerical methods were discussed, such as [3,13,26,29,30,33,35,36,43,48,51,55]. Among them, Alexandrian et al. [3] dealt with the optimal control of systems governed by PDE with uncertain parameter fields by using quadratic approximations; Geierbash et al. [26], Martin et al. [43] presented stochastic gradient method for optimal control with random parameters; Hou et al. [30] applied the finite element method to approximate the stochastic optimal control problem; Ulbrich et al. [33] gave an approximate robust formulation that employed quadratic model of the involved functions to the design problems with uncertain parameters; Kouri et al. [35,36] considered the trust-region algorithm for PDE-constrained optimization under uncertainty; Rosseel et al. [48] used stochastic finite element to the optimal control problem constrained by PDE with uncertain controls. However, in many physical systems, small changes in uncertain parameters can lead to large jumps of the state variables. ...
... We therefore want to solve a stochastic optimization problem set in a infinite dimensional space or manifold. Only few theoretical and numerical results are at hand concerning the convergence and the rate in that context (see [16,17,11,22]). The main challenge then is the computational cost. ...
... where v and w i are defined by (9) and (17). Each ϕ i is continuous by Proposition 2.4 and, then, each ϕ 2 i is also continuous. ...
Article
Full-text available
We study the inverse problem of reconstructing an obstacle in an elastic medium from boundary measurements. We assume that the data are noisy and that a statistical model for the data is available. We propose and study a reconstruction algorithm based on a weighted combination of the first two moments of the Kohn-Vogelius criterion. By numerical results in dimension two, we show that our approach is effective.
... The deterministic solvers employed in these (and related) papers are typically inexact Newton approaches, which allow for massive parallelization for the gradient and Hessian-vector products and avoid expensive matrix computations. Recently, stochastic approximation methods in the spirit of [29] have been adapted to stochastic PDE-constrained optimization problems in [12,13] and [28]. Variance reduction ideas, originating from machine learning, have witnessed applications in this field [27] as well. ...
... Assuming that we employ a conforming spatial discretization, there is sufficient stability near the fully continuous solution, and the numerical bias can be controlled as a function of N , then the convergence statements should carry over to discretization refinements of the fully discrete problem. A deeper analysis of this, as in [13,27], will be the subject of future research. ...
Article
Full-text available
We consider a class of convex risk-neutral PDE-constrained optimization problems subject to pointwise control and state constraints. Due to the many challenges associated with almost sure constraints on pointwise evaluations of the state, we suggest a relaxation via a smooth functional bound with similar properties to well-known probability constraints. First, we introduce and analyze the relaxed problem, discuss its asymptotic properties, and derive formulae for the gradient using the adjoint calculus. We then build on the theoretical results by extending a recently published online convex optimization algorithm (OSA) to the infinite-dimensional setting. Similar to the regret-based analysis of time-varying stochastic optimization problems, we enhance the method further by allowing for periodic restarts at pre-defined epochs. Not only does this allow for larger step sizes, it also proves to be an essential factor in obtaining high-quality solutions in practice. The behavior of the algorithm is demonstrated in a numerical example involving a linear advection-diffusion equation with random inputs. In order to judge the quality of the solution, the results are compared to those arising from a sample average approximation (SAA). This is done first by comparing the resulting cumulative distributions of the objectives at the optimal solution as a function of step numbers and epoch lengths. In addition, we conduct statistical tests to further analyze the behavior of the online algorithm and the quality of its solutions. For a sufficiently large number of steps, the solutions from OSA and SAA lead to random integrands for the objective and penalty functions that appear to be drawn from similar distributions.
... In the present Hilbert space setting this is in some sense even expected to be the rule rather than the exception, since most operators are derived from complicated dynamical systems or the optimization method is applied to discretized formulations of the original problem. See the recent work [34,35] for an interesting illustration in the context of PDE-constrained optimization. Some of our results go beyond the standard unbiasedness assumption. ...
... In fact, this setting has been investigated in detail in the context of stochastic Nash games [72]. Further examples for stochastic approximation schemes in a Hilbert-space setting obeying Assumption 6 are [73,74] and [35]. We now discuss an example that further clarifies the requirements on the estimator. ...
Article
Full-text available
We consider monotone inclusions defined on a Hilbert space where the operator is given by the sum of a maximal monotone operator T and a single-valued monotone, Lipschitz continuous, and expectation-valued operator V. We draw motivation from the seminal work by Attouch and Cabot (Attouch in AMO 80:547–598, 2019, Attouch in MP 184: 243–287) on relaxed inertial methods for monotone inclusions and present a stochastic extension of the relaxed inertial forward–backward-forward method. Facilitated by an online variance reduction strategy via a mini-batch approach, we show that our method produces a sequence that weakly converges to the solution set. Moreover, it is possible to estimate the rate at which the discrete velocity of the stochastic process vanishes. Under strong monotonicity, we demonstrate strong convergence, and give a detailed assessment of the iteration and oracle complexity of the scheme. When the mini-batch is raised at a geometric (polynomial) rate, the rate statement can be strengthened to a linear (suitable polynomial) rate while the oracle complexity of computing an ϵ-solution improves to O(1/ϵ). Importantly, the latter claim allows for possibly biased oracles, a key theoretical advancement allowing for far broader applicability. By defining a restricted gap function based on the Fitzpatrick function, we prove that the expected gap of an averaged sequence diminishes at a sublinear rate of O(1/k) while the oracle complexity of computing a suitably defined ϵ-solution is O(1/ϵ1+a) where a>1. Numerical results on two-stage games and an overlapping group Lasso problem illustrate the advantages of our method compared to competitors.
... The first one, called Stochastic Approximation (SA) method [52,Chapter 5.9], includes iterative methods that at each iteration draw new realizations independent from the previous ones. Examples of such approaches are the stochastic gradient method and its variants, which have been recently studied for OCPUU in [35,36,15,2]. In this manuscript, we adopt a second approach called Sample Average Approximation (SAA) method [52,Chapter 5.1], in which the original objective functional is replaced by an accurate approximation obtained discretizing once and for all the probability space using Stochastic Collocation Methods (SCMs), with Monte Carlo, Quasi-Monte Carlo [19], or Gaussian quadrature formulae. ...
... To get a lower bound on σ(S −1 LR S), we rely on the following theorem. 15 Theorem 5.9 (Theorem 1, [40]). Let K and K be generic invertible matrices satisfying ...
Preprint
The discretization of robust quadratic optimal control problems under uncertainty using the finite element method and the stochastic collocation method leads to large saddle-point systems, which are fully coupled across the random realizations. Despite its relevance for numerous engineering problems, the solution of such systems is notoriusly challenging. In this manuscript, we study efficient preconditioners for all-at-once approaches using both an algebraic and an operator preconditioning framework. We show in particular that for values of the regularization parameter not too small, the saddle-point system can be efficiently solved by preconditioning in parallel all the state and adjoint equations. For small values of the regularization parameter, robustness can be recovered by the additional solution of a small linear system, which however couples all realizations. A mean approximation and a Chebyshev semi-iterative method are investigated to solve this reduced system. Our analysis considers a random elliptic partial differential equation whose diffusion coefficient κ(x,ω)\kappa(x,\omega) is modeled as an almost surely continuous and positive random field, though not necessarily uniformly bounded and coercive. We further provide estimates on the dependence of the preconditioned system on the variance of the random field. Such estimates involve either the first or second moment of the random variables 1/minxDκ(x,ω)1/\min_{x\in \overline{D}} \kappa(x,\omega) and maxxDκ(x,ω)\max_{x\in \overline{D}}\kappa(x,\omega), where D is the spatial domain. The theoretical results are confirmed by numerical experiments, and implementation details are further addressed.
... Moreover, we believe that the whole construction is more amenable to an adaptive version, which, in combination with an appropriate error estimator, allows for a self-controlling algorithm. We leave this for future work, but mention the related recent work [20] on mesh refinement approaches in the context of a stochastic gradient method for PDE-constrained OCPs subject to uncertainties. ...
... Choosing v = u − u in (20) and subtracting the two optimality conditions, we obtain: ...
Article
Full-text available
We consider the numerical approximation of an optimal control problem for an elliptic Partial Differential Equation (PDE) with random coefficients. Specifically, the control function is a deterministic, distributed forcing term that minimizes the expected squared L 2 misfit between the state (i.e. solution to the PDE) and a target function, subject to a regularization for well posedness. For the numerical treatment of this risk-averse Optimal Control Problem (OCP) we consider a Finite Element discretization of the underlying PDE, a Monte Carlo sampling method, and gradient-type iterations to obtain the approximate optimal control. We provide full error and complexity analyses of the proposed numerical schemes. In particular we investigate the complexity of a conjugate gradient method applied to the fully discretized OCP (so called Sample Average Approximation), in which the Finite Element discretization and Monte Carlo sample are chosen in advance and kept fixed over the iterations. This is compared with a Stochastic Gradient method on a fixed or varying Finite Element discretization, in which the expectation in the computation of the steepest descent direction is approximated by Monte Carlo estimators, independent across iterations, with small sample sizes. We show in particular that the second strategy results in an improved computational complexity. The theoretical error estimates and complexity results are confirmed by numerical experiments.
... In §4.2, we derive efficiency estimates for the expected function value taken either for the averaged sequence of iterates or for the last iterate. These efficiency estimates take into account the additive error on the subgradient, using the technique from [16]. To obtain convergence rates for the expected function value of the last iterate, we adapt the concept of modified Fejér monotonicity [22] to the framework of the stochastic APP algorithm. ...
... The efficiency estimate is obtained using a similar technique as in [24] but without requiring the boundedness of U ad . Moreover, we are able to take into account the bias on the gradient with the following assumption, inspired from [16]: ...
Preprint
The stochastic Auxiliary Problem Principle (APP) algorithm is a general Stochastic Approximation (SA) scheme that turns the resolution of an original optimization problem into the iterative resolution of a sequence of auxiliary problems. This framework has been introduced to design decomposition-coordination schemes but also encompasses many well-known SA algorithms such as stochastic gradient descent or stochastic mirror descent. We study the stochastic APP in the case where the iterates lie in a Banach space and we consider an additive error on the computation of the subgradient of the objective. In order to derive convergence results or efficiency estimates for a SA scheme, the iterates must be random variables. This is why we prove the measurability of the iterates of the stochastic APP algorithm. Then, we extend convergence results from the Hilbert space case to the Banach space case. Finally, we derive efficiency estimates for the function values taken at the averaged sequence of iterates or at the last iterate, the latter being obtained by adapting the concept of modified Fej{\'e}r monotonicity to our framework.
... The performance measures typically involve high-dimensional integrals over the space of uncertain parameters, resulting in computationally challenging problems. Strategies to reduce the computational burden include, for instance, (multilevel) Monte Carlo methods [27,30,39], (multilevel) quasi-Monte Carlo methods [14,18,24], sparse grids [4,20], and variants of the stochastic gradient descent algorithm [11,25]. We point out that quasi-Monte Carlo methods are particularly well-suited, since they retain the convexity structure of the optimal control problem while achieving faster convergence rates as compared to Monte Carlo methods. ...
Preprint
Full-text available
A control in feedback form is derived for linear quadratic, time-invariant optimal control problems subject to parabolic partial differential equations with coefficients depending on a countably infinite number of uncertain parameters. It is shown that the Riccati-based feedback operator depends analytically on the parameters provided that the system operator depends analytically on the parameters, as is the case, for instance, in diffusion problems when the diffusion coefficient is parameterized by a Karhunen--Lo\`eve expansion. These novel parametric regularity results allow the application of quasi-Monte Carlo (QMC) methods to efficiently compute an a-priori chosen feedback law based on the expected value. Moreover, under moderate assumptions on the input random field, QMC methods achieve superior error rates compared to ordinary Monte Carlo methods, independently of the stochastic dimension of the problem. Indeed, our paper for the first time studies Banach-space-valued integration by higher-order QMC methods.
... Due to the high dimensionality of the KKT conditions, a common approach is employing a gradient-type method, in which simpler (uncoupled forward and adjoint) problems are solved until an accurate enough solution is found; see, among many examples, [20,23]. For time-dependent initial-value problems, gradient-type methods solve two instationary PDEs, one marching forward in time, the other backward, see, for instance, [23]. ...
Preprint
Full-text available
An automated framework is presented for the numerical solution of optimal control problems with PDEs as constraints, in both the stationary and instationary settings. The associated code can solve both linear and non-linear problems, and examples for incompressible flow equations are considered. The software, which is based on a Python interface to the Firedrake system, allows for a compact definition of the problem considered by providing a few lines of code in a high-level language. The software is provided with efficient iterative linear solvers for optimal control problems with PDEs as constraints. The use of advanced preconditioning techniques results in a significant speed-up of the solution process for large-scale problems. We present numerical examples of the applicability of the software on classical control problems with PDEs as constraints.
... These methods share similarities with batch-versions of stochastic gradient methods. The latters have been analyzed in the context of PDE-constrained optimization under uncertainty in, e.g., [9,8]. ...
Preprint
This manuscript presents a framework for using multilevel quadrature formulae to compute the solution of optimal control problems constrained by random partial differential equations. Our approach consists in solving a sequence of optimal control problems discretized with different levels of accuracy of the physical and probability discretizations. The final approximation of the control is then obtained in a postprocessing step, by suitably combining the adjoint variables computed on the different levels. We present a convergence analysis for an unconstrained linear quadratic problem, and detail our framework for the specific case of a Multilevel Monte Carlo quadrature formula. Numerical experiments confirm the better computational complexity of our MLMC approach compared to a standard Monte Carlo sample average approximation, even beyond the theoretical assumptions.
... For problems in which the underlying state equation itself includes uncertainties, different techniques have been proposed in the past [8][9][10][11]. Moreover, we restrict our convergence analysis to the finite-dimensional case, and leave a generalization to infinite-dimensional spaces, see, e.g., [12,13], for future work. Especially in large scale settings, one usually does not consider deterministic approaches (see, e.g., [14,15]) for the solution of such problems, as they are generally too computationally expensive or even intractable. ...
Article
Full-text available
In this contribution, we present a full overview of the continuous stochastic gradient (CSG) method, including convergence results, step size rules and algorithmic insights. We consider optimization problems in which the objective function requires some form of integration, e.g., expected values. Since approximating the integration by a fixed quadrature rule can introduce artificial local solutions into the problem while simultaneously raising the computational effort, stochastic optimization schemes have become increasingly popular in such contexts. However, known stochastic gradient type methods are typically limited to expected risk functions and inherently require many iterations. The latter is particularly problematic, if the evaluation of the cost function involves solving multiple state equations, given, e.g., in form of partial differential equations. To overcome these drawbacks, a recent article introduced the CSG method, which reuses old gradient sample information via the calculation of design dependent integration weights to obtain a better approximation to the full gradient. While in the original CSG paper convergence of a subsequence was established for a diminishing step size, here, we provide a complete convergence analysis of CSG for constant step sizes and an Armijo-type line search. Moreover, new methods to obtain the integration weights are presented, extending the application range of CSG to problems involving higher dimensional integrals and distributed data.
... This approach has also been pursued recently by other authors. In [11], the authors present a stochastic optimization strategy for PDE-constrained optimization. In the context of aerodynamic shape optimization, stochastic gradient descent has been applied for the robust design optimization of the NACA-0012 airfoil in [16], which considers uncertainties in operating conditions and model uncertainty. ...
... For the later, this has been primarily concerned with deriving convergence results related to the functional (1.1) as well as the iterates. In particular a relevant result is Theorem 4.7 from [15] which states that given a functional J and the optimal state u, with various convexity assumptions which shows ...
Preprint
Full-text available
Stochastic gradient methods have been a popular and powerful choice of optimization methods, aimed at minimizing functions. Their advantage lies in the fact that that one approximates the gradient as opposed to using the full Jacobian matrix. One research direction, related to this, has been on the application to infinite-dimensional problems, where one may naturally have a Hilbert space framework. However, there has been limited work done on considering this in a more general setup, such as where the natural framework is that of a Banach space. This article aims to address this by the introduction of a novel stochastic method, the stochastic steepest descent method (SSD). The SSD will follow the spirit of stochastic gradient descent, which utilizes Riesz representation to identify gradients and derivatives. Our choice for using such a method is that it naturally allows one to adopt a Banach space setting, for which recent applications have exploited the benefit of this, such as in PDE-constrained shape optimization. We provide a convergence theory related to this under mild assumptions. Furthermore, we demonstrate the performance of this method on a couple of numerical applications, namely a p-Laplacian and an optimal control problem. Our assumptions are verified in these applications.
... Therefore, the topic has received a lot of attention in the last years, see, e.g. [19,20,28,16,15,1,30,13,2]. ...
Preprint
Full-text available
We present a multigrid algorithm to solve efficiently the large saddle-point systems of equations that typically arise in PDE-constrained optimization under uncertainty. The algorithm is based on a collective smoother that at each iteration sweeps over the nodes of the computational mesh, and solves a reduced saddle-point system whose size depends on the number N of samples used to discretized the probability space. We show that this reduced system can be solved with optimal O(N) complexity. We test the multigrid method on three problems: a linear-quadratic problem for which the multigrid method is used to solve directly the linear optimality system; a nonsmooth problem with box constraints and L1L^1-norm penalization on the control, in which the multigrid scheme is used within a semismooth Newton iteration; a risk-adverse problem with the smoothed CVaR risk measure where the multigrid method is called within a preconditioned Newton iteration. In all cases, the multigrid algorithm exhibits very good performances and robustness with respect to all parameters of interest.
... The second-stage variable is another vector corresponding to a redistribution of assets after new information is gathered. However, there are some problems in which the underlying spaces are infinite-dimensional, for instance, in optimization with partial differential equations (PDEs) subject to uncertainty; see e.g., [2,8,13,20,[26][27][28]44], and the references therein. As an example, consider the optimal control of a stationary heat source over a domain D ⊂ R d , where the conductivity a = a(s, ω) is uncertain, but known to follow a given probability distribution. ...
Article
Full-text available
We analyze a potentially risk-averse convex stochastic optimization problem, where the control is deterministic and the state is a Banach-valued essentially bounded random variable. We obtain strong forms of necessary and sufficient optimality conditions for problems subject to equality and conical constraints. We propose a Moreau--Yosida regularization for the conical constraint and show consistency of the optimality conditions for the regularized problem as the regularization parameter is taken to infinity.
... We comment further on the studies [19,23] below, which make use of probability constraints. In contrast, the abstract results in [22] can be used for state constraints as considered in this paper. However, these results require a different kind of constraint qualification that may be difficult to verify in general. ...
Article
Full-text available
A class of risk-neutral generalized Nash equilibrium problems is introduced in which the feasible strategy set of each player is subject to a common linear elliptic partial differential equation with random inputs. In addition, each player’s actions are taken from a bounded, closed, and convex set on the individual strategies and a bound constraint on the common state variable. Existence of Nash equilibria and first-order optimality conditions are derived by exploiting higher integrability and regularity of the random field state variables and a specially tailored constraint qualification for GNEPs with the assumed structure. A relaxation scheme based on the Moreau-Yosida approximation of the bound constraint is proposed, which ultimately leads to numerical algorithms for the individual player problems as well as the GNEP as a whole. The relaxation scheme is related to probability constraints and the viability of the proposed numerical algorithms are demonstrated via several examples.
... Such problems are of interest for applications to optimization with partial differential equations (PDEs) under uncertainty, where the set to which x 2 (ω) belongs includes those states solving a PDE. This field is a rapidly developing one, with many developments in understanding the modeling, theory, and design of efficient algorithms; see, e.g., [1,12,14,18,21,26,29,42,52] and the references therein. So far, research has mostly been limited to the case where the control (in our notation, the first-stage variable x 1 ) has been subject to additional constraints. ...
... The analysis is a generalization from the finitedimensional setting from [3,Section 4.3]. While the proof follows many of the same arguments, we include an additive bias term r n , which is important for applications with PDEs, where solutions can only be approximated; see [8]. Additionally, we distinguish between possible topologies on the Hilbert space. ...
Preprint
The study of optimal control problems under uncertainty plays an important role in scientific numerical simulations. Nowadays this class of optimization problems is strongly utilized in engineering, biology and finance. In this paper, a stochastic gradient-based method is proposed for the numerical resolution of a nonconvex stochastic optimization problem on a Hilbert space. We show that, under suitable assumptions, strong or weak accumulation points of the iterates produced by the method converge almost surely to stationary points of the original optimization problem. The proof is based on classical results, such as the theorem by Robbins and Siegmund and the theory of stochastic approximation. The novelty of our contribution lies in the convergence analysis extended to some nonconvex infinite dimensional optimization problems. To conclude, the application to an optimal control problem for a class of elliptic semilinear partial differential equations (PDEs) under uncertainty will be addressed in detail.
... The second-stage variable is another vector corresponding to a redistribution of assets after new information is gathered. However, there are some problems in which the underlying spaces are infinite-dimensional, for instance, in optimization with partial differential equations (PDEs) subject to uncertainty; see e.g., [2,8,13,20,26,27,29,44], and the references therein. As an example, consider the optimal control of a stationary heat source over a domain D ⊂ R d , where the conductivity a = a(s, ω) is uncertain, but known to follow a given probability distribution. ...
Preprint
We analyze a potentially risk-averse convex stochastic optimization problem, where the control is deterministic and the state is a Banach-valued essentially bounded random variable. We obtain strong forms of necessary and sufficient optimality conditions for problems subject to equality and conical constraints. We propose a Moreau--Yosida regularization for the conical constraint and show consistency of the optimality conditions for the regularized problem as the regularization parameter is taken to infinity.
... In the present Hilbert space setting this is in some sense even expected to be the rule rather than the exception, since most operators are derived from complicated dynamical systems or the optimization method is applied to discretized formulations of the original problem. See the recent work [34,35] for an interesting illustration in the context of PDE-constrained optimization. Some of our results go beyond the standard unbiasedness assumption. ...
Preprint
Full-text available
We consider monotone inclusions defined on a Hilbert space where the operator is given by the sum of a maximal monotone operator T and a single-valued monotone, Lipschitz continuous, and expectation-valued operator V. We draw motivation from the seminal work by Attouch and Cabot on relaxed inertial methods for monotone inclusions and present a stochastic extension of the relaxed inertial forward-backward-forward (RISFBF) method. Facilitated by an online variance reduction strategy via a mini-batch approach, we show that (RISFBF) produces a sequence that weakly converges to the solution set. Moreover, it is possible to estimate the rate at which the discrete velocity of the stochastic process vanishes. Under strong monotonicity, we demonstrate strong convergence, and give a detailed assessment of the iteration and oracle complexity of the scheme. When the mini-batch is raised at a geometric (polynomial) rate, the rate statement can be strengthened to a linear (suitable polynomial) rate while the oracle complexity of computing an ϵ\epsilon-solution improves to O(1/ϵ)O(1/\epsilon). Importantly, the latter claim allows for possibly biased oracles, a key theoretical advancement allowing for far broader applicability. By defining a restricted gap function based on the Fitzpatrick function, we prove that the expected gap of an averaged sequence diminishes at a sublinear rate of O(1/k) while the oracle complexity of computing a suitably defined ϵ\epsilon-solution is O(1/ϵ1+a)O(1/\epsilon^{1+a}) where a>1a>1. Numerical results on two-stage games and an overlapping group Lasso problem illustrate the advantages of our method compared to stochastic forward-backward-forward (SFBF) and SA schemes.
... We comment further on the studies [19,23] below, which make use of probability constraints. In contrast, the abstract results in [22] can be used for state constraints as considered in this paper. However, these results require a different kind of constraint qualification that may be difficult to verify in general. ...
Preprint
Full-text available
A class of risk-neutral PDE-constrained generalized Nash equilibrium problems is introduced in which the feasible strategy set of each player is subject to a common linear elliptic partial differential equation with random inputs. In addition, each player’s actions are taken from a bounded, closed, and convex set on the individual strategies and a bound constraint on the common state variable. Existence of Nash equilibria and first-order optimality conditions are derived by exploiting higher integrability and regularity of the random field state variables and a specially tailored constraint qualification for GNEPs with the assumed structure.A relaxation scheme based on the Moreau-Yosida approximation of the bound constraint is proposed, which ultimately leads to numerical algorithms for the individual player problems as well as the GNEP as a whole. The relaxation scheme is related to probability constraints and the viability of the proposed numerical algorithms are demonstrated via several examples.
... Such problems are of interest for applications to optimization with partial differential equations (PDEs) under uncertainty, where the set to which x 2 (ω) belongs includes those states solving a PDE. This field is a rapidly developing one, with many developments in understanding the modeling, theory, and design of efficient algorithms; see, e.g., [9,21,27,18,1,33,14,8,12] and the references therein. So far, research has mostly been limited to the case where the control (in our notation, the first-stage variable x 1 ) has been subject to additional constraints. ...
Preprint
Full-text available
We analyze a convex stochastic optimization problem where the state is assumed to belong to the Bochner space of essentially bounded random variables with images in a reflexive and separable Banach space. For this problem, we obtain optimality conditions that are, with an appropriate model, necessary and sufficient. Additionally, the Lagrange multipliers associated with optimality conditions are integrable vector-valued functions and not only measures. A model problem is given demonstrating the application to PDE-constrained optimization under uncertainty.
... In each individual iteration, it is sufficient to obtain a gradient that indicates the right direction. The stochastic gradient approach also deals with approximated or inexact gradients, used during the optimization process; see [35], for example. However, our approach uses more sample points than usual in the stochastic gradient approach, but we also calculate the objective function with the reduced sample. ...
... In each individual iteration, it is sufficient to obtain a gradient that indicates the right direction. The stochastic gradient approach also deals with approximated or inexact gradients, used during the optimization process, see [34] for example. However, our approach uses more sample points than usual in the stochastic gradient approach, but we also calculate the objective function with the reduced sample. ...
Preprint
In this paper we present an efficient and fully error controlled algorithm for yield estimation and yield optimization. Yield estimation is used to quantify the impact of uncertainty in a manufacturing process. Since computational efficiency is one main issue in uncertainty quantification, we propose a hybrid method, where a large part of a Monte Carlo (MC) sample is evaluated with a surrogate model, and only a small subset of the sample is re-evaluated with a high fidelity finite element model. In order to determine this critical fraction of the sample, an adjoint error indicator is used for both the surrogate error and the finite element error. For yield optimization we propose an adaptive Newton-MC method. We reduce computational effort and control the MC error by adaptivly increasing the sample size. The proposed method minimizes the impact of uncertainty by optimizing the yield. It allows to control the finite element error, surrogate error and MC error. At the same time it is much more efficient than standard MC approaches combined with standard Newton algorithms.
Article
The stochastic Galerkin method for the propagation of probabilistically modeled uncertainties can be difficult to apply in practice due to its formulation and the challenge of creating a computational infrastructure to support it. To address these challenges, this work proposes a sampling-based stochastic Galerkin method that leverages existing deterministic analysis and adjoint-based derivative implementations. The proposed formulation is semi-intrusive since it is implemented using an existing deterministic framework, requiring only the numerical sampling of the deterministic residuals, Jacobians, boundary conditions, and adjoint implementations at nodes in the probabilistic domain. The software architectures to support stochastic generalizations of the deterministic finite element frameworks are presented. This proposed approach is demonstrated using a finite element framework for flexible multibody dynamics problems. Finally, the semi-intrusive implementation of the stochastic Galerkin method is used to demonstrate gradient-based optimizations of flexible multibody dynamics systems in the presence of probabilistically modeled uncertainties.
Chapter
We introduced the yield as the fraction of realizations in a manufacturing process fulfilling the performance feature specifications (PFS), see (4.4). Besides the estimation of the yield, e.g. with one of the methods presented in Chap. 4, the maximization of the yield is the next natural task. This chapter deals with different methods for yield optimization, depending on the specific problem and the information available.
Chapter
Shape optimization models with one or more shapes are considered in this chapter. Of particular interest for applications are problems in which a so-called shape functional is constrained by a partial differential equation (PDE) describing the underlying physics. A connection can be made between a classical view of shape optimization and the differential geometric structure of shape spaces. To handle problems where a shape functional depends on multiple shapes, a theoretical framework is presented, whereby the optimization variable can be represented as a vector of shapes belonging to a product shape space. The multi-shape gradient and multi-shape derivative are defined, which allows for a rigorous justification of a steepest descent method with Armijo backtracking. As long as the shapes as subsets of a hold-all domain do not intersect, solving a single deformation equation is enough to provide descent directions with respect to each shape. Additionally, a framework for handling uncertainties arising from inputs or parameters in the PDE is presented. To handle potentially high-dimensional stochastic spaces, a stochastic gradient method is proposed. A model problem is constructed, demonstrating how uncertainty can be introduced into the problem and the objective can be transformed by use of the expectation. Finally, numerical experiments in the deterministic and stochastic case are devised, which demonstrate the effectiveness of the presented algorithms.
Article
In this paper, we focus on a numerical investigation of a strongly convex and smooth optimization problem subject to a convection–diffusion equation with uncertain terms. Our approach is based on stochastic approximation where true gradient is replaced by a stochastic ones with suitable momentum term to minimize the objective functional containing random terms. A full error analysis including Monte Carlo, finite element, and stochastic momentum gradient iteration errors is done. Numerical examples are presented to illustrate the performance of the proposed stochastic approximations in the PDE-constrained optimization setting.
Article
The discretization of robust quadratic optimal control problems under uncertainty using the finite element method and the stochastic collocation method leads to large saddle‐point systems, which are fully coupled across the random realizations. Despite its relevance for numerous engineering problems, the solution of such systems is notoriously challenging. In this manuscript, we study efficient preconditioners for all‐at‐once approaches using both an algebraic and an operator preconditioning framework. We show in particular that for values of the regularization parameter not too small, the saddle‐point system can be efficiently solved by preconditioning in parallel all the state and adjoint equations. For small values of the regularization parameter, robustness can be recovered by the additional solution of a small linear system, which however couples all realizations. A mean approximation and a Chebyshev semi‐iterative method are proposed to solve this reduced system. We consider a random elliptic partial differential equation whose diffusion coefficient κ(x,ω)κ(x,ω) \kappa \left(x,\omega \right) is modeled as an almost surely continuous and positive random field, though not necessarily uniformly bounded and coercive. We further provide estimates of the dependence of the spectrum of the preconditioned system matrix on the statistical properties of the random field and on the discretization of the probability space. Such estimates involve either the first or second moment of the random variables 1/minx∈D‾κ(x,ω)1/minxDκ(x,ω) 1/{\min}_{x\in \overline{D}}\kappa \left(x,\omega \right) and maxx∈D‾κ(x,ω)maxxDκ(x,ω) {\max}_{x\in \overline{D}}\kappa \left(x,\omega \right) , where D D is the spatial domain. The theoretical results are confirmed by numerical experiments, and implementation details are further addressed.
Article
We develop a sampling-free approximation scheme for distributionally robust PDEconstrained optimization problems, which are min-max control problems. We define the ambiguity set through moment and entropic constraints. We use second-order Taylor's expansions of the reduced objective function w.r.t. uncertain parameters, allowing us to compute the expected value of the quadratic function explicitly. The objective function of the approximated min-max problem separates into a trust-region problem and a semidefinite program. We construct smoothing functions for the optimal value functions defined by these problems. We prove the existence of optimal solutions for the distributionally robust control problem, and the approximated and smoothed problems, and show that a worst-case distribution exists. For the numerical solution of the approximated problem, we develop a homotopy method that computes a sequence of stationary points of smoothed problems while decreasing smoothing parameters to zero. The adjoint approach is used to compute derivatives of the smoothing functions. Numerical results for two nonlinear optimization problems are presented.
Article
We propose a fast and scalable optimization method to solve chance or probabilistic constrained optimization problems governed by partial differential equations (PDEs) with high-dimensional random parameters. To address the critical computational challenges of expensive PDE solution and highdimensional uncertainty, we construct surrogates of the constraint function by Taylor approximation, which relies on efficient computation of the derivatives, low-rank approximation of the Hessian, and a randomized algorithm for eigenvalue decomposition. To tackle the difficulty of the nondifferentiability of the inequality chance constraint, we use a smooth approximation of the discontinuous indicator function involved in the chance constraint, and we apply a penalty method to transform the inequality constrained optimization problem to an unconstrained one. Moreover, we design a gradient-based optimization scheme that gradually increases smoothing and penalty parameters to achieve convergence, for which we present an efficient computation of the gradient of the approximate cost functional by the Taylor approximation. Based on numerical experiments for a problem in optimal groundwater management, we demonstrate the accuracy of the Taylor approximation, its ability to greatly accelerate constraint evaluations, the convergence of the continuation optimization scheme, and the scalability of the proposed method in terms of the number of PDE solves with increasing random parameter dimension from one thousand to hundreds of thousands.
Article
Full-text available
This work is motivated by the need to study the impact of data uncertainties and material imperfections on the solution to optimal control problems constrained by partial differential equations. We consider a pathwise optimal control problem constrained by a diffusion equation with random coefficient together with box constraints for the control. For each realization of the diffusion coefficient we solve an optimal control problem using the variational discretization [M. Hinze, Comput. Optim. Appl., 30 (2005), pp. 45-61]. Our framework allows for lognormal coefficients whose realizations are not uniformly bounded away from zero and infinity. We establish finite element error bounds for the pathwise optimal controls. This analysis is nontrivial due to the limited spatial regularity and the lack of uniform ellipticity and boundedness of the diffusion operator. We apply the error bounds to prove convergence of a multilevel Monte Carlo estimator for the expected value of the pathwise optimal controls. In addition we analyze the computational complexity of the multilevel estimator. We perform numerical experiments in 2D space to confirm the convergence result and the complexity bound.
Article
Full-text available
Many parameter estimation problems involve with a parameter-dependent PDEs with multiple right hand sides. The computational cost and memory requirements of such problems increases linearly with the number of right hand sides. For many applications this is the main bottleneck of the computation. In this paper we show that problems with multiple right hand sides can be reformulated as stochastic optimization problems that are much cheaper to solve. We discuss the solution methodology and use the direct current resistivity and seismic tomography as model problems to show the effectiveness of our approach.
Article
Full-text available
We discuss the use of stochastic collocation for the solution of optimal control problems which are constrained by stochastic partial differential equations (SPDE). Thereby the constraining, SPDE depends on data which is not deterministic but random. Assuming a deterministic control, randomness within the states of the input data will propagate to the states of the system. For the solution of SPDEs there has recently been an increasing effort in the development of efficient numerical schemes based upon the mathematical concept of generalized polynomial chaos. Modal-based stochastic Galerkin and nodal-based stochastic collocation versions of this methodology exist, both of which rely on a certain level of smoothness of the solution in the random space to yield accelerated convergence rates. In this paper we apply the stochastic collocation method to develop a gradient descent as well as a sequential quadratic program (SQP) for the minimization of objective functions constrained by an SPDE. The stochastic function involves several higher-order moments of the random states of the system as well as classical regularization of the control. In particular we discuss several objective functions of tracking type. Numerical examples are presented to demonstrate the performance of our new stochastic collocation minimization approach.
Article
Full-text available
We consider a finite element approximation of elliptic partial differential equations with random coefficients. Such equations arise, for example, in uncertainty quantification in subsurface flow modelling. Models for random coefficients frequently used in these applications, such as log-normal random fields with exponential covariance, have only very limited spatial regularity, and lead to variational problems that lack uniform coercivity and boundedness with respect to the random parameter. In our analysis we overcome these challenges by a careful treatment of the model problem almost surely in the random parameter, which then enables us to prove uniform bounds on the finite element error in standard Bochner spaces. These new bounds can then be used to perform a rigorous analysis of the multilevel Monte Carlo method for these elliptic problems that lack full regularity and uniform coercivity and boundedness. To conclude, we give some numerical results that confirm the new bounds.
Article
Full-text available
In this paper we consider optimization problems where the objective function is given in a form of the expectation. A basic difficulty of solving such stochastic optimization problems is that the involved multidimensional integrals (expectations) cannot be computed with high accuracy. The aim of this paper is to compare two computational approaches based on Monte Carlo sampling techniques, namely, the stochastic approximation (SA) and the sample average approximation (SAA) methods. Both approaches, the SA and SAA methods, have a long history. Current opinion is that the SAA method can efficiently use a specific (say, linear) structure of the considered problem, while the SA approach is a crude subgradient method, which often performs poorly in practice. We intend to demonstrate that a properly modified SA approach can be competitive and even significantly outperform the SAA method for a certain class of convex stochastic problems. We extend the analysis to the case of convex-concave stochastic saddle point problems and present (in our opinion highly encouraging) results of numerical experiments.
Article
Full-text available
We investigate the convergence rate of approximations by finite sums of rank-1 tensors of solutions of multiparametric elliptic PDEs. Such PDEs arise, for example, in the parametric, deterministic reformulation of elliptic PDEs with random field inputs, based, for example, on the M-term truncated Karhunen–Lo`eve expansion. Our approach could be regarded as either a class of compressed approximations of these solutions or as a new class of iterative elliptic problem solvers for high-dimensional, parametric, elliptic PDEs providing linear scaling complexity in the dimension M of the parameter space. It is based on rank-reduced, tensor-formatted separable approximations of the high-dimensional tensors and matrices involved in the iterative process, combined with the use of spectrally equivalent low-rank tensor-structured preconditioners to the parametric matrices resulting from a finite element discretization of the high-dimensional parametric, deterministic problems. Numerical illustrations for the M-dimensional parametric elliptic PDEs resulting from sPDEs on parameter spaces of dimensions M ≤ 100 indicate the advantages of employing low-rank tensor-structured matrix formats in the numerical solution of such problems.
Article
Full-text available
We describe and analyze two numerical methods for a linear elliptic problem with stochastic coecients and homogeneous Dirichlet boundary conditions. Here the aim of the com- putations is to approximate statistical moments of the solution, and in particular we illustrate on the case of the computation of the expected value of the solution. Since the approximation of the stochastic coecients from the elliptic problem is in general not exact, we derive related a priori error estimates. The first method generates iid approximations of the solution by sam- pling the coecients of the equation and using a standard Galerkin finite elements variational formulation. The Monte Carlo method then uses these approximations to compute correspond- ing sample averages. The second method is based on a finite dimensional approximation of the stochastic coecients, turning the original stochastic problem into a deterministic parametric elliptic problem. A Galerkin finite element method, of either h or p version, then approximates the corresponding deterministic solution yielding approximations of the desired statistics. We include a comparison of the computational work required by each method to achieve a given ac- curacy. This comparison suggests intuitive conditions for an optimal selection of these methods.
Article
This work combines results from operator and interpolation theory to show that elliptic systems in divergence form admit maximal elliptic regularity on the Bessel potential scale HDs1(Ω) H ^{s-1}_D(\varOmega ) for s>0s>0 sufficiently small, if the coefficient in the main part satisfies a certain multiplier property on the spaces Hs(Ω) H ^{s}(\varOmega ). Ellipticity is enforced by assuming a Gårding inequality, and the result is established for spaces incorporating mixed boundary conditions with very low regularity requirements for the underlying spatial set. To illustrate the applicability of our results, two examples are provided. Firstly, a phase-field damage model is given as a practical application where higher differentiability results are obtained as a consequence to our findings. These are necessary to show an improved numerical approximation rate. Secondly, it is shown how the maximal elliptic regularity result can be used in the context of quasilinear parabolic equations incorporating quadratic gradient terms.
Article
In this work we develop a scalable computational framework for the solution of PDE-constrained optimal control problems under high/infinite-dimensional uncertainty. Specifically, we consider a mean-variance risk adverse formulation of the stochastic optimization problem and employ a Taylor expansion with respect to the uncertain parameter either to directly approximate the control objective or as a control variate for Monte Carlo variance reduction. The evaluation of the mean and variance of the Taylor approximation requires to efficiently compute the trace of the (preconditioned) Hessian of the control objective. We propose to estimate such trace by solving a generalized eigenvalue problem using a randomized algorithm that only requires the action of the Hessian in a small number of random directions. Then, the computational work does not depend on the nominal dimension of the uncertain parameter but only on the effective dimension (i.e. the rank of the preconditioned Hessian), thus significantly alleviating or breaking the curse of dimensionality. Moreover, when the use of the possibly biased Taylor approximation results in large error of the optimal control function, we use such approximation as a control variate for variance reduction, which results in considerable computational savings (several orders of magnitude) compared to a simple Monte Carlo method. We demonstrate the accuracy, efficiency, and scalability of the proposed computational method for two problems with infinite-dimensional uncertain parameters: a subsurface flow in a porous medium modeled as an elliptic PDE, and a turbulent jet flow modeled as Reynolds-averaged Navier--Stokes equations coupled with a nonlinear advection-diffusion equation. In particular, for the latter (and more challenging) example we show scalability of our algorithm up to one million parameters after discretization.
Article
We present a method for optimal control of systems governed by partial differential equations (PDEs) with uncertain parameter fields. We consider an objective function that involves the mean and variance of the control objective, leading to a risk-averse optimal control problem. To make the optimal control problem tractable, we invoke a quadratic Taylor series approximation of the control objective with respect to the uncertain parameter field. This enables deriving explicit expressions for the mean and variance of the control objective in terms of its gradients and Hessians with respect to the uncertain parameter. The risk-averse optimal control problem is then formulated as a PDE-constrained optimization problem with constraints given by the forward and adjoint PDEs defining these gradients and Hessians. The expressions for the mean and variance of the control objective under the quadratic approximation involve the trace of the (preconditioned) Hessian, and are thus prohibitive to evaluate. To overcome this difficulty, we employ randomized trace estimators. We illustrate our approach with two specific problems: the control of a semilinear elliptic PDE with an uncertain boundary source term, and the control of a linear elliptic PDE with an uncertain coefficient field. For the latter problem, we derive adjoint-based expressions for efficient computation of the gradient of the risk-averse objective with respect to the controls. Our method ensures that the cost of computing the risk-averse objective and its gradient with respect to the control---measured in the number of PDE solves---is independent of the (discretized) parameter and control dimensions, and depends only on the number of random vectors employed in the trace estimation. Finally, we present a comprehensive numerical study of an optimal control problem for fluid flow in a porous medium with uncertain permeability field.
Article
This article is concerned with the derivation of a posteriori error estimates for optimization problems subject to an obstacle problem. To circumvent the nondifferentiability inherent to this type of problem, we introduce a sequence of penalized but differentiable problems. We show differentiability of the central path and derive separate a posteriori dual weighted residual estimates for the errors due to penalization, discretization, and iterative solution of the discrete problems. The effectivity of the derived estimates and of the adaptive algorithm is demonstrated on two numerical examples.
Article
The quantification of probabilistic uncertainties in the outputs of physical, biological, and social systems governed by partial differential equations with random inputs require, in practice, the discretization of those equations. Stochastic finite element methods refer to an extensive class of algorithms for the approximate solution of partial differential equations having random input data, for which spatial discretization is effected by a finite element method. Fully discrete approximations require further discretization with respect to solution dependences on the random variables. For this purpose several approaches have been developed, including intrusive approaches such as stochastic Galerkin methods, for which the physical and probabilistic degrees of freedom are coupled, and non-intrusive approaches such as stochastic sampling and interpolatory-type stochastic collocation methods, for which the physical and probabilistic degrees of freedom are uncoupled. All these method classes are surveyed in this article, including some novel recent developments. Details about the construction of the various algorithms and about theoretical error estimates and complexity analyses of the algorithms are provided. Throughout, numerical examples are used to illustrate the theoretical results and to provide further insights into the methodologies.
Article
This paper improves the trust-region algorithm with adaptive sparse grids introduced in [SIAM J. Sci. Comput., 35 (2013), pp. A1847-A1879] for the solution of optimization problems governed by partial differential equations (PDEs) with uncertain coefficients. The previous algorithm used adaptive sparse-grid discretizations to generate models that are applied in a trust-region framework to generate a trial step. The decision whether to accept this trial step as the new iterate, however, required relatively high-fidelity adaptive discretizations of the objective function. In this paper, we extend the algorithm and convergence theory to allow the use of low-fidelity adaptive sparse-grid models in objective function evaluations. This is accomplished by extending conditions on inexact function evaluations used in previous trust-region frameworks. Our algorithm adaptively builds two separate sparse grids: one to generate optimization models for the step computation and one to approximate the objective function. These adapted sparse grids often contain significantly fewer points than the high-fidelity grids, which leads to a dramatic reduction in the computational cost. This is demonstrated numerically using two examples. Moreover, the numerical results indicate that the new algorithm rapidly identifies the stochastic variables that are relevant to obtaining an accurate optimal solution. When the number of such variables is independent of the dimension of the stochastic space, the algorithm exhibits near dimension-independent behavior.
Article
We consider the numerical solution of a steady-state diffusion problem where the diffusion coefficient is the exponent of a random field. The standard stochastic Galerkin formulation of this problem is computationally demanding because of the nonlinear structure of the uncertain component of it. We consider a reformulated version of this problem as a stochastic convection-diffusion problem with random convective velocity that depends linearly on a fixed number of independent truncated Gaussian random variables. The associated Galerkin matrix is nonsymmetric but sparse and allows for fast matrix-vector multiplications with optimal complexity. We construct and analyze two block-diagonal preconditioners for this Galerkin matrix for use with Krylov subspace methods such as the generalized minimal residual method. We test the efficiency of the proposed preconditioning approaches and compare the iterative solver performance for a model problem posed in both diffusion and convection-diffusion formulations.
Article
This article surveys recent developments in the adaptive numerical solution of optimal control problems governed by partial differential equations (PDE). By the Euler-Lagrange formalism the optimization problem is reformulated as a saddle-point problem (KKT system) that is discretized by a Galerkin finite element method (FEM). Following the Dual Weighted Residual (DWR) approach the accuracy of the approximation is controlled by residual-based a posteriori error estimates. This opens the way toward systematic complexity reduction in the solution of PDE-based optimal control problems occurring in science and engineering (© 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)
Article
The numerical solution of optimization problems governed by partial differential equations (PDEs) with random coefficients is computationally challenging because of the large number of deterministic PDE solves required at each optimization iteration. This paper introduces an efficient algorithm for solving such problems based on a combination of adaptive sparse-grid collocation for the discretization of the PDE in the stochastic space and a trust-region framework for optimization and fidelity management of the stochastic discretization. The overall algorithm adapts the collocation points based on the progress of the optimization algorithm and the impact of the random variables on the solution of the optimization problem. It frequently uses few collocation points initially and increases the number of collocation points only as necessary, thereby keeping the number of deterministic PDE solves low while guaranteeing convergence. Currently an error indicator is used to estimate gradient errors due to adaptive stochastic collocation. The algorithm is applied to three examples, and the numerical results demonstrate a significant reduction in the total number of PDE solves required to obtain an optimal solution when compared with a Newton conjugate gradient algorithm applied to a fixed high-fidelity discretization of the optimization problem.
Article
In this paper we study mathematically and computationally optimal control problems for stochastic elliptic partial differential equations. The control objective is to minimize the expectation of a tracking cost functional, and the control is of the deterministic, distributed type. The main analytical tool is the Wiener–Itô chaos or the Karhunen–Loève expansion. Mathematically, we prove the existence of an optimal solution; we establish the validity of the Lagrange multiplier rule and obtain a stochastic optimality system of equations; we represent the input data in their Wiener–Itô chaos expansions and deduce the deterministic optimality system of equations. Computationally, we approximate the optimality system through the discretizations of the probability space and the spatial space by the finite element method; we also derive error estimates in terms of both types of discretizations.
Article
Let M(x) denote the expected value at level x of the response to a certain experiment. M(x) is assumed to be a monotone function of x but is unknown to the experimenter, and it is desired to find the solution x=θx = \theta of the equation M(x)=αM(x) = \alpha, where α\alpha is a given constant. We give a method for making successive experiments at levels x1,x2,x_1,x_2,\cdots in such a way that xnx_n will tend to θ\theta in probability.
Article
Partial differential equations (PDEs) with random input data, such as random loadings and coefficients, are reformulated as parametric, deterministic PDEs on parameter spaces of high, possibly infinite dimension. Tensorized operator equations for spatial and temporal k-point correlation functions of their random solutions are derived. Parametric, deterministic PDEs for the laws of the random solutions are derived. Representations of the random solutions’ laws on infinite-dimensional parameter spaces in terms of ‘generalized polynomial chaos’ (GPC) series are established. Recent results on the regularity of solutions of these parametric PDEs are presented. Convergence rates of best N-term approximations, for adaptive stochastic Galerkin and collocation discretizations of the parametric, deterministic PDEs, are established. Sparse tensor products of hierarchical (multi-level) discretizations in physical space (and time), and GPC expansions in parameter space, are shown to converge at rates which are independent of the dimension of the parameter space. A convergence analysis of multi-level Monte Carlo (MLMC) discretizations of PDEs with random coefficients is presented. Sufficient conditions on the random inputs for superiority of sparse tensor discretizations over MLMC discretizations are established for linear elliptic, parabolic and hyperbolic PDEs with random coefficients.
Article
Let M(x) be a regression function which has a maximum at the unknown point θ.M(x)\theta. M(x) is itself unknown to the statistician who, however, can take observations at any level x. This paper gives a scheme whereby, starting from an arbitrary point x1x_1, one obtains successively x2,x3,x_2, x_3, \cdots such that xnx_n converges to θ\theta in probability as nn \rightarrow \infty.