ArticlePDF Available

A new approach to improve ill-conditioned parabolic optimal control problem via time domain decomposition

Authors:

Abstract and Figures

In this paper we present a new steepest-descent type algorithm for convex optimization problems. Our algorithm pieces the unknown into sub-blocs of unknowns and considers a partial optimization over each sub-bloc. In quadratic optimization, our method involves Newton technique to compute the step-lengths for the sub-blocs resulting descent directions. Our optimization method is fully parallel and easily implementable, we first presents it in a general linear algebra setting, then we highlight its applicability to a parabolic optimal control problem, where we consider the blocs of unknowns with respect to the time dependency of the control variable. The parallel tasks, in the last problem, turn``on" the control during a specific time-window and turn it ``off" elsewhere. We show that our algorithm significantly improves the computational time compared with recognized methods. Convergence analysis of the new optimal control algorithm is provided for an arbitrary choice of partition. Numerical experiments are presented to illustrate the efficiency and the rapid convergence of the method.
Content may be subject to copyright.
A NEW APPROACH TO IMPROVE
ILL-CONDITIONED PARABOLIC OPTIMAL
CONTROL PROBLEM VIA TIME DOMAIN
DECOMPOSITION
Mohamed Kamel Riahi
To cite this version:
Mohamed Kamel Riahi. A NEW APPROACH TO IMPROVE ILL-CONDITIONED
PARABOLIC OPTIMAL CONTROL PROBLEM VIA TIME DOMAIN DECOMPOSITION.
2015. <hal-00974285v2>
HAL Id: hal-00974285
https://hal.inria.fr/hal-00974285v2
Submitted on 14 Jan 2015
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entific research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destin´ee au d´epˆot et `a la diffusion de documents
scientifiques de niveau recherche, publi´es ou non,
´emanant des ´etablissements d’enseignement et de
recherche fran¸cais ou ´etrangers, des laboratoires
publics ou priv´es.
A NEW APPROACH TO IMPROVE ILL-CONDITIONED PARABOLIC
OPTIMAL CONTROL PROBLEM VIA TIME DOMAIN DECOMPOSITION
Mohamed Kamel RIAHI1
1Department of mathematical science, New Jersey Institute of Technology, University Heights
Newark, New Jersey, USA.
January 14, 2015
Abstract. In this paper we present a new steepest-descent type algorithm for convex op-
timization problems. Our algorithm pieces the unknown into sub-blocs of unknowns and
considers a partial optimization over each sub-bloc. In quadratic optimization, our method
involves Newton technique to compute the step-lengths for the sub-blocs resulting descent di-
rections. Our optimization method is fully parallel and easily implementable, we first presents
it in a general linear algebra setting, then we highlight its applicability to a parabolic optimal
control problem, where we consider the blocs of unknowns with respect to the time dependency
of the control variable. The parallel tasks, in the last problem, turn“on” the control during
a specific time-window and turn it “off” elsewhere. We show that our algorithm significantly
improves the computational time compared with recognized methods. Convergence analysis of
the new optimal control algorithm is provided for an arbitrary choice of partition. Numerical
experiments are presented to illustrate the efficiency and the rapid convergence of the method.
Steepest descent method, Newton method, ill conditioned Optimal control, time domain de-
composition.
1. Introduction
Typically the improvement of iterative methods is based on an implicit transformation of the
original linear system in order to get a new system which has a condition number ideally close to
one see [11,13,25] and references therein. This technique is known as preconditioning. Modern
preconditioning techniques such as algebraic multilevel e.g. [20,24] and domain decomposition
methods e.g. [23,27,4,15] attempt to produce efficient tools to accelerate convergence. Other
techniques have introduced a different definition of the descent directions, for example, CG-
method, GMRES, FGMRES, BFGS, or its limited memory version l-BFGS see for instance [25].
Others approaches (e.g. [5],[12] and [28] without being exhaustive) propose different formulas for
the line-search in order to enhance the optimization procedure.
The central investigation of this paper is the enhancement of the iterations of the steepest
descent algorithm via an introduction of a new formulation for the line-search. Indeed, we show
how to achieve an optimal vectorized step-length for a given set of descent directions. Steepest
descent methods [7] are usually used for solving, for example, optimization problems, control
with partial differential equations (PDEs) constraints and inverse problems. Several approaches
have been developed in the cases of constrained and unconstrained optimization.
It is well-known that the algorithm has a slow convergence rate with ill-conditioned problems
because the number of iterations is proportional to the condition number of the problem. The
method of J.Barzila and J.Borwein [2] based on two-point step-length for the steepest-descent
method for approximating the secant equation avoids this handicap. Our method is very different
Mohamed Kamel RIAHI : riahi@njit.edu http://web.njit.edu/~riahi
1
2 MULTI-STEEPEST DESCENT ALGORITHM
because first, it is based on a decomposition of the unknown and proposes a set of bloc descent
directions, and second because it is general where it can be coupled together with any least-
square-like optimization procedure.
The theoretical basis of our approach is presented and applied to the optimization of a positive
definite quadratic form. Then we apply it on a complex engineering problem involving control of
system governed by PDEs. We consider the optimal heat control which is known to be ill-posed
in general (and well-posed under some assumptions) and presents some particular theoretical
and numerical challenges. We handle the ill-posedness degree of the heat control problem by
varying the regularization parameter and apply our methods in the handled problem to show the
efficiency of our algorithm. The distributed- and boundary-control cases are both considered.
This paper is organized as follows: In Section 2, we present our method in a linear algebra
framework to highlight its generality. Section 3 is devoted to the introduction of the optimal
control problem with constrained PDE on which we will apply our method. We present the
Euler-Lagrange-system associated to the optimization problem and give the explicit formulation
of the gradient in both cases of distributed- and boundary-control. Then, we present and explain
the parallel setting for our optimal control problem. In Section 4, we perform the convergence
analysis of our parallel algorithm. In Section 5, we present the numerical experiments that
demonstrate the efficiency and the robustness of our approach. We make concluding remarks in
Section 6. For completeness, we include calculus results in the Appendix.
Let Ω be a bounded domain in R3, and ΩcΩ, the boundary of Ω is denoted by Ω. We
denote by Γ Ω a part of this boundary. We denote h., .i2(respectively h., .icand h., .iΓ) the
standard L2(Ω) (respectively L2(Ωc) and L2(Γ)) inner-product that induces the L2(Ω)-norm k.k2
on the domain Ω (respectively k·kcon Ωcand k·kΓon Γ).
In the case of finite dimensional vector space in Rm, the scalar product aTbof aand b(where
aTstands for the transpose of a) is denoted by h., .i2too. The scalar product with respect to
the matrix A, i.e. hx, Axi2is denote by hx, xiAand its induced norm is denoted by kxkA. The
transpose of the operator Ais denoted by AT. The Hilbert space L2(0, T ;L2(Ωc)) (respectively
L2(0, T ;L2(Γ))) is endowed by the scalar product h., .ic,I ( respectively h., .iΓ,I) that induces the
norm k.kc,I (respectively k.kΓ,I ).
2. Enhanced steepest descent iterations
The steepest descent algorithm minimizes at each iteration the quadratic function q(x) =
kxx?k2
A, where Ais assumed to be a symbiotic positive definite (SPD) matrix and x?is the
minimum of q. The vector −∇q(x) is locally the descent direction that yields the fastest rate of
decrease of the quadratic form q. Therefore all vectors of the form x+θq(x), where θis a suitable
negative real value, minimize q. The choice of θis found by looking for the mins<0q(x+sq(x))
with the use of a line-search technique. In the case where qis a quadratic form θis given by
−k∇q(x)k2
2/k∇q(x)k2
A. We recall in Algorithm 1the steepest descent algorithm; Convergence
is a boolean variable based on estimation of the residual vector rk< , where is the stopping
criterion.
Our method proposes to modify the stepe 5.of Algorithm 1. It considers the step-length
θR\{0}as a vector in Rˆn
\{0}where ˆnis an integer such that 1 ˆnsize(x), we shall
denote this new vector as Θˆn.
In the following, it is assumed that for a giving vector xRm, the integer ˆndivides mwith
null rest. In this context, let us introduce the identity operators IRmwitch is an m-by-mmatrix
and its partition (partition of unity) given by the projection operators {πn}ˆn
n=1 : projectors from
MULTI-STEEPEST DESCENT ALGORITHM 3
Algorithm 1: Steepest descent.
Input:x0;
1k= 0;
2while Convergence do
3rk=qk:= q(xk);
4Compute Ark;
5Compute θk=−krkk2
2/krkk2
A;
6xk+1 =xk+θkrk;
7k=k+ 1;
8end
Rminto a set of canonical basis {ei}i. These operators are defined for 1 nˆnby
πn:RmRm
ˆn
x7→ πn(x) =
n×m
ˆn
X
i=(n1)×m
ˆn+1
hei, xi2ei.
For reading conveniences, we define ˜xna vector in Rmsuch that ˜xn:= πn(x). The concatenation
of ˜xnfor all 1 nˆnis denoted by
ˆxˆn=
ˆn
M
n=1
πn(x) =
ˆn
M
n=1
˜xnRm.
We remark that πnsatisfy Lˆn
n=1 πn=IRm.
Recall the gradient x= (
∂x1,...,
∂xm)T, and define the bloc gradient ˆxˆn=T
˜x1,...,T
˜xˆnT,
where obviously T
˜xn= (
∂x(n1)×m
ˆn+1 ,...,
∂xn×m
ˆn
)T. In the spirit of this decomposition we in-
vestigate, in the sequel, the local descent directions as the bloc partial derivatives with respect to
the bloc-variables (˜xn)nn
n=1 . We aim, therefore, at finding Θˆn= (θ1, . . . , θ ˆn)TRˆnthat ensures
the min(θn)n<0qˆxk
ˆn+Lˆn
n=1 θn˜xnq(ˆxk
ˆn).
We state hereafter a motivating result, which its proof is straightforward because the spaces
are embedded. Let us first, denote by
(1) Φˆnˆn) : RˆnR+
Θˆn7→ qˆxˆn+Lˆn
n=1 θn˜xnq(ˆxˆn)
which is quadratic because qis.
Theorem 2.1. According to the definition of Φˆnˆn)(see Eq.(1)) we immediately have
min
RpΦpp)min
RqΦqq)q < p.
The new algorithm we discuss in this paper proposes to define a sequence (ˆxk
ˆn)kof vectors
that converges to x?unique minimizer of the quadratic form q. The update formulae reads:
˜xk+1
n= ˜xk
n+θk
n˜xnq(ˆxk
ˆn),
where we recall that ˆnis an arbitrarily chosen integer. Then ˆxk+1
ˆn=Lˆn
n=1 ˜xk+1
n.
We shall explain now how one can accurately computes the vector step-length Θk
ˆnat each
iteration k. It is assumed that qis a quadratic form. From Eq.(1) using the chain rule, we obtain
4 MULTI-STEEPEST DESCENT ALGORITHM
the Jacobian vector Φ0
ˆnˆn)Rˆngiven by
(2) (Φ0
ˆnˆn))j=˜xjq(ˆxk
ˆn)T˜xjq ˆxk
ˆn+
ˆn
M
n=1
θn˜xnq(ˆxk
ˆn)!R,
and the Hessian matrix Φ00
ˆnˆn)Rˆn׈nis given by
00
ˆnˆn))i,j =˜xjq(ˆxk
ˆn)T˜xi˜xjq ˆxk
ˆn+
ˆn
M
n=1
θn˜xnq(ˆxk
ˆn)!˜xjq(ˆxk
ˆn).
It is worth noticing that the matrix ˜xi˜xjqˆxk
ˆn+Lˆn
n=1 θn˜xnq(ˆxk
ˆn)is a bloc portion of the
Hessian matrix A. However if the gradient ˜xnqRm
ˆnassumes an extension by zero (denoted
by e
˜xiq) to Rmso the matrix Φ00
ˆnˆn) has therefore the simplest implementable form
(3) (Φ00
ˆnˆn))i,j = (e
˜xjq(ˆxk
ˆn))TAe
˜xjq(ˆxk
ˆn).
We thus have the expansion Φˆnk
ˆn) = Φˆn(0) + (Θk
ˆn)TΦˆn(0) + 1
2k
ˆn)TΦ00
ˆn(0k
ˆn, with 0:=
(0, .., 0)TRˆn. Then the vector Θk
ˆnthat annuls the gradient writes:
(4) Θk
ˆn=Φ00
ˆn(0)1Φ0
ˆn(0).
Algorithm 1has therefore a bloc structure which can be solved in parallel. This is due to the fact
that partial derivatives can be computed independently. The new algorithm is thus as follows
(see Algorithm 2)
Algorithm 2: Enhanced steepest descent.
k= 0;
Input: ˆx0
ˆnRm;
1while Convergence do
2forall the 1nˆndo
3˜xk
n=πnxk
ˆn);
4rn=˜xk
nqxk
ˆn);
5resize(rn) (i.e. extension by zero means simply project on Rm);
6end
7Assemble Φ0
ˆn(0) with element (Φ0
ˆn(0))j=rT
jrjaccording to Eq.(2);
8Assemble Φ00
ˆn(0) with element (Φ00
ˆn(0))i,j =rT
iArjaccording to Eq.(3);
9Compute Θk
ˆnsolution of Eq.(4);
10 Update ˆxk+1
ˆn= ˆxk
ˆn+Lnθn˜xnq(ˆxk
ˆn);
11 k=k+ 1;
12 end
3. Application to a parabolic optimal control problem
In this part we are interested in the application of Algorithm 2in a finite element computa-
tional engineering problem involving optimization with constrained PDE. In particular, we deal
with the optimal control problem of a system, which is governed by the heat equation. We shall
present two types of control problems. The first concerns the distributed optimal control and
MULTI-STEEPEST DESCENT ALGORITHM 5
the second concerns the Dirichlet boundary control. The main difference from the algorithm just
presented in linear algebra is that the decomposition is applied on the time domain when the con-
trol. This technique is not classical, we may refer to a similar approaches that has been proposed
for the time domain decomposition in application to the control problem, for instance [19,17,18]
which basically they use a variant of the parareal in time algorithm [15].
3.1. Distributed optimal control problem. Let us briefly present the steepest descent method
applied to the following optimal control problem: find v?such that
(5) J(v?) = min
vL2(0,T ;L2(Ωc)) J(v),
where Jis a quadratic cost functional defined by
(6) J(v) = 1
2ky(T)ytarget k2
2+α
2ZI
kvk2
cdt,
where ytarget is a given target state and y(T) is the state variable at time T > 0 of the heat
equation controlled by the variable vover I:= [0, T ]. The Tikhonov regularization parameter α
is introduced to penalize the control’s L2-norm over the time interval I. The optimality system
of our problem reads: tyσy=Bv, on I×,
y(t= 0) = y0.
(7)
tp+σp= 0,on I×,
p(t=T) = y(T)ytarget .
(8)
J(v) = αv +BTp= 0,on I×.(9)
In the above equations, the operator Bis a linear operator that distributes the control in Ωc,
obviously Bstands for the indicator of ΩcΩ, the state variable pstands for the Lagrange
multiplier (adjoint state) solution of the backward heat equation Eq.(8), Eq.(7) is called the
forward heat equation.
3.2. Dirichlet boundary optimal control problem. In this subsection we are concerned with
the PDE constrained Dirichlet boundary optimal control problem, where we aim at minimizing
the cost functional JΓdefined by
(10) JΓ(vΓ) = 1
2kyΓ(T)ytarget k2
2+α
2ZI
kvΓk2
Γdt,
where the control variable vΓis only acting on the boundary Γ Ω. Here too, ytarget is a given
target state (not necessary equal the one defined in the last subsection ! ) and yΓ(T) is the state
variable at time T > 0 of the heat equation controlled by the variable vΓduring the time interval
I:= [0, T ]. As before αis a regularization term. The involved optimality system reads
tyΓσyΓ=fon I×
yΓ=vΓon I×Γ
yΓ=gon I× {\Γ}
yΓ(0) = y0
(11)
tpΓ+σpΓ= 0 on I×
pΓ= 0 on I×
pΓ(T) = yΓ(T)ytarg et
(12)
JΓ(vΓ) = αvΓ(pΓ)T~n = 0 on I×Γ,(13)
where fL2(Ω) is any source term, gL2(Γ) and ~n is the outward unit normal on Γ. the
state variable pΓstands for the Lagrange multiplier (adjoint state) solution of the backward heat
6 MULTI-STEEPEST DESCENT ALGORITHM
equation Eq.(12). Both functions fand gwill be given explicitly for each numerical test that
we consider in the numerical experiment section.
3.3. Steepest descent algorithm for optimal control of constrained PDE. In the optimal
control problem, the evaluation of the gradient as it is clear in Eq.(9) (respectively (13)) requires
the evaluation of the time dependent Lagrange multiplier p(respectively pΓ). This fact, makes the
steepest descent optimization algorithm slightly differs from the Algorithm 1already presented.
Let us denote by kthe current iteration superscript. We suppose that v0is known. The first
order steepest descent algorithm updates the control variable as follows:
(14) vk=vk1+θk1J(vk1),for k1,for the distributed control
respectively as
(15) vk
Γ=vk1
Γ+θk1
ΓJΓ(vk1
Γ),for k1,for the Dirichlet control
The step-length θk1R\{0}in the direction of the gradient J(vk1) = αvk1+BTpk1
(respectively JΓ(vk1
Γ) = αvΓ(pΓ)T~n) is computed as :
θk1=−k∇J(vk1)k2
c,I /k∇J(vk1)k2
2Jfor the distributed control.
respectively as
θk1
Γ=−k∇JΓ(vk1
Γ)k2
c,I /k∇JΓ(vk1
Γ)k2
2JΓfor the Dirichlet control.
The above step-length θk1(respectively θk1
Γ) is optimal (see e.g. [8]) in the sense that it
minimizes the functional θJ(vk1+θJ(vk1)) (respectively θJΓ(vk1
Γ+θJΓ(vk1
Γ))).
The rate of convergence of this technique is κ1
κ+1 2, where κis the condition number of the
quadratic form, namely the Hessian of the cost functional J(respectively JΓ).
3.4. Time-domain decomposition algorithm. Consider ˆnsubdivisions of the time interval
I=ˆn
n=1In, consider also the following convex cost functional J:
J(v1, v2, .., vˆn) = 1
2kY(T)ytarget k2
2+α
2
ˆn
X
n=1 ZIn
kvnk2
cdt,(16)
JΓ(v1,Γ, v2,Γ, .., vˆn,Γ) = 1
2kYΓ(T)ytarget k2
2+α
2
ˆn
X
n=1 ZIn
kvnk2
Γdt,(17)
where vn, n = 1, ..., ˆnare control variables with time support included in In, n = 1, ..., ˆn. The
state Y(T) (respectively YGamma) stands for the sum of state variables Yn(respectively Yn,Γ)
which are time-dependent state variable solution to the heat equation controlled by the variable
vn(respectively vn,Γ). . Obviously because the control is linear the state Ydepends on the
concatenation of controls v1, v2, .., vˆnnamely v=Pn= ˆn
n=1 vn.
Let us define Θˆn:= (θ1, θ2, ..., θ ˆn)Twhere θnR\{0}. For any admissible control w=
Pˆn
nwn, we also define ϕˆnˆn) := J(v+Pˆn
n=1 θnwn), which is quadratic. We have:
(18) ϕˆnˆn) = ϕˆn(0)+ΘT
ˆnϕˆn(0) + 1
2ΘT
ˆn2ϕˆn(0ˆn,
where 0= (0, ..., 0)T. Therefore we can write ϕˆnˆn)Rˆnas ϕˆnˆn) = D(v , w) +
H(v, wˆn, where the Jacobian vector and the Hessian matrix are given respectively by:
D(v, w) := (h∇J(v),π1(w)ic,...,h∇J(v),πˆn(w)ic)TRˆn,
H(v, w) := (Hn,m )n,m,for Hn,m =hπn(w),πm(w)i2J.
MULTI-STEEPEST DESCENT ALGORITHM 7
Here, (πn) is the restriction over the time interval In, indeed πn(w) has support on Inand
assumes extension by zero in I. The solution Θ?
ˆnof ϕˆnˆn) = 0can be written in the form:
(19) Θ?
ˆn=H1(v, w)D(v, w).
In the parallel distributed control problem, we are concerned with the following optimality sys-
tem: tYnσYn=Bvn,on I×,
Yn(t= 0) = δ0
ny0.
(20)
Y(T) =
ˆn
X
n=1
Yn(T)(21)
tP+σP= 0,on I×,
P(t=T) = Y(T)ytarget .
(22)
J(
ˆn
X
n=1
vn) = BTP+α
ˆn
X
n=1
vn= 0,on I×.(23)
where δ0
nstands for the function taking value ”1” only if n= 0, else it takes the value ”0”. The
Dirichlet control problem we are concerned with:
tYn,ΓσYn,Γ=fon I×
Yn,Γ=vn,Γon I×Γ
Yn,Γ=gon I× {\Γ}
Yn,Γ(0) = δ0
ny0.
(24)
YΓ(T) =
ˆn
X
n=1
Yn,Γ(T)(25)
tPΓ+σPΓ= 0 on I×
PΓ= 0 on I×
PΓ(T) = YΓ(T)ytarget.
(26)
JΓ(
ˆn
X
n=1
vn,Γ) = ∇PΓT~n +α
ˆn
X
n=1
vn,Γ= 0 on I×Γ.(27)
The resolution of Eqs. (20) and (24) with respect to nare fully performed in parallel over the
time interval I. It is recalled that the superscript kdenotes the iteration index. The update
formulae for the control variable vkis given by:
vk
n=vk1
n+θk1
nBTPk1+α
ˆn
X
n=1
vk1
n.
respectively as
vk
n,Γ=vk1
n,Γ+θk1
n,Γ∇Pk1
ΓT~n +α
ˆn
X
n=1
vk1
n,Γ.
We shall drop in the following the index Γof the cost functional J. This index
would be only used to specify which cost function is in consideration. unless the
driven formulation apply for distributed as well as boundary control.
We show hereafter how to assemble vector step-length Θk
ˆnat each iteration. For the purposes
of notation we denote by Hkthe k-th iteration of the Hessian matrix H(J(vk),J(vk)) and by
Dkthe k-th iteration of the Jacobian vector D(J(vk),J(vk)). The line-search is performed
8 MULTI-STEEPEST DESCENT ALGORITHM
with quasi-Newton techniques that uses at each iteration ka Hessian matrix Hkand Jacobian
vector Dkdefined respectively by:
Dk:= h∇J(vk),π1J(vk)ic, .., h∇J(vk),πˆnJ(vk)icT,(28)
(Hk)n,m := hπnJ(vk),πmJ(vk)i2J.(29)
The spectral condition number of the Hessian matrix 2Jis denoted as: κ=κ(2J) :=
λmaxλ1
min, with λmax := λmax(2J) the largest eigenvalue of 2Jand λmin := λmin(2J)
its smallest eigenvalue.
According to Eq.(19) we have
(30) Θk
ˆn=H1
kDk.
From Eq.(18) we have:
(31) J(vk+1) = J(vk) + (Θk
ˆn)TDk+1
2k
ˆn)THkΘk
ˆn.
Our parallel algorithm to minimize the cost functional Eq.(16) and (17), is stated as follows (see
Algorithm 3).
Algorithm 3: Enhanced steepest descent algorithm for the optimal control prob-
lem.
0Input:v0
1while Convergence do
2forall the 1nˆndo
3Solve Yn(T)(vk
n) of Eq.(20)(respectively Eq.(24)) in parallel for all 1 nˆn;
4end
5Compute P(t) with the backward problem according to Eq.(22) (respectively Eq.(26)) ;
6forall the 1nˆndo
7Compute (Dk)nof Eq.(28)in parallel for all 1 nˆn;
8end
9Gather (Dk)nfrom processor n, 2 nˆnto master processor;
10 Assemble the Hessian matrix Hkaccording to Eq.(29) with master processor;
11 Compute the inversion of Hkand calculate Θk
ˆnusing Eq.(30);
12 Broadcast θk
nfrom master processor to all slaves processors;
13 Update time-window-control variable vk+1
nin parallel as :
vk+1
n=vk
n+θk
nπnJ(vk)for all 1 nˆn,
and go to step 2;
14 k=k+ 1;
15 end
Since (vn)nhas disjoint time-support, thanks to the linearity, the notation enJ(vk)is
nothing but J(vk
n), where vkis the concatenation of vk
1, . . . , vk
ˆn. In Algorithm 3steps 9,10,11,
12 and 13 are trivial tasks in regards to computational effort.
MULTI-STEEPEST DESCENT ALGORITHM 9
4. Convergence analysis of Algorithm 3
This section provides the proof of convergence of Algorithm 3. In the sequel, we suppose that
k∇J(vk)kcdoes not vanish; otherwise the algorithm has already converged.
proposition 4.1. The increase in value of the cost functional Jbetween two successive controls
vkand vk+1 is bounded below by:
(32) J(vk)J(vk+1)1
2κ(Hk)
k∇J(vk)k4
c
k∇J(vk)k2
2J
.
Proof. Using Eq.(30) and Eq.(31), we can write:
(33) J(vk)J(vk+1) = 1
2DT
kH1
kDk.
Preleminaries: From the definition of the Jacobian vector Dkwe have
kDkk2
2=
ˆn
X
n=1
h∇J(vk),πn(J(vk))i2
c,
=
ˆn
X
n=1
hπn(J(vk)),πn(J(vk))i2
c,
=
ˆn
X
n=1
ken(J(vk))k4
c,
=k∇J(vk)k4
c.
Furthermore since Hkis an SPD matrix we have λmin(H1
k) = 1
λmax(Hk),from which we deduce:
1
λmin(Hk)1
1
ˆn1T
ˆnHk1ˆn.Moreover, we have:
DT
kH1
kDk=DT
kH1
kDk
kDkk2
2
kDkk2
2λmin(H1
k)kDkk2
2
=λmin(H1
k)λmin(Hk)k∇J(vk)k4
c
λmin(Hk)
λmin(Hk)
λmax(Hk)
k∇J(vk)k4
c
1
ˆn1T
ˆnHk1ˆn
=ˆn
κ(Hk)k∇J(vk)k2
2Jk∇J(vk)k4
c.
Since the partition number ˆnis greater than or equal to 1, we conclude that :
(34) DT
kH1
kDkk∇J(vk)k2
2Jk∇J(vk)k4
c
κ(Hk).
Hence, using Eq.(33) we get the stated result.
Theorem 4.2. For any partition ˆnof sub intervals, the control sequence (vk)k1of Algorithm 3
converges to the optimal control vkunique minimizer of the quadratic functional J. Furthermore
we have:
kvkv?k2
2Jrkkv0v?k2
2J,
where the rate of convergence r:= 14κ
κ(Hk)(κ+1)2satisfies 0r < 1.
10 MULTI-STEEPEST DESCENT ALGORITHM
Proof. We denote by v?the optimal control that minimizes J. The equality
J(v) = J(v?) + 1
2hvv?, v v?i2J=J(v?) + 1
2kvv?k2
2J,
holds for any control v; in particular we have:
J(vk+1) = J(v?) + 1
2kvk+1 v?k2
2J,
J(vk) = J(v?) + 1
2kvkv?k2
2J.
Consequently, by subtracting the equations above, we obtain
(35) J(vk+1)J(vk) = 1
2kvk+1 v?k2
2J1
2kvkv?k2
2J.
Since Jis quadratic, we have 2J(vkv?) = J(vk), that is vkv?= (2J)1J(vk).
Therefore we deduce:
kvkv?k2
2J=hvkv?, vkv?i2J
(36)
=hvkv?,2J, vkv?ic
=h(2J)1J(vk),2J, (2J)1J(vk)ic
=h∇J(vk),(2J)1,J(vk)ic
=k∇J(vk)k2
(2J)1.
Because of Eq.(33), we also have
J(vk+1)J(vk) = 1
2DT
kH1
kDk.
Using Eq.(35) and the above, we find that:
kvk+1 v?k2
2J=kvkv?k2
2JDT
kHT
kDk.
Moreover, according to Eqs (34)-(36), we obtain the following upper bound:
kvk+1 v?k2
2J≤ kvkv?k2
2J1
κ(Hk)
k∇J(vk)k4
c
k∇J(vk)k2
2J
≤ kvkv?k2
2J11
κ(Hk)
k∇J(vk)k4
c
k∇J(vk)k2
2Jk∇J(vk)k2
(2J)1.(37)
Using the Kantorovich inequality [14,1] (see also The Appendix) :
(38) k∇J(vk)k4
c
k∇J(vk)k2
2Jk∇J(vk)k2
(2J)1
4λmaxλmin
(λmax +λmin)2.
Then
11
κ(Hk)
k∇J(vk)k4
c
k∇J(vk)k2
2Jk∇J(vk)k2
(2J)1
14κ
κ(Hk)(κ+ 1)2.
Finally we obtain the desired results for any partition to ˆnsubdivision, namely
kvkv?k2
2J14κ
κ(Hk)(κ+ 1)2kkv0v?k2
2J.
The proof is therefore complete.
Remark 4.1. Remark that the proof stands correct for the boundary control, need just to change
the subscript ”c” indicating the distributed control region c, replace it by ”Γ” to indicate the
boundary control on Γ.
MULTI-STEEPEST DESCENT ALGORITHM 11
Remark 4.2. Remark that for ˆn= 1, we immediately get the condition number κ(Hk) = 1 and
we recognize the serial steepest gradient method, which has convergence rate κ1
κ+1 2.
It is difficult to pre-estimate the spectral condition number κ(Hk)(ˆn) (is a function of ˆn)
that play an important role and contribute to the evaluation of the rate of convergence as
our theoretical rate of convergence stated. We present in what follows numerical results that
demonstrate the efficiency of our algorithm, Tests consider examples of well-posed and ill-posed
control problem.
5. Numerical experiments
We shall present the numerical validation of our method in tow stages. In the first stage, we
consider a linear algebra framework where we construct a random matrix-based quadratic cost
function that we minimize using Algorithm 2. In the second stage, we consider the two optimal
control problems presented in sections 3.1 and in 3.2 for the distributed- and Dirchlet boundary-
control respectively. In both cases we minimize a quadratic cost function properly defined for
each handled control problem.
5.1. Linear algebra program. This subsection treat basically the implementation of Algo-
rithm 2. The program was implemented using the scientific programming language Scilab [26].
We consider the minimization of a quadratic form qwhere the matrix Ais an SPD m-by-mma-
trix and a real vector bRmrank(A) are generated by hand (see below for their constructions).
We aim at solving iteratively the linear system Ax =b, by minimizing
(39) q(x) = 1
2xTAx xTb.
Let us denote by ˆnthe partition number of the unknown xRm. The partition is supposed
to be uniform and we assume that ˆndivides mwith a null rest.
We give in Table 1aScilab function that builds the vector step-length Θk
ˆnas stated in
Eq. (4). In the practice we randomly generate an SPD sparse matrix A= (α+γm)IRm+R,
where 0 < α < 1, γ > 1, IRmis the m-by-midentity matrix and Ris a symmetric m-by-m
random matrix. This way the matrix Ais symmetric and diagonally dominant, hence SPD.
It is worthy noticing that the role of αis regularizing when rapidly vanishing eigenvalues of A
are generated randomly. This technique helps us to manipulate the coercivity of the handled
problem hence its spectral condition number.
For such matrix Awe proceed to minimize the quadratic form defined in Eq.(39) with several
ˆn-subdivisions.
The improvement quality of the algorithm against the serial case ˆn= 1 in term of iteration
number is presented in Figure. 1. In fact, the left hand side of Figure. 1presents the cost function
minimization versus the iteration number of the algorithm where several choices of partition on
ˆnare carried out. In the right hand side of the Figure. 1we give the logarithmic representation
of the relative error kxkx?k2
kx?k2, where x?is the exact solution of the linear system at hand.
5.2. Heat optimal control program. We discuss in this subsection the implementation results
of Algorithm 3for the optimization problems presented in section 3. Our tests deal with the 2D-
heat equation on the bounded domain Ω = [0,1]×[0,1]. We consider, three types of test problems
in both cases of distributed and Dirichlet controls. Tests vary according to the theoretical
difficulty of the control problem [6,3,10]. Indeed, we vary the regularization parameter αand
also change the initial and target solutions in order to handle more severe control problems as
has been tested for instance in [6].
Numerical tests concern the minimization of the quadratic cost functionals J(v) and JΓ(vΓ)
using Algorithm 3. It is well known that in the case αvanishes the control problem becomes
12 MULTI-STEEPEST DESCENT ALGORITHM
1f u n c t i o n [ P] = Bu i l d Hk ( n , A, b , x k , d Jk )
2m= s i z e (A , 1 ) ; l=m/n ; i i =modulo( m, n ) ;
3i f i i ˜=0 then
4printf(” P l e a s e c h o s e an o t h e r n ! ) ;
5abort ;
6end
7dJkn=z e r o s (m, n ) ; Dk= [ ] ;
8f o r i = 1:n
9dJ kn ( ( i 1)l + 1: i l , i )= dJk ( ( i 1) l +1: i l ) ;
10 Dk( i )=dJ kn ( : , i ) (Axkb) ;
11 end
12 Hk = [ , ] ;
13 f o r i = 1:n
14 f o r j =i : n
15 Hktmp=Ad Jkn ( : , j ) ;
16 Hk( i , j )=dJk n ( : , i ) Hktmp;
17 Hk( j , i )=Hk( i , j ) ;
18 end
19 end
20 theta=Hk\Dk ;
21 P = eye (m,m) ;
22 f o r i = 1:n
23 P( ( i 1)l + 1: i l , ( i 1)l +1 : i l )=t h e t a ( i ) . eye( l , l ) ;
24 end
25 endfunction
.
Table 1. Scilab function to build the vector step length, for the linear algebra program.
Figure 1. Performance in term of iteration number: Several decomposition on
ˆn. Results from the linear algebra Scilab program.
an ”approximated” controllability problem. Therefore the control variable tries to produce a
solution that reaches as close as it ”can” the target solution. With this strategy, we accentuate
the ill-conditioned degree of the handled problem. We also consider an improper-posed problems
for the controllability approximation, where the target solution doesn’t belong to the space of
the reachable solutions. No solution exists thus for the optimization problem i.e. no control
exists that enables reaching the given target !
MULTI-STEEPEST DESCENT ALGORITHM 13
For reading conveniences and in order to emphasize the role of the parameter αon numerical
tests, we tag problems that we shall consider as Pα
iwhere the index irefers to the problem
among {1,2,3,4}. The table below resumes all numerical test that we shall experiences
Minimize J(v) distributed control Minimize JΓ(vΓ) boundary control
Moderate α= 1 ×1002 well-posed problem ill-posed problem
corresponding data in (Pα
1), (Pα
2) corresponding data in (Pα
3)
Vanishing α= 1 ×1008 ill-posed problem sever ill-posed problem
corresponding data in (Pα
1),(Pα
2) corresponding data in (Pα
3)
Solution does not exist sever ill-posed problem sever ill-posed problem
corresponding data in (Pα
4) corresponding data in (Pα
4)
We suppose from now on that the computational domain Ω is a polygonal domain of the
plane R2. We then introduce a triangulation Thof Ω; the subscript hstands for the largest
length of the edges of the tringles that constitute Th. The solution of the heat equation at a
given time tbelongs to H1(Ω). The source terms and other variables are elements of L2(Ω).
Those infinite dimensional spaces are therefore approximated with the finite-dimensional space
Vh, characterized by P1the space of the polynomials of degree 1 in two variables (x1, x2).
We have Vh:= {uh|uhC0(Ω), uh|KP1,for all K∈ Th}. In addition, Dirichlet boundary
conditions (where the solution is in H1
0(Ω) i.e. vanishing on boundary Ω) are taken into account
via penalization of the vertices on the boundaries. The time dependence of the solution is
approximated via the implicit Euler scheme. The inversion operations of matrices is performed
by the umfpak solver. We use the trapezoidal method in order to approximate integrals defined
on the time interval.
The numerical experiments were run using a parallel machine with 24 CPU’s AMD with 800
MHz in a Linux environment. We code two FreeFem++ [22] scripts for the distributed and
Dirichlet control. We use MPI library in order to achieve parallelism.
Tests that concern the distributed control problem are produced with control that acts on
cΩ, with Ωc= [0,1
3]×[0,1
3], whereas Dirichlet boundary control problem, the control acts
on Γ Ω, with Γ = {(x1, x2),|x2= 0}. The time horizon of the problem is fixed to
T= 6.4 and the small time step is τ= 0.01. In order to have a better control of the time
evolution we put the diffusion coefficient σ= 0.01.
5.2.1. First test problem: Moderate Tikhonov regularization parameter α.We consider an opti-
mal control problem on the heat equation. The control is considered first to be distributed and
then Dirichlet. For the distributed optimal control problem we first use the functions
y0(x1, x2) = exp γ2π(x1.7)2+ (x2.7)2
ytarget (x1, x2) = exp γ2π(x1.3)2+ (x2.3)2,
(Pα
1)
as initial condition and target solution respectively. The real valued γis introduced to force the
Gaussian to have support strictly included in the domain and verify the boundary conditions.
The aim is to minimize the cost functional defined in Eq. (6). The decay of the cost function
with respect to the iterations of our algorithm is presented in Figure. 2on the left side, and
the same results are given with respect to the computational CPU’s time (in sec) on the right
side. We show that the algorithm accelerates with respect to the partition number ˆnand also
preserves the accuracy of the resolution. Indeed, all tests independently of ˆnalways converge to
the unique solution. This is in agreement with Theorem (4.2), which proves the convergence of
the algorithm to the optimal control (unique if it exists [16]) for an arbitrary partition choice ˆn.
14 MULTI-STEEPEST DESCENT ALGORITHM
Figure 2. First test problem, for Pα
1: Normalized and shifted cost functional
values versus iteration number (left) and versus computational time (right) for
several values of ˆn(i.e. the number of processors used).
Figure 3. Snapshots in ˆn= 1,16 of the distributed optimal control on the left
columns and its corresponding controlled final state at time T: y(T) on the right
columns. The test case corresponds to the control problem Pα
1, where αis taken
as α= 1 ×1002 . Same result apply for different choice of ˆn.
MULTI-STEEPEST DESCENT ALGORITHM 15
We test a second problem with an a priori known solution of the heat equation. The considered
problem has
y0(x1, x2) = sin(πx1) sin(πx2)
ytarget (x1, x2) = exp(2π2σT ) sin(πx1) sin(πx2),
(Pα
2)
as initial condition and target solution respectively. Remark that the target solution is taken as
a solution of the heat equation at time T. The results of this test are presented in Figure. 4,
which shows the decay in values of the cost functional versus the iterations of the algorithm on
the left side and versus the computational CPU’s time (in sec) on the right side.
Figure 4. First test problem, for Pα
2: Normalized cost functional values versus
computational CPU time for several values of ˆn(i.e. the number of processors
used).
We give in Figure. 3and Figure. 5several rows value snapshots (varying the ˆn) of the control
and its corresponding controlled final solution y(T). Notice the stability and the accuracy of the
method with any choice of ˆn. In particular the shape of the resulting optimal control is unique
as well as the controlled solution y(T) doesn’t depend on ˆn.
For the Dirichlet boundary control problem we choose the following functions as source term,
initial condition and target solution:
f(x1, x2, t)=3π3σexp(2π2σt)(sin(πx1) + sin(πx2))
y0(x1, x2) = π(sin(πx1) + sin(πx2))
ytarget (x1, x2) = πexp(2π2σ)(sin(πx1) + sin(πx2)),
(Pα
3)
respectively. Because of the ill-posed character of this problem, its optimization leads to results
with hight contrast in scale. We therefore preferred to summarize the optimizations results in
Table 3instead of Figures.
Remark 5.1. Because of the linearity and the superposition property of the heat equation, it can
be shown that problems (Pα
2and Pα
3) mentioned above are equivalent to a control problem which
has null target solution.
16 MULTI-STEEPEST DESCENT ALGORITHM
Figure 5. Snapshots in ˆn= 1,16 of the distributed optimal control on the
left columns and its corresponding controlled final state at time T: y(T) on the
right columns. The test case corresponds to the control problem Pα
2, where
α= 1 ×1002 . Same results apply for different choice of ˆn.
Test problem Results
Pα
1α= 1 ×1002
Quantity ˆn= 1 ˆn= 2 ˆn= 4 ˆn= 8 ˆn= 16
Number of iterations k100 68 63 49 27
walltime in sec 15311.6 15352.3 14308.7 10998.2 6354.56
kYk(T)ytarg etk2/kytarg etk20.472113 0.472117 0.472111 0.472104 0.472102
R(0,T )kvkk2
cdt 0.0151685 0.0151509 0.0151727 0.0152016 0.015214
Pα
2α= 1 ×1002
Quantity ˆn= 1 ˆn= 2 ˆn= 4 ˆn= 8 ˆn= 16
Number of iterations k60 50 45 40 35
walltime in sec 3855.21 3726.28 4220.92 3778.13 3222.78
kYk(T)ytarg etk2/kytarg etk28.26 ×1008 8.26 ×1008 8.15 ×1008 8.15 ×1008 8.14 ×1008
R(0,T )kvkk2
cdt 1.68 ×1007 1.68 ×1007 1.72 ×1007 1.72 ×1007 1.72 ×1007
Pα
2α= 1 ×1008
Quantity ˆn= 1 ˆn= 2 ˆn= 4 ˆn= 8 ˆn= 16
Number of iterations k60 50 40 30 20
walltime in sec 3846.23 4654.34 3759.98 2835.31 1948.4
kYk(T)ytarg etk2/kytarg etk23.93 ×1008 1.14 ×1008 5.87 ×1009 2.04 ×1009 1.76 ×1009
R(0,T )kvkk2
cdt 5.42 ×1007 4.13 ×1006 2.97 ×1004 3.64 ×1003 2.51 ×1003
Table 2. Results’ summary of Algorithm 3applied on the distributed control
problems Pα
1and Pα
2.
5.2.2. Second test problem: vanishing Tikhonov regularization parameter α.In this section, we
are concerned with the ”approximate” controllability of the heat equation, where the regulariza-
tion parameter αvanishes, practically we take α= 1 ×1008. In this case, problems Pα
2and Pα
3,
in the continuous setting are supposed to be well posed (see for instances [9,21]). However, may
not be the case in the discretized settings; we refer for instance to [10] (and reference therein)
for more details.
MULTI-STEEPEST DESCENT ALGORITHM 17
Figure 6. Several rows value snapshots in ˆnof the Dirichlet optimal control on
the left columns and its corresponding controlled final state at time T: y(T) on
the right columns. The test case corresponds to the control problem Pα
3, where
α= 1 ×1002 .
18 MULTI-STEEPEST DESCENT ALGORITHM
Figure 7. Normalized and shifted cost functional values versus computational
CPU time for several values of ˆn(i.e. the number of processors used), Distributed
control problem Pα
2whith α= 1 ×1008 .
Test problem Results
Pα
3α= 1 ×1002
Quantity ˆn= 1 ˆn= 2 ˆn= 4 ˆn= 8 ˆn= 16
Number of iterations 40 40 30 18 10
walltime in sec 12453.9 12416.1 9184.28 5570.54 3158.97
kYΓ(T)ytarget k2/kytarget k28.54 ×10+06 0.472488 0.0538509 0.0533826 0.0534024
R(0,T )kvk2
Γdt 2.79 ×10+08 1.96 ×10+07 31.4193 138.675 275.08
Pα
3α= 1 ×1008
Quantity ˆn= 1 ˆn= 2 ˆn= 4 ˆn= 8 ˆn= 16
Number of iterations 40 40 30 27 10
walltime in sec 1248.85 1248.97 916.232 825.791 325.16
kYΓ(T)ytarget k2/kytarget k28.85 ×10+06 0.151086 0.0292072 0.0278316 0.0267375
R(0,T )kvk2
Γdt 7.92 ×10+08 2.30 ×10+07 1.27 ×10+07 1.47 ×10+07 1.58 ×10+06
Table 3. Results’ summary of Algorithm 3applied on the Dirichlet boundary
control problem Pα
3.
Table 2contains the summarized results for the convergence of the distributed control problem.
On the one hand, we are interested in the error given by our algorithm for several choices of
partition number ˆn. On the other hand, we give the L2(0, T ;L2(Ωc)) of the control. We notice the
improvement in the quality of the algorithm in terms of both time of execution and control energy
consumption, namely the quantity R(0,T)kvkk2
cdt. In fact, for the optimal control framework (Pα
1
and Pα
2with α= 1 ×1002 ), we see that, for a fixed stopping criterion, the algorithm is faster
and consume the same energy independently of ˆn. In the approximate controllability framework
(Pα
2with α= 1 ×1008 vanishes), we note first that the general accuracy of the controlled
solution (see the error kYk(T)ytarget k2/kytarget k2) is improved as α= 1 ×1008 compered
with α= 1 ×1002. Second, we note that the error diminishes when increasing ˆn, the energy
consumption rises however. The scalability in CPU’s time and number of iteration shows the
enhancement of our method when it is applied (i.e. for ˆn > 1).
Table 3contains the summarized results at the convergence of the Dircichlet boundary control
problem. This problem is known in the literature for its ill-posedness, where it may be singular
MULTI-STEEPEST DESCENT ALGORITHM 19
Figure 8. Several rows value snapshots in ˆnof the distributed optimal control
on the left columns and its corresponding controlled final state at time T: (Y(T)
on the right columns. The test case corresponds to the control problem Pα
2,
where α= 1 ×1008 .
20 MULTI-STEEPEST DESCENT ALGORITHM
Figure 9. Several rows value snapshots in ˆnof the Dirichlet optimal control on
the left columns and its corresponding controlled final state at time T: YΓ(T) on
the right columns. The test case corresponds to the control problem Pα
3, where
α= 1 ×1008 ..
MULTI-STEEPEST DESCENT ALGORITHM 21
in several cases see [3] and references therein. In fact, it is very sensitive to noise in the data.
We show in Table 3that for a big value of the regularization parameter αour algorithm behaves
already as the distributed optimal control for a vanishing α, in the sense that it consumes
more control energy to produce a more accurate solution with smaller execution CPU’s time.
It is worth noting that the serial case ˆn= 1 fails to reach an acceptable solution, whereas the
algorithm behaves well as ˆnrises.
We give in Figure. 6and Figure. 9several rows value snapshots (varying ˆn) of the Dirichlet
control on Γ. We present in the first column its evolution during [0, T ] and on the second column
its corresponding controlled final solution y(T) at time T; we scaled the plot of the z-range of
the target solution in both Figs.6and 9.
In each row one sees the control and its solution for a specific partition ˆn. The serial case
ˆn= 1 leads to a controlled solution which doesn’t have the same rank as ytarget, whereas as ˆn
rises, we improve the behavior of the algorithm.
It is worth noting that the control is generally active only around the final horizon time T.
This is very clear in Figure. 6and Figure. 9(see the first row i.e. case ˆn= 1). The nature
of our algorithm, which is based on time domain decomposition, obliges the control to act in
subintervals. Hence, the control acts more often and earlier in time (before T) and leads to a
better controlled solution y(T).
5.2.3. Third test problem: Sever ill-posed problem (no solution). In this test case, we consider a
severely ill-posed problem. In fact, the target solution is piecewise Lipschitz continuous, so that
it is not regular enough compared with the solution of the heat equation. This implies that in
our control problem, both the distributed and the Dirchlet boundary control has no solution.
The initial condition and the target solution are given by
y0(x1, x2) = π(sin(πx1) + sin(πx2))
ytarget (x1, x2) = min x1, x2,(1 x1),(1 x2),
(Pα
4)
respectively. A plots of the initial condition and the target solutions are given in Figure. 10.
Figure 10. Graph of initial and target solution for both distributed and
Dirichlet boundary control problem.
In Figures 11 and 12 we plot the controlled solution at time Tfor the distributed and Dirichlet
control problems respectively. We remark that for the distributed control problem the controlled
solution is smooth except in Ωc, where the control is able to fit with the target solution.
22 MULTI-STEEPEST DESCENT ALGORITHM
Figure 11. Several snapshots in ˆnof final state at time T: Y(T). The test case
corresponds to Distributed control sever Ill-posed problem Pα
4.
Test problem Results
Distributed control
Pα
4α= 1 ×1008
Quantity ˆn= 1 ˆn= 2 ˆn= 4 ˆn= 8 ˆn= 16
Number of iterations 100 68 60 50 40
walltime in sec 6381.43 6303.67 5548.16 4676.83 3785.97
kY(T)ytarg etk2/kytarg etk28.16 ×1003 5.3×1003 4.74 ×1003 3.95 ×1003 3.76 ×1003
R(0,T )kvk2
cdt 0.34 3.01 52.87 52.77 2660.87
Dirichlet control
Pα
4α= 1 ×1008
Quantity ˆn= 1 ˆn= 2 ˆn= 4 ˆn= 8 ˆn= 16
Number of iterations 25 25 20 4 1
walltime in sec 848.58 655.40 655.40 146.19 62.87
kYΓ(T)ytarget k2/kytarget k22.85 ×10+10 3055 39.3 0.2 0.067
R(0,T )kvk2
Γdt 6.73 ×10+08 2.17 ×10+07 141.62 17.84 26758.5
Table 4. Results’ summary of Algorithm 3applied on to both distributed and
Dirichlet boundary control for the third test problem Pα
4.
MULTI-STEEPEST DESCENT ALGORITHM 23
Figure 12. Several snapshots in ˆnof final state at time T: YΓ(T). The test
case corresponds to Dirichlet control sever Ill-posed problem Pα
4.
Remark 5.2. Out of curiosity, we tested the case where the control is distributed on the whole
domain. We see that the control succeeds to fit the controlled solution to the target even if it
is not C1(Ω). This is impressive and shows the impact on the results of the regions where the
control is distributed.
We note the stability of the method of the distributed test case. However, the Dirichlet
problem test case presents hypersensitivity. In fact, in the case of ˆn= 1 the algorithm succeeds
to fit an acceptable shape of the controlled solution, although still far in the scale. We note
that the time domain decomposition leads to a control which gives a good scale of the controlled
solution.
In this severely ill-posed problem, we see that some partitions may fail to produce a control
that fit the controlled solution to the target. There is an exemption for the case of ˆn= 8
partitions, where we have a good reconstruction of the target. The summarized results are given
in Tables 4.
5.2.4. Regularization based on the optimal choice of partition. The next discussion concerns the
kind of situation where the partition leads to multiple solutions, which is common in ill-posed
24 MULTI-STEEPEST DESCENT ALGORITHM
problems. In fact, we discuss a regularization procedure used as an exception handling tool to
choose the best partition, giving the best solution of the handled control problem.
It is well known that ill-posed problems are very sensitive to noise, which could be present due
to numerical approximation or to physical phenomena. In that case, numerical algorithm may
blow-up and fail. We present several numerical tests for the Dirichlet boundary control, which is
a non trivial problem numerically. The results show that in general time domain decomposition
may improve the results in several cases. But scalability is not guaranteed as it is for the
distributed control. We propose a regularization procedure in order to avoid the blow-up and
also to guarantee the optimal choice of partition of the time domain. This procedure is based
on a test of the monotony of the cost function. In fact, suppose that we possess 64 processors
to run the numerical problem. Once we have assembled the Hessian Hkand the Jacobian Dk
for the partition ˆn= 64, we are actually able to get for free the results of the Hessian and the
Jacobian for all partitions ˆnthat divide 64. Hence, we can use the quadratic property of the
cost functional in order to predict and test the value of the cost function for the next iteration
without making any additional computations. The formulae is given by:
J(vk+1) = J(vk)1
2DT
kH1
kDk.
We present in Algorithm 4the technique that enables us to reduce in rank and compute a series
of Hessians and Jacobians for any partition ˆnthat divide the available number of processors. An
exemple of the applicability of these technique, on a 4-by-4 SPD matrix, is given in Appendix.
Algorithm 4: Reduce in rank of the partition ˆn
0Input: ˆn, Hk
ˆn, Dk
ˆn;
1n= ˆn;
2Jk+1
n/2=Jk+1
n;
3while Jk+1
n/2> Jk
ndo
4for i= 0; in;i+ 2 do
5Dk
n/2i=Dk
ni+Dk
ni+1;
6for j= 0; jn;j+ 2 do
7Hk
n/2i,j =Hk
nj+Hk
nj+1;
8end
9end
10 Estimation of the cost Jk
n/2;
11 n=n/2;
12 end
6. Conclusion
We have presented in this article a new optimization technique to enhance the steepest descent
algorithm via domain decomposition in general and we applied our new method in particular to
time-parallelizing the simulation of an optimal heat control problem. We presented its perfor-
mance (in CPU time and number of iterations) versus the traditional steepest descent algorithm
in several and various test problems. The key idea of our method is based on a quasi-Newton
technique to perform efficient real vector step-length for a set of descent directions regarding the
domain decomposition. The originality of our approach consists in enabling parallel computation
where its vector step-length achieves the optimal descent direction in a high dimensional space.
MULTI-STEEPEST DESCENT ALGORITHM 25
Convergence property of the presented method is provided. Those results are illustrated with
several numerical tests using parallel resources with MPI implementation.
Appendix A. Kantorovich matrix inequality
For the sake of completeness, we give in this appendix the Matrix Kantorovich inequality, that
justifies the statement of our convergence proof. Assume that 2Jis symmetric positive definite
with smallest and largest eigenvalues λmin and λmax respectively. We give in the following the
matrix version of the famous Kantorovich inequality, which reads:
Theorem A.1 (see [14] for more details).Assume that Pˆn
n=1 αn= 1 where αn0and λn>
0n; we have thus :
ˆn
X
n=1
αnλn
ˆn
X
n=1
αn
λn
(λmax +λmin)2
4λmaxλmin
.
By diagonalizing the symmetric positive definite operator Hwe obtain: H=PΛP1, where
Pis orthonormal operator (i.e. PT=P1). Recall Eq.(38) that we rewrite as:
k∇J(vk)k2
2Jk∇J(vk)k2
(2J)1
k∇J(vk)k4
c
(λmax +λmin)2
4λmaxλmin
.
In order to simplify the expression, we shall use dkinstead of J(vk) so that the equation above
reads:
dT
k(2J)dkdT
k(2J)1dk
(dT
kdk)2=dT
kPTΛP dk
dT
kPTP dk
dkPTΛ1P dk
dT
kPTP dk
.
Let us define dk:= P dk, consequently the above equality becomes:
dT
kΛdk
dT
kdk
dT
kΛ1dk
dT
kdk
=
ˆn
X
n=1
(dk)2
n
dT
kdk
λn
ˆn
X
n=1
(dk)2
n
dT
kdk
1
λn
.
We then denote by αn=(dk)2
n
dT
kdkso that Pˆn
n=1 αn= 1, and finally:
dT
kAdkdT
kA1dk
(dT
kdk)2=
ˆn
X
n=1
αnλn
ˆn
X
n=1
αn
λn
.
Example 1. Exemple 4-by-4 SPD matrix reduced in rank using the regularization procedure
described in Algorithm 4. In order to illustrate the steps of Algorithm 4, we choose a simple
example: a matrix 4-by-4 which we are going to reduce recursively in 2-by-2 and in 1-by-1 as
follows:
6 1 2 3
1 8 2 4
2 2 12 7
3 4 7 16
7→
(6 1) (2 3)
(1 8) (2 4)
(2 2) (12 7)
(3 4) (7 16)
7→
7 5
9 6
4 19
7 23
7→ 16 11
11 42
16 11
11 42 7→ 27
53 7→ (80)
26 MULTI-STEEPEST DESCENT ALGORITHM
References
[1] Jerzy K Baksalary and Simo Puntanen. Generalized matrix versions of the cauchy-schwarz and kantorovich
inequalities. Aequationes Mathematicae, 41(1):103–110, 1991.
[2] Jonathan Barzilai and Jonathan M Borwein. Two-point step size gradient methods. IMA Journal of Numer-
ical Analysis, 8(1):141–148, 1988.
[3] Faker Ben Belgacem and Sidi Mahmoud Kaber. On the Dirichlet boundary controllability of the one-
dimensional heat equation: semi-analytical calculations and ill-posedness degree. Inverse Problems,
27(5):055012, 19, 2011.
[4] Petter Bjorstad and William Gropp. Domain decomposition: parallel multilevel methods for elliptic partial
differential equations. Cambridge University Press, 2004.
[5] Richard H. Byrd, Gabriel Lopez-Calva, and Jorge Nocedal. A line search exact penalty method using steering
rules. Math. Program., 133(1-2, Ser. A):39–73, 2012.
[6] C. Carthel, R. Glowinski, and J.-L. Lions. On exact and approximate boundary controllabilities for the heat
equation: a numerical approach. J. Optim. Theory Appl., 82(3):429–484, 1994.
[7] Augustin-Louis Cauchy. M´ethode g´en´erale pour la r´esolution des syst`emes d’´equations simultan´ees. Compte
Rendu des Scieances de L’Acad´emie des Sciences XXV, S’erie A(25):536–538, October 1847.
[8] Philippe G. Ciarlet. Introduction `a l’analyse num´erique matricielle et `a l’optimisation. Collection
Math´ematiques Appliqu´ees pour la Maˆıtrise. [Collection of Applied Mathematics for the Master’s Degree].
Masson, Paris, 1982.
[9] Jean-Michel Coron and Emmanuel Tr´elat. Global steady-state controllability of one-dimensional semilinear
heat equations. SIAM J. Control Optim., 43(2):549–569, 2004.
[10] Sylvain Ervedoza and Enrique Zuazua. The wave equation: Control and numerics. In Control of Partial
Differential Equations, Lecture Notes in Mathematics, pages 245–339. Springer Berlin Heidelberg, 2012.
[11] David J Evans. Preconditioning Methods: Theory and Applications. Gordon and Breach Science Publishers,
Inc., 1983.
[12] Luigi Grippo, Francesco Lampariello, and Stephano Lucidi. A nonmonotone line search technique for newton’s
method. SIAM Journal on Numerical Analysis, 23(4):707–716, 1986.
[13] Marcus J Grote and Thomas Huckle. Parallel preconditioning with sparse approximate inverses. SIAM Jour-
nal on Scientific Computing, 18(3):838–853, 1997.
[14] L. V. Kantorovich. Functional analysis and applied mathematics. NBS Rep. 1509. U. S. Department of
Commerce National Bureau of Standards, Los Angeles, Calif., 1952. Translated by C. D. Benster.
[15] J Lions, Yvon Maday, and Gabriel Turinici. A”parareal”in time discretization of pde’s. Comptes Rendus de
l’Academie des Sciences Series I Mathematics, 332(7):661–668, 2001.
[16] J.-L. Lions. Optimal control of systems governed by partial differential equations. Translated from the French
by S. K. Mitter. Die Grundlehren der mathematischen Wissenschaften, Band 170. Springer-Verlag, New York,
1971.
[17] Y. Maday, J. Salomon, and G. Turinici. Monotonic parareal control for quantum systems. SIAM Journal on
Numerical Analysis, 45(6):2468–2482, 2007.
[18] Yvon Maday, Mohamed-Kamel Riahi, and Julien Salomon. Parareal in time intermediate targets methods
for optimal control problems. In Kristian Bredies, Christian Clason, Karl Kunisch, and Gregory von Winckel,
editors, Control and Optimization with PDE Constraints, volume 164 of International Series of Numerical
Mathematics, pages 79–92. Springer Basel, 2013.
[19] Yvon Maday and Gabriel Turinici. A parareal in time procedure for the control of partial differential equa-
tions. C. R. Math. Acad. Sci. Paris, 335(4):387–392, 2002.
[20] Tahir Malas and Levent G¨urel. Incomplete lu preconditioning with the multilevel fast multipole algorithm
for electromagnetic scattering. SIAM Journal on Scientific Computing, 29(4):1476–1494, 2007.
[21] Sorin Micu and Enrique Zuazua. Regularity issues for the null-controllability of the linear 1-d heat equation.
Systems Control Lett., 60(6):406–413, 2011.
[22] Olivier Pironneau, Fr´ed´eric Hecht, and Jacques Morice. freefem++, www.freefem.org/, 2013.
[23] Alfio Quarteroni and Alberto Valli. Domain decomposition methods for partial differential equations. Nu-
merical Mathematics and Scientific Computation. The Clarendon Press, Oxford University Press, New York,
1999. Oxford Science Publications.
[24] Ulrich R¨ude. Mathematical and computational techniques for multilevel adaptive methods. SIAM, 1993.
[25] Yousef Saad. Iterative methods for sparse linear systems. Siam, 2003.
[26] Scilab Enterprises. Scilab: Le logiciel open source gratuit de calcul numrique. Scilab Enterprises, Orsay,
France, 2012.
[27] Andrea Toselli and Olof Widlund. Domain decomposition methods: algorithms and theory, volume 3.
Springer, 2005.
MULTI-STEEPEST DESCENT ALGORITHM 27
[28] Gonglin Yuan and Zengxin Wei. The Barzilai and Borwein gradient method with nonmonotone line search
for nonsmooth convex optimization problems. Math. Model. Anal., 17(2):203–216, 2012.

Supplementary resource (1)

... Despite this undesirable behavior, gradient descent method is still exploited in a plenty of applications and has the advantages of being adaptive, where it can be coupled with many other global techniques in order to overcome the restriction of local convergence. In addition, gradient methods could benefits from some recently developed acceleration techniques such as [23] or [30] to overcome the slow convergence rate du to the nature of the problem. ...
... where σ ∈ (0, 1 2 ] and 0 ≤ ρ ≤ σ . The Goldstein requirement (30) is often regarded as a relaxed extension of the exact line search since it reduces to the later if σ vanishes. Equation (30) ensures, indeed, that the modulus of the slope is reduced by a factor of σ or less through the line search. ...
... The Goldstein requirement (30) is often regarded as a relaxed extension of the exact line search since it reduces to the later if σ vanishes. Equation (30) ensures, indeed, that the modulus of the slope is reduced by a factor of σ or less through the line search. ...
Preprint
Full-text available
This paper presents a general description of a parameter estimation inverse problem for systems governed by nonlinear differential equations. The inverse problem is presented using optimal control tools with state constraints, where the minimization process is based on a first-order optimization technique such as adaptive monotony-backtracking steepest descent technique and nonlinear conjugate gradient methods satisfying strong Wolfe conditions. Global convergence theory of both methods is rigorously established where new linear convergence rates have been reported. Indeed, for the nonlinear non-convex optimization we show that under the Lipschitz-continuous condition of the gradient of the objective function we have a linear convergence rate toward a stationary point. Furthermore, nonlinear conjugate gradient method has also been shown to be linearly convergent toward stationary points where the second derivative of the objective function is bounded. The convergence analysis in this work has been established in a general nonlinear non-convex optimization under constraints framework where the considered time-dependent model could whether be a system of coupled ordinary differential equations or partial differential equations. Numerical evidence on a selection of popular nonlinear models is presented to support the theoretical results. Nonlinear Conjugate gradient methods, Nonlinear Optimal control and Convergence analysis and Dynamical systems and Parameter estimation and Inverse problem
... because every minimizer y of J(t 0 , t E ; y 0 , y E ) also minimizes the integrals over the sub- and [t, t E ]. Divide-and-conquer strategies use this property by dividing one optimal control problem into several smaller ones [107]. ...
Preprint
This thesis presents new mathematical algorithms for the numerical solution of a mathematical problem class called \emph{dynamic optimization problems}. These are mathematical optimization problems, i.e., problems in which numbers are sought that minimize an expression subject to obeying equality and inequality constraints. Dynamic optimization problems are distinct from non-dynamic problems in that the sought numbers may vary over one independent variable. This independent variable can be thought of as, e.g., time. This thesis presents three methods, with emphasis on algorithms, convergence analysis, and computational demonstrations. The first method is a direct transcription method that is based on an integral quadratic penalty term. The purpose of this method is to avoid numerical artifacts such as ringing or erroneous/spurious solutions that may arise in direct collocation methods. The second method is a modified augmented Lagrangian method that leverages ideas from augmented Lagrangian methods for the solution of optimization problems with large quadratic penalty terms, such as they arise from the prior direct transcription method. Lastly, we present a direct transcription method with integral quadratic penalties and integral logarithmic barriers. All methods are motivated with applications and examples, analyzed with complete proofs for their convergence, and practically verified with numerical experiments.
... On the other hand, from inequality (2.5) we get Some numerical approaches regarding similar problems are given in [19,20]. ...
Article
Full-text available
In this paper, we study the long time decay of global solution to the 3D incompressible Navier-Stokes equations. We prove that if u ∈ C ( R + , X − 1 , σ ( R 3 ) ) u\in {\mathcal{C}}\left({{\mathbb{R}}}^{+},{{\mathcal{X}}}^{-1,\sigma }\left({{\mathbb{R}}}^{3})) is a global solution to the considered equation, where X i , σ ( R 3 ) {{\mathcal{X}}}^{i,\sigma }\left({{\mathbb{R}}}^{3}) is the Fourier-Lei-Lin space with parameters i = − 1 i=-1 and σ ≥ − 1 \sigma \ge -1 , then ‖ u ( t ) ‖ X − 1 , σ \Vert u\left(t){\Vert }_{{{\mathcal{X}}}^{-1,\sigma }} decays to zero as time goes to infinity. The used techniques are based on Fourier analysis.
Article
Full-text available
Over the last decades, many efforts have been made toward the understanding of the convergence rate of the gradient-based method for both constrained and unconstrained optimization. The cases of the strongly convex and weakly convex payoff function have been extensively studied and are nowadays fully understood. Despite the impressive advances made in the convex optimization context, the nonlinear non-convex optimization problems are still not fully exploited. In this paper, we are concerned with the nonlinear, non-convex optimization problem under system dynamic constraints. We apply our analysis to parameter identification of systems governed by general nonlinear differential equations. The considered inverse problem is presented using optimal control tools. We tackle the optimization through the use of Fletcher-Reeves nonlinear conjugate gradient method satisfying strong Wolfe conditions with inexact line search. We rigorously establish a convergence analysis of the method and report a new linear convergence rate which forms the main contribution of this work. The theoretical result reported in our analysis requires that the second derivative of the payoff functional be continuous and bounded. Numerical evidence on a selection of popular nonlinear models is presented as a direct application of parameter identification to support the theoretical findings. KEYWORDS convergence analysis, dynamical systems, inverse problem, nonlinear conjugate gradient methods, nonlinear optimal control, parameter identification MSC CLASSIFICATION 65K10; 47N40; 45Q05; 65L09; 90C26; 49J15
Article
An approach to developing active control strategies for separated flows is presented. The methodology proposed is applied to the incompressible unsteady wake flow behind a circular cylinder at a Reynold's number of 100. Control action is achieved via cylinder rotation. Low-order models which are amenable to control and which incorporate the full non-linear dynamics are developed by applying the proper orthogonal decomposition technique to data provided by numerical simulation. This process involves extensions to the usual POD approach and the results are therefore assessed for two ‘open-loop’ test cases. The predictions are found to be satisfactory for control purposes, assuming the model can be reset periodically. The use of these models for optimal control is discussed in a companion paper, Part II. Copyright © 1999 John Wiley & Sons, Ltd.
Chapter
In these Notes we make a self-contained presentation of the theory that has been developed recently for the numerical analysis of the controllability properties of wave propagation phenomena and, in particular, for the constant coefficient wave equation. We develop the so-called discrete approach. In other words, we analyze to which extent the semidiscrete or fully discrete dynamics arising when discretizing the wave equation by means of the most classical scheme of numerical analysis, shear the property of being controllable, uniformly with respect to the mesh-size parameters and if the corresponding controls converge to the continuous ones as the mesh-size tends to zero. We focus mainly on finite-difference approximation schemes for the one-dimensional constant coefficient wave equation. Using the well known equivalence of the control problem with the observation one, we analyze carefully the second one, which consists in determining the total energy of solutions out of partial measurements. We show how spectral analysis and the theory of non-harmonic Fourier series allows, first, to show that high frequency wave packets may behave in a pathological manner and, second, to design efficient filtering mechanisms. We also develop the multiplier approach that allows to provide energy identities relating the total energy of solutions and the energy concentrated on the boundary. These observability properties obtained after filtering, by duality, allow to build controls that, normally, do not control the full dynamics of the system but rather guarantee a relaxed controllability property. Despite of this they converge to the continuous ones. We also present a minor variant of the classical Hilbert Uniqueness Method allowing to build smooth controls for smooth data. This result plays a key role in the proof of the convergence rates of the discrete controls towards the continuous ones. These results are illustrated by means of several numerical experiments.
Conference Paper
In this paper we present a time adaptive technique for the solution of optimal control problems where the dynamic is given by an evolutive semi linear PDE. The method is based on a model reduction using a POD approximation coupled with a Hamilton-Jacobi equation characterizing the value function of the corresponding control problem for the reduced system. The choice of the POD basis is updated according to the evaluation of a numerical indicator in order to guarantee a global accurate solution. This is obtained via a sub-division of the time horizon into sub-intervals where the residual is below a given threshold. Some numerical tests illustrate the main features of this approach.
Article
Optimal control theory is concerned with finding control functions that minimize cost functions for systems described by differential equations. The methods have found widespread applications in aeronautics, mechanical engineering, the life sciences, and many other disciplines. This book focuses on optimal control problems where the state equation is an elliptic or parabolic partial differential equation. Included are topics such as the existence of optimal solutions, necessary optimality conditions and adjoint equations, second-order sufficient conditions, and main principles of selected numerical techniques. It also contains a survey on the Karush-Kuhn-Tucker theory of nonlinear programming in Banach spaces. The exposition begins with control problems with linear equation, quadratic cost function and control constraints. To make the book self-contained, basic facts on weak solutions of elliptic and parabolic equations are introduced. Principles of functional analysis are introduced and explained as they are needed. Many simple examples illustrate the theory and its hidden difficulties. This start to the book makes it fairly self-contained and suitable for advanced undergraduates or beginning graduate students. Advanced control problems for nonlinear partial differential equations are also discussed. As prerequisites, results on boundedness and continuity of solutions to semilinear elliptic and parabolic equations are addressed. These topics are not yet readily available in books on PDEs, making the exposition also interesting for researchers. Alongside the main theme of the analysis of problems of optimal control, Troltzsch also discusses numerical techniques. The exposition is confined to brief introductions into the basic ideas in order to give the reader an impression of how the theory can be realized numerically. After reading this book, the reader will be familiar with the main principles of the numerical analysis of PDE-constrained optimization.
Article
Part I. Finite Dimensional Control Problems: 1. Calculus of variations and control theory 2. Optimal control problems without target conditions 3. Abstract minimization problems: the minimum principle for the time optimal problem 4. Abstract minimization problems: the minimum principle for general optimal control problems Part II. Infinite Dimensional Control Problems: 5. Differential equations in Banach spaces and semigroup theory 6. Abstract minimization problems in Hilbert spaces: applications to hyperbolic control systems 7. Abstract minimization problems in Banach spaces: abstract parabolic linear and semilinear equations 8. Interpolation and domains of fractional powers 9. Linear control systems 10. Optimal control problems with state constraints 11. Optimal control problems with state constraints: The abstract parabolic case Part III. Relaxed Controls: 12. Spaces of relaxed controls: topology and measure theory 13. Relaxed controls in finite dimensional systems: existence theory 14. Relaxed controls in infinite dimensional spaces: existence theory.
Article
We consider a time-dependent PDE-constrained optimization problem of the form min y∈Y,u∈U J(y,u)subjecttoC(y,u)=0·(1) Here u∈U is the control living in a Banach space U, y∈Y⊂C([0,T];B) is a time-dependent state with Banach spaces B and Y, where B⊂L 2 (Ω) with a domain Ω⊂ℝ n . The state equation C(y,u)=0 is the appropriate formulation of a time-dependent PDE (or a system of time-dependent PDEs) y t +A(t,x,y,u)=0,(t,x)∈(0,T)×Ω,y(0,x)=y 0 (x),x∈Ω,(2) with initial data y 0 ∈B. For convenience we assume that boundary conditions are incorporated into the state space Y. This chapter is organized as follows. In Section 7.2 we call the parareal time-domain decomposition technique and state known convergence results. In Section 7.3 we perform a time-domain decomposition of the optimal control problem (7,1) by applying the parareal technique to the state equation. Moreover, we derive optimality conditions for the decomposed problem and show that the parareal technique can also be applied to the adjoint system. In Section 7.4 we propose a generalized SQP method that allows the solution of (7.1) for an arbitrary user-provided state solver and an adjoint solver, where we use parareal solvers in the present chapter. We prove the global convergence of the method. In Section 7.5 we use the generalized SQP method with the specific choice of parareal solvers for state and adjoint equations. We demonstrate the efficiency of the approach by numerical results for the optimal control of a semilinear parabolic equation in two dimensions.
Article
We present an approach to develop active control strategies for separated flows, which is applied to incompressible unsteady wake behind a circular cylinder at a Reynolds number of 100. Control action is achieved via cylinder rotation. Low-order models which are amenable to control and which incorporate the full nonlinear dynamics are developed by applying the proper orthogonal decomposition (POD) technique to data provided by numerical simulation. This process involves extensions to the usual POD approach, and the results are therefore assessed for two ‘open-loop’ test cases. The predictions are found to be satisfactory for control purposes, assuming the model can be reset periodically. In part II, optimal control theory is used to implement the model-based control. It is found that the level of wake unsteadiness can be reduced, even when the low-order model is reset on the basis of limit flow field information. The degree of reduction is dependent on the accuracy of the low-order model, and ways of refining it in the light of control simulations are considered. However, results from two straightforward approaches to this problem suggest that it is easy to ‘over-tune’ the model, resulting in less successful control.