Content uploaded by Julio Backhoff

Author content

All content in this area was uploaded by Julio Backhoff on Feb 28, 2020

Content may be subject to copyright.

SIAM J. CONTROL OPTIM.c

2019 Society for Industrial and Applied Mathematics

Vol. 57, No. 6, pp. 3666–3693

EXTENDED MEAN FIELD CONTROL PROBLEMS: STOCHASTIC

MAXIMUM PRINCIPLE AND TRANSPORT PERSPECTIVE∗

BEATRICE ACCIAIO†, JULIO BACKHOFF-VERAGUAS‡,AND REN´

E CARMONA§

Abstract. We study mean ﬁeld stochastic control problems where the cost function and the state

dynamics depend upon the joint distribution of the controlled state and the control process. We prove

suitable versions of the Pontryagin stochastic maximum principle, both in necessary and in suﬃcient

forms, which extend the known conditions to this general framework. We suggest a variational

approach for a weak formulation of these control problems. We show a natural connection between

this weak formulation and optimal transport on path space, which inspires a novel discretization

scheme.

Key words. controlled McKean–Vlasov SDEs, Pontryagin principle, mean-ﬁeld interaction,

casual transport plans

AMS subject classiﬁcations. 93E20, 90C08, 60H30, 60K35

DOI. 10.1137/18M1196479

1. Introduction. The control of stochastic diﬀerential equations of mean ﬁeld

type, also known as McKean–Vlasov control, did not get much attention before the

theory of mean ﬁeld games became a popular subject of investigation. Indeed the two

topics are intimately related through the asymptotic theory of mean ﬁeld stochastic

systems known as propagation of chaos. See, for example, [15] for an early discussion

of the similarities and the diﬀerences of the two problems. Among the earliest works

on this new form of control problem, relevant to the spirit of the analysis conducted

in this paper, are [10, 9, 3, 28, 8, 13]. Here, we follow the approach introduced

and developed in [13]. The reader is referred to [14, Chapters 3, 4, 6] for a general

overview of these problems and an extensive historical perspective. Still, most of

these contributions are limited to mean ﬁeld interactions entering the models through

the statistical distribution of the state of the system alone. The goal of the present

article is to investigate the control of stochastic dynamics depending upon the joint

distribution of the controlled state and the control process. We refer to such problems

as extended Mean Field control problems; see [14, section 4.6].

Our ﬁrst contribution is to prove an appropriate form of the Pontryagin stochastic

maximum principle, in necessary and in suﬃcient forms, for extended mean ﬁeld

control problems. The main driver behind this search for an extension of existing

tools is the importance of many practical applications, which naturally ﬁt within

the class of models for which the interactions are not only through the distribution

of the state of the system, but also through the distribution of the controls. The

analysis of extended mean ﬁeld control problems had been restricted so far to the

∗Received by the editors June 25, 2018; accepted for publication (in revised form) August 16,

2019; published electronically November 12, 2019.

https://doi.org/10.1137/18M1196479

Funding: The third author was partially supported by National Science Foundation grant

DMS-1716673 and by Army Research Oﬃce grant W911NF-17-1-0578.

†Department of Statistics, London School of Economics, London, WC2A 2AE, England (b.acciaio

@lse.ac.uk).

‡Institute of Statistics and Mathematical Methods in Economics, Vienna University of Technology,

Vienna, 1040, Austria (julio.backhoﬀ@tuwien.ac.at).

§Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544

(rcarmona@princeton.edu).

3666

EXTENDED MEAN FIELD CONTROL PROBLEMS 3667

linear quadratic (LQ) case; see, e.g., [35, 24, 6, 33]. To the best of our knowledge, the

recent work [33] is the only one where more general models are considered. In that

article, however, the authors restrict the analysis to closed-loop feedback controls,

leading to a deterministic reformulation of the problem, which is used in order to

derive the Bellman equation associated with the problem; theirs is therefore a PDE

approach. In the present paper, we study the extended mean ﬁeld control problem

without any restrictions, deriving a version of the Pontryagin maximum principle via

a probabilistic approach.

We apply our optimality conditions for particular classes of models, where our

analysis can be pushed further. In the case of scalar interactions, in which the dy-

namics depend solely upon moments of the marginal distributions, we derive a more

explicit form of the optimality condition. The advantage here is that the analysis

can be conducted with a form of classical diﬀerential calculus, without the use of the

notion of L-diﬀerentiability. The announced work [23] studies an application of such

a class of models in electricity markets. As a special case of scalar interaction, we

study an optimal liquidation model, which we are able to solve explicitly. Finally, we

consider the case of LQ models for which we easily derive explicit solutions which can

be computed numerically. The results in the LQ setting are compatible with existing

results in the literature.

Another contribution of the present article is the variational study of a weak

formulation of the extended mean ﬁeld control problem. Weak formulations have

already been studied in the literature, without nonlinear dependence in the law of

the control, as in [14, Chapter 6] and [25]. In this framework, we derive an analogue

of the Pontryagin principle in the form of a martingale optimality condition. Similar

statements have been derived in [18, 27] under the name of stochastic Euler–Lagrange

condition for a diﬀerent kind of problems. Next, we derive a natural connection

between the extended mean ﬁeld control problem and an optimal transport problem

on path space. The theory of optimal transport is known to provide a set of tools and

results crucial to the understanding of mean ﬁeld control and mean ﬁeld games. We

illustrate the use of this connection by building a discretization scheme for extended

mean ﬁeld control based on transport-theoretic tools (as in [36, Chapter 3.6] for the

case without mean ﬁeld terms), and show that this scheme converges monotonically

to the value of the original extended mean ﬁeld control problem. The explosion in

activity regarding numerical optimal transport gives us reason to believe that such

discretization schemes might be eﬃciently implemented in the near future; see, e.g.,

[19, 7, 29] for the static setting and [30, 31, 32] for the dynamic one.

The paper is organized as follows. In section 2, we introduce the notations and

basic underpinnings for extended mean ﬁeld control. Section 3 provides a new form

of the Pontryagin stochastic maximum principle. In section 4, we study classes of

models for which our optimality conditions lead to explicit solutions. In section 5, we

analyze the weak formulation of the problem in connection with optimal transport.

Finally, in the appendix, we collect some technical proofs.

2. Extended mean ﬁeld control problems. The goal of this short subsection

is to set the stage for the statements and proofs of the stochastic maximum principle

proven in section 3 below.

Let f,b, and σbe measurable functions on Rd×Rk× P2(Rd×Rk) with val-

ues in R,Rd, and Rd×m, respectively, and gbe a real valued measurable function

on Rd× P2(Rd). Here and elsewhere we denote by P(·) (resp., P2(·)) the set of

probability measures (resp., with ﬁnite second moments) over an underlying met-

3668 B. ACCIAIO, J. BACKHOFF-VERAGUAS, AND R. CARMONA

ric space. Let (Ω,F,P) be a probability space, F0⊂ F be a sub-sigma-algebra,

and F= (Ft)0≤t≤Tbe the ﬁltration generated by F0and an m-dimensional Wiener

process W= (Wt)0≤t≤T. We denote by Athe set of progressively measurable pro-

cesses α= (αt)0≤t≤Ttaking values in a given closed-convex set A⊂Rkand satisfying

the integrability condition ET

0|αt|2dt < ∞.

We consider the problem of minimizing

(2.1) J(α) = EïT

0

f(Xt, αt,L(Xt, αt)) dt+gXT,L(XT)ò

over the set Aof admissible control processes, under the dynamic constraint

(2.2) dXt=b(Xt, αt,L(Xt, αt)) dt+σ(Xt, αt,L(Xt, αt)) dWt

with X0a ﬁxed F0-measurable random variable.

The symbol Lstands for the law of the given random element. We shall add mild

regularity conditions for the coeﬃcients band σso that a solution to (2.2) always

exists when α∈A. For the sake of simplicity, we chose to use time independent

coeﬃcients, but all the results would be the same should f,b, and σdepend upon t,

since time can be included as an extra state in the vector X.

The novelty of the above control problem lies in the fact that the cost functional

and the controlled SDE depend on the joint distribution of state and control. For

this reason, we call it the extended mean ﬁeld control problem. In this generality,

this problem has not been studied before. We mention the works [35, 24, 6, 33] for

particular cases and diﬀerent approaches.

2.1. Partial L-diﬀerentiability of functions of measures. We introduce

here the concept of L-diﬀerentiability for functions of joint probability laws (i.e.,

probability measures on product spaces). We refer the reader to [14, Chapter 5] for

more details.

Let u:Rq× P2(Rd×Rk)→R. We use the notation ξfor a generic element of

P2(Rd×Rk), and µ∈ P2(Rd) and ν∈ P2(Rk) for its marginals. We denote a generic

element of Rqby v.

Let (˜

Ω,˜

F,˜

P) be a probability space and let ˜ube a lifting of the function u. In

other words,

˜u:Rq×L2(˜

Ω,˜

F,˜

P;Rd×Rk)(v, ˜

X, ˜α)→ ˜u(v, ˜

X, ˜α) = u(v, L(˜

X, ˜α)).

We say that uis L-diﬀerentiable at (v, ξ) if there exists a pair

(˜

X, ˜α)∈L2(˜

Ω,˜

F,˜

P;Rd×Rk) with L(˜

X, ˜α) = ξ

such that the lifted function ˜uis Fr´echet diﬀerentiable at (v, ˜

X, ˜α); cf. [20, Chapter

II.5, p. 92]. When this is the case, it turns out that the Fr´echet derivative depends

only on the law ξand not on the speciﬁc pair ( ˜

X, ˜α) having distribution ξ; see

[11] or [14, Chapter 6] for details. Thanks to self-duality of L2spaces, the Fr´echet

derivative [D˜u](v, ˜

X, ˜α) of the lifting function ˜uat (v, ˜

X, ˜α) can be viewed as an

element D˜u(v, ˜

X, ˜α) of Rq×L2(˜

Ω,˜

F,˜

P;Rd×Rk) in the sense that

[D˜u](v, ˜

X, ˜α)( ˜

Y) = ˜

E[D˜u(v, ˜

X, ˜α)·˜

Y] for all ˜

Y∈Rq×L2(˜

Ω,˜

F,˜

P;Rd×Rk).

Since Rq×L2(˜

Ω,˜

F,˜

P;Rd×Rk)∼

=Rq×L2(˜

Ω,˜

F,˜

P;Rd)×L2(˜

Ω,˜

F,˜

P;Rk), as in [11],

EXTENDED MEAN FIELD CONTROL PROBLEMS 3669

the random variable D˜u(v, ˜

X, ˜α) can be represented a.s. via the random vector

D˜u(v, ˜

X, ˜α)

=Ä∂vu(v, L(˜

X, ˜α))( ˜

X, ˜α), ∂µu(v, L(˜

X, ˜α))( ˜

X, ˜α), ∂νu(v, L(˜

X, ˜α))( ˜

X, ˜α)ä

for measurable functions ∂vu(·,L(˜

X, ˜α))(·,·), ∂µu(·,L(˜

X, ˜α))(·,·), ∂νu(·,L(˜

X, ˜α))(·,·),

all of them deﬁned on Rq×Rd×Rkand valued, respectively, on Rq,Rd, and Rk. We

call these functions the partial L-derivatives of uat (v, L(˜

X, ˜α)).

3. Stochastic maximum principle. Our goal is to prove a necessary and a

suﬃcient condition for optimality in the extended class of problems considered in the

paper. These are suitable extensions of the Pontryagin stochastic maximum principle

conditions. We deﬁne the Hamiltonian Hby

H(x, α, ξ, y, z) = b(x, α, ξ)·y+σ(x, α, ξ )·z+f(x, α, ξ)

(3.1)

for (x, α, ξ, y, z)∈Rd×Rk× P2(Rd×Rk)×Rd×Rd×m. Naturally, the dot notation

for matrices refers to the trace inner product. We let H0,n stand for the collection

of all Rn-valued progressively measurable processes on [0, T ], and denote by H2,n

the collection of processes Zin H0,n such that ET

0|Zs|2ds < ∞. We shall also

denote by S2,n the space of all continuous processes S= (St)0≤t≤Tin H0,n such that

E[sup0≤t≤T|St|2]<+∞. Here and in what follows, regularity properties, such as

continuity or Lipschitz character, of functions of measures are always understood in

the sense of the 2-Wasserstein distance of the respective spaces of probability measures

with ﬁnite second moments; cf. [34].

Throughout this section, we assume the following:

(I) The functions b,σ, and fare diﬀerentiable with respect to (x, α), for ξ∈

P2(Rd×Rk) ﬁxed, and the functions

(x, α, ξ)→ (∂x(b, σ, f )(x, α, ξ), ∂α(b, σ, f )(x, α, ξ) )

are continuous. Moreover, the functions b,σ, and fare L-diﬀerentiable with

respect to the variable ξ, the mapping

Rd×A×L2(Ω,F,P;Rd×Rk)(x, α, (X, β)) → ∂µ(b, σ, f)(x, α, L(X, β))(X, β )

being continuous. Similarly, the function gis diﬀerentiable with respect to

x, the mapping (x, µ)→ ∂xg(x, µ) being continuous. The function gis also

L-diﬀerentiable with respect to the variable µ, and the following map is con-

tinuous:

Rd×L2(Ω,F,P;Rd)(x, X)→ ∂µg(x, L(X))(X)∈L2(Ω,F,P;Rd).

(II) The derivatives ∂x(b, σ) and ∂α(b, σ) are uniformly bounded, and the mapping

(x, α)→ ∂µ(b, σ)(x, α, ξ)(x, α) (resp., (x, α)→ ∂ν(b, σ)(x, α, ξ )(x, α))

has an L2(Rd, µ;Rd×Rk)-norm (resp., L2(Rk, ν;Rd×Rk)-norm) which is

uniformly bounded in (x, α, ξ). There exists a constant Lsuch that, for any

3670 B. ACCIAIO, J. BACKHOFF-VERAGUAS, AND R. CARMONA

R≥0 and any (x, α, ξ) such that |x|,|α|,ξL2≤R, it holds that

|∂xf(x, α, ξ)|∨|∂xg(x, µ)|∨|∂αf(x, α, ξ)| ≤ L(1 + R),

and the norms in L2(Rd×Rk, ξ;Rd×Rk) and L2(Rd, ξ;Rd×Rk) of (x, α)→

∂µf(x, α, ξ)(x, α), (x, α)→ ∂νf(x, α, ξ )(x, α), and x→ ∂µg(x, µ)(x) are

bounded by L(1 + R).

Under these assumptions, for any admissible control α∈A, we denote by X=

Xαthe corresponding controlled state process satisfying (2.2). We call adjoint pro-

cesses of X(or of α), the couple (Y,Z) of stochastic processes Y= (Yt)0≤t≤Tand

Z= (Zt)0≤t≤Tin S2,d ×H2,d×mthat satisfy

(3.2)

dYt=−∂xHθt, Yt, Zt+˜

E∂µH˜

θt,˜

Yt,˜

Zt)(Xt, αt)dt+ZtdWt, t ∈[0, T ],

YT=∂xgXT,L(XT)+˜

E∂µg˜

XT,L(XT)(XT),

where θt:= (Xt, αt,L(Xt, αt)), and the tilde notation refers to an independent copy.

Equation (3.2) is referred to as the adjoint equation. Formally, the adjoint variable

Ytreads as the derivative of the value function of the control problem with respect to

the state variable. In contrast with the deterministic case, in order for the solution to

be adapted to the information ﬂow, the extra term ZtdWtis needed. This is a stan-

dard feature of the extension of the maximum principle from deterministic control

to stochastic control. As expected, it is driven by the derivative of the Hamiltonian

function with respect to the state variable. In addition, since the controlled dynamics

are of the McKean–Vlasov type, the state variable, with respect to which we diﬀer-

entiate the Hamiltonian function, needs to include the probability measure appearing

in the state equation. This is now understood thanks to the early contributions [13]

and [14, Chapter 6]. In the present case of extended mean ﬁeld control problems, the

above adjoint equation needed to account for the fact that the probability measure

appearing in the state equation is in fact the joint distribution of the state Xtand the

control αt. This forces us to involve the derivative of the Hamiltonian with respect

to the ﬁrst marginal of this joint distribution.

Given αand as a result X,θtappears as a (random) input in the coeﬃcients

of this equation which, except for the presence of the process copies, is a backward

stochastic diﬀerential equation of the McKean–Vlasov type, which is well posed under

the current assumptions. See for example the discussion in [14, Chapter 6, p. 532].

3.1. A necessary condition. The main result of this subsection is based on

the following expression of the Gˆateaux derivative of the cost function J(α).

Lemma 3.1. Let α∈A,Xbe the corresponding controlled state process, and

(Y,Z)its adjoint processes satisfying (3.2). For β∈A, the Gˆateaux derivative of J

at αin the direction β−αis

d

dJ(α+(β−α))=0 =ET

0∂αH(θt, Yt, Zt)+˜

E[∂νH(˜

θt,˜

Yt,˜

Zt)(Xt, αt)]·(βt−αt) dt,

where (˜

X,˜

Y,˜

Z,˜

α,˜

β)is an independent copy of (X,Y,Z,α,β)on the space (˜

Ω,˜

F,˜

P).

Proof. We follow the lines of the proof of the stochastic maximum principle for

the control of McKean–Vlasov equations given in [14, section 6.3]. Given admissible

EXTENDED MEAN FIELD CONTROL PROBLEMS 3671

controls αand β, for each > 0 we deﬁne the admissible control α= (α

t)0≤t≤T

by α

t=αt+(βt−αt), and we denote by X= (X

t)0≤t≤Tthe solution of the

state equation (2.2) for αin lieu of α. We then consider the variation process

V= (Vt)0≤t≤T, deﬁned as the solution of the linear stochastic diﬀerential equation,

(3.3) dVt=γtVt+ρt+ηtdt+ˆγtVt+ ˆρt+ ˆηtdWt

with V0= 0. The coeﬃcients γt, ˆγt,ηt, and ˆηtare deﬁned by

γt=∂xb(θt),ˆγt=∂xσ(θt), ηt=∂αb(θt)(βt−αt),ˆηt=∂ασ(θt)(βt−αt),

which are progressively measurable bounded processes with values in the spaces Rd×d,

R(d×d)×d,Rd, and Rd×d, respectively (the parentheses around d×dindicate that ˆγt·u

is seen as an element of Rd×dwhenever u∈Rd). The coeﬃcients ρtand ˆρtare given

by

ρt=˜

E∂µb(θt)( ˜

Xt,˜αt)˜

Vt+˜

E∂νb(θt)( ˜

Xt,˜αt)( ˜

βt−˜αt),

ˆρt=˜

E∂µσ(θt)( ˜

Xt,˜αt)˜

Vt+˜

E∂νσ(θt)( ˜

Xt,˜αt)( ˜

βt−˜αt),

which are progressively measurable bounded processes with values in Rdand Rd×d,

respectively, and where ( ˜

Xt,˜αt,˜

Vt,˜

βt) is an independent copy of (Xt, αt, Vt, βt) deﬁned

on the separate probability structure (˜

Ω,˜

F,˜

P).

We call V= (Vt)0≤t≤Tthe variation process because it is the Gˆateaux derivative

of the state in the direction β−α, since, as detailed in [14, Lemma 6.10], it satisﬁes

lim

0

Eïsup

0≤t≤T

X

t−Xt

−Vt

2ò= 0.

For this reason, we have:

lim

0

1

[J(α)−J(α)]

=ET

0∂xf(θt)Vt+∂αf(θt)(βt−αt)

+˜

E[∂µf(θt)( ˜

Xt,˜αt)˜

Vt] + ˜

E[∂νf(θt)( ˜

Xt,˜αt)( ˜

βt−˜αt)]dt

+E∂xg(XT,L(XT))VT+˜

E[∂µg(XT,L(XT))( ˜

XT)˜

VT]

=ET

0∂xf(θt)Vt+∂αf(θt)(βt−αt)

+˜

E[∂µf(θt)( ˜

Xt,˜αt)˜

Vt] + ˜

E[∂νf(θt)( ˜

Xt,˜αt)( ˜

βt−˜αt)]dt

+E∂xg(XT,L(XT)) + ˜

E[∂µg(˜

XT,L(XT))(XT)VT],

(3.4)

where we used Fubini's theorem to obtain the last equality. Notice that, if we introduce

the adjoint processes (Y,Z) of α∈Aand the corresponding state process X, by (3.2),

we see that the last expectation above is exactly E[YTVT]. This can be computed

by integration by parts, using the Itˆo diﬀerentials of Yand V, which are given,

3672 B. ACCIAIO, J. BACKHOFF-VERAGUAS, AND R. CARMONA

respectively, by (3.2) and (3.3). In this way we obtain

YTVT=Y0V0+T

0

YtdVt+T

0

VtdYt+T

0

d[Y, V ]t

=MT+T

0Yt∂xb(θt)Vt+Yt∂αb(θt)(βt−αt) + Yt˜

E∂µb(θt)( ˜

Xt,˜αt)˜

Vt

+Yt˜

E∂νb(θt)( ˜

Xt,˜αt)( ˜

βt−˜αt)

−Vt∂xb(θt)Yt−Vt∂xσ(θt)Zt−Vt∂xf(θt)

−Vt˜

E∂µb(˜

θt)(Xt, αt)˜

Yt

−Vt˜

E∂µσ(˜

θt)(Xt, αt)˜

Zt−Vt˜

E∂µf(˜

θt)(Xt, αt)

+Zt∂xσ(θt)Vt+Zt∂ασ(θt)(βt−αt) + Zt˜

E∂µσ(θt)( ˜

Xt,˜αt)˜

Vt

+Zt˜

E∂νσ(t, θt)( ˜

Xt,˜αt)( ˜

βt−˜αt)dt,

where (Mt)0≤t≤Tis a mean zero integrable martingale which disappears when we take

expectations of both sides. Applying Fubini's theorem once more, we have

E[YTVT]

=ET

0Yt∂xb(θt)Vt+Yt∂αb(θt)(βt−αt) + Yt˜

E∂νb(θt)( ˜

Xt,˜αt)( ˜

βt−˜αt)

−Vt∂xb(θt)Yt−Vt∂xσ(θt)Zt−Vt∂xf(θt)−Vt˜

E∂µf(˜

θt)(Xt, αt)

+Zt∂xσ(θt)Vt+Zt∂ασ(θt)(βt−αt) + Zt˜

E∂νσ(t, θt)( ˜

Xt,˜αt)( ˜

βt−˜αt)dt.

Plugging this expression into the second equality of (3.4) we get, again by Fubini's

theorem,

lim

0

1

[J(α)−J(α)]

=ET

0∂αf(θt)(βt−αt) + ˜

E[∂νf(θt)( ˜

Xt,˜αt)( ˜

βt−˜αt)]

+Yt∂αb(θt)(βt−αt) + Yt˜

E∂νb(θt)( ˜

Xt,˜αt)( ˜

βt−˜αt)

+Zt∂ασ(θt)(βt−αt) + Zt˜

E∂νσ(t, θt)( ˜

Xt,˜αt)( ˜

βt−˜αt)dt,

which is the desired result, by (3.1).

We are now ready to prove the necessary part of the Pontryagin stochastic max-

imum principle. In the present framework of extended mean ﬁeld control, we obtain

(3.5) below. It is not possible to improve this condition into a pointwise minimization

condition as in more classical versions of the problem, when there is no nonlinear

dependence on the law of the control; see (6.58) in [14]. We give an example of this

phenomenon in Remark 4.2.

Theorem 3.2. Under assumptions (I)–(II), if the admissible control α=

(αt)0≤t≤T∈Ais optimal, X= (Xt)0≤t≤Tis the associated controlled state given

by (2.2), and (Y,Z)=(Yt, Zt)0≤t≤Tare the associated adjoint processes satisfying

(3.2), then we have

(3.5)

Ä∂αH(θt, Yt, Zt) + ˜

E∂νH(˜

θt,˜

Yt,˜

Zt)(Xt, αt)ä·(αt−a)≤0∀a∈A, dt⊗dP-a.s.,

where (˜

X,˜

Y,˜

Z,˜

α)is an independent copy of (X,Y,Z,α)on L2(˜

Ω,˜

F,˜

P).

EXTENDED MEAN FIELD CONTROL PROBLEMS 3673

Proof. Given any admissible control β, we use as before the perturbation α

t=

αt+(βt−αt). Since αis optimal, we have the inequality

d

dJ(α+(β−α))=0 ≥0.

Using the result of the previous lemma, we get

ET

0∂αH(θt, Yt, Zt) + ˜

E[∂νH(˜

θt,˜

Yt,˜

Zt)(Xt, αt)]·(βt−αt) dt≥0.

We now use the same argument as in the classical case (see, e.g., [14, Theorem 6.14]).

For every tand β∈L2(Ω,Ft,P;A), we can take βtequal to αtexcept for the interval

[t, t +ε], where it equals β, obtaining

(3.6) EÄ∂αH(θt, Yt, Zt) + ˜

E[∂νH(˜

θt,˜

Yt,˜

Zt)(Xt, αt)]ä·(β−αt)≥0.

Further, for any a∈Awe can take βto be equal to aon an arbitrary set in Ft, and

to coincide with αtotherwise, establishing (3.5).

Remark 3.3. If the admissible optimal control αtakes values in the interior of

A, then we may replace (3.5) with the following condition (see, e.g., [14, Proposition

6.15]):

(3.7) ∂αH(θt, Yt, Zt) + ˜

E[∂νH(˜

θt,˜

Yt,˜

Zt)(Xt, αt)] = 0 dt⊗dP-a.s.

Remark 3.4. A sharpening of (3.5) can be obtained under the convexity condition

H(x, a, ξ, y, z)≥H(x, a, ξ, y , z) + ∂αH(x, a, ξ, y, z )·(a−a)

+˜

E∂νH(x, a, ξ, y, z)( ˜

Xt,˜αt)·( ˜α

t−˜αt)

(3.8)

for all x∈Rd,a, a∈A, and ˜αa copy on (˜

Ω,˜

F,˜

P) of an admissible control α,

and where ξ, ξ ∈ P2(Rd×A) with ξ=L(˜

Xt,˜αt) and ξ=L(˜

Xt,˜α

t). Indeed,

in the framework of Theorem 3.2, if (3.8) holds, we apply it for x=Xt(ω), a=

β(ω), y =Yt(ω), z =Zt(ω), a =αt(ω), and α=βsuch that( ˜

X, ˜

Y , ˜

Z, ˜α, ˜

β) is a copy

of (X, Y, Z, α, β). Passing to expectation and using (3.6), we get

E[H(Xt, β, L(Xt, β ), Yt, Zt)] ≥E[H(Xt, αt,L(Xt, αt), Yt, Zt)] ,

so

αt= argmin E[H(Xt, β, L(Xt, β ), Yt, Zt)] : β∈L2(Ω,Ft,P;A).

3.2. A suﬃcient condition. Guided by the necessary condition proven above,

we derive a suﬃcient condition for optimality in the same spirit, though under stronger

convexity assumptions. For a given pair ( ˜

X, ˜α), these conditions read as

(3.9) g(x, µ)≥g(x, µ) + ∂xg(x, µ)·(x−x) + ˜

E∂µg(x, µ)( ˜

X)·(˜

X−˜

X),

and

H(x, a, ξ, y, z)

≥H(x, a, ξ, y, z) + ∂xH(x, a, ξ, y , z)·(x−x) + ∂αH(x, a, ξ, y, z )·(a−a)

+˜

E∂µH(x, a, ξ, y, z)( ˜

X, ˜α)·(˜

X−˜

X) + ∂νH(x, a, ξ, y, z)( ˜

X, ˜α)·(˜α−˜α),

(3.10)

for all x, x∈Rd,a, a∈A,y∈Rd,z∈Rd×m, and any ˜

X(resp., ˜α) copy of a process

in H2,d (resp., of an admissible control) on (˜

Ω,˜

F,˜

P), and where µ=L(˜

X), µ=

L(˜

X), ξ =L(˜

X, ˜α), and ξ=L(˜

X,˜α); see [14, Chapter 6].

3674 B. ACCIAIO, J. BACKHOFF-VERAGUAS, AND R. CARMONA

Theorem 3.5. Under Assumptions (I)–(II), let α= (αt)0≤t≤T∈Abe an ad-

missible control, X= (Xt)0≤t≤Tthe corresponding controlled state process, and

(Y,Z)=(Yt, Zt)0≤t≤Tthe corresponding adjoint processes satisfying (3.2). Let us

assume that

(i) gis convex in the sense of (3.9);

(ii) His convex in the sense of (3.10).

Then, if (3.5) holds, αis an optimal control, i.e., J(α) = infα∈AJ(α).

As before, we use the notation θt= (Xt, αt,L(Xt, αt)) throughout the proof.

Proof. We follow the steps of the classical proofs; see, for example, [14, Theorem

6.16] for the case of the control of standard McKean–Vlasov SDEs. Let ( ˜

X, ˜α) be a

copy of (X, α) on (˜

Ω,˜

F,˜

P), and let α∈Abe any admissible control with X=Xα

the corresponding controlled state. By deﬁnition of the objective function in (2.1)

and of the Hamiltonian of the control problem in (3.1), we have

J(α)−J(α)

=Eg(XT,L(XT)) −g(X

T,L(X

T))+ET

0f(θt)−f(θ

t)dt

=Eg(XT,L(XT)) −g(X

T,L(X

T))+ET

0H(θt, Yt, Zt)−H(θ

t, Yt, Zt)dt

−ET

0b(θt)−b(θ

t)·Yt+σ(θt)−σ(θ

t)] ·Ztdt

(3.11)

with θ

t= (X

t, α

t,L(X

t, α

t)). Being gconvex, we have

EgXT,L(XT)−gX

T,L(X

T)

≤E∂xg(XT,L(XT)) ·(XT−X

T) + ˜

E∂µg(XT,L(XT))( ˜

XT)·(˜

XT−˜

X

T)

=E∂xg(XT,L(XT)) + ˜

E[∂µg(˜

XT,L(XT))(XT)]·(XT−X

T)

=E(XT−X

T)·YT,

(3.12)

where we used Fubini's theorem and the fact that the “tilde random variables” are

independent copies of the “nontilde” ones. Using integration by parts and the fact

that Y= (Yt)0≤t≤Tsolves the adjoint equation (3.2), we get

E[(XT−X

T)·YT]

=EîT

0(Xt−X

t)·dYt+T

0Yt·d[Xt−X

t] + T

0[σ(θt)−σ(θ

t)] ·Ztdtó

=−ET

0∂xH(θt, Yt, Zt)·(Xt−X

t) + ˜

E∂µH(˜

θt,˜

Yt,˜

Zt)(Xt, αt)·(Xt−X

t)dt

+ET

0[b(θt)−b(θ

t)] ·Yt+ [σ(θt)−σ(θ

t)] ·Ztdt.

(3.13)

Again by Fubini's theorem, we get

ET

0˜

E∂µH(˜

θt,˜

Yt,˜

Zt)(Xt, αt)·(Xt−X

t) dt

=ET

0˜

E∂µH(θt, Yt, Zt)( ˜

Xt,˜αt)·(˜

Xt−˜

X

t)dt.

EXTENDED MEAN FIELD CONTROL PROBLEMS 3675

Together with (3.11), (3.12), and (3.13), this gives

J(α)−J(α)

≤ET

0[H(θt, Yt, Zt)−H(θ

t, Yt, Zt)]dt

−ET

0∂xH(θt, Yt, Zt)·(Xt−X

t) + ˜

E∂µH(θt, Yt, Zt)( ˜

Xt,˜αt)·(˜

Xt−˜

X

t)dt

≤ET

0∂αH(θt, Yt, Zt)·(αt−α

t) + ˜

E∂νH(θt, Yt, Zt)( ˜

Xt,˜αt)·( ˜αt−˜α

t)dt

=ET

0∂αH(θt, Yt, Zt) + ˜

E∂νH(˜

θt,˜

Yt,˜

Zt)(Xt, αt)·(αt−α

t)dt

≤0

because of the convexity of H, Fubini's theorem, and (3.5), showing that αis opti-

mal.

4. Examples. In this section, we consider models for which the solution strategy

suggested by the stochastic maximum principle proved in the previous section can be

pushed further. In fact, in sections 4.2 and 4.3, we are able to obtain explicit solutions.

4.1. The case of scalar interactions. In this subection, we state explicitly

what the above forms of the Pontryagin stochastic maximum principle become in the

case of scalar interactions. This is a case of particular interest because it does not need

the full generality of the diﬀerential calculus on Wasserstein spaces, and can be dealt

with by using standard calculus. An example of scalar interactions will be studied

and explicitly solved in the next subsection; see also [23] for another application of

scalar interactions.

Assume drift and cost functions to be of the form

b(x, α, ξ) = b0x, α, ϕdξ, f (x, α, ξ ) = f0x, α, ψdξ, g(x, µ) = g0x, φdµ

for some functions b0, f0on Rd×A×R,g0on Rd×R,ϕ, ψ on Rd×A, and φon Rd.

In order to simplify the notation, we shall assume that the volatility is independent

of the control and, actually, we take σ≡Id. Under these circumstances, the adjoint

equation becomes

dYt=−∂xb0(Xt, αt,E[ϕ(Xt, αt)])Yt+∂xf0(Xt, αt,E[ψ(Xt, αt)])

+˜

E[˜

Yt·∂ζb0(˜

Xt,˜αt,E[ϕ(Xt, αt)])]∂xϕ(Xt, αt)

+˜

E[∂ζf0(˜

Xt,˜αt,E[ψ(Xt, αt)])]∂xψ(Xt, αt)dt+ZtdWt

with terminal condition YT=∂xg0(XT,E[φ(XT)]) + ˜

E[∂ζg0(˜

XT,E[φ(XT)])]∂xφ(XT).

Accordingly, the necessary condition (3.7) for optimality will be satisﬁed when

0 = ∂αb0(Xt, αt,E[ϕ(Xt, αt)]) ·Yt+∂αf0(Xt, αt,E[ψ(Xt, αt)])

+˜

E[˜

Yt·∂ζb0(˜

Xt,˜αt,E[ϕ(Xt, αt)])]∂αϕ(Xt, αt)

+˜

E[∂ζf0(˜

Xt,˜αt,E[ψ(Xt, αt)])]∂αψ(Xt, αt).

(4.1)

4.2. Optimal liquidation with market impact. In this section we explicitly

solve an example that lies outside the classical LQ framework, in the sense that

convexity fails. This is inspired by an optimal liquidation problem with price impact,

but here it is more of mathematical interest than a ﬁnancial one.

3676 B. ACCIAIO, J. BACKHOFF-VERAGUAS, AND R. CARMONA

Consider a market where a group of investors, indexed by i, has large positions

qi

0on the same asset S. Each investor wants to liquidate her position by a ﬁxed time

T > 0, and controls her trading speed αi

tthrough time. Her state is then described

by two variables: her inventory Qi

t, that starts at qi

0and changes according to αi

t,

and her wealth Xi

t, which is assumed to start at zero for all traders. Investors' speed

of trading aﬀects prices in two ways. On the one hand, it generates a permanent

market impact, as the dynamics of Sare assumed to linearly depend on the average

trading speed of all investors. On the other hand, it produces a temporary impact,

that only aﬀects traders' own wealth process (as fees or liquidation cost), and which

is assumed to be linear in their respective rate of trading. The optimality criterion

is the minimization of the cost, which is composed of three factors: the wealth at

time T, the ﬁnal value of the inventory penalized by a terminal market impact, and a

running penalty which is assumed quadratic in the inventory. The optimal trades will

be a result of the trade-oﬀ between trading slowly to reduce the market impact (or

execution/liquidity cost), and trading quickly to reduce the risk of future uncertainty

in prices; see, e.g., [2, 16, 17, 12, 6].

Here we think of a continuum of investors. The initial inventories are distributed

according to a measure m0on R. We formulate the problem for a representative

agent, in the case of cooperative equilibria. The inventory process then evolves as

(4.2) dQt=αtdt, Q0∼m0,

while the wealth process is given by

dXt=−αt(St+kαt)dt, X0= 0,

where kαtmeasures the temporary market impact. The price process is modeled by

dSt=λE[αt]dt+σdWt, S0=s0,

where E[αt] represents the average trading speed, hence λE[αt] stands for the perma-

nent market impact to which all agents contribute (naturally λ≥0). The cost to be

minimized is given by

Eñ−XT−QT(ST−AQT) + φT

0

Q2

tdtô,

whereXTis the terminal proﬁt due to trading in [0, T ], QT(ST−AQT) is the liquida-

tion value of the remaining quantity at terminal time (with a liquidation/execution

penalization), and φis an “urgency” parameter on the running cost (the higher φis,

the higher is the liquidation speed at the beginning of the trading period). Using the

dynamics of X, this can be rewritten as

EñT

0

(αtSt+kα2

t+φQ2

t)dt−QT(ST−AQT)ô.

This example falls into the framework described in section 2. We have a 2-dimensional

state process (S, Q), a 1-dimensional Wiener process W, and the control process is

the trading speed α. The Hamiltonian of the system is

H(x1, x2, a, ξ, y1, y2) = λ¯

ξ2y1+ay2+φx2

2+ax1+ka2,

where ¯

ξ2=vξ(du, dv), and the ﬁrst order condition (4.1) reads as

(4.3) Y2

t+St+ 2kαt+λE[Y1

t] = 0

EXTENDED MEAN FIELD CONTROL PROBLEMS 3677

with adjoint equations

dY1

t=−αtdt+Z1

tdWt, Y 1

T=−QT,(4.4)

dY2

t=−2φQtdt+Z2

tdWt, Y 2

T=−ST+ 2AQT.(4.5)

Remark 4.1. Here the terminal cost function greads as

g(x1, x2) = −x1x2+Ax2

2,

which does not satisfy the convexity condition (3.9). However, an inspection of the

proof of Theorem 3.5 reveals that this assumption was only used in order to obtain the

inequality in (3.12). We are now going to show that such an inequality holds in the

present setting when A≥λ(which is satisﬁed for typical values of the parameters; see

[17, 12]), thus guaranteeing that the ﬁrst order condition (4.3) is not only necessary

but also suﬃcient for the optimality of α. For this purpose, let α∈Abe any

admissible control, and (S, Q) the corresponding controlled state. Then

Eg(ST, QT)−g(S

T, Q

T)−E(ST−S

T)Y1

T+ (QT−Q

T)Y2

T

=λÇEñT

0

α

tdt−T

0

αtdtôå2

−AEÇT

0

αtdt−T

0

α

tdtå2

≤(λ−A)EÇT

0

αtdt−T

0

α

tdtå2,

which is nonpositive for A≥λ.

An inspection of (4.4) suggests that we have Z1

t= 0 and Y1

t=−Q0−t

0αsds=

−Qt;Y2

twill be determined later. Substituting into (4.3), we have

Y2

0−2φt

0

Qsds+t

0

Z2

sdWs+s0+λt

0

E[αs]ds+σWt+ 2kαt

−λ(E[Q0] + t

0

E[αs]ds) = 0,

that is,

(4.6) αt=λE[Q0]−Y2

0−s0

2k+φ

kt

0

Qsds−1

2kt

0

(Z2

s+σ)dWs.

We now show that Q≡Q0and α≡α0, where

Q0

t:= E[Qt|Q0], α0

t:= E[αt|Q0].

By taking conditional expectation in (4.2) and (4.6), we get

(4.7) Q0

t=Q0+t

0

α0

sds, α0

t=α0+φ

kt

0

Q0

sds.

Setting F(t) := Q0

t, we note that F(t) = α0

tand F(t) = φ

kF(t). Together with the

initial conditions F(0) = Q0and F(0) = α0, this gives

(4.8) F(t) = ÅQ0

2−α0

2rãe−rt +ÅQ0

2+α0

2rãert,

where r=φ/k. Now, by taking conditional expectation in (4.5), and substituting

3678 B. ACCIAIO, J. BACKHOFF-VERAGUAS, AND R. CARMONA

into (4.7), we obtain

α0

T=λE[Q0]−2AQ0

2k+λ

2kT

0

E[αt]dt−A

kT

0

α0

tdt

=λE[Q0]−2AQ0

2k+λ

2k(E[QT]−E[Q0]) −A

k(Q0

T−Q0)(4.9)

=λ

2kE[QT]−A

kQ0

T,

that is, F(T) = λ

2kE[F(T)] −A

kF(T). Imposing this condition, and using (4.8), we

obtain

(4.10) α0=Q0rd1e−rT −d2erT

d1e−rT +d2erT +E[Q0]4λφ

(d1e−rT +d2erT )(c1e−rT +c2er T ),

where d1=√φk −A, d2=√φk +A, c1= 2d1+λ, c2= 2d2−λ. From (4.6), we also

have an explicit expression for Y2

0=λE[Q0]−s0−2kα0.

Now we use the ansatz Z2≡ −σ, and show that the process

(4.11) Y2

t=Y2

0−2φt

0

Qsds−σWt

does satisfy the equation and terminal condition in (4.5). Only the latter needs to be

shown. First note that, with this ansatz, from (4.6) and (4.2) we have

αt=α0+φ

kt

0

Qsds, Qt=Q0+α0t+φ

kt

0s

0

Qududs;

thus both processes αand Qare σ(Q0)-measurable, that is,

(4.12) Qt=E[Qt|Q0] = Q0

t=F(t) and αt=E[αt|Q0] = α0

t=F(t).

We now check that Y2satisﬁes the terminal condition in (4.5). By (4.12), (4.11)

implies

Y2

T=λE[Q0]−s0−2kα0−2φT

0

Q0

tdt−σWT.

On the other hand, by (4.12), (4.9), and (4.7),

−ST+ 2AQT=−s0−λ(E[QT]−E[Q0]) −σWT+ 2AQ0

T

=−s0+λE[Q0]−2kα0

T−σWT

=−s0+λE[Q0]−2kα0−2φT

0

Q0

tdt−σWT,

which yields Y2

T=−ST+ 2AQT, as wanted. This shows that the process Z2in the

ansatz, together with Y2deﬁned above, does satisfy (4.5). We have seen that this

gives Qt=F(t) and αt=F(t), by (4.12), thus from (4.8) we have

Qt=ÅQ0

2−α0

2rãe−rt +ÅQ0

2+α0

2rãert, αt=Å−Q0r

2+α0

2ãe−rt +ÅQ0r

2+α0

2ãert.

By (4.10), this gives

Qt=Q0

d1e−r(T−t)+d2er(T−t)

d1e−rT +d2erT +E[Q0]2λ√φk(−e−r t +ert)

(d1e−rT +d2erT )(c1e−rT +c2er T ),

αt=Q0rd1e−r(T−t)−d2er(T−t)

d1e−rT +d2erT +E[Q0]2λφ(e−rt +ert)

(d1e−rT +d2erT )(c1e−rT +c2er T ).

EXTENDED MEAN FIELD CONTROL PROBLEMS 3679

4.3. The LQ case. In this subsection, we use the suﬃcient condition derived

above to solve a simple LQ model. Via diﬀerent methods, such models have been

already studied in the literature; see, e.g., [35, 24, 6, 33]. For the sake of simplicity,

we give the details of the computations in the scalar case m=d=k= 1 and with

A=R. Also, as before, we assume that the volatility is not controlled and, in fact,

that it is identically equal to 1. In such an LQ model, the drift is of the form

b(x, α, ξ) = b1x+b2α+¯

b1¯x+¯

b2¯α

for some constants b1, b2,¯

b1,¯

b2, where we denote by ¯xand ¯αthe means of the state

and the control, in the sense that ¯x=xξ(dx, dα) and ¯α= αξ(dx, dα). As for

the cost functions, we assume that

f(x, α, ξ) = 1

2qx2+ ¯q(x−s¯x)2+rα2+ ¯r(α−¯s¯α)2, g(x, µ) = 1

2γx2+¯γ

2(x−ρ¯x)2

for some constants q, ¯q , r, ¯r, s, ¯s, γ, δ, ρ satisfying ¯q, ¯r, ¯γ≥0 and q, r, γ > 0. Under

these conditions, the Hamiltonian reads

(4.13)

H(x, α, ξ, y)=(b1x+b2α+¯

b1¯x+¯

b2¯α)y+1

2qx2+ ¯q(x−s¯x)2+rα2+ ¯r(α−¯s¯α)2.

Accordingly, the adjoint equation reads as

(4.14) dYt=−b1Yt+ (q+ ¯q)Xt+¯

b1E[Yt] + s¯q(s−2)E[Xt]dt+ZtdWt.

In the present situation, conditions (i) and (ii) of Theorem 3.5 hold, and condition

(3.7) of the Pontryagin stochastic maximum principle holds if

(4.15) b2Yt+¯

b2E[Yt]+(r+ ¯r)αt+ ¯r¯s(¯s−2)E[αt]=0.

Taking expectations, we obtain

(4.16) E[αt] = −b2+¯

b2

r+ ¯r(¯s−1)2E[Yt].

Plugging this expression into (4.15), we get

(4.17) αt=−1

r+ ¯rb2Yt+¯

b2−¯r¯s(¯s−2)(b2+¯

b2)

r+ ¯r(¯s−1)2¯

Yt.

We can rewrite (4.17) and (4.16) as

(4.18) αt=aYt+bE[Yt] and E[αt] = cE[Yt]

with

(4.19)

a=−b2

r+ ¯r, b =−1

r+ ¯r¯

b2−¯r¯s(¯s−2)(b2+¯

b2)

r+ ¯r(¯s−1)2,and c=−b2+¯

b2

r+ ¯r(¯s−1)2.

With this notation, the solution of the mean ﬁeld optimal control of the McKean–

Vlasov SDE (2.2) reduces to the solution of the following forward-backward SDE

(FBSDE) of McKean–Vlasov type:

(4.20)

dXt=b1Xt+¯

b1E[Xt]+(ab2Yt+ (bb2+c¯

b2)E[Yt]dt+ dWt,

dYt=−b1Yt+ (q+ ¯q)Xt+¯

b1E[Yt] + s¯q(s−2)E[Xt]dt+ZtdWt

3680 B. ACCIAIO, J. BACKHOFF-VERAGUAS, AND R. CARMONA

with terminal condition YT= (γ+ ¯γ)XT+ ¯γρ(ρ−2)E[XT]. We solve this system

in the usual way. First, we compute the means ¯xt=E[Xt] and ¯yt=E[Yt]. Taking

expectations in (4.20), we obtain

(4.21)

d¯xt=(b1+¯

b1)¯xt+ (ab2+bb2+c¯

b2)¯ytdt,

d¯yt=−(b1+¯

b1)¯yt+ (q+ ¯q+s¯q(s−2))¯xtdt

with terminal condition ¯yT= (γ+ ¯γ+ ¯γρ(ρ−2))¯xT. The linear system (4.21) can be

solved explicitly. For instance, if we denote

∆ := »(b1+¯

b1)2−(ab2+bb2+c¯

b2)(q+ ¯q+s¯q(s−2)),

and assume that the argument of the square root is strictly positive, one can solve

(4.21) via the theory of linear ODE systems in the case of real eigenvalues. We then

obtain that

¯xt=−(b1+¯

b1)2−∆2

2(q+ ¯q+s¯q(s−2))∆ ße−∆tÅy0+(q+ ¯q+s¯q(s−2))x0

b1+¯

b1+ ∆ ã

−e∆tÅy0+(q+ ¯q+s¯q(s−2))x0

b1+¯

b1−∆ã™

together with

¯yt=−(b1+¯

b1)2−∆2

2(q+ ¯q+s¯q(s−2))∆

·ß−(q+ ¯q+s¯q(s−2))e−∆t

b1+¯

b1−∆Åy0+(q+ ¯q+s¯q(s−2))x0

b1+¯

b1+ ∆ ã

+(q+ ¯q+s¯q(s−2))e∆t

b1+¯

b1+ ∆ Åy0+(q+ ¯q+s¯q(s−2))x0

b1+¯

b1−∆ã™

solve (4.21) for any y0, and choosing y0appropriately one can guarantee that ¯yT=

(γ+ ¯γ+ ¯γρ(ρ−2))¯xT. This expression for (¯xt,¯yt) can be plugged into (4.20) in lieu

of (E[Xt],E[Yt]), reducing the latter to a standard aﬃne FBSDE. We then make the

ansatz Yt=ηtXt+χtfor two deterministic functions t→ ηtand t→ χt, which is

compatible with the terminal condition. Computing the Itˆo diﬀerentials of Ytfrom the

ansatz and from the system (4.20), and identifying the terms in the drift multiplying

the unknown Xt, we ﬁnd that ηtshould be a solution of the scalar Riccati equation

ηt=−1

2b1

(q+ ¯q+η

t+ab2η2

t).

The latter is easily solved, and since necessarily ¯yt=ηt¯xt+χt, then χtcan also be

explicitly obtained. By Theorem 3.5, the control αobtained in this way is optimal.

Notice that it takes the form

αt=aηtXt+aχt+b¯xt

with aand bgiven in (4.19).

Remark 4.2. In classical control of mean ﬁeld type, the pointwise minimization

of the Hamiltonian with respect to the control is a necessary optimality condition.

Let us illustrate with the LQ example how this need not be the case in our extended

framework. If we impose pointwise minimization of (4.13) with respect to α, we get

EXTENDED MEAN FIELD CONTROL PROBLEMS 3681

b2Yt+rαt+ ¯r(αt−¯s¯αt) = 0. Integrating it, we obtain b2E[Yt] + (r+ ¯r−¯r¯s) ¯αt= 0.

On the other hand, the necessary condition (3.5) implies (4.15), so we have ¯

b2E[Yt] +

¯r¯s(¯s−1) ¯αt= 0. The right choice of parameters leads to a contradiction between this

and the previous equation.

5. Variational perspective in the weak formulation. The goal of this sec-

tion is to analyze the extended mean ﬁeld control problem from a purely variational

perspective, that is, by considering its formulation on path space. Given the intrinsic

nature of mean ﬁeld problems, it is natural to express them in terms of laws rather

than controls. The main reason for exploring this point of view is that of creating

a bridge with the optimal transport theory. This paves the way to the use of diﬀer-

ent sets of tools as, for example, the numerical methods that are fast developing in

transport theory. We start by introducing, in section 5.1, a weak formulation of the

extended mean ﬁeld control problem, especially well-suited for variational analysis.

In such a formulation, the probability space is not speciﬁed a priori. We remark that

a weak formulation of the mean ﬁeld control problem has been considered in [14,

section 6.6] and in [25], the latter rigorously proving convergence of large systems of

interacting control problems to the corresponding mean ﬁeld control problem. How-

ever, in these works there is no nonlinear dependence on the law of the control; cf.

our problem (5.1) below.

We proceed in section 5.2 to obtain what we call a martingale optimality condition.

Such a condition can serve as a veriﬁcation tool, in order to evaluate whether a given

control can be optimal. It is therefore the weak-formulation analogue of the necessary

Pontryagin maximum principle. This forms a bridge between the previous sections of

this work, and the ensuing ones. Whenever the Pontryagin maximum principle can be

used (or the martingale optimality condition in the weak formulation), it is a powerful

tool to identify optimal controls and the trajectories of the state at the optimum.

However, it does not say much about the optimal value of the problem. In fact, at

the optimum, the adjoint process gives formally the value of the gradient of the value

function when computed along the optimal trajectories. In order to study the value

function of the control problem (in a situation in which PDE techniques are highly

nontrivial) we recast in section 5.3 our weak formulation in transport-theoretic terms.

Numerical optimal transport has spectacularly grown in strength over the last

few years; see, e.g., [19, 7, 29] and the references therein. Our connection between

transport and mean ﬁeld control is meant to lay ground for eﬃcient numerical methods

in the future. In section 5.4 we provide, at a theorerical level, a ﬁrst discretization

scheme of this kind. To be speciﬁc, the optimal transport problem we obtain in

the discretization has an additional causality constraint (see, e.g., [26, 1, 4, 5]); the

numerical analysis of such problems is also having a burst of activity (e.g., [30, 31, 32]).

5.1. The weak formulation. We present a weak formulation of the extended

mean ﬁeld control problem formulated in section 2, in the sense that the probability

space is not speciﬁed here. We restrict our attention to the case where the state

dynamics have uncontrolled volatility, actually assuming σ≡Id,m=d, that the

drift does not depend on the law of the control, and that the initial condition X0is a

constant x0. We thus consider the minimization problem

inf

P,α

EPñT

0

f(Xt, αt,LP(Xt, αt)) dt+g(XT,LP(XT))ô

subject to dXt=b(Xt, αt,LP(Xt)) dt+ dWt, X0=x0,

(5.1)

3682 B. ACCIAIO, J. BACKHOFF-VERAGUAS, AND R. CARMONA

where the inﬁmum is taken over ﬁltered probability spaces (Ω,F,P) supporting some

d-dimensional Wiener process W, and over control processes αwhich are progres-

sively measurable on (Ω,F,P) and Rk-valued. We use LPto denote the law of the

given random element under P. Again, we choose time independent coeﬃcients for

simplicity, but all the results would be the same should fand bdepend upon t.

We say that (Ω,F,P,W,X,α) is a feasible tuple if it participates in the above

optimization problem yielding a ﬁnite cost.

5.2. Martingale optimality condition. In this section, we obtain a necessary

Pontryagin principle for the weak formulation (5.1). We call this the martingale

optimality condition. Since our aim is to illustrate the method, we assume only in

this part that we are dealing with a drift-control problem

b(x, α, µ) = α, m =d.

We start by expressing the objective function of (5.1) in canonical space, as a func-

tion of semimartingale laws. We denote by Cx0the space of Rd-valued continuous

paths started at x0, and by Sthe canonical process on it. We consider the set of

semimartingale laws

(5.2) ˜

P:= {µ∈ P(Cx0):dSt=αµ

t(S)dt+ dWµ

tµ-a.s.},

where Wµis a µ-Brownian motion and αµis a progressively measurable process w.r.t.

the canonical ﬁltration, denoted by F. It is then easy to see that (5.1) is equivalent

to

(5.3) inf

µ∈˜

P

EµñT

0

fSt, αµ

t,Lµ(St, αµ

t)dt+g(ST, µT)ô.

In what follows we consider perturbation of measures in ˜

Pvia push-forwards along

absolutely continuous shifts which preserve the ﬁltration; see the work of Cruzeiro

and Lassalle [18] and the references therein. Using push-forwards instead of pertur-

bations directly on the SDE is the main diﬀerence between the weak and the strong

perspective. The main idea is to ﬁnd the ﬁrst order conditions for problem (5.3) by

considering perturbations of the form µ,K := (Id +K)∗µaround a putative opti-

mizer µ. For this matter it is important to identify the Doob–Meyer decomposition

of the canonical process under µ,K , which forces an assumption on Kas we now

explain.

Remark 5.1. Let µ∈˜

P. We say that an adapted process U:Cx0→ Cx0is

µ-invertible if there exists V:Cx0→ Cx0adapted such that U◦V=IdCx0holds

U(µ)−a.s., and V◦U=IdCx0holds µ−a.s. Now let K·=.

0ktdtbe adapted. We

say that Kpreserves the ﬁltration under µif for every Uwhich is µ-invertible we

also have that U+Kis µ-invertible. It follows that the set of those K=.

0ktdt

that preserve the ﬁltration under µ, is a linear space. It also follows that for such K

we have µ,K := (Id +K)∗µ∈˜

Pwith αµ,K

t(S+K(S)) = αµ

t(S) + kt(S); see

[18, Proposition 2.1, Lemma 3.1]. A typical case when the ﬁltration is preserved is

when Kis a piecewise linear and adapted process, while an example when Kdoes not

preserve the ﬁltration is given by Tsirelson's drift; see, respectively, [18, Proposition

2.4, Remark 2.1.1].

In analogy to [18, Theorem 5.1], we then obtain the following necessary condition

for an optimizer in (5.3). We use here the notation θµ

t= (St, αµ

t,Lµ(St, αµ

t)).

EXTENDED MEAN FIELD CONTROL PROBLEMS 3683

Proposition 5.2. Let µbe an optimizer for (5.3). Then the process Nµgiven

by

(5.4) Nµ

t:= ∂af(θµ

t) + ˜

E[∂νf(˜

θµ

t)(St, αµ

t)] −t

0∂xf(θµ

s) + ˜

E[∂µf(˜

θµ

s)(Ss, αµ

s)]ds

is a µ-martingale, with terminal value equal to

(5.5)

Nµ

T=−∂xg(ST, µT)−˜

E[∂µg(˜

ST, µT)(ST)] −T

0∂xf(θµ

s) + ˜

E[∂µf(˜

θµ

s)(Ss, αµ

s)]ds.

Proof. We use the notation µ,K introduced in Remark 5.1, and call C(µ) the

cost function appearing in problem (5.3). We have lim→0C(µ,K )−C(µ)

≥0 for all K.

Now if Kpreserves the ﬁltration under µ, then the same is true for −K. Therefore

lim→0C(µ,K )−C(µ)

= 0. To conclude the proof, we use αµ,K

t(S+K(S)) = αµ

t(S)+

kt(S) and similar arguments as in [18, Theorem 5.1].

When (5.4)–(5.5) hold, we say that µsatisﬁes the martingale optimality condition.

The interest of this condition is that it is a clear stochastic counterpart to the classical

Euler–Lagrange condition in the calculus of variations, except for the fact that “being

equal to zero” is here replaced by “being a martingale”; see [18, 27].

Example 5.3. The martingale optimality condition is the analogue of the Pon-

tryagin principle in the weak formulation. To wit, we verify this in a simple ex-

ample. Suppose f(Xt, αt,L(Xt, αt)) = 1

2(αt−E[αt])2and g(XT,L(XT)) = 1

2X2

T.

The martingale optimality condition then asserts that for an optimizer µthe process

Nµ

t:= αµ

t−E[αµ

t] is a martingale with Nµ

T=−ST. On the other hand the Pontryagin

FBSDE states that

dYt=ZtdWt, YT=XT,

as well as αt−E[αt] + Yt, by Remark 3.3. We see the compatibility of the two

statements, as well as the equality in law Nµ

t=−Yt, in this particular case.

Remark 5.4. The above arguments can be adapted to the case when b(x, α, µ) =

b(x, α). This is the case, for example, when bis a C1-diﬀeomorphism and b(x, Rk) is

convex for each x. Indeed, in this case one may redeﬁne the drift in the dynamics of

Svia βµ

t(S) := b(St, αµ

t(S)), which is associated with the cost

f(St, b−1(St, βµ

t(S)),Lµ(St, b−1(St, βµ

t(S))) ),

where with some abuse of notation b−1(x, ·) denotes the inverse of b(x, ·). Using this

time the notation θµ

t= (St, βµ

t,Lµ(St, βµ

t)) one then replaces the right-hand side

(r.h.s.) of (5.4) with

(5.6) ∂af(θµ

t)∂a(b−1)(St, βµ

t) + ˜

E[∂νf(θµ

t)∂a(b−1)( ˜

St,˜

βt)]

−t

0∂xf(θµ

s) + ˜

E[∂µf(θµ

s)( ˜

Ss,˜

βs) + ∂νf(θµ

s)∂x(b−1)( ˜

Ss,˜

βs)]ds,

and the r.h.s. of (5.5) with

(5.7) −∂xg(ST, µT)−˜

E[∂µg(ST, µT)( ˜

ST)]

−T

0∂xf(θµ

s) + ˜

E[∂µf(θµ

s)( ˜

Ss,˜

βs) + ∂νf(θµ

s)∂x(b−1)( ˜

Ss,˜

βs)]ds.

3684 B. ACCIAIO, J. BACKHOFF-VERAGUAS, AND R. CARMONA

5.3. Optimal transport reformulation. In this section we formulate a vari-

ational transport problem on C=C([0, T ]; Rd), the space of Rd-valued continuous

paths, which is equivalent to ﬁnding the weak solutions of the extended mean ﬁeld

problem (5.1). This variational formulation is a particular type of transport problem

under the so-called causality constraint; see [26, 1, 4, 5]. Here we recall this concept

with respect to the ﬁltrations F1and F2, generated by the ﬁrst and by the second

coordinate process on C × C.

Definition 5.5. Given ζ1,ζ2∈ P(C), a probability measure π∈ P(C×C)is called

a causal transport plan between ζ1and ζ2if its marginals are ζ1and ζ2and, for any

t∈[0, T ]and any set A∈ F2

t, the map C x→ πx(A)is ˜

F1

t-measurable, where

πx(dy) := π({x} × dy)is a regular conditional kernel of πw.r.t. the ﬁrst coordinate,

and ˜

F1is the completion of F1w.r.t. ζ1. The set of causal transport plans between ζ1

and ζ2is denoted by Πc(ζ1,ζ2).

The only transport plans that contribute to the variational formulation of the

problem are those under which the diﬀerence of the the coordinate processes on the

product space C × C is a.s. absolutely continuous with respect to Lebesgue measure.

We denote by (ω, ω) the generic element on C × C, and we use ( ˙

˘

ω−ω) to indicate the

density of the process ω−ωwith respect to Lebesgue measure, when it exists, i.e.,

ωt−ωt=ω0−ω0+t

0

(˙

˘

ω−ω)sds, t ∈[0, T ].

In such a case, we write ω−ω L. Moreover, we set

γ:= Wiener measure on Cstarted at 0

and Π

c(γ,·) := {π∈ P(C × C) : π(dω× C) = γ(dω),and ω−ω L, π-a.s.}.

We present the connection between extended mean ﬁeld control and causal trans-

port.

Lemma 5.6. Assume that b(x, ., µ)is injective, and set

ut(ω, ω, µ) := b−1(ωt, ., µ)(˙

˘

ω−ω)t.

Then problem (5.1) is equivalent to

inf EπñT

0

fωt, ut(ω, ω, µπ

t),Lπ(ωt, ut(ω, ω, µπ

t))dt+g(ωT, µπ

T)ô,(5.8)

where the inﬁmum is taken over transport plans π∈Π

c(γ,·)such that dt⊗dπ-a.s.

(˙

˘

ω−ω)t∈b(ωt,Rd, µπ

t), and µπdenotes the second marginal of π.

Proof. Fix (Ω,F,P,W,X,α) a feasible tuple for (5.1), if it exists, and note

that αt=ut(W,X,LP(Xt)) is FX,W-adapted. Then π:= LP(W,X) belongs to

Π

c(γ,LP(X)) and generates the same cost in (5.8). Conversely, given a transport

plan πparticipating in (5.8), the following tuple (Ω,F,P,W,X,α) is feasible for

(5.1): Ω = C × C,Fcanonical ﬁltration on C × C,P=π,W=ω,X=ω, and

αt=ut(ω, ω, µπ

t).

The connection presented in the above lemma will be used in the next proposition,

in order to reduce the optimization problem in (5.1) to a minimization over weak closed

loop tuples, in the following sense.

EXTENDED MEAN FIELD CONTROL PROBLEMS 3685

Definition 5.7. We say that a feasible tuple for (5.1) is a weak closed loop if the

control is adapted to the state (i.e., αis FX-measurable).

We will further need the following concepts of monotonicity: a function f:

P(RN)→Ris called ≺cm-monotone (resp., ≺c-monotone) if f(m1)≤f(m2) when-

ever m1≺cm m2(resp., m1≺cm2). With the latter order of measures, we mean

hdm1≤hdm2for all functions hwhich are convex and increasing w.r.t. the

usual componentwise order in RN(resp., all convex functions h) such that the inte-

grals exist.

Proposition 5.8. Assume

(A1) b(x, ., µ)is injective, b(x, Rk, µ)is a convex set, and b−1(x, ., µ)is convex;

(A2) f(x, b−1(x, ., µ), ξ)is convex and grows at least like κ0+κ1|·|pwith κ1>

0, p ≥1;

(A3) f(x, α, .)is ≺cm-monotone.

Then the minimization in the extended mean ﬁeld problem (5.1) can be taken over

weak closed loop tuples. Moreover, if the inﬁmum is attained, then the optimal control

αis of weak closed loop form.

The proof follows the projection arguments used in [1], which requires the above

convexity assumptions. On the other hand, no regularity conditions are required here,

unlike in the classical PDE or probabilistic approaches (see assumptions (I)–(II) in

section 3). We refer to [25] for a similar statement, in a general framework, but under

no nonlinear dependence on the control law. This proof is postponed to Appendix A.

Remark 5.9. If bis linear with positive coeﬃcient for α, then assumption (A3) in

Proposition 5.8 can be weakened:

(A3)f(x, α, .) is ≺c-monotone,

as can be seen from the proof. For example, conditions (A1), (A2), (A3) are satisﬁed

if

b(x, α, µ) = c1x+c2α+c3¯µand f(x, α, ξ ) = d1x+d2α+d3x2+d4α2+J(¯

ξ1,¯

ξ2),

where Jis a measurable function,

¯µ=xµ(dx),¯

ξ1= xξ(dx, dα),¯

ξ2= αξ(dx, dα),

and ci, diare constants such that c2= 0, d4/c2>0.

5.4. A transport-theoretic discretization scheme. In this part we specialize

the analysis to the following particular case of (5.1):

inf

P,α®1

0

f(LP(αt))dt+g(LP(XT)) : dXt=αtdt+ dWt, X0=x0´,(5.9)

where for simplicity we took T= 1. Throughout this section we assume

(i) gis bounded from below and lower semicontinuous w.r.t. weak convergence;

(ii) fis increasing with respect to convex order, lower semicontinuous w.r.t. weak

convergence, and such that for all λ∈[0,1] and Rk-valued random variables

Z, ¯

Z,

f(L(λZ + (1 −λ)¯

Z)) ≤λf(L(Z)) + (1 −λ)f(L(¯

Z));(5.10)

(iii) fsatisﬁes the growth condition f(ρ)≥a+b|z|pρ(dz) for some a∈R, b >

0, p > 1.

3686 B. ACCIAIO, J. BACKHOFF-VERAGUAS, AND R. CARMONA

Lemma 5.6 shows the equivalence of (5.9) with the variational problem

inf

π∈Π

c(γ,·)®1

0

fLπ(˙

˘

ω−ω)tdt+g(Lπ(ω1))´.

Under the convention that 1

0fLπ(˙

˘

ω−ω)tdt= +∞if ω−ω L fails under π, the

latter can be expressed in the equivalent form

inf

µ∈˜

P

inf

π∈Πc(γ,µ)®1

0

fLπ(˙

˘

ω−ω)tdt+g(Lπ(ω1))´,(P)

where ˜

Pwas deﬁned in (5.2). In the same spirit as [36, Chapter 3.6], we introduce a

family of causal transport problems in ﬁnite dimension increasing to (P). For n∈N,

let Tn:= {i2−n: 0 ≤i≤2n, i ∈N}be the nth generation dyadic grid. For measures

m∈ P(C) and π∈ P(C × C), we write

mn:= Lm({ωt}t∈Tn)∈ P(R(2n+1)d)

and πn:= Lπ({(ωt, ωt)}t∈Tn)∈ P(R(2n+1)d×R(2n+1)d)

for the projections of mand πon the grid Tn. We denote by

(xn

0, xn

1, . . . , xn

2n, yn

0, yn

1, . . . , yn

2n)

a typical element of R(2n+1)d×R(2n+1)d, and let ∆nxi:= xn

i+1 −xn

i, and similarly for

∆nyi.

We consider the auxiliary transport problems

inf

µ∈P(R(2n+1)d)inf

π∈Πn

c(γn,µ)2−n

2n−1

i=0

fÅLπÅ∆nyi−∆nxi

2−nãã+g(Lπ(yn

2n)),

(P(n))

where, in analogy to Deﬁnition 5.5, we called

Πn

c(γn,µ)⊂ P(R(2n+1)d×R(2n+1)d)

the set of causal couplings in P(R(2n+1)d×R(2n+1)d) with marginals γnand µ; see

[5].

Theorem 5.10. Suppose problem (P)is ﬁnite, and that (i),(ii),(iii) hold. Then

the value of the auxiliary problems (P(n)) increases to the value of the original problem

(P), and the latter admits an optimizer.

Remark 5.11. An example of a function satisfying conditions (ii)–(iii) of The-

orem 5.10 is f(ρ) = Rhdρfor Rconvex and increasing, and hconvex with

p-power growth (p > 1). It also covers the case of functions of the form f(ρ) =

φ(w, z) dρ(w) dρ(z) + |x|pdρ(x), with φjointly convex and bounded from below,

and f(ρ) = Var(ρ) + |x|pdρ(x), where in both cases p > 1. For p= 2 the latter falls

into the LQ case of section 4.3.

Proof. Step 1 (lower bound): Let µ∈˜

Pand π∈Πc(γ,µ) with ﬁnite cost for

problem (P). Fix n∈N, and denote by πnthe projection of πonto the grid Tn. We

ﬁrst observe that

1

0

fLπ(˙

˘

ω−ω)tdt+g(Lπ(ω1)) ≥2−n

2n−1

i=0

fÅLπnÅ∆nyi−∆nxi

2−nãã+g(Lπn(yn

2n)).

(5.11)

EXTENDED MEAN FIELD CONTROL PROBLEMS 3687

Indeed, for i∈ {0,...,2n−1}we have

(i+1)2−n

i2−n

fLπ(˙

˘

ω−ω)tdt≥2−nfLπ(i+1)2−n

i2−n

(˙

˘

ω−ω)t

dt

2−n

= 2−nfÅLπÅω(i+1)2−n−ωi2−n−(ω(i+1)2−n−ωi2−n)

2−nãã

= 2−nfÅLπnÅ∆nyi−∆nxi

2−nãã,

where for the inequality we used the convexity condition (5.10). Noticing that the

ﬁrst marginal of πnis equal to γn, the r.h.s. of (5.11) is bounded from below by the

value of (P(n)). Because µ, π have been chosen having ﬁnite cost for problem (P),

but otherwise arbitrary, we conclude that

(P)≥(P(n)) ∀n∈N.

Step 2 (monotonicity): For n∈Nand i∈ {0,...,2n−1}, take ksuch that

i2−n= (k−1)2−(n+1) < k2−(n+1) <(k+ 1)2−(n+1) = (i+ 1)2−n.

Let µn+1 ∈ P(R(2n+1+1)d) and πn+1 ∈Πn+1

c(γn+1,µn+1 ). By (5.10) we get

2−(n+1) ßfÅLπn+1 Å∆n+1yk−1−∆n+1 xk−1

2−(n+1) ãã+fÅLπn+1 Å∆n+1yk−∆n+1xk

2−(n+1) ãã™

≥2−nfÇLπn+1 Çyn+1

k+1 −yn+1

k−1−(xn+1

k+1 −xn+1

k−1)

2−nåå

= 2−nfÅLπnÅ∆nyi−∆nxi

2−nãã,

where πnis the projection of πn+1 on the grid Tn. Analogously to the previous step,

this gives

(P(n+ 1)) ≥(P(n)) ∀n∈N.

Step 3 (discrete to continuous): We introduce auxiliary problems in path-space:

inf

µ∈˜

P

inf

π∈Πc(γ,µ)2−n

2n−1

i=0

fÅLπÅ∆n

iω−∆n

iω

2−nãã+g(Lπ(ω1)),(Paux(n))

where ∆n

iω:= ω(i+1)2−n−ωi2−nand likewise for ∆n

iω. We now prove that

(Paux(n)) = (P(n)) ∀n∈N.(5.12)

First we observe that the left-hand side of (5.12) is larger than the r.h.s. Indeed,

projecting a coupling from Πc(γ,·) onto a discretization grid gives again a causal

coupling; see [36, Lemma 3.5.1]. For the converse inequality, note that Remark 5.12

implies that, for any ν∈ P(R(2n+1)d) and π∈Πn

c(γn,ν) with ﬁnite cost in (P(n)),

there exist µ∈˜

Pand P∈Πc(γ,µ) that give the same cost in (Paux(n)).

3688 B. ACCIAIO, J. BACKHOFF-VERAGUAS, AND R. CARMONA

Step 4 (convergence): Let us denote

c(π) := 1

0

fLπ(˙

˘

ω−ω)tdtand cn(π) := 2−n

2n−1

i=0

fÅLπÅ∆n

iω−∆n

iω

2−nãã,

the cost functionals deﬁning the optimization problems (P) and (Paux(n)). Notice

that Step 1 implies c≥cn, and Step 2 shows that cnis increasing. We now show that

cnconverges to cwhenever the latter is ﬁnite. For this it suﬃces to show that

lim inf

ncn(π)≥c(π).(5.13)

We start by representing cnin an alternative manner, namely,

cn(π) = 1

0

fLπ(t2n+1)2−n

t2n2−n

(˙

˘

ω−ω)s

ds

2−ndt.

By the Lebesgue diﬀerentiation theorem [21, Theorem 6, Appendix E.4], for each pair

(ω, ω) such that ω−ωis absolutely continuous, there exists a dt-full set of times such

that

A(t, n) := (t2n+1)2−n

t2n2−n

(˙

˘

ω−ω)s

ds

2−n→(˙

˘

ω−ω)t.(5.14)

If c(π)<∞, the set of such pairs (ω, ω) is π-full. This shows that (5.14) holds

π(dω, dω)dt-a.s. By Fubini's theorem, there is a dt-full set of times I⊂[0,1] such

that, for t∈I, the limit (5.14) holds in the π-almost sure sense (the π-null set depends

on ta priori). By dominated convergence, this proves that

∀t∈I:Lπ(A(t, n)) ⇒ Lπ(˙

˘

ω−ω)t,

namely, in the sense of weak convergence of measures. By lower boundedness and

lower semicontinuity of f, together with Fatou's lemma, we obtain

lim inf

ncn(π)≥1

0

lim inf

nf(Lπ(A(t, n)) ) dt=1

0

fLπ(˙

˘

ω−ω)tdt,

establishing (5.13) and so that cnc.

By Steps 2 and 3, we know that the values of (Paux(n)) are increasing and

bounded from above by the value of (P). We take πnwhich is 1/n-optimal for

(Paux(n)). It follows then by assumptions (i)–(iii) that 1

0[( ˙

˘

ω−ω)t]pdtdπn≤

¯a+¯

b(P), for some ¯a, ¯

b∈R. By [36, Lemma 3.6.2], we obtain the tightness of {πn}n.

We may thus assume that πn⇒πweakly. By [1, Lemma 5.5], the measure πis causal

(and it obviously has ﬁrst marginal γ). For k≤nwe have

ck(πn)≤cn(