Conference PaperPDF Available

Markov chain approximation methods on generalized HJB equation

January 2008
Proceedings of the IEEE Conference on Decision and Control

DOI:10.1109/CDC.2007.4434068

Source
IEEE Xplore

Conference: Decision and Control, 2007 46th IEEE Conference on

Authors:

Xueping Li

University of Tennessee

Qingshuo Song

The University of Hong Kong

This work is concerned with numerical methods for a class of stochastic control optimizations and stochastic differential games. Numerical procedures based on Markov chain approximation techniques are developed in a framework of generalized Hamilton-Jacobi-Bellman equations. Convergence of the algorithms is derived by means of viscosity solution methods.

Content uploaded by Xueping Li

Content may be subject to copyright.

Markov Chain Approximation Methods on Generalized HJB

Equation

Xueping Li and Q. S. Song

Abstract— This work is concerned with numerical meth-

ods for a class of stochastic control optimizations and

stochastic differential games. Numerical procedures based

on Markov chain approximation techniques are developed

in a framework of generalized Hamilton-Jacobi-Bellman

equations. Convergence of the algorithms is derived by

means of viscosity solution methods.

I. INTRODUCTION

Stochastic control has its wide applications in man-

ufacturing, communication theory, signal processing,

and wireless networks; see for example [10], [7] and

references therein. On the other hand, zero sum stochas-

tic differential games, as the theory of two-controller,

extends the control theory into more realistic problems.

Many problems arising in, for example, pursuit evasion

games, queueing systems in heavy trafﬁc, risk sensitive

control, and constrained optimization problems, can be

formulated as two-player stochastic differential games

[2], [6].

It is well known that the value functions of stochas-

tic optimal controls of such systems lead to systems

of Hamilton-Jacobi-Bellman (HJB) equations, and the

value functions of stochastic differential games satisfy

Hamilton-Jacobi-Isaac (HJI) equations. Such a HJB or

HJI equations are usually nonlinear and difﬁcult to solve

in closed form. Thus numerical methods become viable

alternative. One of the most effective methods is the

Markov chain approximation approach. The proof of

convergence using probability methods is referred to

[9], [10], [14] for stochastic controls and [11], [15] for

stochastic differential games. Viscosity solution methods

provides another way to prove the convergence, see [1]

for stochastic controls and [16] for stochastic differential

games.

The idea of the generalized operator for associated

HJB equations in this work is motivated from [12] on

Q-learning problem, in which many applications of such

a generalization are introduced in the framework of

Xueping Li is with the Department of Industrial and Informa-

tion Engineering, University of Tennessee-Knoxville, TN 37996.

Xueping.Li@utk.edu.

Q. S. Song is with the Department of Mathematics, University of

Southern California, Los Angeles, CA 90089, qingshus@usc.edu.

Research of this author was supported in part by the U.S. Army

Research Ofﬁce MURI grant W911NF-06-1-0094 at the University

of Southern California.

Markov Decision Process (MDP). In this paper, we aim

to introduce generalized HJB equations, and it turns out

that HJI equation is a special case of generalized HJB

equations. For applications, we concentrate on a class

of stochastic controls and stochastic differential games

with ﬁnite time horizon associated with generalized

HJB equations. Upwind ﬁnite difference scheme and its

interpretation of Makrov chain approximation is devel-

oped on generalized HJB equations. The convergence of

numerical solution is provided using viscosity solution

technique. This simultaneously implies the convergence

of numerical solutions both on stochastic control and

stochastic differential games.

The rest of paper is arranged as follows. Section II

begins with description of generalized HJB equation.

Associated stochastic control and stochastic differential

games are formulated as its applications. Section III

presents an effective upwind ﬁnite difference scheme

with its probability interpretations. Section IV proves the

convergence of numerical scheme. Section V concludes

the paper with further remarks.

II. GENERALIZED HJB EQUATIONS

Throughout the paper, we use following notations.

K is generic constant. Q = [0, T ] × R

is domain of

the real-valued value function V (·, ·) : Q → R. U is

a compact subset of Euclidian space. σ(·, ·) : R

U → R

d×d

is a matrix-valued function, and a(x, ν) =

σ(x, ν)σ(x, ν)

, where σ(x, ν)

is the transpose of

σ(x, ν). Let L : R

× U → R be the running cost,

ψ(·, ·) : Q → R be the terminal cost. xy for x, y ∈ R

is abbreviation of inner product xy

i=1

Consider generalized nonlinear Hamilton-Jacobi-

Bellman (HJB) equation: for ∀(t, x) ∈ Q

+⊗

[f(x, ν)D

V +

tr(a(x, ν)D

V )+L(x, ν)] = 0

(1)

with boundary condition V (T, x) = ψ(x) for x ∈ R

Here ⊗

is an operator that summarizes values over

actions as a function of the state, such that, for any real-

valued function φ

, φ

and constant c, there exist some

constant K

(C1) ⊗

[cφ

(x, ν) + φ

(x)] = c ⊗

[φ

(x, ν)] + φ

(x).

(C2) ⊗

(x, ν) ≤ ⊗

(x, ν), whenever φ

≤ φ

Proceedings of the

46th IEEE Conference on Decision and Control

New Orleans, LA, USA, Dec. 12-14, 2007

ThC05.4

(C3) |⊗

(x, ν)−⊗

(x, ν)| ≤ K max

|φ

(x, ν)−

(x, ν)|.

Many natural operators satisfy above conditions, such

as max

φ(x, ν), min

φ(x, ν), and

φ(x, ν)m(dν),

where m(·) is a measure on B(U ), a Borel σ−algebra.

Moreover, if we consider two component control

ν = (ν

, ν

), then min

max

φ(x, ν

, ν

) and

max

min

φ(x, ν

, ν

) also satisfy all conditions

above.

To proceed, we need following regular assumption.

(A1) a, f, L, ψ are continuous and bounded. For φ =

a, f, L, ψ, function φ and its partial derivatives

, φ

are continuous and bounded on R

× U

for i, j = 1, 2, . . . , n.

(Ω, F, F

, P, W ) is a given probability space driven

by Wiener Process W

with ﬁltration F

. In the fol-

lowing, we present two applications of generalized HJB

equation (1): stochastic control problem and stochastic

differential games.

A. Classical stochastic control problem

Suppose X

satisﬁes controlled stochastic differential

equation (SDE)

= f(X

, u

)ds + σ(X

, u

)dW

, ∀s ∈ [t, T ],

(2)

with initial condition X

= x.

Deﬁnition 2.1: An admissible control process u on

[t, T ] is an F

-progressively measurable process taking

values in U. The set of all admissible controls is denoted

by U(t).

Cost function for a given admissible control u(·) ∈ U

is deﬁned as

J(t, x, u) = E

L(X

, u

)ds + ψ(X

)

, (3)

and value function is deﬁned as

V (t, x) = inf

u∈U(t)

J(t, x, u). (4)

It is well known that V (t, x) is unique viscosity solu-

tion of HJB equation (1) with ⊗

= min

. Similarly, if

take sup over all admissible controls in (4), then V (t, x)

is unique viscosity solution of (1) with ⊗

= max

(See [4], [7])

B. Stochastic differential games

Let U = U

× U

and u = (u

, u

) ∈ U(t) is an

admissible control. X

satisﬁes SDE (2). u

and u

are

controls offered by player1 and player2, respectively.

The collection of admissible controls on [t, T ] of player1

and player2 are denoted by U

(t) and U

(t). Player1

(resp. player2) wants to minimize (resp. maximize) the

cost (3). In the following, we deﬁne Elliott-Kalton type

upper and lower values of differential games.

Deﬁnition 2.2: An admissible strategy α (resp. β)

for player2 (resp. player1) on [t, T ] is a mapping α :

(t) → U

(t) (resp. β : U

(t) → U

(t)), such that,

for t < r < T , u

(s) = ˜u

(s) for almost all s ∈ [t, r]

implies β(u

)(s) = β(˜u

)(s) for almost all s ∈ [t, r].

Let S

(t) and S

(t) denote the class of all admissible

strategies of player1 and player2 on [t, T ].

The upper value V

(t, x) and lower value V

−

(t, x)

are deﬁned as

(t, x) = sup

α∈S

(t)

inf

∈U

(t)

J(t, x, u

, α(u

)). (5)

and

−

(t, x) = inf

β∈S

(t)

sup

∈U

(t)

J(t, x, β(u

), u

) (6)

It is well known that V

(t, x) (resp. V

−

(t, x)) is

unique viscosity solution of HJB equation (1) with

⊗

= min

max

(resp. max

min

), See [8]. If

(t, x) = V

−

(t, x) holds for all (t, x) ∈ Q, then the

differential game is said to have a saddle point, and its

value is denoted by V (t, x).

III. NUMERICAL SOLUTIONS

Let e

be ith unit basis of R

for i = 1, 2, . . . , d. For a

given positive discretized parameter δ, h, deﬁne discrete

spaces in state and time by

= {x ∈ R

: x =

i=1

δe

, k

∈ Z},

[t, T ]

= [t, T ] ∩ {t = kh + T : k ∈ Z}.

(7)

To proceed, the following assumptions will be given.

(A2) Matrix a(x, ν) satisﬁes

(x, ν)| −

j6=i

(x, ν)| ≥ 0.

(A3) Discrete parameter δ = δ(h) is a function of h such

that

i=1

(x, ν)−

j6=i

(x, ν)|+δ|f

(x, ν)|] ≤ δ

(8)

Assumption (A2) requires that the diffusion matrix

be diagonally dominated. If the given dynamic system

does not satisfy (A2), then we can adjust the coordinate

system to satisfy assumption (A2); see [10, page 110]

and [7, Page329]. Assumption (A3) gives the relation

between two parameters δ and h, which are used in

discretization.

By V

(·, ·) on Σ

× [0, T ]

, we denote numerical

solution of (1) with parameters δ, h of (8) used. Note

that, for simplicity, we use V

instead of V

δ,h

Numerical solution V

can be obtained by upwind

ﬁnite difference numerical scheme, that is, for any

46th IEEE CDC, New Orleans, USA, Dec. 12-14, 2007 ThC05.4

4070

function φ(t, x)

∆

δ,±

φ = δ

−1

[φ(t, x ± δe

) − φ(t, x)]

∆

2,δ

φ = δ

−2

[φ(t, x + δe

) + φ(t, x − δe

) − 2φ(t, x)]

, ∆

δ,+

φ , ∆

δ,−

∆

δ,+

φ =

−2

[2φ(t, x) + φ(t, x + δe

+ δe

)

+φ(t, x − δe

− δe

)]

−

−2

[φ(t, x + δe

) + φ(t, x − δe

)

+φ(t, x + δe

) + φ(t, x − δe

)]

∆

δ,−

φ = −

−2

[2φ(t, x) + φ(t, x + δe

− δe

)

+φ(t, x − δe

+ δe

)]

−2

[φ(t, x + δe

) + φ(t, x − δe

)

+φ(t, x + δe

) + φ(t, x − δe

)]

∆

h,−

φ =

φ(t, x) − φ(t − h, x)

∆

h,+

φ =

φ(t + h, x) − φ(t, x)

(9)

Applying above upwind ﬁnite difference scheme (9)

to (1), one can write explicit numerical scheme as

∆

h,−

+ ⊗

(x, ν)∆

δ,+

tr(a

(x, ν)∆

2,δ,+

) − f

−

(x, ν)∆

δ,−

−

tr(a

−

(x, ν)∆

2,δ,−

) + L(x, ν)] = 0,

(10)

where

= max{±a, 0}

∆

δ,±

φ = (∆

δ,±

φ, ∆

δ,±

φ, . . . , ∆

δ,±

φ)

∆

2,δ,±

φ = (∆

δ,±

φ)

i,j=1,...,d

(11)

Note that ∆

2,δ,±

is symmetric matrix. In the following,

we give equivalent Markov chain approximation inter-

pretation of above upwind ﬁnite difference scheme. One

can rewrite (10) with boundary conditions by

(t − h, x) = ⊗

[

y∈Σ

(x, y, ν)V

(t, y)

+hL(x, ν)], t ∈ [h, T ]

, x ∈ Σ

(T, x) = ψ(x), x ∈ Σ

(12)

where

(x, x ± δe

, ν)

2δ

(x, ν) −

j6=i

(x, ν)| + 2δf

(x, ν)]

(x, x + δe

± δe

, ν) =

2δ

(x, ν), i 6= j

(x, x − δe

± δe

, ν) =

2δ

(x, ν), i 6= j

(x, x, ν)

= 1 −

i=1

(x, ν) −

j6=i

(x, ν)| + δ|f

(x, ν)|]

(x, y, ν) = 0, otherwise.

(13)

Note that, under assumptions (A2) and (A3), we have

y∈Σ

(x, y, ν) = 1; p

(x, y, ν) ≥ 0 (14)

It can be seen from (14), we can consider p

(·) as a

one step transition probability of a Markov chain {x

n = 0, 1, 2, . . .} in state space Σ

with the cost function

deﬁned by

(k, x) =

T/h

n=k

hL(x

, u

). (15)

Then, the dynamic programming equation of

exactly the same as (12). Hence, by uniqueness, we have

= V

Remark 3.1: An implicit numerical scheme can be

obtained by replacing ∆

h,−

φ by ∆

h,+

φ in (10), that

∆

h,+

+ ⊗

(x, ν)∆

δ,+

tr(a

(x, ν)∆

2,δ,+

) − f

−

(x, ν)∆

δ,−

−

tr(a

−

(x, ν)∆

2,δ,−

) + L(x, ν)] = 0,

(16)

The above implicit numerical scheme also have probabil-

ity interpretations when we deal discrete time as another

state variable, see [10, Chapter 12].

The next section is to ﬁnd sufﬁcient conditions such

that V

of explicit scheme (10) converge to unique

viscosity solution V of (1). For the convergence of

implicit scheme (16), we can follow analogous method.

IV. CONVERGENCE

To show the convergence of V

of explicit scheme

(10) with boundary conditions, one can rewrite (10) as

(t − h, x) = F

(t, ·)](x), t ∈ [h, T ]

, x ∈ Σ

(T, x) = ψ(x), x ∈ Σ

(17)

46th IEEE CDC, New Orleans, USA, Dec. 12-14, 2007 ThC05.4

4071

where Σ

= {x ∈ R

: x =

i=1

δe

, k

∈ Z},

[h, T ]

= {t : h ≤ t ≤ T, t = kh, k ∈ Z}, and F

[φ](x)

is an operator for any function φ : R

→ R, such that

[φ](x) = φ(x) + h ⊗

(x, ν)∆

δ,+

φ(x)

−f

−

(x, ν)∆

δ,−

φ(x) +

tr(a

(x, ν)∆

2,δ,+

φ(x))

−

tr(a

−

(x, ν)∆

δ,−

φ(x)) + L(x, ν)]

(18)

Note that, by condition (C1), one can rewrite (18) as

[φ](x) = ⊗

[

y∈Σ

(x, y, ν)φ(y)+ hL(x, ν)], (19)

Lemma 4.1: Assume (A1), (A2) and (A3). Following

properties hold:

[φ

] ≤ F

[φ

], for ∀φ

≤ φ

(20)

(φ + c) = F

(φ) + c, ∀c ∈ R (21)

∞

≤ K, ∀0 < h < 1, (22)

and for all φ ∈ C

1,2

(

lim

(s,y)→(t,x)

h→0

[φ(s, ·)](y) − φ(s − h, y)

= φ

+ ⊗

[f(x, ν)D

φ(t, x)

tr(a(x, ν)D

φ(t, x)) + L(x, ν)]

(23)

Proof Note that under (A2), (A3) and (14), one can have

(20) and (21). Rewrite (17) as

(t − h, x) = F

(t, ·)](x)

= ⊗

[

y∈Σ

(x, y, ν)V

(t, y) + hL(x, ν)].

(24)

Then, for any t ∈ [h, T ]

(t − h, x) ≤ ⊗

[max

(t, y) + hL(x, ν)]

≤ max

(t, y) + h ⊗

L(x, ν)

≤ max

(t, y) + KhkLk

∞

(25)

It leads to stability of F

, that is, for any 0 ≤ m ≤ T/h,

max

(T − mh, x) ≤ max

(T, x) + KmhkLk

∞

≤ kψ(x)k

∞

+ KT kLk

∞

< ∞

(26)

Hence, (22) holds.

For any test function φ ∈ C

1,2

(

Q), one can verify the

consistency (23) as following,

lim

(s,y)→(t,x)

h→0

[φ(s, ·)](y) − φ(s − h, y)

= lim

(s,y)→(t,x)

h→0

φ(s, y) − φ(s − h, y)

+ lim

(s,y)→(t,x)

h→0

⊗

[L(y, ν) + f

(y, ν)∆

δ,+

φ(s, y)

−f

−

(y, ν)∆

δ,−

φ(s, y) +

tr(a

(y, ν)∆

2,δ,+

φ(s, y))

−

tr(a

−

(y, ν)∆

2,δ,−

φ(s, y))]

= φ

+ ⊗

[f(x, ν)D

φ(t, x)

tr(a(x, ν)D

φ(t, x)) + L(x, ν)]

(27)

This completes the proof. 2

Deﬁnition 4.2: We say that V is a viscosity solution

of equation (1) if

(a) V (t, x) is upper semicontinuous function on Q and

for each φ ∈ C

∞

(Q),

(

t, ¯x) + ⊗

[f(¯x, ν)D

φ(

t, ¯x)

tr(a(¯x, ν)D

φ(

t, ¯x)) + L(¯x, ν)] ≥ 0,

(28)

at every (

t, ¯x) ∈ Q which is strict maximizer of V − φ

(b) V (t, x) is lower semicontinuous function on Q and

for each φ ∈ C

∞

(Q),

(

t, ¯x) + ⊗

[f(¯x, ν)D

φ(

t, ¯x)

tr(a(¯x, ν)D

φ(

t, ¯x)) + L(¯x, ν)] ≤ 0,

(29)

at every (

t, ¯x) ∈ Q which is strict minimizer of V − φ

If (a) (respectively (b)) holds, then V is said to be

subsolution (respectively supersolution) of (1).

For (t, x) ∈ Q, Deﬁne upper and lower semicontinu-

ous envelope of solution V

∗

(t, x) = lim sup

(s,y)→(t,x)

h→0

(s, y),

∗

(t, x) = lim inf

(s,y)→(t,x)

h→0

(s, y).

(30)

Lemma 4.3: V

∗

(resp. V

∗

) deﬁned in (30) is a vis-

cosity subsolution (resp. supersolution) of equation (1)

under assumptions (A1), (A2), and (A3).

Proof. Suppose that φ ∈ C

∞

(Q) is a test function such

that V

∗

− φ has strict maximum at (

t, ¯x) ∈ Q. Then

there is a sequence converging to zero denoted by h,

such that V

− φ has a maximum on [0, T ]

× Σ

, y

) such that (s

, y

) → (

t, ¯x) as h → 0. For all

46th IEEE CDC, New Orleans, USA, Dec. 12-14, 2007 ThC05.4

4072

y ∈ Σ

, therefore

φ(s

+ h, y) −φ(s

, y

) ≥ V

+ h, y) −V

, y

(31)

By virtue of (20) and (21),

[φ(s

+ h, ·)](y

) − φ(s

, y

)

≥ F

+ h, ·)](y

) − V

, y

(32)

By (17), the right hand side of (32) is zero. By dividing

h and forcing h → 0, left hand side of (32) goes to (23).

Thus, (28) holds. One can prove V

∗

is supersolution in

similar fashion. 2

In the following lemma, by A ≥ B for symmetric

matrices, we mean A−B is symmetric positive deﬁnite.

Lemma 4.4: Suppose (A1), A(2), and (A3) hold. Let

φ and

φ be bounded viscosity subsolution and superso-

lution of (1). Then

sup

(φ −

φ) = sup

y∈R

(φ(T, y) −

φ(T, y)) (33)

Proof By virtue [7, Theorem V.9.1], it is enough to show

that there exists a constant K, such that

⊗

[αf(y, ν)(x − y) +

tr(a(y, ν)B) + L(y, ν)]

− ⊗

[αf(x, ν)(x − y) +

tr(a(x, ν)B) + L(x, ν)]

≤ K(α|x − y|

+ |x − y|),

(34)

for every (t, x), (t, y) ∈ Q, α > 0, and symmetric

matrices A, B satisfying

−3α



I 0

0 I



≤



A 0

0 −B



≤ 3α



I −I

−I I



(35)

By condition (C3) of the operator ⊗

, one can write

⊗

[αf(y, ν)(x − y) +

tr(a(y, ν)B) + L(y, ν)]

− ⊗

[αf(x, ν)(x − y) +

tr(a(x, ν)B) + L(x, ν)]

≤ K max

|α(f(y, ν) − f(x, ν))(x − y)|

+K max

|L(y, ν) − L(x, ν)|

+K max

|tr(a(y, ν)B − a(x, ν)A)|.

(36)

Note that assumption (A1) implies Lipschitz continuity

of function f and L. Hence

max

|α(f(y, ν) − f(x, ν))(x − y)|

+ max

|L(y, ν) − L(x, ν)| ≤ K(α|x − y|

+ |x − y|).

(37)

The last term is

tr(a(y, ν)B − a(x, ν)A)

= tr





σ(y, ν)σ(y, ν)

σ(y, ν)σ(x, ν)

σ(x, ν)σ(y, ν)

σ(x, ν)σ(x, ν)





A 0

0 −B





≤ 3αtr





σ(y, ν)σ(y, ν)

σ(y, ν)σ(x, ν)

σ(x, ν)σ(y, ν)

σ(x, ν)σ(x, ν)





I −I

−I I





= 3αtr(σ(y, ν)σ(y, ν)

− σ(y, ν)σ(x, ν)

−σ(x, ν)σ(y, ν)

+ σ(x, ν)σ(x, ν)

)

= 3αtr((σ(y, ν) − σ(x), ν)(σ(y, ν) − σ(x, ν))

)

= 3αkσ(y, ν) − σ(x, ν)k

≤ Kα|x − y|

(38)

The above inequalities combined together imply the

result. 2

Theorem 4.5: Suppose (A1), A(2), and (A3) hold.

Then V

∗

= V

∗

, V is the unique viscosity solution

of (1) on Q.

Proof. By deﬁnition of (30), V

∗

≥ V

∗

. Moreover, (17)

and (30) implies V

∗

and V

∗

has the same boundary

condition on {T } × R

. Applying Lemma 4.3 and

Lemma 4.4, one can write V

∗

≤ V

∗

. Thus, V = V

∗

is viscosity solution of (1). Uniqueness also follows

form Lemma 4.4. 2

Following corollaries are straight forward results from

Theorem 4.5.

Corollary 4.6: Suppose (A1), A(2), and (A3) hold.

V (t, x) is the value function deﬁned by (3), and V

(t, x)

is approximated value function of (10) with ⊗

replaced

by min

. Then V

(t, x) converge to V (t, x) as h → 0.

Corollary 4.7: Suppose (A1), A(2), and (A3) hold.

(t, x) (resp. V

−

(t, x)) is the value function deﬁned

by (5) (resp. (6)), and V

(t, x) (resp. V

h−

(t, x)) is

approximated value function of (10) with ⊗

replaced

by min

max

(resp. max

min

). Then V

(t, x)

(resp. V

−

(t, x)) converge to V

(t, x) (resp. V

−

(t, x))

as h → 0.

V. FURTHER REMARKS

In this work, the generalized HJB equation is pro-

posed, which is associated with stochastic control and

stochastic differential games. The proof of convergence

is given by the viscosity solution method. Probability

methods analogous to [9], [15] can also be used to prove

the convergence.

Another approachable result is controlled stochastic

hybrid system. Such a formulation has extensive recent

applications in risk theory, ﬁnancial engineering, and

insurance modeling, see [5], [13], [17], [18], [19]. It

might be interesting if one could ﬁnd more applications

46th IEEE CDC, New Orleans, USA, Dec. 12-14, 2007 ThC05.4

4073

in the manner of generalized HJB equations, such as,

risk-sensitive models, exploration-sensitive models, and

non zero sum differential games.

REFERENCES

[1] G. Barles, P. E. Souganidis, Convergence of Approximation

shemes for fully nonlinear second order equations, J. Asymptotic

Analysis, 4(1991), 271-283.

[2] T. Basar, P. Bernhard, H

∞

-Optimal Control and Related Mini-

max Problems, Birkh¨auser Boston, Boston, 1991.

[3] D. P. Bertsekas, Dynamic Programming: Deterministic and

Stochastic Models. Upper Saddle River, NJ, 1987.

[4] M. G. Crandall, H. Ishii, P. L. Lions, User’s Guide to Viscosity

Solutions of Second Order Partial Differential Equations, Bul-

letin of the American Mathematical Society, Vol 27, No. 1 (1992),

1-67.

[5] G.B. Di Masi, Y.M. Kabanov, W.J. Runggaldier, Mean variance

hedging of options on stocks with Markov volatility, Theory of

Probability and Applications, Vol 39 (1994), 173-181.

[6] R.J. Elliott and N.J. Kalton, Existence of Value in Differential

Games, Mem. Amer. Math. Soc., 126, Providence, RI, 1972.

[7] W. H. Fleming, H. M. Soner, Controlled Markov Processes and

Viscosity Solutions, 2nd edition, Springer-Verlag, Berlin, New

York, 2006.

[8] W. H. Fleming, P. E. Souganidis, On the Existence of

Value Functions of Two-player, Zero-sum Stochastic Differential

Games, Indiana Univ. Math. J., 38.2, (1989), 293-314.

[9] H. J. Kushner, Numiercal methods for stochastic control prob-

lems in continuous time, SIAM J. Control Optim, 28 (1990),

999-1048.

[10] H. J. Kushner, P. Dupuis, Numerical Methods for Stochastic

Control Problems in Continuous Time, 2nd edition, Springer-

Verlag, New York, Berlin, 2001.

[11] H. J. Kushner, Numerical approximations for stochastic differ-

ential games, SIAM J. Control Optim, 41.2 (2002), 457-486.

[12] M. L. Littman, C. Szepesvari, A Generalized Reinforcement-

Learning Model: Convergence and Applications, Proceedings of

the 13th International Conference on Machine Learning (ICML-

96), Bari, Italy (1996), 310-318.

[13] T. Rolski, H. Schmidli, V. Schmidt and J. Teugels, Stochastic

Processes for Insurance and Finance, Wiley and Sons, New

York, 1999.

[14] Q.S. Song, G. Yin, and Z. Zhang, Numerical method for con-

trolled regime-switching diffusions and regime-switching jump

diffusions, Automatica, 42 (2006), 1147-1158.

[15] Q. S. Song, G. Yin, Existence of Saddle Points in Discrete

Markov Games and Its Application in Numerical Methods for

Stochastic Differential Games, Proceedings of the 45th IEEE

Conference on Decision & Control, (2006), 6325-6330.

[16] P. E. Souganidis, Two-player, zero-sum differential games and

viscosity solutions, Stochastic and differential games, Ann. In-

ternat. Soc. Dynam. Games, 4, Birkh¨auser Boston, Boston, MA,

(1999), 69–104 .

[17] G. Yin and Q. Zhang, Continuous-Time Markov Chains and Ap-

plications: A Singular Perturbation Approach, Springer- Verlag,

New York, 1998.

[18] G. Yin and Q. Zhang, Discrete-time Markov Chains: Two time-

scale Methods and Applications, Springer, New York. 2005.

[19] Q. Zhang, Stock trading: An optimal selling rule, SIAM J.

Control Optim., Vol 40 (2001), 64-87.

46th IEEE CDC, New Orleans, USA, Dec. 12-14, 2007 ThC05.4

4074

The Nonlinear Progressive Accommodation: Design and Methodology

Chapter

Nov 2023

In critical plants, the accommodation procedure must be performed with utmost speed to prevent the potential failure(s) damages. In relation to the optimal control issue, the Progressive Accommodation (PA) is an effective solution to overcome this issue. In this chapter, we define the concept and the objectives of the PA and we justify the choice of the iterative algorithm. Then, we present the limitations of the linear PA when it comes to nonlinear (NL) systems and thus, the necessity to use the NL optimal control. The SDRE method is presented as an approximate solution of the NL optimal control problem, as well as the related NL PA. To demonstrate the suggested strategy, an example is provided at the end.

CPDO with FINITE TERMINATION: MAXIMAL RETURN under CASH-IN and CASH-OUT CONDITIONS

Article

Feb 2016

The maximal return and optimal leverage of a constant proportion debt obligation with finite termination and two boundaries are analysed by numerically solving Hamilton–Jacobi–Bellman equations. We discuss the probabilities of the asset value reaching the upper or lower bound under the optimal control and the optimal control problem with a time-varying boundary. Furthermore, we also analyse the relationship between the optimal return, the optimal policy and different parameters.

Utility indifference pricing and hedging for structured contracts in energy markets

Article

Full-text available

Apr 2017
MATH METHOD OPER RES

In this paper we focus on pricing of structured products in energy markets using utility indifference pricing approach. In particular, we compute the buyer's price of such derivatives for an agent investing in the forward market, whose preferences are described by an exponential utility function. Such a price is characterized in terms of continuous viscosity solutions of suitable non-linear PDEs. This provides an effective way to compute both an optimal exercise strategy for the structured product and a portfolio strategy to partially hedge the financial position. In the complete market case, the financial hedge turns out to be perfect and the PDE reduces to particular cases already treated in the literature. Moreover, in a model with two assets and constant correlation, we obtain a representation of the price as the value function of an auxiliary simpler optimization problem under a risk neutral probability, that can be viewed as a perturbation of the minimal entropy martingale measure. Finally, numerical results are provided.

H∞-Optimal Control and Related Minimax Design Problems

Article

Jan 2008

"I believe that the authors have written a first-class book which can be used for a second or third year graduate level course in the subject... Researchers working in the area will certainly use the book as a standard reference... Given how well the book is written and organized, it is sure to become one of the major texts in the subject in the years to come, and it is highly recommended to both researchers working in the field, and those who want to learn about the subject." —SIAM Review (Review of the First Edition) "This book is devoted to one of the fastest developing fields in modern control theory---the so-called 'H-infinity optimal control theory'... In the authors' opinion 'the theory is now at a stage where it can easily be incorporated into a second-level graduate course in a control curriculum'. It seems that this book justifies this claim." —Mathematical Reviews (Review of the First Edition) "This work is a perfect and extensive research reference covering the state-space techniques for solving linear as well as nonlinear H-infinity control problems." —IEEE Transactions on Automatic Control (Review of the Second Edition) "The book, based mostly on recent work of the authors, is written on a good mathematical level. Many results in it are original, interesting, and inspirational...The book can be recommended to specialists and graduate students working in the development of control theory or using modern methods for controller design." —Mathematica Bohemica (Review of the Second Edition) "This book is a second edition of this very well-known text on H-infinity theory...This topic is central to modern control and hence this definitive book is highly recommended to anyone who wishes to catch up with this important theoretical development in applied mathematics and control." —Short Book Reviews (Review of the Second Edition) "The book can be recommended to mathematicians specializing in control theory and dynamic (differential) games. It can be also incorporated into a second-level graduate course in a control curriculum as no background in game theory is required." —Zentralblatt MATH (Review of the Second Edition)

Numerical Methods for Stochastic Control Problems in Continuous Time

Book

Jan 2001

User's guide to viscosity solutions of second order partial differential equations

Article

Jan 1992
BULL AMER MATH SOC

Convergence of approximation schemes for fully nonlinear second order equations

Article

Jan 1991

Stochastic Processes for Insurance and Finance

Book

May 2008

Stochastic Processes for Insurance and Finance offers a thorough yet accessible reference for researchers and practitioners of insurance mathematics. Building on recent and rapid developments in applied probability, the authors describe in general terms models based on Markov processes, martingales and various types of point processes. Discussing frequently asked insurance questions, the authors present a coherent overview of the subject and specifically address: The principal concepts from insurance and finance; Practical examples with real life data; Numerical and algorithmic procedures essential for modern insurance practices; Assuming competence in probability calculus, this book will provide a fairly rigorous treatment of insurance risk theory recommended for researchers and students interested in applied probability as well as practitioners of actuarial sciences.

Numerical Methods of Stochastic Control Problems in Continuous Time

Chapter

Jan 1992

The chapter presents many of the basic ideas which are in current use for the solution of the dynamic programming equations for the optimal control and value function for the approximating Markov chain models. We concentrate on methods for problems which are of interest over a potentially unbounded time interval. Numerical methods for the ergodic problem will be discussed in Chapter 7, and are simple modifications of the ideas of this chapter. Some approaches to the numerical problem for the finite time problem will be discussed in Chapter 12.

Discrete-Time Markov Chains, Two-Time-Scale Methods and Applications

Article

Jan 2005

Continuous-Time Markov Chains and Applications: A Singular Perturbation Approach

Article

Dec 1998

Controlled markov processes and viscosity solutions

Article

Jul 1994

M. R. James

Controlled markov processes and viscosity solutions by W. H. Fleming and H. M. Soner. Springer-Verlag, New York (1993), 428 pp., $ 49.95. ISBN 0-387-97927-1.

Viscosity solutions of fully non-linear second-order partial differential equations

Article

Markov chain approximation methods on generalized HJB equation

Abstract

Recommended publications

Numerical approximations for stochastic differential games: The ergodic case

Convergence of Markov Chain Approximation on Generalized HJB Equation and Its Applications

Numerical Methods for Non-Zero-Sum Stochastic Differential Games: Convergence of the Markov Chain Ap...

Numerical Approximations for Stochastic Differential Games: The Ergodic Case