Content uploaded by Mathias Staudigl

Author content

All content in this area was uploaded by Mathias Staudigl on Jun 16, 2021

Content may be subject to copyright.

Content uploaded by Mathias Staudigl

Author content

All content in this area was uploaded by Mathias Staudigl on Aug 08, 2020

Content may be subject to copyright.

Computing Dynamic User Equilibrium on Large-Scale

Networks Without Knowing Global Parameters

Duong Viet Thong1, Aviv Gibali2,3, Mathias Staudigl 4, and Phan Tu Vuong5

1Division of Applied Mathematics, Thu Dau Mot University, Binh Duong Province, Vietnam,

(duongvietthong@tdmu.edu.vn)

2Department of Mathematics, ORT Braude College, P.O. Box 78, Karmiel 2161002, Israel

3The Center for Mathematics and Scientiﬁc Computation, U. Haifa, Mt. Carmel, Haifa, Israel, (avivg@braude.ac.il)

4Department of Data Science and Knowledge Engineering, Maastricht University, P.O. Box 616, NL–6200 MD

Maastricht, The Netherlands, (m.staudigl@maastrichtuniversity.nl)

5Mathematical Sciences, University of Southampton, Highﬁeld Southampton SO17 1BJ, United Kingdom,

(T.V.Phan@soton.ac.uk)

June 15, 2021

Abstract

Dynamic user equilibrium (DUE) is a Nash-like solution concept describing an equilibrium

in dynamic traﬃc systems over a ﬁxed planning period. DUE is a challenging class of equilib-

rium problems, connecting network loading models and notions of system equilibrium in one

concise mathematical framework. Recently, Friesz and Han introduced an integrated frame-

work for DUE computation on large-scale networks, featuring a basic ﬁxed-point algorithm for

the eﬀective computation of DUE. In the same work, they present an open-source MATLAB

toolbox which allows researchers to test and validate new numerical solvers. This paper builds

on this seminal contribution, and extends it in several important ways. At a conceptual level,

we provide new strongly convergent algorithms designed to compute a DUE directly in the

inﬁnite-dimensional space of path ﬂows. An important feature of our algorithms is that they

give provable convergence guarantees without knowledge of global parameters. In fact, the

algorithms we propose are adaptive, in the sense that they do not need a priori knowledge

of global parameters of the delay operator, and which are provable convergent even for delay

operators which are non-monotone. We implement our numerical schemes on standard test

instances, and compare them with the numerical solution strategy employed by Friesz and Han.

Keywords Dynamic Traﬃc Assignment; Fixed Point Iteration; Strong Convergence

1 Introduction

This paper is concerned with a class of models known as Dynamic User Equilibrium (DUE). DUE

problems have been studied within the broader context of Dynamic Traﬃc Assignment (DTA),

which is concerned with modeling time-varying traﬃc ﬂows consistent with established traﬃc

ﬂow theory. DTA models are greatly inﬂuenced by Wardrop’s equilibrium principle [52], which is

seen as a Nash-like equilibrium condition in an aggregative game:

(a) Wardrop’s ﬁrst principle, also known as the user optimality principle, states that road seg-

ments used in an equilibrium should display the same travel costs (i.e. delay);

1

(b) Wardrop’s second principle, known as the system’s optimality principle, assumes that drivers

behave cooperatively, in making travel decisions so that the over system costs (aggregate

delays) are minimized.

Logically, the behavioral maxims (a) and (b) are disconnected, and a substantive literature in trans-

portation research is concerned with the design of computational architectures aligning these po-

tentially conﬂicting principles. Since the seminal work of [33,34], dynamic extensions of Wardrop’s

principles have paved the way to the introduction of notions like DUE and Dynamic System Opti-

mal (DSO) models. For comprehensive reviews of DTA models, we refer to [28,39,51].

In the last two decades there have been many eﬀorts to develop a theoretically and sound

formulation of DUE, acceptable to modelers and practitioners alike. Analytical DUE models tend

to be of two varieties: (1) Route Choice (RC) DUE [12,33,34,55], and (2) Simultaneous Route and

Departure Choice (SRDC) DUE [13–15,41]. Both types of DUE rest on two pillars:

1. A mathematical notion of equilibrium;

2. A model of network performance, based on some physical laws describing traﬃc ﬂows.

The second pillar is known in the literature as Dynamic Network Loading (DNL). Equilibrium

is usually expressed in terms of Wardrop’s ﬁrst principle. Mathematical approaches to describe

equilibrium contain variational inequalities (VI) [13,55], nonlinear complementarity problems

[25,38], diﬀerential variational inequalities [11,37] and ﬁxed point problems [15]. In this paper

we choose the VI formulation of DUE, and our aim is to advance computational techniques for the

practical solution of DUE. Our research builds on, and extends, recent advances in computational

approaches to DUE reported in [24]. As is well known computing user equilibrium is a challenging

task; Its main complication arises since it constitutes an interconnected computational procedure,

coupling equilibrium computation with DNL. The DNL, which could be understood as the ﬁrst

layer of the problem, aims at describing the spatial and temporal evolution of traﬃc ﬂows on a

network that is consistent with established route and departure choices of travelers. This is done

by formulating appropriate dynamics to ﬂow propagation, ﬂow conservation, link delay, and path

delay on a network level. In general, DNL models have the following components:

1. Some form of link and/or path dynamics;

2. An computationally-friendly relationship between ﬂow/speed/density and link traversal

time;

3. Flow propagation constraints;

4. A model of junction dynamics (Riemann Solvers) and delays;

5. A model of path traversal times, and

6. Appropriate initial conditions.

DNL generates the path delay operator, which is the key input when computing an equilibrium

given the delays on user routes (travel costs). This is the second layer of the problem, and of main

interest in this paper. At this layer one has to use some equilibrium solver, whose performance

depends signiﬁcantly on the information we have about the structural properties of the delay

operator. However, since the delay operator is itself the result of a computational procedure, it is

not available in closed form, and thus one is confronted essentially with a black-box upon which

we can assume whatever we ﬁnd useful, but the empirical validation of these assumptions is very

hard. It is thus of utmost importance to have at our disposal eﬃciently implementable algorithms

which are:

2

Algorithm DUE Model Assumptions Convergence References

Projected Gradient SRDT Lipschitz cont.

strongly monotone strong [15]

descent algorithm SRDT Co-coercive weak [45]

Route-swapping RC DUE monotone weak [46]

Route-swapping SRDT DUE Continuous

monotone weak [26]

Route-swapping SRDT DUE Continuous

monotone weak [47]

Extragradient RC DUE Lipschitz cont.

pseudo monotone weak [31]

Self-adaptive SRDT DUE D-property weak [21]

Proximal point SRDT DUE Dual solvable weak [20]

FBF SRDT DUE Lipschitz cont.

pseudo monotone strong [10]

Inertial-FBF SRDT DUE Lipschitz cont.

pseudo monotone strong This paper

Table 1: Computational algorithms for DUE (adapted from [24]). The algorithms are arranged in an

increasing order of generality of the monotonicity.

(i) Adaptive to arrival of new information about unknown global parameters;

(ii) Provably convergent under mild monotonicity assumptions.

We argue that, up to now, none of the perceived DUE solvers meet both of these criteria. To

support this claim, we present Table 1, where the current state-of-the-art in DUE computation is

summarized.1

We infer from Table 1that known algorithmic strategies for solving the DUE problem require

knowledge about the global Lipschitz constant and some sort of monotonicity of the path delay

operator. Since the delay operator is not given to us in closed form, both assumptions are practically

not veriﬁable. Algorithmic strategies which are provably convergent without explicit knowledge

of these global properties, are thus to be seen as a very valuable contribution.

1.1 Our Contributions

This paper makes a signiﬁcant step-ahead relative to the perceived computational literature on

DUE, by describing two numerical algorithms acting directly in inﬁnite-dimensional Hilbert spaces.

Our algorithms share the following features:

(i) Strong convergence to a single user equilibrium;

1In this table we focus on algorithms acting directly on the inﬁnite-dimensional Hilbert space formulation of DUE. A

much larger literature on this topic exists which is concerned with ﬁnite-dimensional approximations. In the parlance

of numerical mathematics, the latter would correspond to a ﬁrst discretize, then optimize strategy. As the two approaches

are quite diﬀerent, it would not provide fair comparisons.

3

(ii) Adaptive step-size choices without the need to know global Lipschitz parameters of the delay

operator;

(iii) Provably convergent under a plain pseudo-monotonicity assumption on the path delay operator.

(iv) Include inertial and relaxation eﬀects to potentially speed up the convergence.

While items (ii) and (iii) don’t need much motivation, our emphasis on strongly convergent methods

seems to be somewhat pedantic at ﬁrst sight, so it deserves some words of explanation.

In inﬁnite-dimensional settings strongly convergent iterative schemes are much more desirable

than weakly convergent ones since strong convergence translates the physically tangible property

that the energy khn−h∗k2of the error between the iterate hnand a solution h∗eventually becomes

arbitrarily small. Of course, any numerical solution technique designed for solving a problem

in inﬁnite dimensions must be applied to a ﬁnite-dimensional approximation of the problem.

Exactly in such situations strongly convergent methods are extremely powerful, because they

guarantee stability with respect to numerical discretization. In fact, [17] demonstrated that strongly

convergent schemes might even exhibit faster convergence rates as compared to their weakly

convergent counterparts. It seems therefore fair to say that strong convergence is an extremely

desirable property of solution schemes, with clearly observable physical consequences on the

performance and stability of algorithms. As a matter of fact, [15] employs a projected gradient

iteration of Halpern type [3,18], which forces trajectories to converge strongly to some DUE.

Adaptivity in the step-size policy frees us from any unavailable information about the global

Lipschitz constant of the delay operator. It allows us to tune the step size on-the-ﬂy and guarantees

convergence for general pseudo-monotone operators with good performance properties.

Operator splitting methods with inertia and relaxation have received quite some attention in

recent years, see e.g. [1,27,32]. These schemes are motivated by Nesterov’s accelerated method

[35], and therefore the main motivation for inertial methods is to speed up the convergence rate.

To the best of our knowledge this is the ﬁrst time that inertial and relaxation eﬀects are investigated

in the context of DUE computation and under weak pseudo-monotonicity assumptions.

Remark 1.1.In previous work [10] investigated the DUE with a strongly convergent FBF variant.

This paper replaces and signiﬁcantly extends our previous work by the explicit consideration of

inertial eﬀects.

1.2 Organization of the paper

Sections 2and 3describe user equilibrium and the DNL procedure we use in our numerical exper-

iments. In setting up these two layers we follow closely [24]. Section 4describes the algorithms

we construct and investigate in this paper. Building on the MATLAB toolbox publicly available

at https://github.com/DrKeHan/DTA and documented in [24]. We report the outcomes of our

experiments in Section 5. Technical facts and proofs are organized in Sections 6.1 and 6.2.

2 Dynamic User Equilibrium

We introduce a few notations and terminologies for the ease of presentation below.

•P: set of paths in the network.

•W: set of origin-destination (O-D) pairs in the network.

4

•Qw: ﬁxed O-D demand between w∈W.

•Pw: subset of paths that connect O-D pair w.

•t: continuous time parameter in the ﬁxed time horizon [t0,t1].

•hp(t): departure rate along path pat time t.

•h(t): complete proﬁle of departure rates h(t)={hp(t); p∈P}.

•Ap(t,h): eﬀective travel cost along path pwith departure time tunder the path proﬁle h.

•νw(h): minimum travel cost between O-D pair w∈Wfor all paths and all departure times.

2.1 Formulation of DUE as a Variational inequality

Let [t0,t1] be a ﬁxed planning horizon. We are given a connected directed graph G=(V,E) with

ﬁnite set of vertices V, representing traﬃc intersections (junctions) and arc set E, representing road

segments. A path pin the graph Gis identiﬁed with a non-repeating ﬁnite sequence of links it

traverses, i.e. p={I1,I2,...,Im(p)},where m(p) is the number of links in this path. We denote the set

of all paths by P, and set H:=R|P|. We are interested in paths which connect a set of distinguished

vertices acting as the origin-destination (O-D) pairs in our graph. We are given Ndistinct O-D pairs

denoted as w1,...,wN, where each wi=(oi,di)∈V. Call W:={w1,...,wN}the collection of all

O-D pairs, and let us denote the set of paths connecting the O-D pair wby Pw⊆P.For each O-D

pair w∈Wwe are given an exogenous demand Qw>0; This represents the number of drivers

who have to travel from the origin to the destination described by w. The list Q=(Qw)w∈Wis

often called the trip table. In DUE modeling, the single most crucial ingredient is the path delay

operator, which maps a given vector of departure rates (path ﬂows) hto a vector of path travel

times. We stipulate that path ﬂows are square integrable functions over the planning horizon, so

that hp∈L2([t0,t1]; R+) and h=(hp;p∈P)∈H:=L2([t0,t1]; H). To measure the delay of drivers on

paths, we introduce the operator D:H→H,h7→ D(h), with the interpretation that Dp(t,h) is the

path travel time of a driver departing at time tfrom the origin of path p, and following this path

throughout. This operator is the result of some DNL procedure, which is an integrated subroutine

in the dynamic traﬃc assignment problem. See Section 3for a description of the DNL used in our

computational experiments.

On top of path delays, we consider penalty terms of the form φ(t+Dp(t,h)−τ),penalizing all

arrival times diﬀerent from the target time τ > 0 (i.e. the usual time of a trip on the O-D pair w).

The function φ: [−∞,∞)→[0,∞] should be monotonically increasing with φ(a)>0 for a>0 and

φ(a)=0 for a≤0. Deﬁne the eﬀective delay operator as

Ap(t,h) :=Dp(t,h)+φ(t+Dp(t,h)−τ).(2.1)

We thus obtain an operator A:H→H, mapping each proﬁle of path departure rates hto eﬀective

delays A(h)={Ap(t,h); t∈[t0,t1]} ∈ H.

We follow the perceived DUE literature, and stipulate that Wardrop’s ﬁrst principle holds: Users

of the network aim to minimize their own travel time, given the departure rates in the system. Thus,

a user equilibrium is envisaged, where the delays (interpreted as costs) of all travelers in the same

O-D pair are equal, and no traveler can lower his/her costs by unilaterally switching to a diﬀerent

route. To putthis behavioral axiom into a mathematical framework, we ﬁrst formulate the meaning

of "minimal costs" in the present Hilbert space setting. Recall the essential inﬁmum of a measurable

5

function g: [t0,t1]→Ras ess inf{g(t) : t∈[t0,t1]}=sup x∈R:Leb({s∈[t0,t1] : g(s)<x})=0,

where Leb(·) denoted the Lebesgue measure on the real line. Given a proﬁle h∈H, deﬁne

νp(h) :=ess inf{Ap(t,h) : t∈[t0,t1]} ∀p∈P,and (2.2)

nuw(h) :=min

p∈Pw

νp(h)∀w∈W.(2.3)

On top of minimal costs, we have to restrict the set of departure rates to functions satisfying a basic

ﬂow conservation property. Speciﬁcally, insisting that all trips are realized, we naturally deﬁne the

set of feasible ﬂows as

X:=

f∈H:X

p∈PwZt1

t0

fp(t)dt =Qw∀w∈W

.(2.4)

The set of feasible ﬂows Xis sequentially closed and convex, but not sequentially compact (i.e.

path departure rates are note a-priori assumed to be bounded as the above deﬁnition involves

Lebesgue-integrable functions). We are now ready to give our ﬁrst deﬁnition of user equilibrium.

Deﬁnition 2.1. A proﬁle of departure rates h∗∈His a DUE if

(a) h∗∈X, and

(b) h∗

p(t)>0,p∈Pw⇒Ap(t,h∗)=νw(h∗).

We denote by Ω⊂Xthe (possibly empty) set of DUE.

In [13] it is observed that the deﬁnition of DUE can be formulated equivalently as a variational

inequality VI(A,X): A ﬂow h∗∈Xis a DUE if

hA(h∗),h−h∗i ≥ 0∀h∈X(2.5)

This notion of equilibrium is very useful, since it allows us to apply a large variety of algorithms

to solve VI(A,X), and in fact it can be seen as the basis of most of the computational approaches to

DUE. We now spell out suﬃcient conditions guaranteeing existence of DUE.

Assumption 1. •The penalty function φ: [t0,t1]→R+is continuous and there exists ∆>−1

such that

φ(a)−φ(b)≥∆(a−b) for all t0≤a<b≤t1.(2.6)

•The DNL satisﬁes the FIFO principle and each link has ﬁnite capacity.

•The eﬀective delay operator is weak-to-weak continuous on bounded subsets of X.

Theorem 2.2. Under Assumption 1the DUE problem (2.5)has a solution, i.e. Ω,∅.

Proof. See [19].

The construction of the delay operator requires a speciﬁcation of a DNL (i.e. traﬃc ﬂow

generation). We focus in this work on a macroscopic model of network loading based on ﬂuid

dynamic approximations of traﬃc ﬂow on networks, known as the Lighthill-Whitham-Richards

(LWR) model [30,42]. The LWR model is able to describe the physics of kinematic waves (e.g.

shock waves, rarefaction waves), and allows network extension that capture the formation and

propagation of vehicle queues as well as vehicle spill-back. We will formulate the LWR-based

DNL as a system of partial diﬀerential algebraic equations (PDAE), which uses vehicle density and

queues as the unknown variables, and computes link dynamics, ﬂow propagation, and path delay

for any given vector of path departure rates.

6

2.2 The diﬀerential variational inequality formulation

It has been observed in [14] that DUE can be equivalently formulated as a diﬀerential variational

inequality [37]. From an algorithmic point-of-view this relation is interesting as it allows us to

use time-stepping methods to compute approximate user equilibria [11,15]. Independent of algo-

rithmic considerations, we regard this identiﬁcation as an important conceptual insight, and thus

deserves some remarks here. The precise connection between DVI and DUE goes as follows:

Deﬁne the vector-valued function x: [t0,t1]→R|W|,t7→ x(t)={xw(t); w∈W}as the state

trajectory of a controlled dynamical system with the interpretation that xw(t) is the cumulative

traﬃc up to time ton paths connecting the origin-destination pair w∈W. The deﬁnition of this

state-variable requires that its dynamic evolution is described by the linear diﬀerential equation

d

dtxw(t)=X

p∈Pw

hp(t) a.e. t∈[t0,t1].(2.7)

Additionally, it must satisfy the natural initial and boundary-value conditions

(xw(t0),xw(t1)) =(0,dw)∀w∈W.(2.8)

The diﬀerential variational inequality describing DUE reads then as follows: Find h∈Hsuch that

(2.7), (2.8) and the instantaneous optimality condition

h(t)∈SOL(R|P|

+,A(t,·)) a.e. t∈[t0,t1] (2.9)

holds. Note that this deﬁnes a time-dependent complementarity system

0≤h(t)⊥A(t,h(t)) ≥0 a.e. t∈[t0,t1],

which has been used in a DUE model with a simpliﬁed bottleneck structure in [38]. See [15] for a

formal proof on the correctness of this interpretation.

3 Dynamic Network Loading

The purpose of this section is to explain the dynamic network loading model used in our numerical

investigation. We are considering the LWR model on networks, adopting the description in terms

of a system of Diﬀerential Algebraic Equations (DAE). This formulation of the DNL procedure

has the advantage over its mathematically equivalent description in terms of a system of partial

diﬀerential algebraic equations that it avoids the use of partial diﬀerential operators, and thus is

much more amenable to numerical discretization strategies.

3.1 The Lighthill-Whitham-Richards link model

Network loading acts on the same oriented graph G=(V,E) as in Section 2, where links Ii∈Ehave

a certain length measured by the interval [ai,bi]. The within-link dynamics are captured by the

scalar conservation law

∂tρi(t,x)+∂xρi(t,x)vi(ρi(t,x))=0 (t,x)∈[t0,t1]×[ai,bi].(3.1)

7

The fundamental diagram fi(ρ)=ρ·vi(ρ) is assumed to be continuous, concave and vanishes

at ρ∈ {0, ρjam

i}, where ρjam

iis the jam density on link Ii. Moreover, there exists a unique global

maximum of fiat the value ρc

i. We focus on the triangular fundamental diagram

fi(ρ)=(viρif ρ∈[0, ρc

i],

−wi(ρ−ρjam) if ρ∈(ρc

i, ρjam

i](3.2)

where vi,wi>0 denote the forward and backward kinematic wave speeds, respectively.

At junctions we need to make sure that relevant boundary conditions are satisﬁed to respect

basic physical principles. Consider a junction with mincoming and noutgoing links. At each such

junction, the following conservation property must hold:

m

X

i=1

fi(ρi(t,bi)) =

n

X

j=1

fj(ρj(t,aj)) ∀t∈[t0,t1].(3.3)

This condition simply means that inﬂow into the junction equals outﬂow. However, this condition

alone does not guarantee a unique ﬂow proﬁle at these m+nlinks. Additional conditions, usually

formulated in terms of Riemann solvers and demand/supply conditions must be imposed. We

refer to [5,16] for reviews.

3.2 The variational representation of link dynamics

While (3.1) captures within-link dynamics, the inter-link propagation of congestion requires a

careful treatment of junction dynamics. The overall system of PDEs leads to a complex system

of junction dynamics and conservation laws which is very hard to handle computationally. We

follow a diﬀerent approach here, which is more amenable to numerical computations. We brieﬂy

introduce a variational representation of the link dynamics, based on the generalized Lax-Hopf

formula, originally developed in [2,6,7], which leads to a DNL procedure in terms of a system of

diﬀerential algebraic equations (DAE). Compared to the ﬂow-based approach described in Section

3.1, the DAE based formulation has the following main advantages: (1) the primary variable is ﬂow

instead of density); (2) no partial diﬀerential operators are involved; (3) it introduces simpliﬁed

boundary conditions. We only give a high-level description of this approach, detailed enough so

that the reader is able to understand the mechanics of the numerical solver. A rigorous description

can be found in [16].

Consider the Moskowitz function Ni(t,x) which measures the cumulative number of vehicles

that have passed location xalong link Iiby time t. The following identities hold:

∂tNi(t,x)=fi(ρi(t,x)), ∂xNi(t,x)=−ρi(t,x).(3.4)

It follows immediately that Ni(t,x) satisﬁes the Hamilton-Jacobi equation

∂tNi(t,x)−fi(−∂xNi(t,x)) =0x∈[ai,bi],t∈[t0,t1].(3.5)

Denote by fin

i(t) and fout

i(t) the link Iiinﬂow and outﬂow. The cumulative link entering and exiting

vehicle counts are deﬁned as

d

dtNup

i(t)=fin

i(t),d

dtNdown

i(t)=fout

i(t),

8

where "up" and "down" represent the upstream and downstream boundaries of the link, respec-

tively. [23] derive explicit formulae for the link demand and supply based on a variational formu-

lation known as the Lax-Hopf formula [2,6,7], as follows:

Di(t)=(−fin

i(t−Li/vi) if Nup

i(t−Li/vi)=Ndn

i(t)

Ciif Nup

i(t−Li/v)>Ndn

i(t)

and

Si(t)=

fout

i(t−Li/wi) if Nup

i(t)=Ndn

i(t−Li/wi)+ρjam

iLi

Ciif Nup

i(t)<Ndn

i(t−Li/wi)+ρjam

iLi.

where Li=bi−aiis the length of the link Iii,vi=f0

i(0+) and wi=f0

i(ρjam

i−). These two relations

express the link demand and supply, which are inputs of the junction model, in terms of Nup and

Ndown. This means that one no longer has to compute the dynamics within the link, but focus

instead on the cumulative counts at the two boundaries of the link. Note that, when discretizing

the DNL in time, we immediately obtain the link transmission model [54]. In general, the approach

just described gives rise to the link-based formulation of DNL [23].

Junction Dynamics In a path-based DNL procedure one must incorporate established routing

information into the junction model. Such information is usually formulated by some behavioral

assumption on drivers’ preferences. In the numerical scheme we consider, such information is

provided in terms of an endogenously given ﬂow distribution matrix W(t)=[wij (t)], where wij (t) is

the proportion of ﬂow incoming into link iand continuing by following link jat a given junction.

Abstractly, if Θrepresents some junction model, we have the functional relationship

fout(t),fin(t)= Θ(D(t),S(t),W(t)),

where fout(t)=(fout

i(t))i=1,...,mand fin(t)=(fin

j(t))j=1,...,n, are the computed incoming and outgoing

ﬂows.

Dynamics at the origin nodes At the origin nodes, we employ a simple point-queue model, in

the spirit of Vickrey [49]. Let obe a given origin node, and denote by qo(t) the volume of the point

queue. Let link jbe connected to the origin. We assume that

d

dtqo(t)=X

p∈Po

hp(t)−min{D0(t),Sj(t)},

where Podenotes the set of paths originating from o. The ﬁrst term on the right represents the

inﬂow into the queue, while the second term represents ﬂow leaving the queue, modeling the

demand at the origin as

Do(t)=(Mif qo(t)>0,

Pp∈Pohp(t) else

taking Mto be a suﬃciently large number, bigger than the ﬂow capacity at link j.

9

Algorithm 1: FB for VI(A,X).

Input: Eﬀective delay operator A:H→H, step size {τn}n∈N, Initial point h0∈X,N≥1

stopping time.

for n=0,1,...,Ndo

obtain hnby running a DNL procedure;

if Stopping condition not satisﬁed then

Update

hn+1=PX(hn−τnA(hn)).(4.1)

end if

end for

Calculating path travel times The DNL procedure calculates the path travel times with given

path departure rates. The path travel time is deﬁned as link travel time plus possible queuing at

the origin. We deﬁne the link exit time function λ(t) implicitly as

Nup(t)=Ndown(λ(t)).(3.6)

For a path enumerated as p={1,2,...,K}, the path travel time Dp(t,h) is calculated as

Dp(t,h)=λs◦λ1◦. . . ◦λm(t).

where ( f◦g)(t)=f(g(t)) denotes the composition of two functions. λo(t) is the exit time function

for the potential queuing at the origin o.

4 Strongly convergent ﬁxed-point algorithms

4.1 Fixed Point formulation of DUE

Once a DNL procedure has been ﬁxed, the eﬀective delay operator A(h) can be evaluated. The

deﬁnition of DUE allows us to construct a suitable ﬁxed-point problem which is the basis for the

design of iterative numerical schemes for computing DUE. In fact, it is easy to see that h∗∈His a

path-departure rate proﬁle corresponding to a DUE if and only if the residual

rτ(h)=h−PX(h−τA(h))

is zero, i.e. rτ(h∗)=0, τ > 0. Here, we call PX(x) the orthogonal projection in L2onto the set

X⊂H. A classical iterative scheme to ﬁnd the roots of a nonlinear function is the Picard ﬁxed-

point iteration to localize a ﬁxed point of the map h7→ PX(h−τA(h)). Under strong a-priori

continuity and monotonicity assumptions on the eﬀective delay operator A, the projected gradient

(a.k.a. forward-backward) method, Algorithm 1, generates a sequence {hn}n∈Nwhich will weakly

converge to some DUE. This iterative solver is used in the software package developed in [24], and

has also been employed in many studies before. Weak convergence (see Deﬁnition 6.1 in Section

6.1) of the thus constructed sequence {hn}n∈Nis known when the operator Ais inverse strongly

monotone (co-coercive) with modulus µ > 0

hA(x)−A(y),x−yi ≥ µkA(x)−A(y)k2∀x,y∈H,

10

provided that the step sizes τ∈(0,2µ). Note that co-coercivity is equivalent to Lipschitz continuity

with Lipschitz constant 1

µ. Thus, for making method (4.1) a provably convergent algorithm, we need

to know the Lipschitz constant to pin down an upper bound on the step sizes. Strong convergence

of {hn}n∈Nrequires even stronger uniform monotonicity assumption of the operator Aover the set

X(Theorem 25.8 [3]),2or other modiﬁcations of the basic template (4.1) are needed. [15] present a

strongly convergent variant of (4.1) using a Halpern-type modiﬁcation of the basic scheme above.

Both assumptions, Lipschitz continuity and uniform monotonicity, are very restrictive in the context

of computing DUE. While continuity of the eﬀective delay operator has been established in the

context of the LWR network loading procedure [22], monotonicity estimates are hardly available

for realistic DNL procedures and not very likely to hold in practice. Therefore, strongly convergent

algorithm which are provably convergent to a solution under mild monotonicity assumptions are

highly desirable for modeling, optimization and simulation of traﬃc networks.

4.2 Computing DUE under weak assumptions

Our aim is to design and study alternative numerical schemes for computing DUE, which require

signiﬁcantly less stringent a-priori assumptions on the delay operator, but still come with rigorous

convergence guarantees. We summarize our working assumptions below, while Section 6.1 gathers

precise mathematical deﬁnitions for the readers’ convenience.

Assumption 2. The delay operator A:H→His sequentially weakly continuous and L-Lipschitz

continuous on X. However, we do not need to know L.

To cope with the unavailable information about the Lipschitz constant, we construct adaptive

algorithms, and thus do not need information of this hardly available global parameter. Instead a

simple and eﬃcient update procedure of the step size parameters is proposed which depends on

pointwise variations of the delay operator rather than global variations. This is a major advantage

of the methods we propose here, both from a conceptual and practical point of view, as it allows

us to decide step-sizes "online".

The next assumption is concerned with the structural properties with impose on the delay

operator.

Assumption 3. The delay operator Ais pseudo-monotone on H: For all h1,h2∈H, we have

hA(h1),h2−h1i ≥ 0⇒ hA(h2),h2−h1i ≥ 0 (4.2)

Pseudo-monotonicity is a signiﬁcant weakening of the (strict) monotonicity required when

applying the ﬁxed point iteration scheme (4.1). Some intuition for this concept can be given by

considering the simpler case when the operator is integrable. Any smooth real-valued function

f:H→Rinduces an operator A:H→Hvia its gradient A(h)=∇f(h) (unique thanks to the

Riesz representation theorem). Note that fis (strictly) convex if and only if the gradient map is a

(strictly) monotone operator. If fis merely quasi-convex, the gradient operator is pseudo-monotone

and vice versa. Assumptions 1-3are the standing hypothesis for the rest of this paper. Building on

them, we now describe the numerical schemes we analyze.

Our basic algorithmic design principle follows the forward-backward-forward (FBF) splitting

scheme, originally due to [48]. In its original form, it ensures that path ﬂows will weakly converge

2An operator A:H→His called uniformly monotone if there exists an increasing function ω: (0,∞)→[0,∞),

vanishing at zero, such that

hA(h)−A(h0),h−h0i ≥ ω(kh−h0k)∀h,h0∈dom A.

11

Algorithm 2: FBF for VI(A,X).

Input: Eﬀective Delay operator A:H→H, step size {τn}∈N, Sequences {αn}n∈N,{βn}n∈N, Initial

point h0∈X.

for n=0,1,...,Ndo

Obtain hnby running a DNL procedure;

if Stopping condition not satisﬁed then

Compute

yn=PX(hn−τnA(hn)),

zn+1=yn+τn(A(hn)−A(yn)),

hn+1=(1 −αn−βn)hn+βnzn+1.

Update the step size sequence

τn+1=

min nτn,µkyn−hnk

kA(yn)−A(hn)koif A(yn)−A(hn),0,

τnelse (4.3)

end if

end for

to a DUE, provided that the delay operator is monotone and Lipschitz continuous in the L2norm.

In the special case of variational inequalities, it has been shown that pseudo-monotonicity suﬃces

for weak convergence [4]. Actually, one can easily see that weak convergence holds for a large

class of non-monotone VIs satisfying an angle property at the solution set [9], and this is the main

reason why FBF is an attractive numerical solution scheme for DUE. FBF updates a current path

departure rate proﬁle hby ﬁrst applying (4.1), in order to produce the extrapolated search point

y=PX(h−τA(h)) (ﬁrst forward-backward step). It then performs another forward step in path

space, by calling the DNL procedure at the just constructed extrapolation point y, and shifts density

into directions where the diﬀerence in the “travel costs” between the current path ﬂow hand the

new search point yis large. Algebraically, this leads to the correction step h+=y+τ(A(h)−A(y)).

We would like to emphasize that this correction step does not involve an additional projection onto

the feasible set X. This reduces the computational complexity of FBF relative to its close cousin

the extragradient algorithm due to [29], and speeds up the computations in practice whenever

projections are expensive to implement (see [4] for extensive numerical evidence supporting this

claim).

In order to force strong convergence of the sequence of path departure rates {hn}n∈N, we augment

the scheme by an Halpern-type relaxation procedure. The pseudo-code of the resulting DUE solver

is displayed in Algorithm 2.

Algorithm 2has been analyzed in [10] in detail. In particular, we demonstrated strong conver-

gence of the numerical scheme by proving Theorem 4.1 below.

Theorem 4.1 (Theorem 2, [10]).Suppose that Assumptions 1-3are satisﬁed. Let {αn}n∈Nand {βn}n∈Nbe

two real sequences in (0,1), satisfying conditions

{βn}n∈N⊆(b,1−αn)for some b >0,(4.4)

12

Algorithm 3: IFBF for VI(A,X).

Input: Eﬀective delay operator A:H→H, step size τ0>0 and constants λ, µ ∈(0,1).

Sequences {n}n∈N,{βn}n∈N. Initial point h−1,h0∈X.

for n=0,1,...,Ndo

obtain hnvia a DNL procedure

if Stopping condition not satisﬁed then

Compute

wn=(1 −βn)[hn+αn(hn−hn−1)]

yn=PX(wn−τnA(wn))

hn+1=(1 −λ)wn+λ(yn+τn(A(wn)−A(yn)))

Update the step size

τn+1=

τnif A(wn)=A(yn)

min{τn,µkwn−ynk

kA(wn)−A(yn)k}else (4.6)

Update the inertia parameter

0≤αn+1≤

min α, n+1

khn+1−hnkif hn+1,hn,

αotherwise.

(4.7)

end if

end for

and

lim

n→∞ αn=0and

∞

X

n=1

αn=∞.(4.5)

Then the sequence {hn}generated by Algorithm 2converges strongly to h∗=arg min{kzk:z∈Ω}.

Departing from here, our aim in this paper is to signiﬁcantly extend our previous work by

designing a new FBF-based inertial algorithm, which meets all the desiderata spelled out in the

introduction: i) Adaptive step-sizes, ii) Weak monotonicity, and iii) strong convergence.

To achieve a possible convergence acceleration and to meet conditions i)-iii), we include relax-

ation and inertial eﬀects into our algorithm. To the best of our knowledge, this is the ﬁrst available,

provably convergent, relaxed-inertial splitting algorithm for computing DUE.

The basic idea behind inertial algorithms is to use information accumulated from past iterations

in order to introduce momentum. This is achieved by computing the extrapolated point z=

h+α(h−h0) in the ﬁrst step of each iteration. The introduction of momentum is classical, and

can be traced back to the heavy-ball method of Polyak [40]. We adapt momentum by injecting

relaxations steps in a disciplined way to force the trajectory to converge strongly to a DUE. The

so-constructed new strongly convergent method, to be called the inertial forward-backward-forward

(IFBF) algorithm, is displayed in Algorithm 3.

In the convergence analysis of Algorithm 3it turns out that any positive sequences {n}n∈N,{βn}n∈N⊂

13

(0,1), satisfying

lim

n→∞ βn=0,

∞

X

n=1

βn=∞and lim

n→∞

n

βn

=0 (4.8)

are admissible for strong convergence of the sequence of path ﬂows {hn}n∈Ngenerated by this

Algorithm. The sequence {τn}n∈Nhas the same role as the step size λin the basic ﬁxed point

iteration (4.1). Hence, we have to choose it small enough to ensure convergence (theoretically

smaller than the reciprocal of the Lipschitz constant of the delay operator). Nevertheless we

can realize IFBF without any a-priori knowledge of the Lipschitz constant by implementing the

adaptive step-size policy (4.6). As we will see in Lemma 6.7, the step size sequence {τn}n∈Nhas a

limit and

lim

n→∞ τn=τ≥¯

τ:=min µ

L, τ0.

The parameter αcan be any constant in (0,1). The main theoretical result of this paper reads as

follows.

Theorem 4.2. Let Assumptions 1-3be in place. Then the sequence {hn}n≥0generated by Algorithm 3

converges strongly to the minimum norm solution h∗=arg min{kzk:z∈Ω}.

5 Numerical Experiments

We present preliminary computational examples of the simultaneous route-and-departure-time

dynamic user equilibrium on the Nguyen network [36] and the Sioux falls network. Detailed

network parameters, including coordinates of nodes and link attributes, are sourced and adapted

from [24]. Given that our DUE and DNL formulations are path-based, enumeration of paths was

applied to generate the path set using the Frank-Wolfe algorithm.

We apply Algorithm 2and Algorithm 3with the embedded DNL procedure based on a time-

stepping scheme discretizing the PDAE formulation described in Section 3. We compare our

method with the projected gradient algorithm (4.1), as implemented in the MATLAB toolbox

documented in [24].3As remarked in [24], projected gradient requires strong monotonicity to

Nguyen Network Sioux Falls

No. of links 19 76

No. of nodes 13 24

No. of O-D pairs 4 528

No. of paths 24 6,180

Table 2: Key attributes of the test networks

ensure norm convergence, whereas all our methods are provably strongly convergent by means

of Theorem 4.1 and Theorem 4.2. All computations reported in this section were performed using

MATLAB (R2018a) on a Lenovo x64 Laptop with Intel Core i5 processor with 1.6 GHz and 8GB of

RAM.

3The Matlab code is retrieved from https://github.com/DrKeHan/DTA.

14

3456

9

3

1

2

8

6

47

5

12

7 10

12 11 10 16

49

13 23 16 19

21

24

2625

29

51

8

36 32

4833

22

7

18

20

17

50

55

18 54

13

23

14 15 19

22

24 21 20

74 66

75

62

64

697673

42

59

65 63

34 2840

71

43

30

53 58

61

52

17

56 60

46 67

68

3837

3135

4 14

12

15

27

9

11

45

5741

44

72

70

39

Sioux Falls network

(24 nodes, 76 links, 24 zones)

Anaheim network

(416 nodes, 914 links, 38 zones)

Chicago sketch network

(933 nodes, 2950 links, 387 zones)

40 60 80 100 120 140 160 180

320

340

360

380

400

420

Nguyen network

(13 nodes, 19 links, 4 zones)

112

67

2

11

13

10

4

1

15

9

14

45 8

9

3

32

6

10 11 12 13

16

87

17

19

18

Figure 2: The four test networks for DUE algorithms.

All the computations reported in this section were performed using the MATLAB (R2017b)

package on a standard desktop with Intel i5 processor and 8 GB of RAM.

5.1 Performance of the ﬁxed-point algorithm

The termination criterion for the ﬁxed-point algorithm is set as follows:

hk+1 hk

2

khkk2✏(5.37)

where hkdenotes the path departure vector in the k-th iteration. The threshold ✏is set to be

104for the Nguyen and Sioux Falls networks, and 103for the Anaheim and Chicago Sketch

networks. These di↵erent thresholds were chosen to accommodate the varying convergence

performances of the algorithm on di↵erent networks (see Figure 3).

15

Figure 1: The Nguyen and Sioux Falls network.

5.1 Performance of the Algorithms

We run all three methods for ﬁxed number of iterations and report the last iterate of the algorithm

(hFinal), the corresponding eﬀective delay operator (A(hFinal)), a numerical merit function (GAP), as

well as a measure for the speed of convergence. The construction of our numerical merit function

follows [24]. It is designed to measures the distance to equilibrium via the following version of a

gap function

GAPw=max{Ap(t,hFinal) : t∈[t0,t1],p∈Pws.t. hFinal

p(t)>0}(5.1)

−min{Ap(t,hFinal) : t∈[t0,t1],p∈Pws.t. hFinal

p(t)>0},

for all w∈W. Hence, GAPwrepresents the range of travel costs experienced by all drivers in the

given O-D pair w∈W. In fact, it is clear that GAPw≥0, and in an exact DUE, the gap should be

zero for all O-D pairs, justifying the interpretation of GAP as a numerical merit function.

Table 3contains a list of the global parameters employed in our numerical experiments. Here dt

Nguyen Sioux Fall

dt 70 100

max. Iterations 200 100

α0.7 0.7

µ0.5 0.5

λ0.5 0.2

Table 3: Global Parameters for Algorithms 1,2and 3.

is regulating the mesh-size of the time grid in the numerical solution of the DNL. max. Iterations

is the total number of iterations we let all three algorithms run on each test instance, and λis

15

0.21 0.215 0.22 0.225 0.23 0.235 0.24 0.245 0.25

O-D gap FB

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Count

0.31 0.32 0.33 0.34 0.35 0.36 0.37

O-D gap FBF

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Count

0.185 0.19 0.195 0.2 0.205 0.21 0.215

O-D gap IFBF

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Count

0.205 0.21 0.215 0.22 0.225 0.23 0.235 0.24 0.245

O-D gap FB

1

0.3 0.31 0.32 0.33 0.34 0.35 0.36 0.37

O-D gap FBF

1

0.185 0.19 0.195 0.2 0.205 0.21

O-D gap IFBF

1

Figure 2: Distributions of O-D gaps to the DUE solutions in the Nguyen network, calculated according to

(5.1).

the relaxation parameter in Algorithm 3. The construction of local parameters has been done in

a simple way, without involving extensive search over the parameter space which would very

likely improve the reported results. The FB Algorithm 1has been implemented as in [24], using

the constant step size τ=10 on the Nguyen, and τ=2 on the Sioux falls test network. FBF and

IFBF (Algorithms 2and 3) are implemented with the adaptive step-size (4.3) and (4.6), respectively.

The relaxation and inertial sequences employed in FBF and IFBF are reported in Table 4. Figure 2

FBF IFBF

Nguyen Siuox Falls Nguyen Sioux Falls

αn(1 +n)−0.9(1 +n)−0.9N.N. N.N.

βn0.7−0.7(1 +n)−0.70.5−0.5∗(1 +n)−0.4(10 +n)−210

10n+1

nN.N. N.N. 2

1+n5(0.1+n)−1.1

Table 4: List of method-speciﬁc parameters. N.N. stands for “not needed”.

shows the distribution of the values of our merit function (5.1) on the Nguyen network, and Figure

3displays the same statistic for the Sioux falls network.

We see that in all our experiments the distribution of the O-D gaps is concentrated around

0.2 across all networks for FB and IFBF. This suggests that these algorithms preform similarly in

terms of producing approximate equilibrium solutions. The decisive advantage of IFBF is however

that it is guaranteed to converge strongly to the minimum norm solution, without requiring strict

monotonicity of the delay operator. Figure 4displays the path departure rates as well as the

corresponding eﬀective path delays on randomly selected paths in the considered test networks.

We see from these plots that the path departure rates peak out around the minima of the eﬀective

delay, which reﬂects equilibrium behavior on the routes.

We ﬁnally display a ﬁgure which gives some indication on the relative speed of convergence of

16

0.05 0.1 0.15 0.2 0.25 0.3

O-D gap FB

0

10

20

30

40

50

60

70

80

90

100

Count

0.05 0.1 0.15 0.2 0.25 0.3 0.35

O-D gap FBF

0

10

20

30

40

50

60

70

80

90

100

Count

0.05 0.1 0.15 0.2 0.25 0.3

O-D gap IFBF

0

10

20

30

40

50

60

70

80

90

100

Count

0.1 0.15 0.2 0.25

O-D gap FB

1

0.1 0.15 0.2 0.25 0.3

O-D gap FBF

1

0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26

O-D gap IFBF

1

Figure 3: Distributions of O-D gaps to the DUE solutions in the Sioux falls network, calculated according to

(5.1).

each of the tested algorithms. We compute for each method the “relative energy” sequence

en=khn+1−hnk

khnk(5.2)

which measures the decay of energy of the path departure rates generated by the algorithms. [24]

call this the relative gap, and we follow this terminology in the labeling of the ﬁgures. For all our

methods this sequence must converge to 0, and we can consider one method faster than the other if

the rate of convergence of the energy sequence dominates the other. Figure 5shows the evolution

of the relative energy sequences for each method. We see that already non-optimized step size

parameters lead to some acceleration in the IFBF scheme when compared to other solvers.

6 Convergence Analysis

6.1 Preliminaries

The purpose of this section is to collect some standard concepts from real Hilbert spaces. Through-

out this section we let Hbe a real Hilbert space with scalar product h·,·i and associated norm

k·k.

Deﬁnition 6.1 (Convergence in Hilbert spaces).A sequence of points {xn}n∈Nin a Hilbert space H

converges weakly to a point x∈H, denoted by xn*x, if

lim

n→∞hxn,yi=hx,yi

for all test vectors y∈H. The sequence {xn}converges strongly to xif

lim

n→∞kxn−xk=0.

17

0246

Time (hr)

0

50

100

150

200

250

Path Flow (veh/hr)

0

5

10

15

Effective Delay

Path 6

0246

Time (hr)

0

100

200

300

Path Flow (veh/hr)

0

2

4

6

8

10

Effective Delay

Path 11

0246

Time (hr)

0

100

200

300

Path Flow (veh/hr)

0

2

4

6

8

Effective Delay

Path 18

0246

Time (hr)

0

100

200

300

Path Flow (veh/hr)

0

5

10

15

Effective Delay

Path 24

(a) FB Nguyen

0 2 4 6

Time (hr)

0

10

20

30

Path Flow (veh/hr)

0

2

4

6

8

Effective Delay

Path 14

0 2 4 6

Time (hr)

0

5

10

15

20

25

Path Flow (veh/hr)

0

5

10

15

Effective Delay

Path 46

0 2 4 6

Time (hr)

0

2

4

6

8

10

Path Flow (veh/hr)

0

2

4

6

8

Effective Delay

Path 286

0 2 4 6

Time (hr)

0

10

20

30

40

50

Path Flow (veh/hr)

0

5

10

15

Effective Delay

Path 600

(b) FB Sioux

0246

Time (hr)

0

50

100

150

200

Path Flow (veh/hr)

0

5

10

15

Effective Delay

Path 6

0246

Time (hr)

0

50

100

150

200

250

Path Flow (veh/hr)

0

2

4

6

8

10

Effective Delay

Path 11

0246

Time (hr)

0

50

100

150

200

250

Path Flow (veh/hr)

0

2

4

6

8

Effective Delay

Path 18

0246

Time (hr)

0

50

100

150

200

250

Path Flow (veh/hr)

0

5

10

15

Effective Delay

Path 24

(c) FBF Nguyen

0 2 4 6

Time (hr)

0

2

4

6

8

10

Path Flow (veh/hr)

0

2

4

6

8

Effective Delay

Path 5

0 2 4 6

Time (hr)

0

1

2

3

4

Path Flow (veh/hr)

0

2

4

6

8

Effective Delay

Path 19

0 2 4 6

Time (hr)

0

2

4

6

8

10

Path Flow (veh/hr)

0

2

4

6

8

Effective Delay

Path 286

0 2 4 6

Time (hr)

0

2

4

6

8

Path Flow (veh/hr)

0

2

4

6

8

Effective Delay

Path 468

(d) FBF Sioux

0246

Time (hr)

0

50

100

150

200

250

Path Flow (veh/hr)

0

5

10

15

Effective Delay

Path 6

0246

Time (hr)

0

100

200

300

Path Flow (veh/hr)

0

2

4

6

8

10

Effective Delay

Path 11

0246

Time (hr)

0

100

200

300

400

Path Flow (veh/hr)

0

2

4

6

8

Effective Delay

Path 18

0246

Time (hr)

0

100

200

300

400

Path Flow (veh/hr)

0

5

10

15

Effective Delay

Path 24

(e) IFBF Nguyen

0 2 4 6

Time (hr)

0

2

4

6

8

10

Path Flow (veh/hr)

0

2

4

6

8

Effective Delay

Path 5

0246

Time (hr)

0

0.5

1

1.5

2

2.5

Path Flow (veh/hr)

0

2

4

6

8

Effective Delay

Path 19

0 2 4 6

Time (hr)

0

5

10

15

20

Path Flow (veh/hr)

0

5

10

15

Effective Delay

Path 48

0246

Time (hr)

0

1

2

3

4

5

Path Flow (veh/hr)

0

2

4

6

8

Effective Delay

Path 468

(f) IFBF Sioux

Figure 4: Path departure rates and corresponding eﬀective path delays for selected paths in the DUE

solutions on the Nguyen network.

In order to prove our main convergence results, we need the following standard facts. For all

x,y∈Hand α∈R, we have

kx+yk2≤ kxk2+2hy,x+yi.(6.1)

Lemma 6.2. Let C be a nonempty closed convex set in Hand x ∈Harbitrary. Then

(i) hPC(x)−x,y−PC(x)i ≥ 0for all y ∈C;

(ii) kPC(x)−yk2≤ kx−yk2− kx−PC(x)k2for all y ∈C.

18

0 20 40 60 80 100 120 140

Iteration number

10-5

10-4

10-3

10-2

10-1

Relative gap

FB

FBF

IFBF

(a) Nguyen network

0 20 40 60 80 100

Iteration number

10-5

10-4

10-3

10-2

10-1

Relative gap

FB

FBF

IFBF

(b) Sioux Falls network

Figure 5: The relative energy (5.2) on semi-log scale for both test networks.

Deﬁnition 6.3. Let A:H→Hbe an operator. The operator Ais called

1. L-Lipschitz continuous with L>0 on X⊆Hif

kA(x)−A(y)k ≤ Lkx−yk ∀x,y∈X.

2. pseudo-monotone on X⊆Hif

hA(x),y−xi ≥ 0⇒ hA(y),y−xi ≥ 0∀x,y∈X.(6.2)

3. sequentially weakly continuous if xn*xthen A(xn)*A(x).

The next classical fact shows that solutions of VI(X,A) deﬁned in terms of pseudo-monotone

operators can be determined via “weak formulation”.

Lemma 6.4. [8] If A :H→His pseudo-monotone and continuous, then x∗is a solution of VI(X,A)if and

only if

hA(x),x−x∗i ≥ 0∀x∈X.

The next technical results are basic convergence guarantees for real-valued sequences.

Lemma 6.5. [53] Let {an}n∈Nbe a sequence of nonnegative real numbers, and {βn}n∈Nbe a sequence in (0,1)

such that Pnβn=∞. Suppose that {bn}n∈Nis a sequence with lim supnbn≤0. If

an+1≤(1 −βn)an+βnbn∀n∈N

then limn→∞ an=0.

Lemma 6.6. [43] Let {an}n∈Nbe a sequence of nonnegative real numbers, {βn}n∈Nbe a sequence of real

numbers in (0,1) with Pnβn=∞and {bn}n∈Nbe a sequence of real numbers. Assume that

an+1≤(1 −βn)an+βnbnfor all n ≥1.

If lim supk→∞ bnk≤0for every subsequence {ank}∞

k=1of {an}∞

n=1satisfying lim infk→∞(ank+1−ank)≥0then

limn→∞ an=0.

19

Lemma 6.7. Let τ0>0and µ∈(0,1). Let A :H→Hbe a L-Lipschitz continuous operator. The sequence

{τn}n≥0generated by eq. (4.6)is non-increasing and satisﬁes

lim

n→∞ τn=τ≥¯

τ:=min µ

L, τ0.(6.3)

Furthermore,

kA(wn)−A(yn)k ≤ µ

τn+1

kwn−ynk ∀n≥1.(6.4)

Proof. Since τn+1=min nµkwn−ynk

kA(wn)−A(yn)k, τno, it is clear that τn+1≤τnfor all n≥0. Moreover, using the

L-Lipschitz continuity of the operator Agives

µkwn−ynk

kA(wn)−A(yn)k≥µ

Lif A(wn),A(yn).

Hence, τn+1≥min{µ

L, τn}for all n. By induction, it follows that {τn}nis bounded from below by

min{µ

L, τ0}.Therefore, limn→∞ τn=τ≥¯

τ:=min nµ

L, τ0o.

6.2 Proof of Theorem 4.2

We start with an auxiliary technical result, which guarantees that weak cluster points of the

algorithm produce solutions of DUE. It is based on techniques from [50].

Lemma 6.8. Let {wn}be a sequence generated by Algorithm 3. If there exists a subsequence {wnk}convergent

weakly to z ∈Hand limk→∞ kwnk−ynkk=0, then, having Assumption 1-3in place, we know z ∈Ω.

Proof. Recall that yn=PX(wn−τnA(wn)). By Lemma 6.2(i), we have

hwnk−τnkA(wnk)−ynk,x−ynki ≤ 0,∀x∈X,

or, equivalently,

1

τnk

hwnk−ynk,x−ynki≤hA(wnk),x−ynki,∀x∈X.

Consequently, we have

1

τnk

hwnk−ynk,x−ynki+hA(wnk),ynk−wnki≤hA(wnk),x−wnki,∀x∈X.(6.5)

Since {wnk}is weakly convergent, it is bounded. Then, by the Lipschitz continuity of A,{A(wnk)}is

bounded. As kwnk−ynkk → 0, {ynk}is also bounded and τnk≥min{τ0,µ

L}. Passing (6.5) to the limit

as k→ ∞, we get

lim inf

k→∞

hA(wnk),x−wnki ≥ 0,∀x∈X.(6.6)

Moreover, we have

hA(ynk),x−ynki=hA(ynk)−A(wnk),x−wnki+hA(wnk),x−wnki

+hA(ynk),wnk−ynki

≥ −kA(ynk)−A(wnk)k·kx−wnkk+hA(wnk),x−wnki

− kA(ynk)k·kwnk−ynkk.

20

Since limk→∞ kwnk−ynkk=0 and Ais L-Lipschitz continuous on H, we get from the above

lim

k→∞

kA(wnk)−A(ynk)k=0.

Together with (6.6), we obtain

lim inf

k→∞

hA(ynk),x−ynki ≥ 0∀x∈X.

Next, we show that z∈Ω.We choose a sequence {k}of positive numbers decreasing and tending

to 0. For each k≥1, we denote by Nkthe smallest positive integer such that

hA(ynj),x−ynji+k≥0,∀j≥Nk.(6.7)

Since {k}is decreasing, it is easy to see that the sequence {Nk}is increasing. Furthermore, for each

k≥1, since {yNk} ⊂ X, we can suppose A(yNk),0 (otherwise, yNkis a solution) and, setting

vNk=A(yNk)

kA(yNk)k2,

we have hA(yNk),vNki=1 for each k≥1. Now, we can deduce from (6.7) that, for each k≥1,

hA(yNk),x+kvNk−yNki=hA(yNk),x−ynki+k≥0.

Since Ais pseudo-monotone (Assumption 3) on H, we get

hA(x+kvNk),x+kvNk−yNki ≥ 0.

This implies that

hA(x),x−yNki≥hA(x)−A(x+kvNk),x+kvNk−yNki − khA(x),vNki.(6.8)

Now, we show that limk→∞ kvNk=0. Indeed, since wnk*zand limk→∞ kwnk−ynkk=0,we

obtain yNk*zas k→ ∞. Since {yn} ⊂ X, we clearly have z∈Xas well. Since Ais sequentially

weakly continuous on X,{A(ynk)}converges weakly to A(z). We can suppose A(z),0 (otherwise,

zis a solution). Since the norm mapping is sequentially weakly lower semi-continuous, we have

0<kA(z)k ≤ lim inf

k→∞

kA(ynk)k.

Together with {yNk}⊂{ynk}and k→0 as k→ ∞, we readily conclude

0≤lim sup

k→∞

kkvNkk=lim sup

k→∞ k

kA(ynk)k!≤lim supk→∞ k

lim infk→∞ kA(ynk)k=0.

Hence, limk→∞ kvNk=0.

Now, letting k→ ∞, then the right hand side of (6.8) tends to zero by Ais uniformly continuous,

{wNk},{vNk}are bounded and limk→∞ kvNk=0. Thus, we get

lim inf

k→∞

hA(x),x−yNki ≥ 0.

Hence, for all x∈X, we have

hA(x),x−zi=lim

k→∞

hA(x),x−yNki=lim inf

k→∞

hA(x),x−yNki ≥ 0.

By Lemma 6.4,z∈Ω. This completes the proof.

21

The next result established the boundedness of the sequence of path ﬂows.

Lemma 6.9. Let Assumptions 1-3hold. The sequence of path ﬂows {hn}∞

n=1generated by Algorithm 3is

bounded. In addition,

khn+1−h∗k2≤ kwn−h∗k2−λ1−µτn

τn+12−λ+λµ τn

τn+1kyn−wnk2.(6.9)

Proof. We have

khn+1−h∗k2=k(1 −λ)wn+λ(yn−τn(A(yn)−A(wn))) −h∗k2

=k(1 −λ)(wn−h∗)+λ(yn−h∗)+λτn(A(wn)−A(yn)))k2

=(1 −λ)2kwn−h∗k2+λ2kyn−h∗k2+λ2τ2

nkA(wn)−A(yn)k2

+2(1 −λ)λhwn−h∗,yn−h∗i+2(1 −λ)λτnhwn−h∗,A(wn)−A(yn)i

+2λ2τnhyn−h∗,A(wn)−A(yn)i.(6.10)

Combining

2hwn−h∗,yn−h∗i=kwn−h∗k2+kyn−h∗k2− kwn−ynk2,(6.11)

with the deﬁnition of {τn}, it is easy to see that

kA(wn)−A(yn)k ≤ µ

τn+1

kwn−ynk,∀n≥0.(6.12)

Substituting (6.11) and (6.12) into (6.10), we get

khn+1−h∗k2≤(1 −λ)kwn−h∗k2+λkyn−h∗k2+λ2τ2

n

τ2

n+1

µ2kwn−ynk2−(1 −λ)λkwn−ynk2

+2(1 −λ)λτnhwn−h∗,A(wn)−A(yn)i+2λ2τnhyn−h∗,A(wn)−A(yn)i.(6.13)

Lemma 6.2(i) yields the estimate

kyn−h∗k2=hyn−h∗,yn−h∗i

=hPX(wn−τnA(wn)) −PX(h∗),PX(wn−τnA(wn)) −PX(h∗)i

=hyn−h∗,wn−τnA(wn)−h∗i+hPX(wn−τnA(wn)−PX(h∗),PX(wn−τnA(wn)) −wn+τnA(wn)i

≤ hyn−h∗,wn−τnA(wn)−h∗i

=1

2kyn−h∗k2+1

2kwn−τnA(wn)−h∗k2−1

2k(wn−h∗)−(wn−τnA(wn))k2

=1

2kyn−h∗k2+1

2kwn−h∗k2−1

2kyn−wnk2− hyn−h∗, τnA(wn)i,

or equivalently

kyn−h∗k2≤ kwn−h∗k2− kyn−wnk2−2hyn−h∗, τnA(wn)i.(6.14)

Since h∗∈Ω, we have hA(h∗),yn−h∗i ≥ 0. It follows from the pseudo-monotonicity of Athat

2hτnA(yn),yn−h∗i ≥ 0.(6.15)

22

Adding (6.14) and (6.15), we obtain

kyn−h∗k2≤ kwn−h∗k2− kyn−wnk2−2τnhyn−h∗,A(wn)−A(yn)i.(6.16)

Substituting (6.16) into (6.13), we get

khn+1−h∗k2≤(1 −λ)kwn−h∗k2+λkwn−h∗k2−λkyn−wnk2−2λτnhyn−h∗,A(wn)−A(yn)i

+λ2τ2

n

τ2

n+1

µ2kwn−ynk2−(1 −λ)λkwn−ynk2

+2(1 −λ)λτnhwn−h∗,A(wn)−A(yn)i+2λ2τnhyn−h∗,A(wn)−A(yn)i

=(1 −λ)kwn−h∗k2+λkwn−h∗k2−λkyn−wnk2−2λτnhyn−h∗,A(wn)−A(yn)i

+λ2τ2

n

τ2

n+1

µ2kwn−ynk2−(1 −λ)λτnkwn−ynk2

+2(1 −λ)λτnhyn−h∗,A(wn)−A(yn)i

+2(1 −λ)λτnhwn−yn,A(wn)−A(yn)i+2λ2τnhyn−h∗,A(wn)−A(yn)i

=kwn−h∗k2

+λ2τ2

n

τ2

n+1

µ2kwn−ynk2−(2 −λ)λkwn−ynk2

+2(1 −λ)λτnhwn−yn,A(wn)−A(yn)i

≤ kwn−h∗k2

+λ2τ2

n

τ2

n+1

µ2kwn−ynk2−(2 −λ)λkwn−ynk2

+2(1 −λ)λτn

τn+1

µkwn−ynk2

=kwn−h∗k2−λ2−λ−λτ2

n

τ2

n+1

µ2−2(1 −λ)τn

τn+1

µkwn−ynk2

=kwn−h∗k2−λ1−µτn

τn+12−λ+λµ τn

τn+1kyn−wnk2.(6.17)

Since

lim

n→∞ 1−µτn

τn+12−λ+λµ τn

τn+1=(1 −µ)(2 −λ+λµ)>0

there exists n0∈Nsuch that

1−µτn

τn+12−λ+λµ τn

τn+1>0∀n≥n0.

Hence

khn+1−h∗k≤kwn−h∗k ∀n≥n0.(6.18)

On the one hand, using the deﬁnition of wn, we obtain

kwn−h∗k=k(1 −βn)(hn+αn(hn−hn−1)) −h∗k

=k(1 −βn)(hn−h∗)+(1 −βn)αn(hn−hn−1)−βnh∗k

≤(1 −βn)khn−h∗k+(1 −βn)αnkhn−hn−1k+βnkh∗k

=(1 −βn)khn−h∗k+βn[(1 −βn)αn

βn

khn−hn−1k+kh∗k].(6.19)

23