Conference PaperPDF Available

Solving travel demand model equilibrium with Barzilai-Borwein step sizes

Authors:
  • DKS Associates

Abstract and Figures

Successive averaging algorithms are commonly used to solve equilibrium for model systems combining travel demand and traffic assignment. Each iteration updates the solution estimate as a weighted average of the previous estimate and a new iterate from the feedback cycle. The chosen weights are crucial to whether iteration converges toward the solution, and how quickly. With a well-known sequence of declining step size weights, convergence is assured, but is often quite slow. Many models converge much quicker with a constant step size, but a good choice of the constant is problem-specific, determined by trial-and-error. A repertoire of less-explored step size schedules is compiled, and some self-adaptive methods are noted. These alternatives also depend on parameters needing experimental tuning. Instead of these, however, this report examines the step sizes Barzilai and Borwein identified for iterative gradient descent and solution of systems of equations. With low computational cost and no assumed parameters, each iteration's step size is computed immediately before use from the linearized trend of the two most recent iterates, according to clearer underlying assumptions than previous adaptive methods. Using these step sizes, the tested models converged as efficiently and finely as with the best-performing constant step sizes, and the calculated step sizes were usually close to those best constants before reaching limits of convergence. This method is readily transferrable and potentially beneficial to many other models.
Content may be subject to copyright.
Gibb, J. Page 1
Solving travel demand model equilibrium with Barzilai-Borwein step sizes
John Gibb
DKS Associates
8950 Cal Center Drive, Suite 340
Sacramento, CA 95826
jag@dksassociates.com
April 10, 2017
Working Paper
Copyright © 2014-2017 John Gibb
Abstract
Successive averaging algorithms are commonly used to solve equilibrium for model systems
combining travel demand and traffic assignment. Each iteration updates the solution estimate as
a weighted average of the previous estimate and a new iterate from the feedback cycle. The
chosen weights are crucial to whether iteration converges toward the solution, and how quickly.
With a well-known sequence of declining step size weights, convergence is assured, but is often
quite slow. Many models converge much quicker with a constant step size, but a good choice of
the constant is problem-specific, determined by trial-and-error. A repertoire of less-explored
step size schedules is compiled, and some self-adaptive methods are noted. These alternatives
also depend on parameters needing experimental tuning.
Instead of these, however, this report examines the step sizes Barzilai and Borwein identified for
iterative gradient descent and solution of systems of equations. With low computational cost and
no assumed parameters, each iteration’s step size is computed immediately before use from the
linearized trend of the two most recent iterates, according to clearer underlying assumptions than
previous adaptive methods.
Using these step sizes, the tested models converged as efficiently and finely as with the best-
performing constant step sizes, and the calculated step sizes were usually close to those best
constants before reaching limits of convergence. This method is readily transferrable and
potentially beneficial to many other models.
Gibb, J. Page 2
1 – Introduction
Equilibrium travel demand model systems typically have a cyclical dependency among the
constituent models: travel demand from travel times, traffic assignments from travel demand,
and travel times from traffic assignments. A vector x (represented in practice as some collection
of origin-destination matrices or network variables) is both input and output to the model
function with solution at a fixed point,
󰇛󰇜. (1)
This cycle of dependency leads naturally to a “feedback loop” of calculation, which, at the
solution, reproduces itself. It is well known, however, that direct cycling from any other point,
or “naïve feedback,” is not a reliable solution method (1-4). Instead, iterative approximation
normally uses a successive averaging scheme, whereby the current iteration’s result from one or
more models in the chain is averaged with the previous iteration’s respective average. This new
average is taken as the improved solution, and input to the next feedback loop:
 󰇟󰇛󰇜󰇠 (2)
where: is the input to iteration k of the model system,
󰇛󰇜 is the iterate resulting from iteration k (e.g. trips resulting from demand
model),
the residual, 󰇛󰇜, is the direction vector,
󰇛0,1󰇠 is the chosen step size for iteration k; 1.
Various successfully applied models have chosen different variables in the model stream for this
role, such as the person- or vehicle-trip matrices, link flows, link speeds or times, and zone-to-
zone “skim” travel times (3,4). This notation assumes each iteration concludes with the creation
of the variable about to be successively averaged, whether or not this is the end of an actual
application’s procedure loop.
This paper addresses model applications in which 󰇛󰇜 is a continuous and reasonably smooth
function of , so their relationship is approximately linear across short differences, and the
residual length or norm 󰇛󰇜 is zero at a solution. Each feedback cycle normally has
user-optimal traffic assignments. Here we do not consider solution methods after Evans (2,5,6,7)
whose iterates are the link flows from an all-or-nothing assignment. For them, 󰇛󰇜 is an
abruptly varying function of usually with no fixed point of zero residual, and no definite
relation between equilibrium and residual length. Their approach to equilibrium relies on other
model properties.
Whether an iterated model succeeds at converging to equilibrium, and how quickly, depends on
the chosen step sizes. A step size schedule
, often called “the method of successive
averages” (MSA), is known to converge feedback applications to equilibrium. Its shortcoming
Gibb, J. Page 3
on feedback applications (having user-optimal assignments) is that convergence is often much
slower than with appropriately-chosen constant step sizes (3,8). However, the constant can only
be reliably chosen after trial runs at various values, and a good choice for one model is not
guaranteed to be convergent or efficient with other models, or even with variations of that model.
The step size choice problem, then, is how to choose step sizes for efficient and assured
convergence, with minimal dependence on ad-hoc experiments and tuned parameters.
Section 2 reviews the MSA and constant step size rules, and identifies several alternatives that
have been proposed and applied in other contexts. Some are predetermined schedules, while
others are adaptive, adjusting in response to the trend of some variable. Section 3 identifies two
formulas for calculating each iteration’s step size from the two most recent and 󰇛󰇜. They
require no line search and no parameters to assume. Their precedents, properties, and practical
considerations are discussed. Section 4 compares their performance on some feedback models
against constant and MSA step size schedules. Section 5 gives a summary and directions for
further investigation.
2 – Step size rules
This section reviews existing step size rules, beginning with MSA and constant.
Predetermined step size schedules
Successive averaging originated with Robbins and Monro (9) and extended by Blum (10) to
solve general stochastic equations in which particular 󰇛󰇜 are not known exactly, but
measured with noise. They showed if 0, ∞, and ∞, the successive
average converges to the true fixed point almost certainly. Robbins and Monro suggested
as a prototypical schedule; it is optimal at minimizing the variance of the estimate if the
mean and variance of 󰇛󰇜 are stationary, varying from iteration to iteration due to noise alone.
This work gave rise to the field of stochastic approximation, and is basic to solutions to problems
in numerous fields (11,12).
Sheffi and Powell (13) brought that prototypical step size schedule into transportation modeling
to solve stochastic user-equilibrium traffic assignment. When the Clean Air Act Amendments of
1990 demanded travel demand models in practice to solve full-system equilibrium, practitioners
adapted sequential models into “feedback loops.” Many soon found failure with direct, or naïve
feedback, but success using that same 1/k step size sequence (2,7). Transportation modeling
literature often calls this sequence “the Method of Successive Averages” or MSA (e.g. (2)). This
paper uses terms “successive averaging” for any step sizes, and “MSA” for 1/k in particular.
Bar-Gera and Boyce (8), and Boyce et al. (3) empirically found appropriately-chosen constant
step sizes to converge various models much quicker than MSA. The former also showed the
theoretical appropriateness of constant step sizes for models that are approximately linear in the
Gibb, J. Page 4
neighborhood of the solution, when no stochasticity is involved. However, the appropriate
choice of step size is problem-specific. It takes a series of experimental runs to determine a
given model’s best step size, or a suitable range; too large a step size may not converge at all.
Step size schedules besides 1/k and constant have been identified and explored in other fields,
and in other specialties such as stochastic and dynamic traffic assignment. Whereas MSA gives
equal weight to all iterates from the beginning, the alternatives give more weight to the most
recent iterates, which are presumably closer to the solution. After George and Powell’s work in
dynamic programming (14), plus contributions to dynamic and stochastic traffic assignment by
Hiele (15), Liu et al (16), and others cited, Table 1 compiles several step size schedules.
Table 1
Alternative pre-determined step size schedules
Method Parameters Step size pattern or formula
MSA (Robbins & Monro,
Blum)(9,10) (none) 1,1
2,1
3,1
4,1
5,1
6
Reset MSA (Cascetta et
al)(17), Reset
schedule 1,1
2,1
3,1
4,1
5,1,1
2,1
3,1
4,1
5,1
6,1,1
2,1
3
Nagurney & Zhang (18) (none) 1,1
2,1
2,1
3,1
3,1
3,1
4,1
4,1
4,1
4…,repetitions of 1
Polyak(19) ∈󰇡
,1󰇤,
e.g.

Generalized harmonic(14) 0
1
Weighted-MSA (Liu et
al)(16) 0
12⋯
Search-Then-Converge
(Darken & Moody)(20) ,, 1
1

McClain(14)

1
 
The step size schedules in Table 1 satisfy Blum’s theorem, except Reset MSA with unending
resets, and McClain’s; McClain’s is designed to approach constant asymptotically. All can
reduce the rate at which the step size decreases, compared to MSA. All except one have
parameters to tune for best effect in a given application.
At least one study of demand model convergence has examined such step size schedules: Rich et
al. (21) did so with MSA, Reset MSA, Polyak, and Weighted-MSA methods, achieving the most
efficient and effective convergence with the latter two for their test case.
Gibb, J. Page 5
Choosing or tuning a step size schedule from Table 1 is by trial-and-error, with controls for both
the early step sizes and the rate of reduction. If a particular model is known to converge
efficiently with a certain constant step size, then one can start with a schedule providing roughly
similar step sizes among the number of iterations normally needed.
Adaptive step size rules
A variety of adaptive step size schemes have been applied in other contexts. George and Powell
(14) reviewed several from the literatures of stochastic estimation, signal processing, and
dynamic programming. In many, the step size reacts to some measure that tries to distinguish
forward progress from fluctuation, the former indicating step size increase, the latter to decrease.
Schemes by Gaivoronski (22), and Trigg and Leach (23) compare measures of net change of the
estimate across multiple iterations to its gross distance of travel among them. Schemes by
Kesten (24), and Mirozahmedov (25), adjust the step size in response to the inner product of
successive residuals, negative indicating oscillation or overstep. Like the fixed step size
schedules, these schemes depend on variety of user-adjusted parameters. The stochastic gradient
adaptive method (12,14) attempts to pick an optimal step size to minimize expected squared
residual lengths, but depends on a problem-specific scaling parameter and good settings of a trust
range for the step size.
The aforementioned Liu et al. (16) developed a “self-regulating adaptive” scheme for their
stochastic traffic assignment, which selects one or another increment to the denominator of a
harmonic series depending on whether the residual length grew or shrunk from the last iteration:
1
, Γ,Γ1,if󰇛󰇜󰇛󰇜
 ,1,otherwise
Qiu et al. (26), also for stochastic assignment, extended this scheme into a four-way decision,
using the residual length ratio itself for some cases. Both Liu’s and Qiu’s satisfy Blum’s
theorem for convergence.
In their discussion of constant step sizes for feedback models, Bar-Gera and Boyce (8) included a
provision to reduce the step size if the average residual ratio of the last  iterations (for iteration
) exceeds a specified target; this residual ratio is defined as:
󰇛,󰇜󰇩󰇛󰇜
󰇛󰇜󰇪/
All of the above adaptive step size schemes depend on problem-specific parameters to decide
whether and/or how much to adjust the step size. Also unsatisfying is the vagueness of the
underlying objectives and assumptions by which they use the information at hand to make their
adjustments.
Gibb, J. Page 6
Optimal step sizes
Some models combining traffic assignment with trip distribution, mode choice, and/or trip
generation possess objective functions, optimized at the solution (1,5,6,27,28). Each iteration
typically uses some form of line-search, computing the objective at various points on the
direction line to estimate the optimal step size.
Optimization approaches have not gained widespread use in practice, however. Impediments
include:
the computational cost of the line search in addition to computing the direction line itself,
the complexity of the objective function calculations, which must be coded in addition to,
and in perfect coordination with, all components of the model itself, and
limitations of the models’ functional forms for which an optimization exists or can be
readily identified.
Horowitz (7) generalized and simplified an application of the Evans (5) computational procedure
using MSA instead of optimization. A recent combined model by Yao et al. (28) forewent
evaluation of “the complex objective function” in favor of Liu’s self-adaptive step size scheme.
Functional form limitations motivated Bar-Gera and Boyce in their study of feedback with
constant step sizes (8).
3 – The Barzilai-Borwein Step Sizes
For an iterative solution of multidimensional systems of equations of form 󰇛󰇜 (where
:→), Barzilai and Borwein (30) proposed two alternative step sizes applicable to the
direction vector 󰇛󰇜. Often, but not necessarily, 󰇛󰇜 is a gradient of some function
:→ to be minimized, in which the search is along the negative gradient. At iteration k
with ∆ and ∆ 󰇛󰇜󰇛󰇜, the two Barzilai-Borwein (BB) step sizes are:
󰇛BB1󰇜∆,∆
∆,∆ (3)
󰇛BB2󰇜∆,∆
∆,∆ (4)
BB1 and BB2 are also called respectively the long and short BB step sizes. It can be readily
shown that the ratio 󰇛󰇜
󰇛󰇜 is the squared cosine of the angle between ∆ and ∆.
For the subject fixed-point problem (Eq. 1), 󰇛󰇜󰇛󰇜. We will refer to the residual
󰇛󰇜󰇛󰇜󰇛󰇜, which is the actual positive search direction. Consequently, with
∆ and ∆,
Gibb, J. Page 7
󰇛BB1󰇜∆,∆
∆,∆ (5)
󰇛BB2󰇜∆,∆
∆,∆ (6)
To illustrate, Figure 1 shows a plane section of the solution space containing the two most recent
estimates  and , their corresponding function iterates 󰇛󰇜 and 󰇛󰇜, and residuals
 and . Figure 1 also shows, in dashed lines, linear estimates of what and 󰇛󰇜 would
have been had any other step size been chosen instead of .
Gibb, J. Page 8
Figure 1
Schematic of two iterations and two alternative successors
The linear estimate of the residual vector from any point on , in terms of arbitrary step size
, is
󰇛 󰇜

∆ (7)
We now seek such vectors with properties evidencing proximity to the solution. One is
orthogonal to , seeking a balance between oscillation and creep that is better defined than in
the previous adaptive rules. Orthogonality means ∆󰇡
 ∆󰇢0, which solves as
󰇛BB1󰇜 as in Equation 5. Figure 1 shows the origin of this vector as . The choice of
 for  would approximate Cauchy’s steepest-descent method, which follows a zig-zag
path of right-angles to convergence. But for the next , rather than choose a second point
x
k
‐1 x
k
F󰇛x
k
󰇜
F󰇛x
k
‐1󰇜
F󰇛F󰇛x
k
‐1󰇜󰇜
x
MR
F󰇛x
MR
󰇜
x
OR
F󰇛x
OR
󰇜
x
k
1󰇛BB2󰇜
x
k
1󰇛BB1󰇜
r
k
‐1
r
k
Gibb, J. Page 9
within , Barzilai-Borwein instead moves forward along the current direction vector , to
where it intersects 󰇛󰇜; this is simply the same step size along .
Another suitable 󰇛󰇜 is the one with minimal estimated length. Solving the zero of the
derivative of squared length (from Equation 7) yields the minimum-residual point , with step
size 󰇛BB2󰇜 as in Equation 6. As before, Barzilai-Borwein moves forward along by
that step size. (The choice of  for  is another monotonically convergent, but not very
efficient, method.)
The Barzilai-Borwein methods require less computational effort per iteration than their classical
counterparts, requiring no line-searches, yet they and others have found them to converge
numerous problems in much fewer iterations. They avoid the inefficient zig-zag paths of the
former methods. Rayden (31) proved convergence of BB1 and BB2 for quadratic optimization
problems, and Dai and Liao (32) proved BB1 convergence to be R-linear for such problems.
Fletcher (33) found BB1 quite competitive with applicable conjugate gradient-based methods for
non-quadratic problems. Convergence is not necessarily monotone, however, except under
restrictive conditions.
The BB1 step size is the reciprocal of the Rayleigh quotient for the line-average Jacobian of ,
i.e.
󰇛󰇜∆∆
∆∆ . For BB step sizes to be positive, the Jacobian should be of positive-
definite symmetric part (PD), which requires the Jacobian of residuals be of negative-definite
symmetric part (ND). An iteration with a negative step size moves against the descent direction
toward what may be an antilimit or unstable point, to which continued iteration does not
generally converge.
Raydan, La Cruz et al (34,35,36) developed convergence safeguards for BB applications lacking
theoretical proof of convergence: they attempt a BB step size for each iteration, but whenever
fails to make adequate progress against some number of recent iterations, they backtrack along
 by a smaller step size. They also included various contingency schemes whenever negative
step sizes arose. Fletcher (33) cautioned against over-restrictive safeguarding, giving a
numerical experiment solved much faster without restriction, despite several large “spikes” of its
objective function. Varadhan and Gilbert (37) documented a software package implementing
those procedures, offering both of the BB step sizes as options (and a third). They reported that
the smaller BB2 step size usually outperformed the others, and made this the default option.
The Jacobian of a travel demand model iteration, , is normally ND, representing negative
feedback due to congestion: an increase in a traffic flow movement causes an increase in its
travel time, which deters the next iterate’s demand for travel in that movement. If is ND , then
the real components of the eigenvalue spectrum of 1 are all > 1, comfortably PD and
yielding BB step sizes between zero and one. Leeway remains for positive response of F as long
as its forward component of response is contractive with respect to x. Whenever this happens,
Gibb, J. Page 10
0∆∆
∆∆ 1, resulting in 󰇛BB1󰇜1, which is infeasible in travel demand models with
non-negativity constraints; the step size for application must be truncated to not exceed 1. If a
model ever encounters excessive positive feedback, i.e. ∆∆
∆∆ 1, then 󰇛BB1󰇜0 or fails in
division-by-zero, and instability of the model system is indicated; successive averaging may fail
with any step size selection (8).
An alternative convergence safeguard is proposed here, which avoids the risk of wasted line-
search iterations altogether: impose upper and lower bounds on the step size, such that both
satisfy the criteria of Blum’s theorem. Many of the step size formulas in Table 1 can be adapted
for this purpose. For example, a lower bound  min󰇡
,0.2󰇢, and an upper bound
 min󰇡
,0.9󰇢, provide an initial trust region of [0.2, 0.9], and eventual take-over by the
convergence safeguards. Overarching limits are  0 and  1, to uphold the model’s
non-negativity constraints.
Practical application
Figure 2 shows the application algorithm, generically for any choice of feedback variable (trips,
skims, etc.) and either of the two step size formulas. Note that the second iteration uses an
assumed step size rather than one computed from the initial point , because in many
experimental runs, this iteration’s computed step size was almost always very close to 1, due to
independence and vastly different magnitudes between  and . Furthermore, choosing
high values for this step size did not generally improve convergence.
Gibb, J. Page 11
Figure 2
Algorithm for travel demand model with feedback using BB step sizes and safeguards
Initialize ← or suitable initial value
Initialize 󰇛0,1󰇠, as a residual length stopping criterion
Loop for k = 1 to Max_Iterations
Compute the model chain direction vector: ←󰇛󰇜
If
 then stop
If k = 1, then
←1
Else if k = 2, then
←
Else
∆
∆  (i.e., ∆)
Update safeguard bounds  and 
If
∆,∆0 then (degenerate case, give warning or stop)
← (or other contingency step size)
Else
Compute by either the BB1 or BB2 option, Equation 3 or 4
←max󰇛,min󰇛,󰇜󰇜
End If
End If
 ←
End Loop
A relation between successive residual lengths and step sizes
Many adaptive step size techniques choose a smaller step size if the residual length grows across
one or a group of iterations, than if it contracts. The BB2 step size does likewise. Rearranging
Equation (5),
󰇛2󰇜
 2


 
leads to
1
󰇛2󰇜2󰇧∆,∆
∆,∆󰇨. (8)
If the latest residual grew from the previous, i.e., , and the right term of Equation
8 is positive (normally so, being 1/󰇟󰇛1󰇜]), then both sides of Equation 8 are positive,
requiring 󰇛2󰇜
, a reduction to less than half. If successive residuals have equal
length, then the step size reduces exactly by half.
Gibb, J. Page 12
When the model reaches a convergence level in which the computed differences between x and
F(x) are are dominated by noise, non-shrinking residuals lead BB2 to exponentially-shrinking
step sizes. A minimum step size sequence convergent by Robbins-Monro-Blum, such as 1/k,
would prevent exponentially shrinking step sizes from locking the estimated solution into
“apparent convergence.”
No similar relation occurs between consecutive residual lengths and BB1 step sizes. However, if
successive residuals are dominated by random, independent noise, they would be orthogonal, i.e.
,0, causing BB1 to repeat and perpetuate the previous step size, rather than diminish
it. Containment is appropriate by both maximum and minimum bounds of a declining trust
range.
Taking these considerations together, if the computed BB2 step size repeatedly halves
(approximately) the preceding step size (no matter how chosen), and the BB1 repeatedly
duplicates it, this indicates that a model may have converged to a noise-dominated state.
4 – Experimental results
This section compares convergence of some practical travel demand models with MSA, fixed
step sizes, and the BB step sizes. The models are
Sacramento region, “Sacmet,” (38,39) originally developed for the Sacramento Regional
Council of Governments. The network has nearly 1560 zones, 11,600 nodes and 27,800
links; mode choice is among seven modes. It is applied in Cube Voyager software (40).
Three scenarios are tested:
o Base-year, population of nearly 2 million,
o Future year with about 2.7 million population, with a planned network, and
o Future year, except with BPR link delay functions instead of the model’s conical
functions. Specifically, 
 󰇡1󰇡
󰇢󰇢, where the parameters are kept the
same as in the replaced conical functions, providing the same steepness at the
effective capacity at which time is doubled (Spiess (41)).
Carson City, Nevada and parts of adjacent counties (42), with nearly 240 zones, 900
nodes, 2300 links, 4 modes, in TransCAD (43), the base year with nearly 82,000
population.
In all these models, the collection of vehicle-trip matrices act as the successively-averaged
variable. Thus, the trip tables assigned in the feedback time periods are collectively the
vector, and 󰇛󰇜 is the vehicle trip tables resulting from the succeeding model chain.
Successive averaging of the two prepares the next vehicle trips to be assigned. To compute the
BB step sizes, a new stage is added immediately before the successive-averaging stage, which
computes the differences and inner products in Equations 3 and 4.
Gibb, J. Page 13
Figure 3 compares the convergence of the three Sacramento scenarios for various constant step
sizes, and combinations of computed step size formula (BB1 and BB2) and two chosen step sizes
for iteration 2. The iteration number is shown as the series-line (or depth dimension), rather than
horizontally, to show comparisons more clearly between the cases. The convergence measure,
relative displaced trips = ∑|󰇛󰇜|
, is similar to the Euclidean length of the residuals, but less
sensitive to outliers. All models were run through ten feedback iterations, enough for most to
reach their practical limits of convergence.
As shown in part (a) of Figure 3, the Sacramento base-year model converged monotonically
before stagnation or stopping with each tested step size rule. Of the constant step sizes, 0.8
yielded fastest convergence, 0.7 came close, and convergence was slower the farther the step size
departed from that range. MSA was clearly inferior.
Computational cost of the BB calculations was negligible, at less than 0.4% of overall model
runtimes.
Following the second iteration with pre-chosen step sizes, the BB iterations thereafter
significantly dampened the impact of that choice. For all BB cases, iterations 4 and after
converged about as well as with the best constants.
In part (b) of Figure 3, results from the future-year model were similar, but with best constant
step size leaning toward 0.7. These results indicate that a good constant step size for one model
scenario might not perform so well for another with significantly different congestion levels.
The runs with BB step size formulas converged on par with the best constants, even from the
poorer iteration 2 step size assumption.
Gibb, J. Page 14
Figure 3 – Sacramento model convergence
(a) Base-year
(b) Future year, planned network
(c) Future year, BPR delay functions
0.0001
0.001
0.01
0.1
1
MSA
0.4 Const.
0.5 Const.
0.6 Const.
0.7 Const.
0.8 Const.
0.9 Const.
Naïve
0.5, BB1
0.7, BB1
0.5, BB2
0.7, BB2
Relative displaced trips
Iteration 2
Iteration 3
Iteration 4
Iteration 5
Iteration 6
Iteration 7
Iteration 8
Iteration 9
Iteration 10
0.0001
0.001
0.01
0.1
1
MSA
0.4 Const.
0.5 Const.
0.6 Const.
0.7 Const.
0.8 Const.
0.9 Const.
Naïve
0.5, BB1
0.7, BB1
0.5, BB2
0.7, BB2
Relative displaced trips
Iteration 2
Iteration 3
Iteration 4
Iteration 5
Iteration 6
Iteration 7
Iteration 8
Iteration 9
Iteration 10
0.001
0.01
0.1
1
MSA (1/n)
0.5 Constant
0.6 Constant
0.7 Constant
0.8 Constant
0.9 Constant
Naïve
#2=0.5, BB1
#2=0.7, BB1
#2=0.5, BB2
#2=0.7, BB2
Relative displaced trips
Iteration 2
Iteration 3
Iteration 4
Iteration 5
Iteration 6
Iteration 7
Iteration 8
Iteration 9
Iteration 10
Gibb, J. Page 15
In the future Sacramento model with the BPR function, part (c) of Figure 3 shows a slower rate
of convergence and poorer ultimate convergence levels than the same model with conical
functions. This is not surprising, due to the BPR function’s sharper sensitivities and greater
delays when 
≫1, compared to the conical function. The constant step sizes giving best
early convergence are 0.6 and 0.7, but the former continues to converge finer when the latter
stagnates. Higher constants converged slower and stagnated at worse levels. Convergence
speed and quality are both more sensitive to step size then in the previous conditions.
Concerning the runs of the BPR scenario with calculated BB step sizes:
The choice of second-iteration step size yielded orderly trends only in the early iterations.
The BB cases converged monotonically before nearing stagnation, each iteration roughly
as well as with the best constant, but the improvement with each iteration was less
consistent.
All BB cases reached comparable or better ultimate convergence as the best constant,
each taking 7 or 8 iterations to reach stagnation at a similar level.
Figure 4 shows the actual step sizes calculated in the Sacramento models examined. Some
observations:
For the first two models, BB1 step sizes tended to be larger than BB2; step sizes
significantly increased for a few iterations before they decreased.
For the BPR model, BB step sizes jumped between high and low without consistent
pattern, evidencing a less consistent responsiveness of the model to feedback. Some
computed BB step sizes exceeded 1, unexpected but not impossible for a model of
primarily negative feedback. Trust range truncation was applied.
Many of the calculated step sizes swung outside the range of the respective model’s most
well-behaved constants, some into the range of very poor constants.
Approaching convergence around iteration 7 or so, with noise limiting reduction of
residuals, the BB2 step size contracted as predicted.
The smaller late step sizes, especially with BB2, appear to have enabled a slightly finer
ultimate convergence than the otherwise best constants.
Gibb, J. Page 16
Figure 4
BB Step sizes, Sacramento model scenarios
0.5 or 0.7 = Second iteration assumed step size
Figure 5 compares convergence for the Carson City area model, and Figure 6 shows the actual
computed BB2 step sizes (unconstrained). Its software supported high numerical precision
enabling continued convergence beyond practical necessity. The best constant appears to be near
0.9; BB2 choose step sizes near this value. For both of the second-iteration settings it converged
on par with the best constant before nearing convergence limits. Computed BB1 step sizes (not
shown) were similar.
0
0.2
0.4
0.6
0.8
1
0246810
Iteration
Base-year
0
0.2
0.4
0.6
0.8
1
0246810
Iteration
Future plan
0
0.2
0.4
0.6
0.8
1
0246810
Iteration
Future, BPR function
0.5, BB1
0.7, BB1
0.5, BB2
0.7, BB2
Gibb, J. Page 17
Figure 5
Carson City area model convergence
Figure 6
BB Step sizes, Carson City model
1E-11
1E-10
1E-09
1E-08
0.0000001
0.000001
0.00001
0.0001
0.001
0.01
0.1
123456789101112131415
Relative displaced trips
Iteration
0.5
0.6
0.7
0.8
0.9
1
MSA
BB2(.8)
BB2(.9)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0123456789101112131415
Step Size
Iteration
BB2(.8)
BB2(.9)
MSA (lower limit)
Gibb, J. Page 18
5 – Summary and conclusions
This study demonstrated successive averaging with Barzilai and Borwein (BB) calculated step
sizes to successfully and efficiently solve travel demand models with feedback. For any number
of iterations, the test models reached similar convergence levels as their best constant step sizes,
and far better than MSA. The BB step sizes responded appropriately and successfully, without
intervention, to significant changes in congestion level and to congestion-delay sensitivity.
The two BB step size options performed about equally, usually chosing step sizes in the
neighborhood of the respective model’s best constant. Those by BB2 tended slightly smaller
than BB1. Approaching convergence limits, where noise became significant, the BB2 step sizes
dropped as predicted, indicating for imposing a lower limit such as 1/k.
Convergence is assured by the imposition of a trust range of upper and lower limits on the step
size, both with declining schedules chosen to satisfy Blum’s theorem.
In conclusion, the BB step sizes with trust ranges overcome the drawbacks of the prevalent MSA
and constant step sizes, and of other notable alternatives, by assuring convergence, converging as
quickly or better than well-chosen constants in the examples, depending minimally on
experimentally-tuned parameters, and having wide applicability, simple implementation, and
trifling computational cost. This study indicates BB step sizes could be beneficial to a wide
range of models, offering about the fastest convergence possible due to step sizes, and wider
conditions to which a given model solution process is transferrable.
A few points in favor of the BB2 step size over the BB1 step size are:
BB2 chooses the point along the shortest member of a family of linearly inferred
→󰇛󰇜 pairs, minimizing uncertainty of the estimate, while BB1 takes a right angle not
regarding the overall length of the step taken,
BB2 is the shorter step size of the two for a given condition, favoring caution,
BB2 step sizes shrink when residuals fail to shrink, whether due to normal calculation or
the limits of practical convergence.
This study empirically found slightly better fine convergence of BB2,
For numerous experiments with systems of linear equations, BB2 more frequently
improved each iteration’s solution,
Others have found BB2 to be the better performer for solving various other problems(37),
Despite these considerations, this study found comparable performance of BB1, and no reason to
reject it for demand model feedback.
The second iteration’s step size should normally be chosen in advance. Experimentally, the
impact of this choice diminished and eventually vanished in the subsequent iterations, so for
most models, little if any experimental tuning is necessary; any reasonable value (normally
Gibb, J. Page 19
around 0.4 to 0.8) should suffice. If a BB step size from the third iteration of the same or a
similar model is available, this should make a good choice for the new model’s second. Further
study should examine the suitability of BB for iteration 2 from “warm” starting points (e.g. from
another model scenario).
A priority of the author for further development is to account for random noise explicitly in step
size choice and stopping criteria, not only for conventional models near their convergence limits,
but also for activity-based demand models applied with stochastic sampling processes. Neither
BB1 nor BB2 conform to Blum’s step size criteria when each iterate is dominated by random
independent noise.
Reliance upon MSA, but with dissatisfaction, appears in the examined literature of stochastic and
dynamic traffic assignment. The BB step sizes may be applicable to some of these, and to many
other equilibrium and fixed-point problems.
References
1. Florian, M., S. Nguyen, J. Ferland. On the Combined Distribution-Assignment of
Traffic. Transportation Science, 9(1) pp. 43-53. 1975.
2. Loudon, W., J. Parameswaran, B. Gardner. Incorporating Feedback in Travel
Forecasting. Transportation Research Record: Journal of the Transportation
Research Board, No. 1607, TRB, National Research Council, Washington, D.C.,
1997, pp. 185–195.
3. Boyce, D., C. O’Neill, W. Scherr. Solving the Sequential Travel Forecasting
Procedure with Feedback, Transportation Research Record: Journal of the
Transportation Research Board, No. 2077, TRB, National Research Council,
Washington, D.C., 2008, pp. 129–135.
4. Slavin, H., J. Lam, K. Nanduri, Traffic Assignment and Feedback Reasearch to
Support Improved Travel Forecasting, Federal Transit Administration, 2015.
5. Evans, S. Derivation and analysis of some models for combining trip distribution and
assignment. Transportation Research 10 (1), 37-57. 1976.
6. Boyce, D., H. Bar-Gera. Multiclass combined models for urban travel forecasting.
Networks and Spacial Economics, 4, pp. 115-124, Kluwer, 2004.
7. Horowitz, A., Convergence properties of some iterative traffic assignment algorithms,
Transportation Research Record No. 1220, TRB, National Research Council,
Washington, D.C., 1989, pp. 21–27 .
8. Bar-Gera, H., and D. Boyce. Solving a Nonconvex Combined Travel Forecasting
Model by the Method of Successive Averages with Constant Step Sizes.
Transportation Research, Vol. 40B, 2006, pp. 351-367.
9. Robbins, H., S. Monro. A stochastic approximation method. Annals of Mathematical
Statistics, 22, pp. 400-407, 1951.
Gibb, J. Page 20
10. Blum, J. Multidimensional stochastic approximation methods. Annals of
Mathematical Statistics, 25, pp. 737-744, 1954.
11. Kushner, H., G. Yin, Stochastic Approximation and Recursive Algorithms and
Applications, second edition, Springer-Verlag New York, NY, 2003.
12. Powell, W. Approximate Dynamic Programming, Solving the Curses of
Dimensionality, second edition, John Wiley & Sons, Inc., Hoboken, NJ, 2011.
13. Sheffi, Y., W. Powell. An algorithm for the equilibrium assignment problem with
random link times. Networks 12(2), 191-207 . 1982.
14. George, A., W. Powell. Adaptive stepsizes for recursive estimation with applications
in approximate dynamic programming. Machine Learning, 65(1): 167-198. 2006.
15. Hiele, R.. Iterate averaging methods for solving non-linear programming problems.
Delft Univ. of Technology (master thesis), 2003.
16. Liu, H., X. He, B. He. Method of successive weighted averages (MSWA) and self-
regulated averaging schemes for solving stochastic user equilibrium problem.
Networks and Spacial Economics 9, pp. 485-503, 2009.
17. Cascetta, E., Postorino, M.N. Fixed Point Approaches to the Estimation of O/D
Matrices Using Traffic Counts on Congested Networks, Transportation Science,
35(2) pp.134-147, 2001.
18. Nagurney, A., D. Zhang. Projected dynamical systems and variational inequalities
with applications. Kluwer, Boston, MA, 1996.
19. Polyak, B. New method of stochastic approximation type. Avtomatika i
Telemekhanika, 7, pp. 98-107. Institute of Control Sciences, Moscow, 1990.
20. Darken, C., & Moody, J. Note on learning rate schedules for stochastic optimization.
In Lippmann, Moody and Touretzky, (Eds.), Advances in neural information
processing systems, 3, pp. 1009-1016, 1991.
21. Rich, J., O.A. Nielsen, G. Cantarella. System convergence in transport modeling.
Presented at 2010 European Transport Conference, Association for European
Transport.
22. Gaivoronski, A. Stochastic quasigradient methods and their implementation. In Y.
Ermoliev and R. Wets (eds.) Numerical techniques for stochastic optimization,
Berlin: Springer-Verlag, 1988.
23. Trigg, D., A. Leach. Exponential smoothing with an adaptive response rate.
Operations Research Quarterly, 18(1), 53-59, 1967.
24. Kesten, H. Accelerated stochastic approximation. Annals of Mathematical Statistics,
29(4), 41–59, 1958.
25. Mirozahmedov, F.,Uryasev, S. P. Adaptive stepsize regulation for stochastic
optimization algorithm. Zurnal vicisl. mat. i. mat. fiz., 23(6), 1314-1325, 1983.
26. Qiu, S., L. Cheng, X. Xu, W. Feng. Incorporating an adaptive adjusting scheme into
the method of successive averages for solving stochastic user equilibrium problem.
Gibb, J. Page 21
Proceedings, 12th International Conference of Transportation Professionals (CICTP
2012), American Society of Civil Engineers, 2012.
27. Beckmann, M., C. McGuire, C. Winsten. Studies in the Economics of Transportation.
Yale University Press, New Haven, CT. 1956.
28. Yao, J., A. Chen, S. Ryu, F. Shi. A general unconstrained optimization formulation
for the combined distribution and assignment problem. Transportation. Research
Part B, 59, pp. 137-160, Elsevier, 2014.
29. Saad, Y. Iterative methods for sparse linear systems. 2nd ed., Society for Industrial
and Applied Mathematics. 2003.
30. Barzilai, J., J. Borwein. Two-point step size gradient methods. IMA Journal of
Numerical Analysis, 8, pp. 141-148. 1988.
31. Raydan, M. On the Barzilai and Borwein choice of steplength for the gradient
method. IMA Journal of Mathemetical Analysis, 13, 321-326. 1993.
32. Dai, Y., L. Liao. R-linear convergence of the Barzilai and Borwein gradient method.
IMA Journal of Numerical Analysis, 22, 1-10. 2002.
33. Fletcher, R. On the Barzilai-Borwein method. Applied Optimization 96, pp 235-256.
2005.
34. Raydan, M. The Barzilai and Borwein gradient method for the large scale
unconstrained minimization problem. SIAM Journal of Optimization 7, pp 26-33.
1997.
35. La Cruz, W., M. Raydan. Nonmonotone spectral methods for large-scale nonlinear
systems. Optimizations Methods & Software. 2003.
36. La Cruz, W., J. Martínez, M. Raydan. Spectral residual method without gradient
information for solving large-scale nonlinear systems of equations. Mathematics of
Computation 75(255), 2006.
37. Varadhan, R., P. Gilbert. BB: An R package for solving a large system of nonlinear
equations and for optimizing a high-dimensional nonlinear objective function.
Journal of Statistical Software 32(4). 2009.
38. DKS Associates. Model update report, Sacramento regional travel demand model
version 2001 (Sacmet 01). Sacramento Area Council of Governments, technical
report 02-003. 2002.
39. DKS Associates. Draft update of Sacmet model 2011. Memorandum to Sacramento
Regional Transit District, Jan 26, 2012.
40. Citilabs. Cube Voyager 6.1 software, 2013.
41. Spiess, H., Conical Volume Delay Functions. Transportation Science, Vol. 24, No. 2,
1990.
42. DKS Associates. CAMPO Travel Demand Model Update, for Carson Area
Metropolitan Planning Organization, Sep 2011.
43. Caliper Corp. TransCAD Transportation Planning Software, version 5.0, 2012.
... The authors attributed this improvement to the system's ability to leverage shared information to optimize route planning [38]. There have been many studies that considered the impact of route guidance systems on travel time [39], travel cost [40,41], driver behavior [17,19,[42][43][44][45], cell occupancy and travel demand [42], and vehicle performance [46]. Another way to classify routing systems is based on the techniques used to make routing decisions. ...
... Other studies have investigated SO-DTA using path-based assignment. For instance, different studies have examined the topic using optimal control formulation methodology [17,19,[43][44][45]. The studies used trial and error, non-linear convex programming, sensitivity analysis, non-convex non-linear programming, and optimal control methods. ...
Article
Full-text available
Highlights What are the main findings? Both transportation and communication networks should be considered simultaneously for a successful implementation of communication network for dynamic route guidance systems. There is a need for in-depth investigation to address the scalability and robustness of comprehensive communication architecture for dynamic route guidance systems. An integrated framework that considers the transportation and communication networks is presented. What is the implication of the main finding? The proposed framework shows promise for real-life implementation of comprehensive communication network for dynamic route guidance systems. Combining the transportation and communication networks may lead to a robust and efficient communication architecture for dynamic route guidance systems. Abstract Due to its anticipated impacts on the performance of transportation systems, intelligent transport systems (ITS) have emerged as one of the most extensively investigated topics. The U.S. Department of Transportation has defined route guidance systems (RGSs) as one of the main categories within ITS. Systems like these are essential components when managing travel and transportation. While RGSs play a pivotal role in both present and future transportation, there has been limited research on evaluating the effectiveness and dependability of integrating them with vehicular communication frameworks. Therefore, this paper aims to evaluate the RGS architectures proposed to date in the literature, providing comparisons and classifications based on their structures and requirements for communication systems. Moreover, it explores existing, next generation, as well as prospective choices for V2X communication technologies, evaluating how well they contribute to the development of RGS applications by integrating them with potential communication systems. Specifically, this study assesses the suitability of communication technologies in meeting the requirements of RGS applications. In conclusion, it suggests a framework for integrating RGS and V2X systems and offers directions for future research in this area.
... However, the existing solution algorithms often need to either frequently evaluate the objective function (and/or its derivative) or use inflexible step size determination rules (e.g., monotonically decreasing the step size sequence), which impede the efficiency on both speed and precision of the algorithmic convergence. Recently, a novel step size determination scheme, called the BB step size ( Barzilai & Borwein, 1988 ), has been reported to show great potential for solving the travel demand forecasting models ( Gibb, 2016 ). The BB step size originates from the Newton-type method (secondorder approach), but it involves nearly no extra cost over the standard gradient method (first-order approach) for solving various optimization problems. ...
... The BB step size has been extended to solve other mathematical problems, e.g., unconstrained/constrained system of equations ( Cruz & Raydan, 2003 ;Liu & Feng, 2019 ) and variational inequality with convex constraints ( He, Han & Li, 2012 ). Also, the BB step size has been employed in practice to solve the travel demand forecasting problem ( Gibb, 2016 ). Besides, the convergence of the BB step size has been extensively discussed by Barzilai and Borwein (1988 ), Raydan (1993 ) and Yuan (2008 ). ...
Article
Step size determination (also known as line search) is an important component in effective algorithmic development for solving the traffic assignment problem. In this paper, we explore a novel step size determination scheme, the Barzilai-Borwein (BB) step size, and adapt it for solving the stochastic user equilibrium (SUE) problem. The BB step size is a special step size determination scheme incorporated into the gradient method to enhance its computational efficiency. It is motivated by the Newton-type methods, but it does not need to explicitly compute the second-order derivative. We apply the BB step size in a path-based traffic assignment algorithm to solve two well-known SUE models: the multinomial logit (MNL) and cross-nested logit (CNL) SUE models. Numerical experiments are conducted on two real transportation networks to demonstrate the computational efficiency and robustness of the BB step size. The results show that the BB step size outperforms the current step size strategies, i.e., the Armijo rule and the self-regulated averaging scheme.
... It can be seen that the BB step size requires milder computational efforts than the steepest descent method and runs substantially faster (Yuan, 2008). The BB method has recently been extended and promisingly reported to solve various transportation problems, such as the travel demand forecasting model (Gibb, 2016) and stochastic user equilibrium models (Du et al., 2021). Interested readers please refer to Du et al. (2021) for a detailed review of the BB step size. ...
Article
The non-additive traffic equilibrium problem (NaTEP) overcomes the inadequacies of the additivity assumption in traditional traffic equilibrium models by relaxing the cost incurred on each path that is not a simple sum of the link costs on that path. The computation of the NaTEP heavily depends on the efficiency of the step size determination. This paper aims to accelerate the path-based gradient projection (GP) algorithm for solving the NaTEP using the Barzilai-Borwein (BB) step size scheme. The GP algorithm with the BB step size scheme uses the solution information of the last two iterations to determine a suitable step size and avoids extra evaluations of the mapping value. The proposed algorithm only needs to perform one-time projection onto the nonnegative orthant at each iteration. Two approaches with and without column generation are considered in the GP algorithm implementation. A non-additive shortest path algorithm is adopted for the column generation approach. Numerical results on four transportation networks demonstrate the superior efficiency and robustness of the GP algorithm with the BB step size scheme over the self-adaptive GP method.
Article
This paper examines the convergence properties of four popular traffic assignment algorithms: Frank-Wolfe decomposition for fixed-demand equilibrium assignment, and ad hoc variation of the Evans algorithm for elastic-demand equilibrium assignment, fixed-demand incremental assignment, and elastic-demand incremental assignment. The algorithms were evaluated according to errors associated with insufficient iterations, arbitrary selection of starting point, inexact theory, and small variations in data. Each of the four algorithms reached its intended solution, but did so very slowly. Elastic-demand incremental assignment emerged as the preferred technique, principally because of its more accurate response to small variations in data and its adaptability to various models of travel demand.
Conference Paper
The method of successive averages (MSA) is the most widely used algorithm for solving the stochastic user equilibrium (SUE) problem. It avoids the step size optimization subproblem by using a predetermined step size sequence. However, as is known, its convergence in the latter iterations is quite slow. In this study, we develop an adaptive adjusting scheme to enhance the computational efficiency of the MSA. The features of the proposed scheme are twofold: (1) The step size is adjusted according to the current and previous iterative solutions. Thus, the iterative information is more fully used to determine the step size. In contrast, the step sizes in the MSA are independent of the iterative information. In addition, the step size sequences from the proposed adaptive adjusting scheme satisfy the Blum theorem, which guarantees its convergence. (2) Similar the MSA, the objective function is not used in this scheme, making it useful for solving other more advanced forms of traffic assignment problems (e.g., variational inequality, fixed point problem). Finally, numerical examples on several large-size realistic networks are provided to demonstrate the efficiency of the proposed scheme.
Article
The widely used BPR volume-delay functions have some inherent draw-backs. A set of conditions is developed which a "well behaved" volume delay function should satisfy. This leads to the definition of a new class of functions named conical volume-delay functions , due to their geometrical interpretation as hyperbolic conical sections. It is shown that these functions satisfy all conditions set forth and, thus, constitute a viable alternative to the BPR type functions.
Book
Preface. Glossary of Notation. I: Theory of Projected Dynamical Systems. 1. Introduction and Overview. 2. Projected Dynamical Systems. 3. Stability Analysis. 4. Discrete Time Algorithms. 5. Oligopolistic Market Equilibrium. 6.Spatial Price Equilibrium. 7. Elastic Demand Traffic Equilibrium. 8. Fixed Demand Traffic Equilibrium. Index.
Article
A Dynamic Programming Example: A Shortest Path Problem The Three Curses of Dimensionality Some Real Applications Problem Classes The Many Dialects of Dynamic Programming What is New in this Book? Bibliographic Notes
Article
A modification is proposed to forecasting systems employing exponential smoothing whereby the response rate is varied and made to depend on the value of a tracking signal. In a simple system, this is equivalent to varying α the smoothing constant according to the extent to which biased forecasts are being obtained. Such a system is shown to react much faster to, for example, step changes whilst still retaining the facility to filter out random noise.