PreprintPDF Available

Modelling Stochastic Transportation Networks with Markov Chains

Authors:
  • Klumpentown Consulting
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Representing the randomness that is inherent in transportation networks is of key importance for planners, schedulers, analysts, and users. While many detailed stochastic-sensitive simulation models of transportation networks exist, they are often costly, closed, and extremely data intensive, requiring significant investment from researchers and agencies to develop and analyze. To provide a higher-order understanding of the reliability impacts of new infrastructure or operational policies, new flexible data analysis tools and modelling methods are needed. This research presents an adaptable mathematical framework for modelling transportation networks using Markov chains, and presents a new open source tool, MCRoute, which allows users to quickly prototype and analyze stochastic networks using data or theoretical distributions. Applications of the model and the tool are demonstrated in three case studies: (i) the potential impact of infrastructure improvements at a bottleneck on a railway corridor, (ii) the simulation of bus schedule adherence and bunching occurrences under various holding control scenarios, and (iii) the analysis of path-based reliability of users on a multi-stage transit journey across various modes.
Content may be subject to copyright.
Modelling Stochastic Transportation Networks with
Markov Chains
Willem Klumpenhouwer1,
1
Department of Civil and Mineral Engineering, University of Toronto, 35 St. George Street, Toronto, Ontario, M5S
1A4, Canada
Corresponding author:
Willem Klumpenhouwer, Department of Civil and Mineral Engineering, University of Toronto, 35 St.
George St., Toronto, Ontario M5S 1A4, Canada.
Email: willem.klumpenhouwer@utoronto.ca
Note: This is the authors’ version of a work that has not yet been published in a peer-reviewed journal.
Changes resulting from the publishing process, such as peer review, editing, corrections, structural
formatting, and other quality control mechanisms may not be reflected in this document.
Abstract
Representing the randomness that is inherent in transportation networks is of key importance for planners,
schedulers, analysts, and users. While many detailed stochastic-sensitive simulation models of
transportation networks exist, they are often costly, closed, and extremely data intensive, requiring
significant investment from researchers and agencies to develop and analyze. To provide a higher-order
understanding of the reliability impacts of new infrastructure or operational policies, new flexible data
analysis tools and modelling methods are needed. This research presents an adaptable mathematical
framework for modelling transportation networks using Markov chains, and presents a new open source
tool, MCRoute, which allows users to quickly prototype and analyze stochastic networks using data or
theoretical distributions. Applications of the model and the tool are demonstrated in three case studies: (i)
the potential impact of infrastructure improvements at a bottleneck on a railway corridor, (ii) the simulation
of bus schedule adherence and bunching occurrences under various holding control scenarios, and (iii) the
analysis of path-based reliability of users on a multi-stage transit journey across various modes.
1. Introduction
Any trip, by any mode, on any transportation network involves numerous encounters with random,
unpredictable situations. This randomness can affect travel times, transfers between legs of a trip, and
the overall experience of a user of a transportation network. Reliability is particularly important with
coordinated, scheduled transportation networks such as public transit, airports, and goods transportation,
and for trips in which a fixed arrival time is crucial. Compounding random factors can lead to missed
transfers, problems with resource allocation, and a general breakdown of the network, fostering distrust in
the system.
Detailed simulations are one way to model these complex systems, and many commercial software
systems have been developed to capture some stochasticity in the movement of people and vehicles in a
system. Simulation models are useful for studying specific scenarios with limited geography, however they
suffer from a number of challenges: They are data hungry, often require multi-level calibration (demand,
geometric, component model), and require substantial computation power and time for mid-size and large
applications.
Another approach is to abstract the randomness in these networks into more general distributions,
and model the movement and interactions in these systems as stochastic processes. Instead of attempting
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
to describe or model each individual interaction and its associated random components, a generalized
stochastic process models a bigger picture view of the system. By sacrificing individual detail, it is possible
to generalize problems to a larger scale, and quickly prototype scenarios to model the impacts of various
changes in the system on reliability, without needing high levels of detail about the system.
Markov processes, which are rooted in the assumption that the evolution of a process in the future
is dependent only on the present state, are a useful way of modelling certain aspects of transportation
systems. In many cases, the assumption of independence between steps is a reasonable one, given that
the interactions that vehicles and people have with random aspects of transportation systems are highly
varied. For example, on a single transit journey from home to work a traveller’s trip may be affected by
weather, traffic signals, bus schedule adherence, operator scheduling adjustments, traffic congestion, other
passenger behaviour, route choice, the design of transit facilities, and a myriad of other minutia. Instead of
modelling each individual random encounter, Markov processes consider a given time step or stage to be
well-described by a probability distribution. This probability distribution can then be compounded over
several steps to produce a larger-picture distribution of this travellers journey.
Markov chains, a discrete, finite subset of Markov processes, strike a balance between a theoretical
model of stochastic processes and a data-driven simulation model. Finite state spaces (described more in
Section 3.1) provide a realistic bound on the model, and a discrete time evolution offers modellers and
analysts the chance to change individual situations and time steps and view their effects. For example, a
modeller can learn about the downstream effects on reliability of introducing holding control into a transit
system (Klumpenhouwer and Wirasinghe, 2018), or see the compounding effects of freight railway delays at
successive terminals (Barta et al., 2012).
This paper describes a flexible approach to modelling transportation systems using Markov chains and
provides a number of theoretical tools to examine a constructed Markov network. These include examining
Brownian motion effects through mean and variance growth, the effects of finite state spaces, and the use of
steady-state transition probabilities to discuss the long-term behaviour of systems and detect potential data
quality issues. An accompanying open-source tool developed as part of this research allows users to rapidly
prototype stochastic systems and study high-level reliability impacts of changes on these systems.
To illustrate potential use cases, three examples are given using various data-driven and theoretical
sources of transition probabilities. They include (i) examining the potential impact of infrastructure
improvements at a bottleneck on a railway corridor, (ii) simulating effects of bus schedule adherence
and bunching occurrences under various holding control scenarios, and (iii) analyzing the path-based
reliability of users on a multi-stage transit journey under a variety of route options. The intent of these case
studies and of the proposed model is to introduce a systematic approach to the Markov chain modelling of
transportation systems and to provide inspiration for future use and adaptation of this approach.
The remainder of the paper is structured as follows: Section 2 describes some foundational and recent
studies using stochastic processes to model transportation systems, with a particular focus on Markov
processes. Section 3 outlines the mathematical framework for a generalized Markov chain model of stochastic
transportation networks and describes some key properties and effects of the modelling process. Section 4
introduces the accompanying open-source Python package, MCRoute, and discusses some approaches for
applying the theoretical framework to modelling situations. Section 5 demonstrates the framework and
package in action with three examples. Section 6 provides some concluding remarks on limitations and
potential future research directions.
2. Stochastic Processes in Transportation Analysis
Markov processes appear as analysis tools throughout many aspects of transportation systems. Watling
and Cantarella (2015) provides an excellent introduction to the formalism of a Markov chain system in the
Preprint Version 2021-07-07 2
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
context of transportation behaviour and discusses eight scenarios in which stochastic process models have
been applied, focus on decision making by transportation users and the equilibrium of that decision making
over time. Notably, while there are many examples in traffic assignment, user habit modelling, and the
modelling of a learning process for trip choice, only a single paper within the context of their review (Friesz
et al., 2008) utilized wider themes in stochastic processes such as Brownian Motion.
Recently, Besenczi et al. (2021) combined a traditional network modelling approach with stochastic
processes, focusing on path choice through a large-scale network. They examined a large data set of taxi
trajectories on a square city grid divided into 3 meter segments, and simulated the evolution of traffic on
that cellular grid (so called “Markovian traffic”). Combined with a second distribution that examines the
selection of paths at intersections (so called “Markovian vehicles”), they were able to model large-scale
transportation networks as a Markov process to positive effect. The authors note that there are many
techniques for short-term traffic flow prediction in the literature, but that Markov models are seldom used
in the context of larger networks.
Very few studies directly invoke stochastic processes to model the movement of individual vehicles
throughout a transportation network, especially in a transit context. Daganzo (1994) devised a cell
transmission model for vehicular traffic on a highway, and connected it with existing hydrodynamic theories
of vehicular movement. Newell (1977) modelled the movement of buses on an infinitely long route under a
uniform holding control scheme at each stop. He derived a stochastic process in the form of a Fokker-Planck
diffusion equation to illustrate how buses “drift” from their schedule, analogous to a weak gravitational
or electrostatic field. This paper, which considers the behaviour of individual buses as a collectively as an
evolving probability distribution, is the foundation of the model proposed in this paper.
Recent use of Markov processes to model movement of individual ‘particles’ (vehicles, pedestrians) on
a network have focused on rail and public transport networks, which tend to be less dense or corridor
specific and therefore more easily modelled by Markov chains. Klumpenhouwer and Wirasinghe (2018)
extended Newell’s work on holding control to a discrete Markov chain approach, using vehicle location
data from a bus route to build a model of a bus route and optimize the location of holding points on the
route to balance reliability and speed. ¸Sahin (2017) used Markov chains to examine steady-state delay
distributions on a rail corridor and assess the schedule robustness of railway networks. Similarly, Khadilkar
(2016) modelled the propagation of delays on a railway network with Markov chains to study resource
conflicts and schedule robustness on a rail corridor. Barta et al. (2012) considered the evolution of delays on
a rail freight network to classify terminals in terms of their ability to absorb or amplify delays. Esmaeilnejad
et al. (2021) invoked a Markov chain model of a transit corridor to examine high-level reliability impacts of
infrastructure changes for on-route charging of electric vehicles.
Many of these approaches, while diverse in application, utilize the same elements of Markov processes
and invoke similar definitions. While there are promising results in the areas where they have been applied,
in general Markov chains are not widely used to model random movement within transportation networks.
This may be in part due to the relative complexity in conceptualizing, initializing, and programming a
Markov chain in a given network context. One of the goals of this paper is to provide a systematic approach
by which to construct a Markov chain model representing the desired situation, and to outline accessible
tools by which to turn the theoretical model into reality.
3. Framework
This section provides the mathematical framework for which to conceptualize and model a Markov chain
transportation network. Starting with some mathematical definitions, we discuss transition probability
matrices, how probability distributions can be evolved across multiple steps, and look at some specific
properties of the network that are important for analysis.
Preprint Version 2021-07-07 3
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
3.1 Mathematical Formulation and Definitions
Networks
We define a transportation network
G
as a directional graph (digraph), consisting of sets of
nodes N,edges E, and states S:
G=G(N,E,S)(1)
In this network, Each edge
eE
must connect two nodes (
nN
); individual nodes can have an arbitrary
number of edges attached to them. Collectively, nodes and edges together are referred to in this paper as
elements.
State Space
In this Markovian system, each network must have an associated state space
S
which defines
the bounds of the Markov process. This discrete, finite set represents the possible states that an object
traversing the defined network can exist in. These states can represent varying degrees of abstraction. For
example, when studying reliability these states are typically a characterization of schedule adherence or
delay, such as “3 minutes late” or “extremely late”. States can also represent total travel time, or even more
subjective aspects like the stress level of a traveller as they make their way through an airport. Each state
sS
has an associated label (or name) and value. For example, a state representing a delay of three minutes
would have a value of 3, while a state with a label of “extremely late” might have a value of 30 to represent
a threshold value of 30 minutes. The label is used to identify states for interpretation, while the value is
used for statistical calculations and other analysis.
Time Evolution
The evolution of ‘time’, which is typically associated with the forward evolution of
probabilities through a Markov chain, is given a more abstract definition. Each ‘step’ in the evolution
of a given probability distribution through a Markov chain is represented by travel through a node or
an edge. These steps may not be evenly spaced in reality. For example, in measuring the delay of an
individual making their way through a busy airport, nodes may represent various stages of interaction with
the system (check-in, security), while edges may represent movement between these points. The time of
movement between these nodes may vary, however each represents a discrete checkpoint for the modelled
user. Similarly, when modelling the movement of buses along a route, individual stops may represent
evolution in time.
Paths
To discuss and model traversal of the network, we also define a path
P(pnN
,
peE)
which
connects two nodes via an alternating series of nodes and edges. More specifically, a path is an ordered set
of nodes and edges
P={n1
,
e1,2
,
n2
,
e2,3
,
. . . }
which constitute traversal of the network from node to node.
This pattern can be adopted without loss of generality, as ‘dummy’ nodes and edges can be inserted into
the network that have no effect on the stochastic process (this is done in the examples in Section 5).
3.1.1 Transition Probability Matrices
Each node or edge has an associated transition probability matrix
P
, which is a square matrix of dimension
|S|×|S|
. We denote an individual entry in this matrix with
pi,j
, indicating the state transition probability
from state
i
to
j
. Since this is a matrix whose rows form individual probability distributions, each
pi,j[
0, 1
]
,
and each row of the matrix must sum to 1. Each row in
P
holds a distribution of probabilities describing
how likely an object in state
i
is to move to any of the states
j
. Methods to populate these transition matrices
are discussed in Section 4.2.
Finally, each path
P
has an initial probability distribution vector
p
, which acts an as an initial state for
the stochastic process, much like the initial condition in a differential equation. Similar to the transition
Preprint Version 2021-07-07 4
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
probability matrix, each entry
pi[
0, 1
]
, and all entries in the vector must sum to 1. Figure 1 diagrams a
simple network described by a set of five nodes ni, seven edges ei,j, and a single path:
Pa={n1,e1,3,n3,e3,4 ,n4,e4,2,n2}(2)
n1n2
n3
n4
n5
e1,2
e1,3
e3,5
e5,3
e2,1 e4,2
e3,4
Figure 1: A simple demonstration network for stochastic analysis. The defined path is
shown in bold.
If we assign this path an initial probability distribution vector
pa
, we can calculate the distribution of
probabilities over the state space
S
by multiplying this vector by a sequence of matrices, referred to as an
n
step transition probability matrix. For example, we can find the probability distribution vector
p
after
traversing through n3as
p3=paN1E1,3N3(3)
where
N1
and
N3
are the probability distribution matrices for nodes 1 and 3, and
E1,3
is the probability
distribution vector for edge 1,3. The resulting vector will be of dimension 1
×|S|
. This process of traversing a
path produces a sequence of evolving probability distributions representing the stochastic process evolution
of an object moving through the network.
3.2 Properties of the Markov Network
3.2.1 The Markov Property
For the purposes of modelling stochastic transportation systems, we consider this network to follow
the Markov Property, namely that each individual move from one element of the network to another is
dependent on previous transitions only insofar as it has determined the current state. In other words, the
transition probabilities from one state to another for a given element (edge or node) is the same regardless
of how we arrived at the initial state.
In many transportation applications, whether the modelled process is truly Markovian is questionable.
For example, a bus that is late at a given stop may be late due to the current weather conditions, something
that has affected the transitions from previous stops. In some cases, a process that does not strictly follow
the Markov Property can be approximated quite well by assuming it is Markovian. When that is not feasible,
it is often possible to separate out the non-Markovian portion of the problem and include it as another
Preprint Version 2021-07-07 5
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
state variable. In our bus example, we could create separate networks for different weather conditions and
analyze them individually. This process often requires more data to establish transition probabilities, or
theoretical estimations as discussed in Section 4.2.
3.2.2 Absorbing Classes
An absorbing class is a set of one or more states
sS
on the network which, once reached, cannot be left. For
a generalized network with varying transition probability matrices
P
for nodes and edges, each element of
the network can have individual absorbing classes which can affect the behaviour of probability evolution
along paths that include these elements. In some cases, the presence of these absorbing classes can model
useful situations. For example, holding control points on a bus route, where buses are instructed to depart
no earlier than a specified time, can be modelled by constructing a transition probability matrix where state
transitions representing early departures have probabilities of zero, creating a set of absorbing classes. In a
network with synchronized transfers where transit vehicles wait for the arrival of connecting trips can also
be represented with an absorbing class. Identity matrices
I
, with unit probabilities on the diagonal, consist
entirely of single-state absorbing classes.
Absorbing classes can also indicate data integrity issues when constructing transition probability matrices
directly from sampled data. A matrix that is constructed using relatively few observations across a large
state space will be quite sparse, creating the potential for multiple absorbing classes. A matrix with a high
number of small absorbing classes (classes with one or a few states) can have a significant effect on the
evolution of a probability distribution if these classes are off-diagonal, or it may have little to no effect if the
classes are along the diagonal (approximating an identity matrix). Calculating absorbing classes for matrices
can help quickly diagnose problems and detect poorly constructed transition matrices in large networks.
3.2.3 Mean and Variance Growth
For a defined path
P
, we can measure the mean and variance of the probability distribution vector
pi
at a
given step
i
. These measures provide some analytical insight into the behaviour of the system. In stochastic
processes such as Brownian Motion, the variance grows linearly over time, indicating a steady increase in
the randomness of the system caused by the increasingly large number of possible paths that can be taken
to reach a given state. Places where the variance grows at a faster than the average rate indicates potential
“reliability bottlenecks”, or places which contribute significantly to the overall randomness of the system
downstream from that point. Slower than average growth in variance indicate places of relative stochastic
stability.
Changes in the mean of the probability distribution vectors along the traversal of the path can give
insight into the “drift” of the system. For example, when measuring schedule adherence, one would expect
the mean value of “delay” for a given system to stay at zero, assuming the schedule was planned with the
mean behaviour in mind. Due to the late-skewed nature of travel time distributions, this mean value can
often steadily increase unless scheduling interventions are made. By analyzing the drift of a given path, we
can begin to understand the underlying trends in behaviour that may be the result of external forces such
as scheduling or chronic delays, as opposed to factors that increase the variance of the distribution but may
not affect the underlying mean such as traffic fluctuations or signal timing.
3.2.4 Finite State Spaces and Edge Effects
For computational feasibility and to provide a set of practical bounds on the situation that is being modelled,
the state space
S
is constrained to be finite. In some applications, it may be possible to choose a state
space that is large enough to act an effectively infinite state space, where states on the boundary of the
Preprint Version 2021-07-07 6
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
0.0
0.2
0.4
0.6
0.8
1.0
Transition Probability
-10
-9
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
6
7
8
9
Next State
-10
-9
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
6
7
8
9
Current State
Figure 2: An example of an edge transition matrix populated using a normal distribu-
tion. Note the edge effects at the top left and bottom right corners.
defined space are never reached, or reached with acceptably low probability. Individual transition matrices,
however, are likely to have edge effects in the corners of the matrices, where states cannot transition outside
of the bounds of the finite state space. Figure 2 demonstrates this effect. It is not inherently problematic to
have edge effects in individual transition matrices, however strong edge effects can cause the probability of
accessing edge states to increase quickly, so the system should be designed carefully.
3.2.5 Steady-State Transition Probabilities
One often-invoked analysis technique for Markov chains is to determine the steady-state transition proba-
bility vector for a given transition probability matrix. These steady state vectors
π
, when multiplied by the
given transition probability matrix A, produce the same resulting vector π. That is
π=πA(4)
Steady-state transition probabilities represent convergent behaviour over an infinite number of time steps,
and can be useful to model long-term average behaviour of a given node, edge, or path in a network. These
steady-state transitions can be used on a multi-step matrix such as the one given in (3), or on single node or
edge matrices to study a given node or edge’s behaviour in isolation. Where steady states exist, they can be
found by solving
π(AI)=0(5)
and using the fact that iπi=1.
Preprint Version 2021-07-07 7
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
4. MCRoute
MCRoute is an open-source Python module which allows for convenient and rapid construction and analysis
of Markov-chain based transportation networks.
1
This section outlines the general process of constructing a
network for modelling use, and discusses some methods for building transition probability matrices, which
form the backbone of this model. Additional utility and analysis functions are available and are described
in the documentation.
The conceptual structure of MCRoute closely follows the mathematical structure outlined in Section 3.1.
Since the network is represented as a standard graph structure, MCRoute builds on the Python package
NetworkX (Hagberg et al., 2008), with a MCRoute network
G
being an inherited version of NetworkX’s
DiGraph object. This allows users to access the full functionality of NetworkX, including determining
network properties and performing shortest path analyses, an example of which is demonstrated in Section
5.3. A major additional requirement is that each node and edge must be assigned a transition probability
matrix.
Constructing and analyzing models using a MCRoute network requires assembling data inputs, deter-
mining and populating a state space, adding nodes and edges with their associated transition probability
matrices, and performing analyses on the network (Figure 3).
Determine and assemble data inputs Determine and populate state space
Populate nodes
Create transition
matrix
Add node to
network
Populate edges
Create transition
matrix
Add edge
to network
Attach to
nodes
Assemble collection of
nodes and edges into Path
Sample trajectory
Evolve distributions
Figure 3: Process for creating a generalized network using MCRoute
4.1 Inputs
MCRoute is data-flexible, allowing a modeller to construct networks using data and observations, or
theoretical approaches. Whether computationally or from static data, a user must provide a state space
definition and a set of nodes and edges. These nodes and edges must include a transition probability matrix
under the defined state space.
State spaces are defined by a list of labels and values. The former are used for contextual labelling,
while the latter are used for statistical calculations and for creating transition matrices using theoretical
distributions as discussed in Section 4.2.1. Once a state space is created, individual transition matrices can
be combined with node and edge names to construct the graph network
G
. These inputs form the base
requirements for the construction of the network.
4.2 Building Transition Probability Matrices
There are two general approaches to building transition probability matrices: Using theoretical distributions,
or sampling directly from data.
1Package documentation available at http://mcroute.readthedocs.io/
Preprint Version 2021-07-07 8
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
4.2.1 Theoretical Distributions
To ensure a level of smoothing or to model more hypothetical scenarios with limited data availability, a
user can create transition probability matrices using theoretical distributions. It is possible to construct
matrices from any theoretical distribution and provide them directly to MCRoute in the form of text files
holding probability values, or to use a generating method to create matrices directly from the package.
MCRoute currently provides options to build matrices using uniform, normal, log-normal, exponential,
and beta distributions. Distributions with infinite domains are truncated such that each row’s probability
distribution is normalized over the interval of available transitions in that row. For example, the truncated
normal distribution used to generate state transition probabilities takes the form:
truncnorm(x,a,b,µ,σ) = FΦ(x,µ,σ)FΦ(a,µ,σ)
FΦ(b,µ,σ)FΦ(a,µ,σ)(6)
where
FΦ(x,µ,σ) = Φxµ
σ=1
2"1+2
πZxµ
σ2
0et2dt#(7)
is the cumulative Gaussian distribution with a mean of
µ
and standard deviation of
σ
, and
a
and
b
are
the lowest and highest state transitions possible in a given row (the most negative jump and the most
positive jump). This truncated form allows for a normal distribution with the same mean and standard
deviation to be used across all rows of a given transition matrix, and also minimizes edge effects, since
using distributions on the domain (,)can lead to large probabilities being assigned to the boundary
(see Section 3.2.4).
For the beta distribution, which has an infinite domain on the unit interval (
x[
0, 1
]
), state transition
values are first scaled to match the range of possible state transitions across the entire matrix, and then a
truncated beta distribution is used to calculate transition probabilities. In this case, the distribution takes
the form:
truncbeta(y,r,s,α,β) = Fβ(y,α,β)Fβ(r,α,β)
Fβ(s,α,β)Fβ(r,α,β)(8)
where
Fβ(y,α,β) = Ry
0tα1(1t)β1dt
R1
0tα1(1t)β1dt (9)
is the cumulative distribution function of the beta distribution, with scaling parameters αand β,
y=xδm
δMδm(10)
is the scaled version of the state transition value
x
,
δm
and
δM
are the minimum and maximum possible
state transitions for a given state space, and
r
and
s
are the lowest and highest possible state transitions for
a given row in the transition probability matrix.
For probability transitions with support on the positive reals such as the log-normal and exponential
distributions, state transitions must take on positive values only. These distributions are useful for situations
when transition matrices represent travel times along a link or through a node, and a MCRoute network
is used to determine variations in travel times through the network such as with Example 3 outlined in
Section 5.3.
4.2.2 Sampling Directly from Data
When constructing frequency or sample-based transition probability matrices, a large set of state transition
data is needed to ensure that each possible state transition pair for each node and edge is covered
Preprint Version 2021-07-07 9
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
with an adequate sample size. MCRoute allows for transition matrices to be populated with a set of
observed state transition values, creating a frequency distribution of state transitions to be used across
multiple rows. For example, a transition of “-2” would indicate an observed shift backwards of two states
from the original state during the transition. If each state represents a minute schedule deviation, this
observation would be counted for transitions from -3 to -5, +1 to -1, +3 to +1, etc. If the state space
is instead
{Very Early, Early, On Time, Late, Very Late}
an observed transition of “-2” would be counted
for transitions from “On Time“ to “Very Early”, “Late” to “Early”, and “Very Late” to “On Time”. The
advantage of this method, which is agnostic to the origin state, is that large transition matrices can be
populated with fewer observations, as it is not required to have all permutations of rows and columns
accounted for with multiple observations. Each row in the matrix will have the same distribution with its
central point shifted along the diagonal.
It is also possible to provide a transition matrix directly, constructed using any technique desired. All
that is required is that the matrix be square with a dimension matching the number of states in the state
space, that each entry in the matrix fall in the range [0, 1] and that the sum of each row is exactly one. This
allows for a data-informed matrix to be adjusted to account for theoretical scenario changes as demonstrated
in Section 5.2.
4.3 Evolving States and Sampling Trajectories
Two common computations used in analyzing these Markov chain systems are evolving transition probability
matrices along a path while observing changes in summary statistics, and simulating individual movements
on the network through sample trajectories. MCRoute has functions for both of these situations, making
use of NumPy’s efficient matrix multiplication methods (Harris et al., 2020). Traversing a path produces
a set of transition vectors calculated as described in (3). These resulting vector probability distributions
can be summarized, visualized, or sampled to produce individual trajectories. MCRoute includes helper
methods for these scenarios which are used in the examples in Section 5 and are described further in the
documentation.
5. Application Examples
The following three examples demonstrate how the theoretical framework can be applied to various
modelling situations. The goal is to demonstrate the flexibility of the generalized model to adapt to
variations in data availability, network representation, and analysis approach. Each case study involves a
different network implementation, a different approach to populating transition probability matrices, and
various levels of abstraction in the situation modelled.
These examples are intended as demonstrations of the potential uses of the modelling approach. The
level of detail available to a user of this framework can be much larger than what is described here; these
case studies are intended as proofs of concept rather than definitive studies of the scenarios they represent.
Each example includes a description of the situation, the definition of the Markov Chain network, some
data considerations and adjustments that can be made within the model, a short discussion of the results of
the application, and a concluding statement about how these situations could be expanded in future work.
5.1 Example 1: Reliability Improvements on an Intercity Rail Corridor
This example
2
demonstrates how a Markov chain model of reliability can be used to examine the downstream
impacts of infrastructure changes that affect the reliability of a system. In this case study, we examine
2Preliminary results from this application were presented at the 2021 annual meeting of the Transportation Research Board.
Preprint Version 2021-07-07 10
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
a key intersection of two rail corridors in the Greater Toronto Area, Canada, mapped in Figure 4. Here,
freight traffic crosses a corridor (the Kitchener Line) used predominantly by passenger rail and is a source
of conflict between movements travelling east-west and movements travelling northeast-southwest.
Figure 4: Freight and passenger rail conflict area between Georgetown and Brampton,
Ontario, Canada
To demonstrate a potential application of the developed model, we consider a hypothetical scenario
where infrastructure improvements on this crossover link between Brampton and Georgetown stations cuts
the standard deviation of the change in schedule deviation along that link in half, with all other aspects
(mean travel time, mean schedule deviation) remaining constant. Infrastructure improvements of this kind
have been considered in the past (Bueckert, 2018), and it is known that substantial reliability improvements
are possible if the heterogeneity of rail traffic is reduced (Vromans et al., 2006). The goal is to compare,
using this theoretical model, the effects of improving a single link on the overall performance of trains on
the route.
Real-time and scheduled arrival and departure times from stations along the corridor is reported by VIA
rail
3
and was collected over the calendar year of 2019 from VIA rail’s website. VIA operates four trains
daily on the corridor, two eastbound (even-numbered trains) and two westbound (odd-numbered trains).
3VIA is Canada’s national passenger rail carrier
Preprint Version 2021-07-07 11
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Step
0
2
4
6
8
10
12
14
16
18
20
22
24
26
Standard Deviaiton (min)
84
85
87
88
Train
Before
After
Scenario
Modelled Standard Deviation of Schedule Deviaition Before and After Reliability Improvements
Figure 5: Evolution of the standard deviation over multiple steps across four trains on
the Kitchener corridor.
For each train observation, the schedule deviation on departure from a given station and the arrival at the
following station were computed, and this was used to determine the change in schedule deviation on links
between two stops. For each link, the mean and standard deviation of this change in schedule deviation
was computed to populate data-informed theoretical transition probability matrices.
The constructed network
G1
consisted of nodes represented by stations, edges represented by links
between stations, and a state space
S1
that represented schedule deviations in single-minute increments
between -30 and +40 minutes, for a total of 71 states. To isolate dwell effects from station-to-station movement,
each station node was given an identity probability transition matrix, acting as “inert” probability transitions
that will have no overall effect on the resulting probability distributions. Station-to-station edges were
assigned probability transition matrices populated by truncated normal distributions with a mean of zero
and standard deviation equal to the observed standard deviation on that edge for that train, using the
method provided in Section 4.2.1. A mean of zero was used under the assumption that scheduled service
follows the mean value of link travel time.
We then constructed a second network
G0
1
which was identical except for the transition matrices used
for Brampton-Georgetown (westbound) and Georgetown-Brampton (eastbound) which were assigned half
of the original standard deviation. The standard deviation of the schedule deviation along the route was
compared between the two scenarios to demonstrate the downstream impacts of reliability changes on a
single edge. Figure 5 summarizes the results comparing the two scenarios.
Each ‘step’ in the evolution is an alternating node and edge transition; the step-like increases in standard
deviation in Figure 5 are due to the fact that nodes were given identity matrices which do not impact the
probability evolution. The resulting effects of the improvements is clearly observable on both westbound
(odd-numbered) trains, and less visible on eastbound (even-numbered) trains. This is because there are
more downstream stops after the improved link for westbound trains. Eastbound trains encounter the
improved link later in their journey, meaning the impact of a single change is counteracted by the larger
spread of the evolved probability distribution.
This analysis can be expanded to include other theoretical distributions of running time, or the incorpo-
ration of detailed simulation model results from other areas of a corridor to prototype effects of changes
Preprint Version 2021-07-07 12
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
elsewhere on a rail network. For example, detailed modelling may exist on certain portions of the corridor
closer to Toronto; the results of these models can be used to further improve the realism of the transition
probability matrices for those links, allowing observation of the downstream effects of these improvements.
5.2 Example 2: Simulating Bus Trajectories and Bus Bunching Incidence
This example
4
examines how the the impacts of holding control on bus bunching incidence can be
demonstrated or modelled by this generalized framework and MCRoute. In this example we consider a
pair of buses
b1
and
b2
, travelling sequentially on the same bus route, spaced
H
minutes apart. As the buses
move, they will generate a sequence of schedule deviation values, or a trajectory, which we will generate
using the Markov chain method. For example,
b1
might have a trajectory of
[
0, 2, 5, 5, 6
. . . ]
while
b2
might
have a trajectory of [0, 1, 1, 2, 1 . . . ]. With these two trajectories we can calculate a difference trajectory as
d=b2b1(11)
which in this case would be
[
0,
3,
6,
3,
5
. . . ]
. Values greater than zero in the difference trajectory
indicate a larger spread between buses than the scheduled headway, while negative values indicate the
buses are closer together. The actual separation between buses can therefore be expressed as
δ=H+d(12)
For this example, we will consider a pair of buses to be “bunched” if their separation decreases below half
of a headway at any point along the route; that is if δ<H/2 or d<H/2 for any did.
We constructed a MCRoute model using a set of observed transitions on Calgary Transit’s Route 3,
with data collected in the fall months of 2015. Route 3 is a long route (97 stops) that runs north to south
across the city, serving the downtown central business district. Stop-level schedule deviation information
was inferred from automated passenger count data which recorded stop events handling passengers. The
defined network
G2
consisted of stop nodes and edges representing travel between bus stops on the route.
The state space
S2
was constructed to represent single-minute increments of schedule deviation between -15
minutes and +70 minutes, for a total of 86 states. Similar to Example 1, stops were initialized with identity
transition probability matrices to represent a route with no holding control implemented at stops.
For edge transition probabilities, instead of constructing theoretical distributions using calculated
standard deviations, state transition observations were populated directly into transition probability
matrices. Initial loading identified matrices with high numbers of absorbing classes (see Section 3.2.2) due
to low numbers of observations. For these stops, the previous stop’s transition probability matrix was
used instead. With this network initialized, 1000 pairs of trajectories were generated by sampling from the
evolving states as described in Section 4.3. As an additional smoothing measure, jumps from one stop to
the next were not allowed to exceed a change of
±
5 minutes. The bunching incidence was calculated as
the total fraction of pairs of buses which met the bunching criteria at some point along their route. In this
example, the headway was set to H=20.
To model the implementation of control on the route, the identity matrix used for stop transitions was
truncated at a certain slack time. Truncating matrices requires summing the probability values in each row
to the left of a specified column, and then adding that sum to the probabilities in that column. This creates
an effect of absorbing all earlier state transitions into that single state. Figure 6 visualizes the effect of this
truncation on a sample probability transition matrix. Matrices were truncated at state values ranging in
one-minute increments from -10 to +10 minutes, representing increasingly strict holding control policies. At
each value, trajectories were re-sampled and bunching incidence was measured.
4Preliminary results from this application were developed by and presented in Klumpenhouwer (2018)
Preprint Version 2021-07-07 13
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
0.0
0.2
0.4
0.6
Probability
0
1
2
3
4
5
6
7
8
9
To State
0
1
2
3
4
5
6
7
8
9
From State
(a) Before truncation
0.0
0.2
0.4
0.6
0.8
Probability
0
1
2
3
4
5
6
7
8
9
To State
0
1
2
3
4
5
6
7
8
9
From State
(b) After truncation at state 5
Figure 6: Example of matrix truncation for a 10-value sate space. Truncation occurs at
state ‘5’.
The resulting values for no control and increasingly strict control on bus bunching incidence is shown
in Figure 7. As slack times approach zero from the left hand side (that is, holding times are set below the
mean running times for buses), control becomes increasingly effective at reducing bunching incidence. After
reaching the scheduled time (state ‘0’), additional slack time appears less effective at controlling bunching
rates.
10 987654321 0 1 2 3 4 5 6 7 8 9 10
Slack Time at Time Point (min)
0
20
40
60
80
100
Bus Bunching Incidence (%)
Figure 7: Incidence rate of bus bunching under various slack time situations. The
orange line denotes bunching rates with no control (97% bunching).
5.3 Example 3: Path-based Reliability of a Multi-stage Transit Journey
Consider a traveller making their way through a transportation network such as a transit system. When
planning their journey, this traveller may be presented with a number of different routing options. These
routes may have varying number of transfers, stops, and route types (express, local). When planning a
route, especially if it is a route used frequently over a period of time, the traveller will be interested in both
the speed to their destination and the reliability of the journey. If a strict arrival time is desired, a traveller
facing an unreliable trip will have to budget for a longer travel time than may be typically necessary.
Preprint Version 2021-07-07 14
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
Figure 8 shows a simplified, theoretical transit network, or a set of potential travel options a user faces
when travelling from node
A
to node
B
. For each edge, travel time and reliability is given to be a normal
distribution
N(µ
,
σ)
with a mean of
µ
and a standard deviation of
σ
minutes. Links
DE
and
FG
represent transfers, where a user may have to travel a small distance between vehicles and can encounter
reliability issues during that transfer. We make the assumption that a scheduled travel time on an edge is
based on the mean value.
If a traveller is only concerned with mean travel time, the shortest path from
A
to
B
is along the route
ACDEB
. If the traveller wishes to place a greater weight on the randomness of the system,
they may choose to use a metric such as the mean travel time plus two standard deviations (
µ+
2
σ
), in
which case the shortest path runs AB.
A
C D
B
F
N(5, 2)
N(30, 4)
N(10, 2) N(15,2)
N(2, 2)
N(10, 2)
N(7, 0.5)
N(5, 1.5)
N(8, 3)
N(3, 3)
G
E
Figure 8: A simple network with mean and standard deviation of travel times through
edges.
In situations of similar travel times, it may be more desirable to have certainty in one’s arrival time at the
destination than it is to travel along the marginally faster route. A traveller with an important appointment
may value certainty of arrival much higher than they do travel time. To further demonstrate flexibility in
the framework definition, instead of centering the distribution and state space around a mean value of zero
for schedule adherence, we instead consider a state space of one-minute intervals in the range from -10 to
60 minutes to represent cumulative travel times. Each edge’s transition probability was constructed using a
truncated normal distribution (6) with a mean and standard deviation as shown in the graph.
The six possible paths from
A
to
B
were traversed, and the final distribution over the states at
B
was
examined. Table 1 shows the mean, standard deviation, and the 95th-percent travel time
p95
for the paths.
In this case,
AFGB
provide the lowest value of standard deviation, as well as the lowest
95th-percent certainty in arrival time.
This demonstration network can be expanded into larger situations for study, or a full transportation
network can be loaded in and analysed. Automated fare card data can be used to populate key paths and
to calculate entire-path reliability across the network.
6. Conclusion
Incorporating reliability into large-scale or long-term transportation network and corridor modelling can
provide planners and analysts insights into potential reliability bottlenecks or places where improvements
can have larger impacts. Building detailed simulation models that capture the stochastic nature of the
Preprint Version 2021-07-07 15
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
Table 1: Measures of travel time and reliability along all possible paths.
Path Total Travel Time (min)
Mean Std. Deviation p95
ACDEB 27.0 3.98 35
AFGB 28.5 3.50 35
AFGDEB 30.5 5.35 40
AB 30.5 4.01 38
ACFGB 31.0 3.55 38
ACFGDEB 33.0 5.39 43
movement of passengers and vehicles on a transportation network is resource intensive and requires a
significant amount of information about the network characteristics themselves. Using Markov chains as a
stand-in stochastic process can simplify the problem while still allowing for customization and data-driven
accuracy where possible.
To enable this type of analysis across a larger set of problems, we introduced a generalized approach to
modelling transportation networks with Markov chains. A formal definition of a framework for defining a
network and state space along with methods for creating probability transition matrices were presented.
An accompanying open-source Python package was also introduced, and was used for three examples of
potential applications to transit systems. These examples demonstrate the flexibility of the framework and
the ability of the approach to rapidly produce intuitive and interesting results.
The generalized model has some limitations however, beyond the required Markov assumption of
independent transition probability matrices at a given node or edge. Markov chains are discrete, finite
processes that may not be appropriate for all applications, especially where closed-form analysis and
steady-state solutions are desired. Scaling the analysis to large networks and larger state spaces, while
possible, requires significant computing power. This may limit the overall complexity of a network that can
be used practically by analysts to smaller networks or corridors. Finally, the use of theoretical distributions
can limit the overall realism of the system and the direct measure of randomness. Future work includes
expanding the capabilities of the MCRoute package to allow for generation of other theoretical distributions,
data smoothing, and other features as required. There is also the possibility to expand the applications as
discussed in Sections 5.1, 5.2, and 5.3.
Given the emphasis that users place on reliability, modelling randomness in transportation networks is
important. Markov chain approaches have been used successfully in the past to capture stochastic aspects
of transportation systems, and this generalized framework can provide greater access to this important
mathematical tool.
Acknowledgements
This work is supported by the National Science and Engineering Research Council of Canada (NSERC)
under grant PDF-545762-2020.
Preprint Version 2021-07-07 16
W. Klumpenhouwer Modelling Stochastic Transportation Networks with Markov Chains
References
Barta, J., A. E. Rizzoli, M. Salani, and L. M. Gambardella (2012). Statistical Modelling of Delays in a Rail
Freight Transportation Network. In Proceedings of the 2012 Winter Simulation Conference (WSC), Volume 53,
pp. 1–12.
Besenczi, R., N. Batfai, P. Jeszensky, R. Major, F. Monori, and M. Ispany (2021). Large-scale simulation of
traffic flow using Markov model. PLOS ONE 16(2), 1–31.
Bueckert, K. (2018, dec). Province says they’ve developed alternative to bypassing 30-kilometre track owned
by CN.
Daganzo, C. F. (1994). The Cell Transmission Model: A Dynamic Representation of Highway Traffic
Consistent with the Hydrodynamic Theory. Transportation Research Part B: Methodological 28B(4), 269–287.
Esmaeilnejad, S., W. Klumpenhouwer, S. C. Wirasinghe, and L. Kattan (2021). Optimal En-Route Electric
Bus Charging Station Locations and their Impacts on Reliability. In 24th International Symposium on
Transportation and Traffic Theory.
Friesz, T. L., R. Mookherjee, and T. Yao (2008). Securitizing congestion : The congestion call option.
Transportation Research Part B: Methodological 42, 407–437.
Hagberg, A. A., D. A. Schult, and P. J. Swart (2008). Exploring network structure, dynamics, and function
using NetworkX. 7th Python in Science Conference (SciPy 2008) (SciPy), 11–15.
Harris, C. R., K. J. Millman, S. J. van der Walt, R. Gommers, P. Virtanen, D. Cournapeau, E. Wieser, J. Taylor,
S. Berg, N. J. Smith, R. Kern, M. Picus, S. Hoyer, M. H. van Kerkwijk, M. Brett, A. Haldane, J. F. del Río,
M. Wiebe, P. Peterson, P. Gérard-Marchant, K. Sheppard, T. Reddy, W. Weckesser, H. Abbasi, C. Gohlke,
and T. E. Oliphant (2020). Array programming with NumPy. Nature 585(7825), 357–362.
Khadilkar, H. (2016). Modelling the impact of control strategy on stochastic delay propagation in trans-
portation networks. 2016 European Control Conference, ECC 2016, 2471–2476.
Klumpenhouwer, W. (2018). Optimal Time Point Locations on a Markovian Bus Route. Ph. D. thesis, University
of Calgary, Calgary, Canada.
Klumpenhouwer, W. and S. C. Wirasinghe (2018). Optimal Time Point Configuration of a Bus Route – A
Markovian Approach. Transportation Research Part B: Methodological 117(A), 209–227.
Newell, G. F. (1977). Unstable Brownian Motion of a Bus Trip. In Statistical Mechanics and Statistical Methods
in Theory and Application, pp. 645–667. New York: Plenum Press.
Vromans, M. J., R. Dekker, and L. G. Kroon (2006). Reliability and heterogeneity of railway services. European
Journal of Operational Research 172(2), 647–665.
Watling, D. P. and G. E. Cantarella (2015). Model Representation & Decision-Making in an Ever-Changing
World : The Role of Stochastic Process Models of Transportation Systems. Networks and Spatial Economics 15,
843–882.
¸Sahin, I. (2017). Markov chain model for delay distribution in train schedules: Assessing the effectiveness of
time allowances. Journal of Rail Transport Planning and Management 7(3), 101–113.
Preprint Version 2021-07-07 17
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Modeling and simulating movement of vehicles in established transportation infrastructures, especially in large urban road networks is an important task. It helps in understanding and handling traffic problems, optimizing traffic regulations and adapting the traffic management in real time for unexpected disaster events. A mathematically rigorous stochastic model that can be used for traffic analysis was proposed earlier by other researchers which is based on an interplay between graph and Markov chain theories. This model provides a transition probability matrix which describes the traffic’s dynamic with its unique stationary distribution of the vehicles on the road network. In this paper, a new parametrization is presented for this model by introducing the concept of two-dimensional stationary distribution which can handle the traffic’s dynamic together with the vehicles’ distribution. In addition, the weighted least squares estimation method is applied for estimating this new parameter matrix using trajectory data. In a case study, we apply our method on the Taxi Trajectory Prediction dataset and road network data from the OpenStreetMap project, both available publicly. To test our approach, we have implemented the proposed model in software. We have run simulations in medium and large scales and both the model and estimation procedure, based on artificial and real datasets, have been proved satisfactory and superior to the frequency based maximum likelihood method. In a real application, we have unfolded a stationary distribution on the map graph of Porto, based on the dataset. The approach described here combines techniques which, when used together to analyze traffic on large road networks, has not previously been reported.
Article
Full-text available
Array programming provides a powerful, compact and expressive syntax for accessing, manipulating and operating on data in vectors, matrices and higher-dimensional arrays. NumPy is the primary array programming library for the Python language. It has an essential role in research analysis pipelines in fields as diverse as physics, chemistry, astronomy, geoscience, biology, psychology, materials science, engineering, finance and economics. For example, in astronomy, NumPy was an important part of the software stack used in the discovery of gravitational waves1 and in the first imaging of a black hole2. Here we review how a few fundamental array concepts lead to a simple and powerful programming paradigm for organizing, exploring and analysing scientific data. NumPy is the foundation upon which the scientific Python ecosystem is constructed. It is so pervasive that several projects, targeting audiences with specialized needs, have developed their own NumPy-like interfaces and array objects. Owing to its central position in the ecosystem, NumPy increasingly acts as an interoperability layer between such array computation libraries and, together with its application programming interface (API), provides a flexible framework to support the next decade of scientific and industrial analysis.
Thesis
Full-text available
Transit agencies struggle daily with the randomness experienced by vehicles as they traverse their routes. Since reliability is one of the most important factors in determining the quality of a transit system, it is important that agencies use effective and data-informed strategies to combat this randomness as much as possible. One commonly used strategy is holding control, where early buses are held at specifically chosen stops known as time points. This strategy presents a trade-off between improved reliability and longer overall travel time, and leads to the central research question: for a route with varied travel times between stops and fluctuating passenger demand, what is the optimal number and location of time points along a route? This thesis develops a mathematical Markov chain model of bus behaviour on an urban bus route which can capture the effects of time point placement and is sensitive to individual inter-stop travel times. After formulating a cost function to account for the various impacts of unreliability on passengers and operators, two algorithms are presented to optimally locate time points, with an improvement on existing configurations used by Calgary Transit in Calgary, Canada. Finally, a simulation model shows that holding control drastically reduces the occurrence of the phenomenon of bus bunching.
Article
Full-text available
For a scheduled bus route adopting the holding control strategy, determining the optimal number and location of time points is considered a long-standing but elusive problem. In this paper, we take a new approach to the problem by developing a Markov Chain model to accurately capture the stochastic nature of a bus as it moves along a route in mixed traffic. Transition matrices are created using theoretical distributions of travel time calibrated with stop-to-stop travel time and dwell time data. The approach captures analytically the bus behavior while still allowing the model to be informed by the unique characteristics of the route, including travel time between stops and passenger demand. This stochastic process model mimics the physical phenomenon of Brownian motion, and it is found that the compounding nature of randomness leads to greater unreliability as the route progresses. Theoretical analysis of routes allows us to demonstrate where problem points may exist on the route and can point to locations where reliability improvements may be more effective. We develop a cost function to capture the values of time of passengers including waiting time due to early and late buses, and lost time at time points. We include operating cost capturing the increased cost of travel time caused by added control, and the improved overtime costs resulting from more consistent service. Using data from automated vehicle location (AVL) and automated passenger count (APC) systems, an operational route in Calgary, Canada is optimized using the developed model and cost function. A heuristic optimization algorithm is developed to consider high-cost stops iteratively which improves the cost function compared with existing configurations and with fewer time points.
Article
Full-text available
We review and advance the state-of-the-art in the modelling of transportation systems as a stochastic process. The conceptual and theoretical basis of the approach is explained in detail. A variety of examples are given to motivate its use in the field. While the examples cover a wide range of modelling philosophies, in order to provide focus they are restricted to modelling a special class of problems involving driver route choice in networks. Our overall objective is to establish the applicability of this approach as a ‘unifying framework’ for modelling approaches involving dynamic and stochastic elements, developing further the ideas put forward in Cantarella & Cascetta (Transportation Science 29, 305–329, 1995). Directions for further development and research are identified.
Conference Paper
Full-text available
NetworkX is a Python language package for exploration and analysis of networks and network algorithms. The core package provides data structures for representing many types of networks, or graphs, including simple graphs, directed graphs, and graphs with parallel edges and self loops. The nodes in NetworkX graphs can be any (hashable) Python object and edges can contain arbitrary data; this flexibility mades NetworkX ideal for representing networks found in many different scientific fields. In addition to the basic data structures many graph algorithms are implemented for calculating network properties and structure measures: shortest paths, betweenness centrality, clustering, and degree distribution and many more. NetworkX can read and write various graph formats for eash exchange with existing data, and provides generators for many classic graphs and popular graph models, such as the Erdoes-Renyi, Small World, and Barabasi-Albert models, are included. The ease-of-use and flexibility of the Python programming language together with connection to the SciPy tools make NetworkX a powerful tool for scientific computations. We discuss some of our recent work studying synchronization of coupled oscillators to demonstrate how NetworkX enables research in the field of computational networks.
Conference Paper
Full-text available
This study analyzes the transportation network of a major rail freight operator in order to obtain a model of delay propagation of trains connecting intermodal terminals. Operational management of a rail freight operator needs to take into account deviations due to unexpected events such as unplanned maintenance, strikes, railroad works, traffic congestion. The dispatcher makes train assignment decisions based on a number of performance indicators and also on the expectancy that a given train, currently delayed, could recover or limit the amount of delay in the future. We have developed a Markov-chain based model in order to evaluate the evolution of train delays as a train visits successive terminals. Our model is based on the examination of a large set of historical data and we show how we can classify different terminals according to their ability either to absorb or to amplify delays.
Article
Train movements are subject to disturbances and disruptions, which may cause late departures and/or late arrivals at particular locations (e.g., terminals, stations, crossings) with respect to their pre-determined times. To prevent possible deviations from the scheduled times, time allowances are added to the process times (i.e., running times) as time supplements and to the time interval between successive trains (i.e., minimum headway) as buffer times. The amount of time allowances is decided during the timetable planning process and their contribution to the service quality is monitored in actual operation. The adequacy of time allowances results in punctual train operations, which is one of the primary interests of service users and providers. Because the sequence of train departure and arrival times can be viewed as a stochastic process, the performance of train operations may be analyzed and modeled using Markov chains. We modeled train departure and arrival delays at stations as states and analyzed the successive state changes along train paths in a single track railway. This allowed us to predict train states at certain event time steps and to estimate steady-state delay probabilities. The former may be used to reschedule train movements and the latter to measure timetable robustness.
Chapter
As a bus travels along a route, its trip time between successive bus stops is subject to random fluctuations. If the bus should fall behind schedule because of this, then some excess passengers will have arrived at the bus stops during the late time and it will take the bus longer to load passengers. A bus, therefore, tends to travel slower and falls even further behind schedule. To compensate for this, the usual strategy of control used by bus operators is to provide some slack time in the schedule so that, normally, a bus can gain some time. The bus then operates under a rule that it may not leave a bus stop ahead of schedule, but will leave immediately if it is late. Actually, this control is not completely stable; if a bus falls so far behind schedule that the extra loading time generated by the lateness exceeds the slack time, the bus will still fall progressively further behind schedule.