ArticlePublisher preview available

Global synchromodal shipment matching problem with dynamic and stochastic travel times: A reinforcement learning approach

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

Global synchromodal transportation involves the movement of container shipments between inland terminals located in different continents using ships, barges, trains, trucks, or any combination among them through integrated planning at a network level. One of the challenges faced by global operators is the matching of accepted shipments with services in an integrated global synchromodal transport network with dynamic and stochastic travel times. The travel times of services are unknown and revealed dynamically during the execution of transport plans, but the stochastic information of travel times are assumed available. Matching decisions can be updated before shipments arrive at their destination terminals. The objective of the problem is to maximize the total profits that are expressed in terms of a combination of revenues, travel costs, transfer costs, storage costs, delay costs, and carbon tax over a given planning horizon. We propose a sequential decision process model to describe the problem. In order to address the curse of dimensionality, we develop a reinforcement learning approach to learn the value of matching a shipment with a service through simulations. Specifically, we adopt the Q-learning algorithm to update value function estimations and use the epsilon-greedy strategy to balance exploitation and exploration. Online decisions are created based on the estimated value functions. The performance of the reinforcement learning approach is evaluated in comparison to a myopic approach that does not consider uncertainties and a stochastic approach that sets chance constraints on feasible transshipment under a rolling horizon framework.
This content is subject to copyright. Terms and conditions apply.
Annals of Operations Research
https://doi.org/10.1007/s10479-021-04489-z
ORIGINAL RESEARCH
Global synchromodal shipment matching problem with
dynamic and stochastic travel times: a reinforcement
learning approach
W. Guo1·B. Atasoy2·R. R. Negenborn2
Accepted: 7 December 2021
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021
Abstract
Global synchromodal transportation involves the movement of container shipments between
inland terminals located in different continents using ships, barges, trains, trucks, or any com-
bination among them through integrated planning at a network level. One of the challenges
faced by global operators is the matching of accepted shipments with services in an integrated
global synchromodal transport network with dynamic and stochastic travel times. The travel
times of services are unknown and revealed dynamically during the execution of transport
plans, but the stochastic information of travel times are assumed available. Matching deci-
sions can be updated before shipments arrive at their destination terminals. The objective
of the problem is to maximize the total profits that are expressed in terms of a combina-
tion of revenues, travel costs, transfer costs, storage costs, delay costs, and carbon tax over
a given planning horizon. We propose a sequential decision process model to describe the
problem. In order to address the curse of dimensionality, we develop a reinforcement learn-
ing approach to learn the value of matching a shipment with a service through simulations.
Specifically, we adopt the Q-learning algorithm to update value function estimations and
use the -greedy strategy to balance exploitation and exploration. Online decisions are cre-
ated based on the estimated value functions. The performance of the reinforcement learning
approach is evaluated in comparison to a myopic approach that does not consider uncertain-
ties and a stochastic approach that sets chance constraints on feasible transshipment under a
rolling horizon framework.
Keywords Global synchromodal shipment matching ·Dynamic and stochastic travel
times ·Sequential decision process ·Reinforcement learning ·Q-learning
BW. Gu o
guo.wenjing@courrier.uqam.ca
1CIRRELT and Department of Analytics, Operations and Information Technologies, School of
Management Sciences, University of Quebec at Montreal, Montreal, Canada
2Department of Maritime and Transport Technology, Delft University of Technology, Delft, The
Netherlands
123
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
... In contrast, value-based RL learns values for each state or state-action pair, which can lead to insufficient convergence when dealing with vast state-action spaces. Although value-based algorithms traditionally found their application in control problems, such as inventory and supply chain management (see [67]), transport researchers have also applied Q-Learning in train scheduling and dispatching, as demonstrated by [68][69][70]. We note that the majority of the studies in the current State of the Art use hybrid approaches called Actor-Critic (e.g., Proximal Policy Optimisation (PPO), REINFORCE) where policy update (actor) and value-based learning (critic) are performed in a feedback loop. ...
... We explicitly exclude studies having main focus on topics such as signal controlling, supply chain, robotics, navigation, public transport, maritime logistics, and Multi-Agent bidding. This literature review also does not cover the classical RL methods such as simple Q-Learning from the control theory (see, e.g., [70,89]) as our research aims to explore more advanced NCO methods, which usually rely on Transformer architecture. Below, the surveyed studies are organised according to the problem classes outlined in Section 2.2. ...
... The State of the Art contains limited insights regarding the established ways to rebalance rewards from multiple agents in order to achieve reasonable interaction in a shared environment. Another interesting practical use-case is optimising transport plans for a service network modelled as a sequential decision process on graphs, where each node represents either a train station, trans-shipment terminal, or port [70]. A synchromodal matching platform for intermodal networks demonstrates the most practical use-case for NCO today, allowing freight forwarders or fourth party logistics (4PL) providers to create sustainable and reliable transport services. ...
Article
Full-text available
Intermodal freight transport (IFT) requires a large number of optimisation measures to ensure its attractiveness. This involves numerous control decisions on different time scales, making integrated optimisation with traditional methods almost unfeasible. Recently, a new trend in optimisation science has emerged: the application of Deep Learning (DL) to combinatorial problems. Neural combinatorial optimisation (NCO) enables real-time decision-making under uncertainties by considering rich context information—a crucial factor for seamless synchronisation, optimisation, and, consequently, for the competitiveness of IFT. The objective of this study is twofold. First, we systematically analyse and identify the key actors, operations, and optimisation problems in IFT and categorise them into six major classes. Second, we collect and structure the key methodological components of the NCO framework, including DL models, training algorithms, design strategies, and review the current State of the Art with a focus on NCO and hybrid DL models. Through this synthesis, we integrate the latest research efforts from three closely related fields: optimisation, transport planning, and NCO. Finally, we critically discuss and outline methodological design patterns and derive potential opportunities and obstacles for learning-based frameworks for integrated optimisation problems. Together, these efforts aim to enable a better integration of advanced DL techniques into transport logistics. We hope that this will help researchers and practitioners in related fields to expand their intuition and foster the development of intelligent decision-making systems and algorithms for tomorrow’s transport systems.
... Furthermore, travel time uncertainty is quite common in global transportation resulting from weather conditions and traffic congestion (Demir et al., 2016). As discussed in Guo et al. (2022a), ignoring travel time uncertainty in global synchromodal transport planning might result in suboptimal or even infeasible solutions. This paper designs a buffer strategy to guarantee a certain level of feasible transshipments at transshipment terminals. ...
... Guo et al. (2021) develop a chance-constrained programming (CCP) model to ensure a certain level of feasible transshipments at terminals, and shipments are rerouted under a rolling horizon framework when disturbances happen. Guo et al. (2022a) develop a reinforcement learning approach to learn the value of matching a shipment with a service through simulations. Shipment routes are updated instantly when infeasible transshipments happen based on evaluated value functions. ...
Article
Global synchromodal transportation is a promising strategy for providing efficient, reliable, flexible, and sustainable container shipping services across continents. It involves integrating multiple modes and routes owned by various operators to create a comprehensive transport plan. However, these operators often have their own local networks and are hesitant to cede control to a centralized platform. Instead, they prefer to share limited information in a coordinated manner to achieve a common goal without sacrificing their own benefits. This paper proposes a coordinated mechanism for global synchromodal transport planning, in which a global operator proposes incentives to local operators to select the most efficient modes and routes for shipping containers from one continent to another. An augmented Lagrangian relaxation approach is developed for the global operator to generate incentives, and a heuristic algorithm is designed to address the computational complexity of the optimization problems faced by local operators. We incorporate the proposed approaches with a rolling horizon framework to handle dynamic shipment requests received from spot markets and with a buffer strategy to address travel time uncertainties. The coordinated mechanism is tested on a real network between Asia and Europe, and results show that it can significantly increase total profits, reduce request rejections, and reduce infeasible transshipments compared to decentralized global transportation plans currently in use, particularly under scenarios with higher degrees of dynamism and uncertainty.
... Thus, optimality can be guaranteed for certain RL methods under certain conditions and assumptions. Application of RL to combinatorial optimization problems includes maintenance planning (Kosanoglu et al., 2022), strategic bidding ( van Heeswijk, 2022), production scheduling (Arviv et al., 2016;Waschneck et al., 2018;Zhang et al., 2012), shipment matching (Guo et al., 2022), Traveling Salesman Problem (TSP) (Alipour et al., 2018;Hu et al., 2020;Liu & Zeng, 2009;Miki et al., 2018), Vehicle Routing Problem (VRP) (Hansuwa et al., 2022;Mao & Shen, 2018;Nazari et al., 2018;Śniezyński et al., 2010;Yu et al., 2019), and the Bin Packing Problem (BPP) (Laterre et al., 2018), including the 3D Bin Packing Problem variation (Jiang et al., 2021;Puche & Lee, 2022;Verma et al., 2020;Zhao et al., 2021Zhao et al., , 2020. In recent years, deep Q-learning has enjoyed a great deal of success; such methods have also been adapted to combinatorial problems (Nazari et al., 2018;Zhang et al., 2020). ...
Article
Full-text available
In many large manufacturing companies, freight management is handled by a third-party logistics (3PL) provider, thus allowing manufacturers and their suppliers to focus on the production of goods rather than managing their delivery. Provided their pivotal supply chain role, in this work we propose a general framework for what we term as “the 3PL freight management problem” (3PLFMP). Our framework identifies three primary activities involved in 3PL freight management: the assignment of orders to a fleet of vehicles, efficient routing of the fleet, and packing the assigned orders in vehicles. Furthermore, we provide a specific instantiation of the 3PLFMP that considers direct vs. consolidated shipping strategies, one dimensional packing constraints, and a fixed vehicle routing schedule. We solve this instantiated problem using several Reinforcement Learning (RL) methods, including Q-learning, Double Q-learning, SARSA, Deep Q-learning, and Double Deep Q-learning, comparing against two benchmark methods, a simulated annealing heuristic and a variable neighborhood descent algorithm. We evaluate the performance of these methods on two datasets. One is fully simulated and based on past work, while another is semi-simulated using real-world automobile manufacturers and part supplier locations, and is of our own design. We find that RL methods vastly outperform the benchmark heuristic methods on both datasets, thus establishing the superiority of RL methods in solving this highly complicated and stochastic problem.
Article
Full-text available
This paper offers an empirical study to explore the relationship between transportation modalities and environmental concerns, promoting the adoption of synchromodality as a strategic pathway to achieving sustainable freight transport. The study uses a synchromodal freight transportation platform to analyze the impact of carbon tax policy on modal shift and environmental sustainability. The synchromodal platform is based on an optimization model using Mixed Integer Linear Programming (MILP), incorporating carbon tax as a surrogate measure for environmental costs. A sensitivity analysis is conducted across four distinct scenarios in a case study in the Great Lakes region, focusing on the Canada-US transborder trade. The results of this study illustrate the considerable potential for increasing the utilization of more environmentally sustainable transportation modes in this region. While the addition of carbon tax entails increased total transportation costs for each unit of cargo, the synchromodal-enabled modal shift promises to mitigate transportation’s negative externalities, including congestion, environmental impacts, and noise pollution. The results also highlight the role of synchromodality as a catalyst for sustainable freight transport decisions in the context of a carbon-conscious world.
Article
Full-text available
This paper offers an empirical study to explore the relationship between transportation modalities and environmental concerns, promoting the adoption of synchromodality as a strategic pathway to achieving sustainable freight transport. The study uses a synchromodal freight transportation platform to analyze the impact of carbon tax policy on modal shift and environmental sustainability. The synchromodal platform is based on an optimization model using Mixed Integer Linear Programming (MILP), incorporating carbon tax as a surrogate measure for environmental costs. A sensitivity analysis is conducted across four distinct scenarios in a case study in the Great Lakes region, focusing on the Canada-US transborder trade. The results of this study illustrate the considerable potential for increasing the utilization of more environmentally sustainable transportation modes in this region. While the addition of carbon tax entails increased total transportation costs for each unit of cargo, the synchromodal-enabled modal shift promises to mitigate transportation's negative externalities, including congestion, environmental impacts, and noise pollution. The results also highlight the role of synchromodality as a catalyst for sustainable freight transport decisions in the context of a carbon-conscious world.
Article
One of the stepping stones towards the physical internet is synchromodality, which enables a seamless, flexible, and resilient interconnected network for door-to-door service. As its name suggests, synchronization is the core of synchromodality, but its concept is not yet clearly understood. This paper presents a classification of the literature on synchromodality based on proposed types of synchronization and discusses synchronization methods. Within the complexity of synchromodality, challenges in synchronization are recognized, resulting in several emerging issues. We recommend future research directions to address under-studied issues, highlighting the importance of multi-agent systems with learning for real-time decision making. Additionally, an integrated framework is developed to provide a comprehensive understanding of how synchronization works among various elements in a synchromodal transportation system.
Article
Full-text available
With growing environmental concerns and the exploitation of ubiquitous big data, smart transportation is transforming logistics business and operations into a more sustainable approach. To answer questions in intelligent transportation planning, such as which data are feasible, which methods are applicable for intelligent prediction of such data, and what are the available operations for prediction, this paper offers a new deep learning approach called bi-directional isometric-gated recurrent unit (BDIGRU). It is merged to the deep learning framework of neural networks for predictive analysis of travel time and business adoption for route planning. The proposed new method directly learns high-level features from big traffic data and reconstructs them by its own attention mechanism drawn by temporal orders to complete the learning process recursively in an end-to-end manner. After deriving the computational algorithm with stochastic gradient descent, we use the proposed method to perform predictive analysis of stochastic travel time under various traffic conditions (especially for congestions) and then determine the optimal vehicle route with the shortest travel time under future uncertainty. Based on empirical results with big traffic data, we show that the proposed BDIGRU method can (1) significantly improve the predictive accuracy of one-step 30 min ahead travel time compared to several conventional (data-driven, model-driven, hybrid, and heuristics) methods measured with several performance criteria, and (2) efficiently determine the optimal vehicle route in relation to the predictive variability under uncertainty.
Article
Full-text available
This paper investigates a dynamic and stochastic shipment matching problem faced by network operators in hinterland synchromodal transportation. We consider a platform that receives contractual and spot shipment requests from shippers, and receives multimodal services from carriers. The platform aims to provide optimal matches between shipment requests and multimodal services within a finite horizon under spot request uncertainty. Due to the capacity limitation of multimodal services, the matching decisions made for current requests will affect the ability to make good matches for future requests. To solve the problem, this paper proposes an anticipatory approach which consists of a rolling horizon framework that handles dynamic events, a sample average approximation method that addresses uncertainties, and a progressive hedging algorithm that generates solutions at each decision epoch. Compared with the greedy approach which is commonly used in practice, the anticipatory approach has total cost savings up to 8.18% under realistic instances. The experimental results highlight the benefits of incorporating stochastic information in dynamic decision making processes of the synchromodal matching system.
Article
Full-text available
Synchromodal transportation planning is defined by the possibility to re-route shipments to alternative transportation modes at intermediate terminals based on real-time information about the shipment in transit. We present a synchromodal decision support model to determine the optimal modal choice for a single shipment in a multimodal network that is characterized by stochastic travel times. The model is formulated as a Markov decision process and allows adaptations to the modal choice based on real-time information on the travel time. Our formulation trades off transportation and late delivery penalty costs, and captures the value of synchromodal planning. We demonstrate the use of our model in a numerical case study, where we evaluate synchromodal against static intermodal transportation planning. The latter does not allow real-time adjustments to the modal choice. Compared to intermodality, synchromodal planning has most value when the penalty for late delivery is high and transportation services are more frequent.
Conference Paper
Full-text available
Global intermodal transportation involves the movement of shipments between inland terminals located in different continents by using ships, barges, trains, trucks, or any combination among them through integrated planning at a network level. One of the challenges faced by global operators is the matching of shipment requests with transport services in an integrated global network. The characteristics of the global intermodal shipment matching problem include acceptance and matching decisions, soft time windows, capacitated services, and transshipments between multimodal services. The objective of the problem is to maximize the total profits which consist of revenues, travel costs, transfer costs, storage costs, delay costs, and carbon tax. Travel time uncertainty has significant effects on the feasibility and profitability of matching plans. However, travel time uncertainty has not been considered in global intermodal transport yet leading to significant delays and infeasible transshipments. To fill in this gap, this paper proposes a chance-constrained programming model in which travel times are assumed stochastic. We conduct numerical experiments to validate the performance of the stochastic model in comparison to a deterministic model and a robust model. The experiment results show that the stochastic model outperforms the benchmarks in total profits.
Article
Full-text available
Hinterland intermodal transportation is the movement of containers between deep-sea ports and inland terminals by using trucks, trains, barges, or any combination of them. Synchromodal transportation, as an extension of intermodal transportation, refers to transport systems with dynamic updating of plans by incorporating real-time information. The trend towards spot markets and digitalization in hinterland intermodal transportation gives rise to online synchromodal transportation problems. This paper investigates a dynamic shipment matching problem in which a centralized platform provides online matches between shipment requests and transport services. We propose a rolling horizon approach to handle newly arrived shipment requests and develop a heuristic algorithm to generate timely solutions at each decision epoch. The experiment results demonstrate the solution accuracy and computational efficiency of the heuristic algorithm in comparison to an exact algorithm. The proposed rolling horizon approach outperforms a greedy approach from practice in total costs under various scenarios of the system.
Article
Full-text available
Travel time distribution (TTD) has been widely used to represent the traffic conditions on freeways and help to analyze travel time reliability (TTR). The goal of this study is to develop a systematic approach to analyzing TTD on different types of roadway segments along a corridor. By examining the historical TTR pattern using planning time index (PTI), four typical segments are identified and selected first. The distributions of travel time are then analyzed under different time of day, day of week, segment location and weather. The goodness-of-fit tests of different distributions are then conducted, and the results indicate that Burr distribution can provide highest acceptance rate with the consideration of different times of day (TOD) and days of week (DOW). The results also indicate that Burr distribution can provide highest acceptance rate with the consideration of different weather conditions. This study can provide the insightful information about TTD characteristics under different scenarios, and the results can also help transportation planners make informed decisions. Keywords: Travel time distribution, Travel time reliability, Probe vehicle data
Article
This paper investigates a dynamic and stochastic shipment matching problem, in which a platform aims to provide online decisions on accepting or rejecting newly received shipment requests and decisions on shipment-to-service matches in global synchromodal transportation. The problem is considered dynamic since the platform receives requests and travel times continuously in real time. The problem is considered stochastic since the information of requests and travel times is not known with certainty. To solve the problem, we develop a rolling horizon framework to handle dynamic events, a hybrid stochastic approach to address uncertainties, and a preprocessing-based heuristic algorithm to generate timely solutions at each decision epoch. The experimental results indicate that for instances with above 50% degrees of dynamism, the hybrid stochastic approach that considers shipment request and travel time uncertainties simultaneously outperforms the approaches that do not consider any uncertainty or just consider one type of uncertainties in terms of total profits, the number of infeasible transshipments, and delay in deliveries.
Article
The purpose of revenue management (RM) is to maximize revenue growth for a company by optimizing product/service availability and prices based on micro-level forecasting of customer behavior. Seat/cargo capacity control and air ticket/cargo pricing are two primary RM research topics that have yielded fruitful models and solution methods for air transportation, which have been used by airlines for around 50 years. However, the RM studies for container liner shipping services and their application are scant although the operations of airlines and container shipping lines are quite similar. We therefore introduce the fundamental RM models developed for air transportation, namely, capacity control and pricing models. Based on these models, we proceed to critically review the RM studies for container liner shipping services. Finally, we identify valuable future research directions in container shipping RM.
Article
Uncertainty is inherent in many planning situations. One example is in maritime transportation, where weather conditions and port occupancy are typically characterized by high levels of uncertainty. This paper considers a maritime inventory routing problem where travel times are uncertain. Taking into account possible delays in the travel times is of main importance to avoid inventory surplus or shortages at the storages located at ports. Several techniques to deal with uncertainty, namely deterministic models with inventory buffers; robust optimization; stochastic programming and models incorporating conditional value-at-risk measures, are considered. The different techniques are tested for their ability to deal with uncertain travel times for a single product maritime inventory routing problem with constant production and consumption rates, a fleet of heterogeneous vessels and multiple ports. At the ports, the product is either produced or consumed and stored in storages with limited capacity. We assume two stages of decisions, where the routing, the visit order of the ports and the quantities to load/unload are first-stage decisions (fixed before the uncertainty is revealed), while the visit time and the inventory levels at ports are second-stage decisions (adjusted to the observed travel times). Several solution approaches resulting from the proposed techniques are considered. A computational comparison of the resulting solution approaches is performed to compare the routing costs, the amount of inventory bounds deviation, the total quantities loaded and unloaded, and the running times. This computational experiment is reported for a set of maritime instances having up to six ports and five ships.
Article
Most previous work in addressing the adaptive routing problem in stochastic and time-dependent (STD) network has been focusing on developing parametric models to reflect the network dynamics and designing efficient algorithms to solve these models. However, strong assumptions need to be made in the models and some algorithms also suffer from the curse of dimensionality. In this paper, we examine the application of Reinforcement Learning as a non-parametric model-free method to solve the problem. Both the online Q learning method for discrete state space and the offline fitted Q iteration algorithm for continuous state space are discussed. With a small case study on a mid-sized network, we demonstrate the significant advantages of using Reinforcement Learning to solve for the optimal routing policy over traditional stochastic dynamic programming method. And the fitted Q iteration algorithm combined with tree-based function approximation is shown to outperform other methods especially during peak demand periods.