Conference PaperPDF Available

Game Theory-Based Simultaneous Prediction and Planning for Autonomous Vehicle Navigation in Crowded Environments

Authors:

Abstract

Navigating crowded environments with substantial pedestrian interactions poses distinctive challenges for autonomous vehicles (AVs), primarily due to the interdependence of prediction and planning modules and the necessity for vehicles and pedestrians to cooperate to achieve mutual benefits. In this work, we introduce a novel game theory-based autonomous vehicle stack designed to address these challenges while ensuring safe and efficient navigation in such complex settings. Our primary contributions encompass the development of an efficient Nash equilibrium approximation, facilitating simultaneous prediction and planning outputs, and effectively resolving the "chicken and egg" problem inherent in traditional autonomous software stacks. Moreover, we formulate a payoff function for the game that encapsulates the interactions among agents in the environment and demonstrate that our proposed stack improves classic probabilistic predictor. Through evaluations conducted using real-world datasets, we show that the proposed stack not only enhances prediction accuracy but also improves the planning success rate while mitigating integration errors. Further results can be accessed at the following link: https://youtu.be/ctGg4-ifCYQ.
Game Theory-Based Simultaneous Prediction and Planning for
Autonomous Vehicle Navigation in Crowded Environments
Kunming Li1, Yijun Chen1, Mao Shan1, Jiachen Li2, Stewart Worrall1and Eduardo Nebot1
Abstract Navigating crowded environments with substan-
tial pedestrian interactions poses distinctive challenges for
autonomous vehicles (AVs), primarily due to the interdepen-
dence of prediction and planning modules and the necessity
for vehicles and pedestrians to cooperate to achieve mutual
benefits. In this work, we introduce a novel game theory-based
autonomous vehicle stack designed to address these challenges
while ensuring safe and efficient navigation in such complex
settings. Our primary contributions encompass the development
of an efficient Nash equilibrium approximation, facilitating
simultaneous prediction and planning outputs, and effectively
resolving the “chicken and egg” problem inherent in traditional
autonomous software stacks. Moreover, we formulate a payoff
function for the game that encapsulates the interactions among
agents in the environment and demonstrate that our proposed
stack improves classic probabilistic predictor. Through evalu-
ations conducted using real-world datasets, we show that the
proposed stack not only enhances prediction accuracy but also
improves the planning success rate while mitigating integration
errors. Further results can be accessed at the following link:
https://youtu.be/ctGg4-ifCYQ.
I. INTRODUCTION
Autonomous vehicles (AVs) are commonly designed with
modular architectures that consist of various components
responsible for tasks such as perception, prediction, and
planning. These modular structures are typically favored
due to their verifiability, interpretability, and generalization
capabilities [1] [2].
However, standard modular architectures also present chal-
lenges when integrating individual components into a co-
hesive system. Issues such as compounding errors, infor-
mation bottlenecks, and integration difficulties may emerge
[1]. Compounding errors involve the accumulation of in-
accuracy throughout multiple stages of a modular system,
potentially leading to significant discrepancies in the final
output. Information bottlenecks occur when the data flow
between modules is constrained, limiting the system’s ability
to fully utilize available information and possibly resulting in
suboptimal decision-making. Integration difficulties relate to
the challenges of combining various modules into a seamless
system, often requiring careful coordination and optimization
of interfaces between modules to ensure effective commu-
nication and overall system performance. Addressing these
challenges is crucial when developing a novel autonomous
vehicle stack for safe and efficient navigation in crowded
environments, including interactions with pedestrians.
1Australian Centre for Robotics, The University of Sydney, Sydney, NSW
2006, Australia. E-mails: {k.li, y.chen, m.shan, s.worrall,
e.nebot}@acfr.usyd.edu.au.
2Stanford University, USA. E-mail: jiachen li@stanford.edu.
Fig. 1. A sample result from a real-world dataset [3]. The green
lines represent the future trajectories of pedestrians, while the red line
demonstrates the path planned by the proposed stack.
Moreover, the relationship between prediction and plan-
ning in autonomous vehicles resembles a “chicken and egg”
problem due to their interdependence and cyclical nature [4]
[5]. Accurate predictions are vital for generating safe driving
trajectories, while planning module actions can influence pre-
dictions by altering the vehicle’s behavior and affecting sur-
rounding objects, such as pedestrians. This interdependence
complicates determining which module should be prioritized
and presents challenges in optimizing and coordinating the
two components effectively. Integrated approaches that si-
multaneously address prediction and planning often provide
solutions to this problem, allowing better communication and
coordination between modules, leading to improved safety
and efficiency in autonomous vehicle operation, especially
in crowded environments.
Autonomous navigation in crowded environments can be
viewed as a non-zero-sum game, where all agents, including
vehicles and pedestrians, cooperate for mutual benefit, focus-
ing on primary driving factors such as collision avoidance,
path following, and comfort [6] [7] [8]. However, solving this
game presents several challenges including but not limited to:
1) the reward functions of the agents are generally unknown,
making it difficult to assess the actions’ effectiveness; 2)
finding a Nash equilibrium is computationally demanding
due to the extensive search space and complex interactions
between agents; and 3) incomplete or imperfect information
on agents’ strategies and payoffs adds uncertainty to the
decision-making process.
In this work, we present a novel stack for autonomous
navigation in crowded environments. Our major contribu-
tions include: 1) the development of a game theory-based
autonomous vehicle stack with an efficient Nash equilib-
rium approximation, which enables simultaneous prediction
and planning outputs, effectively addressing the inherent
“chicken and egg” problem in traditional stacks; 2) the defini-
tion of a payoff function for the game, which captures the in-
teractions between the agents in the environment; and 3) the
demonstration that our proposed stack improves probabilistic
prediction. Through evaluations using real-world datasets, we
show that the proposed stack not only improves prediction
accuracy but also increases the planning success rate, while
reducing integration errors. This innovative approach offers a
more efficient and safer solution for autonomous navigation
in highly dynamic environments with a significant level of
vehicle-pedestrian interaction.
II. REL AT ED WORK
A. Probabilistic Models for Pedestrian Motion Prediction
Probabilistic models have been widely employed to quan-
tify prediction uncertainty in the context of pedestrian motion
prediction. Some approaches generate a distribution for each
pedestrian directly [9]–[11], while others sample from a
learned distribution [12], [13]. Previous studies [9], [14],
[15] have demonstrated promising performance in predicting
pedestrian future trajectories by modeling social interactions
across RNN hidden states using pooling layers. Recent
research [16], [17] has shown that Graph Attention Networks
(GATs) can effectively incorporate social interactions as
attention in pooling layers, resulting in superior performance.
Studies by [17], [18] have highlighted the effectiveness of
CNN-based approaches for pedestrian trajectory prediction,
which generally require fewer computational resources than
Recurrent Neural Networks for sequence prediction. Li et
al. [18] proposed a spatial-temporal CNN-based model with
a graph attention network for pedestrian trajectory predic-
tion, which improved prediction accuracy while maintaining
computational efficiency. These works have shown promising
performance in pedestrian motion prediction, but they have
not considered the response from the autonomous vehicle or
the mutual interaction between planning and prediction. In
addition, these probabilistic models also output significant
uncertainty, as the outputs are represented by Gaussian
parameters of the predicted distribution and evaluated by the
best sample.
B. Autonomous Vehicle Stack
Autonomous vehicle stacks can be broadly categorized
into two primary types: modular stacks and end-to-end
stacks. Modular autonomous vehicle stacks emphasize the
employment of neural networks in differentiable AV mod-
ules, such as perception and prediction [19]. Numerous
studies have investigated the tighter integration of prediction
within planning [20]–[26], but these works typically utilize
predictions in planning without enhancing the prediction
module using information from planning. In contrast, end-
to-end stacks learn a policy network directly from data,
benefiting from the elimination of information bottlenecks
and performance scaling with increasing dataset sizes [27]–
[30]. However, fully end-to-end approaches suffer from a
lack of interpretability, which is essential for debugging,
verification, and safety guarantees. Moreover, the general-
izability of purely data-driven methods to unseen or rare-
but-critical scenarios remains unclear [1] [2]. Notably, end-
to-end methods are designed to process raw sensor data
and form decisions based on that information, incorporat-
ing intermediary tasks such as object detection and agent
behavior prediction. However, a crucial obstacle is that the
entirety of this procedure is governed by neural networks,
which complicates the integration with conventional, system-
atic planning and control algorithms. These algorithms are
frequently employed in practical applications due to their
established efficiency and dependability.
C. Game-theoretic Modeling of Crowds Navigation
Game-theoretic modeling can enhance the decision-
making of ego-autonomous vehicles in traffic scenarios. It
models traffic as a game where the ego vehicle aims to reach
its destination safely and efficiently, while pedestrians aim to
walk smoothly and avoid collisions. The reasonable strategies
of the ego vehicle and pedestrians fall into the concept of
Nash equilibrium. For example, if the ego vehicle deviates
from its current planning trajectory, it may potentially result
in damage and prevent itself from reaching its destination.
There is extensive literature on game-theoretic road-user
interaction models in applications such as modeling lane
changes and merging onto motorways, route selection, and
socio-economic choices [31]–[36]. There are three streams
of research based on the number of road-users involved,
including two agents game [33], [34], small group game
[36], and crowd game [35]. Several research gaps exist in
the current literature. The most extensive effort of research
is spent on two-agents game models. Most studies focus on
a limited decision space for agents, involving actions such as
deceleration, maintaining speed, staying in the current lane,
changing lanes, and similar choices. Additionally, apart from
the study conducted by [33], the majority of research lacks
experimental validation using real-world datasets.
III. METHODOLOGY
A. Problem Formulation
Consider a scenario that encompasses an autonomous
vehicle as the ego vehicle indexed by 0and a set of
pedestrians denoted by P={1,2, . . . , P }. The ego vehicle
and pedestrians are predicted to move with varying positions
a[t]
i:= (x[t]
i, y[t]
i) A[t]
iR2, i N := {0}∪P over
prediction time steps t Tpred := {1,2, . . . , T }, where
those A[t]
i, i N , t Tpred represents the position space of
Fig. 2. This flowchart outlines our proposed game theory-based autonomous navigation stack, designed to guide vehicles efficiently and safely through
crowded environments. The multi-agent scene simplifies the interaction dynamics into a two-player game to reduce computational complexity. A predictor
module, grounded in deep learning, uses graph neural encoding to capture the social interactions among pedestrians based on their historical states,
subsequently generating a probabilistic distribution of potential future pedestrian paths. A subset of potential trajectories is then sampled, representing the
predicted pedestrians’ possible trajectories. The stack iterates over all possible strategy support combinations for both players. With each combination, the
stack computes the constraints implied by the strategy supports, distinguishing strategies with either positive or zero probabilities. Support enumeration aids
in identifying optimal response strategies under these constraints. Utilizing a payoff function, our stack enhances the prediction of the crowd’s response
to the ego vehicle’s strategy, and accordingly identifies an optimal navigation strategy that minimizes potential conflicts and ensures safe navigation in
densely populated areas.
the ego vehicle and pedestrians predicted by deep learning
models (See more detail in Section III-B.2).
We now present the normal-form game representation of
the aforementioned scenario. The ego vehicle and pedestrians
are considered to be players indexed in set N. Each player’s
strategy space is considered as any possible time-sequential
positions (x[t]
i, y[t]
i), t Tpred predicted by probabilistic
models or deep learning predictors. Then, each player i N
selects its strategy ai:= (x[t]
i, y[t]
i)t∈Tpred from its strategy
space Ai:= Πt∈Tpred A[t]
iR2Tto maximize its payoff
Ji. For player 0(the ego vehicle), it aims to achieve its
navigation goal b:= (x
0, y
0)while preventing collision
and thus ensuring safety. Hence, player 0s payoff function
is such a mapping that
J0(a0,a1,...,aP):Πj∈N AjR.
For player i P (pedestrians), it seeks to reduce the jerk of
its movement and maintain comfortable distances from other
players including the ego vehicle and other pedestrians. As a
result, the payoff function of player i P is such a mapping
that
Ji(a0,...,ai,...,aP):Πj∈N AjR.
In game theory, Nash equilibrium is a common solution
concept, describing a profile of strategies where no player has
an incentive to unilaterally deviate from their current strategy,
assuming that all other players are also maintaining their
current strategies. The goal of the aforementioned normal-
form game is to determine Nash equilibrium.
B. Efficient Nash Equilibrium Approximation
Solving multi-agent games for autonomous navigation
in crowded environments is challenging due to the vast
strategy space and complex interactions between diverse
agents. Incomplete or imperfect information and the compu-
tationally demanding nature of finding a Nash equilibrium
further complicate real-time decision-making. In this work,
we propose an efficient approach for approximating the
Nash equilibrium, which simplifies the game to a two-player
model, reducing the computational complexity. The strategy
set for the crowds is narrowed down, and the best response
pairs for each player are identified by enumerating pure
strategy supports, enabling a more efficient determination of
the Nash equilibrium.
1) Simplified Game Representation for Efficient Interac-
tion: In this section, we present a simplified version of the
game representation, as discussed in Section III-A. Here, we
consider a scenario involving two primary players. Player
0represents the ego vehicle, while player 1embodies the
surrounding crowds, encompassing all pedestrians in set P.
Player 0selects its strategy a0from its strategy space A0to
optimize its payoff function J0. As player 1signifies all sur-
rounding pedestrians, player 1chooses its strategy {ai}i∈P
from its strategy space Πi∈P AiR2P T to maximize its
payoff function 1
PPi∈P Ji.
In this game formulation, the ego vehicle’s strategy space
A0originates from alternative trajectories provided by the
planner, whereas pedestrians’ strategy space Πi∈PAiare
the output of the predictor. Although this simplified version
of the game encompasses only two players, it retains the
fundamental characteristics of crowd navigation involving
multiple pedestrians.
The proposed Nash equilibrium approximation algorithm,
presented in Section III-B.3, offers an optimal solution for
non-cooperative interactions between the ego vehicle and
multiple pedestrians. Under this framework, neither the ego
vehicle nor the pedestrians can unilaterally modify their
strategies to attain better outcomes, thereby ensuring a stable
and efficient navigation strategy.
2) Using Deep Learning Predictor to Narrow Strategy
Sets: To approximate a Nash equilibrium for multi-agent
interaction, it is essential to assess all possible scenarios.
Nonetheless, it is quite difficult to enumerate every interac-
tion between all agents and establish the optimal response
for each. To streamline this process, we utilize the Social-
STGCNN pedestrian motion predictor [17], which is trained
on real-world data and comprises two main components:
the Spatio-Temporal Graph Convolution Neural Network
(ST-GCNN) and the Time-Extrapolator Convolution Neural
Network (TXP-CNN).
For a given pedestrian trajectory sequence, we construct
a graph representation of the scene. At each time step, an
undirected graph Gt= (Vt,At)is defined, where each node
vi
tVtsignifies the displacement of agent ifrom the
previous timestep, accounting for Npedestrians. The graph
edges are denoted using a weighted adjacency matrix At. The
entire spatio-temporal graph G=G1,...,GTobs spanning
the observed time period Tobs is utilized as input to the ST-
GCNN.
We employ the graph spatial convolution operation, as
introduced in [17], on G. This generalizes convolution from
an array to graph representation, using a kernel to aggregate
features in a local graph neighborhood across spatial and
temporal dimensions, akin to grid-based CNNs [37] [17]:
ˆ
At=˜
D1
2
t(At+I)˜
D1
2
t,(1)
where ˜
Dtdenotes the diagonal degree matrix of node con-
nections for (At+I)at time step t,Irepresents the identity
matrix, and the normalized weight adjacency matrix at time
tis denoted by ˆ
At.
The input to each layer lof the ST-GCNN and TXP-CNN
are shown as Eq. (2a) and Eq. (2c), where Vand ˆ
Adenote
the concatenations of Vtand ˆ
Atalong the time dimension
respectively, for all tTobs :
H(l+1) =σ(ˆ
AH(l)W(l))(2a)
U0=σtxp(HW0
txp)(2b)
U(k+1) =σtxp((UkWk
txp) + Uk), k 0,(2c)
where σtxp and Wk
txp represent the activation function and
layer-specific trainable weight matrices, respectively.
This model outputs bi-variate Gaussian distribution ˆ
Yt
i
N(ˆµt
i,ˆσt
i,ˆρt
i)for pedestrian i P at each time step
t Tpred, from which discrete strategy space Πi∈P Ai
are sampled. Accounting for the joint distribution of x
and yat each time tduring the sampling process would
significantly increase the computational cost, particularly
when considering the interactions between all pedestrians.
To reduce computational complexity and enable real-time
operation of the algorithm, we disregard this correlation and
directly sample predicted trajectories for all individuals from
the distribution. This method enables us to concentrate on a
narrower set of pedestrian future trajectories, thereby making
it more manageable to pinpoint the best responses for each
agent.
3) Approximating Nash Equilibrium: We propose Algo-
rithm 1to efficiently compute pure strategy NE for simplified
game representation in Section III-B.1. It efficiently finds
pure strategy NE by eliminating strategies that lead to colli-
sions, computing payoff matrices, and iteratively identifying
the best response strategies for each player. Algorithm 1
returns a set of NEs, which represent stable strategies where
neither player has an incentive to unilaterally deviate.
Algorithm 1: ComputePureStrategyNE
Input: Player 0’s discrete strategy space A0and
Player 1’s discrete strategy space Πi∈PAi
Output: Approximated NE(s) a
0,a
1,· · · ,a
P
// Eliminate collision strategies:
1Aelim
0 ,Aelim
1 ;
2foreach strategy s0in player 0’s A0do
3foreach strategy s1in player 1’s Πi∈P Aido
4if s0and s1do not collide then
5Aelim
0 Aelim
0 {s0};
6Aelim
1 Aelim
1 {s1};
// Compute payoff matrices Q0and Q1
7foreach strategy s0in player 0’s Aelim
0do
8foreach strategy s1in player 1’s Aelim
1do
9jindex(s0), k index(s1);
10 Q0(j, k)J0(s0, s1);
11 Q1(j, k)1
PPi∈P Ji(s0, s1);
// Find NE by best response (BR) process
12 NE ;
13 foreach strategy s0in player 0’s Aelim
0do
// Compute player 1’s BR given s0
14 jindex(s0);
15 BR1= arg maxs1Q1(j, index(s1));
16 foreach strategy s1in BR1do
// Compute player 0’s BR given BR1
17 kindex(s1);
18 BR0= arg maxs0Q0(index(s0), k);
// Pair best response
19 if s0BR0then
20 NE NE (s0, s1)
21 Return NE
C. Payoff Functions for Optimal Prediction and Path Plan-
ning
We propose payoff functions for pedestrians and au-
tonomous vehicles to optimize safety, efficiency, and pedes-
trian motion prediction in autonomous navigation. These
functions guide each player’s behavior, encouraging strate-
gies for better prediction and path planning. This enhances
the performance and safety of the navigation system.
1) Payoff Function for Ego Vehicle: For autonomous ve-
hicles, the payoff function is formulated to reward strategies
that lead to optimal path planning, taking into considera-
tion several criteria. These include avoiding collisions with
pedestrians, maintaining a safe distance from pedestrians,
and reaching the destination within the shortest possible time.
The function assigns higher rewards to planning paths that
strike a balance between safety and efficiency, resulting in
a smoother navigation experience. The ego vehicle’s payoff
function takes the form
J0=w1a[T]
0b2
|{z }
navigation goal
w2d1
P T X
t∈Tpred
X
i∈P
1d1(a[t]
ia[t]
02)
|{z }
vehicle safety term
with indicator function 1d1(x)=1if x>dand 1d1(x)=0
otherwise. Here, d1represents the safe distance threshold
between the ego vehicle and pedestrians, serving as a de-
terminant for maintaining collision avoidance. The weight
factors w1and w2play an important role in determining
the relative importance of the navigation goal and the safety
requirement. To be more specific, w1represents the weight
assigned to the “vehicle navigation goal”, which is a measure
of how well the ego vehicle is able to reach its destination.
On the other hand, w2represents the weight assigned to the
“vehicle safety term”, which is a measure of the ego vehicle’s
proximity to pedestrians.
2) Payoff Functions for Pedestrians: For pedestrians, the
payoff function is designed to reward predicted behaviors
that maintain a consistent velocity, avoid collisions with other
pedestrians, and ensure a safe distance between pedestrians
and vehicles. The function takes into account multiple fac-
tors, such as the relative speed and distance between pedes-
trians, and assigns higher rewards to behaviors that adhere to
the desired safety and efficiency objectives. This contributes
to identifying the best motion prediction for pedestrians,
leading to enhanced safety and smoother navigation. The
payoff function of pedestrian i P is characterized by
Ji=w3
T3
T3
X
t=1
a[t+3]
i3a[t+2]
i+ 3a[t+1]
ia[t]
i1
|{z }
pedestrian i’s jerk term
w4d2
TX
t∈Tpred
1d2(a[t]
ia[t]
02)
| {z }
pedestrian i’s safety term with ego vehicle
w5d3
T(P1) X
t∈Tpred
X
j∈P,j =i
1d3(a[t]
ja[t]
i2)
| {z }
pedestrian i’s safety term with other pedestrians
,
where d2represents the safe distance threshold between
the pedestrian and the ego vehicle, while d3represents
the safe distance threshold between a pedestrian and other
pedestrians. Specifically, w3represents the weight assigned
to the “pedestrian jerk term”, which is a measure of the
smoothness of the pedestrian’s movement. On the other hand,
w4and w5represent the weight assigned to the “pedestrian
i’s safety term” related to ego vehicle and other pedestrians,
which are two measures of the pedestrian i’s proximity to
the ego vehicle and other pedestrians j P, j =i.
D. Implementation
The implemented predictor model comprises one ST-
GCNN layer and five TXP-CNN layers. Both temporal
convolutions and graph convolutions utilize a kernel size
of 3. The deterministic and probabilistic models employ
and are trained for 150 epochs with a batch size of 128.
Since the VCI dataset does not provide sufficient data for
training, the ETH/UCY dataset is used to train the predictor
model and test it on the VCI dataset. This work employs
the vehicle trajectory planner proposed by [38], which uses
Frenet coordinates to represent road positions more intu-
itively compared to traditional Cartesian coordinates. Frenet
coordinates employ variables sand dto describe a vehicle’s
position on the road or a reference path. These coordinates
offer a simpler mathematical representation of a reference
path, as its run length is described by the s-axis. During the
enumeration of supports, the best planning trajectory that
takes safety and comfort into account is identified.
IV. EXP ERI MEN T
In this experimental section, the main goal is to evaluate
the effectiveness of our proposed stack in terms of reducing
prediction uncertainty and improving integration error. A
series of meticulously designed experiments are conducted
to provide a comprehensive assessment of the stack’s perfor-
mance.
For this experiment, the Vehicle-Crowd Interaction (VCI)
dataset [3] is utilized, and an open-loop validation approach
is adopted, implying that no position feedback is provided
for the ego vehicle. Vehicle trajectories from the VCI dataset
are selected, and our proposed method is applied to perform
path planning. By comparing the resulting path planning to
the ground truth, we can effectively gauge the performance
of our proposed approach.
A. Dataset
In this work, the Vehicle-Crowd Interaction (VCI), ETH
[39] and UCY [40] are utilized. The VCI dataset, collected
from two bustling campus locations, captures interactions
between pedestrians and vehicles in situations where neither
party has priority. One location includes a pedestrian cross-
walk at an unsignalized intersection, while the other consists
of a spacious shared area where pedestrians and vehicles
can move freely. The ETH dataset comprises two top-down
view scenes converted to world coordinates, recorded at an
observation frequency of 2.5 Hz. The UCY dataset contains
single top-down view urban scenes, also observed at a 2.5
Hz frequency.
Owing to the limited size of the VCI dataset, the predic-
tive models require more comprehensive data for training
purposes. Consequently, the larger ETH and UCY datasets
are utilized for training, while the VCI dataset is reserved
for testing and evaluating the performance of the developed
models.
In the testing phase, certain vehicle trajectories within the
VCI dataset that are either very short or remain stationary
are identified and removed. This step ensures a more accurate
and reliable evaluation of the planning model’s performance,
as it focuses on meaningful interactions between pedestrians
and vehicles rather than insignificant or irrelevant data points.
B. Evaluation Metrics
Average Displacement Error (ADE): The mean Eu-
clidean distance between ground truth and predicted
trajectories for each predicted timestep is computed.
Final Displacement Error (FDE): The Euclidean dis-
tance between ground truth and predicted trajectories
at the last predicted time step is determined.
Prediction Collision Rate (COL): This metric represents
the percentage of collision occurrences in the predicted
trajectories for all pedestrians.
Success Rate (SR): The proportion of planning scenar-
ios in which the vehicle successfully avoids colliding
with pedestrians, among all tested scenarios, is assessed.
C. Baseline
The first baseline method comprises a standard au-
tonomous vehicle stack, which includes prediction and plan-
ning modules. This method predicts 20 possible future tra-
jectories for each pedestrian and subsequently generates path
planning that avoids these predicted trajectories, representing
the conventional approach in autonomous vehicle navigation.
Another baseline method we compare against is a planner
that utilizes ground truth pedestrian future trajectories as in-
put for path planning. This method offers an ideal scenario in
which the planner has access to the actual future trajectories
of pedestrians, allowing us to evaluate the performance of
our proposed stack relative to this idealized baseline. We also
conduct a comparison between our method and a traditional
probabilistic predictor with respect to prediction outcomes.
Our stack picks out one or several potential trajectories from
the predicted set. For comparison purposes, we generate a
single trajectory from both the classic probabilistic predictor
and our proposed stack.
To ensure a fair comparison across different stacks, we
employ the same predictor and implementation methodology
for all methods being evaluated. This approach allows us to
isolate the impact of our proposed method on the overall
performance of the stack in the context of autonomous
vehicle navigation and pedestrian interaction.
V. RE SULTS A ND DISCUSSION
In the experiment presented in this paper, we assess the
effectiveness of our proposed method across several key
aspects of autonomous vehicle navigation and pedestrian
interaction. The results of the experiment demonstrate the
following major findings:
Refined probabilistic prediction: Our proposed stack sig-
nificantly improves probabilistic predictions for future pedes-
trian trajectories. Previous probabilistic prediction models
usually output distributions of pedestrians’ future trajectories,
wherein each point on the predicted trajectory is considered
as an independent event. In the real world, this assumption
may not always be valid, as the movement of pedestrians
often depends on their previous positions and movements,
their current context, and their goals. Therefore, treating
each point as an independent event can lead to considerable
uncertainty in the trajectory prediction. To attenuate the
risk of collisions, it is beneficial for such probabilistic
prediction models to sample a vast array of trajectories
from the predicted distribution. However, this practice may
conversely cause a substantial reduction in the search space
for path planning. In the standard probabilistic predictor,
a distribution of potential future positions is produced, as
demonstrated by the contour in Fig. 3. Our method success-
fully refines the range of predicted trajectory sets, providing
planners with flexibility and accommodating large search
spaces. In standard probabilistic predictions, overlapping
predicted distributions imply that pedestrians are expected
to collide with each other. However, in real-world situations,
pedestrians typically avoid collisions. This discrepancy in-
dicates that the standard probabilistic prediction contains a
considerable amount of uncertainty, which could result in
suboptimal planning decisions. Our proposed method ad-
dresses this issue by incorporating more realistic assumptions
about pedestrian behavior, leading to a significant reduction
in prediction uncertainty. As demonstrated in the table, the
proposed stack reduces collision rates by 4%.
Improved path planning: The proposed stack not only
improves the prediction aspect of the navigation process but
also enhances the overall path planning. By utilizing our
method, the autonomous vehicle is able to generate safer
and more efficient plans for navigating through dynamic
environments involving pedestrian interactions. As shown
in Table I, the game theory-based stack achieves a higher
success rate compared to the standard probabilistic predic-
tion. This is because the standard probabilistic prediction
produces excessive uncertainty, which significantly reduces
the planner’s search space and results in failure to find
successful paths to the destination. It can be observed from
the table that the success rate is the lowest when the planner
uses ground truth for planning. This is because the planner
relies on short-term (5-second) predictions of pedestrian
trajectories to plan the path, which may lead to a scenario
where the vehicle avoids collisions with pedestrians in the
short term but ends up in the most congested area, ultimately
colliding with pedestrians. This local planning approach
leads to a low success rate for this stack. Incorporating a
TABLE I
COMPARISON OF STACK PERFORMANCE: P REDICTION AND PLANNING
IN RE AL-WORLD DATASE T.
Method ADE FDE COL SR
Ground truth - planner / / / 42%
Standard stack (predictor - planner) 1.18 2.08 10% 48%
Ours 0.99 1.71 6% 51%
(a) (b)
(c) (d)
Fig. 3. Figure (a)-(d): Experimental results using real-world data for autonomous vehicle path predictions in crowded environments. Figures (a) and (b)
present a comparison between the autonomous vehicle path predictions (red) and human driver trajectories (blue), illustrating the successful navigation of
the vehicle through the crowds. Figures (c) and (d) compare the pedestrian trajectory predictions from a deep learning model (blue) to those from the
proposed game theoretic stack (red), demonstrating the improvement in probabilistic prediction and enhanced performance of the game theoretic stack.
broader range of pedestrian movement possibilities into the
stack can improve the planner’s success rate. In Figure 1,
due to the reward function’s settings (higher rewards for
the vehicle closer to the destination), the vehicle chooses
a path closer to the endpoint. However, as the vehicle gets
closer to pedestrians, it immediately changes its direction
to maintain a safe distance, selecting an alternative path
with fewer pedestrian interactions to reach the destination.
Figure 2 demonstrates that, in highly congested situations,
the planner’s path is remarkably similar to that of a human
driver.
Enhanced prediction accuracy: Our proposed stack not
only reduces uncertainty but also improves the overall accu-
racy of the predictions. By incorporating the mutual effects
between prediction and planning, our model is capable of
generating more precise estimations of pedestrian trajecto-
ries, leading to more reliable navigation in complex environ-
ments. We compared the predicted trajectory from the pre-
dicted distribution to the prediction in the Nash equilibrium
state, and observed that both the average displacement error
and the final displacement error decreased, as illustrated in
Table I. This demonstrates the effectiveness of our proposed
method in capturing these nuanced pedestrian behaviors and
improving prediction accuracy.
VI. CONCLUSIONS AND FUTURE WORK
This work presents a novel game theory-based autonomous
vehicle stack for safe and efficient navigation in crowded
environments with significant pedestrian interaction. The
proposed stack effectively addresses the challenges of stan-
dard modular architectures, such as compounding errors,
information bottlenecks, and integration difficulties. Our ap-
proach employs an efficient Nash equilibrium approximation
that enables simultaneous prediction and planning outputs,
providing a solution to the “chicken and egg” problem.
The stack incorporates a payoff function capturing agent
interactions in the environment. Evaluations using real-
world datasets demonstrated reduced prediction uncertainty,
enhanced accuracy, and improved planning success rate
while mitigating integration errors. The proposed approach
contributes to autonomous navigation solutions in dynamic
environments and highlights the importance of effective
communication and coordination between prediction and
planning modules.
One limitation of our work is the adoption of open-loop
validation. This methodology implies that the behavior of
the pedestrians remains unaffected by the implementation of
the new game theory planner as the data is predetermined
and obtained from a static dataset. This limits the scope
of interaction and adaptability, thereby potentially affecting
the effectiveness of the autonomous navigation system in
dynamic real-world scenarios. Furthermore, there is ample
room for improvement in the individual modules utilized
in our work. The planning algorithms we employed are
more commonly used in less congested environments, such
as highways, as opposed to densely populated areas. A
comprehensive exploration and integration of diverse planner
modules specifically designed for crowded environments
could potentially augment the performance and robustness of
our proposed navigation stack. Future work should address
these limitations and further improve upon the described
system.
REFERENCES
[1] P. Karkus, B. Ivanovic, S. Mannor, and M. Pavone, “Diffstack: A
differentiable and modular control stack for autonomous vehicles, in
Conference on Robot Learning. PMLR, 2023, pp. 2170–2180.
[2] K. Leung, S. Veer, E. Schmerling, and M. Pavone, “Learning
autonomous vehicle safety concepts from demonstrations,” arXiv
preprint arXiv:2210.02761, 2022.
[3] D. Yang, L. Li, K. Redmill, and ¨
U. ¨
Ozg¨
uner, “Top-view trajectories:
A pedestrian dataset of vehicle-crowd interaction from controlled
experiments and crowded campus, in IV, 2019.
[4] S. Eiffert and S. Sukkarieh, “Predicting Responses to a Robot’s Future
Motion Using Generative Recurrent Neural Networks, in ACRA,
2019.
[5] S. Eiffert, N. Wallace, H. Kong, N. Pirmarzdashti, and S. Sukkarieh,
“Path Planning in Dynamic Environments using Generative RNNs and
Monte Carlo Tree Search, ICRA, 2020.
[6] S. Bansal, A. Cosgun, A. Nakhaei, and K. Fujimura, “Collaborative
planning for mixed-autonomy lane merging, in 2018 IEEE/RSJ inter-
national conference on intelligent robots and systems (IROS). IEEE,
2018, pp. 4449–4455.
[7] C. Kim and R. Langari, “Game theory based autonomous vehicles
operation,” International Journal of Vehicle Design, vol. 65, no. 4, pp.
360–383, 2014.
[8] Y. Rahmati, A. Talebpour, A. Mittal, and J. Fishelson, “Game theory-
based framework for modeling human–vehicle interactions on the
road,” Transportation research record, vol. 2674, no. 9, pp. 701–713,
2020.
[9] S. Eiffert, K. Li, M. Shan, S. Worrall, S. Sukkarieh, and E. Nebot,
“Probabilistic Crowd GAN: Multimodal Pedestrian Trajectory Pre-
diction using a Graph Vehicle-Pedestrian Attention Network,” IEEE
Robotics and Automation Letters, 2020.
[10] B. Ivanovic and M. Pavone, “The Trajectron: Probabilistic Multi-Agent
Trajectory Modeling With Dynamic Spatiotemporal Graphs, in ICCV,
2019.
[11] T. Salzmann, B. Ivanovic, P. Chakravarty, and M. Pavone, “Trajec-
tron++: Multi-agent generative trajectory forecasting with heteroge-
neous data for control,” arXiv:2001.03093, 2020.
[12] A. Gupta, J. Johnson, L. Fei-Fei, S. Savarese, and A. Alahi, “Social
GAN: Socially acceptable trajectories with generative adversarial
networks,” in CVPR, 2018.
[13] V. Kosaraju, A. Sadeghian, R. Mart´
ın-Mart´
ın, I. Reid, S. H.
Rezatofighi, and S. Savarese, “Social-BiGAT: Multimodal Trajectory
Forecasting using Bicycle-GAN and Graph Attention Networks, in
NeurIPS, 2019.
[14] A. Alahi, K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and
S. Savarese, “Social LSTM: Human Trajectory Prediction in Crowded
spaces,” in CVPR, 2016.
[15] P. Zhang, W. Ouyang, P. Zhang, J. Xue, and N. Zheng, “SR-LSTM:
State Refinement for LSTM towards Pedestrian Trajectory Prediction,
in CVPR, 2019.
[16] P. Veliˇ
ckovi´
c, G. Cucurull, A. Casanova, A. Romero, P. Li `
o, and
Y. Bengio, “Graph Attention Networks,” in ICLR, 2018.
[17] A. Mohamed, K. Qian, M. Elhoseiny, and C. Claudel, “Social-
STGCNN: A social spatio-temporal graph convolutional neural net-
work for human trajectory prediction,” in CVPR, 2020.
[18] K. Li, S. Eiffert, M. Shan, F. Gomez-Donoso, S. Worrall, and E. Nebot,
“Attentional-GCNN: Adaptive pedestrian trajectory prediction towards
generic autonomous vehicle use cases,” in ICRA, 2021.
[19] X. Weng, B. Ivanovic, and M. Pavone, “Mtp: multi-hypothesis tracking
and prediction for reduced error propagation,” in 2022 IEEE Intelligent
Vehicles Symposium (IV). IEEE, 2022, pp. 1218–1225.
[20] J. Ziegler, P. Bender, T. Dang, and C. Stiller, “Trajectory planning for
bertha—a local, continuous method,” in 2014 IEEE intelligent vehicles
symposium proceedings. IEEE, 2014, pp. 450–457.
[21] C. Liu, S. Lee, S. Varnhagen, and H. E. Tseng, “Path planning for
autonomous vehicles using model predictive control, in 2017 IEEE
Intelligent Vehicles Symposium (IV). IEEE, 2017, pp. 174–179.
[22] B. Ivanovic, A. Elhafsi, G. Rosman, A. Gaidon, and M. Pavone, “Mats:
An interpretable trajectory forecasting representation for planning and
control,” arXiv preprint arXiv:2009.07517, 2020.
[23] Y. Chen, U. Rosolia, C. Fan, A. Ames, and R. Murray, “Reactive
motion planning with probabilisticsafety guarantees,” in Conference
on Robot Learning. PMLR, 2021, pp. 1958–1970.
[24] S. Schaefer, K. Leung, B. Ivanovic, and M. Pavone, “Leveraging
neural network gradients within trajectory optimization for proactive
human-robot interactions,” in 2021 IEEE International Conference on
Robotics and Automation (ICRA). IEEE, 2021, pp. 9673–9679.
[25] S. Casas, C. Gulino, S. Suo, and R. Urtasun, “The importance of
prior knowledge in precise multimodal prediction, in 2020 IEEE/RSJ
International Conference on Intelligent Robots and Systems (IROS).
IEEE, 2020, pp. 2295–2302.
[26] B. Ivanovic and M. Pavone, “Injecting planning-awareness into pre-
diction and detection evaluation, in 2022 IEEE Intelligent Vehicles
Symposium (IV). IEEE, 2022, pp. 821–828.
[27] M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp,
P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang et al., “End to
end learning for self-driving cars, arXiv preprint arXiv:1604.07316,
2016.
[28] A. E. Sallab, M. Abdou, E. Perot, and S. Yogamani, “Deep rein-
forcement learning framework for autonomous driving, arXiv preprint
arXiv:1704.02532, 2017.
[29] A. Jain, L. Del Pero, H. Grimmett, and P. Ondruska, “Autonomy
2.0: Why is self-driving always 5 years away?” arXiv preprint
arXiv:2107.08142, 2021.
[30] A. Kendall, J. Hawke, D. Janz, P. Mazur, D. Reda, J.-M. Allen, V.-D.
Lam, A. Bewley, and A. Shah, “Learning to drive in a day,” in 2019
International Conference on Robotics and Automation (ICRA). IEEE,
2019, pp. 8248–8254.
[31] R. Elvik, “A review of game-theoretic models of road user behaviour,”
Accident Analysis & Prevention, vol. 62, pp. 388–396, 2014.
[32] N. Radwan, W. Burgard, and A. Valada, “Multimodal interaction-
aware motion prediction for autonomous street crossing,” The Interna-
tional Journal of Robotics Research, vol. 39, no. 13, pp. 1567–1598,
2020.
[33] W.-C. Ma, D.-A. Huang, N. Lee, and K. M. Kitani, “Forecasting in-
teractive dynamics of pedestrians with fictitious play,” in Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition,
2017, pp. 774–782.
[34] A. Turnwald, W. Olszowy, D. Wollherr, and M. Buss, “Interactive
navigation of humans from a game theoretic perspective, in 2014
IEEE/RSJ International Conference on Intelligent Robots and Systems.
IEEE, 2014, pp. 703–708.
[35] B. L. Mesmer and C. L. Bloebaum, “Modeling decision and game
theory based pedestrian velocity vector decisions with interacting
individuals, Safety science, vol. 87, pp. 116–130, 2016.
[36] F. T. Johora and J. P. M¨
uller, “Modeling interactions of multimodal
road users in shared spaces,” in 2018 21st International Conference
on Intelligent Transportation Systems (ITSC). IEEE, 2018, pp. 3568–
3574.
[37] T. N. Kipf and M. Welling, “Semi-supervised classification with graph
convolutional networks, arXiv preprint arXiv:1609.02907, 2016.
[38] M. Werling, J. Ziegler, S. Kammel, and S. Thrun, “Optimal trajectory
generation for dynamic street scenarios in a frenet frame,” in 2010
IEEE International Conference on Robotics and Automation. IEEE,
2010, pp. 987–993.
[39] S. Pellegrini, A. Ess, K. Schindler, and L. Van Gool, “You’ll never
walk alone: Modeling social behavior for multi-target tracking, in
ICCV, 2009.
[40] A. Lerner, Y. Chrysanthou, and D. Lischinski, “Crowds by example,”
in Computer graphics forum, vol. 26, no. 3, 2007, pp. 655–664.
... Other approaches, use game-theoretic models [19] or Partially Observable Markov Decision Processes [10] to account for the interaction. In [20], the authors present a game-theoretic approach using a leader-follower game. ...
... The planning framework follows a scheme consisting of the modules Understand, Predict, Plan & Select and Act. Similar schemes are widely used in behavior planning for AVs [15], [19], [24]. Fig. 2 shows the overall structure of the complete system for cooperative communication. ...
... It is crucial to model the high non-linearity of human behaviors, such as human intentions and interactions between agents [16]- [20]. Deep reinforcement learning (DRL) offers a potentially viable solution to these challenges [21]- [23]. Previous works on RL-based methods for social robots include capturing agent-agent interactions [2], [23] and the intentions of human agents [6], incorporating these as predictions into RL policy networks. ...
Preprint
Full-text available
Reinforcement Learning (RL) has enabled social robots to generate trajectories without human-designed rules or interventions, which makes it more effective than hard-coded systems for generalizing to complex real-world scenarios. However, social navigation is a safety-critical task that requires robots to avoid collisions with pedestrians while previous RL-based solutions fall short in safety performance in complex environments. To enhance the safety of RL policies, to the best of our knowledge, we propose the first algorithm, SoNIC, that integrates adaptive conformal inference (ACI) with constrained reinforcement learning (CRL) to learn safe policies for social navigation. More specifically, our method augments RL observations with ACI-generated nonconformity scores and provides explicit guidance for agents to leverage the uncertainty metrics to avoid safety-critical areas by incorporating safety constraints with spatial relaxation. Our method outperforms state-of-the-art baselines in terms of both safety and adherence to social norms by a large margin and demonstrates much stronger robustness to out-of-distribution scenarios. Our code and video demos are available on our project website: https://sonic-social-nav.github.io/.
... Recent studies have also explored the use of social dynamics and negotiation strategies to enhance the interaction capabilities of autonomous agents. For instance, game theory can be used to model interactions among multiple agents allowing for the prediction and anticipation of other agents' actions, leading to more effective decision making processes [43]- [46]. Additionally, incorporating techniques such as imitation learning and behavior cloning has enabled agents to learn from human demonstrations, thereby improving their ability to interact seamlessly with human drivers and pedestrians [47], [48]. ...
Preprint
Training intelligent agents to navigate highly interactive environments presents significant challenges. While guided meta reinforcement learning (RL) approach that first trains a guiding policy to train the ego agent has proven effective in improving generalizability across various levels of interaction, the state-of-the-art method tends to be overly sensitive to extreme cases, impairing the agents' performance in the more common scenarios. This study introduces a novel training framework that integrates guided meta RL with importance sampling (IS) to optimize training distributions for navigating highly interactive driving scenarios, such as T-intersections. Unlike traditional methods that may underrepresent critical interactions or overemphasize extreme cases during training, our approach strategically adjusts the training distribution towards more challenging driving behaviors using IS proposal distributions and applies the importance ratio to de-bias the result. By estimating a naturalistic distribution from real-world datasets and employing a mixture model for iterative training refinements, the framework ensures a balanced focus across common and extreme driving scenarios. Experiments conducted with both synthetic dataset and T-intersection scenarios from the InD dataset demonstrate not only accelerated training but also improvement in agent performance under naturalistic conditions, showcasing the efficacy of combining IS with meta RL in training reliable autonomous agents for highly interactive navigation tasks.
Article
Training intelligent agents to navigate highly interactive environments presents significant challenges. While guided meta reinforcement learning (RL) approach that first trains a guiding policy to train the ego agent has proven effective in improving generalizability across scenarios with various levels of interaction, the state-of-the-art method tends to be overly sensitive to extreme cases, impairing the agents' performance in the more common scenarios. This study introduces a novel training framework that integrates guided meta RL with importance sampling (IS) to optimize training distributions iteratively for navigating highly interactive driving scenarios, such as T-intersections or roundabouts. Unlike traditional methods that may underrepresent critical interactions or overemphasize extreme cases during training, our approach strategically adjusts the training distribution towards more challenging driving behaviors using IS proposal distributions and applies the importance ratio to de-bias the result. By estimating a naturalistic distribution from real-world datasets and employing a mixture model for iterative training refinements, the framework ensures a balanced focus across common and extreme driving scenarios. Experiments conducted with both synthetic and naturalistic datasets demonstrate both accelerated training and performance improvements under highly interactive driving tasks.
Article
Deep reinforcement learning (DRL) provides a promising way for intelligent agents (e.g., autonomous vehicles) to learn to navigate complex scenarios. However, DRL with neural networks as function approximators is typically considered a black box with little explainability and often suffers from suboptimal performance, especially for autonomous navigation in highly interactive multi-agent environments. To address these issues, we propose three auxiliary tasks with spatio-temporal relational reasoning and integrate them into the standard DRL framework, which improves the decision making performance and provides explainable intermediate indicators. We propose to explicitly infer the internal states (i.e., traits and intentions) of surrounding agents (e.g., human drivers) as well as to predict their future trajectories in the situations with and without the ego agent through counterfactual reasoning. These auxiliary tasks provide additional supervision signals to infer the behavior patterns of other interactive agents. Multiple variants of framework integration strategies are compared. We also employ a spatio-temporal graph neural network to encode relations between dynamic entities, which enhances both internal state inference and decision making of the ego agent. Moreover, we propose an interactivity estimation mechanism based on the difference between predicted trajectories in these two situations, which indicates the degree of influence of the ego agent on other agents. To validate the proposed method, we design an intersection driving simulator based on the Intelligent Intersection Driver Model (IIDM) that simulates vehicles and pedestrians. Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics and provides explainable intermediate indicators (i.e., internal states, and interactivity scores) for decision making.
Article
Full-text available
For mobile robots navigating on sidewalks, the ability to safely cross street intersections is essential. Most existing approaches rely on the recognition of the traffic light signal to make an informed crossing decision. Although these approaches have been crucial enablers for urban navigation, the capabilities of robots employing such approaches are still limited to navigating only on streets that contain signalized intersections. In this article, we address this challenge and propose a multimodal convolutional neural network framework to predict the safety of a street intersection for crossing. Our architecture consists of two subnetworks: an interaction-aware trajectory estimation stream ( interaction-aware temporal convolutional neural network (IA-TCNN)), that predicts the future states of all observed traffic participants in the scene; and a traffic light recognition stream AtteNet. Our IA-TCNN utilizes dilated causal convolutions to model the behavior of all the observable dynamic agents in the scene without explicitly assigning priorities to the interactions among them, whereas AtteNet utilizes squeeze-excitation blocks to learn a content-aware mechanism for selecting the relevant features from the data, thereby improving the noise robustness. Learned representations from the traffic light recognition stream are fused with the estimated trajectories from the motion prediction stream to learn the crossing decision. Incorporating the uncertainty information from both modules enables our architecture to learn a likelihood function that is robust to noise and mispredictions from either subnetworks. Simultaneously, by learning to estimate motion trajectories of the surrounding traffic participants and incorporating knowledge of the traffic light signal, our network learns a robust crossing procedure that is invariant to the type of street intersection. Furthermore, we extend our previously introduced Freiburg Street Crossing dataset with sequences captured at multiple intersections of varying types, demonstrating complex interactions among the traffic participants as well as various lighting and weather conditions. We perform comprehensive experimental evaluations on public datasets as well as our Freiburg Street Crossing dataset, which demonstrate that our network achieves state-of-the-art performance for each of the subtasks, as well as for the crossing safety prediction. Moreover, we deploy the proposed architectural framework on a robotic platform and conduct real-world experiments that demonstrate the suitability of the approach for real-time deployment and robustness to various environments.
Conference Paper
Autonomous vehicle navigation in shared pedestrian environments requires the ability to predict future crowd motion both accurately and with minimal delay. Understanding the uncertainty of the prediction is also crucial. Most existing approaches however can only estimate uncertainty through repeated sampling of generative models. Additionally, most current predictive models are trained on datasets that assume complete observability of the crowd using an aerial view. These are generally not representative of real-world usage from a vehicle perspective, and can lead to the underestimation of uncertainty bounds when the on-board sensors are occluded. Inspired by prior work in motion prediction using spatio-temporal graphs, we propose a novel Graph Convolutional Neural Network (GCNN)-based approach, Attentional-GCNN, which aggregates information of implicit interaction between pedestrians in a crowd by assigning attention weight in edges of the graph. Our model can be trained to either output a probabilistic distribution or faster deterministic prediction, demonstrating applicability to autonomous vehicle use cases where either speed or accuracy with uncertainty bounds are required. To further improve the training of predictive models, we propose an automatically labelled pedestrian dataset collected from an intelligent vehicle platform representative of real-world use. Through experiments on a number of datasets, we show our proposed method achieves an improvement over the state of art by 10% Average Displacement Error (ADE) and 12% Final Displacement Error (FDE) with fast inference speeds.
Conference Paper
Motion planning in environments with multiple agents is critical to many important autonomous applications such as autonomous vehicles and assistive robots. This paper considers the problem of motion planning, where the controlled agent shares the environment with multiple uncontrolled agents. First, a predictive model of the uncontrolled agents is trained to predict all possible trajectories within a short horizon based on the scenario. The prediction is then fed to a motion planning module based on model predictive control. We proved generalization bound for the predictive model using three different methods, post-bloating, support vector machine (SVM), and conformal analysis, all capable of generating stochastic guarantees of the correctness of the predictor. The proposed approach is demonstrated in simulation in a scenario emulating autonomous highway driving.