PreprintPDF Available

Human Trajectory Prediction via Neural Social Physics

Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Trajectory prediction has been widely pursued in many fields, and many model-based and model-free methods have been explored. The former include rule-based, geometric or optimization-based models, and the latter are mainly comprised of deep learning approaches. In this paper, we propose a new method combining both methodologies based on a new Neural Differential Equation model. Our new model (Neural Social Physics or NSP) is a deep neural network within which we use an explicit physics model with learnable parameters. The explicit physics model serves as a strong inductive bias in modeling pedestrian behaviors, while the rest of the network provides a strong data-fitting capability in terms of system parameter estimation and dynamics stochasticity modeling. We compare NSP with 15 recent deep learning methods on 6 datasets and improve the state-of-the-art performance by 5.56%-70%. Besides, we show that NSP has better generalizability in predicting plausible trajectories in drastically different scenarios where the density is 2-5 times as high as the testing data. Finally, we show that the physics model in NSP can provide plausible explanations for pedestrian behaviors, as opposed to black-box deep learning. Code is available:
Content may be subject to copyright.
Human Trajectory Prediction via Neural Social Physics
Jiangbei Yue, Dinesh Manocha, and He Wang
Existing approaches generally fall into model-based and model-free methods. Model-
based methods tend to possess good explainability. However, they are less effective in
data fitting. Model-free methods based on deep learning excel at data fitting, but lack
explainability. Our paper proposes neural social physics that can explain pedestrian
behaviors and retain good data-fitting capabilities to predict human trajectories by
combining model-based and model-free methods.
A new neural differentiable equation model for trajectory prediction and analysis.
A new mechanism to combine explicit models with neural networks for prediction.
The NSP model performs well in: prediction, generalization and explainability
𝒅𝒕 𝒕 = 𝒇𝜽,𝝓 𝒕, 𝒒 𝒕 , 𝜴 𝒕 , 𝒒𝑻, 𝑬 + 𝜶𝝓(𝒕, 𝒒𝒕:𝒕−𝑴)
𝒒 𝒕 + 𝜟𝒕 𝒒 𝒕 +
𝒒 𝒕 ∆𝒕 = 𝒑 𝒕
𝒑 𝒕 + ∆𝒕
𝒑 𝒕 + 𝜶(𝒕, 𝒒𝒕:𝒕−𝑴 )
Dataset S-GAN Sophie PECNet Y-net NSP
ETH 0.81/1.52 0.70/1.43 0.54/0.87 0.28/0.33 0.25/0.24
Hotel 0.72/1.61 0.76/1.67 0.18/0.24 0.10/0.14 0.09/0.13
Univ 0.60/1.26 0.54/1.24 0.35/0.60 0.24/0.41 0.21/0.38
Zara1 0.34/0.69 0.30/0.63 0.22/0.39 0.17/0.27 0.16/0.27
Zara2 0.42/0.84 0.38/0.78 0.17/0.30 0.13/0.22 0.12/0.20
AVG 0.58/1.18 0.54/1.15 0.29/0.48 0.18/0.27 0.17/0.24
SDD 27.2/41.4 16.3/29.4 10.0/15.9 7.9/11.9 6.5/10.6
Prediction accuracy on public datasets. ADE/FDE
Interpretability of Prediction
Collisions in unseen scenarios
Red is observed. Green is our prediction. Black is the ground-truth. Blue is
pedestrians. 𝐹
𝑔𝑜𝑎𝑙,𝐹𝑐𝑜𝑙 and 𝐹
𝑒𝑛𝑣 are shown as yellow, blue and black arrows.
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
Differentiable physics modeling combines physics models with gradient-based learning to provide model explicability and data efficiency. It has been used to learn dynamics, solve inverse problems and facilitate design, and is at its inception of impact. Current successes have concentrated on general physics models such as rigid bodies, deformable sheets, etc, assuming relatively simple structures and forces. Their granularity is intrinsically coarse and therefore incapable of modelling complex physical phenomena. Fine-grained models are still to be developed to incorporate sophisticated material structures and force interactions with gradient-based learning. Following this motivation, we propose a new dif-ferentiable fabrics model for composite materials such as cloths, where we dive into the granularity of yarns and model individual yarn physics and yarn-to-yarn interactions. To this end, we propose several differentiable forces, whose counterparts in empirical physics are indifferentiable, to facilitate gradient-based learning. These forces, albeit applied to cloths, are ubiquitous in various physical systems. Through comprehensive evaluation and comparison, we demonstrate our model's explicability in learning meaningful physical parameters, versatility in incorporating complex physical structures and heterogeneous materials, data-efficiency in learning, and high-fidelity in capturing subtle dynamics. Code is available in: hysics-A-Yarn-level-Model-for-Fabrics.git
Full-text available
Despite the significant progress over the last 50 years in simulating flow problems using numerical discretization of the Navier–Stokes equations (NSE), we still cannot incorporate seamlessly noisy data into existing algorithms, mesh-generation is complex, and we cannot tackle high-dimensional problems governed by parametrized NSE. Moreover, solving inverse flow problems is often prohibitively expensive and requires complex and expensive formulations and new computer codes. Here, we review flow physics-informed learning, integrating seamlessly data and mathematical models, and implement them using physics-informed neural networks (PINNs). We demonstrate the effectiveness of PINNs for inverse problems related to three-dimensional wake flows, supersonic flows, and biomedical flows.
Full-text available
Pedestrian trajectory prediction is one of the main concerns of computer vision problems in the automotive industry, especially in the field of advanced driver assistance systems. The ability to anticipate the next movements of pedestrians on the street is a key task in many areas, e.g., self-driving auto vehicles, mobile robots or advanced surveillance systems, and they still represent a technological challenge. The performance of state-of-the-art pedestrian trajectory prediction methods currently benefits from the advancements in sensors and associated signal processing technologies. The current paper reviews the most recent deep learning-based solutions for the problem of pedestrian trajectory prediction along with employed sensors and afferent processing methodologies, and it performs an overview of the available datasets, performance metrics used in the evaluation process, and practical applications. Finally, the current work exposes the research gaps from the literature and outlines potential new research directions.
Full-text available
We describe a new algorithm for the generation of high quality tetrahedral meshes using artificial neural networks. The goal is to generate close-to-optimal meshes in the sense that the error in the computed finite element (FE) solution (for a target system of partial differential equations (PDEs)) is as small as it could be for a prescribed number of nodes or elements in the mesh. In this paper we illustrate and investigate our proposed approach by considering the equations of linear elasticity, solved on a variety of three-dimensional geometries. This class of PDE is selected due to its equivalence to an energy minimization problem, which therefore allows a quantitative measure of the relative accuracy of different meshes (by comparing the energy associated with the respective FE solutions on these meshes). Once the algorithm has been introduced it is evaluated on a variety of test problems, each with its own distinctive features and geometric constraints, in order to demonstrate its effectiveness and computational efficiency.
Full-text available
When overpopulated cities face frequent crowded events like strikes, demonstrations, parades or other sorts of people gatherings, they are confronted to multiple security issues. To mitigate these issues, security forces are often involved to monitor the gatherings and to ensure the security of their participants. However, when access to technology is limited, the security forces can quickly become overwhelmed. Fortunately, more and more important smart cities are adopting the concept of intelligent surveillance systems. In these situations, intelligent surveillance systems require the most advanced techniques of crowd analysis to monitor crowd events properly. In this review, we explore various studies related to crowd analysis. Crowd analysis is commonly broken down into two major branches: crowd statistics and crowd behavior analysis. When crowd statistics determines the Level Of Service (LoS) of a crowded scene, crowd behavior analysis describes the motion patterns and the activities that are observed in a scene. One of the hottest topics of crowd analysis is anomaly detection. Although a unanimous definition of anomaly has not yet been met, each of crowd analysis subtopics can be subjected to abnormality. The purpose of our review is to find subareas, in crowd analysis, that are still unexplored or that seem to be rarely addressed through the prism of Deep Learning.
Full-text available
Due to the interaction and external interference, the crowds will constantly and dynamically adjust their evacuation path in the evacuation process to achieve the purpose of rapid evacuation. The information from previous process can be used to modify the current evacuation control information to achieve a better evacuation effect, and iterative learning control can achieve an effective prediction of the expected path within a limited running time. In order to depict this process, the social force model is improved based on an iterative extended state observer so that the crowds can move along the optimal evacuation path. First, the objective function of the optimal evacuation path is established in the improved model, and an iterative extended state observer is designed to get the estimated value. Second, the above model is verified through simulation experiments. The results show that, as the number of iterations increases, the evacuation time shows a trend of first decreasing and then increasing.
Trajectory prediction aims to predict the movement trend of the agents like pedestrians, bikers, vehicles. It is helpful to analyze and understand human activities in crowded spaces and widely applied in many areas such as surveillance video analysis and autonomous driving systems. Thanks to the success of deep learning, trajectory prediction has made significant progress. The current methods are dedicated to studying the agents’ future trajectories under the social interaction and the sceneries’ physical constraints. Moreover, how to deal with these factors still catches researchers’ attention. However, they ignore the Semantic Shift Phenomenon when modeling these interactions in various prediction sceneries. There exist several kinds of semantic deviations inner or between social and physical interactions, which we call the “Gap”. In this paper, we propose a Contextual Semantic Consistency Network (CSCNet) to predict agents’ future activities with powerful and efficient context constraints. We utilize a well-designed context-aware transfer to obtain the intermediate representations from the scene images and trajectories. Then we eliminate the differences between social and physical interactions by aligning activity semantics and scene semantics to cross the Gap. Experiments demonstrate that CSCNet performs better than most of the current methods quantitatively and qualitatively.
The real-time simulation of human crowds has many applications. Simulating how the people in a crowd move through an environment is an active and ever-growing research topic. Most research focuses on microscopic (or ‘agent-based’) crowd-simulation methods that model the behavior of each individual person, from which collective behavior can then emerge. This state-of-the-art report analyzes how the research on microscopic crowd simulation has advanced since the year 2010. We focus on the most popular research area within the microscopic paradigm, which is local navigation, and most notably collision avoidance between agents. We discuss the four most popular categories of algorithms in this area (force-based, velocity-based, vision-based, and data-driven) that have either emerged or grown in the last decade. We also analyze the conceptual and computational (dis)advantages of each category. Next, we extend the discussion to other types of behavior or navigation (such as group behavior and the combination with path planning), and we review work on evaluating the quality of simulations. Based on the observed advancements in the 2010s, we conclude by predicting how the research area of microscopic crowd simulation will evolve in the future. Overall, we expect a significant growth in the area of data-driven and learning-based agent navigation, and we expect an increasing number of methods that re-group multiple ‘levels’ of behavior into one principle. Furthermore, we observe a clear need for new ways to analyze (real or simulated) crowd behavior, which is important for quantifying the realism of a simulation and for choosing the right algorithms at the right time.