Xianyuan Zhan

Xianyuan Zhan
Tsinghua University | TH

Assistant Professor

About

50
Publications
19,739
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,090
Citations
Introduction
Offline reinforcement learning, data-driven methods for transportation applications
Additional affiliations
August 2011 - present
Purdue University
Position
  • Research Assistant

Publications

Publications (50)
Preprint
Full-text available
We study the problem of offline Imitation Learning (IL) where an agent aims to learn an optimal expert behavior policy without additional online environment interactions. Instead, the agent is provided with a supplementary offline dataset from suboptimal behaviors. Prior works that address this problem either require that expert data occupies the m...
Preprint
Full-text available
Contrastive learning (CL) has recently been applied to adversarial learning tasks. Such practice considers adversarial samples as additional positive views of an instance, and by maximizing their agreements with each other, yields better adversarial robustness. However, this mechanism can be potentially flawed, since adversarial perturbations may c...
Preprint
Full-text available
Offline imitation learning (IL) is a powerful method to solve decision-making problems from expert demonstrations without reward labels. Existing offline IL methods suffer from severe performance degeneration under limited expert data due to covariate shift. Including a learned dynamics model can potentially improve the state-action space coverage...
Conference Paper
The recent offline reinforcement learning (RL) studies have achieved much progress to make RL usable in real-world systems by learning policies from pre-collected datasets without environment interaction. Unfortunately, existing offline RL methods still face many practical challenges in real-world system control tasks, such as computational restric...
Preprint
Full-text available
Learning effective reinforcement learning (RL) policies to solve real-world complex tasks can be quite challenging without a high-fidelity simulation environment. In most cases, we are only given imperfect simulators with simplified dynamics, which inevitably lead to severe sim-to-real gaps in RL policy learning. The recently emerged field of offli...
Article
Full-text available
Optimizing the combustion efficiency of a thermal power generating unit (TPGU) is a highly challenging and critical task in the energy industry. We develop a new data-driven AI system, namely DeepThermal, to optimize the combustion control strategy for TPGUs. At its core, is a new model-based offline reinforcement learning (RL) framework, called MO...
Article
Full-text available
We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that maximizes long-term reward while satisfying safety constraints given only offline data, without further interaction with the environment. This problem is more appealing for real world RL applications, in which data collection is costly or dangerous....
Conference Paper
Full-text available
We study the problem of offline Imitation Learning (IL) where an agent aims to learn an optimal expert behavior policy without additional online environment interactions. Instead, the agent is provided with a supplementary offline dataset from suboptimal behaviors. Prior works that address this problem either require that expert data occupies the m...
Preprint
Full-text available
In offline reinforcement learning (RL), one detrimental issue to policy learning is the error accumulation of deep Q function in out-of-distribution (OOD) areas. Unfortunately, existing offline RL methods are often over-conservative, inevitably hurting generalization performance outside data distribution. In our study, one interesting observation i...
Preprint
Full-text available
Heated debates continue over the best autonomous driving framework. The classic modular pipeline is widely adopted in the industry owing to its great interpretability and stability, whereas the fully end-to-end paradigm has demonstrated considerable simplicity and learnability along with the rise of deep learning. As a way of marrying the advantage...
Preprint
Full-text available
End-to-end learning robotic manipulation with high data efficiency is one of the key challenges in robotics. The latest methods that utilize human demonstration data and unsupervised representation learning has proven to be a promising direction to improve RL learning efficiency. The use of demonstration data also allows "warming-up" the RL policie...
Preprint
Most prior approaches to offline reinforcement learning (RL) utilize \textit{behavior regularization}, typically augmenting existing off-policy actor critic algorithms with a penalty measuring divergence between the policy and the offline data. However, these approaches lack guaranteed performance improvement over the behavior policy. In this work,...
Preprint
Heated debates continue over the best autonomous driving framework. The classic modular pipeline is widely adopted in the industry owing to its great interpretability and stability, whereas the end-to-end paradigm has demonstrated considerable simplicity and learnability along with the rise of deep learning. We introduce a new modularized end-to-en...
Preprint
Full-text available
We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that maximizes long-term reward while satisfying safety constraints given only offline data, without further interaction with the environment. This problem is more appealing for real world RL applications, in which data collection is costly or dangerous....
Conference Paper
Full-text available
Purchase prediction is an essential task in both online and offline retail industry, especially during major shopping festivals , when strong promotion boosts consumption dramatically. It is important for merchants to forecast such surge of sales and have better preparation. This is a challenging problem, as the purchase patterns during shopping fe...
Conference Paper
Full-text available
Accurate network-wide traffic state estimation is vital to many transportation operations and urban applications. However, existing methods often suffer from the scalability issue when performing real-time inference at the city-level, or not robust enough under limited data. Currently, GPS trajectory data from probe vehicles has become a popular da...
Preprint
Full-text available
Detecting anomalies in large complex systems is a critical and challenging task. The difficulties arise from several aspects. First, collecting ground truth labels or prior knowledge for anomalies is hard in real-world systems, which often lead to limited or no anomaly labels in the dataset. Second, anomalies in large systems usually occur in a col...
Preprint
Full-text available
Offline reinforcement learning (RL) enables learning policies using pre-collected datasets without environment interaction, which provides a promising direction to make RL useable in real-world systems. Although recent offline RL studies have achieved much progress, existing methods still face many practical challenges in real-world system control...
Preprint
Full-text available
Thermal power generation plays a dominant role in the world's electricity supply. It consumes large amounts of coal worldwide, and causes serious air pollution. Optimizing the combustion efficiency of a thermal power generating unit (TPGU) is a highly challenging and critical task in the energy industry. We develop a new data-driven AI system, name...
Article
Full-text available
A key issue to understand urban system is to characterize the activity dynamics in a city—when, where, what, and how activities happen in a city. To better understand the urban activity dynamics, city-wide and multiday activity participation sequence data, namely, activity chain as well as suitable spatiotemporal models, are needed. The commonly us...
Article
License-plate recognition (LPR) data are emerging data sources in urban transportation systems which contain rich information. Large-scale LPR systems have seen rapid development in many parts of the world. However, limited by privacy considerations, LPR data are seldom available to the research community, which lead to huge research gap in data-dr...
Article
The spatial correlation between urban sprawl and the underlying road network has long been recognized in urban studies. Accessibility to road networks is often considered an approximation for the measurement of human mobility, which is a key factor in determining potential urban sprawl in the future. Despite the close relationship between urban dev...
Article
Full-text available
In this paper, we used complex network analysis approaches to investigate topological coevolution over a century for three different urban infrastructure networks. We applied network analyses to a unique time-stamped network data set of an Alpine case study, representing the historical development of the town and its infrastructure over the past 10...
Preprint
Full-text available
Effective evacuation of residents in hurricane-affected areas is essential in reducing the overall damage and ensuring public safety. However, traffic flow patterns in evacuation contexts is far more complex than normal traffic and is usually accompanied with severe congestion due to the presence of evacuees. In such scenarios, agent-based simulati...
Article
Full-text available
Just as natural river networks are known to be globally self-similar, recent research has shown that human-built urban networks, such as road networks, are also functionally self-similar, and have fractal topology with power-law node-degree distributions (p(k) = a k). Here we show, for the first time, that other urban infrastructure networks (sanit...
Article
Full-text available
We propose a new framework for modeling the evolution of functional failures and recoveries in complex networks, with traffic congestion on road networks as the case study. Differently from conventional approaches, we transform the evolution of functional states into an equivalent dynamic structural process: dual-vertex splitting and coalescing emb...
Article
Full-text available
This paper develops a complementarity formulation for a multi-user class, simultaneous route and departure time choice dynamic user equilibrium (DUE) model. A path-based multiclass cell transmission model (mCTM) is embedded to propagate the traffic flow on the network. Heterogeneous user classes are incorporated in the new formulation and heterogen...
Article
Full-text available
Hard shoulder running (HSR) and queue warning are two active traffic management (ATM) strategies and are commonly used to alleviate highway traffic congestion. This study proposes an optimisation model for HSR operation in coordination with queue warning service during non-recurring traffic accident condition using an updated cell transmission mode...
Article
Full-text available
Personal mobility carbon allowance (PMCA) schemes are designed to reduce carbon consumption from transportation networks. PMCA schemes influence the travel decision process of users and accordingly impact the system metrics including travel time and greenhouse gas (GHG) emissions. We develop a multi-user class dynamic user equilibrium model to eval...
Conference Paper
Full-text available
In this paper, we investigate the historical development of complex network topologies in urban water distribution networks (WDNs) and urban drainage networks (UDNs). The analyses were performed on time-stamped network data of an Alpine case study, which represent the evolution of the town and its infrastructure over the past 106 years. We use the...
Article
Full-text available
Household behavior and dynamic traffic flows are the two most important aspects of hurricane evacuations. However, current evacuation models largely overlook the complexity of household behavior leading to oversimplified traffic assignments and, as a result, inaccurate evacuation clearance times in the network. In this paper, we present a high fide...
Article
Full-text available
We examine high-resolution urban infrastructure data using every pipe for the water distribution network (WDN) and sanitary sewer network (SSN) in a large Asian city (≈4 million residents) to explore the structure as well as the spatial and temporal evolution of these infrastructure networks. Network data were spatially disaggregated into multiple...
Article
Full-text available
Traffic volume estimation at the city scale is an important problem useful to many transportation operations and urban applications. This paper proposes a hybrid framework that integrates both state-of-art machine learning techniques and well-established traffic flow theory to estimate citywide traffic volume. In addition to typical urban context f...
Article
Full-text available
Taxi service systems in big cities are immensely complex due to the interaction and self-organization between taxi drivers and passengers. An inefficient taxi service system leads to more empty trips for drivers and longer waiting time for passengers and introduces unnecessary congestion on the road network. In this paper, we investigate the effici...
Article
Full-text available
Social media check-in services have enabled people to share their activity-related choices providing a new source of human activity and social networks data. Geo-location data from these services offers us information, in new ways, to understand social influence on individual choices. In this paper, we investigate the extent of social influence on...
Article
Full-text available
This paper presents a sequential method to estimate the direct transportation economic impacts (DTEI) related to transportation due to disruptions in highway networks used by trucks and cars. The main input is the Freight Analysis Framework version 3, best public data for truck movements in the United States. The method considers multi-commodity...
Article
Full-text available
This study investigates the Multivariate Poisson-lognormal (MVPLN) model that jointly models crash frequency and severity accounting for correlations. The ordinary univariate count models analyze crashes of different severity level separately ignoring the correlations among severity levels. The MVPLN model is capable to incorporate the general corr...
Article
Accurate estimation and prediction of urban link travel times are important for urban traffic operations and management. This paper develops a Bayesian mixture model to estimate short-term average urban link travel times using large-scale trip-based data with partial information. Unlike typical GPS trajectory data, trip-based data from taxies or ot...
Chapter
Full-text available
Understanding urban dynamics is of fundamental importance for the efficient operation and sustainable development of large cities. In this paper, we present a comprehensive study on characterizing urban dynamics using the large scale taxi data in New York City. The pick-up and drop-off locations are firstly analyzed separately to reveal the general...
Article
Full-text available
Emerging location-based services in social media tools such as Foursquare and Twitter are providing an unprecedented amount of public-generated data on human movements and activities. This novel data source contains valuable information (e.g., geo-location, time and date, type of places) on human activities. While the data is tremendously beneficia...
Article
Full-text available
Microblogs posted to Twitter after the tornado in Moore, Oklahoma, on May 20, 2013, were analyzed in this study. The potential of social media data was explored for the extraction of relevant and useful information during natural disasters and as an additional data source for better understanding of individual behavior during a crisis. Data records...
Article
Full-text available
Location-based check-in services enable individuals to share their activity-related choices providing a new source of human activity data for researchers. In this paper urban human mobility and activity patterns are analyzed using location-based data collected from social media applications (e.g. Foursquare and Twitter). We first characterize aggre...

Network

Cited By

Projects

Project (1)
Archived project