Jianmin Ji

Jianmin Ji
University of Science and Technology of China | USTC · School of Computer Science and Technology

About

105
Publications
14,941
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
813
Citations

Publications

Publications (105)
Preprint
Constructing precise 3D maps is crucial for the development of future map-based systems such as self-driving and navigation. However, generating these maps in complex environments, such as multi-level parking garages or shopping malls, remains a formidable challenge. In this paper, we introduce a participatory sensing approach that delegates map-bu...
Article
In multimodal perception systems, achieving precise extrinsic calibration between LiDAR and camera is of critical importance. However, the pre-calibrated extrinsic parameters may gradually drift during operation, leading to a decrease in the accuracy of the perception system. It is challenging to address this issue using methods based on artificial...
Preprint
Full-text available
Combining accurate geometry with rich semantics has been proven to be highly effective for language-guided robotic manipulation. Existing methods for dynamic scenes either fail to update in real-time or rely on additional depth sensors for simple scene editing, limiting their applicability in real-world. In this paper, we introduce MSGField, a repr...
Preprint
SLAM is a fundamental capability of unmanned systems, with LiDAR-based SLAM gaining widespread adoption due to its high precision. Current SLAM systems can achieve centimeter-level accuracy within a short period. However, there are still several challenges when dealing with largescale mapping tasks including significant storage requirements and dif...
Preprint
The integration of large language models (LLMs) with robotics has significantly advanced robots' abilities in perception, cognition, and task planning. The use of natural language interfaces offers a unified approach for expressing the capability differences of heterogeneous robots, facilitating communication between them, and enabling seamless tas...
Preprint
Multi-modal systems enhance performance in autonomous driving but face inefficiencies due to indiscriminate processing within each modality. Additionally, the independent feature learning of each modality lacks interaction, which results in extracted features that do not possess the complementary characteristics. These issue increases the cost of f...
Preprint
When planning for autonomous driving, it is crucial to consider essential traffic elements such as lanes, intersections, traffic regulations, and dynamic agents. However, they are often overlooked by the traditional end-to-end planning methods, likely leading to inefficiencies and non-compliance with traffic regulations. In this work, we endeavor t...
Preprint
The conditional diffusion model has been demonstrated as an efficient tool for learning robot policies, owing to its advancement to accurately model the conditional distribution of policies. The intricate nature of real-world scenarios, characterized by dynamic obstacles and maze-like structures, underscores the complexity of robot local navigation...
Article
The increasing demand for automation has led to a rise in the use of low-speed Autonomous guided vehicles (AGVs). However, AGVs rely on batteries for their power source, which limits their operational time and affects their overall performance. To optimize their energy usage and enhance their battery life, it is crucial to understand the power cons...
Preprint
Full-text available
Large Language Models (LLMs) possess extensive foundational knowledge and moderate reasoning abilities, making them suitable for general task planning in open-world scenarios. However, it is challenging to ground a LLM-generated plan to be executable for the specified robot with certain restrictions. This paper introduces CLMASP, an approach that c...
Article
Far-range perception is essential for intelligent transportation systems. The main challenge of far-range perception is due to the difficulty of performing accurate object detection and tracking under far distances (e.g., $> \text{150}\,\text{m}$ ) at low cost. To cope with such challenges, deploying millimeter wave Radars and high-definition (HD...
Preprint
Formal representations of traffic scenarios can be used to generate test cases for the safety verification of autonomous driving. However, most existing methods are limited in highway or highly simplified intersection scenarios due to the intricacy and diversity of traffic scenarios. In response, we propose Traffic Scenario Logic (TSL), which is a...
Preprint
Autonomous vehicles necessitate a delicate balance between safety, efficiency, and user preferences in trajectory planning. Existing traditional or learning-based methods face challenges in adequately addressing all these aspects. In response, this paper proposes a novel component termed the Logical Guidance Layer (LGL), designed for seamless integ...
Article
Nowadays, closed-set perception methods for autonomous driving perform well on datasets containing normal scenes. However, they still struggle to handle anomalies in the real world, such as unknown objects that have never been seen while training. The lack of public datasets to evaluate the model performance on anomaly and corner cases has hindered...
Article
When planning for autonomous driving, it is crucial to consider essential traffic elements such as lanes, intersections, traffic regulations, and dynamic agents. However, they are often overlooked by the traditional end-to-end planning methods, likely leading to inefficiencies and non-compliance with traffic regulations. In this work, we endeavor t...
Article
Roadside camera-based perception methods are in high demand for developing efficient vehicle-infrastructure collaborative perception systems. By focusing on object-level depth prediction, we explore the potential benefits of integrating environmental priors into such systems and propose a geometry-based roadside per-object depth estimation algorith...
Article
Recently, deploying sensors such as LiDARs on the roadside to monitor the passing traffic and assist autonomous vehicle perception has become popular. However, unlike autonomous vehicle systems, roadside sensor systems involve sensors from different subsystems, resulting in a lack of synchronization in both time and space between the sensors. Calib...
Conference Paper
This paper addresses the challenge of generating safety-critical scenarios with multiple adversarial vehicles for testing autonomous vehicles. Such scenarios must be plausible and collision-avoidable while resulting in a collision with the vehicle-under-test. However, the tremendous number of scenarios and the low ratio of plausible scenarios makes...
Article
Autonomous mobile robots have become popular in various applications coexisting with humans, which requires robots to navigate efficiently and safely in crowd environments with diverse pedestrians. Pedestrians may cooperate with the robot by avoiding it actively or ignoring the robot during their walking, while some pedestrians, denoted as non-coop...
Preprint
LiDAR and Radar are two complementary sensing approaches in that LiDAR specializes in capturing an object's 3D shape while Radar provides longer detection ranges as well as velocity hints. Though seemingly natural, how to efficiently combine them for improved feature representation is still unclear. The main challenge arises from that Radar data ar...
Article
Full-text available
The past decade has witnessed the rapid development of autonomous driving systems. However, it remains a daunting task to achieve full autonomy, especially when it comes to understanding the ever-changing, complex driving scenes. To alleviate the difficulty of perception, self-driving vehicles are usually equipped with a suite of sensors (e.g., cam...
Preprint
Full-text available
In this paper, we present the USTC FLICAR Dataset, which is dedicated to the development of simultaneous localization and mapping and precise 3D reconstruction of the workspace for heavy-duty autonomous aerial work robots. In recent years, numerous public datasets have played significant roles in the advancement of autonomous cars and unmanned aeri...
Preprint
Full-text available
Reliable localization is crucial for autonomous robots to navigate efficiently and safely. Some navigation methods can plan paths with high localizability (which describes the capability of acquiring reliable localization). By following these paths, the robot can access the sensor streams that facilitate more accurate location estimation results by...
Preprint
It is important for deep reinforcement learning (DRL) algorithms to transfer their learned policies to new environments that have different visual inputs. In this paper, we introduce Prompt based Proximal Policy Optimization ($P^{3}O$), a three-stage DRL algorithm that transfers visual representations from a target to a source environment by applyi...
Preprint
Recently, it has become popular to deploy sensors such as LiDARs on the roadside to monitor the passing traffic and assist autonomous vehicle perception. Unlike autonomous vehicle systems, roadside sensors are usually affiliated with different subsystems and lack synchronization both in time and space. Calibration is a key technology which allows t...
Preprint
The recent trend for multi-camera 3D object detection is through the unified bird's-eye view (BEV) representation. However, directly transforming features extracted from the image-plane view to BEV inevitably results in feature distortion, especially around the objects of interest, making the objects blur into the background. To this end, we propos...
Article
Full-text available
The recent trend of fusing complementary data from LiDARs and cameras for more accurate perception has made the extrinsic calibration between the two sensors critically important. Indeed, to align the sensors spatially for proper data fusion, the calibration process usually involves estimating the extrinsic parameters between them. Traditional LiDA...
Preprint
Tensor program tuning is a non-convex objective optimization problem, to which search-based approaches have proven to be effective. At the core of the search-based approaches lies the design of the cost model. Though deep learning-based cost models perform significantly better than other methods, they still fall short and suffer from the following...
Preprint
Full-text available
In this letter, we propose MAROAM, a millimeter wave radar-based SLAM framework, which employs a two-step feature selection process to build the global consistent map. Specifically, we first extract feature points from raw data based on their local geometric properties to filter out those points that violate the principle of millimeter-wave radar i...
Article
A recent trend is to combine multiple sensors ( i.e. , cameras, LiDARs and millimeter-wave Radars) to achieve robust multi-modal perception for autonomous systems such as self-driving vehicles. Although quite a few sensor fusion algorithms have been proposed, some of which are top-ranked on various leaderboards, a systematic study on how to integr...
Preprint
Simultaneous localization and mapping (SLAM) based on laser sensors has been widely adopted by mobile robots and autonomous vehicles. These SLAM systems are required to support accurate localization with limited computational resources. In particular, point cloud registration, i.e., the process of matching and aligning multiple LiDAR scans collecte...
Preprint
Full-text available
The recent trend of fusing complementary data from LiDARs and cameras for more accurate perception has made the extrinsic calibration between the two sensors critically important. Indeed, to align the sensors spatially for proper data fusion, the calibration process usually involves estimating the extrinsic parameters between them.Traditional LiDAR...
Preprint
Full-text available
Existing navigation policies for autonomous robots tend to focus on collision avoidance while ignoring human-robot interactions in social life. For instance, robots can pass along the corridor safer and easier if pedestrians notice them. Sounds have been considered as an efficient way to attract the attention of pedestrians, which can alleviate the...
Article
It has been well recognized that fusing the complementary information from depth-aware LiDAR point clouds and semantic-rich stereo images would benefit 3D object detection. Nevertheless, it is non-trivial to explore the inherently unnatural interaction between sparse 3D points and dense 2D pixels. To ease this difficulty, the recent approaches gene...
Preprint
Full-text available
The dependency tree of a natural language sentence can capture the interactions between semantics and words. However, it is unclear whether those methods which exploit such dependency information for semantic parsing can be combined to achieve further improvement and the relationship of those methods when they combine. In this paper, we examine thr...
Preprint
It has been well recognized that fusing the complementary information from depth-aware LiDAR point clouds and semantic-rich stereo images would benefit 3D object detection. Nevertheless, it is not trivial to explore the inherently unnatural interaction between sparse 3D points and dense 2D pixels. To ease this difficulty, the recent proposals gener...
Article
Full-text available
Autonomous and safe navigation in complex environments without collisions is particularly important for mobile robots. In this paper, we propose an end-to-end deep reinforcement learning method for mobile robot navigation with map-based obstacle avoidance. Using the experience collected in the simulation environment, a convolutional neural network...
Chapter
The dependency tree of a natural language sentence can capture the interactions between semantics and words. However, it is unclear whether those methods which exploit such dependency information for semantic parsing can be combined to achieve further improvement and the relationship of those methods when they combine. In this paper, we examine thr...
Preprint
Full-text available
It is challenging for a mobile robot to navigate through human crowds. Existing approaches usually assume that pedestrians follow a predefined collision avoidance strategy, like social force model (SFM) or optimal reciprocal collision avoidance (ORCA). However, their performances commonly need to be further improved for practical applications, wher...
Preprint
Full-text available
Deep reinforcement learning (DRL) algorithms have proven effective in robot navigation, especially in unknown environments, through directly mapping perception inputs into robot control commands. Most existing methods adopt uniform execution duration with robots taking commands at fixed intervals. As such, the length of execution duration becomes a...
Preprint
Full-text available
In this paper, we propose a map-based end-to-end DRL approach for three-dimensional (3D) obstacle avoidance in a partially observed environment, which is applied to achieve autonomous navigation for an indoor mobile robot using a depth camera with a narrow field of view. We first train a neural network with LSTM units in a 3D simulator of mobile ro...
Preprint
Full-text available
As cameras are increasingly deployed in new application domains such as autonomous driving, performing 3D object detection on monocular images becomes an important task for visual scene understanding. Recent advances on monocular 3D object detection mainly rely on the ``pseudo-LiDAR'' generation, which performs monocular depth estimation and lifts...
Article
Full-text available
One of the main obstacles to 3D semantic segmentation is the significant amount of endeavor required to generate expensive point-wise annotations for fully supervised training. To alleviate manual efforts, we propose GIDSeg, a novel approach that can simultaneously learn segmentation from sparse annotations via reasoning global-regional structures...
Preprint
Full-text available
In the past few years, we have witnessed rapid development of autonomous driving. However, achieving full autonomy remains a daunting task due to the complex and dynamic driving environment. As a result, self-driving cars are equipped with a suite of sensors to conduct robust and accurate environment perception. As the number and type of sensors ke...
Preprint
Full-text available
One of the main obstacles to 3D semantic segmentation is the significant amount of endeavor required to generate expensive point-wise annotations for fully supervised training. To alleviate manual efforts, we propose GIDSeg, a novel approach that can simultaneously learn segmentation from sparse annotations via reasoning global-regional structures...
Preprint
The input space of a neural network with ReLU-like activations is partitioned into multiple linear regions, each corresponding to a specific activation pattern of the included ReLU-like activations. We demonstrate that this partition exhibits the following encoding properties across a variety of deep learning models: (1) {\it determinism}: almost e...
Conference Paper
Full-text available
This paper proposes an end-to-end deep reinforcement learning approach for mobile robot navigation with dynamic obstacles avoidance. Using experience collected in a simulation environment, a convolutional neural network (CNN) is trained to predict proper steering actions of a robot from its egocentric local occupancy maps, which accommodate various...
Article
Full-text available
It is challenging to avoid obstacles safely and efficiently for multiple robots of different shapes in distributed and communication-free scenarios, where robots do not communicate with each other and only sense other robots’ positions and obstacles around them. Most existing multi-robot collision avoidance systems either require communication betw...
Preprint
This paper proposes an end-to-end deep reinforcement learning approach for mobile robot navigation with dynamic obstacles avoidance. Using experience collected in a simulation environment, a convolutional neural network (CNN) is trained to predict proper steering actions of a robot from its egocentric local occupancy maps, which accommodate various...
Preprint
Full-text available
Place recognition and loop closure detection are challenging for long-term visual navigation tasks. SeqSLAM is considered to be one of the most successful approaches to achieving long-term localization under varying environmental conditions and changing viewpoints. It depends on a brute-force, time-consuming sequential matching method. We propose M...
Preprint
Full-text available
Visual Place Recognition (VPR) is an important component in both computer vision and robotics applications, thanks to its ability to determine whether a place has been visited and where specifically. A major challenge in VPR is to handle changes of environmental conditions including weather, season and illumination. Most VPR methods try to improve...
Article
As more and more open knowledge resources become available, it is interesting to explore opportunities of enhancing autonomous agents' capacities by utilizing the knowledge in these resources, instead of hand-coding knowledge for agents. AÂ major challenge towards this goal lies in the translation of the open knowledge organized in multiple modes,...
Conference Paper
In this paper, we apply Fangzhen Lin’s methodology of computer aided theorem discovery to discover classes of strongly equivalent logic programs with negation as failure in the head. Specifically, with the help of computers, we discover exact conditions that capture the strong equivalence between small sets of rules, which have potential applicatio...