Xiaohan Zhang's research while affiliated with State University of New York and other places

Publications (17)

Preprint
Task planning systems have been developed to help robots use human knowledge (about actions) to complete long-horizon tasks. Most of them have been developed for "closed worlds" while assuming the robot is provided with complete world knowledge. However, the real world is generally open, and the robots frequently encounter unforeseen situations tha...
Preprint
Full-text available
Large language models (LLMs) have demonstrated remarkable zero-shot generalization abilities: state-of-the-art chatbots can provide plausible answers to many common questions that arise in daily life. However, so far, LLMs cannot reliably solve long-horizon planning problems. By contrast, classical planners, once a problem is given in a formatted w...
Preprint
Full-text available
Classical planning systems have shown great advances in utilizing rule-based human knowledge to compute accurate plans for service robots, but they face challenges due to the strong assumptions of perfect perception and action executions. To tackle these challenges, one solution is to connect the symbolic states and actions generated by classical p...
Article
Full-text available
Robots frequently need to perceive object attributes, such as red, heavy, and empty, using multimodal exploratory behaviors, such as look, lift, and shake. One possible way for robots to do so is to learn a classifier for each perceivable attribute given an exploratory behavior. Once the attribute classifiers are learned, they can be used by robots...
Preprint
Full-text available
Multi-object rearrangement is a crucial skill for service robots, and commonsense reasoning is frequently needed in this process. However, achieving commonsense arrangements requires knowledge about objects, which is hard to transfer to robots. Large language models (LLMs) are one potential source of this knowledge, but they do not naively capture...
Preprint
Given the current point-to-point navigation capabilities of autonomous vehicles, researchers are looking into complex service requests that require the vehicles to visit multiple points of interest. In this paper, we develop a layered planning framework, called GLAD, for complex service requests in autonomous urban driving. There are three layers f...
Preprint
Automated task planning algorithms have been developed to help robots complete complex tasks that require multiple actions. Most of those algorithms have been developed for "closed worlds" assuming complete world knowledge is provided. However, the real world is generally open, and the robots frequently encounter unforeseen situations that can pote...
Article
Task and motion planning (TAMP) algorithms have been developed to help robots plan behaviors in discrete and continuous spaces. Robots face complex real-world scenarios, where it is hardly possible to model all objects or their physical properties for robot planning (e.g., in kitchens or shopping centers). In this letter, we define a new object-cen...
Preprint
Full-text available
Task and motion planning (TAMP) algorithms aim to help robots achieve task-level goals, while maintaining motion-level feasibility. This paper focuses on TAMP domains that involve robot behaviors that take extended periods of time (e.g., long-distance navigation). In this paper, we develop a visual grounding approach to help robots probabilisticall...
Preprint
Full-text available
Task and motion planning (TAMP) algorithms have been developed to help robots plan behaviors in discrete and continuous spaces. Robots face complex real-world scenarios, where it is hardly possible to model all objects or their physical properties for robot planning (e.g., in kitchens or shopping centers). In this paper, we define a new object-cent...
Preprint
Mobile telepresence robots (MTRs) allow people to navigate and interact with a remote environment that is in a place other than the person's true location. Thanks to the recent advances in 360 degree vision, many MTRs are now equipped with an all-degree visual perception capability. However, people's visual field horizontally spans only about 120 d...
Preprint
Full-text available
Robots frequently need to perceive object attributes, such as "red," "heavy," and "empty," using multimodal exploratory actions, such as "look," "lift," and "shake." Robot attribute learning algorithms aim to learn an observation model for each perceivable attribute given an exploratory action. Once the attribute models are learned, they can be use...
Preprint
Full-text available
Autonomous vehicles need to plan at the task level to compute a sequence of symbolic actions, such as merging left and turning right, to fulfill people's service requests, where efficiency is the main concern. At the same time, the vehicles must compute continuous trajectories to perform actions at the motion level, where safety is the most importa...

Citations

... In the case where the object attributes refer to the object's function, they are then referred to as 0-order affordances [59]. Task and motion planning methods have been applied to object-centric perception while leveraging physics simulation [68]. Those methods focused on learning to improve the robots' perception capabilities. ...
... Initial versions of the morc and morc-itrs algorithms were introduced in two separate conference papers [12,13]. Both papers aimed to enable a robot manipulator to identify object attributes using multiple exploratory behaviors and the produced multimodal sensory data. ...
... Recent classical planning systems designed for robotics frequently use planning domain description language (PDDL) or answer set programming (ASP) as the underlying action language for the planners [18,19,20,21]. For example, researchers have used classical planning algorithms for sequencing actions for a mobile robot working on delivery tasks [22], reasoning about safe and efficient urban driving behaviors for autonomous vehicles [23], and planning actions for a team of mobile robots [24]. Task and motion planning (TAMP) is a hierarchical planning framework that combines classical planning in discrete spaces and robot motion planning in continuous space [25,26]. ...