Article

Generative Models for Grid-Based and Image-Based Pathfinding

Authors:
  • Artificial Intelligence Research Institute (AIRI)
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Subsequent to the transformation, the Quality Assessment phase ensures the augmented images meet high-quality standards, crucial for maintaining the natural appearance and utility of the images in training scenarios, as noted by [102] in their work on breast cancer diagnosis. Metadata Generation follows, documenting the augmentation details, which is essential for reproducibility and further analytical work [103]. ...
Preprint
Full-text available
In the past five years, research has shifted from traditional Machine Learning (ML) and Deep Learning (DL) approaches to leveraging Large Language Models (LLMs) , including multimodality, for data augmentation to enhance generalization, and combat overfitting in training deep convolutional neural networks. However, while existing surveys predominantly focus on ML and DL techniques or limited modalities (text or images), a gap remains in addressing the latest advancements and multi-modal applications of LLM-based methods. This survey fills that gap by exploring recent literature utilizing multimodal LLMs to augment image, text, and audio data, offering a comprehensive understanding of these processes. We outlined various methods employed in the LLM-based image, text and speech augmentation, and discussed the limitations identified in current approaches. Additionally, we identified potential solutions to these limitations from the literature to enhance the efficacy of data augmentation practices using multimodal LLMs. This survey serves as a foundation for future research, aiming to refine and expand the use of multimodal LLMs in enhancing dataset quality and diversity for deep learning applications. (Surveyed Paper GitHub Repo: https://github.com/WSUAgRobotics/data-aug-multi-modal-llm. Keywords: LLM data augmentation, LLM text data augmentation, LLM image data augmentation, LLM speech data augmentation, audio augmentation, voice augmentation, chatGPT for data augmentation, DeepSeek R1 text data augmentation, DeepSeek R1 image augmentation, Image Augmentation using LLM, Text Augmentation using LLM, LLM data augmentation for deep learning applications)
Article
Full-text available
Bounded suboptimal heuristic search is a family of search algorithms capable of solving hard combinatorial problems, returning suboptimal solutions within a given bound. Recent machine learning approaches have been shown to learn accurate heuristic functions. Learned heuristics, however, are slow to compute; concretely, given a single search state s and a learned heuristic h, evaluating h(s) is typically very slow relative to expansion time, since state-of-the-art learned heuristics are implemented as neural networks. However, by using a Graphics Processing Unit (GPU), it is possible to compute heuristics using batched computation. Existing approaches to batched heuristic computation are specific to satisficing search and have not studied the problem in the context of bounded-suboptimal search. In this paper, we present K-Focal Search, a bounded suboptimal search algorithm that in each iteration expands K states from the FOCAL list and computes the learned heuristic values of the successors using a GPU. We experiment over the 24-puzzle and Rubik’s Cube using DeepCubeA, a very effective and inadmissible learned heuristic. Our results show that K-Focal Search benefits both from batched computation and from the diversity in the search introduced by its expansion strategy. Over standard Focal Search, K-Focal Search improves runtime by a factor of 6, expansions by up to three orders of magnitude, and finds better quality solutions, keeping the theoretical guarantees of Focal Search.
Article
Full-text available
A total of 248 UAV RGB images were taken in the summer of 2021 over a representative pistachio orchard in Spain (X: 341450.3, Y: 4589731.8; ETRS89/UTM zone 30N). It is a 2.03 ha plot, planted in 2016 with Pistacia vera L. cv. Kerman grafted on UCB rootstock, with a NE–SW orientation and a 7 × 6 m triangular planting pattern. The ground was kept free of any weeds that could affect image processing. The photos (provided in JPG format) were taken using a UAV DJI Phantom Advance quadcopter in two flight missions: one planned to take nadir images (β = 0°), and another to take oblique images (β = 30°), both at 55 metres above the ground. The aerial platform incorporates a DJI FC6310 RGB camera with a 20 megapixel sensor, a horizontal field of view of 84° and a mechanical shutter. In addition, GCPs (ground control points) were collected. Finally, a high-quality 3D photogrammetric reconstruction process was carried out to generate a 3D point cloud (provided in LAS, LAZ, OBJ and PLY formats), a DEM (digital elevation model) and an orthomosaic (both in TIF format). The interest in using remote sensing in precision agriculture is growing, but the availability of reliable, ready-to-work, downloadable datasets is limited. Therefore, this dataset could be useful for precision agriculture researchers interested in photogrammetric reconstruction who want to evaluate models for orthomosaic and 3D point cloud generation from UAV missions with changing flight parameters, such as camera angle.
Article
Full-text available
In bounded-suboptimal heuristic search, one attempts to find a solution that costs no more than a prespecified factor of optimal as quickly as possible. This is an important setting, as it admits faster-than-optimal solving while retaining some control over solution cost. In this paper, we investigate several new algorithms for bounded-suboptimal search, including novel variants of EES and DPS, the two most prominent previous proposals, and methods inspired by recent work in bounded-cost search that leverages uncertainty estimates of the heuristic. We perform what is, to our knowledge, the most comprehensive empirical comparison of bounded-suboptimal search algorithms to date, including both search and planning benchmarks, and we find that one of the new algorithms, a simple alternating queue scheme, significantly outperforms previous work.
Article
Full-text available
In this paper, we study the problem of visual indoor navigation to an object that is defined by its semantic category. Recent works have shown significant achievements in the end-to-end reinforcement learning approach and modular systems. However, both approaches need a big step forward to be robust and practically applicable. To solve the problem of insufficient exploration of the scenes and make exploration more semantically meaningful, we extend standard task formulation and give the agent easily accessible landmarks in the form of the room locations and those types. The availability of landmarks allows the agent to build a hierarchical policy structure and achieve a success rate of 63% on validation scenes in a photo-realistic Habitat simulator. In a hierarchy, a low level consists of separately trained RL skills and a high level deterministic policy, which decides which skill is needed at the moment. Also, in this paper, we show the possibility of transferring a trained policy to a real robot. After a bit of training on the reconstructed real scene, the robot shows up to 79% SPL when solving the task of navigating to an arbitrary object.
Article
Full-text available
Many discoveries of active surface processes on Mars have been made due to the availability of repeat high-resolution images from the High Resolution Imaging Science Experiment (HiRISE) onboard the Mars Reconnaissance Orbiter. HiRISE stereo images are used to make digital terrain models (DTMs) and orthorectified images (orthoimages). HiRISE DTMs and orthoimage time series have been crucial for advancing the study of active processes such as recurring slope lineae, dune migration, gully activity, and polar processes. We describe the process of making HiRISE DTMs, orthoimage time series, DTM mosaics, and the difference of DTMs, specifically using the ISIS/SOCET Set workflow. HiRISE DTMs are produced at a 1 and 2 m ground sample distance, with a corresponding estimated vertical precision of tens of cm and ∼1 m, respectively. To date, more than 6000 stereo pairs have been acquired by HiRISE and, of these, more than 800 DTMs and 2700 orthoimages have been produced and made available to the public via the Planetary Data System. The intended audiences of this paper are producers, as well as users, of HiRISE DTMs and orthoimages. We discuss the factors that determine the effective resolution, as well as the quality, precision, and accuracy of HiRISE DTMs, and provide examples of their use in time series analyses of active surface processes on Mars.
Article
Full-text available
This paper presents a self-improving lifelong learning framework for a mobile robot navigating in different environments. Classical static navigation methods require environment-specific in-situ system adjustment, e.g. from human experts, or may repeat their mistakes regardless of how many times they have navigated in the same environment. Having the potential to improve with experience, learning-based navigation is highly dependent on access to training resources, e.g. sufficient memory and fast computation, and is prone to forgetting previously learned capability, especially when facing different environments. In this work, we propose Lifelong Learning for Navigation (LLfN) which (1) improves a mobile robot's navigation behavior purely based on its own experience, and (2) retains the robot's capability to navigate in previous environments after learning in new ones. LLfN is implemented and tested entirely onboard a physical robot with a limited memory and computation budget.
Article
Full-text available
In the last years, deep learning and reinforcement learning methods have significantly improved mobile robots in such fields as perception, navigation, and planning. But there are still gaps in applying these methods to real robots due to the low computational efficiency of recent neural network architectures and their poor adaptability to robotic experiments’ realities. In this article, we consider an important task in mobile robotics - navigation to an object using an RGB-D camera. We develop a new neural network framework for robot control that is fast and resistant to possible noise in sensors and actuators. We propose an original integration of semantic segmentation, mapping, localization, and reinforcement learning methods to improve the effectiveness of exploring the environment, finding the desired object, and quickly navigating to it. We created a new HISNav dataset based on the Habitat virtual environment, which allowed us to use simulation experiments to pre-train the model and then upload it to a real robot. Our architecture is adapted to work in a real-time environment and fully implements modern trends in this area.
Article
Full-text available
Long-range indoor navigation requires guiding robots with noisy sensors and controls through cluttered environments along paths that span a variety of buildings. We achieve this with PRM-RL, a hierarchical robot navigation method in which reinforcement learning (RL) agents that map noisy sensors to robot controls learn to solve short-range obstacle avoidance tasks, and then sampling-based planners map where these agents can reliably navigate in simulation; these roadmaps and agents are then deployed on robots, guiding them along the shortest path where the agents are likely to succeed. In this article, we use probabilistic roadmaps (PRMs) as the sampling-based planner, and AutoRL as the RL method in the indoor navigation context. We evaluate the method with a simulation for kinematic differential drive and kinodynamic car-like robots in several environments, and on differential-drive robots at three physical sites. Our results show that PRM-RL with AutoRL is more successful than several baselines, is robust to noise, and can guide robots over hundreds of meters in the face of noise and obstacles in both simulation and on robots, including over 5.8 km of physical robot navigation.
Conference Paper
Full-text available
Focal search (FS) is a bounded-suboptimal search (BSS) variant of A*. Like A*, it uses an open list whose states are sorted in increasing order of their f-values. Unlike A*, it also uses a focal list containing all states from the open list whose f-values are no larger than a suboptimality factor times the smallest f-value in the open list. In this paper, we develop an anytime version of FS, called anytime FS (AFS), that is useful when deliberation time is limited. AFS finds a "good" solution quickly and refines it to better and better solutions if time allows. It does this refinement efficiently by reusing previous search efforts. On the theoretical side, we show that AFS is bounded suboptimal and that anytime potential search (ATPS/ANA*), a state-of-the-art anytime bounded-cost search (BCS) variant of A*, is a special case of AFS. In doing so, we bridge the gap between anytime search algorithms based on BSS and BCS. We also identify different properties of priority functions, used to sort the focal list, that may allow for efficient reuse of previous search efforts. On the experimental side, we demonstrate the usefulness of AFS for solving hard combinatorial problems, such as the generalized covering traveling salesman problem and the multi-agent pathfinding problem.
Article
Full-text available
Single-shot grid-based path finding is an important problem with the applications in robotics, video games etc. Typically in AI community heuristic search methods (based on A* and its variations) are used to solve it. In this work we present the results of preliminary studies on how neural networks can be utilized to path planning on square grids, e.g. how well they can cope with path finding tasks by themselves within the well-known reinforcement problem statement. Conducted experiments show that the agent using neural Q-learning algorithm robustly learns to achieve the goal on small maps and demonstrate promising results on the maps have ben never seen by him before.
Article
Full-text available
Localization is the problem of estimating the location of an autonomous agent from an observation and a map of the environment. Traditional methods of localization, which filter the belief based on the observations, are sub-optimal in the number of steps required, as they do not decide the actions taken by the agent. We propose "Active Neural Localizer", a fully differentiable neural network that learns to localize accurately and efficiently. The proposed model incorporates ideas of traditional filtering-based localization methods, by using a structured belief of the state with multiplicative interactions to propagate belief, and combines it with a policy model to localize accurately while minimizing the number of steps required for localization. Active Neural Localizer is trained end-to-end with reinforcement learning. We use a variety of simulation environments for our experiments which include random 2D mazes, random mazes in the Doom game engine and a photo-realistic environment in the Unreal game engine. The results on the 2D environments show the effectiveness of the learned policy in an idealistic setting while results on the 3D environments demonstrate the model's capability of learning the policy and perceptual model jointly from raw-pixel based RGB observations. We also show that a model trained on random textures in the Doom environment generalizes well to a photo-realistic office space environment in the Unreal engine.
Article
Full-text available
Path planning constitutes one of the most crucial abilities an autonomous robot should possess, apart from Simultaneous Localization and Mapping algorithms (SLAM) and navigation modules. Path planning is the capability to construct safe and collision free paths from a point of interest to another. Many different approaches exist, which are tightly dependent on the map representation method (metric or feature-based). In this work four path planning algorithmic families are described, that can be applied on metric Occupancy Grid Maps (OGMs): Probabilistic RoadMaps (PRMs), Visibility Graphs (VGs), Rapidly exploring Random Trees (RRTs) and Space Skeletonization. The contribution of this work includes the definition of metrics for path planning benchmarks, actual benchmarks of the most common global path planning algorithms and an educated algorithm parameterization based on a global obstacle density coefficient.
Conference Paper
Full-text available
The performance of heuristic search (such as A*) based planners depends heavily on the quality of the heuristic function used to focus the search. These algorithms work fast and generate high-quality solutions, even for high-dimensional problems, as long as they are given a well-designed heuristic function. Consequently, the research in developing an efficient planner for a specific domain becomes the design of a good heuristic function. However, for many domains, it is hard to design a single heuristic function that captures all the complexities of the problem. Furthermore, it is hard to ensure that heuristics are admissible and consistent, which is necessary for A* like searches to provide guarantees on completeness and bounds on suboptimality. In this paper, we develop a novel heuristic search, called Multi-Heuristic A* (MHA*), that takes in multiple, arbitrarily inadmissible heuristic functions in addition to a single consistent heuristic, and uses all of them simultaneously to search for a complete and bounded suboptimal solution. This simplifies the design of heuristics and enables the search to effectively combine the guiding powers of different heuristic functions. We support these claims with experimental analysis on several domains ranging from inherently continuous domains such as full-body manipulation and navigation to inherently discrete domains such as the sliding tile puzzle.
Article
Full-text available
Many important problems are too difficult to solve optimally. A traditional approach to such problems is bounded subop-timal search, which guarantees solution costs within a user-specified factor of optimal. Recently, a complementary ap-proach has been proposed: bounded-cost search, where so-lution cost is required to be below a user-specified absolute bound. In this paper, we show how bounded-cost search can incorporate inadmissible estimates of solution cost and so-lution length. This information has previously been shown to improve bounded suboptimal search and, in an empirical evaluation over five benchmark domains, we find that our new algorithms surpass the state-of-the-art in bounded-cost search as well, particularly for domains where action costs differ.
Article
Full-text available
Grids with blocked and unblocked cells are often used to represent terrain in robotics and video games. However, paths formed by grid edges can be longer than true shortest paths in the terrain since their headings are artificially constrained. We present two new correct and complete any-angle path-planning algorithms that avoid this shortcoming. Basic Theta* and Angle-Propagation Theta* are both variants of A* that propagate information along grid edges without constraining paths to grid edges. Basic Theta* is simple to understand and implement, fast and finds short paths. However, it is not guaranteed to find true shortest paths. Angle-Propagation Theta* achieves a better worst-case complexity per vertex expansion than Basic Theta* by propagating angle ranges when it expands vertices, but is more complex, not as fast and finds slightly longer paths. We refer to Basic Theta* and Angle-Propagation Theta* collectively as Theta*. Theta* has unique properties, which we analyze in detail. We show experimentally that it finds shorter paths than both A* with post-smoothed paths and Field D* (the only other version of A* we know of that propagates information along grid edges without constraining paths to grid edges) with a runtime comparable to that of A* on grids. Finally, we extend Theta* to grids that contain unblocked cells with non-uniform traversal costs and introduce variants of Theta* which provide different tradeoffs between path length and runtime.
Conference Paper
Full-text available
Bounded suboptimal search algorithms offer shorter solving times by sacrificing optimality and instead guaranteeing solution costs within a desired factor of optimal. Typically these algorithms use a single admissible heuristic both for guiding search and bounding solution cost. In this paper, we present a new approach to bounded suboptimal search, Explicit Estimation Search, that separates these roles, consulting potentially inadmissible information to determine search order and using admissible information to guarantee the cost bound. Unlike previous proposals, it successfully combines estimates of solution length and solution cost to predict which node will lead most quickly to a solution within the suboptimality bound. An empirical evaluation across six diverse benchmark domains shows that Explicit Estimation Search is competitive with the previous state of the art in domains with unit-cost actions and substantially outperforms previously proposed techniques for domains in which solution cost and length can differ.
Article
Heuristic search algorithms, e.g. A*, are the commonly used tools for pathfinding on grids, i.e. graphs of regular structure that are widely employed to represent environments in robotics, video games, etc. Instance-independent heuristics for grid graphs, e.g. Manhattan distance, do not take the obstacles into account, and thus the search led by such heuristics performs poorly in obstacle-rich environments. To this end, we suggest learning the instance-dependent heuristic proxies that are supposed to notably increase the efficiency of the search. The first heuristic proxy we suggest to learn is the correction factor, i.e. the ratio between the instance-independent cost-to-go estimate and the perfect one (computed offline at the training phase). Unlike learning the absolute values of the cost-to-go heuristic function, which was known before, learning the correction factor utilizes the knowledge of the instance-independent heuristic. The second heuristic proxy is the path probability, which indicates how likely the grid cell is lying on the shortest path. This heuristic can be employed in the Focal Search framework as the secondary heuristic, allowing us to preserve the guarantees on the bounded sub-optimality of the solution. We learn both suggested heuristics in a supervised fashion with the state-of-the-art neural networks containing attention blocks (transformers). We conduct a thorough empirical evaluation on a comprehensive dataset of planning tasks, showing that the suggested techniques i) reduce the computational effort of the A* up to a factor of 4x while producing the solutions, whose costs exceed those of the optimal solutions by less than 0.3% on average; ii) outperform the competitors, which include the conventional techniques from the heuristic search, i.e. weighted A*, as well as the state-of-the-art learnable planners. The project web-page is: https://airi-institute.github.io/TransPath/.
Article
Language models (LMs) have demonstrated their capability in possessing commonsense knowledge of the physical world, a crucial aspect of performing tasks in everyday life. However, it remains unclear whether they have the capacity to generate grounded, executable plans for embodied tasks. This is a challenging task as LMs lack the ability to perceive the environment through vision and feedback from the physical environment. In this paper, we address this important research question and present the first investigation into the topic. Our novel problem formulation, named G-PlanET, inputs a high-level goal and a data table about objects in a specific environment, and then outputs a step-by-step actionable plan for a robotic agent to follow. To facilitate the study, we establish an evaluation protocol and design a dedicated metric, KAS, to assess the quality of the plans. Our experiments demonstrate that the use of tables for encoding the environment and an iterative decoding strategy can significantly enhance the LMs' ability in grounded planning. Our analysis also reveals interesting and non-trivial findings.
Article
Potential Search (PS) is an algorithm that is designed to solve bounded cost search problems. In this paper, we modify PS to work within the framework of bounded suboptimal search and introduce Dynamic Potential Search (DPS). DPS uses the idea of PS but modifies the bound to be the product of the minimal f-value in OPEN and the required suboptimal bound. We study DPS and its attributes. We then experimentally compare DPS to WA* and to EES on a variety of domains and study parameters that affect the behavior of these algorithms.In general we show that in domains with unit edge costs (e.g., many standard benchmarks) DPS significantly outperforms WA* and EES but there are exceptions.
Article
Work in machine learning has grown tremendously in the past years, but has had little to no impact on optimal search approaches. This paper looks at challenges in using deep learning as a part of optimal search, including what is feasible using current public frameworks, and what barriers exist for further adoption. The primary contribution of the paper is to show how to learn admissible heuristics through supervised learning from an existing heuristic. Several approaches are described, with the most successful approach being based on learning a heuristic as a classifier and then adjusting the quantile used with the classifier to ensure heuristic admissibility, which is required for optimal solutions. A secondary contribution is a description of the Batch A* algorithm, which can batch evaluations for more efficient use by the GPU. While ANNs can effectively learn heuristics that produce smaller search trees than alternate compression approaches, there still exists a time overhead when compared to efficient C++ implementations. This point of evaluation points out a challenge for future work.
Article
Learned heuristics, though inadmissible, can provide very good guidance for bounded-suboptimal search. Given a single search state s and a learned heuristic h, evaluating h(s) is typically very slow relative to expansion time, since state-of-the-art learned heuristics are implemented as neural networks. However, by using a Graphics Processing Unit (GPU), it is possible to compute heuristics using batched computation. Existing approaches to batched heuristic computation are specific to satisficing search and have not studied the problem in the context of bounded-suboptimal search. In this paper, we present K-Focal Search, a bounded suboptimal search algorithm that in each iteration expands K nodes from the FOCAL list and computes the learned heuristic values of the successors using a GPU. We experiment over the Rubik's Cube domain using DeepCubeA, a very effective inadmissible learned heuristic. Our results show that K-Focal Search benefits both from batched computation and from the diversity in the search introduced by its expansion strategy. Over standard FS, it improves runtime by a factor of 6, expansions by up to three orders of magnitude, and finds better solutions, keeping the theoretical guarantees of Focal Search.
Article
Resorting to certain heuristic functions to guide the search, the computational efficiency of prevailing path planning algorithms, such as A*, D*, and their variants, is solely determined by how well the heuristic function approximates the true path cost. In this study, we propose a novel approach to learning heuristic functions using a deep neural network (DNN) to improve the computational efficiency. Even though DNNs have been widely used for object segmentation, natural language processing, and perception, their role in helping to solve path planning problems has not been well investigated. This work shows how DNNs can be applied to path planning and what kind of loss functions are suitable for learning such a heuristic. Our preliminary results show that an appropriately designed and trained DNN can learn a heuristic that effectively guides prevailing path planning algorithms.
Article
A key challenge in satisficing planning is to use multiple heuristics within one heuristic search. An aggregation of multiple heuristic estimates, for example by taking the maximum, has the disadvantage that bad estimates of a single heuristic can negatively affect the whole search. Since the performance of a heuristic varies from instance to instance, approaches such as algorithm selection can be successfully applied. In addition, alternating between multiple heuristics during the search makes it possible to use all heuristics equally and improve performance. However, all these approaches ignore the internal search dynamics of a planning system, which can help to select the most useful heuristics for the current expansion step. We show that dynamic algorithm configuration can be used for dynamic heuristic selection which takes into account the internal search dynamics of a planning system. Furthermore, we prove that this approach generalizes over existing approaches and that it can exponentially improve the performance of the heuristic search. To learn dynamic heuristic selection, we propose an approach based on reinforcement learning and show empirically that domain-wise learned policies, which take the internal search dynamics of a planning system into account, can exceed existing approaches.
Article
This work grapples with the challenge of directing autonomous decision making by planetary rovers conducting science investigations. Most of the related work addresses obstacle avoidance and traversabilty, while less work seeks to directly improve science yield. This research develops a comprehensive approach for planetary rovers that accounts for both science investigation and mobility risk. We present a probabilistic framework that quantifies these two attributes of rover exploration and generates paths that constrain risk while increasing science return. Specifically, science productivity is measured and improved using formal principles from information theory and statistical learning for decision making. Risk is estimated using a probabilistic model that predicts rover wheel slippage based on geometric and semantic information. Our method is evaluated in a simulation study using real Mars surface data that is relevant for both science and terrain investigations. Experimental analysis verifies the effectiveness of our approach.
Article
Transformer, first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism. Thanks to its strong representation capabilities, researchers are looking at ways to apply transformer to computer vision tasks. In a variety of visual benchmarks, transformer-based models perform similar to or better than other types of networks such as convolutional and recurrent neural networks. Given its high performance and less need for vision-specific inductive bias, transformer is receiving more and more attention from the computer vision community. In this paper, we review these vision transformer models by categorizing them in different tasks and analyzing their advantages and disadvantages. The main categories we explore include the backbone network, high/mid-level vision, low-level vision, and video processing. We also include efficient transformer methods for pushing transformer into real device-based applications. Furthermore, we also take a brief look at the self-attention mechanism in computer vision, as it is the base component in transformer. Toward the end of this paper, we discuss the challenges and provide several further research directions for vision transformers.
Article
Image-to-image translation (I2I) aims to transfer images from a source domain to a target domain while preserving the content representations. I2I has drawn increasing attention and made tremendous progress in recent years because of its wide range of applications in many computer vision and image processing problems, such as image synthesis, segmentation, style transfer, restoration, and pose estimation. In this paper, we provide an overview of the I2I works developed in recent years. We will analyze the key techniques of the existing I2I works and clarify the main progress the community has made. Additionally, we will elaborate on the effect of I2I on the research and industry community and point out remaining challenges in related fields.
Article
In video games and robotics, one often discretizes a continuous 2D environment into a regular grid with blocked and unblocked cells and then finds shortest paths for the agents on the resulting grid graph. Shortest grid paths, of course, are not necessarily true shortest paths in the continuous 2D environment. In this article, we therefore study how much longer a shortest grid path can be than a corresponding true shortest path on all regular grids with blocked and unblocked cells that tessellate continuous 2D environments. We study 5 different vertex connectivities that result from both different tessellations and different definitions of the neighbors of a vertex. Our path-length analysis yields either tight or asymptotically tight worst-case bounds in a unified framework. Our results show that the percentage by which a shortest grid path can be longer than a corresponding true shortest path decreases as the vertex connectivity increases. Our path-length analysis is topical because it determines the largest path-length reduction possible for any-angle path-planning algorithms (and thus their benefit), a class of path-planning algorithms in artificial intelligence and robotics that has become popular.
Article
This work aims at developing an efficient path planning algorithm for the driving objective of a Martian day(sol) that can take into account terrain information for application to the proposed Mars Sample Return (MSR) mission. To prepare the planning process for one sol (i.e., with a limited time allocated to driving), a map of expected rover velocity over a chosen area is constructed, obtained by combining terrain classes, rock abundance and slope at that location.The planning phase starts offline by computing several paths that can be traversed in one sol (i.e., a few hours), which will later provide suitable options to the rover if replanning is necessary due to unexpected mobility difficulties. Online, the rover gains information about its environment as it drives (via slip monitoring and/or instrument deployment) and updates the map locally if major discrepancies are found. If an update is made, the remaining driving time along the different options is recalculated and the most efficient path is chosen. The online process is repeated until the rover has reached its daily goal.When simulated on different maps of expected rover speed at Gusev Crater, Mars, the algorithm correctly captured changes of terrain initially not mapped, and rerouted the rover to a more efficient path only when necessary, in which case it effectively complied with the time constraint to reach the goal.
Article
Grid path planning is an important problem in AI. Its understanding has been key for the development of autonomous navigation systems. An interesting and rather surprising fact about the vast literature on this problem is that only a few neighborhoods have been used when evaluating these algorithms. Indeed, only the 4- and 8-neighborhoods are usually considered, and rarely the 16-neighborhood. This paper describes three contributions that enable the construction of effective grid path planners for extended 2k-neighborhoods; that is, neighborhoods that admit 2k neighbors per state, where k is a parameter. First, we provide a simple recursive definition of the 2k-neighborhood in terms of the 2k-1-neighborhood. Second, we derive distance functions, for any k ≥ 2, which allow us to propose admissible heuristics that are perfect for obstacle-free grids, which generalize the well-known Manhattan and Octile distances. Third, we define the notion of canonical path for the 2k-neighborhood; this allows us to incorporate our neighborhoods into two versions of A*, namely Canonical A* and Jump Point Search (JPS), whose performance, we show, scales well when increasing k. Our empirical evaluation shows that, when increasing k, the cost of the solution found improves substantially. Used with the 2k-neighborhood, Canonical A* and JPS, in many configurations, are also superior to the any-angle path planner Theta* both in terms of solution quality and runtime. Our planner is competitive with one implementation of the any-angle path planner, ANYA in some configurations. Our main practical conclusion is that standard, well-understood grid path planning technology may provide an effective approach to any-angle grid path planning.
Chapter
2D path planning in static environment is a well-known problem and one of the common ways to solve it is to (1) represent the environment as a grid and (2) perform a heuristic search for a path on it. At the same time 2D grid resembles much a digital image, thus an appealing idea comes to being – to treat the problem as an image generation task and to solve it utilizing the recent advances in deep learning. In this work we make an attempt to apply a generative neural network as a path finder and report preliminary results, convincing enough to claim that this direction of research is worth further exploration.
Article
We learn end-to-end point-to-point and path-following navigation behaviors that avoid moving obstacles.These policies receive noisy lidar observations and output robot linear and angular velocities. The policies are trained in small, static environments with AutoRL, a drop-in replacement for reinforcement learning (RL) that automates the search for deep RL reward and neural network architecture with large-scale hyper-parameter optimization. AutoRL first finds a reward that maximizes task completion, and then a neural network architecture that maximizes cumulative of the found reward. Empirical evaluations, both in simulation and on-robot, show that AutoRL policies do not suffer from the catastrophic forgetfulness that plagues many other deep reinforcement learning algorithms, generalize to new environments and moving obstacles, are robust to sensor, actuator, and localization noise, and can serve as robust building blocks for larger navigation tasks. Our path-following and point-to-point policies are respectively 60% and 20% more successful than comparison methods.
Conference Paper
Robotic motion planning problems are typically solved by constructing a search tree of valid maneuvers from a start to a goal configuration. Limited onboard computation and real-time planning constraints impose a limit on how large this search tree can grow. Heuristics play a crucial role in such situations by guiding the search towards potentially good directions and consequently minimizing search effort. Moreover, it must infer such directions in an efficient manner using only the information uncovered by the search up until that time. However, state of the art methods do not address the problem of computing a heuristic that \emphexplicitly minimizes search effort. In this paper, we do so by training a heuristic policy that maps the partial information from the search to decide which node of the search tree to expand. Unfortunately, naively training such policies leads to slow convergence and poor local minima. We present SaIL, an efficient algorithm that trains heuristic policies by imitating clairvoyant oracles - oracles that have full information about the world and demonstrate decisions that minimize search effort. We leverage the fact that such oracles can be efficiently computed using dynamic programming and derive performance guarantees for the learnt heuristic. We validate the approach on a spectrum of environments which show that SaIL consistently outperforms state of the art algorithms. Our approach paves the way forward for learning heuristics that demonstrate an anytime nature - finding feasible solutions quickly and incrementally refining it over time. Open-source code and details can be found here: https://goo.gl/YXkQAC
Conference Paper
We introduce the value iteration network (VIN): a fully differentiable neural network with a `planning module' embedded within. VINs can learn to plan, and are suitable for predicting outcomes that involve planning-based reasoning, such as policies for reinforcement learning. Key to our approach is a novel differentiable approximation of the value-iteration algorithm, which can be represented as a convolutional neural network, and trained end-to-end using standard backpropagation.We evaluate VIN based policies on discrete and continuous path-planning domains, and on a natural-language based search task. We show that by learning an explicit planning computation, VIN policies generalize better to new, unseen domains.This paper is a significantly abridged and IJCAI audience targeted version of the original NIPS 2016 paper with the same title, available here: https://arxiv.org/abs/1602.02867
Article
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.0 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
Article
In this paper, we study the automatic ground map building and efficient path planning in unmanned aerial/ground vehicles (UAV/UGV) cooperative systems. Using the UAV, a ground image can be obtained from the aerial vision, which is then processed with image denosing, image correction, and obstacle recognition to construct the ground map automatically. Image correction is used to help the UGV improve the recognition accuracy of obstacles. Based on the constructed ground map, a hybrid path planning algorithm is proposed to optimize the planned path. A genetic algorithm is used for global path planning, and a local rolling optimization (LRO) is used to constantly optimize the results of the genetic algorithm. Experiments are performed to evaluate the performance of the proposed schemes. The evaluation results show that our proposed approach can obtain much less costly path compared to the traditional path planning algorithms such as genetic algorithm and A star algorithm and can run in real-time to support the UAV/UGV systems.
Article
Designing intelligent and robust autonomous navigation systems remains a great challenge in mobile robotics. Inverse reinforcement learning (IRL) offers an efficient learning technique from expert demonstrations to teach robots how to perform specific tasks without manually specifying the reward function. Most of existing IRL algorithms assume the expert policy to be optimal and deterministic, and are applied to experiments with relatively small-size state spaces. However, in autonomous navigation tasks, the state spaces are frequently large and demonstrations can hardly visit all the states. Meanwhile the expert policy may be non-optimal and stochastic. In this paper, we focus on IRL with large-scale and high-dimensional state spaces by introducing the neural network to generalize the expert’s behaviors to unvisited regions of the state space and an explicit policy representation is easily expressed by neural network, even for the stochastic expert policy. An efficient and convenient algorithm, Neural Inverse Reinforcement Learning (NIRL), is proposed. Experimental results on simulated autonomous navigation tasks show that a mobile robot using our approach can successfully navigate to the target position without colliding with unpredicted obstacles, largely reduce the learning time, and has a good generalization performance on undemonstrated states. Hence prove the robot intelligence of autonomous navigation transplanted from limited demonstrations to completely unknown tasks.
Article
We introduce the value iteration network: a fully differentiable neural network with a `planning module' embedded within. Value iteration networks are suitable for making predictions about outcomes that involve planning-based reasoning, such as predicting a desired trajectory from an observation of a map. Key to our approach is a novel differentiable approximation of the value-iteration algorithm, which can be represented as a convolutional neural network, and trained end-to-end using standard backpropagation. We evaluate our value iteration networks on the task of predicting optimal obstacle-avoiding trajectories from an image of a landscape, both on synthetic data, and on challenging raw images of the Mars terrain.
Conference Paper
In automatic navigation of mobile systems, first, they require providing a path network for robot/vehicle motion. Therefore, path planning is an important task of autonomous vehicle systems. To deal with the problem, this paper presents a method for constructing the shortest path, which support for vehicle auto-navigation in outdoor environments. The method using online road map images to estimate not only the shape of road network but also the directed road network, which could not be estimated by the use of only aerial/satellite images. The proposed method to solve this problem includes three stages. First, a raw network of path for motion is detected using the road map images. Second, the path network is converted to the Global coordinates, which provides a convenience for online auto-navigation task. Third, the shortest path for motion is estimated based on the A* algorithm. The experimental results demonstrate robustness and effectiveness of the method for path networks estimation under the large scene of outdoor environments.
Conference Paper
Grid-based path finding is required in many video games and virtual worlds to move agents. With both map sizes and the number of agents increasing, it is important to develop path finding algorithms that are efficient in memory and time. In this work, we present an algorithm called DBA* that uses a database of pre-computed paths to reduce the time to solve search problems. When evaluated using benchmark maps from Dragon AgeTM, DBA* requires less memory and time for search, and performs less pre-computation than comparable real-time search algorithms. Further, its suboptimality is less than 3%, which is better than the PRA* implementation used in Dragon AgeTM.
Article
Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.
Article
The study of algorithms on grids has been widespread in a number of research areas. Grids are easy to implement and offer fast memory access. Because of their simplicity, they are used even in commercial video games. But, the evaluation of work on grids has been inconsistent between different papers. Many research papers use different problem sets, making it difficult to compare results between papers. Furthermore, the performance characteristics of each test set are not necessarily obvious. This has motivated the creation of a standard test set of maps and problems on the maps that are open for all researchers to use. In addition to creating these sets, we use a variety of metrics to analyze the properties of the test sets. The goal is that these test sets will be useful to many researchers, making experimental results more comparable across papers, and improving the quality of research on grid-based domains.
Article
Real-time heuristic search algorithms satisfy a constant bound on the amount of planning per action, independent of the problem size. These algorithms are useful when the amount of time or memory resources are limited, or a rapid response time is required. An example of such a problem is pathfinding in video games where numerous units may be simultaneously required to react promptly to a player's commands. Classic real-time heuristic search algorithms cannot be deployed due to their obvious state revisitation (“scrubbing”). Recent algorithms have improved performance by using a database of precomputed subgoals. However, a common issue is that the precomputation time can be large, and there is no guarantee that the precomputed data adequately cover the search space. In this paper, we present a new approach that guarantees coverage by abstracting the search space, using the same algorithm that performs the real-time search. It reduces the precomputation time via the use of dynamic programming. The new approach eliminates the learning component and the resultant “scrubbing.” Experimental results on maps of tens of millions of grid cells from Counter-Strike: Source and benchmark maps from Dragon Age: Origins show significantly faster execution times and improved optimality results compared to previous real-time algorithms.
Article
The paper introduces three extensions of the A* search algorithm which improve the search efficiency by relaxing the admissibility condition. 1) A* employs an admissible heuristic function but invokes quicker termination conditions while still guaranteeing that the cost of the solution found will not exceed the optimal cost by a factor greater than 1 + . 2) R¿* may employ heuristic functions which occasionally violate the admissibility condition, but guarantees that at termination the risk of missing the opportunity for further cost reduction is at most ¿. 3) R¿*,* is a speedup version of R¿*, combining the termination condition of A* with the risk-admissibility condition of R¿*. The Traveling Salesman problem was used as a test vehicle to examine the performances of the algorithms A* and R¿*. The advantages of A* are shown to be significant in difficult problems, i.e., problems requiring a large number of expansions due to the presence of many subtours of roughly equal costs. The use of R¿* is shown to produce a 4:1 reduction in search time with only a minor increase in final solution cost.
Conference Paper
In real world planning problems, time for deliberation is often limited. Anytime planners are well suited for these problems: they find a feasi- ble solution quickly and then continually work on improving it until time runs out. In this paper we propose an anytime heuristic search, ARA*, which tunes its performance bound based on available search time. It starts by finding a suboptimal solution quickly using a loose bound, then tightens the bound progressively as time allows. Given enough time it finds a provably optimal solution. While improving its bound, ARA* reuses previous search efforts and, as a result, is significantly more effi- cient than other anytime search methods. In addition to our theoretical analysis, we demonstrate the practical utility of ARA* with experiments on a simulated robot kinematic arm and a dynamic path planning prob- lem for an outdoor rover.