Content uploaded by Jose Pinto
Author content
All content in this area was uploaded by Jose Pinto on Jan 14, 2014
Content may be subject to copyright.
Experiments with Deliberative Planning on
Autonomous Underwater Vehicles
Jos´
e Pinto and Jo˜
ao Sousa
Faculdade de Engenharia da Universidade do Porto
Underwater Systems and Technology Lab
Rua Dr. Roberto Frias, 4200-465 Porto, Portugal
Email: {zepinto, jtasso}@fe.up.pt
Fr´
ed´
eric Py and Kanna Rajan
Monterey Bay Aquarium Research Institute
Moss Landing, California, United States
Email: {fpy, kanna.rajan}@mbari.org
Abstract—We describe the process of improving the onboard
autonomy of LAUV-Seacon AUVs with an in-situ planning agent.
Deliberative planning is achieved by extending the existing
control architecture with T-REX and a domain model descrip-
tion which is specific to the Seacon AUVs and typical mission
scenarios. We discuss the required architectural changes as well
as results and lessons learned. The field deployments showed that
operators were able to command the vehicles by requesting high-
level objectives like locations to be visited and areas to survey
both before and during mission execution instead of sending low-
level waypoint or actuator commands. The commanded plans,
generated onboard according to the domain model are inherently
safe, abstracted and easier to deal with for a non-technical
operator.
I. INTRODUCTION
The undersea realm, where AUVs tipically operate, is
highly unknown and dynamic. However, most AUVs operate
by strictly following a pre-defined plan, created previously
by a human operator at the surface. In order to counteract
events like strong currents, obstacles or any other failures, the
plan script must articulate a number of fallback mechanisms
together with specific mission objectives. The result of having
humans defining complex plans can result in error-prone
definitions of the intended vehicle behavior which can only be
verified through techniques like model checking or computer
simulations prior to mission execution.
Instead of creating the plans beforehand, researchers at the
Monterey Bay Aquarium Research Institute (MBARI) have
designed, tested and fielded the Teleo-Reactive EXecutive
(T-REX) [1], [2], [3], [4], [5], an Open Source plan execution
framework that uses temporal Constraint-based Planning [6]
with plans synthesized onboard the AUV, both before and
while executing the mission autonomously and according to
a set of high-level objectives. The generated plans are, by
definition, a valid sequence of actions according to a given
domain model, which enforces the safety of the resulting
behavior.
Leveraging that work, LSTS (the Underwater Systems and
Technology Lab, Univ. of Porto) has recently extended their
toolchain in order to optionally encompass an onboard plan-
ner/executive based on T-REX and NASA’s EUROPA planner.
T-REX is a mission executive that allows different reactors
(threads of execution) to interact with other reactors by provid-
T-REX Agent
Iridium interface
Executive
Mission Manager
Skipper
NavigatorScience
Legend :
Interface reactor
Planning reactor
Goal flow
Observation flow
Fig. 1. An example of a T-REX agent.
ing internal (controlled) timelines and using external timelines,
controlled by other reactors’ Fig. 1 shows an instance of
interacting reactors within one T-REX agent. T-REX has
been developed aiming the integration of deliberative planning
reactors, allowing each reactor to have a different look-ahead
(how far they can plan) and latency (maximum amount of time
to produce a plan). T-REX timelines hold tokens that have a
type, starting time and duration (possibly indefinite) together
with other attributes. Reactors interact by adding observation
tokens into internal timelines and posting goals to external
timelines. Goals are similar to observations but their start time
is in the future. If they eventually get done, they are posted
as an observation in the timeline they were requested. Details
of T-REX are beyond the scope of this work.
In this paper we describe the required steps towards inte-
grating T-REX into the existing software toolchain in Section
III, field test experiments in Section IV, an evaluation of the
results in Section V and, finally some conclusions in Section
VI. First we provide the overall context and objectives of the
exercise in Section II.
II. TH E REP-12 EXERCISE
The Rapid Environmental Picture (REP) exercise is an
annual event resulting from a collaboration between Porto
University and the Portuguese Navy. The main objective of
Fig. 2. Location of REP-12 off the coast of Portugal.
this exercise is to advance and test operational concepts and
underwater technologies with supporting assets from the Navy,
focusing on underwater mine warfare.
The REP-12 exercise from the 9th to the 20th July 2012,
took place off the coast of Sesimbra (Fig. 2 – A) and in an
underwater archeological site next to Tr´
oia peninsula (Fig.
2 – B), with numerous partners with differing goals. As a
result, there were a multitude of tests concerning ocean floor
mapping, acoustic communications, delay-tolerant networking,
UAV-AUV communications and deliberative planning. More
details about the REP-12 , including tests and results are
available in [7].
III. IMPLEMENTATION
A. The LSTS Toolchain
LSTS has been developing different types of unmanned
vehicles including AUVs, UAVs, ASVs and ROVs. The objec-
tive of the lab is to create networked vehicle systems where
vehicles can be both endpoints of communication and routers
of information. To achieve this, LSTS has created a modular
software toolchain [8] that is agnostic of vehicle and com-
munication means allowing for mixed-initiative cooperative
behavior of vehicles and humans.
This toolchain is divided into three major components:
DUNE ,IMC and Neptus which are briefly described next.
1) DUNE :The DUNE Unified Navigation Environment
is a POSIX-compliant onboard software that uses a black-
board mechanism for inter-process communication. DUNE
tasks (processes) send IMC (described next) messages to a
shared bus from which they receive messages of subscribed
types. Different DUNE configurations (tasks and parameters)
are used to support different vehicles and execution profiles
like simulation, hardware and HWIL simulation.
The onboard software tasks are divided into a set of control
layers, from the low-level sensor/actuation tasks up to the plan
and team supervisors that command maneuver execution. An
overview of these control layers can be seen in Fig. 5.
Goto Area
Survey Popup
maneuverDone? maneuverDone?
limitsBreached?
limitsBreached?
Popup
Fig. 3. A Neptus scripted plan example.
Fig. 4. The LSTS toolchain.
2) IMC :The Inter-Module Communicatons protocol is a
message-oriented protocol that is independent of the underly-
ing communication mean (UDP, HTTP, TCP, acoustic modem
or GSM). Its specification is documented in XML from which
different language bindings are generated automatically [9].
The IMC communication protocol is used all across the
toolchain (Fig. 4), both between operator consoles and ve-
hicles (network nodes) and between DUNE tasks. DUNE tasks
communicate with each other locally via an IMC message bus
and in a distributed system using IMC messages transported
over available communication means.
3) Neptus: The Neptus Command and Control software
runs on the operator console, serving as the graphical interface
between humans and a robotic network. This software supports
different mission phases: planning, simulation, execution, data
revision and dissemination [10], [11].
For Neptus, a plan is defined as a finite state machine.
Each state is tied to a maneuver with specific parameters and,
between states, different transition conditions can be specified.
In each state, the vehicle executes a maneuver that changes its
physical state (through motion) and in the mean time a set of
events can be flagged during execution (triggering transitions).
Examples of maneuvers are “Goto”, “Loiter”, “FollowTra-
jectory” and “Rows”. These take specific parameters like
destination, speed or depth. Transition conditions are currently
a limited set of events like “ManeuverDone” or “Limits-
Breached” as shown in Fig. 3.
These 3 components have been used together numerous
times to control multiple unmanned vehicles having one or
more operators in the loop for different purposes [12], [13].
This is one of the most extensive toolchains we are aware of
in regular operational use.
Plan interface
Vehicle Interface
Maneuver interface
Guidance/Navigation
Platform interface
Plan supervisor
Loiter controller
Goto controller StationKeeping controller
Navigation Guidance
IMU driver CTD driver Thruster driver Fin servos driver
Vehicle Supervisor
Team Supervisor
Plan commands Plan state
Vehicle commands Vehicle state
Maneuver commands Maneuver state
Actuator commands Sensors state
Guidance commands Navigation state
Fig. 5. Control layers of the LSTS toolchain.
In order to enforce safety of the vehicle’s behavior, the
toolchain provides the possibility of setting per-vehicle safety
limits like maximum depth, minimum bottom distance, speed
and area bounds among others. On board the vehicle, a
special supervisor task continually monitors the vehicle’s state
and validates it according to these limits, flagging the event
“LimitsBreached” in case the state is not valid. This approach
doesn’t prevent sending of erroneous plans but has been
otherwise an effective and reactive approach to vehicle safety
since it usually triggers execution of fail-safe behavior.
B. Operational Requirements
In order to come up with good deliberative planning prim-
itives, we divised a set of operational requirements of the
resulting software architecture that we now enumerate.
1) Operators should be able to follow vehicle execution (as
they do in scripted plans).
2) All generated plans should be validated according to its
safety before execution.
3) The safety limits should (still) be continuously verified
while the vehicle is executing any kind of autonomous
plan.
4) The operator should be able to pause deliberative plan-
ning and command execution of scripted plans, resuming
deliberative planning afterwards.
5) Pausing of deliberative planning should be possible with
the vehicle underwater by using an acoustic modem.
6) The high-level objectives should abstract vehicle specific
paramaters so that those are handled by the onboard
planner.
C. Integrated Architecture
Integration of deliberative planning into the LSTS toolchain
was undertaken by having T-REX running as a separate
process inside the vehicles. This T-REX instance contains a
set of deliberative reactors, based on the EUROPA planner and
a platform-specific reactor.
The deliberative reactors use a domain model (described in
Section III-F) that allows them to interface with the Platform
Neptus
TREX plugin
IMC
TrexGoal / TrexCommand
DUNE
TREX task
TREX
Platform
Reactor
UDP
Transport
Deliberative
Reactor
Message bus TREX timelines
Deliberative
Reactors
IMC (localhost)
Fig. 6. Components of the integrated system.
specific reactor. This reactor connects to DUNE through IMC
and translates incoming messages into T-REX observations
which are posted to an internal timeline. The Platform reactor
also accepts T-REX goals which are eventually translated into
DUNE commands (sent through IMC).
In order to maintain the feature of being able to command
the vehicle via scripted plans, T-REX can be temporarily
deactivated on user request. While T-REX is deactivated, it
still is doing state synchronization but the domain model states
that no commands can be sent to the vehicle while deactivated.
This allows deliberative reactors to receive any new goals and
plan future actions to take when T-REX is reactivated on user
request.
D. Safety Mechanisms
Several safety mechanisms are enforced by this new archi-
tecture. The supervisor that continually monitors breaching of
operational limits remains active. This is possible since the
interface from T-REX with the onboard software resides in
the planning control layer and, as such, all the underlying
mechanisms can still be active. In the event the vehicle goes
outside these limits, T-REX control is deactivated and the
previously existing fail-safe mechanism (or a contingency
maneuver) is executed.
When the limits are breached, a “Blocked” observation is
generated. Since the planner’s domain model requires an “Ac-
tive” observation in order to generate any kind of commands,
T-REX will only send new plans after being reactivated by
user request (sending of a specific IMC message). Moreover,
the same behavior is activated if the vehicle receives an
“Abort” command either via Wi-Fi or acoustic modem while
underwater.
At each tick, all T-REX reactors are required to do a
synchronization step which consists of integrating observations
from other reactors into their state. If a reactor is not able to
produce its internal state based on the received observations
(for instance if observations generate an impossible state), it
gets terminated by T-REX . In our architecture we react to
this event by having an aditional reactor (Safety Bug) that
simply listens to the timelines of all other reactors and if they
get terminated, it will generate an “Abort” command which
results in fail-safe behavior and T-REX being temporarilly
deactivated.
On top of the mentioned safety mechanisms, the domain
model includes information about the operational limits (as a
posted observation) on the vehicle and the planner is required
to create plans which are safe, meaning that they wont drive
the vehicle outside of these limits. As a result, if there are any
objectives which require the vehicle to exit the operational
limits, they are adequately rejected by the planner.
E. T-REX reactors
The T-REX implementation involved the creation of differ-
ent reactors for encapsulating functionalities and thus improve
flexibility. This flexibility comes from the possibility of swap-
ping reactors with others with similar interfaces in terms of
used timelines. Fig. 7 depicts the resulting set of reactors and
their interactions through timelines. We next describe briefly
how they function.
1) Platform: This reactor receives IMC messages from
DUNE and generates corresponding observations in its con-
trolled timelines. Moreover, this reactor also accepts maneuver
goals which eventually result in the request and execution of
plans in the vehicle.
2) Goal Queue: This reactor receives a goal description
(XML) and posts it into the corresponding timeline. The
received goals can come from DUNE or any other process
capable of writing into a Unix pipe.
3) Safety Bug: This simple reactor listens to all timelines
and if it detects that the owner has terminated, sends an
“Abort” command to DUNE .
4) Navigator: This is a deliberative reactor that accepts
“At” goals. As a result it will post maneuver goals that will
drive the vehicle towards the desired locations.
5) Surface: This is a deliberative reactor that drivers the
vehicle towards the surface periodically by posting objectives
to the Navigator reactor.
F. Domain Model
The deliberative reactors (Navigator and Surface) require a
domain model with a set of rules that describe how the world
evolves according to the observations. In order to control the
world, deliberative reactors post goals that will eventually be-
come observations. The domain model for the LAUV-Seacon
is specified in NDDL (Novel Domain Description Language)
and interpreted by the EUROPA planner. This is currently a
simple model focusing on maneuvers and their impact on the
state of the world.
Listing 1 shows a snippet of the domain model used by
the Navigator deliberative reactor during the exercise on the
Seacon. The first rule (on the estimator timeline) determines
that, for a vehicle to arrive at a location, the arrival must
be met by a maneuver whose destination matches that of the
desired location. The second (platform timeline) enforces that
Estimator : : At {
me t b y ( P l a t f o r m . M an e uv er ma n eu v er ) ;
man eu ve r . l a t = = l a t ;
m an e uv e r . l o n == l o n ;
m an e uv e r . d e p t h == d e p t h ;
m an eu v er . s e c s == s e c s ;
m an eu ve r . t o l e r a n c e == t o l e r a n c e ;
}
P l a t f o r m : : Ma n eu ve r {
s t a r t s d u r i n g ( O pL im i ts . L i m i ts l i m i t s ) ;
s p e e d <= l i m i t s . m ax Spe ed ;
l i m i t s . m in Sp ee d <= s p e e d ;
i n s i d e ( l a t , l on , d ep t h , l i m i t s ) ;
s t a r t s d u r i n g ( T r e x S u p e r v i s i o n . A c ti v e a c t i v e ) ;
me t b y ( I d l e i d l e ) ;
m ee t s ( E s t i m a t e d S t a t e . P o s i t i o n p ) ;
d i s t == w g s 8 4 d i s t ( l a t , l o n , p . l a t , p . l o n ) ;
dist <= t o l e r a n c e ;
}
Listing 1. NDDL fragment relating At and Maneuver predicates
estimator
Platform
Navigator
command state
oplimits
Safety Bug
navigator
Goal Queue
Surface
estimator
Fig. 7. T-REX reactors in the periodic surfacing model.
the maneuver starts only if T-REX is currently able to send
commands (activated), tests if the maneuver is safe and that
it terminates near the desired location (otherwise a new plan
must be found).
Safety of the maneuver is checked against the currently
known operational limits by limiting the speed of the vehicle
and checking that the position is safe. The entire domain model
is available [14].
In order to improve the operator’s situational awareness and
reduce navigational uncertainty, a second deliberative reactor
was added that enforces the vehicle to come to the surface and
acquire a GPS signal periodically. As a result, the ammount
of time that the vehicle stays underwater is bounded by a sur-
facing time parameter, part of a “PeriodicSurface” predicate.
The resulting set of reactors and timelines is depicted in Fig.
7.
IV. FIE LD TE ST S
In REP-12 the Seacon vehicle (see table I) was used in
initial tests as an ASV and subsequently in the nominal AUV
Fig. 8. LSTS Seacon vehicle in AUV mode (left) and in ASV mode
(right) used in the REP-12 exercise.
configuration.
The ASV mode is achieved by mounting two floaters and
adding a long-range Wi-Fi access point and antenna. The idea
behind this modification is that it allows the vehicle to be used
as a mobile communications gateway between surface (Wi-Fi,
GSM) and underwater (acoustic modem). Both configurations
of this vehicle are shown in Fig. 8.
A. Operator Interface
For these tests, a simple operator interface was developed
as a Neptus map interaction plugin. From this interface, the
operator can add locations to be visited (goals) by clicking
the map. The sent goals, visible in the map as circular
shape, can be recalled by the user similarly or they can be
(eventually) marked as done. Moreover, the interface also
allows deactivating / activating of T-REX.
Since the interface is based on Neptus it is possible to
monitor the vehicle’s execution while adding new goals or
create scripted plans. Scripted plans can be interleaved with
T-REX execution by temporarily disabling T-REX and order
execution of a previously sent plan.
B. Simple Model
The first experiment consisted in testing a simple domain
model that has no initial objectives and accepts locations to be
visited as goals. The Navigator reactor, receives these goals
and generates goals in other’s reactors timelines that ultimately
result in one or more maneuvers to be executed by the vehicle.
The objective of this test was to measure CPU usage
while stress-testing the system. This was done by adding and
TABLE I
LAUV-SEA CON SPE CI FIC ATIO NS
Length 140 cm
Diameter 16 cm
Maximum depth 50 m
Endurance 8 hours at 3 knots
Communications Wi-Fi, Acoustic Modem, HSDPA/GSM
Actuation 4 directional fins and 1 brushless motor
Navigation GPS (surface), LBL, AHRS
Optional payloads (AUV) CTD, sidescan sonar, multibeam
Optional payloads (ASV) Long-range Wi-Fi antenna, video camera
recalling several objectives, blocking T-REX with pending
objectives, interrupting maneuver execution manually, etc. All
of the previous requires the planner to do replanning in order
to try to conteract unpredicted occurences.
C. Periodic Visits
This test, made with Seacon in AUV mode, consisted
in slight modification of the previous (simple) model. We
changed the model by enforcing the AUV to visit two points,
at the surface alternately whenever more than 2 minutes have
passed since the previous visit. In the mean time, the vehicle
is able to visit any other locations desired (received as goals
from the user).
D. Volume Survey
In this model, a deliberative reactor (VolumeSurvey) was
added to the simple model. This reactor takes goals that are
parametrized by a set of 4 points and the number of legs
(number of long transects). As a result, VolumeSurvey will
command a survey of the area (quad resulting from connecting
the 4 points) by generating visit-location goals which are
handled by the Navigator reactor.
E. Periodic Surfacing
In this test, we changed the first (simple) model by enforcing
the vehicle to acquire a valid GPS fix strictly periodically. This
was modelled by stating (in NDDL) that the AUV may not
be underwater for more than 5 minutes and if an underwater
maneuver will take longer than the time of the next surface,
then it will be interrupted for the vehicle to come to the
surface.
V. EVAL UATION
The integrated system was tested both with AUVs and
ASVs successfully and it was used together with the standard
scripted planning (interleaved). The system was stress-tested
by blocking / reactivating T-REX several times, adding and
recalling goals and also by posting unsafe goals. All of the
former requires the planner to replan its actions.
For the surface model, the resulting behavior depitcted in
Fig. 9, shows that the vehicle did one or more maneuvers
(different depths) while underwater and has actively come to
the surface whenever there were no objectives left underwater.
Moreover, around 10:52, the vehicle took more than 5 minutes
for completing the maneuver and, as a result, it popped up at
the surface to acquire GPS and then dived again to continue
the maneuver, as expected.
Overall, the system allowed controlling of the vehicle in
a more intuitive fashion. Instead of directly controlling the
vehicle, the operator simply states what needs to be done.
The resulting execution, however, was unpredictable in the
sense that the vehicle may chose any sequence of actions that
fulfil the objectives and this sequence of actions, despite safe,
is not known apriori. For instance, considering that surfacing
is a dangerous maneuver hen there is boat traffic, with the
Periodic Surfacing model it was impossible to know exactly
Fig. 9. Depth and number of GPS satellites in the periodic surface
test.
when and where the vehicle was going to surface (only that
it wouldn’t stay underwater for more than 5 minutes).
VI. CONCLUSIONS AND FUTURE ST EP S
T-REX was successfully integrated into the LSTS
toolchain and tested with AUVs and ASVs. The planning
interface was intuitive enough for non-trained operators to
command and recall goals using the plugin developed for the
Neptus graphical interface.
We will change the architecture in order to eliminate the
need of maneuver preemption (stops between maneuvers). On
the other hand, we aim to signal maneuver completion but
switch to other maneuver only when there is other maneuver
ready to be executed.
The tested models, while still simple, showed that it is
possible to use this control architecture to command the LSTS
vehicles. We plan to improve the models by adding new
reactors, provide better context awareness and augment with
more complex behaviors. We also plan to extend Neptus with
a shore-side planner that is able to command several vehicles
by interacting with a set of distributed T-REX timelines.
ACKNOWLEDGMENT
The authors would like to thank the Portuguese Navy for
providing excellent conditions in the testing of this work at
sea in Sesimbra, during the REP-12 exercise.
The research leading to these results has received funding
from the European Commission FP7-ICT Cognitive Systems,
Interaction, and Robotics under the contract #270180 (NOP-
TILUS). MBARI authors are funded from a block grant from
the Packard Foundation and and in part by NSF grant #IIS-
1127975 and NOAA grant #NA11NOS4780055.
REFERENCES
[1] C. McGann, F. Py, K. Rajan, H. Thomas, R. Henthorn, and R. McEwen,
“A Deliberative Architecture for AUV Control,” in Intnl. Conf. on
Robotics and Automation (ICRA), Pasadena, May 2008.
[2] C. McGann, F. Py, K. Rajan, J. P. Ryan, and R. Henthorn, “Adaptive
Control for Autonomous Underwater Vehicles,” in AAAI, Chicago, IL,
2008.
[3] C. McGann, F. Py, K. Rajan, J. P. Ryan, H. Thomas, R. Henthorn, and
R. McEwen, “Preliminary Results for Model-Based Adaptive Control of
an Autonomous Underwater Vehicle,” in Intnl. Symp. on Experimental
Robotics (ISER), Athens, 2008.
[4] F. Py, K. Rajan, and C. McGann, “A Systematic Agent Framework
for Situated Autonomous Systems,” in 9th International Conf. on Au-
tonomous Agents and Multiagent Systems, Toronto, Canada, May 2010.
[5] K. Rajan and F. Py, “T-REX: Partitioned Inference for AUV Mission
Control,” in Further Advances in Unmanned Marine Vehicles, G. N.
Roberts and R. Sutton, Eds. The Institution of Engineering and
Technology (IET), 2012.
[6] M. Ghallab and D. Nau and P. Traverso, Automated Planning Theory
and Practice. Elsevier Science, 2004.
[7] “Rep12 experiment web site.” [Online]. Available:
https://whale.fe.up.pt/rep12/
[8] J. Pinto, P. Calado, J. Braga, P. Dias, R. Martins, and E. Marques, “Im-
plementation of a control architecture for networked vehicle systems,”
in IFAC Workshop on Navigation, Guidance and Control of Underwater
Vehicles (NGCUV2012), 2012.
[9] R. Martins, P. Dias, E. Marques, J. Pinto, J. Sousa, and F. Pereira,
“Imc: A communication protocol for networked vehicles and sensors,”
in OCEANS 2009 - EUROPE, may 2009, pp. 1 –6.
[10] P. Dias, G. Gonc¸alves, R. Gomes, J. Sousa, J. Pinto, and F. Pereira,
“Mission planning and specification in the neptus framework,” in
Robotics and Automation, 2006. ICRA 2006. Proceedings 2006 IEEE
International Conference on, may 2006, pp. 3220 –3225.
[11] J. Pinto, P. S. Dias, R. Gonc¸alves, E. Marques, G. M. Gonc¸alves, J. a. B.
Sousa, and F. L. Pereira, “NEPTUS – A Framework to Support the
Mission Life Cycle,” in 7th IFAC Conferent on Manoeuvring and Control
of Marine Craft, Lisbon, Portugal, 2006.
[12] A. Tinka, S. Diemer, L. Madureira, E. Marques, J. B. Sousa, R. Martins,
J. Pinto, J. E. da Silva, P. Saint-Pierre, and A. M. Bayen, “Viability-based
computation of spatially constrained minimum time trajectories for an
autonomous underwater vehicle: implementation and experiments.” in
American Control Conference, St. Louis, Missouri, USA, 2009.
[13] R. Martins and J. Sousa, “Concepts and tools for coordination and
control of networked ocean-going vehicles,” in AUV 2010, Monterey,
California, USA, 2010.
[14] “Trex2-agent source code repository on google code.” [Online].
Available: https://code.google.com/p/trex2-agent/