Conference PaperPDF Available

Learning Inverse Dynamics: A Comparison


Abstract and Figures

While it is well-known that model can enhance the control performance in terms of precision or energy efficiency, the practical application has often been limited by the complexities of manually obtaining sufficiently accurate models. In the past, learning has proven a viable alternative to using a combination of rigid-body dynamics and handcrafted approximations of nonlinearities. However, a major open question is what nonparametric learning method is suited best for learning dynamics? Traditionally, locally weighted projection regression (LWPR), has been the standard method as it is capable of online, real-time learning for very complex robots. However, while LWPR has had significant impact on learning in robotics, alternative nonparametric regression methods such as support vector regression (SVR) and Gaussian processes regression (GPR) offer interesting alternatives with fewer open parameters and potentially higher accuracy. In this paper, we evaluate these three alternatives for model learning. Our comparison consists out of the evaluation of learning quality for each regression method using original data from SARCOS robot arm, as well as the robot tracking performance employing learned models. The results show that GPR and SVR achieve a superior learning precision and can be applied for real-time control obtaining higher accuracy. However, for the online learning LWPR presents the better method due to its lower computational requirements.
Content may be subject to copyright.
Learning Inverse Dynamics: a Comparison
Duy Nguyen-Tuong, Jan Peters, Matthias Seeger, Bernhard Sch¨olkopf
Max Planck Institute for Biological Cybernetics
Spemannstraße 38, 72076 T¨ubingen - Germany
Abstract. While it is well-known that model can enhance the control
performance in terms of precision or energy efficiency, the practical appli-
cation has often been limited by the complexities of manually obtaining
sufficiently accurate models. In the past, learning has proven a viable al-
ternative to using a combination of rigid-body dynamics and handcrafted
approximations of nonlinearities. However, a major open question is what
nonparametric learning method is suited best for learning dynamics? Tra-
ditionally, locally weighted projection regression (LWPR), has been the
standard method as it is capable of online, real-time learning for very com-
plex robots. However, while LWPR has had significant impact on learning
in robotics, alternative nonparametric regression methods such as support
vector regression (SVR) and Gaussian processes regression (GPR) offer
interesting alternatives with fewer open parameters and potentially higher
accuracy. In this paper, we evaluate these three alternatives for model
learning. Our comparison consists out of the evaluation of learning qual-
ity for each regression method using original data from SARCOS robot
arm, as well as the robot tracking performance employing learned models.
The results show that GPR and SVR achieve a superior learning precision
and can be applied for real-time control obtaining higher accuracy. How-
ever, for the online learning LWPR presents the better method due to its
lower computational requirements.
1 Introduction
Model-based robot control, e.g., feedforward nonlinear control [1], exhibits many
advantages over traditional PID-control such as potentially higher tracking ac-
curacy, lower feedback gains, lower energy consumption etc. Within the context
of automatic robot control, this approach can be considered as an inverse prob-
lem, where the plant model, e.g, the dynamics model of a robot described by
rigid-body formulation, is used to predict the joint torques given the desired
trajectory (i.e., the joint positions, velocities, and accelerations), see, e.g., [1].
However, for many robot systems a sufficiently accurate plant model is hard to
achieve using the pure rigid-body formulation due to unmodeled nonlinearities
such as friction or actuator nonlinearities [2]. In such cases, the imprecise model
can lead to large tracking errors which can only be avoided using high-gain con-
trol or more accurate models. As high-gain control would turn the robot into a
danger for its environment, the latter is the preferable option. For this, one im-
portant alternative is the inference of inverse models from measured data using
regression techniques.
While this goal has been considered in the past [3, 4], given recent progress
in regression techniques and increased computing power for online computation,
Fig. 1: Anthropomor-
phic SARCOS master
robot arm.
it is time that we reevaluate this issue using state-of-
the-art methods. In this paper, we compare three differ-
ent nonparametric regression methods for learning the
dynamics model, i.e., the locally weighted projection
regression (LWPR) [5], the full Gaussian processes re-
gression (GPR) [6] and the ν-support vector regression
(ν-SVR) [7]. The approximation quality is evaluated us-
ing (i) simulation data and (ii) real data taken from a
7 degree-of-freedom (DoF) SARCOS master robot arm,
as shown in Figure 1. Furthermore, we will examine
the tracking performances of the robot using the learned
models in the setting of feedforward nonlinear control [1].
Our main focus during these evaluations is to an-
swer two questions: a) which of the presented methods
is suited best for our problem domain, and b) whether policies learned by support
vector machines and Gaussian process can work in a real-time control scenario.
In the following, we will describe the role of inverse dynamics in nonlinear,
feedforward robot control and, subsequently, the regression algorithms used for
model approximation. Afterwards, we will discuss the results of model learning
and how these can be used for control. Finally, we will show the performance
during a real-time tracking task explaining our real-time robot control setup.
2 Inverse Dynamics Models in Feedforward Control
In model-based control, the controller command is computed using apriori knowl-
edge about the system expressed in an inverse dynamics model [1, 8], which is
traditionally given in the rigid-body formulation [1]: M(q)¨q +F(q,˙q) = u,
where q,˙q,¨q are joint angles, velocities and accelerations of the robot. M(q)
denotes the inertia matrix and F(q,˙q) all internal forces, including Coriolis and
centripetal forces, gravity as well as unmodel-able nonlinearities.
The motor command u=uFF +uFB is the applied joint torques and consists
out of a feedforward component uFF and a feedback component uFB. The feed-
forward component predicts the torques required to follow a desired trajectory
given by desired joint angles qd, velocities ˙qdand accelerations ¨qd. If we have
a sufficiently accurate analytical model, we can compute the feedforward com-
ponent by uFF =M(qd)¨qd+F(qd,˙qd). The feedback component is required
to ensure that a tracking error cannot accumulate and destabilize the system.
Linear feedback controllers uFB =Kpe+Kv˙e, with e=qdqbeing tracking
error, are commonly used in the feedforward control setting, where the feedback
gains Kpand Kvare chosen such that they remain low for compliance while
sufficiently high for stability [1].
However, for many robot systems the dynamics model presented by rigid-
body equation as given is not sufficiently accurate, especially in case of unmod-
eled nonlinearities, complex friction and actuator dynamics [2]. This imprecise
model leads to a bad prediction of joint torques uFF which can result in poor
control performances or even damage the system. Thus, learning more precise
inverse dynamics models from measured data using regression methods poses
an interesting alternative. In this case, the feedforward component is generally
considered as a function of desired trajectories, hence, uFF =f(qd,˙qd,¨qd).
3 Nonparametric Regression Methods for Model Learning
Learning the feedforward function is a straightforward regression problem as
we can observe the trajectories resulting from our motor commands u. Thus,
we have to learn the mapping from inputs x= [qT,˙qT,¨qT]R3nto targets
y=uRn. With the learned function, the feedforward torque uFF can be
predicted for a query input point xd= [qT
d]. In the remainder of the
section, we discuss three nonparametric regression techniques used for learning
inverse dynamics models, i.e., the current standard method LWPR [5], ν-SVR [7]
and GPR [6].
3.1 Locally Weighted Projection Regression (LWPR)
In LWPR, the predicted value ˆyis given by a combination of Nindividually
weighted locally linear models normalized by the sum of all weights [2,5], Thus,
k=1 wk¯yk
k=1 wk
with ¯yk=¯xT
θkand ¯xk= [(xck)T,1]T, where wkis the weight, ˆ
the regression parameter and ckis the center of the k-th linear model. For
the weight determination, a Gaussian kernel is often used: wk= exp(0.5(x
ck)TDk(xck)), where Dkis a positive definite distance matrix. During the
learning process, the main purpose is to adjust Dkand ˆ
θk, such that the errors
between predicted values and targets are minimal [5].
3.2 Gaussian Processes Regression (GPR)
GPR is performed using a linear model: y=f(x)+with f(x) = φ(x)Tw, where
wis the weight vector [6]. The linear computation is done after transforming
the input xwith a kernel function φ(), for which the Gaussian kernel, as given
in Section 3.1, can be taken. It is further assumed that the target value yis
corrupted by a noise with zero mean and variance σ2
To make a prediction for a new input xthe outputs of all linear models are
averaged and additionally weighted by their posterior [6]. The predicted value
f(x) and corresponding variance V(x) can be given as follow [6]
f(x) = k
V(x) = k(x,x)k
where k=φ(x)TΣpΦ,k(x,x) = φ(x)TΣpφ(x) and K=ΦTΣpΦ. The
matrix Φdenotes an aggregation of columns φ(x) for all cases in the training
set and Σpthe variance of the weights.
nMSE Joint [i]
[%] 1 2 3 4 5 6 7
LWPR 3.9 1.6 2.1 3.1 1.7 2.1 3.1
GPR 0.7 0.2 0.1 0.5 0.1 0.4 0.6
ν-SVR 0.4 0.3 0.1 0.6 0.2 0.5 0.4
Table 1: Learning error in percent for each
DoF using simulation data.
nMSE Joint [i]
[%] 1 2 3 4 5 6 7
LWPR 1.7 2.1 2.0 0.5 2.5 2.4 0.7
GPR 0.5 0.3 0.1 0.1 1.5 1.2 0.2
ν-SVR 0.8 0.6 0.5 0.1 0.5 1.2 0.1
RBM 5.9 226.3 111.3 3.4 2.7 1.3 1.4
Table 2: Learning error in percent for each
DoF using real SARCOS data.
Joint [i] GPR ν-SVR LWPR
1 0.78 1.17 1.45
2 1.05 1.01 1.63
3 0.24 0.19 0.19
4 2.42 2.34 3.24
5 0.23 0.14 0.23
6 0.31 0.21 0.29
7 0.23 0.24 0.26
Table 3: Tracking error as nMSE
in percent for each DoF using test
3.3 ν-Support Vector Regression (ν-SVR)
For ν-SVR the predicted value f(x) for a query point xis given by [7]
f(x) = Xm
i=1 (α
iαi)k(xi,x) + b , (3)
with k(xi,x) = φ(xi)Tφ(x) and mdenotes the number of training points. The
transformation φ() of the input vector can also be done with an appropriate
kernel function as in the case of GPR. The quantities α
i,αiand bare determined
through an optimization procedure parameterized by C0 and ν0 [7]. The
parameter νimplies the width of the tube around the hyperplane (3) and C
denotes the regularization factor for training [7].
4 Evaluations on Data Sets & Application in Control
In this section, we compare the learning performance of LWPR, GPR and ν-
SVR using (i) simulation data and (ii) real SARCOS robot data. Generating
the simulation data, we use a model of the 7-DoF SARCOS master arm created
with the SL-software package [9].
4.1 Evaluation on Simulation Data
For the input data, a trajectory is generated such that it is sufficiently rich.
Subsequently, we control the robot arm tracking those trajectory in a closed-loop
control setting, where we sample the corresponding controller commands for the
target data, i.e., the joint torques. In so doing, a training set and a test set with
21 inputs and 7 targets are generated which consist of 14094 examples for training
and 5560 for testing. The training takes place for each DoF separately, employing
LWPR, GPR and ν-SVR. Table 1 gives the normalized mean squared error
(nMSE) in percent of the evaluation on the test set, where the normalized mean
squared error is defined as: nMSE = Mean squared error/Variance of target.
It can be seen that GPR and ν-SVR yield better model approximation com-
pared to LWPR, since GPR and ν-SVR are a global methods. A further ad-
vantage of these methods is that there are only some hyperparameters to be
determined, which makes the learning process more practical. However, the
main drawback is the computational cost. In general, the training time for GPR
and ν-SVR is about 2-time longer compared to LWPR. The advantage of LWPR
is the fast computation, since the model update is done locally. However, due to
many meta parameters which have to be set manually for the LWPR-training,
it is fairly tedious to find an optimal setting for those by trial-and-error.
4.2 Evaluation on Real Robot Data
The data is taken from the real anthropomorphic SARCOS master arm with 7
DoF, as shown in Figure 1. Here, we have 13622 examples for training and 5500
for testing. Table 2 shows the nMSE after learning with real robot data for each
DoF. Additionally, we also determine the nMSE of a linear regression using the
rigid-body robot model (RBM). The resulting error will indicate, how far the
analytical model can explain the data.
Compared to LWPR, GPR and ν-SVR provide better results for every DoF.
Considering the rigid-body model, the linear regression yields very large approx-
imation error for the 2. and 3. DoF. Apparently, for these DoF the nonlinearities
(e.g., hydraulic cables, complex friction) cannot be approximated well using just
the rigid-body functions. This example shows the difficulty using the analytical
model for control in practice, where the imprecise dynamics model will result in
poor control performance for real system, e.g., large tracking error.
4.3 Application to Control
Using the offline-learned models from Section 4.1, the SL-model of the SARCOS
robot arm [9] is controlled to accomplish a tracking task. For desired trajectories,
i.e., joint angles, velocities and accelerations, we generate test trajectories which
are similar to training trajectories, comparing the generalization ability of each
regression method. Table 3 gives the tracking error of each joint as nMSE for the
test trajectories. The Figure 2 shows the corresponding tracking performance
for the joint 1 and 2, other joints are similar. It’s necessary to emphasize that
the control task is done in real-time where the system is sampled with 480 Hz.
It can be seen that the tracking error of GPR and ν-SVR is only slightly
smaller than LWPR in spite of better learning accuracy. This is due to the
reason that in case of GPR and ν-SVR, the controller command ucan only
be updated at every 4th sampling step due to more involved calculations for
prediction, see Equations (2) and (3). In spite of those limitations, we are able
to control the robot arm in real-time achieving a competitive performance. For
LWPR, we are able to calculate the controller command for every sampling step,
since evaluation of the prediction values (1) is quite fast. Furthermore, the
results show that the learned models are able to generalized well in present of
unknown trajectories similar to training data.
0 1 2 3 4 5
1. Joint
time [s]
Amplitude [rad]
time [s]
Amplitude [rad]
2. Joint
Fig. 2: Tracking performance for joint 1and 2. Other joints are similar.
5 Conclusion
Our results indicate that GPR and ν-SVR can be made to work for control appli-
cations in real-time, and that it is easier to apply to learning problems achieving
a higher learning accuracy compared to LWPR. However, the computational
cost is prohibitively high for online learning. Our next step is to modify GPR
and ν-SVR, so that they can be used for an online regression and thus is capable
for real-time learning. Here, the problem of expensive computation has to be
overcome using other techniques, such as sparse or local models [10].
[1] John J. Craig. Introduction to Robotics: Mechanics and Control. Prentice Hall, 3. edition
edition, 2004.
[2] J. Nakanishi, Jay A. Farrell, and S. Schaal. Composite adaptive control with locally
weighted statistical learning. Neural Networks, 2005.
[3] E. Burdet and A. Codourey. Evaluation of parametric and nonparametric nonlinear
adaptive controllers. Robotica, 16(1):59–73, 1998.
[4] J. Kocijan, R. Murray-Smith, C. Rasmussen, and A. Girard. Gaussian process model
based predictive control. Proceeding of the American Control Conference, 2004.
[5] S. Vijayakumar and S. Schaal. Locally weighted projection regression: An O(n) algorithm
for incremental real time learning in high dimensional space. International Conference
on Machine Learning, Proceedings of the Sixteenth Conference, 2000.
[6] Carl E. Rasmussen and Christopher K. Williams. Gaussian Processes for Machine Learn-
ing. MIT-Press, Massachusetts Institute of Technology, 2006.
[7] Bernhard Sch¨olkopf and Alex Smola. Learning with Kernels: Support Vector Machines,
Regularization, Optimization and Beyond. MIT-Press, Cambridge, MA, 2002.
[8] Mark W. Spong, Seth Hutchinson, and M. Vidyasagar. Robot Dynamics and Control.
John Wiley and Sons, New York, 2006.
[9] S. Schaal. The SL simulation and real-time control software package. University of
Southern California.
[10] D. Nguyen-Tuong. Machine learning for robot motor control. Thesis Proposal (unpub-
lished). Max Planck Institute of Biological Cybernetics, 2007.
... GPs have been shown in literature to be applicable to modeling dynamic systems with unknown nonlinear dynamics. Such modeling methods are widely available for Euler-Lagrange systems [2], [8], [9], single-input systems that offer full state measurements [5], and causal systems [11]. ...
... Whether the realization E[f (R)] is representative of the real function f is dependent on the choice of k in (5). This can be seen easily by rewriting the posterior mean (8) in scalar form as ...
... For computational reasons, the size of the dataset is reduced by only taking every 30 rows of Y and u into account to form D. This leads to a dataset of M = 2970 observations of y t and u(t). The kernel hyper-parameters are optimized by maximization of the log-marginal likelihood (see Section III-D) and the feedforward signals u ff,1 , u ff,2 are computed with (8). ...
Feedforward control is essential to achieving good tracking performance in positioning systems. The aim of this paper is to develop an identification strategy for inverse models of systems with nonlinear dynamics of unknown structure using input-output data, which directly delivers feedforward signals for a-priori unknown tasks. To this end, inverse systems are regarded as noncausal nonlinear finite impulse response (NFIR) systems and modeled as a Gaussian Process with a stationary kernel function that imposes properties such as smoothness and periodicity. The approach is validated experimentally on a consumer printer with friction and shown to lead to improved tracking performance with respect to linear feedforward.
... • RBD-based learning: identifying the physical parameters, β, of the model [10,20,21], • RBD-free learning: finding the mapping τ = f (q,q,q) with no prior knowledge on the model [3,22], • RBD-error learning: compensating the error of RBDbased models by learning a mapping = f (q,q,q) as the Error Model [23, 24, 25]. Fig. 2: Various trajectories converging to a single stable limit cycle from different initial state (q i ,q i ), black dots. ...
... Not only optimizing a controller through trials but also constructing a controller from a dataset was also intensively studied. This approach is often referred to as inverse model learning [22,23]. In this approach, regression methods play a crucial role in expressing inverse models. ...
Full-text available
Neural networks have been widely used to model nonlinear systems that are difficult to formulate. Thus far, because neural networks are a radically different approach to mathematical modeling, control theory has not been applied to them, even if they approximate the nonlinear state equation of a control object. In this research, we propose a new approach—i.e., neural model extraction, that enables model-based control for a feed-forward neural network trained for a nonlinear state equation. Specifically, we propose a method for extracting the linear state equations that are equivalent to the neural network corresponding to given input vectors. We conducted simple simulations of a two degrees-of-freedom planar manipulator to verify how the proposed method enables model-based control on neural network forward models. Through simulations, where different settings of the manipulator’s state observation are assumed, we successfully confirm the validity of the proposed method.
... A review of the model learning and robot control can be found in [23]. The work in [24], [25] adopt an inverse dynamics controller using global and local GPs regression models, respectively. The learned model predicts control inputs based on the robot current states and the desired derivative of robot states. ...
Ranging from cart-pole systems and autonomous bicycles to bipedal robots, control of these underactuated balance robots aims to achieve both external (actuated) subsystem trajectory tracking and internal (unactuated) subsystem balancing tasks with limited actuation authority. This paper proposes a learning model-based control framework for underactuated balance robots. The key idea to simultaneously achieve tracking and balancing tasks is to design control strategies in slow- and fast-time scales, respectively. In slow-time scale, model predictive control (MPC) is used to generate the desired internal subsystem trajectory that encodes the external subsystem tracking performance and control input. In fast-time scale, the actual internal trajectory is stabilized to the desired internal trajectory by using an inverse dynamics controller. The coupling effects between the external and internal subsystems are captured through the planned internal trajectory profile and the dual structural properties of the robotic systems. The control design is based on Gaussian processes (GPs) regression model that are learned from experiments without need of priori knowledge about the robot dynamics nor successful balance demonstration. The GPs provide estimates of modeling uncertainties of the robotic systems and these uncertainty estimations are incorporated in the MPC design to enhance the control robustness to modeling errors. The learning-based control design is analyzed with guaranteed stability and performance. The proposed design is demonstrated by experiments on a Furuta pendulum and an autonomous bikebot.
... The methods of machine learning on the contrary do not require system knowledge or any kind of parameter identification, since they are purely based on recorded data. An overview of different learning methods for the feedforward control of a 7-DoF (degrees of freedom) serial link robot arm and an assessment of their performance is given in [9]. ...
Full-text available
Positioning objects in industrial handling applications is often compromised by elasticity-induced oscillations reducing the possible motion time and thereby the performance and profitability of the automation solution. Existing approaches for oscillation reduction mostly focus on the elasticity of the handling system itself, i.e. the robot structure. Depending on the task, elastic parts or elastic grippers like suction cups strongly influence the oscillation and prevent faster positioning. In this paper, the problem is investigated exemplarily with a typical handling robot and an additional end effector setup representing the elastic load. The handling object is modeled as a base-excited spring and mass, making the proposed approach independent from the robot structure. A model-based feed-forward control based on differential flatness and a machine-learning method are used to reduce oscillations solely with a modification of the end effector trajectory of the robot. Both methods achieve a reduction of oscillation amplitudes of 85% for the test setup, promising a significant increase in performance. Further investigations on the uncertainty of the parameterization prove the applicability of the not yet widely-used learning approach in the field of oscillation reduction.
... By learning a policy in this manner we decouple the potentially inaccurate model f φ z (x t , u t ) from the true dynamics, in a learning inverse dynamics fashion [34]. To see this, suppose we have some z t and v t , and z t+1 = f φ z (z t ,v t ). ...
... Especially, learning of inverse dynamic models is researched extensively. Inverse dynamic models can be directly used as feed-forward control input of the robot [55]. LWPRbased techniques are often used for inverse dynamics models. ...
Full-text available
Redundant manipulators have more degrees of freedom then minimally required in order to perform the main manipulation task. This provides more flexibility during the manipulation task as it allows for simultaneous secondary tasks like obstacle avoidance or modification of dynamical properties. However, the overall system becomes underdetermined and methods for redundancy resolution are required. One technique for redundancy resolution is task space augmentation. Besides the task space coordinates, another set of coordinates is defined to be used to determine the configuration of a manipulator. Linear projection is used to ensure that the main task is not disturbed by the new coordinates. In this thesis a novel kind of task space augmentation is designed, which is based on dynamical decoupling. By construction, these new coordinates are dynamically independent and no projection is required. Controllers in both sets of coordinates can be superimposed without mutual interference. The additional new set of coordinates is computed by a coordinate function with certain properties. The mapping from joint space to task space can be seen as a foliation of the joint space manifold, where the leaves correspond to the self-motion manifolds. Based thereon, relations of the Jacobian between the task space forward kinematics and the Jacobian of the desired coordinate function are derived. These relations can be described as an underdetermined system of partial differential equations. In order to find an approximate solution to this, a variational principle is employed. In particular, the desired coordinate function is written as a neural network and the derived requirements on the Jacobians are translated to a cost function. Training of the neural network simultaneously finds a concrete instantiation of the PDE as well as a solution to it. Trained models for different planar robots are evaluated in different settings. Kinematic evaluation shows decoupling of the two sets of coordinates on first-order dynamics, which is generally not provided by traditional augmentation methods. Afterwards, the model is evaluated using simulation of closed-loop dynamics. Impedance controllers in both coordinate sets control a simulated planar robot. In contrast to the kinematic analysis some couplings are observable on actual multi-body dynamics. The majority of couplings is due to Coriolis and centrifugal forces and terms related to the change of the Jacobian. An additional feed-forward controller compensating the major couplings achieves dynamically decoupled coordinates. The developed method provides a technique to automatically find dynamically decoupled coordinates, which can be used for impedance control of redundant robots. These can also be interpreted as providing potentials and geodetic springs which are advantageous for controller design.
Due to the important role of the manipulator dynamic model in manipulation control, the identification of the dynamic parameters of manipulators has become a research hotspot once again. In this paper, we present an overview of the modeling of manipulator dynamics, the optimization methods of excitation trajectory, the identification methods for dynamic parameters, and the identification of friction model parameters. First, the process and basic methods of identification of manipulation dynamic parameters are summarized, and the optimization methods for excitation trajectory are analyzed in detail. Further, friction model parameter identification and the physical feasibility of dynamic parameters are discussed. These are research hotspots associated with the identification of dynamic parameters of manipulators. The backgrounds and solutions of the problems of physical feasibility and identification of friction parameters are reviewed in this paper. Finally, neural networks and deep learning methods are discussed. The neural networks and deep learning methods have been used to improve the accuracy of identification. However, deep learning methods and neural networks need more in-depth analysis and experiments. At present, the instrumental variable method with complete physical feasibility constraints is an optimal choice for dynamic parameter identification. Moreover, this review aims to present the important theoretical foundations and research hotspots for the identification of manipulation dynamic parameters and help researchers determine future research areas.
Full-text available
Human beings can achieve a high level of motor performance that is still unmatched in robotic systems. These capabilities can be ascribed to two main enabling factors: (i) the physical proprieties of human musculoskeletal system, and (ii) the effectiveness of the control operated by the central nervous system. Regarding point (i), the introduction of compliant elements in the robotic structure can be regarded as an attempt to bridge the gap between the animal body and the robot one. Soft articulated robots aim at replicating the musculoskeletal characteristics of vertebrates. Yet, substantial advancements are still needed under a control point of view, to fully exploit the new possibilities provided by soft robotic bodies. This paper introduces a control framework that ensures natural movements in articulated soft robots, implementing specific functionalities of the human central nervous system, i.e. learning by repetition, after-effect on known and unknown trajectories, anticipatory behavior, its reactive re-planning, and state covariation in precise task execution. The control architecture we propose has a hierarchical structure composed of two levels. The low level deals with dynamic inversion and focuses on trajectory tracking problems. The high level manages the degree of freedom redundancy, and it allows to control the system through a reduced set of variables. The building blocks of this novel control architecture are well-rooted in the control theory, which can furnish an established vocabulary to describe the functional mechanisms underlying the motor control system. The proposed control architecture is validated through simulations and experiments on a bio-mimetic articulated soft robot.
Conference Paper
Reliability of the traditional analytical model building techniques for Robotic Manipulators is debatable with higher Degrees of Freedom (DoF) and under dynamic, uncertain environments. Keeping these uncertainties and inaccuracies in the backdrop, the researchers have been encouraged to use supervised machine learning techniques as a better alternative for data-driven model learning. The main advantage of data-driven models lies in their adaptability to cope with the model variations in real-time. Considering the proven superiority of the Recurrent Neural Networks (RNN) family in sequence modelling, this paper projects three members of this family, namely Simple RNN (SRNN), Long Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) as promising candidates for Robotic manipulator model learning tasks. Simulation results obtained by using some publicly available data sets of KUKA LWR and SARCOS Robot Arm with 7-DoF, clearly show that model learning performance of both LSTM and GRU are better than other classical regression based techniques.
Full-text available
First Page of the Article
In this paper, nine adaptive control algorithms are compared. The best two of them are tested experimentally. It is shown that the Adaptive FeedForward Controller AFFC) is well suited for learning the parameters of the dynamic equation, even in the presence of friction and noise. The resulting control performance is better than with measured parameters for any trajectory in the workspace. When the task consists of repeating the same trajectory, an adaptive look-up-table MEMory, introduced and analyzed in this paper, is simpler to implement and results in even better control performance.
Gaussian processes (GPs) provide a principled, practical, probabilistic approach to learning in kernel machines. GPs have received increased attention in the machine-learning community over the past decade, and this book provides a long-needed systematic and unified treatment of theoretical and practical aspects of GPs in machine learning. The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics. The book deals with the supervised-learning problem for both regression and classification, and includes detailed algorithms. A wide variety of covariance (kernel) functions are presented and their properties discussed. Model selection is discussed both from a Bayesian and a classical perspective. Many connections to other well-known techniques from machine learning and statistics are discussed, including support-vector machines, neural networks, splines, regularization networks, relevance vector machines and others. Theoretical issues including learning curves and the PAC-Bayesian framework are treated, and several approximation methods for learning with large datasets are discussed. The book contains illustrative examples and exercises, and code and datasets are available on the Web. Appendixes provide mathematical background and a discussion of Gaussian Markov processes.
This paper introduces a probably stable learning adaptive control framework with statistical learning. The proposed algorithm employs nonlinear function approximation with automatic growth of the learning network according to the nonlinearities and the working domain of the control system. The unknown function in the dynamical system is approximated by piecewise linear models using a nonparametric regression technique. Local models are allocated as necessary and their parameters are optimized on-line. Inspired by composite adaptive control methods, the proposed learning adaptive control algorithm uses both the tracking error and the estimation error to update the parameters. We first discuss statistical learning of nonlinear functions, and motivate our choice of the locally weighted learning framework. Second, we begin with a class of first order SISO systems for theoretical development of our learning adaptive control framework, and present a stability proof including a parameter projection method that is needed to avoid potential singularities during adaptation. Then, we generalize our adaptive controller to higher order SISO systems, and discuss further extension to MIMO problems. Finally, we evaluate our theoretical control framework in numerical simulations to illustrate the effectiveness of the proposed learning adaptive controller for rapid convergence and high accuracy of control.