## No full-text available

To read the full-text of this research,

you can request a copy directly from the authors.

Recent years an extensive literature appears using the Lie groups theory to solve the problems of computer vision. Lie groups theory is the natural representation of a space of transformations. Lie algebra is the tangent space of Lie groups at the identity. From Lie groups to Lie algebra, we can establish a mapping from the multiplicative structure to an equivalent vector space representation, which makes correlation calculation become rational and precise. Based on the linear structure of Lie algebra, many statistical learning methods can be readily applied. This survey briefly reviews the different approaches about the use of Lie groups theory that have been developed by research; introducing the mathematical background of Lie groups theory corresponding to computer vision; describing the main approaches in details according two categories.

To read the full-text of this research,

you can request a copy directly from the authors.

... We gave particular emphasis to problems of evolution in classical mechanics and problems of signal processing. This is by no means an exhaustive survey; other interesting areas of application are for example problems in vision and medical imaging, see for instance [40,73]. ...

... We gave particular emphasis to problems of evolution in classical mechanics and problems of signal processing. This is by no means an exhaustive survey; other interesting areas of application are for example problems in vision and medical imaging, see for instance [40,72]. ...

We give a short and elementary introduction to Lie group methods. A selection of applications of Lie group integrators are discussed. Finally, a family of symplectic integrators on cotangent bundles of Lie groups is presented and the notion of discrete gradient methods is generalised to Lie groups.

The paper deals with identifying linear dynamical systems from the experimental data obtained through applying the test signals to the system. The paper objective is to determine both the form and the coefficients of the transfer function retrieved from the hodograph samples experimentally at bench test. The order of the frequency transfer function of the system being identified was assumed to be unknown. It was expected that in obtaining the frequency characteristics of a real system there would be noise during the experiment as a result of which the points of the experimentally obtained hodograph would be randomly shifted. As a model, a certain transfer function of the system was adopted. The authors proposed to find a solution of the identification problem in the class of hodographs specified by the model of the system. The search for unknown coefficients of the transfer function of the system model is carried out by minimizing a proximity criterion (measure) - described and published earlier by one of the authors - between the experimentally received system hodograph and the system model on an entire set of the experimental points of the system hodograph and the hodograph of the system model. The solution of linear dynamic system identification from the frequency hodograph was reduced to solving a system of equations of the system model frequency transfer function that is linear with respect to unknown parameters.The proposed identification algorithm allows us to determine the order of the frequency transfer function of the identified system from the experimentally obtained samples of the frequency hodograph of the system. For dynamic systems of the fifth order at most there is software developed to simulate the process providing the pseudo-experimental data with random errors and determining the parameters of such systems.A computational experiment has been carried out to evaluate the error with which the proposed algorithm determines the parameter values of the system to be identified. The illustrative computational experiment has shown that using the proposed algorithm for identifying a linear dynamic system from the frequency hodograph the error in determining the coefficient values of the frequency transfer function of the system is comparable with a range of measuring error in the experimental samples of the hodograph of this system. In known sources on identification of linear dynamic systems there is no method of identification this publication describes. This identification method of linear dynamic systems can find application in experimental testing, verification tests in situ and iron bird tests for vehicles of various purposes.

In recent years, complex-, quaternion-, and Clifford-valued neural networks have been intensively studied. This paper introduces Lie algebra-valued bidirectional associative memories, an alternative generalization of the real-valued neural networks, for which the states, outputs, and thresholds are all from a Lie algebra. The definition of these networks is given, together with an expression for an energy function, that is indeed proven to be an energy function for the proposed network.

This paper introduces Lie algebra-valued Hopfield neural networks, for which the states, outputs, weights and thresholds are all from a Lie algebra. This type of networks represents an alternative generalization of the real-valued neural networks besides the complex-, hyperbolic-, quaternion-, and Clifford-valued neural networks that have been intensively studied over the last few years. The dynamics of these networks from the energy function point of view is studied by giving the expression of such a function and proving that it is indeed an energy function for the proposed network.

This paper introduces advances in the study of Lie group machine learning (LML) from three aspects. First, this paper presents the reasons why we choose Lie group to describe features, aiming at clarifying the differences between LML and traditional machine learning methods. By showing extensive applications in artificial intelligence area, we illustrate the universality of Lie Group representation. Second, this paper outlines the main LML algorithms since proposed, with an emphasis on recent research progress. At last, according to the current development, this paper shows some future research directions about LML.

This paper introduces Lie algebra-valued feedforward neural networks, for which the inputs, outputs, weights and biases are all from a Lie algebra. This type of networks represents an alternative generalization of the real-valued neural networks besides the complex-, hyperbolic-, quaternion-, and Clifford-valued neural networks that have been intensively studied over the last few years. The full deduction of the gradient descent algorithm for training such networks is presented. The proposed networks are tested on two synthetic function approximation problems and on geometric transformations, the results being promising for the future of Lie algebra-valued neural networks.

The KNN (The K nearest neighbor) is known as its simple efficient and widely used in classification problems or as a benchmark in classification problems. For different data types especially complex structure and high-dimensional data in real-life, the choice of distance metrics between sample points is a relatively complexity problem. The KNN's feature space is generally n-dimensional real vector space. This article converts the samples in the vector space to be the elements in line with the Lie group nature and then proposes a Li-KNN algorithm to solve the classification problem based on the theory of Lie groups. It shows good results by the experimental on handwritten numeral.

We propose a simple and elegant algorithm to track nonrigid objects using a covariance based object description and a Lie algebra based update mechanism. We represent an object window as the covariance matrix of features, therefore we manage to capture the spatial and statistical properties as well as their correlation within the same representation. The covariance matrix enables efficient fusion of different types of features and modalities, and its dimensionality is small. We incorporated a model update algorithm using the Lie group structure of the positive definite matrices. The update mechanism effectively adapts to the undergoing object deformations and appearance changes. The covariance tracking method does not make any assumption on the measurement noise and the motion of the tracked objects, and provides the global optimal solution. We show that it is capable of accurately detecting the nonrigid, moving objects in non-stationary camera sequences while achieving a promising detection rate of 97.4 percent.

This paper presents a novel learning based tracking model combined with object detection. The existing tech- niques proceed by linearizing the motion, which makes an implicit Euclidean space assumption. Most of the trans- formations used in computer vision have matrix Lie group structure. We learn the motion model on the Lie algebra and show that the formulation minimizes a first order ap- proximation to the geodesic error. The learning model is extended to train a class specific tracking function, which is then integrated to an existing pose dependent object de- tector to build a pose invariant object detection algorithm. The proposed model can accurately detect objects in vari- ous poses, where the size of the search space is only a frac- tion compared to the existing object detection methods. The detection rate of the original detector is improved by more than 90% for large transformations.

Principal component analysis has proven to be useful for understanding geometric variability in populations of parameterized objects. The statistical framework is well understood when the parameters of the objects are elements of a Euclidean vector space. This is certainly the case when the objects are described via landmarks or as a dense collection of boundary points. We have been developing representations of geometry based on the medial axis description or m-rep. Although this description has proven to be effective, the medial parameters are not naturally elements of a Euclidean space. In this paper we show that medial descriptions are in fact elements of a Lie group. We develop methodology based on Lie groups for the statistical analysis of medially-defined anatomical objects.

The mean shift algorithm is widely applied for nonpara- metric clustering in Euclidean spaces. Recently, mean shift was generalized for clustering on matrix Lie groups. We further extend the algorithm to a more general class of nonlinear spaces, the set of analytic manifolds. As exam- ples, two specific classes of frequently occurring parameter spaces, Grassmann manifolds and Lie groups, are consid- ered. When the algorithm proposed here is restricted to ma- trix Lie groups the previously proposed method is obtained. The algorithm is applied to a variety of robust motion seg- mentation problems and multibody factorization. The mo- tion segmentation method is robust to outliers, does not require any prior specification of the number of indepen- dent motions and simultaneously estimates all the motions present.

We present a new algorithm to detect pedestrian in still images utilizing covariance matrices as object descriptors. Since the descriptors do not form a vector space, well known machine learning techniques are not well suited to learn the classifiers. The space of d-dimensional nonsingular covariance matrices can be represented as a connected Riemannian manifold. The main contribution of the paper is a novel approach for classifying points lying on a connected Riemannian manifold using the geometry of the space. The algorithm is tested on INRIA and DaimlerChrysler pedestrian datasets where superior detection rates are observed over the previous approaches.

The Gaussian distribution is the basis for many methods used in the statistical analysis of shape. One such method is principal component analysis, which has proven to be a powerful technique for describing the geometric variability of a population of objects. The Gaussian framework is well understood when the data being studied are elements of a Euclidean vector space. This is the case for geometric objects that are described by landmarks or dense collections of boundary points. We have been using medial representations, or m-reps, for modelling the geometry of anatomical objects. The medial parameters are not elements of a Euclidean space, and thus standard PCA is not applicable. In our previous work we have shown that the m-rep model parameters are instead elements of a Lie group. In this paper we develop the notion of a Gaussian distribution on this Lie group. We then derive the maximum likelihood estimates of the mean and the covariance of this distribution. Analogous to principal component analysis of covariance in Euclidean spaces, we define principal geodesic analysis on Lie groups for the study of anatomical variability in medially-defined objects. Results of applying this framework on a population of hippocampi in a schizophrenia study are presented.

For analyzing shapes of planar, closed curves, we propose differential geometric representations of curves using their direction functions and curvature functions. Shapes are represented as elements of infinite-dimensional spaces and their pairwise differences are quantified using the lengths of geodesics connecting them on these spaces. We use a Fourier basis to represent tangents to the shape spaces and then use a gradient-based shooting method to solve for the tangent that connects any two shapes via a geodesic. Using the Surrey fish database, we demonstrate some applications of this approach: 1) interpolation and extrapolations of shape changes, 2) clustering of objects according to their shapes, 3) statistics on shape spaces, and 4) Bayesian extraction of shapes in low-quality images.

We propose a nonlinear filter for estimating the trajectory of a
random walk on a matrix Lie group with constant computational
complexity. It is based on a finite-dimensional approximation of the
conditional distribution of the state-given past measurements-via a set
of fair samples, which are updated at each step and proven to be
consistent with the updated conditional distribution. The algorithm
proposed, like other Monte Carlo methods, can in principle track
arbitrary distributions evolving on arbitrarily large state spaces.
However, several issues concerning sample impoverishment need to be
taken into account when designing practical working systems

Deformable template representations of observed imagery model the
variability of target pose via the actions of the matrix Lie groups on
rigid templates. In this paper, we study the construction of minimum
mean squared error estimators on the special orthogonal group, SO(n),
for pose estimation. Due to the nonflat geometry of SO(n), the standard
Bayesian formulation of optimal estimators and their characteristics
requires modifications. By utilizing Hilbert-Schmidt metric defined on
GL(n), a larger group containing SO(n), a mean squared criterion is
defined on SO(n). The Hilbert-Schmidt estimate (HSE) is defined to be a
minimum mean squared error estimator, restricted to SO(n). The expected
error associated with the HSE is shown to be a lower bound, called the
Hilbert-Schmidt bound (HSB), on the error incurred by any other
estimator. Analysis and algorithms are presented for evaluating the HSE
and the HSB in cases of both ground-based and airborne targets

In this paper we give precise definitions of different, properly invariant notions of mean or average rotation. Each mean is associated with a metric in SO(3). The metric induced from the Frobenius inner product gives rise to a mean rotation that is given by the closest special orthogonal matrix to the usual arithmetic mean of the given rotation matrices. The mean rotation associated with the intrinsic metric on SO(3) is the Riemannian center of mass of the given rotation matrices. We show that the Riemannian mean rotation shares many common features with the geometric mean of positive numbers and the geometric mean of positive Hermitian operators. We give some examples with closed-form solutions of both notions of mean.

This book is an introductory graduate-level textbook on the theory of smooth manifolds. Its goal is to familiarize students with the tools they will need in order to use manifolds in mathematical or scientific research-- smooth structures, tangent vectors and covectors, vector bundles, immersed and embedded submanifolds, tensors, differential forms, de Rham cohomology, vector fields, flows, foliations, Lie derivatives, Lie groups, Lie algebras, and more. The approach is as concrete as possible, with pictures and intuitive discussions of how one should think geometrically about the abstract concepts, while making full use of the powerful tools that modern mathematics has to offer. Along the way, the book introduces students to some of the most important examples of geometric structures that manifolds can carry, such as Riemannian metrics, symplectic structures, and foliations. The book is aimed at students who already have a solid acquaintance with general topology, the fundamental group, and covering spaces, as well as basic undergraduate linear algebra and real analysis. John M. Lee is Professor of Mathematics at the University of Washington in Seattle, where he regularly teaches graduate courses on the topology and geometry of manifolds. He was the recipient of the American Mathematical Society's Centennial Research Fellowship and he is the author of two previous Springer books, Introduction to Topological Manifolds (2000) and Riemannian Manifolds: An Introduction to Curvature (1997).

A novel image projective registration algorithm based on Riemannian manifold is presented. We use SL(3) group as projective parametric transformation to exploit the geometric structure of the underlying space and get the geodesics on SL(3) through variation method. Then, we define the new Riemannian exponential mapping. Finally, we develop a new image projective registration algorithm based on Riemannian analysis on SL(3). In addition, we analyze the advantage of our method and give a direct proof of local quadratic convergence of the algorithm. Comparative experiments have demonstrated that this method makes a more significant improvement on efficiency and accuracy than the image projective registration algorithm based on vector space and outperforms the projective registration algorithm based on Lie group.

We present a novel method for modeling dynamic visual phenomena, which consists of two key aspects. First, the integral motion of constituent elements in a dynamic scene is captured by a common underlying geometric transform process. Second, a Lie algebraic representation of the transform process is introduced, which maps the transformation group to a vector space, and thus overcomes the difficulties due to the group structure. Consequently, the statistical learning techniques based on vector spaces can be readily applied. Moreover, we discuss the intrinsic connections between the Lie algebra and the Linear dynamical processes, showing that our model induces spatially varying fields that can be estimated from local motions without continuous tracking. Following this, we further develop a statistical framework to robustly learn the flow models from noisy and partially corrupted observations. The proposed methodology is demonstrated on real world phenomenon, inferring common motion patterns from surveillance videos of crowded scenes and satellite data of weather evolution.

We define the Newton iteration for solving the equation f(y) = 0, where f is a map from a Lie group to its corresponding Lie algebra. Two versions are presented, which are formulated independently of any metric on the Lie group. Both formulations reduce to the standard method in the Euclidean case, and are related to existing algorithms on certain Riemannian manifolds. In particular, we show that, under classical assumptions on f, the proposed method converges quadratically. We illustrate the techniques by solving a fixed-point problem arising from the numerical integration of a Lie-type initial value problem via implicit Euler.

The Lie transformation group model of neuropsychology (LTG/NP) purports to represent and explain how the locally smooth processes
observed in the visual field, and their integration into the global field of visual phenomena, are consequences of special
properties of the underlying neuronal complex. These properties are modeled by a specific set of mathematical structures that
account both for local (infinitesimal) operations and for their generation of the “integral curves” that are visual contours.
The purpose of this tutorial paper is to expound, as nontechnically as possible, the mathematical basis for LTG/NP, and to
evaluate that model against a reasonable set of criteria for a neuropsychological theory. It is shown that this approach to
spatial vision is closer to the mainstream of current theoretical work than might be assumed; recent experimental support
for LTG/NP is described.

This paper introduces a 3-d representation of vehicles as a space of scale and orientation transformations that define the shape of individual vehicle instances. This shape space forms a group, where the similarity of different vehicle observations can be evaluated using a distance measure defined by Lie group theory. A generic class of vehicles (e.g. SUV) is represented by a set of curves on the Lie group manifold, called geodesics. The classification of any given vehicle instance is achieved by finding the class with the smallest Lie distance between the geodesics and the vehicle shape. Vehicle recognition is carried out on 3-d LIDAR point clouds. The performance of the Lie classifier is evaluated against two other approaches and found to provide superior recognition performance, particularly with respect to the ability to generalize from a small number of labeled prototypes.

The main purpose of this paper is to estimate 2D and 3D transformation parameters. All the group transformations are represented in terms of their Lie algebra elements. The Lie algebra approach assures to follow the shortest path or geodesic in the involved Lie group. For the estimation of the Lie algebra parameters, we take advantage of the theory of system identification. Two experiments are presented to show the potential of the method. First, we carry out the estimation of the affine or projective parameters related to the transformation involved in monocular region tracking. Second, we develop a monocular method to estimate 3D motion of an object in the visual space. In the latter, the six parameters of the rigid motion are estimated based on measurements of the six parameters of the affine transformation in the image.

In this paper, we propose distributed algorithms for estimating the average pose of an object viewed by a localized network of camera motes. To this effect, we propose distributed averaging consensus algorithms on the group of 3D rigid-body transformations, SE(3). We rigorously analyze the convergence of the proposed algorithms, and show that naive generalizations of Euclidean consensus algorithms fail to converge to the correct solution. We also provide synthetic experiments that confirm our analysis and validate our approach.

A novel approach to visual servoing is presented, which takes advantage of the structure of the Lie algebra of affine transformations. The aim of this project is to use feedback from a visual sensor to guide a robot arm to a target position. The target position is learned using the principle of ‘teaching by showing’ in which the supervisor places the robot in the correct target position and the system captures the necessary information to be able to return to that position. The sensor is placed in the end effector of the robot, the ‘camera-in-hand’ approach, and thus provides direct feedback of the robot motion relative to the target scene via observed transformations of the scene. These scene transformations are obtained by measuring the affine deformations of a target planar contour (under the weak perspective assumption), captured by use of an active contour, or snake. Deformations of the snake are constrained using the Lie groups of affine and projective transformations. Properties of the Lie algebra of affine transformations are exploited to provide a novel method for integrating observed deformations of the target contour. These can be compensated with appropriate robot motion using a non-linear control structure. The local differential representation of contour deformations is extended to allow accurate integration of an extended series of small perturbations. This differs from existing approaches by virtue of the properties of the Lie algebra representation which implicitly embeds knowledge of the three-dimensional world within a two-dimensional image-based system. These techniques have been implemented using a video camera to control a 5 DoF robot arm. Experiments with this implementation are presented, together with a discussion of the results.

The objective of this paper is to propose a new homography-based approach to image-based visual tracking and servoing. The visual tracking algorithm proposed in the paper is based on a new efficient second-order minimization method. Theoretical analysis and compar- ative experiments with other tracking approaches show that the pro- posed method has a higher convergence rate than standard first-order minimization techniques. Therefore, it is well adapted to real-time ro- botic applications. The output of the visual tracking is a homography linking the current and the reference image of a planar target. Us- ing the homography, a task function isomorphic to the camera pose has been designed. A new image-based control law is proposed which does not need any measure of the D structure of the observed target (e.g. the normal to the plane). The theoretical proof of the existence of the isomorphism between the task function and the camera pose and the theoretical proof of the stability of the control law are pro- vided. The experimental results, obtained with a 6 d.o.f. robot, show the advantages of the proposed method with respect to the existing approaches. KEY WORDS—visual tracking, visual servoing, efficient second-order minimization, homography-based control law

A fundamental problem in biological and machine vision is visual invariance: How are objects perceived to be the same despite transformations such as translations, rotations, and scaling? In this letter, we describe a new, unsupervised approach to learning invariances based on Lie group theory. Unlike traditional approaches that sacrifice information about transformations to achieve invariance, the Lie group approach explicitly models the effects of transformations in images. As a result, estimates of transformations are available for other purposes, such as pose estimation and visuomotor control. Previous approaches based on first-order Taylor series expansions of images can be regarded as special cases of the Lie group approach, which utilizes a matrix-exponential-based generative model of images and can handle arbitrarily large transformations. We present an unsupervised expectation-maximization algorithm for learning Lie transformation operators directly from image data containing examples of transformations. Our experimental results show that the Lie operators learned by the algorithm from an artificial data set containing six types of affine transformations closely match the analytically predicted affine operators. We then demonstrate that the algorithm can also recover novel transformation operators from natural image sequences. We conclude by showing that the learned operators can be used to both generate and estimate transformations in images, thereby providing a basis for achieving visual invariance.

We propose a new method to estimate multiple rigid motions from noisy 3D point correspondences in the presence of outliers. The method does not require prior specification of number of motion groups and estimates all the motion parameters simultaneously. We start with generating samples from the rigid motion distribution. The motion parameters are then estimated via mode finding operations on the sampled distribution. Since rigid motions do not lie on a vector space, classical statistical methods can not be used for mode finding. We develop a mean shift algorithm which estimates modes of the sampled distribution using the Lie group structure of the rigid motions. We also show that proposed mean shift algorithm is general and can be applied to any distribution having a matrix Lie group structure. Experimental results on synthetic and real image data demonstrate the superior performance of the algorithm.

Motivated by applications in fuzzy control, robotics and vision, this paper considers the problem of computing the centre of mass (precisely, the Karcher mean) of a set of points defined on a compact Lie group, such as the special orthogonal group consisting of all orthogonal matrices with unit determinant. An iterative algorithm, whose derivation is based on the geometry of the problem, is proposed. It is proved to be globally convergent. Interestingly, the proof starts by showing the algorithm is actually a Riemannian gradient descent algorithm with fixed step size.

Visual cues are often very difficult to track. We use an effective least squares estimation of the Lie algebra parameters to find the affine transformation involved in a visual region tracking. These parameters represent the geodesis of the optimal transformation orbit. Our experiments validate the effectiveness of the method.

While motion estimation has been extensively studied in the computer vision literature, the inherent information redundancy in an image sequence has not been well utilised. In particular as many as N(N-1)/2 pairwise relative motions can be estimated efficiently from a sequence of N images. This highly redundant set of observations can be efficiently averaged resulting in fast motion estimation algorithms that are globally consistent. In this paper we demonstrate this using the underlying Lie-group structure of motion representations. The Lie-algebras of the Special Orthogonal and Special Euclidean groups are used to define averages on the Lie-group which in turn gives statistically meaningful, efficient and accurate algorithms for fusing motion information. Using multiple constraints also controls the drift in the solution due to accumulating error. The performance of the method in estimating camera motion is demonstrated on image sequences.