Chapter
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Barycentric Subspaces have been defined in the context of manifolds using the notion of exponential barycenters. In this work, we extend the definition to quotient spaces which are not necessary manifolds. We define an alignment map and an horizontal logarithmic map to introduce Quotient Barycentric Subspaces (QBS). Due to the discrete group action and the quotient structure, the characterization of the subspaces and the estimation of the projection of a point onto the subspace is far from trivial. We propose two algorithms towards the estimation of the QBS and we discussed the results, underling the possible next steps for a robust estimation and their application to different data types.KeywordsDiscrete GroupQuotient SpaceBarycentric Subspaces AnalysisGraph SpaceObject Oriented Data Analysis

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Complex analyses involving multiple, dependent random quantities often lead to graphical models—a set of nodes denoting variables of interest, and corresponding edges denoting statistical interactions between nodes. To develop statistical analyses for graphical data, especially towards generative modeling, one needs mathematical representations and metrics for matching and comparing graphs, and subsequent tools, such as geodesics, means, and covariances. This paper utilizes a quotient structure to develop efficient algorithms for computing these quantities, leading to useful statistical tools, including principal component analysis, statistical testing, and modeling. We demonstrate the efficacy of this framework using datasets taken from several problem areas, including letters, biochemical structures, and social networks.
Article
Full-text available
Network data are becoming increasingly available, and so there is a need to develop suitable methodology for statistical analysis. Networks can be represented as graph Laplacian matrices, which are a type of manifold‐valued data. Our main objective is to estimate a regression curve from a sample of graph Laplacian matrices conditional on a set of Euclidean covariates, for example in dynamic networks where the covariate is time. We develop an adapted Nadaraya‐Watson estimator which has uniform weak consistency for estimation using Euclidean and power Euclidean metrics. We apply the methodology to the Enron email corpus to model smooth trends in monthly networks and highlight anomalous networks. Another motivating application is given in corpus linguistics, which explores trends in an author’s writing style over time based on word co‐occurrence networks.
Article
Full-text available
This paper investigates the generalization of Principal Component Analysis (PCA) to Riemannian manifolds. We first propose a new and more general type of family of subspaces in manifolds that we call barycen-tric subspaces. They are implicitly defined as the locus of points which are weighted means of k + 1 reference points. As this definition relies on points and not on tangent vectors, it can also be extended to geodesic spaces which are not Riemannian. For instance, in stratified spaces, it naturally allows principal subspaces that span several strata, which is impossible in previous generalizations of PCA. We show that barycentric subspaces locally define a submanifold of dimension k which generalizes geodesic subspaces. Second, we rephrase PCA in Euclidean spaces as an optimization on flags of linear subspaces (a hierarchy of properly embedded linear sub-spaces of increasing dimension). We show that the Euclidean PCA minimizes the sum of the unexplained variance by all the subspaces of the flag, also called the Area-Under-the-Curve (AUC) criterion. Barycentric subspaces are naturally nested, allowing the construction of hierarchically nested subspaces. Optimizing the AUC criterion to optimally approximate data points with flags of affine spans in Riemannian manifolds lead to a particularly appealing generalization of PCA on manifolds, that we call Barycentric Subspaces Analysis (BSA).
Article
Full-text available
A general framework is laid out for principal component analysis (PCA) on quotient spaces that result from an isometric Lie group action on a complete Riemannian manifold. If the quotient is a manifold, geodesics on the quotient can be lifted to horizontal geodesics on the original manifold. Thus, PCA on a mani- fold quotient can be pulled back to the original manifold. In general, however, the quotient space may no longer carry a manifold structure. Still, horizontal geodesics can be well-defined in the general case. This allows for the concept of generalized geodesics and orthogonal projection on the quotient space as the key ingredients for PCA. Generalizing a result of Bhattacharya and Patrangenaru (2003), geodesic scores can be defined outside a null set. Building on that, an algorithmic method to perform PCA on quotient spaces based on generalized geodesics is developed. As a typical example where non-manifold quotients appear, this framework is applied to Kendall's shape spaces. In fact, this work has been motivated by an application occurring in forest biometry where the current method of Euclidean linear approx- imation is unsuitable for performing PCA. This is illustrated by a data example of individual tree stems whose Kendall shapes fall into regions of high curvature of shape space: PCs obtained by Euclidean approximation fail to reflect between-data distances and thus cannot correctly explain data variation. Similarly, for a classical archeological data set with a large spread in shape space, geodesic PCA allows new insights that have not been available under PCA by Euclidean approximation. We conclude by reporting challenges, outlooks, and possible perspectives of intrinsic shape analysis.
Conference Paper
Full-text available
Diffusion tensor magnetic resonance imaging (DT-MRI) is emerging as an important tool in medical image analysis of the brain. However, relatively little work has been done on producing statistics of diffusion tensors. A main dif- ficulty is that the space of diffusion tensors, i.e., the space of symmetric, positive- definite matrices, does not form a vector space. Therefore, standard linear statisti- cal techniques do not apply. We show that the space of diffusion tensors is a type of curved manifold known as a Riemannian symmetric space. We then develop methods for producing statistics, namely averages and modes of variation, in this space. In our previous work we introduced principal geodesic analysis, a gen- eralization of principal component analysis, to compute the modes of variation of data in Lie groups. In this work we expand the method of principal geodesic analysis to symmetric spaces and apply it to the computation of the variability of diffusion tensor data. We expect that these methods will be useful in the registra- tion of diffusion tensor images, the production of statistical atlases from diffusion tensor data, and the quantification of the anatomical variability caused by disease.
Article
Statistical analysis for populations of networks is widely applicable but challenging as networks have strongly non-Euclidean behaviour. Graph space is an exhaustive framework for studying populations of unlabelled networks which are weighted or unweighted, uni- or multi-layered, directed or undirected. Viewing graph space as the quotient of a Euclidean space with respect to a finite group action, we show that it is not a manifold, and that its curvature is unbounded from above. Within this geometrical framework we define generalized geodesic principal components, and we introduce the align all and compute algorithms, all of which allow for the computation of statistics on graph space. The statistics and algorithms are compared with existing methods and empirically validated on three real datasets, showcasing the framework potential utility. The whole framework is implemented within the geomstats Python package.
Article
Understanding how unlabeled graphs depend on input values or vectors is of extreme interest in a range of applications. In this paper, we propose a regression model taking values in graph space, representing unlabeled graphs which can be weighted or unweighted, one or multi-layer, and have same or different numbers of nodes, as a function of real valued regressor. As graph space is not a manifold, well-known manifold regression models are not applicable. We provide flexible parametrized regression models for graph space, along with precise and computationally efficient estimation procedures given by the introduced align all and compute regression algorithm. We show the potential of the proposed model for three real datasets: a time dependent cryptocurrency correlation matrices, a set of bus mobility usage network in Copenhagen (DK) during the pandemic, and a set of team players’ passing networks for all the matches in Fifa World Championship 2018.
Article
If A and B are nonvoid subsets of a metric space (X, d), the set of points x ∊ X for which d(x, A) = d(x, B) is called the equidistant set determined by A and B. Among other results, it is shown that if A and B are connected and X is Euclidean 22-space, then the equidistant set determined by A and B is connected.
Chapter
In this chapter, we begin by introducing the simplest type of manifolds, the topological manifolds, which are topological spaces with three special properties that encode what we mean when we say that they “locally look like ℝn .” We then prove some important topological properties of manifolds that we use throughout the book. In the second section we introduce an additional structure, called a smooth structure, that can be added to a topological manifold to enable us to do calculus. Following the basic definitions, we introduce a number of examples of manifolds, so you can have something concrete in mind as you read the general theory. At the end of the chapter we introduce the concept of a smooth manifold with boundary, an important generalization of smooth manifolds that will have numerous applications throughout the book.
Article
The shape-space Σ m k whose points σ represent the shapes of not totally degenerate k-ads in ℝ m is introduced as a quotient space carrying the quotient metric. When m=1, we find that Σ 1 k =S k-2 : when m≥3, the shape-space contains singularities. This paper deals mainly with the case m=2, when the shape-space Σ 2 k can be identified with a version of ℂP k-2 · Of special importance are the shape-measures induced on ℂP k-2 by any assigned diffuse law of distribution for the k vertices. We determine several such shape-measures, we resolve some of the technical problems associated with the graphic presentation and statistical analysis of empirical shape distributions, and among applications we discuss the relevance of these ideas to testing for the presence of non- accidental multiple alignments in collections of (i) neolithic stone monuments and (ii) quasars. Finally the recently introduced Ambartzumian density is examined from the present point of view, its norming constant is found, and its connexion with random Crofton polygons is established.
Article
The active field of Functional Data Analysis (about understanding the variation in a set of curves) has been recently extended to Object Oriented Data Analysis, which considers populations of more general objects. A particularly challenging extension of this set of ideas is to populations of tree-structured objects. We develop an analog of Principal Component Analysis for trees, based on the notion of tree-lines, and propose numerically fast (linear time) algorithms to solve the resulting optimization problems. The solutions we obtain are used in the analysis of a data set of 73 individuals, where each data object is a tree of blood vessels in one person's brain.