Marta Casanellas

Marta Casanellas
Universitat Politècnica de Catalunya | UPC · Departament de Matemàtiques

About

62
Publications
2,280
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
599
Citations
Citations since 2017
24 Research Items
254 Citations
201720182019202020212022202301020304050
201720182019202020212022202301020304050
201720182019202020212022202301020304050
201720182019202020212022202301020304050
Introduction
Marta Casanellas currently works at the Departament de Matemàtiques, Universitat Politècnica de Catalunya. Marta does research in Evolutionary Biology, Algebra, and Algebraic geometry.
Skills and Expertise

Publications

Publications (62)
Preprint
In the last years algebraic tools have been proven to be useful in phylogenetic reconstruction and model selection by means of the study of phylogenetic invariants. However, up to now, the models studied from an algebraic viewpoint are either too general or too restrictive (as group-based models with a uniform stationary distribution) to be used in...
Article
Full-text available
Homogeneity across lineages is a general assumption in phylogenetics according to which nucleotide substitution rates are common to all lineages. Many phylogenetic methods relax this hypothesis but keep a simple enough model to make the process of sequence evolution more tractable. On the other hand, dealing successfully with the general case (hete...
Preprint
Full-text available
Homogeneity across lineages is a common assumption in phylogenetics according to which nucleotide substitution rates remain constant in time and do not depend on lineages. This is a simplifying hypothesis which is often adopted to make the process of sequence evolution more tractable. However, its validity has been explored and put into question in...
Article
We present the phylogenetic quartet reconstruction method SAQ (Semi-Algebraic Quartet reconstruction). SAQ is consistent with the most general Markov model of nucleotide substitution and, in particular, it allows for rate heterogeneity across lineages. Based on the algebraic and semi-algebraic description of distributions that arise from the genera...
Article
Modelling the substitution of nucleotides along a phylogenetic tree is usually done by a hidden Markov process. This allows to define a distribution of characters at the leaves of the trees and one might be able to obtain polynomial relationships among the probabilities of different characters. The study of these polynomials and the geometry of the...
Preprint
Consider the problem of learning undirected graphical models on trees from corrupted data. Recently Katiyar et al. showed that it is possible to recover trees from noisy binary data up to a small equivalence class of possible trees. Their other paper on the Gaussian case follows a similar pattern. By framing this as a special phylogenetic recovery...
Article
A Markov matrix is embeddable if it can represent a homogeneous continuous-time Markov process. It is well known that if a Markov matrix has real and pairwise-different eigenvalues, then the embeddability can be determined by checking whether its principal logarithm is a rate matrix or not. The same holds for Markov matrices that are close enough t...
Preprint
We present the phylogenetic quartet reconstruction method SAQ (Semi-algebraic quartet reconstruction). SAQ is consistent with the most general Markov model of nucleotide substitution and, in particular, it allows for rate heterogeneity across lineages. Based on the algebraic and semi-algebraic description of distributions that arise from the genera...
Preprint
Characterizing whether a Markov process of discrete random variables has an homogeneous continuous-time realization is a hard problem. In practice, this problem reduces to deciding when a given Markov matrix can be written as the exponential of some rate matrix (a Markov generator). This is an old question known in the literature as the embedding p...
Preprint
A Markov matrix is embeddable if it can represent a homogeneous continuous-time Markov process. It is well known that if a Markov matrix has real and pairwise-different eigenvalues, then the embeddability can be determined by checking whether its principal logarithm is a rate matrix or not. The same holds for Markov matrices close enough to the ide...
Preprint
Less rigid than phylogenetic trees, phylogenetic networks allow the description of a wider range of evolutionary events. In this note, we explain how to extend the rank invariants from phylogenetic trees to phylogenetic networks evolving under the general Markov model and the equivariant models.
Article
Full-text available
Deciding whether a substitution matrix is embeddable (i.e. the corresponding Markov process has a continuous-time realization) is an open problem even for \(4\times 4\) matrices. We study the embedding problem and rate identifiability for the K80 model of nucleotide substitution. For these \(4\times 4\) matrices, we fully characterize the set of em...
Preprint
Modelling the substitution of nucleotides along a phylogenetic tree is usually done by a hidden Markov process. This allows to define a distribution of characters at the leaves of the trees and one might be able to obtain polynomial relationships among the probabilities of different characters. The study of these polynomials and the geometry of the...
Preprint
Algebraic statistics uses tools from algebra (especially from multilinear algebra, commutative algebra and computational algebra), geometry and combinatorics to provide insight into knotty problems in mathematical statistics. In this survey we illustrate this on three problems related to networks, namely network models for relational data, causal s...
Preprint
Deciding whether a Markov matrix is embeddable (i.e. can be written as the exponential of a rate matrix) is an open problem even for $4\times 4$ matrices. We study the embedding problem and rate identifiability for the K80 model of nucleotide substitution. For these $4\times 4$ matrices, we fully characterize the set of embeddable K80 Markov matric...
Article
Full-text available
We present an algorithm for the unsupervised learning of latent variable models based on the method of moments. We give efficient estimates of the moments for two models that are well known, e.g., in text mining, the single-topic model and latent Dirichlet allocation, and we provide a tensor decomposition algorithm for the moments that proves to be...
Article
In many areas of applied linear algebra, it is necessary to work with matrix approximations. A usual situation occurs when a matrix obtained from experimental or simulated data is needed to be approximated by a matrix that lies in a corresponding statistical model and satisfies some specific properties. In this short note, we focus on symmetric and...
Article
Phylogenetic varieties related to equivariant substitution models have been studied largely in the last years. One of the main objectives has been finding a set of generators of the ideal of these varieties, but this has not yet been achieved in some cases (for example, for the general Markov model this involves the open “salmon conjecture”, see [2...
Article
Full-text available
The reconstruction of phylogenetic trees from molecular sequence data relies on modelling site substitutions by a Markov process, or a mixture of such processes. In general, allowing mixed processes can result in different tree topologies becoming indistinguishable from the data, even for infinitely long sequences. However, when the underlying Mark...
Article
Algebraic statistics uses tools from algebra (especially from multilinear algebra, commutative algebra, and computational algebra), geometry, and combinatorics to provide insight into knotty problems in mathematical statistics. In this review, we illustrate this on three problems related to networks: network models for relational data, causal struc...
Article
Full-text available
This paper presents an algorithm for the unsupervised learning of latent variable models from unlabeled sets of data. We base our technique on spectral decomposition, providing a technique that proves to be robust both in theory and in practice. We also describe how to use this algorithm to learn the parameters of two well known text mining models:...
Article
Full-text available
Phylogenetic varieties related to equivariant substitution models have been studied largely in the last years. One of the main objectives has been finding a set of generators of the ideal of these varieties, but this has not yet been achieved in some cases (for example, for the general Markov model this involves the open "salmon conjecture") and it...
Article
Full-text available
One reason why classical phylogenetic reconstruction methods fail to correctly infer the underlying topology is because they assume oversimplified models. In this paper we propose a quartet reconstruction method consistent with the most general Markov model of nucleotide substitution, which can also deal with data coming from mixtures on the same t...
Article
Full-text available
Motivated by phylogenetics, our aim is to obtain a system of equations that define a phylogenetic variety on an open set containing the biologically meaningful points. In this paper we consider phylogenetic varieties defined via group-based models. For any finite abelian group $G$, we provide an explicit construction of $codim X$ phylogenetic invar...
Article
Full-text available
Background The reconstruction of the phylogenetic tree topology of four taxa is, still nowadays, one of the main challenges in phylogenetics. Its difficulties lie in considering not too restrictive evolutionary models, and correctly dealing with the long-branch attraction problem. The correct reconstruction of 4-taxon trees is crucial for making qu...
Article
Full-text available
Background The selection of an evolutionary model to best fit given molecular data is usually a heuristic choice. In his seminal book, J. Felsenstein suggested that certain linear equations satisfied by the expected probabilities of patterns observed at the leaves of a phylogenetic tree could be used for model selection. It remained an open questio...
Article
Full-text available
Background A number of software packages are available to generate DNA multiple sequence alignments (MSAs) evolved under continuous-time Markov processes on phylogenetic trees. On the other hand, methods of simulating the DNA MSA directly from the transition matrices do not exist. Moreover, existing software restricts to the time-reversible models...
Data
Is a zipped (extension .zip) file containing the C++ implementation of GenNon-h.
Article
The goal of branch length estimation in phylogenetic inference is to estimate the divergence time between a set of sequences based on compositional differences between them. A number of software is currently available facilitating branch lengths estimation for homogeneous and stationary evolutionary models. Homogeneity of the evolutionary process i...
Article
Full-text available
In phylogenetic inference, an evolutionary model describes the substitution processes along each edge of a phylogenetic tree. Misspecification of the model has important implications for the analysis of phylogenetic data. Conventionally, however, the selection of a suitable evolutionary model is based on heuristics or relies on the choice of an app...
Article
Under a markovian evolutionary process, the expected number of substitutions per site (also called branch length) that have occurred when a sequence has evolved from another according to a transition matrix $P$ can be approximated by $-1/4log det P.$ When the Markov process is assumed to be continuous in time, i.e. $P=\exp Qt$ it is easy to simulat...
Article
Recently there have been several attempts to provide a whole set of generators of the ideal of the algebraic variety associated to a phylogenetic tree evolving under an algebraic model. These algebraic varieties have been proven to be useful in phylogenetics. In this paper we prove that, for phylogenetic reconstruction purposes, it is enough to con...
Article
Full-text available
A new approach to phylogenetic reconstruction has been emerging in the last years. Given an evolutionary model, the joint probability distribution of the nucleotides for these species satisfy some algebraic constraints called invariants. These invariants have theoretical and practical interest, since they can be used to infer phylogenies. In this p...
Article
Full-text available
In this paper we characterize non-connected Buchsbaum curves C in P^n and we give a sharp bound for the number of disjoint connected components of C.
Article
Full-text available
We prove that, for every r 2, the moduli space Ms X.rI c1; c2/ of rank r stable vector bundles with Chern classes c1 D rH and c2 D 1 2 .3r2 r/ on a nonsingular cubic surface X P3 contains a nonempty smooth open subset formed by ACM bundles, i.e. vector bundles with no intermediate cohomology. The bundles we consider for this study are extremal for...
Article
The Kimura 3-parameter model on a tree of n leaves is one of the most used in phylogenetics. The affine algebraic variety W associated to it is a toric variety. We study its geometry and we prove that it is isomorphic to a geometric quotient of the affine space by a finite group, which is completely described. As a consequence, we are able to study...
Article
Full-text available
An attempt to use phylogenetic invariants for tree reconstruction was made at the end of the 80s and the beginning of the 90s by several authors (the initial idea due to Lake and Cavender and Felsenstein in 1987. However, the efficiency of methods based on invariants is still in doubt, probably because these methods only used few generators of the...
Article
Full-text available
In this paper we prove that the generalized version of the Minimal Resolution Conjecture stated by Mustata holds for certain general sets of points on a smooth cubic surface $X \subset \mathbb{P}^3$. The main tool used is Gorenstein liaison theory and, more precisely, the relationship between the free resolutions of two linked schemes.
Article
For a finite set of points X⊆Pn and for a given point P∈X, the notion of a separator of P in X (a hypersurface containing all the points in X except P) and of the degree of P in X, (the minimum degree of these separators) has been largely studied. In this paper we extend these notions to a set of points X on a projectively normal surface S⊆Pn, cons...
Article
This chapter is concerned with the description of the Small Trees website which can be found at the following web address: The goal of the website is to make available in a unified format various algebraic features of different phylogenetic models on small trees. By “small” we mean trees with at most 5 taxa. In the first two sections, we describe a...
Article
Full-text available
This chapter is devoted to the study of strand symmetric Markov models on trees from the standpoint of algebraic statistics. A strand symmetric Markov model is one whose mutation probabilities reflect the symmetry induced by the double-stranded structure of DNA (see Chapter 4). In particular, a strand symmetric model for DNA must have the following...
Article
Full-text available
En aquest article fem una introducci´o a les aplicacions de la geometria algebraica en filogen`etica. Gr`acies a qu`e gran part dels models evolutius usats en filogen`etica corresponen a varietats algebraiques, l’ideal associat a aquestes varietats pot ser usat per donar un nou enfocament a la infer`encia filogen`etica. Peer reviewed
Article
Let X be a normal arithmetically Gorenstein scheme in . We give a criterion for all codimension two ACM subschemes of X to be in the same Gorenstein biliaison class on X, in terms of the category of ACM sheaves on X. These are sheaves that correspond to the graded maximal Cohen–Macaulay modules on the homogeneous coordinate ring of X. Using known r...
Article
We prove that if $X \subset \mathbb{P}^N$ has dimension k and it is r-Buchsbaum with r > max (codim X - k, 0), then X is contained in at most one variety of minimal degree and dimension k + 1.
Article
In this paper we compute the Hilbert functions of irreducible (or smooth) and reduced arithmetically Gorenstein schemes that are twisted anti-canonical divisors on arithmetically Cohen–Macaulay schemes. We also prove some folklore results characterizing the Hilbert functions of irreducible standard determinantal schemes, and we use them to produce...
Article
Full-text available
We study Gorenstein liaison of codimension two subschemes of an arithmetically Gorenstein scheme X. Our main result is a criterion for two such subschemes to be in the same Gorenstein liaison class, in terms of the category of ACM sheaves on X. As a consequence we obtain a criterion for X to have the property that every codimension 2 arithmetically...
Article
 The theory of Gorenstein liaison has been developed during the last 3 years to generalize liaison theory of codimension 2 schemes to schemes of codimension ≥ 3 in a projective space. One of the main open questions in Gorenstein liaison theory is whether any arithmetically Cohen-Macaulay subscheme of ℙ n is in the Gorenstein liaison class of a comp...
Article
We answer a question proposed by Hartshorne about the Lazarsfeld–Rao property for even Gorenstein liaison classes.
Article
Liaison theory has been extensively studied during the past decades. In codimension 2, the theory has reached a very satisfactory state, but in higher codimensions there are still many open problems. In this paper we prove that two unions $V= \bigcup_{i=1}^k L_i$ and $V'= \bigcup_{i=1}^{k'} L'_i$ of independent linear varieties of dimension $d \geq...
Article
Let be an arithmetically Cohen–Macaulay subscheme. In terms of Gorenstein liaison it is natural to ask whether C is in the Gorenstein liaison class of a complete intersection. In this paper, we study the Gorenstein liaison classes of arithmetically Cohen–Macaulay divisors on standard determinantal schemes and on rational normal scrolls. As main res...
Article
We discuss the problem of whether arithmetically Gorenstein schemes are in the Gorenstein liaison class of a complete intersecti on. We present some axamples of arithmetically Gorenstein schenes that are indeed in the Gorenstein liaison class of a complete intersection. In the recent research on Gorenstein liaison theory, the question whether any a...
Article
Full-text available
In this paper we characterize non-connected Buchsbaum curves C in P^n and we give a sharp bound for the number of disjoint connected components of C.
Article
Full-text available
An attempt to use phylogenetic invariants for tree reconstruction was made at the end of the 80s and the beginning of the 90s by several au-thors (the initial idea due to Lake [Lake, 1987] and Cavender and Felsen-stein [Cavender and Felsenstein, 1987]). However, the efficiency of methods based on invariants is still in doubt ([Huelsenbeck, 1995], [...
Article
Full-text available
"... Les varietats algebraiques apareixen de manera natural en considerar models estadístics empleats en genòmica i filogenètica. Explicarem quina és la relació entre aquests models estadístics i la geometria algebraica. Veurem també com utilitzar aquestes varietats algebraiques per a recuperar les relacions ancestrals entre espècies, és a dir, rec...

Network

Cited By