Comparison of Tree-Child Phylogenetic Networks

Department of Mathematics and Computer Science, University of the Balearic Islands, E-07122 Palma de Mallorca, Spain.
IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM (Impact Factor: 1.44). 01/2010; 6(4):552-69. DOI: 10.1109/TCBB.2007.70270
Source: DBLP


Phylogenetic networks are a generalization of phylogenetic trees that allow for the representation of nontreelike evolutionary events, like recombination, hybridization, or lateral gene transfer. While much progress has been made to find practical algorithms for reconstructing a phylogenetic network from a set of sequences, all attempts to endorse a class of phylogenetic networks (strictly extending the class of phylogenetic trees) with a well-founded distance measure have, to the best of our knowledge and with the only exception of the bipartition distance on regular networks, failed so far. In this paper, we present and study a new meaningful class of phylogenetic networks, called tree-child phylogenetic networks, and we provide an injective representation of these networks as multisets of vectors of natural numbers, their path multiplicity vectors. We then use this representation to define a distance on this class that extends the well-known Robinson-Foulds distance for phylogenetic trees and to give an alignment method for pairs of networks in this class. Simple polynomial algorithms for reconstructing a tree-child phylogenetic network from its path multiplicity vectors, for computing the distance between two tree-child phylogenetic networks and for aligning a pair of tree-child phylogenetic networks, are provided. They have been implemented as a Perl package and a Java applet, which can be found at

Download full-text


Available from: Francesc Rossello, Jan 14, 2014
  • Source
    • "But there are binary, non-planar stable networks. In fact, tree-child networks, galled trees, and galled networks are all stable [2] [23] "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this work, we answer an open problem in the study of phylogenetic networks. Phylogenetic trees are rooted binary trees in which all edges are directed away from the root, whereas phylogenetic networks are rooted acyclic digraphs. For the purpose of evolutionary model validation, biologists often want to know whether or not a phylogenetic tree is contained in a phylogenetic network. The tree containment problem is NP-complete even for very restricted classes of networks such as tree-sibling phylogenetic networks. We prove that this problem is solvable in cubic time for stable phylogenetic networks. A linear time algorithm is also presented for the cluster containment problem.
  • Source
    • "It is interesting that the matrix H includes the information utilized in [4] to produce a metric on tree-child networks. More precisely, if N is both DC and tree-child, then the vectors µ(v) utilized in [4] have the entries H v,x for x ∈ X. Even DC networks can be very large. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Phylogenetic networks are rooted acyclic directed graphs in which the leaves are identified with members of a set X of species. The cluster of a vertex is the set of leaves that are descendants of the vertex. A network is "distinct-cluster" if distinct vertices have distinct clusters. This paper focuses on the set DC(X) of distinct-cluster networks whose leaves are identified with the members of X. For a fixed X, a metric on DC(X) is defined. There is a "cluster-preserving" simplification process by which vertices or certain arcs may be removed without changing the clusters of remaining vertices. Many of the resulting networks may be uniquely determined without regard to the order of the simplifying operations.
  • Source
    • "The following lemma, which is Lemma 2 in [4], gives some alternative useful characterizations of TC networks. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In general, a phylogenetic network is a graphical representation of an evolutionary history that involves reticulate events like recombinations, hybridizations, or lateral gene transfers. Tree-child reticulate networks (TC networks) are a special class of phylogenetic networks that allow to represent evolutionary histories where, despite the existence of such reticulate events, every ancestral species has some descendant through mutations. In this paper we establish two equivalent characterizations of the families of clusters of TC networks. These characterizations yield a simple, polynomial-time algorithm that decides whether a given family of clusters on a set of taxa is the family of clusters of some TC network or not, and, when the answer is positive, outputs a TC network that is a minimal reticulate network representing this family of clusters. This algorithm is based on the notion of cluster network introduced by Huson and Rupp, and it has been implemented in a Python package and a companion web tool, which are freely available on the web.
    Fundamenta Informaticae 01/2014; 134(1-1-2):1-15. DOI:10.3233/FI-2014-1087 · 0.72 Impact Factor
Show more