ThesisPDF Available

Quartet weights in phylogenetic network reconstruction

Authors:

Abstract and Figures

Quartet weight encodes the quantitative substructure of all quartets in the taxa set we analyze, as a comparison, distance is the substructure of all pairs. So quartet weights of a taxa set contains more information than distances and is possible of infering phylogenetic history more accurately. Traditionally quartet weight were used in tree reconstruction, but we found that quartet weight is more appropriate to be understood in in the context of phylogenetic network. The main part of this paper builds a theory for quartet weights and discuss some applications of quartet weight related methods in phylogenetic network reconstruction, the phylogenetic network is an alternative for phylogenetic tree, which allows for representing reticulate events. This part were consists of of two chapters. The first part were aimed to build a theory being the quartet weight analog of theory of metrics, which give rise to fruitful results, including linear dependency theorem and analogs of SplitDecomposition and Neighbornet algorithm. Even most of generalization are not direct and failed to maintain all desirable properties of the theory with metric, those method were proved practically useful in many cases. We also show that the T-theory for quartet weights failed to explain the 2-very-weakly-compatible condition. In this chapter when we were trying to apply those methods into a real dataset we found that the existing method for calculating quartet weight is not satisfactory, thus more accurate method is needed. The second part introduces an novel method that calculates quartet weight using Hadamard conjugation. Rate-variation were also involved in such method, which significantly improves the performance. Compared with existing methods like pattern counting and Maximal Likelihood-based method, Hadamard conjugation generates more accurate quartet weights and be able to construct more accurate phylogenetic networks, verified by both simulation studies and real dataset. In the end an epilogue is attached to establish some results on more general types of split system and clusters, especially on maximal cardinalities, is presented. Those systems were deviated by methods using those higher data. The order of maximal cardinalities of (p,q)-hierarchies were explicitly decided. Some other important result is: the maximal cardinality of (1,3)(-1,3)-hierarchy is between n3/9+O(n2)n^3/9+O(n^2) and n3/6+O(n2)n^3/6+O(n^2); the maximal cardinality of 22'-weakly compatible split system is between 3n2/4+O(n)3n^2/4+O(n) and n2+O(n)n^2+O(n) and maximal cardinality of 2-weakly compatible split system is between 3n2/2+O(n)3n^2/2+O(n) and O(n2.5)O(n^{2.5}).
Content may be subject to copyright.
A preview of the PDF is not available
ResearchGate has not been able to resolve any citations for this publication.
Book
Fourier Series.- Hilbert Spaces.- The Fourier Transform.- Distributions.- Finite Abelian Groups.- LCA groups.- The Dual Group.- Plancheral?s Theorem.- Matrix Groups.- The Representations of SU(2).- The Peter-Weyl Theorem.- The Heisenberg Group.- The Riemann Zeta Function.- Haar Integration.- Bibliography.- Index.
Article
A conjecture of Bandelt and Dress states that the maximum quartet distance between any two phylogenetic trees on n leaves is at most ( 2/3 +o(1))(n 4). Using the machinery of flag algebras, we improve the currently known bounds regarding this conjecture; in particular, we show that the maximum is at most (0.69 + o(1)) (n 4). We also give further evidence that the conjecture is true by proving that the maximum distance between caterpillar trees is at most ( 2/3 + o(1)) (n 4).
Article
A new method called the neighbor-joining method is proposed for reconstructing phylogenetic trees from evolutionary distance data. The principle of this method is to find pairs of operational taxonomic units (OTUs [= neighbors]) that minimize the total branch length at each stage of clustering of OTUs starting with a starlike tree. The branch lengths as well as the topology of a parsimonious tree can quickly be obtained by using this method. Using computer simulation, we studied the efficiency of this method in obtaining the correct unrooted tree in comparison with that of five other tree-making methods: the unweighted pair group method of analysis, Farris's method, Sattath and Tversky's method, Li's method, and Tateno et al.'s modified Farris method. The new, neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods.
Article
Split networks are a popular tool for the analysis and visualization of complex evolutionary histories. Every collection of splits (bipartitions) of a finite set can be represented by a split network. Here we characterize which collection of splits can be represented using a planar split network. Our main theorem links these collections of splits with oriented matroids and arrangements of lines separating points in the plane. As a consequence of our main theorem, we establish a particularly simple characterization of maximal collections of these splits.
Article
The evolutionary history of a set of taxa is usually represented by a phylogenetic tree, and this model has greatly facilitated the discus-sion and testing of hypotheses. However, it is well known that more complex evolutionary scenarios are poorly described by such models. Further, even when evolution proceeds in a tree-like manner, analysis of the data may not be best served by using methods that enforce a tree structure, but rather by a richer visualization of the data to evaluate its properties, at least as an essential first step. Thus, phylogenetic net-works should be employed when reticulate events such as hybridization, horizontal gene transfer, recombination, or gene duplication and-loss are believed to be involved, and, even in the absence of such events, phylogenetic networks have a useful role to play. This paper reviews the terminology used for phylogenetic networks and covers both split networks and reticulate networks, how they are defined and how they can be interpreted. Additionally, the paper outlines the beginnings of a comprehensive statistical framework for applying split network meth-ods. We show how split networks can represent confidence sets of trees and introduce a conservative statistical test for whether the conflicting signal in a network is treelike. Finally, this paper describes a new pro-gram SplitsTree4, an interactive and comprehensive tool for inferring different types of phylogenetic networks from sequences, distances and trees.