Fast calculation of the quartet distance between trees of arbitrary degree.

Department of Computer Science, University of Aarhus, Aabogade 34, DK-8200 Arhus N, Denmark.
Algorithms for Molecular Biology (Impact Factor: 1.46). 02/2006; 1(16):16. DOI: 10.1186/1748-7188-1-16
Source: DBLP


A number of algorithms have been developed for calculating the quartet distance between two evolutionary trees on the same set of species. The quartet distance is the number of quartets - sub-trees induced by four leaves - that differs between the trees. Mostly, these algorithms are restricted to work on binary trees, but recently we have developed algorithms that work on trees of arbitrary degree.
We present a fast algorithm for computing the quartet distance between trees of arbitrary degree. Given input trees T and T', the algorithm runs in time O(n + /V/./V'/ min{id, id'}) and space O(n + /V/./V'/), where n is the number of leaves in the two trees, V and V are the non-leaf nodes in T and T', respectively, and id and id' are the maximal number of non-leaf nodes adjacent to a non-leaf node in T and T', respectively. The fastest algorithms previously published for arbitrary degree trees run in O(n3) (independent of the degree of the tree) and O(/V/./V'/'), respectively. We experimentally compare the algorithm with existing algorithms for computing the quartet distance for general trees.
We present a new algorithm for computing the quartet distance between two trees of arbitrary degree. The new algorithm improves the asymptotic running time for computing the quartet distance, compared to previous methods, and experimental results indicate that the new method also performs significantly better in practice.

  • Source
    • "The quartet distance can be computed in time O (n log n) for binary trees [5], in time O (d9 n log n) for trees where all nodes have degree less than d [6], and in sub-cubic time for general trees [7]. See also Christiansen et al. [8] for a number of algorithms for general trees with different tradeoffs depending on the degree of inner nodes. For the triplet distance, O (n2) time algorithms exist for both binary and general trees [2,9]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The triplet distance is a distance measure that compares two rooted trees on the same set of leaves by enumerating all sub-sets of three leaves and counting how often the induced topologies of the tree are equal or different. We present an algorithm that computes the triplet distance between two rooted binary trees in time O (n log2n). The algorithm is related to an algorithm for computing the quartet distance between two unrooted binary trees in time O (n log n). While the quartet distance algorithm has a very severe overhead in the asymptotic time complexity that makes it impractical compared to O (n 2) time algorithms, we show through experiments that the triplet distance algorithm can be implemented to give a competitive wall-time running time.
    Full-text · Article · Jan 2013 · BMC Bioinformatics
  • Source
    • "Distance measures have typically been constructed with fully resolved (i.e., binary) trees in mind. Algorithms for computing distances between non-binary trees are few [92], [93], [94] and little work has been done on deriving methods that normalize properly when comparing trees that differ in the level of resolution of inner nodes [93]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent advances in automated assessment of basic vocabulary lists allow the construction of linguistic phylogenies useful for tracing dynamics of human population expansions, reconstructing ancestral cultures, and modeling transition rates of cultural traits over time. Here we investigate the Tupi expansion, a widely-dispersed language family in lowland South America, with a distance-based phylogeny based on 40-word vocabulary lists from 48 languages. We coded 11 cultural traits across the diverse Tupi family including traditional warfare patterns, post-marital residence, corporate structure, community size, paternity beliefs, sibling terminology, presence of canoes, tattooing, shamanism, men's houses, and lip plugs. The linguistic phylogeny supports a Tupi homeland in west-central Brazil with subsequent major expansions across much of lowland South America. Consistently, ancestral reconstructions of cultural traits over the linguistic phylogeny suggest that social complexity has tended to decline through time, most notably in the independent emergence of several nomadic hunter-gatherer societies. Estimated rates of cultural change across the Tupi expansion are on the order of only a few changes per 10,000 years, in accord with previous cultural phylogenetic results in other language families around the world, and indicate a conservative nature to much of human culture.
    Full-text · Article · Apr 2012 · PLoS ONE
  • Source
    • "We used a standard tree-comparison metric, the quartets distance [27], [28], [58], to quantify how congruent the Gray et al [3] tree was with the traditional linguistic subgroupings. The quartets distance measures the number of different combinations of four language subsets in both trees. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We recently used computational phylogenetic methods on lexical data to test between two scenarios for the peopling of the Pacific. Our analyses of lexical data supported a pulse-pause scenario of Pacific settlement in which the Austronesian speakers originated in Taiwan around 5,200 years ago and rapidly spread through the Pacific in a series of expansion pulses and settlement pauses. We claimed that there was high congruence between traditional language subgroups and those observed in the language phylogenies, and that the estimated age of the Austronesian expansion at 5,200 years ago was consistent with the archaeological evidence. However, the congruence between the language phylogenies and the evidence from historical linguistics was not quantitatively assessed using tree comparison metrics. The robustness of the divergence time estimates to different calibration points was also not investigated exhaustively. Here we address these limitations by using a systematic tree comparison metric to calculate the similarity between the Bayesian phylogenetic trees and the subgroups proposed by historical linguistics, and by re-estimating the age of the Austronesian expansion using only the most robust calibrations. The results show that the Austronesian language phylogenies are highly congruent with the traditional subgroupings, and the date estimates are robust even when calculated using a restricted set of historical calibrations.
    Full-text · Article · Mar 2010 · PLoS ONE
Show more