PresentationPDF Available

On Two Measures of Distance for Fully-Labelled Trees

Authors:

Abstract

Talk given at CPM 2020
On Two Measures of Distance for
Fully-Labelled Trees
Giulia Bernardini1, Paola Bonizzoni1, Paweł
Gawrychowski2
1University of Milano - Bicocca, Italy
2University of Wrocław, Poland
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Why?
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Why?
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Ingredients
A finite set of n labels L = { }
Two rooted trees fully labeled by L
Two operations: link&cut and permutation
T S
♣ ♣
♥ ♥
[CPM’19] A rearrangement distance for fully-labelled trees
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Link&cut
| = cut the edge ( , )
d
d
T S
♣ ♣♣ ♣
♥ ♥
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Link&cut
d
d
T S
♣ ♣♣ ♣
♥ ♥
| = cut the edge ( , ) and link to
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Permutation
( )
d
d
T S
♣ ♣
♥ ♥
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Permutation
d
d
T S
♣ ♣
♥ ♥
( )
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Permutation
d
d
T S
♣ ♣
♥ ♥
( )
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Permutation
d
d
T S
♣ ♣
a
♥ ♥
( )
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Operational distances
Permutation distance between two isomorphic trees S , T:
the smallest size of a permutation that transforms T into S
Rearrangement distance between any two trees S , T with
identical roots: the smallest size of any sequence of
link&cut and permutation operations that transforms T into
S without permuting the root
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Contributions
Permutation
distance
Rearrangement
distance
CPM 2019 O(n³) time algorithm
NP-hard:
constant-factor
approximation
algorithm for
binary trees
This work
Equivalent to
Bipartite Maximum
Matching.
Õ(n4/3+o(1)) time
algorithm
Constant-factor
approximation
algorithm for any
two trees
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Contributions
Permutation
distance
Rearrangement
distance
CPM 2019 O(n³) time algorithm
NP-hard:
constant-factor
approximation
algorithm for
binary trees
This work
Equivalent to
Bipartite Maximum
Matching.
Õ(n4/3+o(1)) time
algorithm
Constant-factor
approximation
algorithm for any
two trees
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Contributions
Permutation
distance
Rearrangement
distance
CPM 2019 O(n³) time algorithm
NP-hard:
constant-factor
approximation
algorithm for
binary trees
This work
Equivalent to
Bipartite Maximum
Matching.
Õ(n4/3+o(1)) time
algorithm
Constant-factor
approximation
algorithm for any
two trees
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Operational distances
Rearrangement distance revised: the smallest size of any
sequence of cut and permutation operations that
transforms T into a forest T’~S without permuting the root
T’ S
♥ ♥
♣ ♣
~
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Rearrangement distance: what is difficult?
F1
F2
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Rearrangement distance: what is difficult?
F1
F2
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Rearrangement distance: what is difficult?
F1
F2
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Rearrangement distance: what is difficult?
F1
F2
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Four steps: step i transforms F1
i-1 into F1
i with ALG( i )
operations: ALG( i ) =O( d(F1
i-1,F2) )
Constant-factor approximation algorithm
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Goal: make the nodes with different children in F1 and F2
roots
Step 1: cut the grandparents
F1
F2
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Goal: make the nodes with different children in F1 and F2
roots
ALG(1) ≤ 4d(F1 , F2)
Step 1: cut the grandparents
F1
1
F2
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Goal: make sure that no two children of a node in F1
2 have
different parents in F2
For each node of F1
1, each child vote for its representative
(its parent in F2). The majority wins, the rest is cut.
Step 2: let the children vote!
F1
1
F2
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Goal: make sure that no two children of a node in F1
2 have
different parents in F2
ALG(2) ≤ 2d(F1
1 , F2)
Step 2: let the children vote!
F1
2
F2
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Goal: make sure that no two nodes in F1
2 have children with
the same representative
Among the parents of the nodes in F1
2 that have the same
representative, the one with more children wins.
Step 3: make the parents fight!
F1
2
F2
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Goal: make sure that no two nodes in F1
2 have children with
the same representative
ALG(3) ≤ 2d(F1
2 , F2)
Step 3: let the parents fight!
F1
3
F2
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Goal: make sure that no two nodes in F1
2 have the same
representative
Permute each node of F1
3 with the representative
of its children.
Step 4: permute the rest
F1
3
F2
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Goal: make sure that no two nodes in F1
2 have the same
representative
ALG(4) ≤ 4d(F1
3 , F2)
Step 4: permute the rest
F1
4
F2
Giulia Bernardini On Two Measures of Distance for Fully-Labelled Trees CPM 2020
Future work
Lower the constant factor of the approximation
algorithm
Is there any approximation scheme for the
rearrangement distance?
Thank you for your attention
Article
Full-text available
Background Existing software for comparison of species delimitation models do not provide a (true) metric or distance functions between species delimitation models, nor a way to compare these models in terms of relative clustering differences along a lattice of partitions. Results Piikun is a Python package for analyzing and visualizing species delimitation models in an information theoretic framework that, in addition to classic measures of information such as the entropy and mutual information [1], provides for the calculation of the Variation of Information (VI) criterion [2], a true metric or distance function for species delimitation models that is aligned with the lattice of partitions. Conclusions Piikun is available under the MIT license from its public repository ( https://github.com/jeetsukumaran/piikun), and can be installed locally using the Python package manager ‘pip‘.
Presentation
Full-text available
Slides for my invited talk at WABI 2024
ResearchGate has not been able to resolve any references for this publication.