Content uploaded by Kaari Landry
Author content
All content in this area was uploaded by Kaari Landry on Jun 20, 2023
Content may be subject to copyright.
arXiv:2305.00033v1 [cs.DS] 28 Apr 2023
Finding agreement cherry-reduced subnetworks
in level-1 networks
Kaari Landry*1, Olivier Tremblay-Savard1, and Manuel Lafond2
1University of Manitoba, Winnipeg MB, Canada * landryk1@cs.umanitoba.ca
2Universit´e de Sherbrooke, Sherbrooke QC, Canada
Abstract. Phylogenetic networks are increasingly being considered as
better suited to represent the complexity of the evolutionary relation-
ships between species. One class of phylogenetic networks that has re-
ceived a lot of attention recently is the class of orchard networks, which
is composed of networks that can be reduced to a single leaf using cherry
reductions. Cherry reductions, also called cherry-picking operations, re-
move either a leaf of a simple cherry (sibling leaves sharing a parent) or a
reticulate edge of a reticulate cherry (two leaves whose parents are con-
nected by a reticulate edge). In this paper, we present a fixed-parameter
tractable algorithm to solve the problem of finding a maximum agree-
ment cherry-reduced subnetwork (MACRS) between two rooted binary
level-1 networks. This is first exact algorithm proposed to solve the
MACRS problem. As proven in earlier work, there is a direct relationship
between finding an MACRS and calculating a distance based on cherry
operations. As a result, the proposed algorithm also provides a distance
that can be used for the comparison of level-1 networks.
Keywords: Cherry operations ·Graphs and networks ·Trees ·Net-
work problems ·Algorithm design and analysis ·Biology and genetics ·
Phylogenetic Networks
1 Introduction
Phylogenetic trees have been used extensively throughout the years to represent
simple evolutionary relationships between species. Because of this, many tools
and techniques are readily available to efficiently build, compare and evaluate
trees. Phylogenetic networks on the other hand are much better suited to repre-
sent more complex relationships, such as the ones resulting from hybridization,
recombination and lateral gene transfer events [11]. In the last 15 years or so,
bioinformatics research has focused increasingly on solving problems related to
phylogenetic networks, such as network construction [24,23,25,29,26,1,22], mini-
mum hybridization number [2,10,27,15,8,9,3], tree/network containment [16,17,28],
and distance calculation between networks [5,21,19].
One crucial concept that has been shown to be a very useful tool in solving
several of the important phylogenetic network problems mentioned above is the
one of cherry-picking sequences [10,20]. A cherry-picking sequence is made up
of operations that can reduce a network by either removing one leaf of a simple
(tree-like) cherry (i.e. two leaf siblings descending from the same parent vertex),
2 K. Landry et al.
or removing one reticulate edge of a reticulated cherry (two leaves whose parent
vertices are connected by a reticulate edge). The concept of cherry-picking has
been so valuable that it led to the definition of orchard networks, also known
as cherry-picking networks, which are simply phylogenetic networks that can be
reduced to a single leaf by cherry-picking operations [6,17]. Recent work has been
focusing on further characterizing and classifying different subtypes of orchard
networks [14,18,13].
Lately, we have used a generalized definition of cherry operations to describe
both cherry reductions (i.e. cherry picking) and cherry expansions (the reverse
of a reduction, which adds a simple or reticulate cherry) [19]. We have then
defined four novel distances between orchard networks that are based on cherry
operations, with three of them being different formulations of an equivalent
distance (construction, deconstruction and tail distances) and the fourth one
(mixed distance) being a lower bound for the other three. In the process of
describing these distances, the concept of a maximum agreement cherry-reduced
subnetwork (MACRS – note that we replace cherry-picking used in [19] by cherry-
reduced here for clarity) was defined to represent a network contained in both
networks being compared that maximizes the number of vertices. We showed
that finding an MACRS of two orchard networks was NP-hard, and this was
analogous to the problem of calculating the three equivalent distances.
In this work, we present an exact fixed-parameter tractable (FPT) algorithm
to compute an MACRS of two rooted binary level-1 networks that is exponential
in the sum of reticulations present in both networks. More precisely, our algo-
rithm runs in O(3rn3), where ris the sum of reticulations and n represents the
maximum number of vertices of the input networks. Our approach essentially
consists of enumerating a certain set of subnetworks of the input networks in
which all possible combinations of reticulation edges have been removed. Then,
it makes use of a dynamic programming algorithm that finds whether there is
an MACRS (and what it is, if it exists) or not between two level-l networks in
which reticulations that are remaining cannot be removed (we call this problem
MACRS-Simple). We prove that the initial MACRS problem can be solved by
solving the MACRS-Simple problem on all combinations of enumerated subnet-
works.
It is worth noting another important difference between the previous defining
work on MACRS and this article is the definition of networks. Specifically, we
allow leaves of the network to have multiple labels. In fact, we force all leaf labels
to be conserved as the network is trimmed by cherry reductions by subsuming
labels of a removed leaf onto its cherry sibling that remains. In this way, we keep
a “memory” of reductions and this compressed representation of networks allows
to restore all possible alternative network (bijective) leaf labelings from it.
Finally, we conclude the paper by discussing how the enumeration step could
be optimized by considering the relationships between the reticulations of both
input networks. We also briefly present a preliminary idea of how the proposed
algorithm could be extended to higher level binary networks. Even though the
proposed approach applies to orchard networks and not to general networks,
Finding agreement cherry-reduced subnetworks in level-1 networks 3
the orchard network class actually contains network types that are of interest
to the research community, such as the tree-child networks [4] and tree-sibling
time-consistent networks [6]. The tree-child networks in particular, in addition to
having been studied extensively in the literature, are biologically relevant, since
all ancestral species (internal vertices) have a path that can go to a leaf using
only tree vertices. This reflects the idea that ancestral species have descendants
that will perdure through mutation and speciation events, and that hybridization
events are not as common as speciation events [18].
2 Preliminaries
We first introduce the notions regarding networks, then proceed to defining
cherry operations and our problem of interest.
2.1 Networks
Aphylogenetic network N, or a network for short, is an acyclic directed graph
without vertices of in-degree and out-degree 1, and whose vertices and edges are
denoted V(N) and E(N), respectively. We assume that all networks are binary.
For v∈V(N), we use v−and v+to denote the in-degree and out-degree of v,
respectively. The set V(N) contains
–the root ρ(N), which is the unique node satisfying ρ(N)−= 0 and ρ(N)+=
2. In the case that |V(N)|= 2, ρ(N)+= 1;
–the leaves L(N), which satisfy l−= 1 and l+= 0 for all l∈L(N);
–the internal vertices V(N)\(L(N)∪ {ρ(N)}), which contains:
•the tree vertices T(N), which satisfy v−= 1 and v+= 2 for all v∈T(N);
•the reticulation vertices R(N), or simply reticulations, which satisfy
v−= 2 and v+= 1 for all v∈R(N).
We use Xto denote the set of all taxa. For our purposes, the leaves of a network
Nare labeled by one or more taxa. For l∈L(N), we will use X(l) to denote
the set of taxa that label l. We require that X(l)6=∅, and that for any distinct
leaves l1, l2∈L(N), X(l1)∩X(l2) = ∅.
The edges directed into a reticulation vertex are called reticulation edges,
denoted ER(N). For v∈V(N), the out-neighbors of vare called its children.
If vhas a single in-neighbor, we denote it by p(v) and call it the parent of v
(if v∈ {ρ(N)} ∪ R(N), then p(v) is undefined). Vertices uand vare siblings
if p(u), p(v) are defined and p(u) = p(v). When there is a directed path from
vertex vto vertex u, we call van ancestor of uand we call uadescendant of v.
The descendants of vare denoted reach(v , N) while its ancestors are denoted
reach−(v, N) (note that vitself is in both sets). The union of the labels in
reach(v, N)∩L(N) is denoted X(v). We denote by R(v) the set of reticulations
in reach(v, N).
Two networks N1,N2are weakly isomorphic if there exists a bijection σ:
V(N1)→V(N2) such that (u, v)∈E(N1) if and only if (σ(u), σ (v)) ∈E(N2),
4 K. Landry et al.
and such that for each l∈L(N1), X(l)∩X(σ(l)) 6=∅. For this we use the
notation N ≃ N ′. If, for each l∈L(N1), X(l) = X(σ(l)), then we say N1and
N2are strongly isomorphic which we denote by N1=N2.
A network Nmay have only one edge whose endpoints are ρ(N) and a leaf.
Then Nis a single-leaf network or singleton. We say ρ(N)roots N. If, for a
vertex v, and for all vertices v′∈reach(v, N), if every path from ρ(N) to v′
goes through v, then we say vroots the subnetwork below it.
While a network Nis directed, there is an undirected version of Non the
same vertex set and with an undirected edge {u, v}present for every (u, v)∈
E(N) which we call the underlying graph. It is on this underlying graph that we
identify the set of biconnected components of N. Such a component is a maximal
subgraph Bthat cannot be disconnected by the removal of an edge therein. Note
that every individual leaf and some tree vertices alone constitute a biconnected
component, we refer to such single vertex components as trivial, and all others
as non-trivial. For a set of biconnected components B1...Bbon a network N, a
bridge is an edge (u, v) such that u∈Bi,v∈Bjfor any arbitrary 1 ≤i6=j≤b.
The level of a network is the maximum number of reticulations across all
biconnected components of a network. A level-knetwork has no biconnected
component with more than kreticulations. A level-1 network has every bicon-
nected component with either 0 or 1 reticulations. Note that this does not limit
the number of reticulations over the whole network, just in each biconnected
component.
2.2 Cherries and cherry reductions
Acherry is a pair of leaves that are siblings or that have a reticulation joining
their parents. More specifically, a pair (x, y)∈L(N)×L(N) is called a cherry if
either p(x) = p(y), in which case (x, y) is called a simple cherry, or p(x)∈R(N)
and (p(y), p(x)) ∈E(N), in which case (x, y) is called a reticulated cherry.
Let Nbe a network and let (x, y) be a pair of vertices. Then applying the
cherry reduction (x, y) on Ncreates a new network as follows:
–If (x, y) is a simple cherry of N, then the (x, y)reduction consists of removing
the leaf xand the edge (p(x), x), suppressing the resulting node of in and out-
degree 1 if any, and re-assigning X(y) = X(y)∪X(x). Note that the operation
we introduce here differs from the cherry reduction operation described in
previous work, where both the leaf xand the set X(x) are deleted. The
purpose of our new definition is to preserve a reference to which label could
have been assigned to y. This is to say that the labels on a given leaf are
interchangeable [19, Lemma 3].
–If (x, y) is a reticulated cherry of N, then we remove the reticulation edge
(p(y), p(x)) and the resulting vertices of in and out-degree 1 are suppressed.
In this case, we say that the reticulation edge (p(y), p(x)) is removed by the
cherry reduction (x, y).
–If (x, y) is not a cherry of N, then Nis unchanged.
Finding agreement cherry-reduced subnetworks in level-1 networks 5
The resulting graph is a network, and always has a cherry unless it is a
singleton (true of all orchard networks by definition [6,17]).
Cherry reductions often occur in batches, and a sequence Sof pairs of leaves
is called a cherry sequence (CS ). The number of elements in Sis denoted |S|.
The cherry at position iof a CS Sis referred to by Si. We use N hSito denote
the network obtained from Nby first applying cherry reduction S1on N, then
S2on the resulting network, and so on until S|S|is applied. Note that we al-
low Sto contain pairs that do not modify the network (e.g. non-cherries). The
subsequence from (including) the first cherry to (excluding) the ith cherry in S
is S(0:i). When a CS Sreduces a network Nto a singleton, then we say Sis
complete for N. We assume networks are orchard networks hereafter.
See Figure 1 for an illustration of the two cherry reduction operations, and
the concepts of isomorphism.
a={a}b={b}
r1
c={c}f={f}
u v
w
N1
x
r2
d={d}e={e}
a={a}b={b}
r1
c={c}f={f}
u v
w
N2
x
r2
e={d, e}
a={a}b={b}
r1
c={c}f={f}
u v
w
N3
e={d, e}
a={a}b={b}
N4
e={c, d, e, f }
a={a}b={b}
N5
c={c}
Fig. 1. In this figure, leaves are represented by open circles, tree vertices as filled circles,
reticulations as filled squares, and the root of the network as a filled, inverted trian-
gle. Network N1is a level-1 network with |R(N)|= 2. N1is a reticulation-trimmed
subnetwork of N1with respect to F=∅. Network N2=N1h(d, e)i, where (d, e)
is a simple cherry/reduction. Network N3=N2h(e, f)iwhere (e, f ) is a reticulated
cherry/reduction. N3is reticulation-trimmed subnetwork of N1and of N2with re-
spect to F={(x, r2)}. Network N4=N3h(c, e)·(f, e)·(e, b)iand is a reticulation-
trimmed subnetwork of N1and of N2with respect to F={x, r2),(v, r1)}or to
F={(w, r2),(v, r1)}. Network N5≃ N4, in fact, there are CSs that may head lead to
leaf ebeing any of leaves c,d,e, or f. Each of these networks would have the same
label set on that leaf, and all are weakly isomorphic with N5.
6 K. Landry et al.
Cherries on a network can be reduced in any order. We restate a theorem
of [16] that we adapt to our formalism3.
Theorem 1. Let Nbe a network, let (x, y)be a cherry of N, and let Sbe a CS
that contains (x, y). Then there exists a CS S′such that N hSi=N hS′i, and
whose first element is (x, y).
2.3 Maximum agreement cherry-reduced subnetworks
For networks Nand N′, when there exists a CS Ssuch that N hSi ≃ N ′, we
say that N′is a cherry-reduced subnetwork (CRS ) of N, denoted by N′⊆cr N.
We can now define the main problem of focus.
The Maximum Agreement Cherry-Reduced Subnetwork (MACRS) problem.
Input: Two orchard networks N1and N2
Find: A network N∗with the maximum number of vertices that satisfies N∗⊆cr
N1and N∗⊆cr N2
A solution N∗to the above problem will be called an MACRS of N1and
N2.
3 An MACRS algorithm on level-1 networks
We show that the MACRS problem can be solved in time O(3rn3) for n=
max(|V(N1)|,|V(N2)|), and r=|R(N1)|+R(N2)|on level-1 networks. We em-
ploy a two-step strategy. We first enumerate a number of inputs that have been
specially reduced to a selected set of remaining reticulations. Second, these in-
puts are provided to a cubic time dynamic programming algorithm on an easier
version of MACRS that uses only simple reductions. Because of the number of
special inputs is limited by 3r, we get an FPT algorithm. MACRS is thus split
into two subproblems. We first introduce them and show how they can be used
to solve MACRS. The later sections then focus on each problem separately.
Let Nbe a network and let F⊆ER(N) be a subset of reticulation edges.
We wish to generate all the maximal cherry-reduced subnetworks of Nunder
the restriction that the reticulation edges removed by cherry operations coincide
with F. Thus, we say that a network N′is a reticulation-trimmed subnetwork of
Nwith respect to Fif there exists a CS Ssuch that N hSi=N′, and such that
(u, v)∈Fif and only if Scontains a reticulated cherry reduction that removes
(u, v), and Sis of minimum length i.e. we require that there is no other CS S′
with |S′|<|S|that satisfies the same properties.
Furthermore, we say that N′is a reticulation-trimmed subnetwork of Nif
there exists a set F⊆ER(N) such that N′is a reticulation-trimmed subnetwork
of Nwith respect to F.
3Note that the authors prove the statement under the assumption that Sis complete,
and that leaves are single-labeled. However the proof is easy to adapt to our context.
Finding agreement cherry-reduced subnetworks in level-1 networks 7
The Reticulation-Trimmed Enumeration problem:
Input: An orchard network N.
Find: the set of all reticulation-trimmed subnetworks of N.
Note that the size of the set of reticulation-trimmed subnetworks depends
heavily on the network structure. For instance, it is possible to show that it is
linear when all reticulations are arranged in a path, and exponential when all
reticulations are independent (none is an ancestor of the other). It is possible to
calculate the size of this set exactly by algorithmic means though an abstraction
of the network structure. However, we reserve the analysis of the impact of this
parameter on our algorithm for future work.
Once the set of edges to remove by reticulation have been guessed, it remains
to infer the set of non-reticulated cherry operations. A simple CS is a CS that
contains only simple cherries. In this way, R(N) = R(N hSi) for any simple CS S.
For networks Nand N′, when there exists a simple CS Ssuch that N hSi ≃ N ′
we say that N′is a CRS-SIMPLE of N. Note that owing to our definition of
weak isomorphism, N hSi ≃ N ′does not mean that Stransforms Ninto N′. A
better intuition would rather be that after applying Son N, we could choose
one label in the label set of each leaf of NhSiand of N′, such that the resulting
networks would be isomorphic in the traditional sense.
The Simple Maximum Agreement Cherry-Reduced Subnetwork (MACRS-Simple)
problem.
Input: Two orchard networks N1and N2.
Find: a network N∗with a maximum number of vertices such that N∗is a
CRS-SIMPLE of N1and a CRS-SIMPLE of N2.
A solution N∗to the above problem will be called a MACRS-Simple of N1
and N2.
For the standard MACRS problem on networks N1and N2, there is always
a solution as long as X(N1)∩X(N2)6=∅, however since reticulations can-
not be removed by simple CS, the MACRS-SIMPLE problem may not have a
solution (for instance when the two networks have different number of reticula-
tion vertices). We can now describe our main algorithm, where we assume that
the MACRS-Simple routine correctly returns an optimal solution to the above
problem.
8 K. Landry et al.
Algorithm 1 MACRS Finder
Input Two networks N1and N2
Output A MACRS of N1and N2
1: ˜
N ← empty network
2: for each reticulation-trimmed subnetwork N′
1of N1do
3: for each reticulation-trimmed subnetwork N′
2of N2do
4: Let N′be a MACRS-Simple of N′
1and N′
2
5: if N′exists and |V(N′)|>|V(˜
N)|then ˜
N ← N ′
6: end for
7: end for
8: return ˜
N
An optimization technique is evident here: as we mentioned, there is only
a solution to MACRS-Simple (N1,N2) when |R(N1)|=|R(N2)|since only
simple reductions will be performed. Thus, we need only test such pairs. This
optimization is not currently formalized into the algorithm and complexity anal-
ysis presented here, but rather will make for future work.
In the remainder of this section, we focus on proving that this algorithm works
correctly. We will deal with the complexity of the algorithm once we have dealt
with the Reticulation-Trimmed Enumeration and MACRS-Simple sub-
problems. We begin by showing that one can always obtain a subnetwork by
first going through a reticulated-trimmed subnetwork, and then using only sim-
ple cherry reductions.
Lemma 1. Let Nbe a network. Then for any N′⊆cr N, there exists a reticulation-
trimmed subnetwork N′′ of Nand a simple CS Ssuch that N′′hSi=N′.
For proof of Lemma 1, see Appendix.
Theorem 2. Algorithm 1 correctly finds a MACRS of N1and N2.
Proof. Let N∗be a MACRS of N1and N2. Let ˜
Nbe the network returned
by Algorithm 1. We first claim that if ˜
Nis non-empty, it does satisfy ˜
N ⊆cr
N1,N2. To see this, note that every pair N′
1,N′
2of networks enumerated by
Algorithm 1 satisfy N′
1⊆cr N1and N′
2⊆cr N2, by the definition of reticulation-
trimmed subnetworks. Moreover, if a MACRS-Simple N′of N′
1,N′
2exists, then
by transitivity, N′⊆cr N′
1⊆cr N1and N′⊆cr N′
2⊆cr N2. Since ˜
Nis one of
those N′, this proves our claim.
Let us now focus on the optimality of ˜
N. First note that |V(N∗)| ≥ |V(˜
N)|:
if ˜
Nis an empty network, this is obvious, and otherwise, by our above claim, ˜
N
is a cherry-reduced subnetwork of N1and N2and can thus not be larger than
N∗.
Let us now show that |V(N∗)| ≤ |V(˜
N)|. By Lemma 1, there exists a
reticulation-trimmed subnetwork N′
1of N1(resp. N′
2of N2) such that N∗can
be obtained from N′
1(resp. N′
2) using only simple CSs. Thus, N∗is a CRS-
SIMPLE of N′
1and N′
2. Algorithm 1 will eventually enumerate N′
1and N′
2and
Finding agreement cherry-reduced subnetworks in level-1 networks 9
find a MACRS-Simple N′of them, which is of maximum size and thus has at
least as many vertices as N∗. Since the returned ˜
Nis the N′of maximum size
found by the algorithm, it follows that |V(N∗)| ≤ |V(˜
N)|.
4 Subroutines
4.1 Enumerating the set of reticulation-trimmed subnetworks
We now show how to enumerate the set of all reticulation-trimmed subnetworks
of a network Nin time O(3|R(N)||V(N)|). The reticulation-trimmed subnetworks
are characterized by having no more reductions than what sufficiently removes
the desired reticulation edges. Luckily, we will see that at most one such network
can exist; we must only remove the complete subnetwork under both endpoints
of the reduced reticulation edge. This is guaranteed possible by cherry reduc-
tions, assuming all reticulations below these endpoints have also been specified
for removal. Algorithm 2 shows how to enumerate the relevant edges, and uses
Algorithm 3 as a subroutine, which finds the reticulation-trimmed subnetwork
with respect to a given edge set. We show that the reticulation-trimmed subnet-
work of Nwith respect to F⊆R(N) is uniquely defined in Lemma 4. We say
that a set of edges Fis disjoint if, for any two distinct edges (u, v),(x, y)∈F,
{u, v} ∩ {x, y }=∅.
Algorithm 2 REDUCED-SET-FINDER
Input A network N
Output The set of all reticulation-trimmed subnetworks of N
1: for each F∈ {P(ER(N)) : (a, b),(c, b)/∈Ffor any a, b, c}do
2: N←N∪RT-SUBNET-MAKER(N, F )
3: end for
4: return N
Lemma 2. Let Nbe a network, F⊆ER(N)be a set, and N′be a network that
is a reticulation-trimmed subnetwork of Nwith respect to F. Then Fis disjoint.
For proof of Lemma 2, see Appendix. Next, for F⊆ER(N), a topological
sort of Fis an ordering of its element such that for distinct edges e1, e2∈F, if
there is a path from a vertex of e1to a vertex of e2in N, then e1comes later
than e2in this ordering.
Lemma 3. Let Nbe a network and F⊆ER(N)be a set such that there exists
a reticulation-trimmed subnetwork of Nwith respect to F. Then there exists a
topological sort of F.
For proof of Lemma 3, see Appendix. The next lemma is crucial, as it shows
that reticulation-trimmed subnetworks with respect to a given Fare either
unique, or do not exist. This allows us to enumerate in reasonable time.
10 K. Landry et al.
Lemma 4. Let Nbe a network and let F⊆ER(N). Then there does not exist
two non-strongly isomorphic reticulation-trimmed subnetworks of Nwith respect
to F.
For proof of Lemma 4, see Appendix. For an example, given a network N,
of an F⊆ER(N) that does not admit a reticulation-trimmed subnetwork of
N, consider Nwith 2 reticulations, r1,r2such that r1∈reach−(r2). Choosing
F={(p1, r1)}, for p1chosen arbitrarily between r1’s parents, will not admit a
reticulation-trimmed subnetwork since reticulation r2must have leaves below its
endpoints to be in a cherry, but this choice of Fhas no corresponding reticulated
reductions of r1making it impossible to construct a CS Sthat reduces only r2.
We next describe Algorithm 3, which produces the reticulation-trimmed net-
works with respect to some given F, see Figure 2 for an illustration.
p(u)
u
vv′
u′
p(v)
(1) p(u)
u
vv′
u′
p(v)
(2)
(3)
u′v′
Fig. 2. In this figure, leaves are represented by open circles, tree vertices as filled circles,
reticulations as filled squares. A subnetwork without reticulations is represented by a
large open triangle, a subnetwork that may be reticulated is represented by a large
open blob. This Figure shows an example of the operation of Algorithm 3, note how
R(u)∪R(v)\ {v}=∅in this example. Subnetwork under label (1) is an example
network at line 7, the dotted line represents the removed reticulation edge (u, v) by
line 5 and both leaves u′and v′have been constructed (leaf labels are not shown). The
network under label (2) shows the state of network (1) at line 8 when edges (p(u), u′)
and (p(v), v′) have been added.The network under label (3) shows the state of the
network under (1) at line 9 when vertices in reach(u, N′)∪reach(v , N′) are removed.
Finding agreement cherry-reduced subnetworks in level-1 networks 11
Algorithm 3 RT-SUBNET-MAKER
Input A network Nand a disjoint set F⊆ER(N)
Output the reticulation-trimmed subnetwork of Nwith respect to F, or
NULL if it does not exist
1: N′← N
2: Find a topological sort F′or F
3: for each (u, v)∈F′in order do
4: if R(u)∪R(v)\ {v}=∅then
5: delete edge (u, v)
6: construct leaf u′such that X(u′) = X(u)
7: construct leaf v′such that X(v′) = X(v)
8: add edges (p(u), u′) and (p(v), v′) to N′
9: remove all vertices in reach(u, N′)∪reach(v , N′)
10: else
11: return NULL
12: end if
13: end for
14: return N′
Lemma 5. Algorithm 3 on (N,F) returns the reticulation-trimmed subnetwork
N′of Nwith respect to Fif it exists, and NULL if not, and runs in time
O(|V(N)|).
For proof of Lemma 5, see Appendix.
Theorem 3. Algorithm 2 correctly enumerates all reticulation-trimmed subnet-
works of a network N, and runs in time O(3|R(N)||V(N)|).
Proof. It is already proved (Lemma 2) that non-disjoint Fdoes not admit a
reticulation-trimmed subnetwork, so it is correct to filter those. The remaining
correctness follows from the exhaustive nature of the construction of all Fand
by the correctness of Algorithm 3
As for the time complexity, filtering non-disjoint Fimplies a threefold choice
on each reticulation (we either include one, or none of its incoming edges, but
not both by disjointness). Thus the size of the set is O(3|R(N)|). Recalling that
Algorithm 3 can be implemented in time O(|V(N)|), the total runtime for Al-
gorithm 2 is in O(3|R(N)||V(N)|).
4.2 An algorithm for MACRS-Simple
A dynamic programming algorithm that solves the MACRS-Simple problem
in cubic time is given and proved in this section.
Assume we have networks N1and N2as input to the MACRS-Simple prob-
lem. We assume that we have computed the set of biconnected components of
N1and N2in a preprocessing step, along with the bridge edges. This can be
12 K. Landry et al.
done in time O(|V(N)|), see [7]. Since the networks considered are level-1, each
biconnected component Bcontains exactly one vertex uthat has no in-neighbor
in B, and exactly one vertex rthat has no out-neighbor in B. If Bis trivial, then
u=r, and otherwise ris a reticulation vertex and there are two edge-disjoint
paths from uto rin B[12] We refer to these two paths as component paths, The
vertex uwill be called the root of Band denoted ρ(B), and rwill be called
the bottom of B. We let B1be the set of biconnected components of N1and B2
be the set of biconnected components of N2. Finally for i∈ {1,2}, we denote
ρ(Bi) = {ρ(B) : B∈ Bi}, i.e. the set of roots in Bi.
Using dynamic programming, we construct a table Mwhose rows are the
roots in ρ(B1) and whose columns are the roots in ρ(B2). For u∈ρ(B1), v ∈
ρ(B2), we define Nuas the subnetwork of N1rooted at u, and Nvas the sub-
network of N2rooted at v. We then define M[u, v] as the number of leaves in a
MACRS-Simple of Nuand Nv. If uis a tree vertex, its children are denoted
u1and u2.
In N1, we denote the two component paths on the same non-trivial bicon-
nected component by π1
l=p1
l,1... and π1
r=p1
r,1... and in N2these paths will
be denoted π2
l=p2
l,1... and π2
r=p2
r,1.... For a vertex pion path π=p1..., let
h(pi) be the child vertex of pisuch that h(pi)6=pi+1. In other words, the edge
(pi, h(pi)) is a bridge pendant πleading to a different biconnected component
where h(pi) is rooting a distinct subnetwork. See Figure 3 for an illustration of
the component paths and the described labelings for an example N1network.
We use Algorithm 4 to compute M[u, v] for each u∈ρ(B1), v ∈ρ(B2) in
postorder. We seek the result M[ρ(N1), ρ(N2)] + |R(N1)|as Mrecords only the
number of leaves in an MACRS-Simple of N1and N2. From this information
we can calculate more about the general size of the network because they are
binary, |V(N)|= 2|L(N)|+ 2|R(N)| − 1. Luckily, the number of reticulations
in the solution is known ahead of time since it must have the same number of
reticulations as each of the inputs. Note that we can also reconstruct the network
that corresponds to the optimal size of the MACRS-Simple of N1and N2by
performing a traceback in the dynamic programming table.
Theorem 4. Algorithm 4 runs in time O(|V(N1)||V(N2)|(|V(N1)|+|V(N2)|)).
Proof. The algorithm fills a table M, a table of maximum size |V(N1)||V(N2)|,
thus if we can show each table entry is calculated in at most linear (|V(N1)|+
|V(N2)|) time, then the algorithm is cubic as claimed.
The preprocessing step to determine and label biconnected components is
linear as it requires a modified depth-first search [7]. Then, the calculations
being performed for lines 1 through 9 consist of finding and checking the labelled
components (linear), and checking up to 12 set intersections (linear) of a vertex’s
descendants leaves (linear to find). Lines 10 and on perform a linear number of
table lookups/calls. The paths themselves are also linear to find as they are
simply the paths that leave each child of the rooting vertex of the biconnected
component and end on the next reticulation, the length of which can also be
calculated on a single pass. Thus the claim holds.
Finding agreement cherry-reduced subnetworks in level-1 networks 13
Algorithm 4
Input: Two multi-networks N1,N2, vertices u∈ρ(B1), v ∈ρ(B2)
Output: M[u, v]
1: if both uand vare trivial components then
2: if uor vis a leaf then
3:
M[u, v] = (1 if X(u)∩X(v)6=∅and R(u)∪R(v) = ∅
−∞ otherwise
4: else
5: for each i∈ {1,2}, j ∈ {1,2}, define Xij =X(ui)∩X(vj)
6: M[u, v] = max(M1, M2) where
M1=
1 if X11 6=∅and X22 =∅and R(u)∪R(v) = ∅
1 if X11 =∅and X22 6=∅and R(u)∪R(v) = ∅
M[u1, v1] + M[u2, v2] if X11 6=∅and X22 6=∅
−∞ otherwise
M2=
1 if X12 6=∅and X21 =∅and R(u)∪R(v) = ∅
1 if X12 =∅and X21 6=∅and R(u)∪R(v) = ∅
M[u1, v2] + M[u2, v1] if X12 6=∅and X21 6=∅
−∞ otherwise
7: end if
8: else if uis a trivial biconnected component and vis in a non-trivial bicon-
nected component (or vice versa) then
9: M[u, v] = −∞
10: else uand vare in non-trivial components with reticulations r1,r2respec-
tively and complement paths π1
l,π1
r,π2
l,π2
r
11: M1=−∞
12: M2=−∞
13: if |π1
l|=|π2
l|and |π1
r|=|π2
r|then
14:
M1=M[r1, r2] +
i=|π1
l|
X
i=1
M[h(p1
l,i), h(p2
l,i)] +
i=|π1
r|
X
i=1
M[h(p1
r,i), h(p2
r,i)]
15: end if
16: if |π1
l|=|π2
r|and |π1
r|=|π2
l|then
17:
M2=M[r1, r2] +
i=|π1
l|
X
i=1
M[h(p1
l,i), h(p2
r,i)] +
i=|π1
r|
X
i=1
M[h(p1
r,i), h(p2
l,i)]
18: end if
19: M[u, v] = max(M1, M2)
20: end if
14 K. Landry et al.
p1
l,1
r1
p1
r,1
p1
r,2
h(p1
l,1)h(p1
r,2)
h(p1
r,1)
ρ(Bi)
v
Fig. 3. In this figure, tree vertices as filled circles and reticulations as filled squares.
A subnetwork is represented by a large open blob. vertices in red are in the same non-
trivial biconnected component. Yellow edges are path π1
1and green edges are path π1
r.
Tree vertex vis a trivial biconnected component itself such that R(v)6=∅.
Theorem 5. The entry M[ρ(N1), ρ(N2)] correctly contains |L(N∗)|for N∗=
MACRS-Simple(N1,N2)if one exists, and −∞ otherwise.
See Appendix for proof of Theorem 5.
4.3 Complexity of Algorithm 1
Theorem 6. Let N1,N2be two networks, let n= max(|V(N1)|,|V(N2)|), and
r=|R(N1)|+|R(N2)|. Then the MACRS problem can be solved in time O(3rn3).
Proof. By Theorem 3, Algorithm 2 can enumerate all reticulation-trimmed sub-
networks of N1and N2in total time O(3|R(N1)|n+ 3|R(N2)|n) = O(3rn). The
number of pairs of such networks for which we compute a MACRS-Simple is
O(3r), each of which can be handled in time O(n3) by Theorem 4. The total
running time is thus O(3rn+ 3rn3) = O(3rn3).
5 Conclusion and discussion
In this paper, we presented the first exact algorithm to find an MACRS of two
rooted binary level-1 networks. The proposed approach starts by enumerating all
reticulation-trimmed subnetworks for both input networks, and then compares
Finding agreement cherry-reduced subnetworks in level-1 networks 15
all the possible pairs produced for each input network using a dynamic pro-
gramming algorithm for the MACRS-Simple problem. The enumeration step
presented here is currently exponential in the sum of reticulation numbers of
both input networks, and the MACRS-Simple algorithm takes cubic time in
the maximum number of vertices contained in the input networks.
In addition to the benefit of being able to extract a common subnetwork
structure of maximum size from two orchard networks, the proposed algorithm
permits to find a measure of the amount of differences between them. As shown
in our previous work [19], there is a direct correspondence between finding an
MACRS (more specifically, its size) and calculating one of the three equivalent
distances presented in that work. As such, the algorithm presented here provides
a first method to calculate exactly these distances. This can be used in the future
to compare this distance with other distances (such as the mixed distance) or to
evaluate the accuracy of different heuristic approaches.
Future extensions
There is an obvious optimization that can be applied to the approach presented
in this work related to the enumeration of the reticulation-trimmed subnetworks.
Since the MACRS-Simple algorithm by definition does not remove reticula-
tions, comparing two input reticulation-trimmed subnetworks that do not share
the same reticulation number or topology (in the sense that no mapping of the
components containing reticulations can be made) will result in no solution. An
obvious improvement to the enumeration step is to compare the topological rela-
tionships of the reticulations in both input networks (which, in the case of level-1
networks, can be modelled by trees), find the largest common reticulation topol-
ogy between them, and start enumerating from there by gradually removing all
possible reticulations. While this strategy does not achieve any additional formal
bounding, it may reduce greatly the number of reticulation-trimmed subnetwork
pairs to consider on many real inputs (potentially bringing it down to a linear
number of pairs).
Another interesting avenue of work is to generalize our algorithm to higher
level networks. A brief overview of a possible strategy would be to extend the
MACRS-Simple dynamic programming to consider, for each pair of bicon-
nected components, all possible isomorphisms, find the maximum value and
then summing to it the values of the exterior nodes that are matched in the
isomorphism.
Attaching leaves to a non-orchard network was used previously to extend
an approach to solve the minimum hybridization problem on any rooted phylo-
genetic network [20]. Exploring if and how a similar idea could be employed to
generalize our proposed algorithm to non-orchard networks should be considered.
Finally, as mentioned earlier, the complexity of our method is exponential in
the sum of the number of reticulations in both input networks because of the
enumeration step. Ideally, we could find an approach for which the complexity
would depend only on the level of the two input networks, which we leave as an
open problem.
16 K. Landry et al.
References
1. Allen-Savietta, C.: Estimating Phylogenetic Networks from Concatenated Se-
quence Alignments. The University of Wisconsin-Madison (2020)
2. Baroni, M., Semple, C., Steel, M.: A framework for representing reticulate evolu-
tion. Annals of Combinatorics 8, 391–408 (2005)
3. Bernardini, G., van Iersel, L., Julien, E., Stougie, L.: Reconstructing phylogenetic
networks via cherry picking and machine learning. In: WABI 2022-2nd Interna-
tional Workshop on Algorithms in Bioinformatics (2022)
4. Bordewich, M., Semple, C.: Determining phylogenetic networks from inter-taxa
distances. Journal of mathematical biology 73(2), 283–303 (2016)
5. Cardona, G., Llabr´es, M., Rossell´o, F., Valiente, G.: Metrics for phylogenetic net-
works i: Generalizations of the robinson-foulds metric. IEEE/ACM Transactions
on Computational Biology and Bioinformatics 6(1), 46–61 (2008)
6. Erd˝os, P.L., Semple, C., Steel, M.: A class of phylogenetic networks reconstructable
from ancestral profiles. Mathematical biosciences 313, 33–40 (2019)
7. Hopcroft, J.E., Tarjan, R.E.: Dividing a graph into triconnected components. SIAM
Journal on computing 2(3), 135–158 (1973)
8. Huber, K.T., Linz, S., Moulton, V.: The rigid hybrid number for two phylogenetic
trees. Journal of Mathematical Biology 82, 1–29 (2021)
9. Huber, K.T., Linz, S., Moulton, V.: Cherry picking in forests: A new characteri-
zation for the unrooted hybrid number of two phylogenetic trees. arXiv preprint
arXiv:2212.08145 (2022)
10. Humphries, P.J., Linz, S., Semple, C.: Cherry picking: a characterization of the
temporal hybridization number for a set of phylogenies. Bulletin of mathematical
biology 75(10), 1879–1890 (2013)
11. Huson, D.H., Bryant, D.: Application of Phylogenetic Net-
works in Evolutionary Studies. Molecular Biology and Evolution
23(2), 254–267 (10 2005). https://doi.org/10.1093/molbev/msj030,
https://doi.org/10.1093/molbev/msj030
12. Huson, D.H., Rupp, R., Scornavacca, C.: Phylogenetic networks: concepts, algo-
rithms and applications. Cambridge University Press (2010)
13. van Iersel, L., Janssen, R., Jones, M., Murakami, Y.: Orchard networks are trees
with additional horizontal arcs. Bulletin of Mathematical Biology 84(8), 76 (2022)
14. van Iersel, L., Janssen, R., Jones, M., Murakami, Y., Zeh, N.: A unifying character-
ization of tree-based networks and orchard networks using cherry covers. Advances
in Applied Mathematics 129, 102222 (2021)
15. Janssen, R., Jones, M., Murakami, Y.: Combining networks using cherry picking
sequences. In: Algorithms for Computational Biology: 7th International Confer-
ence, AlCoB 2020, Missoula, MT, USA, April 13–15, 2020, Proceedings. pp. 77–92.
Springer (2020)
16. Janssen, R., Murakami, Y.: Linear time algorithm for tree-child network contain-
ment. In: Algorithms for Computational Biology: 7th International Conference,
AlCoB 2020, Missoula, MT, USA, April 13–15, 2020, Proceedings 7. pp. 93–107.
Springer (2020)
17. Janssen, R., Murakami, Y.: On cherry-picking and network containment. Theoret-
ical Computer Science 856, 121–150 (2021)
18. Kong, S., Pons, J.C., Kubatko, L., Wicke, K.: Classes of explicit phylogenetic net-
works and their biological and mathematical significance. Journal of Mathematical
Biology 84(6), 47 (2022)
Finding agreement cherry-reduced subnetworks in level-1 networks 17
19. Landry, K., Teodocio, A., Lafond, M., Tremblay-Savard, O.: Defining phylogenetic
network distances using cherry operations. IEEE/ACM Transactions on Compu-
tational Biology and Bioinformatics (2022)
20. Linz, S., Semple, C.: Attaching leaves and picking cherries to characterise the
hybridisation number for a set of phylogenies. Advances in Applied Mathematics
105, 102–129 (2019)
21. Lu, B., Zhang, L., Leong, H.W.: A program to compute the soft robinson–foulds
distance between phylogenetic networks. BMC genomics 18, 1–10 (2017)
22. Lutteropp, S., Scornavacca, C., Kozlov, A.M., Morel, B., Stamatakis, A.: Netrax:
accurate and fast maximum likelihood phylogenetic network inference. Bioinfor-
matics 38(15), 3725–3733 (2022)
23. Nguyen, Q., Roos, T.: Likelihood-based inference of phylogenetic networks from
sequence data by phylodag. In: Algorithms for Computational Biology: Second
International Conference, AlCoB 2015, Mexico City, Mexico, August 4-5, 2015,
Proceedings 2. pp. 126–140. Springer (2015)
24. Park, H.J., Jin, G., Nakhleh, L.: Bootstrap-based support of hgt inferred by max-
imum parsimony. BMC Evolutionary Biology 10(1), 1–11 (2010)
25. Sol´ıs-Lemus, C., Bastide, P., An´e, C.: Phylonetworks: a package for phylogenetic
networks. Molecular biology and evolution 34(12), 3292–3298 (2017)
26. Tan, M., Long, H., Liao, B., Cao, Z., Yuan, D., Tian, G., Zhuang, J., Yang, J.: Qs-
net: Reconstructing phylogenetic networks based on quartet and sextet. Frontiers
in Genetics 10, 607 (2019)
27. Van Iersel, L., Janssen, R., Jones, M., Murakami, Y., Zeh, N.: Polynomial-time
algorithms for phylogenetic inference problems involving duplication and reticula-
tion. IEEE/ACM transactions on computational biology and bioinformatics 17(1),
14–26 (2019)
28. Van Iersel, L., Jones, M., Weller, M.: Embedding phylogenetic trees in networks of
low treewidth. arXiv preprint arXiv:2207.00574 (2022)
29. Wen, D., Yu, Y., Zhu, J., Nakhleh, L.: Inferring phylogenetic networks using phy-
lonet. Systematic biology 67(4), 735–740 (2018)
18 K. Landry et al.
Appendix
1 Proof of Lemma 1(page 8)
Lemma 1. Let Nbe a network. Then for any N′⊆cr N, there exists a reticulation-
trimmed subnetwork N′′ of Nand a simple CS Ssuch that N′′hSi=N′.
Proof. Let Nand N′be networks such that N′⊆cr N. We use induction on
|E(N)|to prove a slightly stronger statement. We show that for any F⊆ER(N)
such that there exists a CS S′satisfying N hS′i=N′that removes the set of
reticulation edges F, there exists a reticulation-trimmed subnetwork N′′ of N
with respect to Fand a CS Ssuch that N′′ hSi=N′.
The base case is |E(N)|= 1. In this case we have a singleton network on a
root, a single leaf, and an edge between them. There are no cherries and only
F=∅is possible, and so all N′⊆cr Nhave N′=Nand S=∅for N hSi=N′.
So the claim is trivially true.
For the induction step, assume that the claim holds for all networks whose
number of edges is strictly smaller than |E(N)|.
Let F⊆ER(N), and suppose there is a CS S′such that N hS′i=N′,
and that the set of reticulation edges removed by S′is F. If every reduction
in S′is simple, then F=∅and Nis itself a reticulation-trimmed subnetwork
of Nwith respect to F=∅, in which case the statement holds. Otherwise, let
(u, v)∈Fbe the first reticulation edge of Nremoved by a reduction in S′, say
(u, v) is removed by cherry S′
i. On network N hS′
(0:i)i,uand vmust both have
leaf children. This implies that there are no reticulations in reach(u, N). Thus,
all cherries (x, y) of S(0:i)such that {x, y } ∈ reach(u, N) are simple.
Assume for now that at least one such simple reduction exist, and let S′
h=
(x, y) be the first of them. With this, we see that (x, y) is a cherry on Nthrough-
out every step of the reduction of Nby S′
(0:h), including on N. We may there-
fore assume that (x, y) is the first reduction of S′by Theorem 1. Thus N′is
obtained from N h(x, y)iby applying the CS S′
(1:|S|), which we know removes the
set of reticulation edges F. By induction, there exists a reticulation-trimmed
subnetwork N∗of N h(x, y)iwith respect to Fand a simple CS S∗such that
N∗hS∗i=N′. Let Tbe a smallest CS such that results in N∗after applying it
on N h(x, y)i. Then N∗can be obtained from Nby applying the CS (x, y )·T.
We note that reduction (x, y) is required in any CS that results in a trimmed-
reticulation subnetwork of Nwith respect to F(since the cherry (x, y) is below
uand (u, v) needs to be removed) and it follows by the minimality of Ton
N h(x, y)ithat (x, y)·Tis minimum on N. Thus N∗is a reticulation-trimmed
subnetwork of Nwith respect to Fand the claim holds.
It is also possible that (x, y ) does not exist as a simple reduction, which
occurs when uand veach already have a leaf child in N. In this case, (x, y) is
a reticulated cherry on Nsuch that p(x) = vand p(y) = u. As in the previous
case, we may assume that (x, y) is the first reduction of S′. Let F′=F\ {(u, v )}.
By induction there is a reticulation-trimmed subnetwork N∗of N h(x, y)iwith
respect to F′and there exists S∗such that N∗hS∗i=N′. We also have a CS T
Finding agreement cherry-reduced subnetworks in level-1 networks 19
that is minimum and contains a reticulated reduction if and only if it removes an
edge in F′. As before, the reduction (x, y) is required in any reticulation-trimmed
subnetwork of Nwith respect to F, and it follows by the minimality of Tthat
(x, y)·Tyields such a network. Again, N∗is a reticulation-reduced subnetwork
of Nwith respect to Fand there is the simple CS S∗such that N∗hS∗i=N′.
⊓⊔
2 Proofs from Section 4.1: Lemma 2(page 9),
Lemma 3(page 9), Lemma 4(page 10),
Lemma 5(page 11)
Lemma 2. Let Nbe a network, F⊆ER(N)be a set, and N′be a network that
is a reticulation-trimmed subnetwork of Nwith respect to F. Then Fis disjoint.
Proof. Say Fis not disjoint, then there are two cases. First, say Fcontains
edges (u, v) and (w, v ). The reduction on the network must first reduce one of
these edges, say (u, v). This reduction removes the vertex v, and thus the edge
(w, v) can no longer be in the network. In this way no single CS can reduce
both the reticulation edges leading into the same reticulation. Next, assume F
is not disjoint because it contains edges (u, v) and (u, w). This is not possible
in a level-1 network, because reticulations v , w would be contained in the same
biconnected component.
Lemma 3. Let Nbe a network and F⊆ER(N)be a set such that there exists
a reticulation-trimmed subnetwork of Nwith respect to F. Then there exists a
topological sort of F.
Proof. This claim is evidence by the topological partial ordering on R(N), which
exists on any network. Then, note that Fis disjoint by Lemma 2.
Lemma 4. Let Nbe a network and let F⊆ER(N). Then there does not exist
two non-strongly isomorphic reticulation-trimmed subnetworks of Nwith respect
to F.
Proof. Let Nbe a network and F⊆ER(N) be a set.
We claim there are not two non-strongly isomorphic reticulation-trimmed
subnetworks with respect to F. We proceed by induction on |V(N)|+|E(N)|.
In the base case, |V(N)|+|E(N)| ≤ 3 and Nis a singleton network on one
edge, with a root and a leaf as endpoints. In this case, F=∅and the reticulation-
trimmed subnetwork of Nwith respect to Fis Nand N=N. There are no
cherries on Nso there is not more than one reticulation-trimmed subnetwork of
Nwith respect to F.
Assume the claim holds for all networks with strictly less vertices and edges
than N. If there is no reticulation-trimmed subnetwork of Nwith respect to F,
then the claim holds. Assume such a network, N′, exists. If F=∅then Nis the
only reticulation-trimmed subnetwork of Nwith respect to F, as making any
20 K. Landry et al.
cherry reductions would not be minimum (since N h∅i =Nis already sufficient
to remove empty F). The claim holds in this case, so we now assume F6=∅.
Let edge e∈Fbe (arbitrarily) one of the lowest edge in the topological sort
on F(which exists, Lemma 3). Because N′exists, there is at least one cherry
below an endpoint of e(emay be in the cherry itself). Let one such cherry be
called (x, y).
Next assume (x, y) is such that (p(y), p(x)) 6=e. We know that (x, y) is not
reticulated, as otherwise ewould not be one of the lowest reticulations in F.
Therefore (x, y) is not reticulated, and thus simple. Because eis in F, it will
have to be removed, and must have leaves under its endpoints to do so, thus
(x, y) needs to be reduced in any CS Ssuch that N hSiis a reticulation-trimmed
subnetwork of Nwith respect to F. Moreover, by Theorem 1, we may assume
that any such CS Sstarts with (x, y) as otherwise (x, y) can be removed first
without affecting the resulting network. It follows that we may assume that, for
every CS Ssuch that N hSiis a reticulation-trimmed subnetwork of Nwith
respect to F, applying the first reduction results in N h(x, y)i. Then applying
S(1:|S|)on N h(x, y)imust yield a reticulation-trimmed subnetwork of N h(x, y)i
with respect to F(in particular, |S(1:|S|)|is minimum as otherwise, |S|would
not be minimum). By the induction hypothesis, N h(x, y)idoes not have two
non-strongly isomorphic reticulation-trimmed subnetworks with respect to F.
Since we may assume that all CSs Sapplicable to Nthat result in such a
trimmed subnetwork go through N h(x, y)i, it follows that Nalso does not have
two non-strongly isomorphic reticulation-trimmed subnetworks with respect to
F.
Finally, assume that (x, y) is reticulated and (p(y), p(x)) = e. As in the
previous case, note that (x, y ) must be present in any CS Ssuch that N hSiis
a reticulation-trimmed subnetwork of Nwith respect to F, and we may further
assume that any such CS starts with (x, y). Then, any such CS first goes through
N h(x, y)i, and then by minimality, results in a reticulation-trimmed subnetwork
of N h(x, y)iwith respect to F\ {(p(y), p(x))}. By induction, there is only one
such network, and thus also only one reticulation-trimmed subnetwork of Nwith
respect to F.
⊓⊔
Lemma 5. Algorithm 3 on (N,F) returns the reticulation-trimmed subnetwork
N′of Nwith respect to Fif it exists, and NULL if not, and runs in time
O(|V(N)|).
Proof. First, there is always a topological sort on F(Lemma 3) and Fis disjoint
(Lemma 2).
If a network is not returned, then it must be that R(u)∪R(v)\ {v} 6=∅on
one of the (possibly) partially reduced subnetwork for some edge (u, v)∈F. In
this case, Fdoes not admit a reticulation-trimmed subnetwork since there is no
edge in Fthat corresponds the reduction of all reticulations below u, and v, a
requirement for the reduction of reticulation v.
Assume a network is returned, we claim the algorithm is correct in this case.
Finding agreement cherry-reduced subnetworks in level-1 networks 21
We will show this claim by constructing a CS Ssuch that N hSi=N′, and
by counting the exact number of reticulation cherries which we show will remove
the desired F. Finally, we will show Sis minimum.
Say F′has the order h(u1, v1)(u2, v2)...i. For each (ui, vi), in order, we con-
struct a CS Sion the partially reduced N, then let S=S1·S2.... Note that
this means we construct each Sion N hS1·S2·...·Si−1i. There are distinct
subnetworks rooted on uiand viso there are CSs Su
i,Sv
ithat are complete for
each respective subnetwork. Note how, since we assume a network was returned,
R(ui)∪R(vi)\ {vi}=∅, so Su
iand Sv
iare simple. Say they reduce the sub-
networks to leaves lu
iand lv
irespectively (so that p(lu
i) = uand p(lv
i) = v). Let
Si=Su
i·Sv
i·(lv
i, lu
i). The cherry (lv
i, lu
i) will reduce the reticulation edge (u, v )
under these conditions.
In this way, there is a reticulation cherry in Sif and only if it reduces an
edge in F. Furthermore, Sis minimum since we construct Sion the network
N hS1·S2·...·Si−1iand we have selected only the cherry reductions that are
necessary and sufficient for the reduction of the targeted reticulation edges.
The claimed running time is straightforward. We can obtain a topological
sort on Fusing a standard topological sort of Nobtained in time O(|V(N)|+
|E(N)|) = O(|V(N)|) (since our networks are binary). Then the algorithm only
iterates over Fand replaces subnetworks by multi-labeled leaves, which can be
handled in time O(|V(N)|).
3 Proof of Theorem 5, page 14
Lemma 6 and Corollary 1 is required to justify the arguments in the proof for
Theorem 5.
Lemma 6. Let Nbe a network and Sbe any CS on N. If, for r∈R(N),
r∈ N hSi, then v∈reach−(r, N)∈ N hSi.
Proof. Assume ∃r∈ N ,∈ N hSi,v∈reach−(r, N), /∈ N hSi. First, vis not a
leaf as a leaf cannot be above any other vertex. There must be some cherry in
S, say Si= (x, y) such that v=p(x), v=p(y) or v=p(x) = p(y) in N hS[(0:i)i.
Since r∈ N hSi, we have that r∈ N hS(0:i)iand since v∈reach−(r, N) we have
that v∈reach−(r, N hS(0:i)i) thus the only orientation we can have is v=p(y),
r=p(x) in the reticulated cherry (x, y)∈ N hS(0:i)i. But x,y,p(x), p(y) are
removed in N hS(0:i]icontradicting that r∈ N hSi.
Note that Lemma 6 implies that after any simple reduction by a CS Son
a network N, that since all r∈R(N)∈ N hSithen Sr∈R(N)reach−(r, N)∈
N hSi. Noting this, it follows that
Corollary 1 ∀networks N1,N2and ∀ N ∗=M ACRS −SI M P LE(N1,N2)
such that N∗is non-null, ∃and edge-preserving bijective function
f:[
r∈R(N1)
reach−(r, N1)→[
r∈R(N2)
reach−(r, N2)
.
22 K. Landry et al.
Theorem 5. The entry M[ρ(N1), ρ(N2)] correctly contains |L(N∗)|for N∗=
MACRS-Simple(N1,N2)if one exists, and −∞ otherwise.
Proof. Given input networks N1and N2, for any u∈ρ(B1), v ∈ρ(B2), let the
subnetwork of N1rooted on ube called Nuand let the subnetwork of N2rooted
on vbe called Nv. We claim that M[u, v] as we defined it always contains the
number of leaves of a MACRS-Simple between Nuand Nv. In particular, this
will show our desired result with u=ρ(N1), v =ρ(N2). The proof is by induction
on |V(Nu)|+|V(Nv)|.
As a base case, suppose that |V(Nu)|= 1 and |V(Nv)|= 1, i.e. uand vare
both leaves. If X(u)∩X(v)6=∅, we put M[u, v] = 1, which is correct since Nu
and Nvare weakly isomorphic. If X(u)∩X(v) = ∅, we put M[u, v] = −∞, which
is correct since there is MACRS-Simple between networks that do not share a
leaf label.
Let us now consider the inductive step. For the rest of the proof, for any
u′∈V(N1), v′∈V(N2), we denote by N∗
u′,v′aMACRS-Simple between Nu′
and Nv′, if one exists (otherwise, N∗
u′,v′is undefined). As an inductive hypothesis,
we assume that M[u′, v′] is the number of leaves in N∗
u′,v′, for any u′, v′such
that |V(Nu′)|+|V(Nv′)|<|V(Nu)|+|V(Nv)|.
The proof is split into cases.
Case: uand vare trivial, one is a leaf. If X(u)∩X(v)6=∅and R(u)∪R(v) = ∅
then M[u, v] = 1. This is correct since a complete reduction can proceed on
both networks without any reticulations and with at least one leaf in common, a
requirement for any network to be isomorphic with a singleton network. More-
over, there cannot be more than one leaf since uor vis itself a leaf. For the
same reason, when X(u)∩X(v) = ∅,M[u, v] = −∞ is obviously correct. When
R(u)∪R(v)6=∅,M[u, v] is correct because a MACRS-Simple of uand vcan
only be a leaf, but this cannot be achieved since one of the networks has an
unremovable reticulation.
Case: uand vare trivial and R(u)∪R(v) = ∅,X(u1)∩X(v1)6=∅and X(u2)∩
X(v2) = ∅or X(u1)∩X(v1) = ∅and X(u2)∩X(v2)6=∅or X(u1)∩X(v2)6=∅
and X(u2)∩X(v1) = ∅or X(u1)∩X(v2) = ∅and X(u2)∩X(v1)6=∅.In
this case, M[u, v] = 1 by line 6. Neither unor vhave reticulations below them,
thus any reduction may proceed. In fact, the reduction on Nuand Nvmust be
complete to obtain an isomorphic network since there is no shared leaf below
one child of uand one child of v, thus that child must be removed to reach any
MACRS-SIMPLE(Nu,Nv), requiring a cherry on uand v. Luckily, there is a
leaf shared below one child of uand one child of vand so a singleton isomorphic
network is possible, so this case is correct.
Case: uand vare trivial, R(u)∪R(v)6=∅, and X(u1)∩X(v1) = ∅or X(u2)∩
X(v2) = ∅or X(u1)∩X(v2) = ∅or X(u2)∩X(v1) = ∅.In this case line 2
resolves to true so we calculate M[u, v] on line 6. We find that M[u, v] = −∞
since R(u)∪R(v)6=∅. A complete reduction is required to reach the required
isomorphic singleton in this condition, but the presence of a reticulation prevents
this.
Finding agreement cherry-reduced subnetworks in level-1 networks 23
Case: uis trivial and vis not trivial or vis trivial and uis not trivial. In this
case we always resolve M[u, v] = −∞ by line 8 and 9. This is indeed the correct
case, the presence of a reticulation in Nvand not in Nu(or in Nuand not Nv)
makes an isomorphic network unreachable by simple reductions alone.
Case: uand vare trivial, both are not leaves, and X(u1)∩X(v1)6=∅and
X(u2)∩X(v2)6=∅or X(u1)∩X(v2)6=∅and X(u2)∩X(v1)6=∅. In this
case we calculate M[u, v] on line 6. Regardless of any reticulations that may be
below uor v, we put M[u, v] as the maximum between M[u1, v1] + M[u2, v2]
and M[u, v] = M[u1, v2] + M[u2, v1]. We can assume, by the inductive hy-
pothesis, that M[u1, v1] = |L(N∗
u1,v1)|and M[u2, v2] = |L(N∗
u2,v2)|(likewise
M[u1, v2] = |L(N∗
u1,v2)|and M[u2, v1] = |L(N∗
u2,v1)|). It is not difficult to see
that if N∗
u,v , a MACRS-Simple of Nuand Nv, exists, then it can be obtained
by joining a MACRS-Simple of Nu1,Nv1with a MACRS-Simple of Nu2,Nv2
under a common parent, or by joining a MACRS-Simple of Nu1,Nv2with
aMACRS-Simple of Nu2,Nv1under a common parent. In the current case,
M1=M[u1, v1] + M[u2, v2] and M2=M[u1, v2] + M[u2, v1] correspond to con-
structing these two possible networks, and since they contain the correct values
by induction, M= max(M1, M2) is correct.
Case: uand vare both non-trivial and M[r1, r2]6=−∞ for reticulation r1,r2
in u’s, v’s biconnected components respectively, and M[h(p1
i), h(p2
i)] 6=−∞ for
all iin any p1∈π1
l, π1
ror p2∈π2
l, π2
r). In this case we resolve M[u, v] =
M[r1, r2] + Pi=|π1
l|
i=1 M[h(p1
l,i), h(p2
l,i)] + Pi=|π1
r|
i=1 M[h(p1
r,i), h(p2
r,i)] or M[u, v] =
M[r1, r2] + Pi=|π1
l|
i=1 M[h(p1
l,i), h(p2
r,i)] + Pi=|π1
r|
i=1 M[h(p1
r,i), h(p2
l,i)].
Each table reference in this summation returns a value that is correct for
that subnetwork, by the induction hypothesis, as every subnetwork ˜
Nof Nu
(6=Nu) and Nv(6=Nv) is smaller. The summation itself is also correct. This
is evident by noting that the biconnected components on uand vonly contain
vertices along πland πr[12], so all vertices in the components are accounted for.
Furthermore, all bridges must lead to disjoint networks. Finally, by Corollary 1
the only possible networks are constructed by joining up child networks that
pair vertices in the order evident by the forbidden paths and their independent
subnetwork children/siblings. Since we consider the maximum among all such
possible networks, the solution is maximal. It is for this same reason when uand
vare non-trivial but the conditions are such that M[u, v] = −∞ by line 18, or
by an operand being −∞ in line 14 or line 16, that M[u, v] = −∞ is correctly
found.