Content uploaded by Kaari Landry

Author content

All content in this area was uploaded by Kaari Landry on Jun 20, 2023

Content may be subject to copyright.

arXiv:2305.00033v1 [cs.DS] 28 Apr 2023

Finding agreement cherry-reduced subnetworks

in level-1 networks

Kaari Landry*1, Olivier Tremblay-Savard1, and Manuel Lafond2

1University of Manitoba, Winnipeg MB, Canada * landryk1@cs.umanitoba.ca

2Universit´e de Sherbrooke, Sherbrooke QC, Canada

Abstract. Phylogenetic networks are increasingly being considered as

better suited to represent the complexity of the evolutionary relation-

ships between species. One class of phylogenetic networks that has re-

ceived a lot of attention recently is the class of orchard networks, which

is composed of networks that can be reduced to a single leaf using cherry

reductions. Cherry reductions, also called cherry-picking operations, re-

move either a leaf of a simple cherry (sibling leaves sharing a parent) or a

reticulate edge of a reticulate cherry (two leaves whose parents are con-

nected by a reticulate edge). In this paper, we present a ﬁxed-parameter

tractable algorithm to solve the problem of ﬁnding a maximum agree-

ment cherry-reduced subnetwork (MACRS) between two rooted binary

level-1 networks. This is ﬁrst exact algorithm proposed to solve the

MACRS problem. As proven in earlier work, there is a direct relationship

between ﬁnding an MACRS and calculating a distance based on cherry

operations. As a result, the proposed algorithm also provides a distance

that can be used for the comparison of level-1 networks.

Keywords: Cherry operations ·Graphs and networks ·Trees ·Net-

work problems ·Algorithm design and analysis ·Biology and genetics ·

Phylogenetic Networks

1 Introduction

Phylogenetic trees have been used extensively throughout the years to represent

simple evolutionary relationships between species. Because of this, many tools

and techniques are readily available to eﬃciently build, compare and evaluate

trees. Phylogenetic networks on the other hand are much better suited to repre-

sent more complex relationships, such as the ones resulting from hybridization,

recombination and lateral gene transfer events [11]. In the last 15 years or so,

bioinformatics research has focused increasingly on solving problems related to

phylogenetic networks, such as network construction [24,23,25,29,26,1,22], mini-

mum hybridization number [2,10,27,15,8,9,3], tree/network containment [16,17,28],

and distance calculation between networks [5,21,19].

One crucial concept that has been shown to be a very useful tool in solving

several of the important phylogenetic network problems mentioned above is the

one of cherry-picking sequences [10,20]. A cherry-picking sequence is made up

of operations that can reduce a network by either removing one leaf of a simple

(tree-like) cherry (i.e. two leaf siblings descending from the same parent vertex),

2 K. Landry et al.

or removing one reticulate edge of a reticulated cherry (two leaves whose parent

vertices are connected by a reticulate edge). The concept of cherry-picking has

been so valuable that it led to the deﬁnition of orchard networks, also known

as cherry-picking networks, which are simply phylogenetic networks that can be

reduced to a single leaf by cherry-picking operations [6,17]. Recent work has been

focusing on further characterizing and classifying diﬀerent subtypes of orchard

networks [14,18,13].

Lately, we have used a generalized deﬁnition of cherry operations to describe

both cherry reductions (i.e. cherry picking) and cherry expansions (the reverse

of a reduction, which adds a simple or reticulate cherry) [19]. We have then

deﬁned four novel distances between orchard networks that are based on cherry

operations, with three of them being diﬀerent formulations of an equivalent

distance (construction, deconstruction and tail distances) and the fourth one

(mixed distance) being a lower bound for the other three. In the process of

describing these distances, the concept of a maximum agreement cherry-reduced

subnetwork (MACRS – note that we replace cherry-picking used in [19] by cherry-

reduced here for clarity) was deﬁned to represent a network contained in both

networks being compared that maximizes the number of vertices. We showed

that ﬁnding an MACRS of two orchard networks was NP-hard, and this was

analogous to the problem of calculating the three equivalent distances.

In this work, we present an exact ﬁxed-parameter tractable (FPT) algorithm

to compute an MACRS of two rooted binary level-1 networks that is exponential

in the sum of reticulations present in both networks. More precisely, our algo-

rithm runs in O(3rn3), where ris the sum of reticulations and n represents the

maximum number of vertices of the input networks. Our approach essentially

consists of enumerating a certain set of subnetworks of the input networks in

which all possible combinations of reticulation edges have been removed. Then,

it makes use of a dynamic programming algorithm that ﬁnds whether there is

an MACRS (and what it is, if it exists) or not between two level-l networks in

which reticulations that are remaining cannot be removed (we call this problem

MACRS-Simple). We prove that the initial MACRS problem can be solved by

solving the MACRS-Simple problem on all combinations of enumerated subnet-

works.

It is worth noting another important diﬀerence between the previous deﬁning

work on MACRS and this article is the deﬁnition of networks. Speciﬁcally, we

allow leaves of the network to have multiple labels. In fact, we force all leaf labels

to be conserved as the network is trimmed by cherry reductions by subsuming

labels of a removed leaf onto its cherry sibling that remains. In this way, we keep

a “memory” of reductions and this compressed representation of networks allows

to restore all possible alternative network (bijective) leaf labelings from it.

Finally, we conclude the paper by discussing how the enumeration step could

be optimized by considering the relationships between the reticulations of both

input networks. We also brieﬂy present a preliminary idea of how the proposed

algorithm could be extended to higher level binary networks. Even though the

proposed approach applies to orchard networks and not to general networks,

Finding agreement cherry-reduced subnetworks in level-1 networks 3

the orchard network class actually contains network types that are of interest

to the research community, such as the tree-child networks [4] and tree-sibling

time-consistent networks [6]. The tree-child networks in particular, in addition to

having been studied extensively in the literature, are biologically relevant, since

all ancestral species (internal vertices) have a path that can go to a leaf using

only tree vertices. This reﬂects the idea that ancestral species have descendants

that will perdure through mutation and speciation events, and that hybridization

events are not as common as speciation events [18].

2 Preliminaries

We ﬁrst introduce the notions regarding networks, then proceed to deﬁning

cherry operations and our problem of interest.

2.1 Networks

Aphylogenetic network N, or a network for short, is an acyclic directed graph

without vertices of in-degree and out-degree 1, and whose vertices and edges are

denoted V(N) and E(N), respectively. We assume that all networks are binary.

For v∈V(N), we use v−and v+to denote the in-degree and out-degree of v,

respectively. The set V(N) contains

–the root ρ(N), which is the unique node satisfying ρ(N)−= 0 and ρ(N)+=

2. In the case that |V(N)|= 2, ρ(N)+= 1;

–the leaves L(N), which satisfy l−= 1 and l+= 0 for all l∈L(N);

–the internal vertices V(N)\(L(N)∪ {ρ(N)}), which contains:

•the tree vertices T(N), which satisfy v−= 1 and v+= 2 for all v∈T(N);

•the reticulation vertices R(N), or simply reticulations, which satisfy

v−= 2 and v+= 1 for all v∈R(N).

We use Xto denote the set of all taxa. For our purposes, the leaves of a network

Nare labeled by one or more taxa. For l∈L(N), we will use X(l) to denote

the set of taxa that label l. We require that X(l)6=∅, and that for any distinct

leaves l1, l2∈L(N), X(l1)∩X(l2) = ∅.

The edges directed into a reticulation vertex are called reticulation edges,

denoted ER(N). For v∈V(N), the out-neighbors of vare called its children.

If vhas a single in-neighbor, we denote it by p(v) and call it the parent of v

(if v∈ {ρ(N)} ∪ R(N), then p(v) is undeﬁned). Vertices uand vare siblings

if p(u), p(v) are deﬁned and p(u) = p(v). When there is a directed path from

vertex vto vertex u, we call van ancestor of uand we call uadescendant of v.

The descendants of vare denoted reach(v , N) while its ancestors are denoted

reach−(v, N) (note that vitself is in both sets). The union of the labels in

reach(v, N)∩L(N) is denoted X(v). We denote by R(v) the set of reticulations

in reach(v, N).

Two networks N1,N2are weakly isomorphic if there exists a bijection σ:

V(N1)→V(N2) such that (u, v)∈E(N1) if and only if (σ(u), σ (v)) ∈E(N2),

4 K. Landry et al.

and such that for each l∈L(N1), X(l)∩X(σ(l)) 6=∅. For this we use the

notation N ≃ N ′. If, for each l∈L(N1), X(l) = X(σ(l)), then we say N1and

N2are strongly isomorphic which we denote by N1=N2.

A network Nmay have only one edge whose endpoints are ρ(N) and a leaf.

Then Nis a single-leaf network or singleton. We say ρ(N)roots N. If, for a

vertex v, and for all vertices v′∈reach(v, N), if every path from ρ(N) to v′

goes through v, then we say vroots the subnetwork below it.

While a network Nis directed, there is an undirected version of Non the

same vertex set and with an undirected edge {u, v}present for every (u, v)∈

E(N) which we call the underlying graph. It is on this underlying graph that we

identify the set of biconnected components of N. Such a component is a maximal

subgraph Bthat cannot be disconnected by the removal of an edge therein. Note

that every individual leaf and some tree vertices alone constitute a biconnected

component, we refer to such single vertex components as trivial, and all others

as non-trivial. For a set of biconnected components B1...Bbon a network N, a

bridge is an edge (u, v) such that u∈Bi,v∈Bjfor any arbitrary 1 ≤i6=j≤b.

The level of a network is the maximum number of reticulations across all

biconnected components of a network. A level-knetwork has no biconnected

component with more than kreticulations. A level-1 network has every bicon-

nected component with either 0 or 1 reticulations. Note that this does not limit

the number of reticulations over the whole network, just in each biconnected

component.

2.2 Cherries and cherry reductions

Acherry is a pair of leaves that are siblings or that have a reticulation joining

their parents. More speciﬁcally, a pair (x, y)∈L(N)×L(N) is called a cherry if

either p(x) = p(y), in which case (x, y) is called a simple cherry, or p(x)∈R(N)

and (p(y), p(x)) ∈E(N), in which case (x, y) is called a reticulated cherry.

Let Nbe a network and let (x, y) be a pair of vertices. Then applying the

cherry reduction (x, y) on Ncreates a new network as follows:

–If (x, y) is a simple cherry of N, then the (x, y)reduction consists of removing

the leaf xand the edge (p(x), x), suppressing the resulting node of in and out-

degree 1 if any, and re-assigning X(y) = X(y)∪X(x). Note that the operation

we introduce here diﬀers from the cherry reduction operation described in

previous work, where both the leaf xand the set X(x) are deleted. The

purpose of our new deﬁnition is to preserve a reference to which label could

have been assigned to y. This is to say that the labels on a given leaf are

interchangeable [19, Lemma 3].

–If (x, y) is a reticulated cherry of N, then we remove the reticulation edge

(p(y), p(x)) and the resulting vertices of in and out-degree 1 are suppressed.

In this case, we say that the reticulation edge (p(y), p(x)) is removed by the

cherry reduction (x, y).

–If (x, y) is not a cherry of N, then Nis unchanged.

Finding agreement cherry-reduced subnetworks in level-1 networks 5

The resulting graph is a network, and always has a cherry unless it is a

singleton (true of all orchard networks by deﬁnition [6,17]).

Cherry reductions often occur in batches, and a sequence Sof pairs of leaves

is called a cherry sequence (CS ). The number of elements in Sis denoted |S|.

The cherry at position iof a CS Sis referred to by Si. We use N hSito denote

the network obtained from Nby ﬁrst applying cherry reduction S1on N, then

S2on the resulting network, and so on until S|S|is applied. Note that we al-

low Sto contain pairs that do not modify the network (e.g. non-cherries). The

subsequence from (including) the ﬁrst cherry to (excluding) the ith cherry in S

is S(0:i). When a CS Sreduces a network Nto a singleton, then we say Sis

complete for N. We assume networks are orchard networks hereafter.

See Figure 1 for an illustration of the two cherry reduction operations, and

the concepts of isomorphism.

a={a}b={b}

r1

c={c}f={f}

u v

w

N1

x

r2

d={d}e={e}

a={a}b={b}

r1

c={c}f={f}

u v

w

N2

x

r2

e={d, e}

a={a}b={b}

r1

c={c}f={f}

u v

w

N3

e={d, e}

a={a}b={b}

N4

e={c, d, e, f }

a={a}b={b}

N5

c={c}

Fig. 1. In this ﬁgure, leaves are represented by open circles, tree vertices as ﬁlled circles,

reticulations as ﬁlled squares, and the root of the network as a ﬁlled, inverted trian-

gle. Network N1is a level-1 network with |R(N)|= 2. N1is a reticulation-trimmed

subnetwork of N1with respect to F=∅. Network N2=N1h(d, e)i, where (d, e)

is a simple cherry/reduction. Network N3=N2h(e, f)iwhere (e, f ) is a reticulated

cherry/reduction. N3is reticulation-trimmed subnetwork of N1and of N2with re-

spect to F={(x, r2)}. Network N4=N3h(c, e)·(f, e)·(e, b)iand is a reticulation-

trimmed subnetwork of N1and of N2with respect to F={x, r2),(v, r1)}or to

F={(w, r2),(v, r1)}. Network N5≃ N4, in fact, there are CSs that may head lead to

leaf ebeing any of leaves c,d,e, or f. Each of these networks would have the same

label set on that leaf, and all are weakly isomorphic with N5.

6 K. Landry et al.

Cherries on a network can be reduced in any order. We restate a theorem

of [16] that we adapt to our formalism3.

Theorem 1. Let Nbe a network, let (x, y)be a cherry of N, and let Sbe a CS

that contains (x, y). Then there exists a CS S′such that N hSi=N hS′i, and

whose ﬁrst element is (x, y).

2.3 Maximum agreement cherry-reduced subnetworks

For networks Nand N′, when there exists a CS Ssuch that N hSi ≃ N ′, we

say that N′is a cherry-reduced subnetwork (CRS ) of N, denoted by N′⊆cr N.

We can now deﬁne the main problem of focus.

The Maximum Agreement Cherry-Reduced Subnetwork (MACRS) problem.

Input: Two orchard networks N1and N2

Find: A network N∗with the maximum number of vertices that satisﬁes N∗⊆cr

N1and N∗⊆cr N2

A solution N∗to the above problem will be called an MACRS of N1and

N2.

3 An MACRS algorithm on level-1 networks

We show that the MACRS problem can be solved in time O(3rn3) for n=

max(|V(N1)|,|V(N2)|), and r=|R(N1)|+R(N2)|on level-1 networks. We em-

ploy a two-step strategy. We ﬁrst enumerate a number of inputs that have been

specially reduced to a selected set of remaining reticulations. Second, these in-

puts are provided to a cubic time dynamic programming algorithm on an easier

version of MACRS that uses only simple reductions. Because of the number of

special inputs is limited by 3r, we get an FPT algorithm. MACRS is thus split

into two subproblems. We ﬁrst introduce them and show how they can be used

to solve MACRS. The later sections then focus on each problem separately.

Let Nbe a network and let F⊆ER(N) be a subset of reticulation edges.

We wish to generate all the maximal cherry-reduced subnetworks of Nunder

the restriction that the reticulation edges removed by cherry operations coincide

with F. Thus, we say that a network N′is a reticulation-trimmed subnetwork of

Nwith respect to Fif there exists a CS Ssuch that N hSi=N′, and such that

(u, v)∈Fif and only if Scontains a reticulated cherry reduction that removes

(u, v), and Sis of minimum length i.e. we require that there is no other CS S′

with |S′|<|S|that satisﬁes the same properties.

Furthermore, we say that N′is a reticulation-trimmed subnetwork of Nif

there exists a set F⊆ER(N) such that N′is a reticulation-trimmed subnetwork

of Nwith respect to F.

3Note that the authors prove the statement under the assumption that Sis complete,

and that leaves are single-labeled. However the proof is easy to adapt to our context.

Finding agreement cherry-reduced subnetworks in level-1 networks 7

The Reticulation-Trimmed Enumeration problem:

Input: An orchard network N.

Find: the set of all reticulation-trimmed subnetworks of N.

Note that the size of the set of reticulation-trimmed subnetworks depends

heavily on the network structure. For instance, it is possible to show that it is

linear when all reticulations are arranged in a path, and exponential when all

reticulations are independent (none is an ancestor of the other). It is possible to

calculate the size of this set exactly by algorithmic means though an abstraction

of the network structure. However, we reserve the analysis of the impact of this

parameter on our algorithm for future work.

Once the set of edges to remove by reticulation have been guessed, it remains

to infer the set of non-reticulated cherry operations. A simple CS is a CS that

contains only simple cherries. In this way, R(N) = R(N hSi) for any simple CS S.

For networks Nand N′, when there exists a simple CS Ssuch that N hSi ≃ N ′

we say that N′is a CRS-SIMPLE of N. Note that owing to our deﬁnition of

weak isomorphism, N hSi ≃ N ′does not mean that Stransforms Ninto N′. A

better intuition would rather be that after applying Son N, we could choose

one label in the label set of each leaf of NhSiand of N′, such that the resulting

networks would be isomorphic in the traditional sense.

The Simple Maximum Agreement Cherry-Reduced Subnetwork (MACRS-Simple)

problem.

Input: Two orchard networks N1and N2.

Find: a network N∗with a maximum number of vertices such that N∗is a

CRS-SIMPLE of N1and a CRS-SIMPLE of N2.

A solution N∗to the above problem will be called a MACRS-Simple of N1

and N2.

For the standard MACRS problem on networks N1and N2, there is always

a solution as long as X(N1)∩X(N2)6=∅, however since reticulations can-

not be removed by simple CS, the MACRS-SIMPLE problem may not have a

solution (for instance when the two networks have diﬀerent number of reticula-

tion vertices). We can now describe our main algorithm, where we assume that

the MACRS-Simple routine correctly returns an optimal solution to the above

problem.

8 K. Landry et al.

Algorithm 1 MACRS Finder

Input Two networks N1and N2

Output A MACRS of N1and N2

1: ˜

N ← empty network

2: for each reticulation-trimmed subnetwork N′

1of N1do

3: for each reticulation-trimmed subnetwork N′

2of N2do

4: Let N′be a MACRS-Simple of N′

1and N′

2

5: if N′exists and |V(N′)|>|V(˜

N)|then ˜

N ← N ′

6: end for

7: end for

8: return ˜

N

An optimization technique is evident here: as we mentioned, there is only

a solution to MACRS-Simple (N1,N2) when |R(N1)|=|R(N2)|since only

simple reductions will be performed. Thus, we need only test such pairs. This

optimization is not currently formalized into the algorithm and complexity anal-

ysis presented here, but rather will make for future work.

In the remainder of this section, we focus on proving that this algorithm works

correctly. We will deal with the complexity of the algorithm once we have dealt

with the Reticulation-Trimmed Enumeration and MACRS-Simple sub-

problems. We begin by showing that one can always obtain a subnetwork by

ﬁrst going through a reticulated-trimmed subnetwork, and then using only sim-

ple cherry reductions.

Lemma 1. Let Nbe a network. Then for any N′⊆cr N, there exists a reticulation-

trimmed subnetwork N′′ of Nand a simple CS Ssuch that N′′hSi=N′.

For proof of Lemma 1, see Appendix.

Theorem 2. Algorithm 1 correctly ﬁnds a MACRS of N1and N2.

Proof. Let N∗be a MACRS of N1and N2. Let ˜

Nbe the network returned

by Algorithm 1. We ﬁrst claim that if ˜

Nis non-empty, it does satisfy ˜

N ⊆cr

N1,N2. To see this, note that every pair N′

1,N′

2of networks enumerated by

Algorithm 1 satisfy N′

1⊆cr N1and N′

2⊆cr N2, by the deﬁnition of reticulation-

trimmed subnetworks. Moreover, if a MACRS-Simple N′of N′

1,N′

2exists, then

by transitivity, N′⊆cr N′

1⊆cr N1and N′⊆cr N′

2⊆cr N2. Since ˜

Nis one of

those N′, this proves our claim.

Let us now focus on the optimality of ˜

N. First note that |V(N∗)| ≥ |V(˜

N)|:

if ˜

Nis an empty network, this is obvious, and otherwise, by our above claim, ˜

N

is a cherry-reduced subnetwork of N1and N2and can thus not be larger than

N∗.

Let us now show that |V(N∗)| ≤ |V(˜

N)|. By Lemma 1, there exists a

reticulation-trimmed subnetwork N′

1of N1(resp. N′

2of N2) such that N∗can

be obtained from N′

1(resp. N′

2) using only simple CSs. Thus, N∗is a CRS-

SIMPLE of N′

1and N′

2. Algorithm 1 will eventually enumerate N′

1and N′

2and

Finding agreement cherry-reduced subnetworks in level-1 networks 9

ﬁnd a MACRS-Simple N′of them, which is of maximum size and thus has at

least as many vertices as N∗. Since the returned ˜

Nis the N′of maximum size

found by the algorithm, it follows that |V(N∗)| ≤ |V(˜

N)|.

4 Subroutines

4.1 Enumerating the set of reticulation-trimmed subnetworks

We now show how to enumerate the set of all reticulation-trimmed subnetworks

of a network Nin time O(3|R(N)||V(N)|). The reticulation-trimmed subnetworks

are characterized by having no more reductions than what suﬃciently removes

the desired reticulation edges. Luckily, we will see that at most one such network

can exist; we must only remove the complete subnetwork under both endpoints

of the reduced reticulation edge. This is guaranteed possible by cherry reduc-

tions, assuming all reticulations below these endpoints have also been speciﬁed

for removal. Algorithm 2 shows how to enumerate the relevant edges, and uses

Algorithm 3 as a subroutine, which ﬁnds the reticulation-trimmed subnetwork

with respect to a given edge set. We show that the reticulation-trimmed subnet-

work of Nwith respect to F⊆R(N) is uniquely deﬁned in Lemma 4. We say

that a set of edges Fis disjoint if, for any two distinct edges (u, v),(x, y)∈F,

{u, v} ∩ {x, y }=∅.

Algorithm 2 REDUCED-SET-FINDER

Input A network N

Output The set of all reticulation-trimmed subnetworks of N

1: for each F∈ {P(ER(N)) : (a, b),(c, b)/∈Ffor any a, b, c}do

2: N←N∪RT-SUBNET-MAKER(N, F )

3: end for

4: return N

Lemma 2. Let Nbe a network, F⊆ER(N)be a set, and N′be a network that

is a reticulation-trimmed subnetwork of Nwith respect to F. Then Fis disjoint.

For proof of Lemma 2, see Appendix. Next, for F⊆ER(N), a topological

sort of Fis an ordering of its element such that for distinct edges e1, e2∈F, if

there is a path from a vertex of e1to a vertex of e2in N, then e1comes later

than e2in this ordering.

Lemma 3. Let Nbe a network and F⊆ER(N)be a set such that there exists

a reticulation-trimmed subnetwork of Nwith respect to F. Then there exists a

topological sort of F.

For proof of Lemma 3, see Appendix. The next lemma is crucial, as it shows

that reticulation-trimmed subnetworks with respect to a given Fare either

unique, or do not exist. This allows us to enumerate in reasonable time.

10 K. Landry et al.

Lemma 4. Let Nbe a network and let F⊆ER(N). Then there does not exist

two non-strongly isomorphic reticulation-trimmed subnetworks of Nwith respect

to F.

For proof of Lemma 4, see Appendix. For an example, given a network N,

of an F⊆ER(N) that does not admit a reticulation-trimmed subnetwork of

N, consider Nwith 2 reticulations, r1,r2such that r1∈reach−(r2). Choosing

F={(p1, r1)}, for p1chosen arbitrarily between r1’s parents, will not admit a

reticulation-trimmed subnetwork since reticulation r2must have leaves below its

endpoints to be in a cherry, but this choice of Fhas no corresponding reticulated

reductions of r1making it impossible to construct a CS Sthat reduces only r2.

We next describe Algorithm 3, which produces the reticulation-trimmed net-

works with respect to some given F, see Figure 2 for an illustration.

p(u)

u

vv′

u′

p(v)

(1) p(u)

u

vv′

u′

p(v)

(2)

(3)

u′v′

Fig. 2. In this ﬁgure, leaves are represented by open circles, tree vertices as ﬁlled circles,

reticulations as ﬁlled squares. A subnetwork without reticulations is represented by a

large open triangle, a subnetwork that may be reticulated is represented by a large

open blob. This Figure shows an example of the operation of Algorithm 3, note how

R(u)∪R(v)\ {v}=∅in this example. Subnetwork under label (1) is an example

network at line 7, the dotted line represents the removed reticulation edge (u, v) by

line 5 and both leaves u′and v′have been constructed (leaf labels are not shown). The

network under label (2) shows the state of network (1) at line 8 when edges (p(u), u′)

and (p(v), v′) have been added.The network under label (3) shows the state of the

network under (1) at line 9 when vertices in reach(u, N′)∪reach(v , N′) are removed.

Finding agreement cherry-reduced subnetworks in level-1 networks 11

Algorithm 3 RT-SUBNET-MAKER

Input A network Nand a disjoint set F⊆ER(N)

Output the reticulation-trimmed subnetwork of Nwith respect to F, or

NULL if it does not exist

1: N′← N

2: Find a topological sort F′or F

3: for each (u, v)∈F′in order do

4: if R(u)∪R(v)\ {v}=∅then

5: delete edge (u, v)

6: construct leaf u′such that X(u′) = X(u)

7: construct leaf v′such that X(v′) = X(v)

8: add edges (p(u), u′) and (p(v), v′) to N′

9: remove all vertices in reach(u, N′)∪reach(v , N′)

10: else

11: return NULL

12: end if

13: end for

14: return N′

Lemma 5. Algorithm 3 on (N,F) returns the reticulation-trimmed subnetwork

N′of Nwith respect to Fif it exists, and NULL if not, and runs in time

O(|V(N)|).

For proof of Lemma 5, see Appendix.

Theorem 3. Algorithm 2 correctly enumerates all reticulation-trimmed subnet-

works of a network N, and runs in time O(3|R(N)||V(N)|).

Proof. It is already proved (Lemma 2) that non-disjoint Fdoes not admit a

reticulation-trimmed subnetwork, so it is correct to ﬁlter those. The remaining

correctness follows from the exhaustive nature of the construction of all Fand

by the correctness of Algorithm 3

As for the time complexity, ﬁltering non-disjoint Fimplies a threefold choice

on each reticulation (we either include one, or none of its incoming edges, but

not both by disjointness). Thus the size of the set is O(3|R(N)|). Recalling that

Algorithm 3 can be implemented in time O(|V(N)|), the total runtime for Al-

gorithm 2 is in O(3|R(N)||V(N)|).

4.2 An algorithm for MACRS-Simple

A dynamic programming algorithm that solves the MACRS-Simple problem

in cubic time is given and proved in this section.

Assume we have networks N1and N2as input to the MACRS-Simple prob-

lem. We assume that we have computed the set of biconnected components of

N1and N2in a preprocessing step, along with the bridge edges. This can be

12 K. Landry et al.

done in time O(|V(N)|), see [7]. Since the networks considered are level-1, each

biconnected component Bcontains exactly one vertex uthat has no in-neighbor

in B, and exactly one vertex rthat has no out-neighbor in B. If Bis trivial, then

u=r, and otherwise ris a reticulation vertex and there are two edge-disjoint

paths from uto rin B[12] We refer to these two paths as component paths, The

vertex uwill be called the root of Band denoted ρ(B), and rwill be called

the bottom of B. We let B1be the set of biconnected components of N1and B2

be the set of biconnected components of N2. Finally for i∈ {1,2}, we denote

ρ(Bi) = {ρ(B) : B∈ Bi}, i.e. the set of roots in Bi.

Using dynamic programming, we construct a table Mwhose rows are the

roots in ρ(B1) and whose columns are the roots in ρ(B2). For u∈ρ(B1), v ∈

ρ(B2), we deﬁne Nuas the subnetwork of N1rooted at u, and Nvas the sub-

network of N2rooted at v. We then deﬁne M[u, v] as the number of leaves in a

MACRS-Simple of Nuand Nv. If uis a tree vertex, its children are denoted

u1and u2.

In N1, we denote the two component paths on the same non-trivial bicon-

nected component by π1

l=p1

l,1... and π1

r=p1

r,1... and in N2these paths will

be denoted π2

l=p2

l,1... and π2

r=p2

r,1.... For a vertex pion path π=p1..., let

h(pi) be the child vertex of pisuch that h(pi)6=pi+1. In other words, the edge

(pi, h(pi)) is a bridge pendant πleading to a diﬀerent biconnected component

where h(pi) is rooting a distinct subnetwork. See Figure 3 for an illustration of

the component paths and the described labelings for an example N1network.

We use Algorithm 4 to compute M[u, v] for each u∈ρ(B1), v ∈ρ(B2) in

postorder. We seek the result M[ρ(N1), ρ(N2)] + |R(N1)|as Mrecords only the

number of leaves in an MACRS-Simple of N1and N2. From this information

we can calculate more about the general size of the network because they are

binary, |V(N)|= 2|L(N)|+ 2|R(N)| − 1. Luckily, the number of reticulations

in the solution is known ahead of time since it must have the same number of

reticulations as each of the inputs. Note that we can also reconstruct the network

that corresponds to the optimal size of the MACRS-Simple of N1and N2by

performing a traceback in the dynamic programming table.

Theorem 4. Algorithm 4 runs in time O(|V(N1)||V(N2)|(|V(N1)|+|V(N2)|)).

Proof. The algorithm ﬁlls a table M, a table of maximum size |V(N1)||V(N2)|,

thus if we can show each table entry is calculated in at most linear (|V(N1)|+

|V(N2)|) time, then the algorithm is cubic as claimed.

The preprocessing step to determine and label biconnected components is

linear as it requires a modiﬁed depth-ﬁrst search [7]. Then, the calculations

being performed for lines 1 through 9 consist of ﬁnding and checking the labelled

components (linear), and checking up to 12 set intersections (linear) of a vertex’s

descendants leaves (linear to ﬁnd). Lines 10 and on perform a linear number of

table lookups/calls. The paths themselves are also linear to ﬁnd as they are

simply the paths that leave each child of the rooting vertex of the biconnected

component and end on the next reticulation, the length of which can also be

calculated on a single pass. Thus the claim holds.

Finding agreement cherry-reduced subnetworks in level-1 networks 13

Algorithm 4

Input: Two multi-networks N1,N2, vertices u∈ρ(B1), v ∈ρ(B2)

Output: M[u, v]

1: if both uand vare trivial components then

2: if uor vis a leaf then

3:

M[u, v] = (1 if X(u)∩X(v)6=∅and R(u)∪R(v) = ∅

−∞ otherwise

4: else

5: for each i∈ {1,2}, j ∈ {1,2}, deﬁne Xij =X(ui)∩X(vj)

6: M[u, v] = max(M1, M2) where

M1=

1 if X11 6=∅and X22 =∅and R(u)∪R(v) = ∅

1 if X11 =∅and X22 6=∅and R(u)∪R(v) = ∅

M[u1, v1] + M[u2, v2] if X11 6=∅and X22 6=∅

−∞ otherwise

M2=

1 if X12 6=∅and X21 =∅and R(u)∪R(v) = ∅

1 if X12 =∅and X21 6=∅and R(u)∪R(v) = ∅

M[u1, v2] + M[u2, v1] if X12 6=∅and X21 6=∅

−∞ otherwise

7: end if

8: else if uis a trivial biconnected component and vis in a non-trivial bicon-

nected component (or vice versa) then

9: M[u, v] = −∞

10: else uand vare in non-trivial components with reticulations r1,r2respec-

tively and complement paths π1

l,π1

r,π2

l,π2

r

11: M1=−∞

12: M2=−∞

13: if |π1

l|=|π2

l|and |π1

r|=|π2

r|then

14:

M1=M[r1, r2] +

i=|π1

l|

X

i=1

M[h(p1

l,i), h(p2

l,i)] +

i=|π1

r|

X

i=1

M[h(p1

r,i), h(p2

r,i)]

15: end if

16: if |π1

l|=|π2

r|and |π1

r|=|π2

l|then

17:

M2=M[r1, r2] +

i=|π1

l|

X

i=1

M[h(p1

l,i), h(p2

r,i)] +

i=|π1

r|

X

i=1

M[h(p1

r,i), h(p2

l,i)]

18: end if

19: M[u, v] = max(M1, M2)

20: end if

14 K. Landry et al.

p1

l,1

r1

p1

r,1

p1

r,2

h(p1

l,1)h(p1

r,2)

h(p1

r,1)

ρ(Bi)

v

Fig. 3. In this ﬁgure, tree vertices as ﬁlled circles and reticulations as ﬁlled squares.

A subnetwork is represented by a large open blob. vertices in red are in the same non-

trivial biconnected component. Yellow edges are path π1

1and green edges are path π1

r.

Tree vertex vis a trivial biconnected component itself such that R(v)6=∅.

Theorem 5. The entry M[ρ(N1), ρ(N2)] correctly contains |L(N∗)|for N∗=

MACRS-Simple(N1,N2)if one exists, and −∞ otherwise.

See Appendix for proof of Theorem 5.

4.3 Complexity of Algorithm 1

Theorem 6. Let N1,N2be two networks, let n= max(|V(N1)|,|V(N2)|), and

r=|R(N1)|+|R(N2)|. Then the MACRS problem can be solved in time O(3rn3).

Proof. By Theorem 3, Algorithm 2 can enumerate all reticulation-trimmed sub-

networks of N1and N2in total time O(3|R(N1)|n+ 3|R(N2)|n) = O(3rn). The

number of pairs of such networks for which we compute a MACRS-Simple is

O(3r), each of which can be handled in time O(n3) by Theorem 4. The total

running time is thus O(3rn+ 3rn3) = O(3rn3).

5 Conclusion and discussion

In this paper, we presented the ﬁrst exact algorithm to ﬁnd an MACRS of two

rooted binary level-1 networks. The proposed approach starts by enumerating all

reticulation-trimmed subnetworks for both input networks, and then compares

Finding agreement cherry-reduced subnetworks in level-1 networks 15

all the possible pairs produced for each input network using a dynamic pro-

gramming algorithm for the MACRS-Simple problem. The enumeration step

presented here is currently exponential in the sum of reticulation numbers of

both input networks, and the MACRS-Simple algorithm takes cubic time in

the maximum number of vertices contained in the input networks.

In addition to the beneﬁt of being able to extract a common subnetwork

structure of maximum size from two orchard networks, the proposed algorithm

permits to ﬁnd a measure of the amount of diﬀerences between them. As shown

in our previous work [19], there is a direct correspondence between ﬁnding an

MACRS (more speciﬁcally, its size) and calculating one of the three equivalent

distances presented in that work. As such, the algorithm presented here provides

a ﬁrst method to calculate exactly these distances. This can be used in the future

to compare this distance with other distances (such as the mixed distance) or to

evaluate the accuracy of diﬀerent heuristic approaches.

Future extensions

There is an obvious optimization that can be applied to the approach presented

in this work related to the enumeration of the reticulation-trimmed subnetworks.

Since the MACRS-Simple algorithm by deﬁnition does not remove reticula-

tions, comparing two input reticulation-trimmed subnetworks that do not share

the same reticulation number or topology (in the sense that no mapping of the

components containing reticulations can be made) will result in no solution. An

obvious improvement to the enumeration step is to compare the topological rela-

tionships of the reticulations in both input networks (which, in the case of level-1

networks, can be modelled by trees), ﬁnd the largest common reticulation topol-

ogy between them, and start enumerating from there by gradually removing all

possible reticulations. While this strategy does not achieve any additional formal

bounding, it may reduce greatly the number of reticulation-trimmed subnetwork

pairs to consider on many real inputs (potentially bringing it down to a linear

number of pairs).

Another interesting avenue of work is to generalize our algorithm to higher

level networks. A brief overview of a possible strategy would be to extend the

MACRS-Simple dynamic programming to consider, for each pair of bicon-

nected components, all possible isomorphisms, ﬁnd the maximum value and

then summing to it the values of the exterior nodes that are matched in the

isomorphism.

Attaching leaves to a non-orchard network was used previously to extend

an approach to solve the minimum hybridization problem on any rooted phylo-

genetic network [20]. Exploring if and how a similar idea could be employed to

generalize our proposed algorithm to non-orchard networks should be considered.

Finally, as mentioned earlier, the complexity of our method is exponential in

the sum of the number of reticulations in both input networks because of the

enumeration step. Ideally, we could ﬁnd an approach for which the complexity

would depend only on the level of the two input networks, which we leave as an

open problem.

16 K. Landry et al.

References

1. Allen-Savietta, C.: Estimating Phylogenetic Networks from Concatenated Se-

quence Alignments. The University of Wisconsin-Madison (2020)

2. Baroni, M., Semple, C., Steel, M.: A framework for representing reticulate evolu-

tion. Annals of Combinatorics 8, 391–408 (2005)

3. Bernardini, G., van Iersel, L., Julien, E., Stougie, L.: Reconstructing phylogenetic

networks via cherry picking and machine learning. In: WABI 2022-2nd Interna-

tional Workshop on Algorithms in Bioinformatics (2022)

4. Bordewich, M., Semple, C.: Determining phylogenetic networks from inter-taxa

distances. Journal of mathematical biology 73(2), 283–303 (2016)

5. Cardona, G., Llabr´es, M., Rossell´o, F., Valiente, G.: Metrics for phylogenetic net-

works i: Generalizations of the robinson-foulds metric. IEEE/ACM Transactions

on Computational Biology and Bioinformatics 6(1), 46–61 (2008)

6. Erd˝os, P.L., Semple, C., Steel, M.: A class of phylogenetic networks reconstructable

from ancestral proﬁles. Mathematical biosciences 313, 33–40 (2019)

7. Hopcroft, J.E., Tarjan, R.E.: Dividing a graph into triconnected components. SIAM

Journal on computing 2(3), 135–158 (1973)

8. Huber, K.T., Linz, S., Moulton, V.: The rigid hybrid number for two phylogenetic

trees. Journal of Mathematical Biology 82, 1–29 (2021)

9. Huber, K.T., Linz, S., Moulton, V.: Cherry picking in forests: A new characteri-

zation for the unrooted hybrid number of two phylogenetic trees. arXiv preprint

arXiv:2212.08145 (2022)

10. Humphries, P.J., Linz, S., Semple, C.: Cherry picking: a characterization of the

temporal hybridization number for a set of phylogenies. Bulletin of mathematical

biology 75(10), 1879–1890 (2013)

11. Huson, D.H., Bryant, D.: Application of Phylogenetic Net-

works in Evolutionary Studies. Molecular Biology and Evolution

23(2), 254–267 (10 2005). https://doi.org/10.1093/molbev/msj030,

https://doi.org/10.1093/molbev/msj030

12. Huson, D.H., Rupp, R., Scornavacca, C.: Phylogenetic networks: concepts, algo-

rithms and applications. Cambridge University Press (2010)

13. van Iersel, L., Janssen, R., Jones, M., Murakami, Y.: Orchard networks are trees

with additional horizontal arcs. Bulletin of Mathematical Biology 84(8), 76 (2022)

14. van Iersel, L., Janssen, R., Jones, M., Murakami, Y., Zeh, N.: A unifying character-

ization of tree-based networks and orchard networks using cherry covers. Advances

in Applied Mathematics 129, 102222 (2021)

15. Janssen, R., Jones, M., Murakami, Y.: Combining networks using cherry picking

sequences. In: Algorithms for Computational Biology: 7th International Confer-

ence, AlCoB 2020, Missoula, MT, USA, April 13–15, 2020, Proceedings. pp. 77–92.

Springer (2020)

16. Janssen, R., Murakami, Y.: Linear time algorithm for tree-child network contain-

ment. In: Algorithms for Computational Biology: 7th International Conference,

AlCoB 2020, Missoula, MT, USA, April 13–15, 2020, Proceedings 7. pp. 93–107.

Springer (2020)

17. Janssen, R., Murakami, Y.: On cherry-picking and network containment. Theoret-

ical Computer Science 856, 121–150 (2021)

18. Kong, S., Pons, J.C., Kubatko, L., Wicke, K.: Classes of explicit phylogenetic net-

works and their biological and mathematical signiﬁcance. Journal of Mathematical

Biology 84(6), 47 (2022)

Finding agreement cherry-reduced subnetworks in level-1 networks 17

19. Landry, K., Teodocio, A., Lafond, M., Tremblay-Savard, O.: Deﬁning phylogenetic

network distances using cherry operations. IEEE/ACM Transactions on Compu-

tational Biology and Bioinformatics (2022)

20. Linz, S., Semple, C.: Attaching leaves and picking cherries to characterise the

hybridisation number for a set of phylogenies. Advances in Applied Mathematics

105, 102–129 (2019)

21. Lu, B., Zhang, L., Leong, H.W.: A program to compute the soft robinson–foulds

distance between phylogenetic networks. BMC genomics 18, 1–10 (2017)

22. Lutteropp, S., Scornavacca, C., Kozlov, A.M., Morel, B., Stamatakis, A.: Netrax:

accurate and fast maximum likelihood phylogenetic network inference. Bioinfor-

matics 38(15), 3725–3733 (2022)

23. Nguyen, Q., Roos, T.: Likelihood-based inference of phylogenetic networks from

sequence data by phylodag. In: Algorithms for Computational Biology: Second

International Conference, AlCoB 2015, Mexico City, Mexico, August 4-5, 2015,

Proceedings 2. pp. 126–140. Springer (2015)

24. Park, H.J., Jin, G., Nakhleh, L.: Bootstrap-based support of hgt inferred by max-

imum parsimony. BMC Evolutionary Biology 10(1), 1–11 (2010)

25. Sol´ıs-Lemus, C., Bastide, P., An´e, C.: Phylonetworks: a package for phylogenetic

networks. Molecular biology and evolution 34(12), 3292–3298 (2017)

26. Tan, M., Long, H., Liao, B., Cao, Z., Yuan, D., Tian, G., Zhuang, J., Yang, J.: Qs-

net: Reconstructing phylogenetic networks based on quartet and sextet. Frontiers

in Genetics 10, 607 (2019)

27. Van Iersel, L., Janssen, R., Jones, M., Murakami, Y., Zeh, N.: Polynomial-time

algorithms for phylogenetic inference problems involving duplication and reticula-

tion. IEEE/ACM transactions on computational biology and bioinformatics 17(1),

14–26 (2019)

28. Van Iersel, L., Jones, M., Weller, M.: Embedding phylogenetic trees in networks of

low treewidth. arXiv preprint arXiv:2207.00574 (2022)

29. Wen, D., Yu, Y., Zhu, J., Nakhleh, L.: Inferring phylogenetic networks using phy-

lonet. Systematic biology 67(4), 735–740 (2018)

18 K. Landry et al.

Appendix

1 Proof of Lemma 1(page 8)

Lemma 1. Let Nbe a network. Then for any N′⊆cr N, there exists a reticulation-

trimmed subnetwork N′′ of Nand a simple CS Ssuch that N′′hSi=N′.

Proof. Let Nand N′be networks such that N′⊆cr N. We use induction on

|E(N)|to prove a slightly stronger statement. We show that for any F⊆ER(N)

such that there exists a CS S′satisfying N hS′i=N′that removes the set of

reticulation edges F, there exists a reticulation-trimmed subnetwork N′′ of N

with respect to Fand a CS Ssuch that N′′ hSi=N′.

The base case is |E(N)|= 1. In this case we have a singleton network on a

root, a single leaf, and an edge between them. There are no cherries and only

F=∅is possible, and so all N′⊆cr Nhave N′=Nand S=∅for N hSi=N′.

So the claim is trivially true.

For the induction step, assume that the claim holds for all networks whose

number of edges is strictly smaller than |E(N)|.

Let F⊆ER(N), and suppose there is a CS S′such that N hS′i=N′,

and that the set of reticulation edges removed by S′is F. If every reduction

in S′is simple, then F=∅and Nis itself a reticulation-trimmed subnetwork

of Nwith respect to F=∅, in which case the statement holds. Otherwise, let

(u, v)∈Fbe the ﬁrst reticulation edge of Nremoved by a reduction in S′, say

(u, v) is removed by cherry S′

i. On network N hS′

(0:i)i,uand vmust both have

leaf children. This implies that there are no reticulations in reach(u, N). Thus,

all cherries (x, y) of S(0:i)such that {x, y } ∈ reach(u, N) are simple.

Assume for now that at least one such simple reduction exist, and let S′

h=

(x, y) be the ﬁrst of them. With this, we see that (x, y) is a cherry on Nthrough-

out every step of the reduction of Nby S′

(0:h), including on N. We may there-

fore assume that (x, y) is the ﬁrst reduction of S′by Theorem 1. Thus N′is

obtained from N h(x, y)iby applying the CS S′

(1:|S|), which we know removes the

set of reticulation edges F. By induction, there exists a reticulation-trimmed

subnetwork N∗of N h(x, y)iwith respect to Fand a simple CS S∗such that

N∗hS∗i=N′. Let Tbe a smallest CS such that results in N∗after applying it

on N h(x, y)i. Then N∗can be obtained from Nby applying the CS (x, y )·T.

We note that reduction (x, y) is required in any CS that results in a trimmed-

reticulation subnetwork of Nwith respect to F(since the cherry (x, y) is below

uand (u, v) needs to be removed) and it follows by the minimality of Ton

N h(x, y)ithat (x, y)·Tis minimum on N. Thus N∗is a reticulation-trimmed

subnetwork of Nwith respect to Fand the claim holds.

It is also possible that (x, y ) does not exist as a simple reduction, which

occurs when uand veach already have a leaf child in N. In this case, (x, y) is

a reticulated cherry on Nsuch that p(x) = vand p(y) = u. As in the previous

case, we may assume that (x, y) is the ﬁrst reduction of S′. Let F′=F\ {(u, v )}.

By induction there is a reticulation-trimmed subnetwork N∗of N h(x, y)iwith

respect to F′and there exists S∗such that N∗hS∗i=N′. We also have a CS T

Finding agreement cherry-reduced subnetworks in level-1 networks 19

that is minimum and contains a reticulated reduction if and only if it removes an

edge in F′. As before, the reduction (x, y) is required in any reticulation-trimmed

subnetwork of Nwith respect to F, and it follows by the minimality of Tthat

(x, y)·Tyields such a network. Again, N∗is a reticulation-reduced subnetwork

of Nwith respect to Fand there is the simple CS S∗such that N∗hS∗i=N′.

⊓⊔

2 Proofs from Section 4.1: Lemma 2(page 9),

Lemma 3(page 9), Lemma 4(page 10),

Lemma 5(page 11)

Lemma 2. Let Nbe a network, F⊆ER(N)be a set, and N′be a network that

is a reticulation-trimmed subnetwork of Nwith respect to F. Then Fis disjoint.

Proof. Say Fis not disjoint, then there are two cases. First, say Fcontains

edges (u, v) and (w, v ). The reduction on the network must ﬁrst reduce one of

these edges, say (u, v). This reduction removes the vertex v, and thus the edge

(w, v) can no longer be in the network. In this way no single CS can reduce

both the reticulation edges leading into the same reticulation. Next, assume F

is not disjoint because it contains edges (u, v) and (u, w). This is not possible

in a level-1 network, because reticulations v , w would be contained in the same

biconnected component.

Lemma 3. Let Nbe a network and F⊆ER(N)be a set such that there exists

a reticulation-trimmed subnetwork of Nwith respect to F. Then there exists a

topological sort of F.

Proof. This claim is evidence by the topological partial ordering on R(N), which

exists on any network. Then, note that Fis disjoint by Lemma 2.

Lemma 4. Let Nbe a network and let F⊆ER(N). Then there does not exist

two non-strongly isomorphic reticulation-trimmed subnetworks of Nwith respect

to F.

Proof. Let Nbe a network and F⊆ER(N) be a set.

We claim there are not two non-strongly isomorphic reticulation-trimmed

subnetworks with respect to F. We proceed by induction on |V(N)|+|E(N)|.

In the base case, |V(N)|+|E(N)| ≤ 3 and Nis a singleton network on one

edge, with a root and a leaf as endpoints. In this case, F=∅and the reticulation-

trimmed subnetwork of Nwith respect to Fis Nand N=N. There are no

cherries on Nso there is not more than one reticulation-trimmed subnetwork of

Nwith respect to F.

Assume the claim holds for all networks with strictly less vertices and edges

than N. If there is no reticulation-trimmed subnetwork of Nwith respect to F,

then the claim holds. Assume such a network, N′, exists. If F=∅then Nis the

only reticulation-trimmed subnetwork of Nwith respect to F, as making any

20 K. Landry et al.

cherry reductions would not be minimum (since N h∅i =Nis already suﬃcient

to remove empty F). The claim holds in this case, so we now assume F6=∅.

Let edge e∈Fbe (arbitrarily) one of the lowest edge in the topological sort

on F(which exists, Lemma 3). Because N′exists, there is at least one cherry

below an endpoint of e(emay be in the cherry itself). Let one such cherry be

called (x, y).

Next assume (x, y) is such that (p(y), p(x)) 6=e. We know that (x, y) is not

reticulated, as otherwise ewould not be one of the lowest reticulations in F.

Therefore (x, y) is not reticulated, and thus simple. Because eis in F, it will

have to be removed, and must have leaves under its endpoints to do so, thus

(x, y) needs to be reduced in any CS Ssuch that N hSiis a reticulation-trimmed

subnetwork of Nwith respect to F. Moreover, by Theorem 1, we may assume

that any such CS Sstarts with (x, y) as otherwise (x, y) can be removed ﬁrst

without aﬀecting the resulting network. It follows that we may assume that, for

every CS Ssuch that N hSiis a reticulation-trimmed subnetwork of Nwith

respect to F, applying the ﬁrst reduction results in N h(x, y)i. Then applying

S(1:|S|)on N h(x, y)imust yield a reticulation-trimmed subnetwork of N h(x, y)i

with respect to F(in particular, |S(1:|S|)|is minimum as otherwise, |S|would

not be minimum). By the induction hypothesis, N h(x, y)idoes not have two

non-strongly isomorphic reticulation-trimmed subnetworks with respect to F.

Since we may assume that all CSs Sapplicable to Nthat result in such a

trimmed subnetwork go through N h(x, y)i, it follows that Nalso does not have

two non-strongly isomorphic reticulation-trimmed subnetworks with respect to

F.

Finally, assume that (x, y) is reticulated and (p(y), p(x)) = e. As in the

previous case, note that (x, y ) must be present in any CS Ssuch that N hSiis

a reticulation-trimmed subnetwork of Nwith respect to F, and we may further

assume that any such CS starts with (x, y). Then, any such CS ﬁrst goes through

N h(x, y)i, and then by minimality, results in a reticulation-trimmed subnetwork

of N h(x, y)iwith respect to F\ {(p(y), p(x))}. By induction, there is only one

such network, and thus also only one reticulation-trimmed subnetwork of Nwith

respect to F.

⊓⊔

Lemma 5. Algorithm 3 on (N,F) returns the reticulation-trimmed subnetwork

N′of Nwith respect to Fif it exists, and NULL if not, and runs in time

O(|V(N)|).

Proof. First, there is always a topological sort on F(Lemma 3) and Fis disjoint

(Lemma 2).

If a network is not returned, then it must be that R(u)∪R(v)\ {v} 6=∅on

one of the (possibly) partially reduced subnetwork for some edge (u, v)∈F. In

this case, Fdoes not admit a reticulation-trimmed subnetwork since there is no

edge in Fthat corresponds the reduction of all reticulations below u, and v, a

requirement for the reduction of reticulation v.

Assume a network is returned, we claim the algorithm is correct in this case.

Finding agreement cherry-reduced subnetworks in level-1 networks 21

We will show this claim by constructing a CS Ssuch that N hSi=N′, and

by counting the exact number of reticulation cherries which we show will remove

the desired F. Finally, we will show Sis minimum.

Say F′has the order h(u1, v1)(u2, v2)...i. For each (ui, vi), in order, we con-

struct a CS Sion the partially reduced N, then let S=S1·S2.... Note that

this means we construct each Sion N hS1·S2·...·Si−1i. There are distinct

subnetworks rooted on uiand viso there are CSs Su

i,Sv

ithat are complete for

each respective subnetwork. Note how, since we assume a network was returned,

R(ui)∪R(vi)\ {vi}=∅, so Su

iand Sv

iare simple. Say they reduce the sub-

networks to leaves lu

iand lv

irespectively (so that p(lu

i) = uand p(lv

i) = v). Let

Si=Su

i·Sv

i·(lv

i, lu

i). The cherry (lv

i, lu

i) will reduce the reticulation edge (u, v )

under these conditions.

In this way, there is a reticulation cherry in Sif and only if it reduces an

edge in F. Furthermore, Sis minimum since we construct Sion the network

N hS1·S2·...·Si−1iand we have selected only the cherry reductions that are

necessary and suﬃcient for the reduction of the targeted reticulation edges.

The claimed running time is straightforward. We can obtain a topological

sort on Fusing a standard topological sort of Nobtained in time O(|V(N)|+

|E(N)|) = O(|V(N)|) (since our networks are binary). Then the algorithm only

iterates over Fand replaces subnetworks by multi-labeled leaves, which can be

handled in time O(|V(N)|).

3 Proof of Theorem 5, page 14

Lemma 6 and Corollary 1 is required to justify the arguments in the proof for

Theorem 5.

Lemma 6. Let Nbe a network and Sbe any CS on N. If, for r∈R(N),

r∈ N hSi, then v∈reach−(r, N)∈ N hSi.

Proof. Assume ∃r∈ N ,∈ N hSi,v∈reach−(r, N), /∈ N hSi. First, vis not a

leaf as a leaf cannot be above any other vertex. There must be some cherry in

S, say Si= (x, y) such that v=p(x), v=p(y) or v=p(x) = p(y) in N hS[(0:i)i.

Since r∈ N hSi, we have that r∈ N hS(0:i)iand since v∈reach−(r, N) we have

that v∈reach−(r, N hS(0:i)i) thus the only orientation we can have is v=p(y),

r=p(x) in the reticulated cherry (x, y)∈ N hS(0:i)i. But x,y,p(x), p(y) are

removed in N hS(0:i]icontradicting that r∈ N hSi.

Note that Lemma 6 implies that after any simple reduction by a CS Son

a network N, that since all r∈R(N)∈ N hSithen Sr∈R(N)reach−(r, N)∈

N hSi. Noting this, it follows that

Corollary 1 ∀networks N1,N2and ∀ N ∗=M ACRS −SI M P LE(N1,N2)

such that N∗is non-null, ∃and edge-preserving bijective function

f:[

r∈R(N1)

reach−(r, N1)→[

r∈R(N2)

reach−(r, N2)

.

22 K. Landry et al.

Theorem 5. The entry M[ρ(N1), ρ(N2)] correctly contains |L(N∗)|for N∗=

MACRS-Simple(N1,N2)if one exists, and −∞ otherwise.

Proof. Given input networks N1and N2, for any u∈ρ(B1), v ∈ρ(B2), let the

subnetwork of N1rooted on ube called Nuand let the subnetwork of N2rooted

on vbe called Nv. We claim that M[u, v] as we deﬁned it always contains the

number of leaves of a MACRS-Simple between Nuand Nv. In particular, this

will show our desired result with u=ρ(N1), v =ρ(N2). The proof is by induction

on |V(Nu)|+|V(Nv)|.

As a base case, suppose that |V(Nu)|= 1 and |V(Nv)|= 1, i.e. uand vare

both leaves. If X(u)∩X(v)6=∅, we put M[u, v] = 1, which is correct since Nu

and Nvare weakly isomorphic. If X(u)∩X(v) = ∅, we put M[u, v] = −∞, which

is correct since there is MACRS-Simple between networks that do not share a

leaf label.

Let us now consider the inductive step. For the rest of the proof, for any

u′∈V(N1), v′∈V(N2), we denote by N∗

u′,v′aMACRS-Simple between Nu′

and Nv′, if one exists (otherwise, N∗

u′,v′is undeﬁned). As an inductive hypothesis,

we assume that M[u′, v′] is the number of leaves in N∗

u′,v′, for any u′, v′such

that |V(Nu′)|+|V(Nv′)|<|V(Nu)|+|V(Nv)|.

The proof is split into cases.

Case: uand vare trivial, one is a leaf. If X(u)∩X(v)6=∅and R(u)∪R(v) = ∅

then M[u, v] = 1. This is correct since a complete reduction can proceed on

both networks without any reticulations and with at least one leaf in common, a

requirement for any network to be isomorphic with a singleton network. More-

over, there cannot be more than one leaf since uor vis itself a leaf. For the

same reason, when X(u)∩X(v) = ∅,M[u, v] = −∞ is obviously correct. When

R(u)∪R(v)6=∅,M[u, v] is correct because a MACRS-Simple of uand vcan

only be a leaf, but this cannot be achieved since one of the networks has an

unremovable reticulation.

Case: uand vare trivial and R(u)∪R(v) = ∅,X(u1)∩X(v1)6=∅and X(u2)∩

X(v2) = ∅or X(u1)∩X(v1) = ∅and X(u2)∩X(v2)6=∅or X(u1)∩X(v2)6=∅

and X(u2)∩X(v1) = ∅or X(u1)∩X(v2) = ∅and X(u2)∩X(v1)6=∅.In

this case, M[u, v] = 1 by line 6. Neither unor vhave reticulations below them,

thus any reduction may proceed. In fact, the reduction on Nuand Nvmust be

complete to obtain an isomorphic network since there is no shared leaf below

one child of uand one child of v, thus that child must be removed to reach any

MACRS-SIMPLE(Nu,Nv), requiring a cherry on uand v. Luckily, there is a

leaf shared below one child of uand one child of vand so a singleton isomorphic

network is possible, so this case is correct.

Case: uand vare trivial, R(u)∪R(v)6=∅, and X(u1)∩X(v1) = ∅or X(u2)∩

X(v2) = ∅or X(u1)∩X(v2) = ∅or X(u2)∩X(v1) = ∅.In this case line 2

resolves to true so we calculate M[u, v] on line 6. We ﬁnd that M[u, v] = −∞

since R(u)∪R(v)6=∅. A complete reduction is required to reach the required

isomorphic singleton in this condition, but the presence of a reticulation prevents

this.

Finding agreement cherry-reduced subnetworks in level-1 networks 23

Case: uis trivial and vis not trivial or vis trivial and uis not trivial. In this

case we always resolve M[u, v] = −∞ by line 8 and 9. This is indeed the correct

case, the presence of a reticulation in Nvand not in Nu(or in Nuand not Nv)

makes an isomorphic network unreachable by simple reductions alone.

Case: uand vare trivial, both are not leaves, and X(u1)∩X(v1)6=∅and

X(u2)∩X(v2)6=∅or X(u1)∩X(v2)6=∅and X(u2)∩X(v1)6=∅. In this

case we calculate M[u, v] on line 6. Regardless of any reticulations that may be

below uor v, we put M[u, v] as the maximum between M[u1, v1] + M[u2, v2]

and M[u, v] = M[u1, v2] + M[u2, v1]. We can assume, by the inductive hy-

pothesis, that M[u1, v1] = |L(N∗

u1,v1)|and M[u2, v2] = |L(N∗

u2,v2)|(likewise

M[u1, v2] = |L(N∗

u1,v2)|and M[u2, v1] = |L(N∗

u2,v1)|). It is not diﬃcult to see

that if N∗

u,v , a MACRS-Simple of Nuand Nv, exists, then it can be obtained

by joining a MACRS-Simple of Nu1,Nv1with a MACRS-Simple of Nu2,Nv2

under a common parent, or by joining a MACRS-Simple of Nu1,Nv2with

aMACRS-Simple of Nu2,Nv1under a common parent. In the current case,

M1=M[u1, v1] + M[u2, v2] and M2=M[u1, v2] + M[u2, v1] correspond to con-

structing these two possible networks, and since they contain the correct values

by induction, M= max(M1, M2) is correct.

Case: uand vare both non-trivial and M[r1, r2]6=−∞ for reticulation r1,r2

in u’s, v’s biconnected components respectively, and M[h(p1

i), h(p2

i)] 6=−∞ for

all iin any p1∈π1

l, π1

ror p2∈π2

l, π2

r). In this case we resolve M[u, v] =

M[r1, r2] + Pi=|π1

l|

i=1 M[h(p1

l,i), h(p2

l,i)] + Pi=|π1

r|

i=1 M[h(p1

r,i), h(p2

r,i)] or M[u, v] =

M[r1, r2] + Pi=|π1

l|

i=1 M[h(p1

l,i), h(p2

r,i)] + Pi=|π1

r|

i=1 M[h(p1

r,i), h(p2

l,i)].

Each table reference in this summation returns a value that is correct for

that subnetwork, by the induction hypothesis, as every subnetwork ˜

Nof Nu

(6=Nu) and Nv(6=Nv) is smaller. The summation itself is also correct. This

is evident by noting that the biconnected components on uand vonly contain

vertices along πland πr[12], so all vertices in the components are accounted for.

Furthermore, all bridges must lead to disjoint networks. Finally, by Corollary 1

the only possible networks are constructed by joining up child networks that

pair vertices in the order evident by the forbidden paths and their independent

subnetwork children/siblings. Since we consider the maximum among all such

possible networks, the solution is maximal. It is for this same reason when uand

vare non-trivial but the conditions are such that M[u, v] = −∞ by line 18, or

by an operand being −∞ in line 14 or line 16, that M[u, v] = −∞ is correctly

found.