Content uploaded by Felix Hermann

Author content

All content in this area was uploaded by Felix Hermann on Oct 19, 2022

Content may be subject to copyright.

ALEA, Lat. Am. J. Probab. Math. Stat. 18, 325–347 (2021)

DOI: 10.30757/ALEA.v18-15

The partial duplication random graph

with edge deletion

Felix Hermann and Peter Pfaﬀelhuber

Technische Universität Berlin,

Straße des 17. Juni 136,

10623 Berlin, Germany.

E-mail address:felix.hermann@tu-berlin.de

URL:http://page.math.tu-berlin.de/∼hermann/

Albert-Ludwigs-Universität Freiburg,

Ernst-Zermelo-Straße 1,

79104 Freiburg, Germany.

E-mail address:peter.pfaffelhuber@stochastik.uni-freiburg.de

URL:https://www.stochastik.uni-freiburg.de/professoren/pfaffelhuber

Abstract. We study a random graph model in continuous time. Each vertex is

partially copied with the same rate, i.e. an existing vertex is copied and every edge

leading to the copied vertex is copied with independent probability p. In addition,

every edge is deleted at constant rate, a mechanism which extends previous partial

duplication models. In this model, we obtain results on the degree distribution,

which shows a phase transition such that either – if pis small enough – the frequency

of isolated vertices converges to 1, or there is a positive fraction of vertices with

unbounded degree. We derive results on the degrees of the initial vertices as well

as on the sub-graph of non-isolated vertices. In particular, we obtain expressions

for the number of star-like subgraphs and cliques.

1. Introduction

Various random graph models have been studied in the last decades. Frequently,

such models try to mimic the behavior of social networks (see e.g. Cooper and Frieze,

2003 and Barabási et al.,2002) or interactions within biological networks (see e.g.

Wagner,2001,Albert,2005 and Jeong et al.,2000). For a general introduction to

random graphs see the monographs Durrett (2007), van der Hofstad (2017) and

references therein.

In this paper, we study and extend a duplication random graph model introduced

and discussed in Bhan et al. (2002), Chung et al. (2003), Pastor-Satorras et al.

Received by the editors July 8th, 2019; accepted December 10th, 2020.

2010 Mathematics Subject Classiﬁcation. 05C80 (Primary), 60K35 (Secondary).

Key words and phrases. Random graph, Degree distribution, Cliques.

325

326 F. Hermann and P. Pfaﬀelhuber

(2003), Chung et al. (2003), Bebek et al. (2006), Bebek et al. (2006), Hermann

and Pfaﬀelhuber (2016), Jordan (2018) and, more recently, in Jacquet et al. (2020)

and Frieze et al. (2020). In most applications, a vertex models a protein and

an edge denotes some form of interaction; see e.g. Pastor-Satorras et al. (2003).

Within the genome, the DNA encoding for a protein can be duplicated (which

in fact is a long evolutionary process), such that the interactions of the copied

protein are partially inherited to the copy; see Ohno (1970) for some more biological

explanations. Within the random graph model, a vertex is p-copied, i.e. a new

vertex is introduced and every edge of the parent vertex is independently copied

with the same probability p. The idea behind this is to model protein-protein

interactions, assuming that the ability to interact can be inherited from a parent

protein with a ﬁxed probability p.

In Pastor-Satorras et al. (2003), an extension of this model was suggested (but

not studied further) where edges can be randomly removed from the random graph.

Aiming for a closer look at this model, we extend the duplication random graph

from above by introducing a rate δat which each edge in the random graph is

deleted. In biological terms, this corresponds to loss of interactive abilities due to

mutation or deterioration; see e.g. Figure 3 in Wagner,2001.

While in the previous literature no rigorous limit results were shown for the model

without edge deletion, which we will call pure partial duplication model,Hermann

and Pfaﬀelhuber (2016) have determined a critical parameter p∗≈0.567143, the

unique solution of pep= 1, below which approximately all vertices are isolated.

Moreover, almost sure asymptotics and limit results for the number of k-cliques and

k-stars in the random graph as well as for the degree of a ﬁxed vertex were obtained.

Recently, Jordan (2018) has shown that for p<e−1the degree distribution of the

connected component, i.e. of the subgraph of non-isolated vertices, has a limit with

tail behavior close to a power-law with exponent β > 2solving β−3 + pβ−2= 0

(cf. Jordan,2018, Theorem 1(c)). This ﬁnding has been complemented by Jacquet

et al. (2020) by some ﬁner asymptotics. Extending the model for adding additional

edges at random, Frieze et al. (2020) obtain results on the degree distribution and

the degree of a ﬁxed vertex. We also mention that Bienvenu et al. (2019) introduce

a similar model for speciation. However, in their model, each birth of a new vertex

is linked to removing another vertex, making the number of vertices in the network

a constant.

In this paper for the model with edge deletion, we derive results on the degree

distribution F1, F2, . . . of the full graph (see Theorem 2.4), the sub-graph of non-

isolated vertices (see Proposition 2.6) as well as the number of star-like graphs,

cliques, and degrees of initial vertices (see Theorem 2.9). The methods used to

derive these results include branching processes with disasters Z(see Section 3.1)

and piecewise deterministic jump processes X(see Section 3.2). For the former,

note that the degree distribution is closely related via P(Zt=k) = E[Fk(t)] for

all k≥0(see (3.2)) – i.e. the expected degree distribution of the graph process

equating to the distribution of Zt. Such a connection to branching processes is

as in Jordan (2018), but now Zhas additional deaths at rate δ– the rate of

edge deletion. (Note that links between random graphs and branching processes

frequently appear in the literature, e.g. van der Hofstad,2017, Section 4.2 and

Bollobás and Riordan,2009.) For the latter, such branching processes, and therefore

the degree distribution, can be studied by using piecewise-deterministic Markov

The partial duplication random graph with edge deletion 327

jump processes, a tool which we introduce in Lemma 2.3; see also Section 3.1 for the

connection to branching processes. The new connection of Xand Zis via a duality

relation (see (2.2)), which was already used in Hermann and Pfaﬀelhuber (2016),

and proved extendable in various directions. Generalizing the limit results from

Hermann and Pfaﬀelhuber,2016, Lemma 3.3 for Xto a broader class of piecewise-

deterministic Markov processes, we obtained general limit results for branching

processes with disasters in several settings in Hermann and Pfaﬀelhuber (2020). In

the present paper, we transfer these results to the degree distribution, generalizing

the results of Theorem 2.7 in Hermann and Pfaﬀelhuber (2016) to the model with

edge deletion; see Section 3. In particular, we derive a phase transition such that

if pis small, the fraction of vertices with positive degree vanishes. In this case we

ﬁnd that the sub-graph of non-isolated vertices is exponentially small, with two

possible rates depending on pand δ; see Theorem 2.4. For larger p, a positive

fraction of vertices is non-isolated, and their degree is unbounded. In Section 4, we

prove Theorem 2.9 which states limit results for binomial moments of the degree

distribution, cliques and the degree of a ﬁxed node mainly by applying martingale

theory, generalizing Theorems 2.9 and 2.14 in Hermann and Pfaﬀelhuber (2016).

2. Model and main results

After introducing the model (and its connection to a piecewise deterministic

Markov process) in Section 2.1, we give our ﬁrst main result, Theorem 2.4, on

the number of non-isolated vertices, in Section 2.2. In Section 2.3, we discuss the

(generating function of) the degree-distribution of the sub-graph of non-isolated

vertices. Theorem 2.9 on certain graph functionals is contained in Subsection 2.4.

Finally, we put our results in perspective to previous results in Subsection 2.5.

2.1. The model and a piecewise deterministic Markov process.

Deﬁnition 2.1 (Partial duplication graph process with edge deletion).Let p∈

[0,1],δ≥0and G0= (V0, E0)be a deterministic undirected graph without loops

with vertex set V0={v1, . . . , v|V0|}and non-empty edge set E0. Let P D(p, δ) =

(Gt)t≥0be the continuous-time graph-valued Markov process starting in G0and

evolving in the following way:

– Every node v∈Vtis p-partially duplicated (or p-copied for short) at rate

κt:= (|Vt|+ 1)/|Vt|, i.e. a new node v|Vt|+1 is added and for each w∈Vt

with (v, w)∈Et,v|Vt|+1 is connected to windependently with probability

p.

– Every edge in Etis removed at rate δ.

Then, P D(p, δ)is a partial duplication graph process with edge deletion with initial

graph G0,edge-retaining probability pand deletion rate δ. Within P D(p, δ), we

deﬁne the following quantities:

(1) Let Di(t) := degGt(vi)·1{i≤|Vt|} be the degree of vi, i.e. the number of its

neighbors, at time t.

(2) Let F(t) := (Fk(t))k=0,1,2,... with Fk(t) := |{1≤i≤ |Vt|:Di(t) = k}|/|Vt|

for k= 0,1,2, . . . be the degree distribution at time t. Furthermore, let

F+(t) := 1 −F0(t)be the proportion of vertices of positive degree.

(3) For k= 1,2, . . . let Bk(t) := P`≥k`

kF`(t)be the kth binomial moment of

the degree distribution.

328 F. Hermann and P. Pfaﬀelhuber

(4) For k= 1,2, . . . let Ck(t)be the number of k-cliques at time t, i.e. the

number of complete sub-graphs of size k.

Remark 2.2.(1) It is often desirable to have a discrete-time graph valued-

process where at each time nthe graph is of size n. For the P D(p, δ),

such a discretization is also possible, but quite elaborate, since the number

of edge deletions between two node additions follows a generalized negative

hypergeometric distribution heavily depending on the number of edges and

thus depending on the number of edges added with the latest node. In

what follows, we will only discuss the continuous-time version.

(2) We choose the duplication rate κt:= (|Vt|+ 1)/|Vt|in order to get a closed

recurrence relation for the degree distribution; see Lemma 3.3. Alterna-

tively, one could choose eκt:= 1, i.e. all vertices are copied at unit rate.

Since |Vt| ∼ V∞et(see Lemma 4.4) and hence Rt

0κs−eκsds =Rt

0

1

|Vs|ds con-

verges to a ﬁnite random variable, we conjecture that the random graph

with our choice of κtbehaves the same qualitatively for t→ ∞ (i.e. un-

derlying the same phase transitions as in Theorems 1 and 2). However,

some limits depend on the intial graph (cf. F0in Theorem2.4(c)) and thus,

since the choice of κtinﬂuences the distribution of Gtearly on, quantitative

diﬀerences are to be expected.

(3) For k∈Nak-star is a graph of k+1 nodes and kedges, where one particular

node, the center, is connected to each of the kother nodes. Since each node

of degree `is the center of `

kdistinct k-stars, these deliver an alternative

interpretation for the binomial moments: |Vt| · Bk(t)is equal to the total

number of distinct k-stars contained as subgraphs in Gt. Hence, the Bk(t)

as well as the Ck(t)can give an understanding of the topology of Gt. In

fact, several functionals of interest can be expressed via Bkand Ckas the

average degree in Gtequates to B1(t) = C2(t)/|Vt|while the transitivity

ratio is given by C3(t)

|Vt|B2(t).

Note that Hermann and Pfaﬀelhuber (2016) used the notation Sk(t)for the

factorial moments of the degree distribution giving Sk(t)/k! = Bk(t).

In order to formulate our results, we need an auxiliary process, which is connected

to P D(p, δ). It will appear below in Theorem 2.4.1 and in Proposition 2.6. The

proof of the following Lemma is found in Section 3.2.

Lemma 2.3 (Connection of P D(p, δ)and a piecewise-deterministic Markov pro-

cess).Let X= (Xt)t≥0be a Markov process on [0,1] jumping at rate 1 from Xt=x

to px, in between jumps evolving according to ˙

Xt=pXt(1−Xt)−δXt. Furthermore,

let

Hx(t) :=

∞

X

k=0

(1 −x)kFk(t),(2.1)

i.e. the probability generating function at 1−xof the degree distribution at time t.

Then, for all t≥0and x∈[0,1], writing Ex[.] := E[.|X0=x],

E[Hx(t)] = Ex[HXt(0)] =

∞

X

k=0

Fk(0) ·Ex[(1 −Xt)k].(2.2)

The partial duplication random graph with edge deletion 329

2.2. Limits on the number of non-isolated vertices. Recall from the model with

δ= 0 that there are at least three regimes: 1) If p < 1/e (or 0> p −plog 1

p),

Theorem 2.7 of Hermann and Pfaﬀelhuber (2016) shows that the frequency of iso-

lated vertices converges to 1, and Theorem 2.1 of Jordan (2018) shows that the

connected component converges to a graph with a power law distribution. 2)

If 1/e < p < p∗, where p∗≈0.567143 is the unique solution of pep= 1, (i.e.

p−plog 1

p>0> p −log 1

p), the techniques of Jordan (2018) break down (see his

Proposition 3.7), but still Theorem 2.7 of Hermann and Pfaﬀelhuber (2016) shows

that the frequency of non-isolated vertices becomes negligible. 3) If p>p∗(or

p−log 1

p>0), the expected number of non-isolated vertices converges to a non-

trivial fraction of the whole graph. Our ﬁrst result, Theorem 2.4 below, extends

this result to the case δ≥0. The three cases (a), (b) and (c) of the following

Theorem have 1), 2) and 3) as special cases for δ= 0 . In Figure 2.1, we give an

illustration of all cases.

For the formulation, we need some notation: For Xas in Lemma 2.3, we deﬁne

c= exp −pZ∞

0

E1[X2

s]

E1[Xs]ds.(2.3)

Set at∼bt, if at/bt

t→∞

−−−→ 1, as well as Q∅= 1, and recall that Bk(0) is the kth

binomial moment of the degree distribution of the initial graph, and in particular,

B1(0)|V0|is the initial number of edges.

Theorem 2.4 (Limit of the degree distribution).Let p∈(0,1) and δ≥0.

(a) If δ≥p−plog 1

p, then F(t)t→∞

−−−→ (1,0,0, . . .)almost surely with

E[F+(t)] ∼ce−t(1+δ−2p),

where c∈(0, B1(0)) is given in (2.3).

(b) If p≥e−1and p−plog 1

p≥δ≥p−log 1

p, then F(t)t→∞

−−−→ (1,0,0, . . .)

almost surely with

−1

tlog E[F+(t)] t→∞

−−−→ 1−1

γ(1 + log γ),

where γ= log 1

p/(p−δ).

(c) If p > log 1

pand δ < p −log 1

p, then F(t)t→∞

−−−→ (F0,0,0, . . .)almost surely,

where F0is non-deterministic and

E[F0]=1−1−δ

p−1

plog 1

p∞

X

k=1

Bk(0)(−1)k−1

k−1

Y

`=1 1−δ

p−1−p`

p` .

Remark 2.5 (Interpretations).(1) Clearly, the quantity F0is increasing in δ

and decreasing in p(and F+is decreasing in δand increasing in p). See

also the illustrations of Theorem 2.4 in Figure 2.1.

(2) For Theorem 2.4(c), we will see in the proof that the right hand side is

the hitting probability of a stochastic process, and in particular is in (0,1).

This interpretation shows that 0<E[F0]<1as long as the initial graph is

not trivial (i.e. F0(0) <1).

(3) The asymptotics given in case δ≥p−plog 1

pis more exact than the one

given for p−plog 1

p≥δ≥p−log 1

p(in the sense that −1

tlog E[F+(t)] ∼

1 + δ−2pis a consequence of part 1 of the above Theorem). The reason is

330 F. Hermann and P. Pfaﬀelhuber

that we can give a formula for cin this case, which does not carry over to

(b); see the proof of Proposition 2.6.

(A)

p

1

p∗

1/e

1

δ

E[F0(∞)] <1

E[F+(∞)] ∼e−t(1+δ−2p)

E[F+(∞)] ∼e−t(1−

1

γ(1+log γ))

(B)

p

1

p∗

1/e

−1

tlog E[F+(t)]

δ= 0

δ= 1

δ= 0.5

1−plog 1

p−p

Figure 2.1.

Illustration of Theorem 2.4. In (A), the three cases are shown in the p−δ-plane.

In (B), we draw the diﬀerent exponential rates of decrease in E[F+].

2.3. On the degree distribution of non-isolated vertices. In the case δ≥p−log 1

p,

the frequency of isolated vertices converges to 1. Hence, it is interesting to study

the (degree distribution of the) sub-graph of non-isolated vertices. In order to do

so, note that F+(t) = 1 −H1(t), with Hfrom (2.1). Also note that, if at some time

ta duplication event is triggered, 1−Hp(t)denotes the probability that the new

node is not isolated. Here, we aim for results on the asymptotics of the generating

function of the degree distribution of the sub-graph of non-isolated vertices,

h+

x(t) := P∞

k=1(1 −x)kE[Fk(t)]

E[F+(t)] =E[Hx(t)−H1(t)]

E[1 −H1(t)] = 1 −E[1 −Hx(t)]

E[1 −H1(t)] .

Using (2.2) and Bernoulli’s formula we compute

The partial duplication random graph with edge deletion 331

E[1 −Hx(t)]

= 1 −

∞

X

k=0

Fk(0)Ex[(1 −Xt)k] = 1 −

∞

X

`=0

Ex[X`

t](−1)`X

k≥`

Fk(0)`

k

= 1 −

∞

X

`=0

(−1)`Ex[X`

t]·B`(0) =

∞

X

`=1

B`(0)(−1)`+1Ex[X`

t].(2.4)

As it turns out, we can control the right hand side as long as δ > p −plog 1

p(see

Lemma 3.4). This immediately implies the following result:

Proposition 2.6 (Limit of degree distribution on the set of non-isolated vertices).

Let Xbe the process given in Lemma 2.3 and δ > p −plog 1

p. Then, as t→ ∞,

E[1 −Hx(t)] ∼Ex[Xt]·B1(0),and therefore 1−h+

x(t)∼Ex[Xt]

E1[Xt].

For the other case, p−log 1

p< δ ≤p−plog 1

p, the right hand side of (2.4) is not

dominated by the Ex[Xt]-term. Here, we can rather show that (see Section 3.4)

1

tZt

0

Ex[Xk+1

s]

Ex[Xk

s]ds t→∞

−−−→ (p−δ)k+pk−(1 + log γ)/γ

pk

=log 1

γpk+γpk−1

kγp =: ck(p, δ)

(2.5)

where γ= log 1

p/(p−δ). However, to obtain a limit result in analogy to Proposi-

tion 2.6, we need this convergence not only to hold in the Cesaro-sense, but in the

regular sense, i.e.

Conjecture 2.7. For p≥e−1,p−log 1

p< δ ≤p−plog 1

pand k≥1it holds

Ex[Xk+1

t]

Ex[Xk

t]

t→∞

−−−→ ck(p, δ ).(C)

With this, inserting Ex[X`

t] = Ex[Xt]Q`−1

k=1

Ex[Xk+1

t]

Ex[Xk

t], (2.4) immediately provides

(a) and (b) of

Proposition 2.8. Assume that Conjecture 2.7 holds.

(a) If p≥e−1,δ=p−plog 1

p, then c1(p, δ)=0and E[1 −Hx(t)] ∼Ex[Xt]·

B1(0).

(b) If p>e−1,p−log 1

p< δ < p −plog 1

p, then

E[1 −Hx(t)] ∼Ex[Xt]·

∞

X

`=1

B`(0)(−1)`+1c1(p, δ)· · · c`−1(p, δ).

In both cases, as t→ ∞,

1−h+

x(t)∼Ex[Xt]

E1[Xt]∼xexp pZ∞

0

E1[X2

s]

E1[Xs]−Ex[X2

s]

Ex[Xs]ds.(2.6)

The last approximate equality is also shown in Section 3.4. Since the right hand

side of (2.6) is non-trivial, this proposition implies – given that Conjecture 2.7

holds true – that the degree distribution of the sub-graph of non-isolated vertices

converges to some non-trivial distribution.

332 F. Hermann and P. Pfaﬀelhuber

However, for a proof of Conjecture 2.7 or a closer analysis of the limits more insight

into the process Xis necessary.

2.4. Limits of some graph-functionals. We now investigate the limiting behavior of

certain functionals of the graph.

Theorem 2.9 (Binomial moments, cliques and degrees).As t→ ∞, the following

statements hold almost surely:

(a) For k= 1,2, . . . ,etβkBk(t)→Bk(∞), where Bk(∞)∈ L1and

βk=(1 + δ−2p, if δ≥p−p(1−pk−1)

k−1,

1 + δk −pk −pk,otherwise.

(b) For k= 2,3, . . . ,exp(−t(kpk−1−δk

2))Ck(t)→Ck(∞), where Ck(∞)∈

L1.

(i) If Ck(0) >0and δ < 2pk−1/(k−1), the convergence also holds in L1

and P(Ck(∞)>0) >0.

(ii) Otherwise, if δ≥2pk−1/(k−1),Ck(t)=0for all t≥TCkfor some

ﬁnite random variable TCkand P(Ck(∞) = 0) = 1.

(c) For i≤ |V0|,e−t(p−δ)Di(t)→Di(∞), where Di(∞)∈ L1. Moreover,

(i) if Di(0) >0and δ < p, the convergence also holds in Lrfor all r≥1

and

E[Di(∞)] = Di(0)1 + p

|V0|.(2.7)

(ii) if δ≥p,Di(t) = 0 for all t≥Tifor some ﬁnite random variable Ti

and P(Di(∞) = 0) = 1.

Remark 2.10 (Interpretations).

(1) For Theorem 2.9.(a), we have β1= 1 + δ−2pand for δ= 0, we have

βk= 1 −pk −pk,k= 1,2, . . . In all cases, we can also write βk= (1 + δ−

2p)∧(1 + δk −pk −pk), which immediately shows that βkis continuous

in pand δ. In addition, for k= 2,3, . . . , we ﬁnd p≥p(1−pk−1)

k−1, i.e. we can

choose δ≥0such that either of the two cases can in fact occur. Moreover,

βk≤βk−1, which can be seen as follows: First, note that (1−pk−2)/(k−2)

(1−pk−1)/(k−1) =

(1+···+pk−3)/(k−2)

(1+···+pk−2)/(k−1) ≥1. So, if δ≥p−p(1−pk−1)

k−1, both βk−1and βkdo not

depend on kanyway. Then, if p−p(1−pk−1)

k−1≥δ≥p−p(1−pk−2)

k−2, we have

that

βk−1−βk= 1 + δ−2p−min(1 + δ−2p, 1 + δk −pk −pk)≥0.

Finally, for p−p(1−pk−2)

k−2≥δ, we have

βk−1−βk=p−δ−pk−1+pk≥p(1 −pk−1)

k−1−p(1 −p)pk−2

=p(1 −p)1 + · · · +pk−2

k−1−pk−2≥0.

The fact that βk−1≥βkimplies that there are much less star-like subgraphs

with k−1leaves than star-like subgraphs with kleaves, k= 2,3, . . . This

can only be explained by nodes with high degree.

The partial duplication random graph with edge deletion 333

(2) Noting that B1(t)=2|V(t)| · C2(t)and 1

tlog |V(t)|t→∞

−−−→ 1, we see that the

results in (a) and (b) imply the same growth rate for the number of edges.

(3) Interestingly, we ﬁnd that δ≥pimplies that all vertices of the initial graph

will eventually be isolated (i.e. have degree 0). However, the total number

of edges, denoted by C2, only dies out for δ≥2p. So, for p < δ < 2p, all

initial vertices become isolated, but are copied often enough such that the

number of edges is positive for all times with positive probability.

(A)

p

1

β2

δ= 0

δ= 1/2

δ=p2

δ= 1

(B)

p

1

β3

δ= 0

δ= 1/2

δ=p−p(1−p2)

2

δ= 1

Figure 2.2.

Illustration of Theorem 2.4(a). We display the rates of decay of E[Bk(t)] for k= 2

(A) and k= 3 (B).

2.5. Connection to previous work. Hermann and Pfaﬀelhuber (2016) analyzed the

case δ= 0. Note that Theorem 2.7.1 in that paper is extended here by considering

δ > 0as well as giving precise exponential decay rates in Theorem 2.4(a) and 1(b).

Moreover, recalling the connection Bk(t) = Sk(t)/k!mentioned in Remark 2.2.3,

Theorem 2.7.2 is also generalized here to the case δ > 0and further extended by

the almost sure convergence of each component of the degree distribution in The-

orem 2.4(c). The methods used in the proofs of Theorem 2.4 and Proposition 2.6

must be seen as extensions of tools used previously in Hermann and Pfaﬀelhuber

(2016). In particular, we found that duplication graphs with edge deletion yield a

similar connection to birth-death processes with disasters, deﬁned in the next sec-

tion. Such models can be studied using piecewiese-deterministic Markov processes;

334 F. Hermann and P. Pfaﬀelhuber

see Hermann and Pfaﬀelhuber (2020). Theorem 2.4 and Proposition 2.6 now es-

sentially follow by combining Hermann and Pfaﬀelhuber,2020, Corollaries 2.4 and

3.7 in the next section.

Theorem 2.9 in Hermann and Pfaﬀelhuber (2016) deals with cliques and k-stars

in the case δ= 0 and is extended by Theorem 2.9. More precisely, since |Vt| ∼ et,

and Hermann and Pfaﬀelhuber (2016) treats the discrete-time model, we note that

Theorem 2.9(1) of Hermann and Pfaﬀelhuber (2016) aligns with Theorem 2.9(b),

but only gives L1(rather than L2)-convergence. In Theorem 2.9(2) of Hermann

and Pfaﬀelhuber (2016), S◦

k, the number of k-stars in the network at time trelative

to the network size, was analyzed, which coincided with the factorial moments of

the degree distribution. There, a k-star was not deﬁned as a sub-graph of Gt, since

it depended on the order of the nodes. |Vt| · Bk(t)now gives the number of star-

like sub-graphs in the network at time tconsisting of k+ 1 nodes. Since the only

diﬀerence between S◦

kand Bk, as given in Theorem 2.9(a) is a factor of k!, the results

of Hermann and Pfaﬀelhuber (2016) easily apply also for Bkif δ= 0. Theorem 2.14

of Hermann and Pfaﬀelhuber (2016) treats the degrees of initial vertices and thus

can be compared to Theorem 2.9(c).

3. Proof of Theorem 1

Our analysis of the random graph P D(p, δ)is based on some main observations:

First, the expected degree distribution can be represented by a birth-death process

with binomial disasters Z, such that the distribution of Ztequals the expected

degree distribution of Gt; see (3.2). Second, asymptotics for the survival probability

of such processes were studied in Hermann and Pfaﬀelhuber (2020).

3.1. Birth-death processes with disasters and p-jump processes.

Deﬁnition 3.1. Let b > 0,d≥0and p∈[0,1]. Let Z(b, d, p)=(Zt)t≥0be a

continuous-time Markov process on N0that evolves as follows: Given Z0=z, the

process jumps

– to z+ 1 at rate bz;

– to z−1at rate dz;

– to a binomially distributed random variable with parameters zand pat

rate 1.

Then we call Z(b, d, p)abirth-death process subject to binomial disasters with birth-

rate b,death-rate dand survival probability p.

Remark 3.2.(1) A birth-death process with binomial disasters, Z(b, d, p)mod-

els the size of a population where each individual duplicates with rate band

dies with rate d, subjected to binomial disasters at rate 1. These disasters

are global events that kill oﬀ each individual independently of each other

with probability 1−p, which generates the binomial distribution in the

third part of Deﬁnition 3.1.

(2) Hermann and Pfaﬀelhuber (2020) provides several limit results for such

branching processes with disasters. As reference for the following, let Z=

Z(b, d, p)be as above. Then, Corollary 2.7 of Hermann and Pfaﬀelhuber

(2020) states:

The partial duplication random graph with edge deletion 335

(a) If b−d≤plog 1

p,Zgoes extinct almost surely and

lim

t→∞ −1

tlog P(Zt>0) = (1 −p)−(b−d).

(b) If plog 1

p< b −d≤log 1

p,Zgoes extinct almost surely and

lim

t→∞ −1

tlog P(Zt>0) = 1 −b−d

log 1

p1 + log log 1

p

b−d.

(c) If b−d > log 1

p, then Pk(limt→∞ Zt= 0) + Pk(limt→∞ Zt=∞) = 1

and

Pk( lim

t→∞ Zt=∞) = 1−d+ log 1

p

bk

X

`=1 k

`(−1)`−1

`−1

Y

m=1 1−dm + (1 −pm)

bm .

By constructing a relationship between P D(p, δ)and Z(p, δ, p)in

Lemma 3.3, we are able to transfer these results to our duplication graph

processes.

Lemma 3.3. Let p∈(0,1),δ≥0and recall Fk(t)from Deﬁnition 2.1. As h→0,

the entries Fkof the degree distribution yield

1

hE[Fk(t+h)−Fk(t)|Gt]

=−(1 + pk +δk)Fk(t) + p(k−1)Fk−1(t) + δ(k+ 1)Fk+1 (t)

+X

`≥k`

kpk(1 −p)`−kF`(t) + o(1).

(3.1)

Moreover, recall Z:= Z(p, δ, p)from Deﬁnition 3.1 (i.e. the binomial distribution

of the disasters has the birth rate as a parameter) and let P(Z0=k) = Fk(0) for

all kbe its initial distribution. Then, for all t≥0and k, it holds

P(Zt=k) = E[Fk(t)],(3.2)

i.e. the distribution of Ztequals the expected degree distribution of Gt.

Proof : Letting Φk(t) := |Vt|Fk(t), the absolute number of nodes with degree kat

time t, we obtain for h→0that

1

hE[Φk(t+h)−Φk(t)|Gt]

=−(pkκt+δk)Φk(t) + p(k−1)κtΦk−1(t) + δ(k+ 1)Φk+1(t)

+X

`≥k

κtΦ`(t)`

kpk(1 −p)`−k+O(h),

where the ﬁrst term on the right hand side stands for the events at which a node

can lose the degree kby either obtaining a new neighbor (by one of its kneighbors

being copied, which happens with rate kκt, retaining at least the one relevant edge,

which has probability p) or one of its kedges being deleted, which happens at

rate δk. The second and third terms describe the corresponding gain of a node

with degree kby analogous events. Finally, the sum equals the rate of a new node

arising with degree k, which can only happen if a node of degree `≥kis copied

(with rate κtΦ`(t)) and the copy retains exactly kedges (which then has a binomial

probability).

336 F. Hermann and P. Pfaﬀelhuber

Now, since |Vt|only increases if a new node is added, i.e. on an event related to

κt, it follows

1

hE[Fk(t+h)−Fk(t)|Gt]

=1

hEhΦk(t+h)

|Vt+h|−Φk(t)

|Vt+h|Gti+1

hEhΦk(t)

|Vt+h|−Φk(t)

|Vt|Gti

=κt

|Vt|+ 1−pkΦk(t) + p(k−1)Φk−1(t) + X

`≥k

Φ`(t)`

kpk(1 −p)`−k

+−δkΦk(t) + δ(k+ 1)Φk+1 (t)

|Vt|+κt|Vt|Φk(t)1

|Vt|+ 1 −1

|Vt|+O(h)

=−pkFk(t) + p(k−1)Fk−1(t) + X

`≥k

F`(t)`

kpk(1 −p)`−k

−δkFk(t) + δ(k+ 1)Fk+1 (t)−Fk(t) + O(h),

and (3.1) holds. Computing the Kolmogorov forwards equation for Zshows that

for k= 0,1,2, . . .

d

dt P(Zt=k) = −(1 + pk +δk)P(Zt=k) + p(k−1)P(Zt=k−1)

+δ(k+ 1)P(Zt=k+ 1) + X

`≥k

P(Zt=`)`

kpk(1 −p)`−k,

which is the same relation as (3.1) after taking expectation and letting h→0. This

shows (3.2).

3.2. Properties of the piecewise deterministic jump process X.We have seen the

connection of P D(p, δ)to a branching process with disasters in Lemma 3.3. Such

branching processes are in turn closely connected to piecewise deterministic jump

processes as in Lemma 2.3 (Hermann and Pfaﬀelhuber,2020). Hence, we can now

prove Lemma 2.3.

Proof of Lemma 2.3:Lemma 3.3 implies that E[Hx(t)] = E[(1−x)Zt]. Recognizing

(Zt) = Z(p, δ, p)as a homogeneous branching process with disasters Zh

λ,q,1,p in the

sense of Hermann and Pfaﬀelhuber,2020, Deﬁnition 2.5, with death-rate λ=p+δ

and oﬀspring distribution q= (q0,0, q2,0, . . .)holding q2=p

p+δ= 1 −q0, the result

follows from Lemma 4.1 in Hermann and Pfaﬀelhuber (2020).

For the process X, we now obtain a property which is needed in the proofs of

Theorem 2.4 and Proposition 2.6.

Lemma 3.4 (Moments of X).Let Xbe as in Lemma 2.3. If δ > p −plog 1

p, then

Ex[Xk

t] = o(Ex[Xt]) for all k= 2,3, . . . and Ex[Xt]∼ce−t(1+δ−2p), where

c:= x·exp −pZ∞

0

Ex[X2

s]

Ex[Xs]ds∈(0,1).

Proof : Recall γ:= log 1

p/(p−δ). Indeed, for p−plog 1

p< δ ≤p−p2log 1

p, such that

γ∈(p−1, p−2), it follows from Corollary 2.4 of Hermann and Pfaﬀelhuber (2020)

that – independent of x–

−1

tlog Ex[X2

t]

Ex[Xt]t→∞

−−−→ 1−1 + log γ

γ−(1 + δ−2p)=2p·c2(p, δ)>0.

The partial duplication random graph with edge deletion 337

On the other hand, if δ≥p−p2log 1

p, the same corollary gives

−1

tlog Ex[X2

t]

Ex[Xt]t→∞

−−−→ 1+2δ−2p−p2−(1 + δ−2p)

=δ−p2≥p2(1

p−1−log 1

p)>0.

In either case, there is an ε > 0such that 0< r(s) := Ex[X2

s]/Ex[Xs] = O(e−εs)

and it follows from (3.3)

Ex[Xt] = xexp −t(1 + δ−2p)−pZt

0

r(s)ds,

conluding the proof.

3.3. Proof of Theorem 2.4.By (3.1) in Lemma 3.3, we get that, as h→0

1

hE[F0(t+h)−F0(t)|Gt]→ −F0(t) + δF1(t) +

∞

X

`=0

(1 −p)`F`(t)

=δF1(t) +

∞

X

`=1

(1 −p)`F`(t)≥0.

Hence, (F0(t))tis a bounded sub-martingale and converges almost surely and in

L1. Consequently, the left hand side has to converge to 0 almost surely. Since

(1 −p)`is always positive, that can only be the case if F`(t)→0almost surely for

all `= 1,2, . . . , which guarantees almost sure convergence of F(t)to a vector of

the form F(∞)=(F0,0,0, . . .)in all cases.

Let Z:= (Zt)t≥0:= Z(p, δ, p)be as in Deﬁnition 3.1. We note that E[F+(t)] =

P(Zt>0) by (3.2). For (a), we see from Lemma 2.3 and Lemma 3.4 that

E[F+(t)] = 1 −E[H1(t)] = 1 −

∞

X

k=0

Fk(0)E1[(1 −Xt)k]

=

∞

X

k=1

kFk(0)E1[Xt] + o(E1[Xt]) ∼B1(0) ·ce−t(1+δ−2p)

with cas in (2.3). Moreover, (b) follows directly from Corollary 2.7 in Hermann

and Pfaﬀelhuber (2020); see Remark 3.2.2. by setting b=pand d=δ. For (c), we

again use Corollary 2.7 in Hermann and Pfaﬀelhuber (2020), but use in addition

that

E[F0(t)] = P(Zt= 0) =

∞

X

k=0

Pk(Zt= 0) ·P(Z0=k),

and Pk≥`Fk(0)k

`=B`(0).

3.4. Proof of claims in Subsection 2.3.It remains to show (2.5) and the last equality

in (2.6). Applying the generator of Xwe see that its moments satisfy, for k=

1,2, . . . ,

d

dt log Ex[Xk

t] = 1

Ex[Xk

t]pkEx[Xk

t]−Ex[Xk

t]+(p−δ)kEx[Xk

t]−pkEx[Xk+1

t]

=pk+ (p−δ)k−1−pkEx[Xk+1

t]

Ex[Xk

t](3.3)

338 F. Hermann and P. Pfaﬀelhuber

and thus, integrating, dividing by −t, and using Corollary 2.4 of Hermann and

Pfaﬀelhuber (2020),

1

tZt

0

Ex[Xk+1

s]/Ex[Xk

s]ds =−1

pk 1

t(log Ex[Xk

t]−log(xk)) −pk−(p−δ)k+ 1

t→∞

−−−→ p−δ+pk−(1 + log γ)/γ

pk =ck(p, δ),

which shows (2.5). Moreover, the last equality in (2.6) also follows from (3.3).

4. Proof of Theorem 2.9

The proof of Theorem 2.9, which is carried out in Section 4.4, will be based on the

analysis of several martingales, which are derived in Proposition 4.5 in Section 4.3.

In Section 4.2, we will analyze the total size of Gt.

4.1. Two auxiliary functions. We will need two speciﬁc functions in the sequel,

which we now analyze.

Lemma 4.1. Let p∈(0,1),δ≥0and

g:([0,∞)→R,

x7→ 1 + δx −px −px.

Then, gis strictly concave and thus, x7→ g(x)/x strictly decreases. Also, the

following holds:

(1) If δ≥p,gis strictly increasing and,

g(x)x→∞

−−−−→ (∞,if p < δ,

1,if p=δ.

(2) If p−log 1

p< δ < p,

gis (strictly increasing on (0, ξ),

strictly decreasing on (ξ, ∞)

for ξ:= log γ / log 1

pwith γ:= log 1

p/(p−δ). The global maximum is g(ξ) =

1−1

γ(1 + log γ).

(3) If δ≤p−log 1

p,gstrictly decreases and its maximum is g(0) = 0.

Proof : All results are straight-forward to compute. First, g0(x) = δ−p+ log 1

p·px

for all cases. Since the right hand side strictly increases, gis strictly concave. Part

1. follows from the form of g0. For part 3., we have that g0(x)≤δ−p+ log 1

p≤0,

implying the result. Finally, for part 2., we have that g0(x)=0iﬀ p−x= log 1

p/(p−

δ) = γiﬀ x= log γ / log 1

p=ξ= log γ/(γ(p−δ)) and the rest follows.

Lemma 4.2. Let Γdenote the Γ-function, r∈R,gr(n) := Γ(n+r)

Γ(n)and n0≥

max{2,1−r}. Then, there are 0< cr≤1< Cr<∞, such that

crnr≤gr(n)≤Crnrfor all n≥n0.(4.1)

Proof : First, we note that gr(n)∼nras n→ ∞ (see e.g. 6.1.46. of Stein,1970)

and hence, the result follows.

The partial duplication random graph with edge deletion 339

4.2. The size of the graph. For the asymptotics of the functionals of the random

graph in Theorem 2.9 it will be helpful to understand the asymptotics of the process

(|Vt|). Here and below, we will frequently use the following well-known lemma.

Lemma 4.3. Let X= (Xt)t≥0be a Markov process with complete and separable

state space (E, r), and f:E→Rcontinuous and bounded and such that

lim

h→0

1

hE[f(Xt+h)−f(Xt)|Xt=x] = λf(x), x ∈E

for some λ∈R, then (e−tλf(Xt))t≥0is a martingale.

Proof : See Lemma 4.3.2 of Ethier and Kurtz (1986).

Lemma 4.4 (Graph size).Let gr(n) := Γ(n+r)/Γ(n). For all r > −(|V0|+ 1),

the process (e−trgr(|Vt|+ 1))t≥0is a non-negative martingale. Moreover, there is a

random variable V∞such that the following holds:

e−t|Vt|t→∞

−−−→ V∞almost surely and in Lrfor all r≥1,

et/|Vt|t→∞

−−−→ 1/V∞almost surely and in Lrfor 1≤r < |V0|+ 1,

V∞is Γ(|V0|+ 1,1)-distributed.

Proof : Let 0< cr<1< Cr<∞be as in Lemma 4.2. The process V:= (|Vt|)t≥0

is a Markov process which jumps from vto v+ 1 at rate v+ 1. Setting gr(v) =

Γ(v+r)/Γ(v), we see that the process (gr(|Vt|+ 1))t≥0is well-deﬁned and non-

negative if |Vt|+1+r > 0for all t, i.e. if r > −(|V0|+ 1). Then, as h→0,

1

hE[gr(Vt+h)−gr(Vt)|Vt=v]=(v+ 1)(gr(v+ 2) −gr(v+ 1)) + o(1)

= (v+ 1)gr(v+ 1)v+1+r

v+ 1 −1+o(1)

=rgr(v+ 1) + o(1)

and Lemma 4.3 implies that (e−trgr(|Vt|+ 1))t≥0is a (non-negative) martingale

for all r > −(|V0|+ 1), and therefore L1-bounded. By the martingale conver-

gence theorem, this martingale converges almost surely. Using (4.1), the martingale

(e−tg1(|Vt|+1))t= (e−t(|Vt|+1))tis Lr-bounded for every r≥1and therefore con-

verges in Lr. Analogously, for r=−1, the martingale (etg−1(|Vt|+1))t= (et/|Vt|)t

is Lr-bounded for 1≤r < |V0|+ 1 and converges in Lr.

Noting that (|Vt|+ 1)t≥0is a Yule-process starting in |V0|+ 1, we have that

|Vt|+ 1 is distributed as the sum of |V0|+ 1 independent, geometrically distributed

random variables with success probabilities e−t(see e.g. p. 109 of Athreya and Ney,

1972). Hence, as t→ ∞, we ﬁnd that e−t|Vt|converges in distribution to the sum

of |V0|+ 1 independent, exponentially distributed random variables with unit rate.

This is a Γ(|V0|+ 1,1) distribution.

4.3. Some martingales. Similarly to the discrete-time pure duplication graph in

Hermann and Pfaﬀelhuber (2016) we obtain martingales for the functionals of

P D(p, δ).

Proposition 4.5 (Martingales).Fix k≥2.

(1) Considering the function gof Lemma 4.1, the following properties hold:

(a) If g(1) ≤g(k),(etg(1)Bk(t))t≥0is a martingale that almost surely con-

verges to a limit Bk(∞)∈ L1.

340 F. Hermann and P. Pfaﬀelhuber

(b) If g(1) > g(k), there is a process Rk(t)such that (etg(k)(Bk(t) +

Rk(t)))t≥0is a positive martingale that almost surely converges to a

limit Bk(∞)∈ L1and etg(k)Rk(t)t→∞

−−−→ 0. In particular, etg(k)Bk(t)

→Bk(∞)almost surely as t→ ∞.

Combining (a) and (b), we ﬁnd et(g(1)∧g(k))Bk(t)t→∞

−−−→ Bk(∞)∈ L1.

(2) (e−t(kpk−1−1−δ(k

2))Ck(t)/|Vt|)t≥0is a martingale that converges almost

surely to a limit ˜

Ck(∞). If additionally Ck(0) >0and δ < 2pk−1/(k−1),

the convergence also holds in L2.

(3) Let i≤ |V0|. Then, (e−t(p−δ−1) Di(t)/|Vt|)t≥0is a martingale that converges

almost surely to a limit ˜

Di(∞)∈ L1. Moreover,

E[e−t(p−δ)Di(t)] = Di(0)1 + (1 −e−t)p

|V0|(4.2)

and for r≥2, there is C > 0, depending only on r, p, δ such that

E[(e−t(p−δ)Di(t))r]≤E[(Di(0))r] + CZt

0

e−s(p−δ)E[(e−s(p−δ)Di(s))r−1]ds. (4.3)

Proof : 1. Since the sum in Bk(t)is almost surely ﬁnite for every kand t, it follows

for h→0using equation (3.1), that

1

hE[Bk(t+h)−Bk(t)|Gt]

=X

`≥k`

k −(1 + p` +δ`)F`(t) + p(`−1)F`−1(t) + δ(`+ 1)F`+1 (t)

+X

m≥`m

`p`(1 −p)m−`Fm(t)!+o(1)

=−Bk(t) + p(k−1)Fk−1(t) + X

m≥k

Fm(t)

m

X

`=k`

km

`p`(1 −p)m−`

+X

`≥k

F`(t) −(p+δ)``

k+p``+ 1

k+δ``−1

k

| {z }

=:a(`,k)

!+o(1).

Considering that n+1

m−n

m=n

m−1and n

m·n−1

m−1=n

m, we deduce

a(`, k) = p``

k−1−δ``−1

k−1= (p−δ)``−1

k−1+p``−1

k−2

= (p−δ)k`

k+p(k−1)`

k−1,

which implies that

1

hE[Bk(t+h)−Bk(t)|Gt]

=−Bk(t) + p(k−1)Bk−1(t)+(p−δ)kBk(t)

+X

m≥k

Fm(t)

m−k

X

`=0 m−k

`m

kp`+k(1 −p)m−k−`

| {z }

=(m

k)pk

+o(1)

The partial duplication random graph with edge deletion 341

=Bk(t)(p−δ)k−(1 −pk)+p(k−1)Bk−1(t) + o(1)

=−g(k)Bk(t) + p(k−1)Bk−1(t) + o(1),

recalling the function g:x7→ 1 + δx −px −pxfrom Lemma 4.1. In any case we

see that (etg(1)B1(t))t≥0is a non-negative martingale converging almost surely to

a limit B1(∞)∈ L1. For k= 2,3, . . . let gmin(k) := min1≤`≤kg(k)the running

minimum of g. Then, there are two cases to consider:

1. g(1) ≤g(k): It holds by strict concavity of g(see Lemma 4.1) that in this case

g(1) = gmin(k)< g(`)for all 1< ` < k. Thus, letting

λk

m:= g(1)

g(k)

k−1

Y

`=m

p`

g(`)−g(1), m = 2, . . . , k

and λk

1:= 1 + 1

g(1) pλk

2, these coeﬃcients are well-deﬁned and positive. Considering

the linear combination Qk(t) := Pk

m=1 λk

mBm(t)we obtain

1

hE[Qk(t+h)−Qk(t)|Gt]

=−g(1)Bk(t) +

k−1

X

m=1

Bm(t)(−λk

mg(m) + λk

m+1pm) + o(1)

=−g(1)Bk(t) +

k−1

X

m=2

Bm(t)λk

m−g(m) + pmg(m)−g(1)

pm

+B1(t)−g(1)λk

1+pλk

2+o(1)

=−g(1)Qk(t) + o(1).

So now, (etg(1)Qk(t))t≥0is a non-negative martingale for every k. Since Bk(t)can

be represented as a linear combination of (Q`(t))1≤`≤k, also (etg(1)Bk(t))t≥0has to

be a non-negative martingale and thus converges to some Bk(∞)∈ L1.

2. g(1) > g(k): Here it holds by strict concavity of gthat g(`)> g(k) = gmin(k)

for all `= 1, . . . , k −1. Hence

λk

m:=

k−1

Y

`=m

p`

g(`)−g(k), m = 1, . . . , k,

are well-deﬁned and positive. We compute analogously to the ﬁrst case that, as

h→0, with Qk(t) := Pk

m=1 λk

mBm,

1

hE[Qk(t+h)−Qk(t)|Gt]

=−g(k)Bk(t) +

k−1

X

m=1

Bm(t)λk

m−g(m) + pmg(m)−g(k)

pm +o(1)

=−g(k)Qk(t) + o(1).

Thus, (etg(k)Qk(t))t≥0is a non-negative martingale and has an almost sure limit

Bk(∞)∈ L1. Moreover, for `<k, since g(`)> g(k), we have that etg(k)Q`(t)→0,

so, writing Rk(t) = Qk(t)−Bk(t), we see that Rk(t) = Pk−1

`=1 µk

`Q`(t)for some

µk

1, . . . , µk

k−1and etg(k)Rk(t)→0and etg(k)Bk(t)→Bk(∞)follows.

2. For the cliques ﬁx t≥0and let Nk(v)for every node v∈Vtdenote the number

342 F. Hermann and P. Pfaﬀelhuber

of k-cliques that node is part of. Then, Pv∈VtNk(v) = kCk(t). Analogously deﬁne

Mk(e)as the number of cliques that the edge e∈Etis contained in, such that

Pe∈EtMk(e) = k

2Ck(t). Also, let ˜

Ck(t) := Ck(t)/|Vt|. Note that for a new k-

clique to arise, a node vinside of such a clique has to be copied. Then, every of the

Nk(v)cliques vis part of has a chance of pk−1that the copy obtains the k−1edges

it needs to form a new k-clique. Also, whenever an edge eis deleted, all Mk(e)

k-cliques are destroyed. We deduce

1

hE[˜

Ck(t+h)−˜

Ck(t)|Gt]

=|Vt|+ 1

|Vt|X

v∈VtCk(t) + pk−1Nk(v)

|Vt|+ 1 −˜

Ck(t)−δX

e∈Et

Mk(e)

|Vt|+o(1)

=X

v∈Vt˜

Ck(t) + pk−1Nk(v)

|Vt|−|Vt|+ 1

|Vt|˜

Ck(t)−δ

|Vt|k

2Ck(t) + o(1)

=˜

Ck(t)kpk−1−1−δk

2

| {z }

=:qk

+o(1).(4.4)

This shows that (e−tqk·˜

Ck(t))t≥0is a non-negative martingale and hence converges

almost surely to an integrable random variable ˜

Ck(∞).

It remains to show the L2-convergence of the martingale (e−tqk˜

Ck(t))t≥0for

qk+ 1 >0, i.e. δ < 2pk−1/(k−1). This will be done by considering the number of

(unordered) pairs of k-cliques, CCk(t) := Ck(t)

2= (Ck(t)2−Ck(t))/2, and verifying

that the process given by g

CC k(t) := e−t·2qkC Ck(t)

|Vt|(|Vt|−1) is L1-bounded, which implies

L2-boundedness of the martingale (e−tqk·˜

Ck(t))t≥0and concludes the proof.

Let us denote by Ck,`(t)the number of (unordered) pairs of k-cliques which have

exactly `shared vertices. Since the overlap of such a pair (i.e. the sub-graph both

cliques have in common) is an `-clique with `

2edges, the number of edges making

up the pair equals 2k

2−`

2. Hence, arguing as in the proof of Theorem 2.9 in

Hermann and Pfaﬀelhuber (2016), considering that (i) one new such pair arises if

one of the 2(k−`)non-shared vertices is fully copied (probability pk−1), and (ii) one

new pair arises if one of the `shared vertices is fully copied (probability p2k−`−1),

by taking the copied node instead of the original one; in addition, there are two

new pairs of k-cliques, one original and one copied, which share `−1vertices, and

(iii) if one of the `shared vertices is chosen, but only one of the two cliques is fully

copied (probability 2pk−1(1 −pk−`)) one new pair of k-cliques arises, which shares

`−1vertices. In addition, such a pair will be destroyed if one of its edges is deleted,

hence we deduce for `≤k−2

1

hE[Ck,`(t+h)−Ck,`(t)|Gt]

= (|Vt|+ 1) ·2(k−`)pk−1+`p2k−`−1

|Vt|Ck,`(t) + 2(`+ 1)pk−1

|Vt|Ck,`+1(t)

−δ·2k

2−`

2·Ck,`(t) + o(1),

which implies for e

Ck,`(t) := e−t·2qkCk,` (t)

|Vt|(|Vt|−1) , that

1

hE[e

Ck,`(t+h)−e

Ck,`(t)|Gt]

The partial duplication random graph with edge deletion 343

=−2qke

Ck,`(t) + e−t·2qk(|Vt|+ 1)

· (2(k−`)pk−1+`p2k−`−1)Ck,`(t) + 2(`+ 1)pk−1Ck,`+1(t)

|Vt| · (|Vt|+ 1)|Vt|

+Ck,`(t)·1

(|Vt|+ 1)|Vt|−1

|Vt|(|Vt| − 1)!

−2δk

2e

Ck,`(t) + δ`

2e

Ck,`(t) + o(1)

=e

Ck,`(t) −2qk−2δk

2+δ`

2+ (2(k−`)pk−1+`p2k−`−1)·|Vt| − 1

|Vt|−2!

+e

Ck,`+1(t)·2(`+ 1)pk−1·|Vt| − 1

|Vt|+o(1)

≤e

Ck,`(t)−2`pk−1+`p2k−`−1+δ`

2+e

Ck,`+1(t)·2(`+ 1)pk−1+o(1)

=−`e

Ck,`(t)pk−1(2 −pk−`)−δ

2(`−1)+ 2(`+ 1)pk−1e

Ck,`+1(t) + o(1).

Analogously, for `=k−1, additional pairs arise if a clique with kvertices is

completely copied (probability pk−1), so

1

hE[e

Ck,k−1(t+h)−e

Ck,k−1(t)|Gt]

≤ −(k−1) e

Ck,k−1(t)pk−1(2 −p)−δ

2(k−2)

+ 2kpk−1e−t·2qk·Ck(t)

|Vt|(|Vt| − 1) +o(1).

Also, letting b

Ck(t) := e−t·2qkCk(t)/(|Vt|(|Vt| − 1)) and combining the calculation

above with the one in (4.4), it follows that

1

hE[b

Ck(t+h)−b

Ck(t)|Gt]

=−2qkb

Ck(t) + e−t·2qk kpk−1Ck(t)

|Vt|(|Vt| − 1) ·|Vt| − 1

|Vt|+Ck(t)

|Vt|−Ck(t)|Vt|+ 1

|Vt|(|Vt| − 1)!

−δk

2b

Ck(t) + o(1)

≤b

Ck(t)−2qk+kpk−1−2−δk

2+o(1)

=−kpk−1−δk

2b

Ck(t) = −kb

Ck(t)pk−1(2 −p0)−δ

2(k−1)+o(1).

Now, since

δ < 2pk−1

k−1= min

2≤m≤kn2pk−1(2 −pk−m)

m−1o,

the coeﬃcients given by

λ`:=

`

Y

m=1

2pk−1

pk−1(2 −pk−m)−δ

2(m−1)

344 F. Hermann and P. Pfaﬀelhuber

for 1≤`≤kare well-deﬁned and positive and we obtain for the linear combination

Rk(t) := e

Ck,0(t) +

k−1

X

`=1

λ`e

Ck,`(t) + λkb

Ck(t)

that

lim

h→0

1

hE[Rk(t+h)−Rk(t)|Gt]

≤

k−1

X

`=1 e

Ck,`(t)·`−λ`pk−1(2 −pk−`)−δ

2(`−1)+λ`−12pk−1

+b

Ck(t)·k−λkpk−1(2 −pk−k)−δ

2(k−1)+λk−12pk−1= 0.

Thus, (Rk(t)) is a non-negative super-martingale, L1-bounded and, since λmin :=

min({1} ∪ {λ`; 1 ≤`≤k})>0and g

CC k(t)≤Rk(t)/λmin , the proof of 2. is

complete.

3. For the degree Di(t), we set ˜

Di(t) = Di(t))/|Vt|and compute, as h→0,

1

hE[˜

Di(t+h)−˜

Di(t)|Gt]

= (|Vt|+ 1) pDi(t)

|Vt|Di(t)+1

|Vt|+ 1 −Di(t)

|Vt|+1−pDi(t)

|Vt| Di(t)

|Vt|+ 1 −Di(t)

|Vt|!

+δDi(t)Di(t)−1

|Vt|−Di(t)

|Vt|+o(1)

=Di(t)

|Vt| p|Vt| − Di(t)

|Vt|−1−pDi(t)

|Vt|−δ!+o(1)

=˜

Di(t)(p−δ−1) + o(1).

Lemma 4.3 shows that (e−t(p−δ−1)Di(t)/|Vt|)t≥0is a non-negative martingale, and

hence converges almost surely. Furthermore, we write with gr(n) := Γ(n+r)/Γ(n)

1

hE[gr(Di(t+h)−gr(Di(t)) |Gt]

= (|Vt|+ 1)pDi(t)

|Vt|(gr(Di(t) + 1) −gr(Di(t)))

+δDi(t)(gr(Di(t)−1) −gr(Di(t))) + o(1)

=gr(Di(t))pr |Vt|+ 1

|Vt|+δDi(t)Di(t)−1

Di(t) + r−1−1+o(1)

=gr(Di(t))r(p−δ) + gr(Di(t))rp|Vt|+ 1

|Vt|−1−δDi(t)

Di(t) + r−1−1

+o(1)

=gr(Di(t))r(p−δ) + gr(Di(t))rp1

|Vt|+δ(r−1) 1

Di(t) + r−1(4.5)

+o(1).

The partial duplication random graph with edge deletion 345

Letting h→0, this gives (4.2) for r= 1, since

E[e−t(p−δ)Di(t)] = Di(0) + Zt

0

pE[e−s(p−δ)Di(s)/|Vs|]ds

=Di(0) + Zt

0

pe−sDi(0)/|V0|ds,

where in the last step we used the martingale (e−t(p−δ−1)Di(t)/|Vt|)t≥0. Moreover,

since gr(Di(t))

Di(t)+r−1=gr−1(Di(t)), (4.5) gives for some C > 0, depending on r, p, δ

d

dtE[e−tr(p−δ)gr(Di(t))] ≤C·E[e−tr(p−δ)gr−1(Di(t))],

and (4.3) follows with Lemma 4.2 and integration.

4.4. Proof of Theorem 2.9.(a). Recalling the function gfrom Lemma 4.1, we note

that (see also Remark2.10.1 for the second equality)

βk= (1 + δ−2p)∧1+(δ−p)k−pk=g(1) ∧g(k).

Hence, Lemma 4.5.1 shows that etβkBk(t)is non-negative and converges to some

Bk(∞)∈ L1. So, (a) follows.

(b) We combine Lemma 4.5.2 (recall the random variable ˜

Ck(∞)) with the almost

sure convergence e−tVt

t→∞

−−−→ V∞from Lemma 4.4. In all cases, we have that

exp −tkpk−1−δk

2Ck(t)

= exp −tkpk−1−1−δk

2Ck(t)/|Vt| · e−t|Vt|

t→∞

−−−→ ˜

Ck(∞)·V∞=: Ck(∞),

(4.6)

where V∞>0almost surely.

If δ≥2pk−1/(k−1), it is kpk−1−δk

2≤0and the convergence can only hold if

Ck(t)t→∞

−−−→ 0almost surely. Since Ck(t)∈N0, the ﬁrst hitting time TCkof 0 has to

be ﬁnite. On the other hand, for δ < 2pk−1/(k−1), combining the L2-convergences

in Lemma 4.5.2 and Lemma 4.4 we obtain that the convergence in (4.6) also holds

in L1. Since (e−t(kpk−1−1−δ(k

2))Ck(t)/|Vt|)is an L2-convergent and thus uniformly

integrable martingale, P(Ck(∞)>0) = P(˜

Ck(∞)>0) >0.

(c) Finally, ﬁx i∈ {1,...,|V0|}. Again, we combine Lemma 4.5.3 (recall the random

variable ˜

Di(∞)) with the almost sure convergence e−tVt

t→∞

−−−→ V∞from Lemma 4.4.

In all cases, we have that

e−t(p−δ)Di(t) = e−t(p−δ−1) Di(t)

|Vt|e−t|Vt|t→∞

−−−→ ˜

Di(∞)·V∞=: Di(∞).(4.7)

If δ < p, we ﬁnd by (4.2) that (e−t(p−δ)Di(t))t≥0is L1-bounded. Then, inductively

using (4.3) shows that (e−t(p−δ)Di(t))t≥0is Lr-bounded for all r≥1. In particular,

this implies that the convergence in (4.7) also holds in Lrfor all r≥1. This gives

convergence of ﬁrst moments, and (2.7) follows by taking t→ ∞ in (4.2).

If δ≥p, the almost sure convergence in (4.7) implies, since Di(t)∈N0, that

Di(∞) = 0, so there must be a ﬁnite hitting time Tiof 0.