Content uploaded by Haoran Zhang
Author content
All content in this area was uploaded by Haoran Zhang on Jul 20, 2022
Content may be subject to copyright.
Available via license: CC BY 4.0
Content may be subject to copyright.
Signed Network Embedding with Application to
Simultaneous Detection of Communities and Anomalies
Haoran Zhang and Junhui Wang
School of Data Science
City University of Hong Kong
Abstract
Signed networks are frequently observed in real life with additional sign information
associated with each edge, yet such information has been largely ignored in existing
network models. This paper develops a unified embedding model for signed networks
to disentangle the intertwined balance structure and anomaly effect, which can greatly
facilitate the downstream analysis, including community detection, anomaly detec-
tion, and network inference. The proposed model captures both balance structure and
anomaly effect through a low rank plus sparse matrix decomposition, which are jointly
estimated via a regularized formulation. Its theoretical guarantees are established in
terms of asymptotic consistency and finite-sample probability bounds for network em-
bedding, community detection and anomaly detection. The advantage of the proposed
embedding model is also demonstrated through extensive numerical experiments on
both synthetic networks and an international relation network.
Keywords: Anomaly detection, balance theory, community detection, low rank plus sparse
matrix decomposition, network embedding, signed network
1 Introduction
Network structure has been widely employed to describe pairwise interaction among a variety
of objects. In literature, a number of popular models have been developed to leverage network
structure for better modeling and prediction accuracy, including the Erd¨os-R´enyi model
(Erd¨os and R´enyi, 1960), the β-model (Chatterjee et al., 2011; Graham, 2017), stochastic
1
block model (Holland et al., 1983; Zhao et al., 2012; Sengupta and Chen, 2018), and network
embedding model (Hoff et al., 2002; Zhang et al., 2021). Although success has been widely
reported, most existing methods and theories are developed for unsigned network, and signed
network has been largely ignored in literature.
One of the fundamental differences between signed network from unsigned network is
the sign information associated with each edge, reflecting the polarity of node interaction.
Examples of signed network include international politics (Heider, 1946; Axelrod and Ben-
nett, 1993; Moore, 1979), with consensus or conflict between different countries; and social
networks (Massa and Avesani, 2005; Leskovec et al., 2010; Kunegis et al., 2009), with friends
or enemies between users. The existence of negative edges leads to some unique structures
in signed networks, such as the balance theory (Heider, 1946; Cartwright and Harary, 1956),
which has made most existing methods on unsigned network not directly applicable to signed
network (Chiang et al., 2014).
In particular, the balance theory suggests that signed networks tend to conform to some
local balanced patterns. A signed network is said to have strong balance if all its cycles
have an even number of negative edges, following from the human intuition that “friend
of friend is a friend” and “enemy of enemy is also a friend” (Heider, 1946). While strong
balance implies clear-cut community structures (Harary, 1953), it can be too restrictive and
rarely satisfied in practice. Weak balance is proposed in Davis (1967), which only requires
the signed network to have no cycle with exactly one negative edge and implies multiple
communities in the signed network (Easley et al., 2010). For illustration, Figure 1 displays
four possible triads with signed edges, where triads A and C are strongly balanced, while
triads A, C and D are weakly balanced.
The balance theory provides additional guidance for community detection in a signed
network. More specifically, besides sharing similar connectivity pattern, nodes in the same
community tend to be connected with positive edges whereas nodes in different communities
2
Figure 1: Balanced and unbalanced triads.
tend to be connected with negative edges (Tang et al., 2016). In literature, a number of
community detection methods for signed network have been developed (Doreian and Mrvar,
1996; Bansal et al., 2004; Li et al., 2014; Chen et al., 2014; Jiang, 2015; Yang et al., 2007).
However, most of them are algorithm oriented, and very little theoretical analysis has been
conducted on the interplay between the balance theory and connectivity patterns.
Another distinctive feature of signed network is that violations of the balance theory
are also prevalent (Cartwright and Gleason, 1966; Bansal et al., 2004; Zheng et al., 2015).
In other words, real-life signed networks can have triads like B in Figure 1, suggesting two
friends of the same node can be enemies themselves. For instance, Israel and Turkey are
two close allies of the United States in international politics, but the relationship between
themselves has not been very constructive. We refer to such violation as the anomaly effect,
which is often encountered in real-life signed networks. It may convey important information
that can not be explained by the balance theory, yet it has been largely ignored in the existing
literature of signed network modeling.
The major contributions of this paper are three-fold. First, to the best of our knowledge,
this paper is one of the first attempts to incorporate both balance structure and anomaly
effect in signed networks. Particularly, estimation of the balance structure can benefit sub-
stantially by taking the anomaly effect into account, and estimation of the anomaly effect per
se is also of interest in various real applications, such as the international relation network in
Section 5. Second, we propose a unified embedding model for signed networks to disentangle
the intertwined balance structure and anomaly effect. It is cast into a flexible probabilistic
3
model, where the balance structure and anomaly effect are modeled via a low rank plus
sparse matrix decomposition. Finally, a thorough theoretical analysis is conducted to quan-
tify the asymptotic estimation consistency of the proposed embedding model. We establish
some novel identifiability conditions, under which both balance structure and anomaly effect
can be consistently estimated with fast convergence rates. Its applications to community
detection and anomaly detection in signed network are also considered, with sound theo-
retical justification. Particularly, under the signed stochastic block model (SSBM) with n
nodes, the proposed model achieves a fast convergence rate of Op(n−1) in terms of commu-
nity detection, which matches up with the best existing results for unsigned network (Lei
and Rinaldo, 2015). In addition, the false discovery proportion of the proposed model also
converges to 0 at a fast rate under some mild conditions.
The rest of paper is organized as follows. Section 2 presents the proposed embedding
model for signed network as well as its estimation formulation and computational details.
Section 3 establishes the theoretical results on the asymptotic consistencies of the proposed
model, as well as the theoretical guarantees for its applications to community detection
and anomaly detection. Section 4 conducts extensive numerical experiments on synthetic
networks to examine the finite sample performance of the proposed model, and Section 5
applies it to analyze an international relation network. Section 6 concludes the paper, and
technical proofs and necessary lemmas are provided in the Appendix.
Before moving to Section 2, we define some notations here. For a vector β,let kβk
denote its Euclidean norm. For a matrix X= (xij) = (x1, ..., xn)>,we denote kXk0=
Pi,j 1{xij 6=0},kXkF=qPi,j x2
ij,kXkmax = maxi,j |xij|,and kXk2→∞ = max1≤i≤nkxik.We
also denote ν(X) and σk(X) as the vectorization and the k-th largest singular value of X,
respectively. Let Inand 1ndenote the identity matrix of size nand the vector with nones,
respectively.
4
2 Signed Network Embedding
2.1 Embedding Model
Consider a signed network Gwith nnodes labeled by [n] = {1, ..., n}and an adjacent matrix
Y= (yij)n×nwith yij ∈ {−1,0,1}and yij =yji. Here yij = 1 if there is a positive edge
between node iand node j, yij =−1 if there is a negative edge, and yij = 0 if no edge is
observed at all. Suppose the distribution function of yij is given by
F(t|mij) := Pr(yij ≥t|mij) =
1, t ≤ −1,
f(dt+mij), t ∈ {0,1},
0, t ≥2,
(1)
where fis some pre-specified increasing link function such as the logit function or probit
function, d= (d0, d1) are intercepts, and M= (mij)n×nis an underlying matrix. Here a large
value of mij leads to a large probability of a positive edge between nodes iand j, whereas
a small value of mij implies a large probability of a negative edge between nodes iand
j. It is also assumed that yij ’s are mutually independent conditional on M. Furthermore,
it is interesting to note that unsigned network can also be accommodated in (1), as the
probability of negative edges becomes 0 if d0is set as ∞.
To fully exploit the balance structure and anomaly effect in G, we assume Mcan be
decomposed as M=L+S, where L= (lij )n×nis a low rank matrix for the balance and
community structure, and S= (sij)n×nis a sparse matrix for the anomaly effect. Compared
with the balance and community structures in L, it is believed that the level of the anomaly
effect in Sis much weaker, and it is only observed on a relatively small number of edges
(Facchetti et al., 2011), leading to the sparsity in S.
The modeling strategy for Lis motivated from the fact that both weak balance and
5
community structure can be naturally accommodated via network embedding in a low di-
mensional space. Specifically, we set lij =−kβi−βjk2with βi∈RK1being the embedding
vector for node i, which makes La negative Euclidean distance matrix with rank(L)≤K1+2.
We refer readers to Chapter 5 of Dattorro (2010) for the rank of a Euclidean distance matrix.
With the embedding vectors, two nodes with positive edge tend to have a small distance in
the embedding space, whereas two nodes with negative edge have a relatively large distance.
As a direct consequence, triads A, C and D in Figure 1 are allowed under this embedding
framework, whereas triad B is forbidden due to the triangle inequality. Besides weak bal-
ance, this Euclidean embedding framework also encourages nodes with similar connectivity
patterns to be situated in a close neighborhood in the embedding space. In the sequel, we
denote the corresponding parameter space for Las
Ln={L= (lij)∈Rn×n:lij =−kβi−βjk2,B= (β1, ..., βn)>∈Rn×K1}.(2)
Modeling of Srequires additional structural assumption, since it is generally difficult
to construct consistent estimate of Sfrom a single observed Gwith random noise (Chan-
drasekaran et al., 2011; Cand`es et al., 2011). To facilitate its estimation, we assume that
S∈ Snwith
Sn={S∈Rn×n:S=AA>,A∈Rn×K2},(3)
where A= (α1, ..., αn)>,and sij =α>
iαjwith αibeing an additional embedding vector of
node i, determining whether it may have anomalous edges with other nodes.
6
2.2 Estimation Formulation
Let T={−1,0,1}denote the support of yij . For each t∈ T ,we further denote the
probability of yij =tas
p(t|mij) := F(t|mij )−F(t+ 1 |mij).
Then the log likelihood of the signed network Gtakes the form
log L(M) =
n
X
i,j=1
log p(yij |mij) =
n
X
i,j=1
log [F(yij |mij)−F(yij + 1 |mij )] .
Given the embedding framework in (2) and (3), we rewrite log L(M) = log L(B,A) with
mij =−kβi−βjk2+α>
iαj, and propose the estimation formulation as
(b
B,b
A) = arg min
B∈Rn×K1,A∈Rn×K2{−log L(B,A)}
s.t. kBk2→∞ ≤C, kAk2→∞ ≤C, 1>
nB=0,1>
nA=0,
B>A=0,and kAkF≤κ√ankBkF.
(4)
Here, Cand κare some pre-specified constants, and anis a small anomaly rate that controls
the relative scales of kBkFand kAkF,which reflects the prior knowledge that the level
of anomaly effect is much weaker than that of the balance structure. The sum-to-zero and
orthogonal constraints on Band Aare necessary for their identifiability in terms of parameter
estimation. Given (b
B,b
A),we define b
L= (b
lij)n×nwith b
lij =−kb
βi−b
βjk2,b
S=b
Ab
A>and
c
M=b
L+b
S.
Once b
Band b
Sare obtained, we can further detect communities and anomalies in the
signed network. Particularly, we perform an (1+)-approximation of the K-means algorithm
7
(Kumar et al., 2004) on the estimated {b
βi}n
i=1 to detect communities, and then
b
Nl={i∈[n] : b
ψi=l},for l= 1, ..., m,
is the lth detected community, where mdenotes the number of communities, and b
ψi∈[m]
denotes the community membership of node i. We also detect anomalies by performing hard
thresholding on bsij,and conclude the edge yij to be anomalous if |bsij |> ηn,where ηnis the
thresholding parameter.
2.3 Computing Algorithm
The optimization task in (4) can be efficiently solved by an alternative updating scheme,
which updates Band Aiteratively via the projected gradient descent algorithm. Specifically,
for a matrix Xand positive constant c, we define some projection operators as following
PF,c(X) := 1{kXkF≤c}X+ 1{kXkF>c}ckXk−1
FX,
P2→∞,c(X) := 1{kXk2→∞ ≤c}X+ 1{kXk2→∞ >c}ckXk−1
2→∞X.
Further, given B∈Rn×K1,we define
P⊥
B(A) := In−B(B>B)−1B>A,
for any A∈Rn×K2,which is the projection operator onto the orthogonal complement of B’s
column space.
Then given B(k),A(k)with 1>
nA(k)=0,and {d0, d1, C, κ, an},we implement the
8
following updating scheme:
B(k+1) =P2→∞,C JnB(k)+ξ1
∂log L(B(k),A(k))
∂B,
A(k+1) =P2→∞,C PF,κ√ankB(k+1)kFP⊥
(B(k+1),1n)A(k)+ξ2
∂log L(B(k+1),A(k))
∂A,
(5)
where ξ1, ξ2>0 are step sizes and Jn=In−1n1>
n/n. We repeat the above updating steps
until convergence to get ( b
B,b
A).
If the intercepts dare unknown a priori, their estimates can be obtained in the iterative
updating algorithm as well. We denote log L(B,A,d) to emphasize the dependence of the
log likelihood on d,and define P[a,b](x) = x1{a≤x≤b}+a1{x<a}+b1{x>b},for any interval [a, b]
and scalar x. Given d(k)= (d(k)
0, d(k)
1) satisfying c1≤d(k)
1≤d(k)
0+δ < d(k)
0≤c2,where
δ > 0, c1, c2are pre-specified constants, in addition to the updating steps in (5), we also
update das
d(k+1)
1=P[c1,d(k)
0−δ]d(k)
1+ξ3
∂log L(B(k+1),A(k+1),d(k))
∂d1,
d(k+1)
0=P[d(k+1)
1+δ,c2] d(k)
0+ξ4
∂log L(B(k+1),A(k+1),(d(k)
0, d(k+1)
1))
∂d0!,
where ξ3, ξ4>0 are step sizes.
It is clear that the embedding performance of (4) relies on the choices of K1and K2,
which can be determined by some data-adaptive tuning procedure, such as the network
cross-validation (Li et al., 2020). If community detection is of primary interest, we would
suggest to set K1=K2=m−1,where only the number of communities mis determined
by some tuning procedure (Saldana et al., 2017).
9
3 Theory
Denote (L∗,S∗)∈ Ln× Snas the true parameters, and M∗=L∗+S∗.Further denote
B∗∈Rn×K1and A∗∈Rn×K2as the embedding vectors for L∗and S∗,respectively. For any
constant α > 0, we define
Gα:= max
t∈T sup
|x|≤α
|p0(t|x)|
p(t|x)and Hα:= min
t∈T inf
|x|≤α[p0(t|x)]2
[p(t|x)]2−p00(t|x)
p(t|x),
where p0(t|x) and p00(t|x) denote, respectively, the first and second order derivatives of
p(t|x) with respect to x.
Recall the constants C, κ and sequence an→0 in (4), and we make the following technical
conditions.
Condition A1. G5C2+d0<∞and H5C2+d0>0.
Condition A2. kB∗k2→∞ ≤C, kA∗k2→∞ ≤C, 1>
nB∗=0,1>
nA∗=0,and (B∗)>A∗=
0.
Condition A3. There exist two positive definite matrices Σ1and Σ2such that tr(Σ2)<
κtr(Σ1) and
1
n(B∗)>B∗→Σ1,with λ1(Σ1)> ... > λK1(Σ1)>0,
1
nan
(A∗)>A∗→Σ2,with λ1(Σ2)> ... > λK2(Σ2)>0.
Condition A1 assumes that the probability function p(t| ·) is neither too steep nor too
flat in the feasible domain. It is satisfied by most common link functions, such as the logit
and probit functions. Similar conditions have also been assumed in Bhaskar (2016) for quan-
tized matrix completion. Condition A2 assumes that the embedding vectors are centralized,
and the two embedding matrices B∗and A∗are orthogonal to each other. Condition A3
quantifies the difference in the relative scales of the true balance structure and the anomaly
effect. It holds true with high probability if β∗
iare independent copies from a mean zero
10
distribution in RK1,whose covariance matrix Σ1has K1different positive eigenvalues; and
α∗
iare independent copies from a mixture distribution (1 −an)f0+anf1,where f0is the
Dirac delta distribution with point mass at 0,and f1is a mean zero distribution in RK2
whose covariance matrix Σ2has K2different positive eigenvalues.
We first establish the upper bounds on the estimation errors of c
Mand (b
L,b
S).
Theorem 1. Suppose Conditions A1-A3 hold. Then, with probability at least 1−C1exp(−C2n),
it holds true that
1
nkc
M−M∗kF≤rn,(6)
where rn= 4.02p2(K+ 2)n−1G5C2+d0H−1
5C2+d0, C1and C2are universal constants, and K=
K1+K2.
Theorem 2. Suppose Conditions A1-A3 hold. Then, there exists a constant csuch that,
with probability at least 1−C1exp(−C2n),
1
nkb
L−L∗kF≤cCp2K1(K+ 2)G5C2+d0H−1
5C2+d0
√n,(7)
1
nkb
S−S∗kF≤cCp2K2(K+ 2)G5C2+d0H−1
5C2+d0
√nan
.(8)
Theorems 1 and 2 shows that both M∗and (L∗,S∗) can be consistently estimated by c
M
and (b
L,b
S),respectively. Note that the convergence rate of b
S−S∗depends on the anomaly
rate an,which reflects the scale of anomaly effect.
We then turn to establish the asymptotic consistency in terms of community detection.
The following condition on the true community structure is required.
Condition A4. There exists a finite set B={b1, ..., bm} ⊂ RK1such that β∗
i∈ B,for
any i= 1, ..., n.
Condition A4 is equivalent to the signed stochastic block model (Jiang, 2015), which
assumes that the embedding vector of each node is fully determined by its community mem-
11
bership. Let ψ∗
i∈[m] be the community index for node isuch that β∗
i=bψ∗
i.Then the l-th
community is defined as
N∗
l={i∈[n] : ψ∗
i=l},for l= 1, ..., m,
which is invariant up to some permutation of the community index.
Further, define ξn= minl∈[m]|N∗
l|/n as the proportion of nodes in the smallest commu-
nity, and ζn= min1≤l<k≤mkbl−bkkas the minimal distance between bl’s. Both ξnand ζn
are allowed to converge to 0 as ndiverges.
Theorem 3. Suppose Conditions A1-A4 hold. Further suppose
K1r2
n=o(ξnζ2
n),(9)
where rnis defined as in Theorem 1. Then, there exists a constant csuch that, with probability
at least 1−C1exp(−C2n),
min
{p1,...,pm}
1
n
m
X
l=1 |N∗
l\b
Npl| ≤ cK1(K+ 2)G2
5C2+d0H−2
5C2+d0
nζ2
n
,(10)
and
min
{p1,...,pm}max
1≤l≤m|N∗
l\b
Npl|
|N∗
l|≤cK1(K+ 2)G2
5C2+d0H−2
5C2+d0
nξnζ2
n
.(11)
In Theorem 3, (10) gives a bound for overall proportion of mis-clustered nodes, while
(11) gives a bound for the worst case proportion of mis-clustered nodes in each communities.
To guarantee detection consistency, (9) requires that the quantity ξnζ2
nconverges to 0 at
a speed not faster than K1r2
n.If ξn= Ω(1), ζn= Ω(1),and K1, K2, C are fixed, then the
convergence rates in both (10) and (11) are of order n−1,which matches up with the best
existing results for unsigned network in literature (Lei and Rinaldo, 2015).
12
Finally, we establish the asymptotic consistency in terms of anomaly detection. The
following condition on the sparsity of S∗is required.
Condition A5. There exists a universal constant c0>0 such that
c−1
0n2a2
n≤ kS∗k0≤c0n2a2
n,(12)
CpK2rna−3
2
n=o(smin),(13)
where rnis defined as in Theorem 1, and smin = mins∗
ij 6=0 |s∗
ij|.
In Condition A5, (12) assures the number of nonzero entries in S∗is of order n2a2
n,and
(13) requires that the minimal absolute value of the nonzero entries in S∗is not too close to
zero. Then, with a proper choice of ηn,Theorem 4 establishes an upper bound for the false
discovery proportion of b
S.
Theorem 4. Suppose Conditions A1-A3 and A5 hold and the thresholding parameter ηnis
set so that
ηn=o(smin)and CpK2rna−3
2
n=o(ηn),(14)
where rnis defined as in Theorem 1. Then, there exists a constant csuch that, with probability
at least 1−C1exp(−C2n),
#{(i, j) : |bsij |> ηn, s∗
ij = 0}
#{(i, j) : |bsij |> ηn} ∨ 1≤cC2K2(K+ 2)G2
5C2+d0H−2
5C2+d0
na3
nη2
n
.(15)
It is clear that the upper bound in (15) converges to 0 as ndiverges, whose convergence
rate is governed by n, anand ηn.Particularly, when Kand Care fixed, the false discovery
proportion of b
Sconverges to zero at a fast rate of n−1/2log n, provided that a3
nη2
nis set as
the same order of (√nlog n)−1.
13
4 Numerical Experiments
We examine the finite-sample performance of the proposed method in terms of both commu-
nity detection and anomaly detection. For community detection, we compare the proposed
method, denoted as SNE, with two existing signed network community detection methods in
literature (Chiang et al., 2012; Cucuringu et al., 2019), as well as a naive embedding method
which only considers the balance structure Lwhile ignoring the anomaly effect S; denoted as
BNC, SPONGE and naive, respectively. Their community detection accuracy is measured
by the overall community detection error rate in (10). Furthermore, as existing methods
rarely consider anomaly detection, we just report the false discovery proportion (15) of SNE
in different scenarios to demonstrate its effectiveness in anomaly detection.
The following two synthetic networks are considered.
Example 1. The synthetic network is generated from the SSBM model with anomalies.
Specifically, we set m= 4, and generate ψ∗
ifrom a multinomial distribution on {1,2,3,4}
with probability (0.1,0.2,0.3,0.4).Let β∗
i=bψ∗
i∼N(0,I3), and α∗
i∼(1 −an)f0+anf1,
where f0is the Dirac delta distribution with point mass at 0and f1= 0.5N(13,Ω2) +
0.5N(−13,Ω2) is a mixture Gaussian distribution with Ω2to be a 3 ×3 diagonal matrix
with diagonal entries independently generated from a uniform distribution on [0,0.1].
Example 2. The synthetic network is generated from a mixture model with anomalies.
Specifically, for community structure, we set m= 4, and generate ψ∗
ifrom a multinomial
distribution on {1,2,3,4}with probability (0.25,0.25,0.25,0.25).Let β∗
i∼N(bψ∗
i,0.01I3)
with bψ∗
i∼N(0,I3), and α∗
iis generated similarly as in Example 1.
Various scenarios in each example are considered, with an∈ {0,0.1,0.2,0.3}, and n∈
{200,500,1000}. For each scenario, the averaged community detection errors for all the
methods over 50 independent replications, together with their standard errors, are reported
in Tables 1 and 2.
14
Table 1: The averaged community detection errors of various methods over 50 independent
replications and their standard errors in Example 1.
Method an= 0 an= 0.1an= 0.2an= 0.3
n= 200
SNE 0.0426(0.0054) 0.0507(0.0051) 0.0914(0.0076) 0.0787(0.0067)
naive 0.0426(0.0054) 0.0443(0.0043) 0.1051(0.0079) 0.1649(0.0043)
BNC 0.1333(0.0034) 0.1302(0.0042) 0.1253(0.0027) 0.1348(0.0038)
SPO 0.2213(0.0032) 0.2143(0.0034) 0.1381(0.0066) 0.1926(0.0032)
n= 500
SNE 0.0062(0.0018) 0.0120(0.0025) 0.0074(0.0018) 0.0844(0.0042)
naive 0.0062(0.0018) 0.0174(0.0031) 0.0462(0.0028) 0.1518(0.0024)
BNC 0.1162(0.0013) 0.1282(0.0026) 0.1207(0.0014) 0.1263(0.0018)
SPO 0.2252(0.0015) 0.1010(0.0030) 0.1834(0.0027) 0.1984(0.0012)
n= 1000
SNE 0.0058(0.0013) 0.0115(0.0018) 0.0083(0.0013) 0.0640(0.0024)
naive 0.0058(0.0013) 0.0173(0.0022) 0.0461(0.0014) 0.1619(0.0013)
BNC 0.1232(0.0017) 0.1099(0.0008) 0.1148(0.0011) 0.1279(0.0012)
SPO 0.2144(0.0020) 0.1461(0.0007) 0.1942(0.0010) 0.1546(0.0017)
Table 2: The average community detection errors of various methods over 50 independent
replications and their standard errors in Example 2.
Method an= 0 an= 0.1an= 0.2an= 0.3
n= 200
SNE 0.0326(0.0047) 0.0601(0.009) 0.0496(0.0067) 0.0795(0.0068)
naive 0.0326(0.0047) 0.0519(0.008) 0.0463(0.0066) 0.0965(0.0045)
BNC 0.2991(0.0056) 0.3260(0.007) 0.2923(0.0051) 0.3023(0.0049)
SPO 0.2031(0.0113) 0.1694(0.012) 0.0710(0.0086) 0.1251(0.0092)
n= 500
SNE 0.0330(0.0048) 0.0359(0.0047) 0.0227(0.0028) 0.0648(0.0034)
naive 0.0330(0.0048) 0.0249(0.0027) 0.0484(0.0036) 0.1606(0.0043)
BNC 0.3274(0.0057) 0.3160(0.0056) 0.2438(0.0032) 0.2623(0.0035)
SPO 0.0828(0.0065) 0.0306(0.0026) 0.1425(0.0057) 0.2283(0.0052)
n= 1000
SNE 0.0191(0.0029) 0.0308(0.0035) 0.0294 (0.0028) 0.0777 (0.0045)
naive 0.0191(0.0029) 0.0279(0.0019) 0.0387 (0.0035) 0.0808 (0.0041)
BNC 0.3282(0.0040) 0.2921(0.0035) 0.2496 (0.0053) 0.2628 (0.0058)
SPO 0.0315(0.0035) 0.0726(0.0037) 0.1047 (0.0049) 0.0969 (0.0047)
15
It is clear from Tables 1 and 2 that SNE outperforms the other three competitors in
most scenarios. Particularly, SNE and naive yield the same numerical performance when
an= 0, but the advantage of SNE becomes more and more substantial when both nand
anincrease, which confirms the benefit of incorporating the anomaly effects into the signed
network modeling. Furthermore, both BNC and SPO do not produce satisfactory and stable
numerical performance, in that SPO yields reasonable performance in Example 2, but much
worse performance than other methods in Example 1.
In addition, the averaged false discovery proportions of SNE over 50 independent repli-
cations and its standard error are reported in Table 3. It is evident that its false discovery
proportion decreases as ngrows, confirming the asymptotic convergence estabilished in The-
orem 4. It is also interesting to remark that the performance of SNE in anomaly detection
deteriorates as anincreases, which is possibly due to the fact that the difference between the
anomaly effect and balance structure shrinks as angrows, making anomaly detection more
challenging.
Table 3: The averaged false discovery proportions of SNE over 50 independent replications
and its standard errors.
an= 0.1an= 0.2an= 0.3
Example 1
n= 200 0.9736(0.0017) 0.7253(0.0081) 0.4320(0.0143)
n= 500 0.2349(0.0038) 0.1538(0.0081) 0.2117(0.0059)
n= 1000 0.0022(0.0003) 0.0585(0.0043) 0.1831(0.0052)
Example 2
n= 200 0.9651(0.0030) 0.7859(0.0125) 0.6496(0.0216)
n= 500 0.2788(0.0063) 0.1103(0.0043) 0.2365(0.0099)
n= 1000 0.0287(0.0034) 0.0290(0.0018) 0.1910(0.0057)
16
5 International Relation Network
We now apply the proposed SNE method to analyze the international relation network,
which is constructed based on the Correlates of War dataset during 1993—2014 (Maoz et al.,
2019). In this network, we set yij =−1 if there was ever a conflict between countries iand
j,yij = 1 if countries iand jwere always in alliance and never had conflict during the whole
time period, and yij = 0 if countries iand jhad neither alliance nor conflict. This leads to
a signed network with 152 nodes, 2026 positive edges and 694 negative edges.
We first set K1=K2=m−1, and employ the Bayesian information criterion (Saldana
et al., 2017) to select m= 6, which is consistent with some existing studies (Traag and
Bruggeman, 2009; Jiang, 2015). We then apply SNE with C= 2, κ = 1, an= 0.1 on this
signed network to obtain the embedding vectors b
Band b
S. We further perform an (1 + )-
approximation of the K-means algorithm on {b
βi}n
i=1, and obtained the estimated community
membership b
Nlfor l= 1,...,6. In addition, we set ηnas the median of the absolute value
of all bsij’s, and denote esij =bsij1{|bsij |>ηn}.
Displayed in Panel (a) of Figure 2 is the heatmap of a rearranged b
Laccording to b
Nl,
showing a clear block diagonal structure produced by SNE. Here the darker the color is, the
larger b
lij is, which also indicates a smaller distance kb
βi−b
βjk. Displayed in Panel (b) of
Figure 2 is a side-by-side boxplot of esij, where the left boxplot shows the distribution of esij
with i, j in the same community but yij =−1, and the right boxplot shows the distribution
of esij with i, j in different communities but yij = 1. It is evident that most esij’s in the left
boxplot are negative, while those esij’s in the right boxplot tend to be positive, confirming
the validity of SNE in anomaly detection.
We also color the world map according to the estimated community membership in
Figure 3, where countries colored in grey are not included in the dataset. The detailed
country list of each community can be found in Appendix C. It is clear from Figure 3 that
17
Figure 2: The heatmap for the rearranged b
Laccording to the estimated community mem-
bership is displayed in Panel (a), and a side-by-side boxplot for esij is displayed in Panel (b).
the first and largest community contains the United States and its political allies, including
most western European countries, Canada, Australia and some countries in Middle East
such as Israel, Saudi Arabia and United Arab Emirates. The third community consists of
Russia and its political allies including Yugoslavia, Vietnam and some countries in central
Asia such as Turkmenistan and Kyrgyzstan. The fourth community consists of countries in
East Asia including China and Japan, and countries in central Africa including Tanzania
and Central African Republic. It is interesting to note the community structure in Africa is
rather complicated, which is probably due to the frequent conflicts between African countries
during the time period.
It is also interesting to note that both China and Japan are in the fourth community, but
the estimated anomaly effect between them is negative with ˜sij =−0.055, suggesting a non-
constructive relationship between these two countries. Similarly, although Israel and Turkey
are both allies of the United States and contained in the first community, the estimated
anomaly effect between them is also negative with ˜sij =−0.067. On positive anomaly,
though China and Pakistan or China and Israel are in different communities, the estimated
anomaly effect between China and Pakistan is ˜sij = 0.033,and that between China and
18
−50
0
50
−100 0 100 200
longitude
latitude
community
1
2
3
4
5
6
NA
Figure 3: World map with countries in different communities.
Israel is ˜sij = 0.044. All these estimated anomaly effects are well expected due to their
well-known historical conflicts or friendships.
6 Discussion
In this article, we propose a unified embedding model for signed networks, which is one of the
first attempts to incorporate both balance structure and anomaly effect in signed network
modeling. Asymptotic analysis has been conducted to assure estimation consistency of the
proposed embedding model. Its applications to community detection and anomaly detection
in signed network are also considered, with sound theoretical justification. The advantage
of the proposed embedding model is supported by extensive numerical experiments on both
synthetic networks and an international relation network. It is also worth noting that the
proposed embedding model is flexible and can be extended to various signed networks, such
as the directed signed networks or multi-layer signed networks.
19
References
Axelrod, R. and Bennett, D. S. (1993). A landscape theory of aggregation. British journal
of political science, 23(2):211–233.
Bansal, N., Blum, A., and Chawla, S. (2004). Correlation clustering. Machine learning,
56(1):89–113.
Bhaskar, S. A. (2016). Probabilistic low-rank matrix completion from quantized measure-
ments. The Journal of Machine Learning Research, 17:2131–2164.
Cand`es, E. J., Li, X., Ma, Y., and Wright, J. (2011). Robust principal component analysis?
Journal of the ACM (JACM), 58(3):1–37.
Cartwright, D. and Gleason, T. C. (1966). The number of paths and cycles in a digraph.
Psychometrika, 31(2):179–199.
Cartwright, D. and Harary, F. (1956). Structural balance: a generalization of heider’s theory.
Psychological review, 63(5):277.
Chandrasekaran, V., Sanghavi, S., Parrilo, P. A., and Willsky, A. S. (2011). Rank-sparsity
incoherence for matrix decomposition. SIAM Journal on Optimization, 21(2):572–596.
Chatterjee, S., Diaconis, P., and Sly, A. (2011). Random graphs with a given degree sequence.
The Annals of Applied Probability, 21(4):1400–1435.
Chen, Y., Wang, X., Yuan, B., and Tang, B. (2014). Overlapping community detection in
networks with positive and negative links. Journal of Statistical Mechanics: Theory and
Experiment, 2014(3):P03021.
Chiang, K.-Y., Hsieh, C.-J., Natarajan, N., Dhillon, I. S., and Tewari, A. (2014). Prediction
20
and clustering in signed networks: a local to global perspective. The Journal of Machine
Learning Research, 15:1177–1213.
Chiang, K.-Y., Whang, J. J., and Dhillon, I. S. (2012). Scalable clustering of signed networks
using balance normalized cut. In Proceedings of the 21st ACM international conference
on Information and knowledge management, pages 615–624.
Cucuringu, M., Davies, P., Glielmo, A., and Tyagi, H. (2019). Sponge: A generalized
eigenproblem for clustering signed networks. In The 22nd International Conference on
Artificial Intelligence and Statistics, pages 1088–1098. PMLR.
Dattorro, J. (2010). Convex optimization & Euclidean distance geometry. Lulu. com.
Davis, J. A. (1967). Clustering and structural balance in graphs. Human relations, 20(2):181–
187.
Doreian, P. and Mrvar, A. (1996). A partitioning approach to structural balance. Social
networks, 18(2):149–168.
Easley, D., Kleinberg, J., et al. (2010). Networks, crowds, and markets, volume 8. Cambridge
university press Cambridge.
Erd¨os, P. and R´enyi, A. (1960). The evolution of random graphs. Magyar Tud. Akad. Mat.
Kutat´o Int. K¨ozl, 5:17–61.
Facchetti, G., Iacono, G., and Altafini, C. (2011). Computing global structural balance
in large-scale signed social networks. Proceedings of the National Academy of Sciences,
108(52):20953–20958.
Graham, B. (2017). An econometric model of network formation with degree heterogeneity.
Econometrica, 85(4):1033–1063.
21
Harary, F. (1953). On the notion of balance of a signed graph. Michigan Mathematical
Journal, 2(2):143–146.
Heider, F. (1946). Attitudes and cognitive organization. The Journal of psychology,
21(1):107–112.
Hoff, P. D., Raftery, A. E., and Handcock, M. S. (2002). Latent space approaches to social
network analysis. Journal of the American Statistical Association, 97(460):1090–1098.
Holland, P. W., Laskey, K. B., and Leinhardt, S. (1983). Stochastic blockmodels: First steps.
Social Networks, 5:109–137.
Jiang, J. Q. (2015). Stochastic block model and exploratory analysis in signed networks.
Physical Review E, 91(6):062805.
Kumar, A., Sabharwal, Y., and Sen, S. (2004). A simple linear time (1+/spl epsiv/)-
approximation algorithm for k-means clustering in any dimensions. In 45th Annual IEEE
Symposium on Foundations of Computer Science, pages 454–462. IEEE.
Kunegis, J., Lommatzsch, A., and Bauckhage, C. (2009). The slashdot zoo: mining a social
network with negative edges. In Proceedings of the 18th international conference on World
wide web, pages 741–750.
Lei, J. and Rinaldo, A. (2015). Consistency of spectral clustering in stochastic block models.
The Annals of Statistics, 43:215–237.
Leskovec, J., Huttenlocher, D., and Kleinberg, J. (2010). Predicting positive and negative
links in online social networks. In Proceedings of the 19th international conference on
World wide web, pages 641–650.
Li, T., Levina, E., and Zhu, J. (2020). Network cross-validation by edge sampling.
Biometrika, 107:257–276.
22
Li, Y., Liu, J., and Liu, C. (2014). A comparative analysis of evolutionary and memetic algo-
rithms for community detection from signed social networks. Soft Computing, 18(2):329–
348.
Maoz, Z., Johnson, P. L., Kaplan, J., Ogunkoya, F., and Shreve, A. P. (2019). The dyadic
militarized interstate disputes (mids) dataset version 3.0: Logic, characteristics, and com-
parisons to alternative datasets. Journal of Conflict Resolution, 63:811–835.
Massa, P. and Avesani, P. (2005). Controversial users demand local trust metrics: An
experimental study on epinions. com community. In AAAI, volume 5, pages 121–126.
Moore, M. (1979). Structural balance and international relations. European Journal of Social
Psychology.
Saldana, D. F., Yu, Y., and Feng, Y. (2017). How many communities are there? Journal of
Computational and Graphical Statistics, 26:171–181.
Sengupta, S. and Chen, Y. (2018). A block model for node popularity in networks with com-
munity structure. Journal of the Royal Statistical Society: Series B (Statistical Methodol-
ogy), 80(2):365–386.
Tang, J., Chang, Y., Aggarwal, C., and Liu, H. (2016). A survey of signed network mining
in social media. ACM Computing Surveys (CSUR), 49(3):1–37.
Traag, V. A. and Bruggeman, J. (2009). Community detection in networks with positive
and negative links. Physical Review E, 80(3):036115.
Yang, B., Cheung, W., and Liu, J. (2007). Community mining from signed social networks.
IEEE transactions on knowledge and data engineering, 19(10):1333–1348.
Zhang, J., He, X., and Wang, J. (2021). Directed community detection with network em-
bedding. Journal of the American Statistical Association, pages 1–11.
23
Zhao, Y., Levina, E., and Zhu, J. (2012). Consistency of community detection in networks
under degree-corrected stochastic block models. The Annals of Statistics, 40(4):2266–2292.
Zheng, X., Zeng, D., and Wang, F.-Y. (2015). Social balance in signed networks. Information
Systems Frontiers, 17(5):1077–1095.
24