Content uploaded by Dung D. Le
Author content
All content in this area was uploaded by Dung D. Le on Aug 30, 2018
Content may be subject to copyright.
Multiperspective Graph-Theoretic Similarity Measure
ABSTRACT
1 INTRODUCTION
similar
CIKM ’18, October 22–26, 2018, Torino, Italy
Problem. graph-theoretic
similarity
(i,j)similar
k i l j
similar
uniperspective
naive
Proposed Approach. multi-
perspective
inter-object
inter-perspective
Contributions.
First
Second
ird
Fourth
Fih
Finally
2 RELATED WORK
Graph-eoretic Similarity.
G(V,E)
S(a,b) a,b
S(a,b) =
C
|N(a)| |N(b)|
|N(a)|
i=1
|N(b)|
j=1
S(Ni(a),Nj(b)), a,b,
1, a=b
C N(a) N(b)
a b
a,b
a
S(a,b) = 0 b,a
Multiperspective Similarity.
3 OVERVIEW
3.1 Problem Formulation
O={o1,o2, . . . ,on}
m P={p1,p2, . . . ,pm}
O p∈ P
Gp(O,Ep) Ep⊆ O × O
p
G={G1,G2, . . . ,Gm}
Gp
O
m
Figure 1: Illustration of the Hypergraph Representation
perspective
object
G H=
(X,E) X=P ∪ O
E={(pk,oi,oj) : 1 ≤k≤m; 1 ≤i,j≤n}
(pk,oi,oj)∈ E oi oj
pk (oi,oj)∈Epk Gpk
P={p1,p2}
O={o1,o2,o3,o4} (p1,o1,o4)
(p1,o2,o3) p1 o1
o4 o2 o3
p2o1 o2 o3 o4
H(X,E)
oi,oj∈ O p∈ P
Sp(oi,oj) [0,1]
Sp(oi,oj) = 1 i=j
Problem 1 (Multiperspective Similarity). Given a multiper-
spective hypergraph H, determine the similarity score Sp(oi,oj)for
each perspective p∈ P and pair of objects oi,oj∈ O.
3.2 Framework for Multiperspective Solutions
Gp
Disjoint-SimRank
Gp
Table 1: List of Notations
Symbols Description
P {p1,p2, . . . ,pm}
O {o1,o2, . . . , om}
X P ∪ O
E
{(p,oi,oj) : oi oj p}
m
n
Np(oi){oj∈ O|(p,oi,oj)∈ E}
Sp(oi,oj) oi ok
p
Sp[Sp(oi,oj)]n×n
Wp n×n
p∈ P
sim(p,p′) p,p′
C
n n
sim(p,p′)∈[0,1]
p,p′∈ P sim(p,p′)=1
p=p′ p,p′
Sp(oi,oj) oi oj
p
sim(p,p′)
Sp(oi,oj)
p′
Np(oi) oi Gp
Sp(oi,oj) = C
| P |
p′∈P
sim(p,p′)
ok∈Np′(oi)
ol∈Np′(oj)
Sp′(ok,ol)
|Np′(oi)| |Np′(oj)|,
Sp= [Sp(oi,oj)]n×n
Wp
p∈ P
Sp=C
|P |
p′∈P
sim(p,p′).Wp′TSp′Wp′
sim(p,p′)
4 STRAIGHTFORWARD SOLUTION:
PIPELINED-SIMRANK
sim(p,p′) Sp(oi,oj)
sim(p,p′)
H
4.1 Inter-Perspective Similarity
p
Gp
p p′
Gp Gp′ p
p′
H= ({P ,O},E)
B
P O × O
p oij
B (p,oi,oj)∈ E H
Figure 2: : Bipartite graph for computing
similarity between perspective nodes
B
sim(p,p′)
Sp(oi,oj)
4.2 Learning Algorithm
sim(p,p′) p,p′∈ P
sim(p,p′)
S(0)
p(∗,∗)∀p∈ P
S(0)
p(oi,oj) = 0 i,j 1 i=j
Algorithm 1
Require: H
—- create bipartite graph from hypergraph —-
B ← bipartiteTransform(H)
—- compute the similarity between perspectives —-
{sim(∗)(p,p′)}∀p,p′∈P ←bipartiteSimRank(B)
Initialize S(0)
p←n,∀p∈ P
while do
S(t+1)
p(oi,oj) = C
|P |
p′∈P
sim(∗)(p,p′)
×
ok∈Np′(oi)
ol∈Np′(oj)
S(t)
p′(ok,ol)
|Np′(oi)||Np′(oj)|,
( 1≤i,j≤n)
S(t+1)
p(oi,oi) = 1( 1≤i≤n)
Return {Sconverged
p(oi,oj),∀p∈ P,oi,oj∈ O }
{sim(∗)(p,p′),∀p,p′∈ P}.
4.3 Convergence Property
e sequence of perspective-specic similarity score
produced by Algorithm 1 is non-decreasing and bounded by [0,1],
i.e., for p∈ P,oi,oj∈ O,t≥0.
1≥S(t+1)
p(oi,oj)≥S(t)
p(oi,oj)≥0,
Proof: (5)
S(1)
p(oi,oj)≥0 = S(0)
p(oi,oj),∀p∈ P,oi,oj∈ O
S(1)
p(oi,oi) = 1 = S(0)
p(oi,oi),∀p∈ P,oi∈ O .
t= 0
∀t≥1
{S(t)
p(oi,oj)}t≥0
{S(t)
p(oi,oj)}t≥0 Sp(oi,oj)∈[0,1]
{Sp(oi,oj)} {sim(∗)(p,p′)}
5 JOINT SOLUTION: MP-SIMRANK
joint
sim(p,p′)
Sp(oi,oj)
5.1 Inter-Perspective Similarity
sim(p,p′)
Sp
sim(p,p′) Sp
sim(p,p′) = 1 −
Sp− S′
p
F
n,
p,p′
Sp Sp′
Sp−S′
p
F
n sim(p,p′)
Sp Sp′
Sp−S′
p
F
n
sim(p,p′)
5.2 Learning Algorithm
S(0)
p(∗,∗)∀p∈
Algorithm 2
Require: H
S(0)
p←In,∀p∈ P
sim(0)(p,p′) = 1 p=p′ 0 p,p′
while do
S(t+1)
p(oi,oj) = C
|P |
p′∈P
sim(t)(p,p′)
×
ok∈Np′(oi)
ol∈Np′(oj)
S(t)
p′(ok,ol)
|Np′(oi)||Np′(oj)|,
( 1≤i,j≤n)
S(t+1)
p(oi,oi) = 1 ( 1≤i≤n)
sim(t+1)(p,p′) = 1 −
S(t+1)
p−S(t+1)
p′
F
n,∀p,p′∈ P
Return {Sconverged
p(oi,oj),∀p∈ P,oi,oj∈ O }
{simconverged(p,p′),∀p,p′∈ P}
P
sim(0)(p,p′)∀p,p′∈ P
sim(0)(p,p′) = 0 p,p′ 1 p=p′
5.3 Convergence Property
e sequence of similarity between perspectives pro-
duced by Algorithm 2 is non-decreasing and bounded by [0,1], i.e.,
for t≥1,
1≥sim(t+1)(p,p′)≥sim(t)(p,p′)≥0,∀p,p′∈ P.
Proof: t≥0
S(t+1)
p− S(t+1)
p′
F≤
S(t)
p− S(t)
p′
F,
∀p,p′∈ P
S(t+1)
p− S(t+1)
p′
F
=
C
|P |
p′′∈ P
sim(t)(p,p′′)−sim(t)(p′,p′′)WT
p′′ · S(t)
p′′ ·Wp′′
F
≤C
|P |
p′′∈ P
sim(t)(p,p′′)−sim(t)(p′,p′′).
WT
p′′ · S(t)
p′′ ·Wp′′
F
=C
|P |
p′′∈ P
S(t)
p− S(t)
p′′
F−
S(t)
p′− S(t)
p′′
F.
WT
p′′ · S(t)
p′′ ·Wp′′
F
n
=C
|P |
p′′∈ P
(S(t)
p− S(t)
p′′ )−(S(t)
p′− S(t)
p′′ )
F.
WT
p′′ · S(t)
p′′ ·Wp′′
F
n
≤C
|P |
p′′∈ P
S(t)
p− S(t)
p′
F<
S(t)
p− S(t)
p′
F
⇒sim(t+1)(p,p′)≥sim(t)(p,p′).
0≤
S(t)
p−S(t)
p′
F
n≤1,∀t≥1 p,p′∈ P
sim(t)(p,p′)∈[0,1],∀t≥0 p,p′∈ P
sim(t)(p,p′) sim(p,p′)
{S(t)
p(oi,oj)}t≥0
Sp(oi,oj)∈[0,1] {sim(t+1)(p,p′)}t≥0
sim(p,p′) Sp(oi,oj) sim(p,p′)
6 EXPERIMENTS ON EFFECTIVENESS
6.1 Experimental Settings
Datasets.
Zoo
legstype
(p,oi,oj)
oi oj p
legselephantgirae
Congressional Voting Records (or HouseVote)
Zoo
Paris Aractions
i j
m∗n2
Task and Metrics.
Recall: p∈ P
E
p E
p Recall
E
p E
p
Recall =1
m
p∈P
|E
p∩ E
p|
|E
p|
PRES:
PRES
n
ri
Nmax
PRES = 1 −ri
n−n+1
2
Nmax
Methods.
C
uniperspective Merged-
SimRank
Average-
SimRank
Disjoint-SimRank
Personalized Collaborative Clustering PCC
6.2 Comparison to Baselines
Recall
Disjoint-SimRank Recall Zoo
HouseVote Paris Aractions
Merged-SimRank Recall Disjoint-
SimRank Zoo HouseVote
Paris Aractions
Average-SimRank
Recall Zoo
HouseVote Paris Aractions
00.150.30.450.60.75
Merged-S imRank
Average-SimRank
Disjoint-Si mRank
PCC
Pipelined-SimRank
MP-SimRan k
Recall
Zoo
00.150.30.450.60.75
Merged-S imRank
Average-SimRank
Disjoint-Si mRank
PCC
Pipelined-SimRank
MP-SimRan k
Recall
HouseVote
00.010.020.030.040.0
5
Merged-S imRank
Average-SimRank
Disjoint-Si mRank
PCC
Pipelined-SimRank
MP-SimRan k
Recall
Paris Attractions
Figure 3: Recall values of all models
00.20.40.60.81
Merged-S imRank
Average-SimRank
Disjoint-Si mRank
PCC
Pipelined-SimRank
MP-SimRan k
PRES
Zoo
00.20.40.60.81
Merged-S imRank
Average-SimRank
Disjoint-Si mRank
PCC
Pipelined-SimRank
MP-SimRan k
PRES
Paris Attractions
00.20.40.60.81
Merged-S imRank
Average-SimRank
Disjoint-Si mRank
PCC
Pipelined-SimRank
MP-SimRan k
PRES
HouseVote
Figure 4: PRES values of all models
Recall Zoo HouseVote
Paris Aractions Recall
PCC
Zoo HouseVote Paris Aractions
Disjoint-SimRank
Merged-
SimRank Average-SimRank
PRES
Recall
Recall PRES
PRES
Paris Aractions
Recall PRES
6.3 Inter-Perspective Similarities
sim(p,p′)∀p,p′∈ P
p,p′
sim(p,p′)
sim(p,p′)
Zoo HouseVote
p∈ P
P Paris Aractions
Table 2: Correlation between NMI scores
and inter-perspective similarities for Zoo (17 perspectives)
p1
p2
p3
p3
p5
p6
p7
p8
p9
p10
p11
p12
p13
p14
p15
p16
p17
6.4 Illustrative Case Study
Paris Aractions
| |
Paris Aractions
7 DISCUSSION ON EFFICIENCY
Table 3: Correlation between NMI scores and
inter-perspective similarities for HouseVote (16
perspectives)
p1
p2
p3
p4
p5
p6
p7
p8
p9
p10
p11
p12
p13
p14
p15
p16
Table 4: Cluster data of four users from Paris Aractions
30 50 62 76 88
30 50 62 88
62 88 50
88 76
30
Table 5: Complexity analysis (per iteration) of all
SimRank-based methods
Methods Storage Time
On2On2dmax
On2Omn2dmax
Omn2Omn2dmax
Om2+n4+mn2O(m2+n4+mn2)dbi +m2n2dmax
Om2+mn2Om2n2dmax
7.1 Complexity Analysis
Merged-
SimRank
n2
dp
|Np(oi)|.|Np(oj)| oi,oj∈ O dmax
∀p∈ P
Average-SimRank Disjoint-SimRank
m
50
62
30
88
50
62
30
88
50
62
88
76
76
88
30
id:
id:
id:
id:
Figure 5: Illustrative example of multiperspective similarity from Paris Aractions dataset.
mn2
m2 n4
dbi B
m2n2dmax
n4
convergence rate
Dt=1
m
p∈P
S(t+1)
p− S(t)
p
F
n,
Dt
t
Zoo HouseVote
Paris Aractions
7.2 Heuristic for More Ecient MP-SimRank
Disjoint-SimRank
Sdisjoint
p,∀p∈ P
O(mn2dmax)
k-medoids
k
O(m2+km)
Hc cluster-specic inter-object Sc
O(k2n2dmax)
O(mn2dmax+m2+km+k2n2dmax)
Om2n2dmax
k≤m
k
Recall PRES
k
k=m k
8 CONCLUSION
Running Time (second)
HouseVote
PRES Recall
k = 1
910 11 12 13 14 15 16 k=17
k=1
910 11
12 13
14 15 16
k=17
Running Time (second)
Zoo
PRES Recall
Running Time (second)
P
ar
i
s
A
ttract
i
ons
PRES Recall
Figure 6: PRES, Recall, and running time of with dierent number of clusters k.
Algorithm 3
Require: H k
– Step 1: run disjoint-simrank on each perspective graph –
Sdistjoint
p←Disjoint −SimRank(Gp),∀p∈ P.
- Step 2: compute Frobenius distances between perspectives -
F=[F(p,p′)]p,p′∈P
F(p,p′) =
Sdistjoint
p−Sdistjoint
p′
F
– Step 3: cluster perspectives and merge graphs –
C ← K−Medoids(F,k); Hc←merge −graph(H,C)
– Step 4: run on the new hypergraph Hc–
{Sc}c∈C ←(Hc)
– Step 5: assign each perspective the inter-object similarity –
15: – of the cluster it belongs to–
Sp←Sc,∀p∈ P,c∈ C, p∈c
{Sp,∀p∈ P}
ACKNOWLEDGMENTS
REFERENCES
ACM Transactions on Knowledge Discovery from Data (TKDD)
User
modeling and user-adapted interaction
Journal of Machine Learning
Research
Proceedings of the Workshop on Geometrical Models of Natural Lan-
guage Semantics
BMC bioinformatics
IEEE Transactions on Neural
Networks
Proceedings of the irtieth international conference on Very large data
bases-Volume 30
Proceedings of the
16th ACM SIGKDD international conference on Knowledge discovery and data min-
ing
Pro-
ceedings of the sixth new zealand computer science research student conference
(NZCSRSC2008), Christchurch, New Zealand
Proceedings of the 2007 joint conference on empirical meth-
ods in natural language processing and computational natural language learning
(EMNLP-CoNLL)
Proceedings of the eighth ACM SIGKDD international conference on
Knowledge discovery and data mining
Proceedings of the 2014 ACM SIGMOD in-
ternational conference on Management of data
Proceedings of the 13th International Conference on Extending Database
Technology
Proceedings of the 2010 SIAM International Confer-
ence on Data Mining
Proceedings of the 33rd in-
ternational ACM SIGIR conference on Research and development in information
retrieval
Intro-
duction to information retrieval
Proceedings of the 52nd Annual Meeting of the
Association for Computational Linguistics (Volume 1: Long Papers)
Proceedings of the VLDB Endowment
6th International Conference on Data Mining,
ICDM 2006
Advances in Neural Information Processing Sys-
tems
Advances in neural information processing systems
WWW