Content uploaded by Chunheng Jiang

Author content

All content in this area was uploaded by Chunheng Jiang on Jun 18, 2015

Content may be subject to copyright.

International Conference on Computer Science and Artificial Intelligence (ICCSAI 2013)

ISBN: 978-1-60595-132-4

Random Walks on the Bipartite-Graph for Personalized Recommendation

Zhong-you Pei, Chun-heng Chiang, Wen-bin Lin

School of Mathematics, Southwest JiaoTong University

Chengdu, 610031, China

pzy20062141@my.swjtu.edu.cn; chiangchunheng@gmail.com; wl@swjtu.edu.cn

Abstract—With the link relations between users and items, we

develop a new graph-based top-N recommendation method

following the idea of topic-sensitive PageRank. An iterative

propagation procedure over the bipartite graph is adopted to

simulate the spreading of users' preference information. It’s

proven that the random walks on the bipartite graph converge

to a stationary probability distribution. By exploiting the

critical point over the random walks, we design a novel

similarity metric to measure nodes’ similarity relationships.

The experimental results on MovieLens show that our method

outperforms the other two node's similarity methods in terms

of both Precision and Recall for the recommendations of

different lengths.

Keywords-random walk; bipartite graph; topic-sensitive

PageRank; collaborative filtering

I. INTRODUCTION

Generally, traditional recommendation approaches are

categorized into two main groups [1], namely Content-based

Filtering (CBF) and Collaborative Filtering (CF). CBF

recommends items that are similar to ones previously liked

by users. It relies on the similarity measure for items, the

higher the similarity score is, the more likely the items are

recommended as the preferred items for a given user. On the

other hand, the underlying idea of CF can be described as

following steps, (1) identify her/his preference neighborhood;

and (2) aggregate the mass opinions to estimate the given

user’s preference to unknown items; then (3) sort the items in

order of importance and make some top-N recommendations.

CF techniques have been acknowledged to be one of the

most successful recommendation techniques, and has been

used in different applications such as recommending news

(e.g. Yahoo!), products (e.g. Amazon), friends (e.g.,

Facebook, Twitter) and movies (e.g., Netflix, MovieLens,

Youtube).

Many variations of CF techniques have also been

proposed, including those based on the nearest

neighborhood-based prediction algorithm [2,3], the

approaches with the latent factor models [4, 5] and some

graph-based models [6, 7, 8, 9], etc.

In graph-based approaches, the rating database is

represented in the form of a graph, and the edge weight

encodes the similarities between the users and items. Up to

date, various graph-based recommendation methods have

been proposed. In order to search and rank the folksonomies,

Hotho et al. [7] develop FolkRank. Following the idea of

PageRank, they assume that, when resources are annotated

with important tags by influential users, the resources should

also be relevant and important. For practical reasons, a direct

application of PageRank is not possible, an alternative

weight-spreading approach is proposed. Fouss et al. [9]

compared several random-walk-based quantities for

measuring relatedness between users and movies on bipartite

graph. One of them is to compute the pseudo-inverse matrix

of the Laplacian matrix of the graph. With the supposed user

tastes transitivity property, Huang et al. [6] proposed a

graph-based model to deal with the cold start and data

sparsity problems [11] on the augmented rating matrix.

Similarly, Zhang and Pu turn to exploit the neighborhood

relatedness in a recursive way [8].

In this work, we introduce the Topic-Sensitive PageRank

into the graph-based approaches with random walks on the

user-item bipartite graph.

The rest of this paper is organized as follows. Section II

gives notations. Our method is presented in Section III. The

experimental results are provided in Section IV and

conclusions are given in Section V.

II. NOTATIONS

Let

U

be the set of the users,

I

be the set of the items

and their sizes are

m

and

n

, respectively. A bipartite graph

( , )

G V E

= can be constructed, where

V

is the vertex set and

can be represented by the union of two sets

V U I

= ∪

;

E

is

the edges which link the user nodes in

U

and the item nodes

in

I

. Without loss of generality, for any user

u U

∈

, and

any item

i I

∈

, if there exists some activities (referred to as

an event

ui

A

), e.g. u has viewed and (or) rated the item

i

,

then an edge is built to link them. Otherwise, they cannot be

directly linked and the corresponding edge does not exist.

For ease of reference, we denote

1 ,

0 ,

ui

ui

A is true

e

elsewise

=

and since the bipartite graph

G

is an undirected graph,

thus

ui iu

e e

=

.

III. THE PROPOSED METHOD

Inspired by the topic-sensitive PageRank [10], we

develop a bipartite-graph-based top-N recommendation

approach.

The relatedness between users and items can be

formulated as a bipartite graph

G

. Each user has her/his

unique preference, which is similar to Haveliwala’s topic.

Specifically, we design a novel type of random walk on the

user-item bipartite graph to simulate users’ preference

propagation procedure.

Considering a user

uU

, we start to run the random

walk, and denote the initial state as

0

Xu

. Strictly

speaking, the random walk is not completely irregular, rather

it is probability-constrained. At each step, the surfer has two

choices: continuing the walks or backing to the start point

(namely the node

u

).

Suppose the walk has advanced to the

thk

step and the

surfer now is located at the node

v

, which may be either a

user or an item node. The surfer may act as follows:

With probability

, the surfer continues the random

walk. Specifically, she/he randomly selects one of

the directly linked neighbours, and goes forward.

With probability

1

, the surfer returns to the start

node

u

.

Let

( , , )p u v k

be the probability for the surfers arriving at

the node

v

by k steps starting from the source user u.

Therefore, the source user’s initial probability must be 1, that

is

, ,0 =1p u u

. Using the state chain

0 1 1

X X X X

kk

, the

probability

( , , )p u v k

can be equivalently written as a

conditional probability given the initial state

0

Xu

.

0

( , , ) ( | )

k

p u v k p X v X u

(1)

According to Bayes theorem, we have

0

1 0 1 0

'

10

'

( | )

( | ', ) ( '| )

( | ', ) ( , ', 1)

k

k k k

vV

kk

vV

p X v X u

p X v X v X u p X v X u

p X v X v X u p u v k

(2)

Suppose the surfer locates at the node

v

which is

different from

u

. If there does not exist a direct link between

v

and

'v

, we have

'0

vv

e

and

10

( | ', ) 0

kk

p X v X v X u

. It implies that, there is no

alternative road that can lead the surfer to the destination

node

v

at the next step, except the edges linking node

v

and

its neighbours. If there is a direct link between

v

and

'v

, we

have

'1

vv

e

, and the following equation holds

10

( | ', )

kk v

p X v X v X u O

(3)

where

v

O

denotes the out-degree of the node

'v

.

Substitute Equation (3) into Equation (2), we obtain

'

'

0

10

': 1

': 1

( | )

( | ', ) ( , ', 1)

( , ', 1)

vv

vv

k

kk

ve

ve v

p X v X u

p X v X v X u p u v k

p u v k

O

(4)

On the other hand, suppose

v

is the source user, i.e.,

vu

. There are two ways which can bring the surfer to

v

from the previous node

'v

. One is the teleportation with

probability

1

, and the other is based on the direct edges

analogous to the above spreading process.

We formulate the above procedure as follows:

'1

'1

': '

': '

( , ', 1) 1,

( , , ) ( , ', 1) ,

vv

vv

ve v

ve v

p u v k vu

O

p u v k p u v k vu

O

(5)

A. Theoretical Analysis

Let

| | | |VV

MR

be the transition probability matrix

defined as

0

0

TA

MB

(6)

where each entry is represented as

m

i

ij

ij v

e

O

, the block

entries

mn

AR

,

nm

BR

and

= mV U I n

. Assume

| | | |VV

k

PR

be the

k

-step transition probability distribution

over the bipartite graph, and the initial distribution is denoted

as

00

Q

P

(7)

where

Q

is a

m

-dimensional block vector.

As in PageRank, the random walk can, at any step

k

,

either jump via the probability transition matrix

M

to an

adjacent node with probability

, or teleport to the active

user node with probability

1

. The probability distribution

vector

( , , )p u v k

for the source user

u

with

k

random walks

can be expressed recursively as the following matrix form

01

(1 ) T

kk

P P M P

(8)

Theorem 1. If

is nonnegative and less than one, the

random walks on the bipartite graph

G

must converge to a

stationary probability distribution after sufficient steps.

Proof. The distribution difference over the graph at two

successive steps can be formulated as follow

when

k

is an odd number

1

2

11

2

()

()

k

k

kk k

AB

P P Q

BA B

(9)

when

k

is an even number

2

2

12

2

()

()

k

k

kk k

AB AB

P P Q

BA B

(10)

Since all the matrices in the right side of the equations (9)

and (10) are independent of

, and

lim 0

k

k

, we have

1

lim kk

kPP

Now we will show that, for every

0

, there exists a

nature integer

K

and a stationary probability distribution P* ,

such that

*

|| ||

k

PP

holds for every

kK

.

Let

*k

k

Z P P

, since

** 0

(1 )

T

P M P P

we can obtain

*

11

()

TT

k k k

Z M P P M Z

and thus,

0

()

Tk

k

Z M Z

In view of the fact that

M

is a stochastic matrix, we have

11

|| || || || 1

TT

MM

According to the theorem of matrix theory,

T

M

is a

convergent matrix. Therefore, the random walks on the

bipartite graph

G

can converge to a stationary probability

distribution after sufficient steps. □

B. Critical Point

There exists an interesting property behind the random

walks, which can be used to simplify our proposed method.

When the random walks proceeded at the length of three or

four, the recommendation accuracy is independent of the

factor

. We now prove it briefly.

Proof . The basic function of a recommender system is

whether it can recommend the unrated items to a user. Let

u

J

be the set of items which have not been rated by the

source user u, and

()

ku

PJ

be the

k

-step probability

distribution over the set of unrated items. Let

()

k

PI

be the

k

-step probability distribution over all item nodes.

According to Equations (9) and (10), by summating all the

successive steps probability distribution differences, we can

obtain the 3-step probability distribution over the item nodes

3

3 0 1

1

23

( ) ( ) [ ( ) ( )]

()

jj

j

P I P I P I P I

BQ BQ BABQ

(11)

and the 4-step probability distribution

2 3 4

4( ) ( ) ( )P I BQ BQ BABQ BABQ

(12)

For convenience, we use • to represent a probability

distribution vector and use

()u

J

to denote a vector

comprising the elements in • corresponding to

u

J

. The

spreading mechanism tells that the set

u

J

have not been

activated until the walks proceed to the third stage. As a

result, the term

BQ

doesn’t affect the distribution over

u

J

,

i.e.,

12

( ) ( ) ( ) 0

u

u u J

P J P J BQ

. As for the third stage, the

term

BABQ

initializes the probability distribution over

u

J

,

that is

3

3

P ( ) ( )

uu

J BABQ J

. When it comes to the fourth

stage, according to Equation (12), we have

3 4 3

4( ) ( ) (1 )( )

uu

u J J

P J BABQ BABQ BABQ

It’s obvious that the ranking for elements in

4()

u

PJ

is the

same as

3()

u

PJ

, which indicates that the recommendation

accuracy is independent of the factor

. □

C. Another Similarity Measure

Basing on the property of the critical point, we propose

one new similarity measure. Suppose the random walks have

forwarded to the third stage, let’s consider the matrix

AB

involved in current transition probability matrix

3

()

T

M

.

We utilize

mm

AB R

as the similarity matrix for user

nodes. The entries

mn

AR

and

nm

BR

in

T

M

represent

the relatedness between users and items, and

AB

make a

bridge for the nodes A and B. Specifically, the similarity

between two users

u

and

v

can be formulated as

1

uv

uv i I I iu

sOO

Compared with the Jaccard similarity index, this measure

has an important property, which can reduce the impact of

items’ popularity on the similarity score.

IV. EXPERIMENTATION

A. Accuracy Evaluation Metrics

Precision and Recall are two standard evaluation metrics

in information retrieval. Both have also been widely used in

the evaluation of recommendation accuracy.

Let

u

R

be the recommended items list to the user

u

provided by the recommender system, and

u

T

be her/his

rating history on the test data set. Precision is defined as the

percentage of relevant items in the recommended list

||

||

uu

uU

u

uU

RT

Precision R

(13)

which represents the probability that a recommended item is

relevant. Recall is defined as the ratio of the suggested

relevant items to the number of available relevant items

||

||

uu

uU

u

uU

RT

Recall T

(14)

B. Data Set

To evaluate the performances of our method, we

conducted experiments on MovieLens provided by

GroupLens Research Center (http://www.grouplens.org/).

MovieLens is both a recommender system and a virtual

community website, where users are allowed to share movies

using favored tags. The website has over 50,000 users who

have provided their ratings on more than 3,000 movies. To

achieve a greater reliability, only the users who have rated 20

or more movies are included, that results in a data set with

over 100,000 ratings from 943 users for 1,682 movies. Each

opinion is represented by a tuple

( , , )

ui ui

t u i r

, where

uU

denotes a user,

iI

is a movie, and the rating for the

movie

i

by user

u

is denoted by

ui

r

, which is an integer

score at five levels, e.g., 1 implies the movie is very bad, and

5 indicates the movie is very good. In addition, the data set

also provides users’ profile information, such as age, gender,

and features of movies, e.g. the type of the movie.

C. Baselines

In the experiments, the standard user-based and item-

based collaborative filtering techniques are selected as the

baseline for top-N recommendations.

The user-based approaches evaluate the interest of a user

u

for an item

i

using the ratings for the item by the her/his

neighborhood users, who give similar rating for the items.

The item-based methods [2], on the other hand, predict the

rating of user

u

to an item

i

using her/his historical rating

for items that similar to

i

.

Let

u

I

be the item set rated by the user

uU

, and

i

U

be the set of users who have rated a given item

iI

. Given

any two users

u

and

v

, their similarity

uv

s

is measured by

the Jaccard index

||

||

uv

uv uv

II

sII

Similarly, given two items

i

and

j

, we define the

similarity

ij

s

between them as

||

||

ij

ij j

UU

sU

Therefore, the

k

neighbours for the user

u

and the item

i

will be established based on the computed similarities,

written

( , )N u k

and

( , )N i k

, respectively.

Let

ui

r

be the rating for item

i

by user

u

and

ui

p

be the

satisfaction for item

i

by this user. In order to recommend

items for the user

u

, the user-based approaches and the

item-based approaches are formulated respectively as

( , )

( , )

i

u

ui uv vi

v N u k U

ui ij uj

j N i k I

p s r

p s r

Using the satisfaction scores, the system sort and

recommend the items with top scores to the given user

u

.

D. Experimental Results

To reduce the variability in prediction, we conduct five-

fold cross validation experiments on MovieLens data set.

Thus, the data set is divided into five subsets, one of which

is randomly selected for testing, and the others are for

training. Throughout the experiments, the performances in

terms of Precision and Recall are all averaged over the 5-

fold subsets.

There are three parameters in our method that need to be

suitably selected, they are the damping factor

, the

maximum number of iterations or length of random walks

K

and

0

P

the personalization initial probability distribution

vector over the bipartite graph.

We select one simplest version of personalization

distribution, where the only nonzero element corresponds to

the source user as described in section III.

To evaluate the impact of

, we measure the

performances of our method with nine different

chosen

from 0.1 to 0.9, in terms of Precision and Recall. The results

are plotted in Fig.1. It can be observed that for given

0

P

, the

smaller the value of

, the better our method performs,

indicating a similar trend as in [14].

Given a damping factor

, the iterative number

K

can

be adaptively determined until the probability distribution

over the graph is convergent. On the other hand, Zhang et al.

[14] have experimented on MovieLens, and pointed out that

a longer propagation path, i.e. a larger

K

, will bring more

redundancy information in prediction. To avoid the noises

induced by too many random walks, we choose

6K

which is slightly larger than the critical point.

According to the property about the critical point, it is

proved that, the damping factor

does not affect the

recommendation quality. At the critical point, we have

designed a novel similarity measure for user nodes. In order

to show the advantage of our method, the user-based

collaborative filtering approach with this novel similarity

measure, we compare it to the user-based collaborative

filtering approach with classical Jaccard similarity measure,

as well as the traditional item-based approach. The

comparison results on MovieLens are presented in Fig.2.

The experimental results show that the new similarity

measure is more consistent with the recommendation goal,

and has good potentials for future works.

V. CONCLUSIONS AND FUTURE WORKS

In this paper, we develop a graph-based recommendation

method, following the idea of topic-sensitive PageRank. The

main contribution of this study is: (1) construct a user-item

bipartite graph from the binary rating database, and also take

users’ unique preferences into consideration; (2) design a

special type of random walks on the bipartite-graph with a

critical point; (3) propose a novel metric to measure the

vicinity/similarity between user nodes. The experimental

results on MovieLens show that the new metric derived from

the critical point provides an alternative measure for the

user-based collaborative filtering approaches. On the other

hand, the performance of the top-N recommendation method

using different α and maximum number of iterations

suggests that too many propagation may deteriorate the

overall recommendation accuracy, since the procedure will

introduce more noises.

In the future work, we will focus on designing divers

similarity metrics to measure the relations between

heterogeneous nodes, and developing effective ensemble

methods to boost the capacity of the measures.

ACKNOWLEDGMENT

This work was supported in part by the Program for New

Century Excellent Talents in University (Grant No. NCET-

10-0702).

REFERENCES

[1] Adomavicius G, Tuzhilin A. Toward the next generation of

recommender systems: A survey of the state-of-the-art and possible

extensions[J]. Knowledge and Data Engineering, IEEE Transactions

on, 2005, 17(6): 734-749.

[2] Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl.

Item-based collaborative filtering recommendation algorithms. In

Proceedings of the 10th international conference on World Wide Web,

pages 285–295. ACM, 2001.

[3] Jonathan L Herlocker, Joseph A Konstan, Loren G Terveen, and John

T Riedl. Evaluating collaborative filtering recommender systems.

ACM Transactions on Information Systems (TOIS), 22(1):5–53,

2004.K. Elissa, “Title of paper if known,” unpublished.

[4] Yehuda Koren. Factorization meets the neighborhood: a multifaceted

collaborative filtering model. In Proceedings of the 14th ACM

SIGKDD international conference on Knowledge discovery and data

mining, pages 426–434. ACM,2008.

[5] Linas Baltrunas and Xavier Amatriain. Towards time-dependant

recommendation based on implicit feedback. In Workshop on

context-aware recommender systems (CARS09), 2009.

[6] Zan Huang, Hsinchun Chen, and Daniel Zeng. Applying associative

retrieval techniques to alleviate the sparsity problem in collaborative

filtering. ACM Transactions on Information Systems (TOIS),

22(1):116–142, 2004.

[7] Andreas Hotho, Robert Jäschke, Christoph Schmitz, and Gerd

Stumme. Information retrieval in folksonomies: Search and ranking.

In The semantic web: research and applications, pages 411–426.

Springer, 2006.

[8] Jiyong Zhang and Pearl Pu. A recursive prediction algorithm for

collaborative filtering recommender systems. In Proceedings of the

2007 ACM conference on Recommender systems, pages 57–64.

ACM, 2007.

[9] Francois Fouss, Alain Pirotte, Jean-Michel Renders, and Marco

Saerens. Random-walk computation of similarities between nodes of

a graph with application to collaborative recommendation.

Knowledge and Data Engineering, IEEE Transactions on, 19(3):355–

369, 2007.

[10] Taher H Haveliwala. Topic-sensitive pagerank. In Proceedings of the

11th international conference on World Wide Web, pages 517–526.

ACM, 2002.

[11] Dietmar Jannach, Markus Zanker, Alexander Felfernig, and Gerhard

Friedrich. Recommender systems: an introduction. Cambridge

University Press, 2010.

[12] Sergey Brin and Lawrence Page. The anatomy of a large-scale

hypertextual web search engine. Computer networks and ISDN

systems, 30(1):107–117, 1998.

[13] Geoffrey Grimmett and David Stirzaker. Probability and random

processes. Oxford university press, 2001.

[14] Yin Zhang, Jiang-qin Wu, and Yue-ting Zhuang. Random walk

models for top-n recommendation task. Journal of Zhejiang

University SCIENCE A, 10(7):927–936, 2009

Figure 1. (a) Precision and (b) Recall of our method at various choices of

on MovieLens data set

Figure 2. Comparison results for models in terms of (a) Precision and (b) Recall on MovieLens data set