Content uploaded by Xi Sheryl Zhang

Author content

All content in this area was uploaded by Xi Sheryl Zhang on Aug 06, 2018

Content may be subject to copyright.

Semi-Supervised Discriminative Preference Elicitation for

Cold-Start Recommendation

Xi Zhang, Jian Cheng, Ting Yuan, Biao Niu, Hanqing Lu

National Laboratory of Pattern Recognition

Institute of Automation, Chinese Academy of Sciences

Beijing, China

{xi.zhang, jcheng, tyuan, bniu, luhq}@nlpr.ia.ac.cn

ABSTRACT

Recommendation for cold users is fairly challenging because

no prior rating can be used in preference prediction. To

tackle this cold-start scenario, rating elicitation is usually

employed through an initial interview in which users are

queried by some carefully selected items. In this paper, we

propose a novel framework to mine the most valuable items

to construct query set using a semi-supervised discriminative

selection (SSDS) model. To learn a low dimensional repre-

sentation for users in item space which can reﬂect their tastes

to a large extent, the model incorporates category labels as

discriminative information. To ensure the used labels reli-

able as well as all users considered, the model utilizes a semi-

supervised scheme leveraging expert guidance with graph

regularization. Experimental results on real-world dataset

MovieLens demonstrate that the proposed SSDS model out-

performs traditional preference elicitation methods on top-N

measures for cold-start recommendation.

Categories and Subject Descriptors

H.3.3 [Information Search and Retrieval]: Information

ﬁltering; H.3.5 [On-line Information Services]: Web-

based services

General Terms

Algorithms, Experimentation

Keywords

Recommender Systems, Cold-Start, Preference Elicitation

1. INTRODUCTION

With the prevalence of massive web 2.0 applications, on-

line information for a large variety of items is growing rapid-

ly. Recommender systems, as important tools for informa-

tion ﬁltering, have received success in many famous com-

mercial websites such as Amazon, Netﬂix and Last.fm. Most

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for proﬁt or commercial advantage and that copies bear this notice and the full cita-

tion on the ﬁrst page. Copyrights for components of this work owned by others than

ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-

publish, to post on servers or to redistribute to lists, requires prior speciﬁc permission

and/or a fee. Request permissions from permissions@acm.org.

CIKM’13, Oct. 27–Nov. 1, 2013, San Francisco, CA, USA.

Copyright 2013 ACM 978-1-4503-2263-8/13/10 ...$15.00.

of these websites are based on Collaborative Filtering(CF)

which is a widely used recommendation approach. The basic

hypothesis of CF is that similar users have similar responses

to similar items. However, it is diﬃcult to compute similari-

ty for users without any observed rating, thus CF would fail

in this cold-start scenario. In general, recommendation for

new users is quite challenging.

Existing studies on cold-start problem mainly focus on

three diﬀerent strategies. The ﬁrst strategy is extracting la-

tent features via auxiliary social relations instead of directly

exploiting CF [5, 3]. The second one is mapping addition-

al attributes from user proﬁles into a latent feature space

such that the lacked rating records can be compensated [1].

One limitation of these methods is that the user proﬁle and

social connection are usually personal and not always avail-

able. Hence many recommender systems resort to an alter-

native strategy named Rating Elicitation which solicits user

preference by initial interview process on some items [4, 2,

6]. On the one hand, rating elicitation admits an overal-

l tendency that the more ratings requested from users the

better recommendations are. On the other hand, diﬀerent

observed rating sets might lead to signiﬁcantly diﬀerent rec-

ommendation performance. Consequently, for the purpose

of constructing a most valuable query set to interview cold

users, the essential goal of preference elicitation is item s-

election by making full use of available information in the

system.

In this paper, our item selection model is developed on the

basis of the following two concerns. From the perspective

of elicitation mechanism, the qualiﬁed items in query set

should largely reﬂect user taste on diverse items of multiple

genres. The reason is natural. If the interests on categories

are captured, it would be much easier to comprehensively

understand user preference such as to return an appealing

item list. Moreover, from the perspective of user modeling,

since there is no extra information to describe user except

observed ratings, the rating vectors can be viewed as a sort

of representation for users. The problem of item selection

is converted to the problem of learning a low dimensional

representation which can embody user preference as much

as possible.

For above two considerations, we incorporate category la-

bels as discriminative information to guide the representa-

tive item selection. Particularly, the user category labels are

inferred by rating matrix and item category relations. Here

we only utilize the category labels of partial users(experts)

to make sure labels are trustworthy. In addition, to take

advantage of unlabeled user data, a graph regularization is

employed. Therefore, a Semi-Supervised Discriminative Se-

lection(SSDS) framework is proposed for preference elicita-

tion to integrate discriminative and semi-supervised compo-

nents by combining expert guidance and graph constraint

together.

As aforementioned, the output of SSDS model is the se-

lected representative items. Ratings on them are regarded as

new descriptions of users in a low dimensional space. Final-

ly, ranking scores are generated for cold users through these

learned user representations as well as the original rating

matrix.

2. SEMI-SUPERVISED DISCRIMINATIVE

SELECTION MODEL

2.1 Problem Statement

Let U={u1,··· , un}be the user collection and V=

{v1,··· , vm}be the item collection. Normally, tradition-

al elicitation model have an observed rating matrix R=

(r1,· · · ,rn)∈Rm×nas input. Due to high data sparsity in

a real world scenario, there exists large amounts of missing

value in each column ri, and most of the items are not pop-

ular. Thereby a candidate pool Vp={v1,··· , vmp}with

mp< m is constructed by ﬁltering out the long-tailed item-

s. Then X= (x1,··· ,xn)∈Rmp×nis the submatrix of R

where each column xicorresponds to ratings user uigiven

to candidate pool Vpand is deemed as item representation

for ui.

Suppose there are kcategories in the given dataset, C=

{c1,··· , ck}is the category label set. Let Xl= (x1,··· ,xnl)

∈Rmp×nlwith nl< n denote rating matrix for labeled user-

s. Yl∈Rnl×kdenote user-category matrix where Yl(i, j) =

1 if user uiis interested in category cj, and 0 otherwise. A

toy example of data resources is depicted in Figure 1. Note

that the labeled user set Ulis a small subset of U, which con-

sists of only active users in each category and is regarded as

informative experts.

Based on above notations, the problem of item selection

can be stated as: given rating matrix Xfor warm user set

U, rating matrix Xland label matrix Ylfor expert set Ul,

the goal is to select most representative item set Vsfrom

candidate pool Vpby building an item selection model fas

f:{Vp;Xl,Yl,X} → {Vs}

2.2 Expert-Guided Selection

In order to capture preference for cold users, an intuitive

scheme is using most discriminative items as queries which

can express users’ attitude towards items of diﬀerent cate-

gories, thus user category labels are embedded into our s-

election model. Concretely, a user who has a quantity of

ratings in certain category suggests that the user may be

interested in the category. Yet, there is no suﬃcient rating

for most users to determine their loved categories. Instead,

labels of experts are utilized as selection guidance, and thus

the correctness of used labels can be guaranteed.

With representation matrix Xland category label matrix

Ylfor experts predeﬁned, the selection procedure can be for-

mulated as the following ℓ2,1-norm regularized least squares

regression problem:

min

P∥XT

lP−Yl∥2

F+α∥P∥2,1(1)

......... ...

Figure 1: A toy example of data resources. X and Xl

are representation matrices for all users and experts

respectively, Ylis expert label matrix, and xn+1 is

interview result related with cold user.

where P∈Rmp×kis a mapping matrix to ﬁnd a low di-

mensional subspace such that the disagreement of predicted

preference over categories and their true labels is minimized.

αis the parameter of ℓ2,1-norm regularization. ∥P∥2,1is de-

ﬁned as

∥P∥2,1=

mp

i=1

k

j=1

P2(i, j) =

mp

i=1

∥P(i, :)∥2(2)

where ℓ2,1-norm constrains the sparsity of item dimension

and inclines to select items by jointly considering regression

task on kcategories.

2.3 Graph-Regularized Selection

However, the supervised setting of selection model might

suﬀer overﬁtting problem for unlabeled data because of mere-

ly taking ratings of experts into account. To overcome this

limitation, unlabeled users should be concerned. To this

end, we propose a semi-supervised method to integrate graph

regularization into the above selection model, which is pre-

sented as

min

P∥XT

lP−Yl∥2

F+α∥P∥2,1

+β

i

i′

S(i, i′)∥PTxi−PTxi′∥2

2

(3)

where βis a weight to control the strength of graph regu-

larization. In particular, we construct an undirected graph

Gwith adjacency matrix Sfor both labeled and unlabeled

users. The nodes in Gcorrespond to user representation

{x1,· · · ,xn}. The adjacency matrix Sis computed by co-

sine similarity based on the following rule,

S(i, i′) = cos(xi,xi′),if xi′∈ N (xi) or xi∈ N (xi′)

0,otherwise.

(4)

where N(xi) denote the nearest neighbor set of xi. The

assumption under graph regularization in Eq.(3) is that if

two users are similar in representation space, their category

labels should also be similar.

Using Laplacian matrix LS=DS−Sin Eq.(3) where

DS(i, i) = jS(i, j) is a diagonal matrix, the SSDS model

combined expert guidance with graph regularization can be

ﬁnally written as

min

P∥XT

lP−Yl∥2

F+α∥P∥2,1+βtr(PTXLSXTP)(5)

2.4 Optimization of SSDS Algorithm

To optimize the uniﬁed objective function in Eq.(5), we

convert the ﬁrst term into tr(·) and discard the constant

term tr(YT

lYl). The minimization problem becomes

J(P) = tr(PTGP −2HP) + α∥P∥2,1(6)

where G=XlXT

l+βXLSXTand H=YT

lXT

l. To obtain

the optimal solution for P, the derivative of the objective

function in Eq.(6) is set as ∂J(P)

∂P= 0. Then Pis presented

as

P= (G+αDP)−1HT(7)

where DPis a diagonal matrix with DP(i, i) = 1

2∥P(i,:)∥2. It

can be proved that G+αDPis a positive deﬁnite matrix,

thus the inverse in Eq.(7) exist. Since DPis dependent to P,

Pand DPcan be updated alternatively until convergence.

The detailed optimization algorithm for SSDS is described

in Algorithm 1.

Algorithm 1 The SSDS Algorithm

Input: {Vp,Xl,Yl,X, α, β, K }

Output: Vsincluding Kmost representative items

1: Construct Gby XlXT

l+βXLSXT;

2: Construct Hby YT

lXT

l;

3: Set t= 0 and initialize DPtas an identity matrix;

4: loop

5: Compute Pt+1 = (G+αDPt)−1HT;

6: Update the diagonal matrix DPt+1 with the i-th

diagonal element is 1

2∥Pt+1(i,:)∥2;

7: t=t+ 1;

8: end loop until convergence

9: Sort each item according to ∥P(i, :)∥2in descending or-

der and select top-K ranked ones.

3. COLD-START RECOMMENDATION

Once representative items are selected, for a new user, an

interview process is conducted by oﬀering questions with the

learned items aiming to elicit preference eﬃciently. Further-

more, to recommend items, a ranking estimation method is

introduced to predict possible rankings towards the whole

item set V.

Without loss of generality, a new user can be denoted as

un+1. After interview process, one new column xn+1 is gen-

erated. Next, we place emphasis on utilizing representative

item set Vs, observed rating matrix Rand representation

vector xn+1 in cold-start recommendation. Inspired by low-

rank matrix factorization, a global optimal formulation is

developed which solves a loading matrix W∈RK×mby

min

w

1

2∥R−WT

X∥2

F+λ

2∥W∥2

F(8)

where

X=X(Vs,:) is new matrix description for user set U

based on selected items Vs. Similarly, un+1 also has

xn+1.

Accordingly, the solution of Wis given by

W∗= (

X

XT+λI)−1

XRT(9)

Where I∈RK×Kis an identity matrix. When there is a new

user, the personalized ranking score is eventually computed

by rn+1 = (W∗)T

xn+1. Notice that the loading matrix

can be pre-computed and thus our model is consistent with

online-updating principle of real-world systems.

Table 1: Comparison of Diﬀerent Selection Strate-

gies for Cold-Start Recommendation(nl= 50, β = 0.1)

Methods Query# MAP NDCG Prec@5

random

K=10

0.0669 0.0465 0.0445

popular 1 0.3119 0.2027 0.1950

popular 2 0.3397 0.2136 0.2023

k-medoids 0.1971 0.1436 0.1369

SSDS 0.3516 0.2272 0.2151

random

K=20

0.1834 0.1268 0.1204

popular 1 0.3479 0.2208 0.2089

popular 2 0.3352 0.2120 0.2023

k-medoids 0.2692 0.1892 0.1803

SSDS 0.3816 0.2469 0.2320

random

K=30

0.2109 0.1467 0.1381

popular 1 0.3668 0.2302 0.2161

popular 2 0.3414 0.2140 0.2022

k-medoids 0.3454 0.2392 0.2290

SSDS 0.4122 0.2701 0.2533

4. EXPERIMENTS

4.1 Experiment Design

In order to evaluate how SSDS behaves on user cold-start

recommendation, we conduct experiments using a bench-

mark dataset MovieLens. The dataset includes one million

observed ratings ranged from 1 to 5 points which are given

by 6,040 users to 3,952 movies. Besides, category informa-

tion about movies is available. To simulate cold users, 20%

users are randomly picked from the whole user set. Then

to test the cold-start recommendation results, an 80-20 split

is used for each cold user, where 80% ratings are randomly

chosen as the response set to answer queries during inter-

view while the rest of 20% ratings is test set. The experi-

ments are set up as 5-fold cross-validation. The parameter

λin Eq.(8) and the size of the nearest neighbor set are re-

spectively determined on MovieLens data as 0.1 and 10 by

cross-validation. We also choose items owned more than 500

ratings to construct the candidate pool Vp.

4.2 Evaluation Metrics

After SSDS-based preference elicitation, ranking scores

are predicted. The performance can be measured by three

classical evaluation metrics in top-N recommendation:

Mean Average Precision(MAP). For each user, Av-

erage Precision(AP) is ﬁrst deﬁned as

AP(u) = N

i=1 prec(i)×pref(i)

# of preferred items

where prec(i) is precision and pref(i) is a binary preference

indicator at ranked position i. MAP is computed based on

AP by the following equation

MAP = 1

|U|

u∈U

AP(u)

Normalized Discounted Cumulative Gain(NDCG).

For a ranked list of Nitem, NDCG is computed by

NDCG = 1

IDCG ×

N

i=1

2pref(i)−1

log2(i+ 1)

where IDCG is produced by a perfect ranking algorithm.

10 20 30

0.3

0.33

0.36

0.39

0.42

K

MAP

nl=20

nl=50

nl=200

nl=500

10 20 30

0.2

0.22

0.24

0.26

K

NDCG

nl=20

nl=50

nl=200

nl=500

Figure 2: Performance Variation of SSDS with re-

spect to expert number nland query number K

Precision. If an item is contained in the test set, we

consider that it is correctly predicted. Prec@N evaluates

the ratio of correctly predicted items in top-N lists. In our

work, we report results for Prec@5.

4.3 Results and Analysis

To demonstrate the eﬀectiveness of the proposed SSDS

model, we compare it with several traditional query selection

methods for cold-start problem including random, popular,

k-medoids strategies. It is worth to notice that we combine

the same cold-start recommendation approach introduced in

Section 3 with all above selection methods. The experimen-

tal results are shown in Table 1. To make a fair comparison,

there are two diﬀerent popular strategies used here. The ﬁrst

one(popular 1) is a query list which is comprised of popular

items extracted from multiple categories to make sure diver-

sity of query list, while the other one(popular 2) is normal

popular set without considering category information.

Because rating elicitation aims to minimize user interac-

tion costs at the same time improving recommendation ac-

curacy, the length of the interview is the pivotal parameter

in our work. To avoid boring interviews, we vary the query

number Kfrom 10 to 30. The overall trend of recommen-

dation correctness along with the interview process can be

observed.

For all compared methods, the performance increases when

Kis expanded. This phenomenon conﬁrms a basic fact: the

more ratings are elicited from the users, the more eﬀective

the recommendations are. On the other hand, these method-

s are not equally eﬃcient for learning user preference. Specif-

ically, two points could be inferred by comparing the per-

formance of these methods: 1. Random strategy performs

severely worse than other strategies, which empirically prove

the necessity of designing an appropriate selection model.

2. SSDS model yields best performance under all of the e-

valuation conditions, which verify that our semi-supervised

discriminative selection framework can do beneﬁts on iden-

tifying more informative items than other methods and then

boost cold-start recommendation accuracy.

Additionally, to assess inﬂuence of the two fundamental

components of SSDS: expert guidance and graph regulariza-

tion, we study the impact of parameters. Correspondingly,

the two important parameters of SSDS are the number of

experts nland weight of regularization term β. Thereby,

we analyze each of them by keeping the other one ﬁxed.

Firstly, ﬁxed βas β= 0.1 and given the query number

K∈ {10,20,30}, the performance variation with respect

to nlis shown in Figure 2. When nlis increasing, both

MAP and NDCG ﬁrst increase until reach peaks and then

drop. The main reason behind this observation is: At the

10 20 30

0.2

0.25

0.3

0.35

0.4

K

MAP

β=0

β=0.001

β=0.01

β=0.1

β=1

10 20 30

0.1

0.15

0.2

0.25

K

NDCG

β=0

β=0.001

β=0.01

β=0.1

β=1

Figure 3: Performance Variation of SSDS with re-

spect to graph weight βand query number K

beginning, adding more labeled users can provide more dis-

criminative information so as to help us ﬁnding a better user

representation in item space. Nevertheless, with the num-

ber of labeled users growing, some untrustworthy labels are

mixed into discriminative information and ﬁnally result in

the accuracy declining.

Secondly, ﬁxed nlas nl= 50 and still given the query

number K∈ {10,20,30}, the performance variation with

respect to βis illustrated in Figure 3. βis varied from the

range {0,0.001,0.01,0.1,1}where larger βenhance the eﬀect

of graph constraint on users who are similar in rating behav-

iors. From the ﬁgure we can see that, even though it achieves

the best point when βis at diﬀerent value, it is clearly that

the performance upgrades on the basis of the model without

graph regularization(β= 0). Especially, the improvement is

signiﬁcantly when the interview is short(K= 10).

5. CONCLUSIONS

In this paper, we propose a novel query selection frame-

work for preference elicitation. By using semi-supervised

and discriminative information, our model tends to select

representative item set which can describe user preference

on diverse categories comprehensively. Experimental results

on benchmark movie rating dataset show that the proposed

query selection model produces more accurate recommenda-

tion for cold users than competitors.

6. ACKNOWLEDGEMENT

This work was supported by 973 Program under Project

2010CB327905, by the National Natural Science Foundation

of China under Grant No. 61170127 and 61070104.

7. REFERENCE

[1] Z. Gantner, L. Drumond, C. Freudenthaler, S. Rendle, and

L. Schmidt-Thieme. Learning attribute-to-feature mappings

for cold-start recommendations. In Proc. of ICDM, 2010.

[2] N. Golbandi, Y. Koren, and R. Lempel. On bootstrapping

recommender systems. In Proc. of CIKM, 2010.

[3] A. Krohn-Grimberghe, L. Drumond, C. Freudenthaler, and

L. Schmidt-Thieme. Multi-relational matrix factorization

using bayesian personalized ranking for social network data.

In Proc. of WSDM, 2012.

[4] N. N. Liu, X. Meng, C. Liu, and Q. Yang. Wisdom of the

better few: cold start recommendation via representative

based rating elicitation. In Proc. of RecSys, 2011.

[5] L. Tang and H. Liu. Relational learning via latent social

dimensions. In Proc. of KDD, 2009.

[6] K. Zhou, S.-H. Yang, and H. Zha. Functional matrix

factorizations for cold-start recommendation. In Proc. of

SIGIR, 2011.