ArticlePDF Available

Semi-supervised discriminative preference elicitation for cold-start recommendation

Authors:
  • Institute of Automation, Chinese Academy of Sciences

Abstract and Figures

Recommendation for cold users is fairly challenging because no prior rating can be used in preference prediction. To tackle this cold-start scenario, rating elicitation is usually employed through an initial interview in which users are queried by some carefully selected items. In this paper, we propose a novel framework to mine the most valuable items to construct query set using a semi-supervised discriminative selection (SSDS) model. To learn a low dimensional representation for users in item space which can reflect their tastes to a large extent, the model incorporates category labels as discriminative information. To ensure the used labels reliable as well as all users considered, the model utilizes a semi-supervised scheme leveraging expert guidance with graph regularization. Experimental results on real-world dataset MovieLens demonstrate that the proposed SSDS model outperforms traditional preference elicitation methods on top-N measures for cold-start recommendation.
Content may be subject to copyright.
Semi-Supervised Discriminative Preference Elicitation for
Cold-Start Recommendation
Xi Zhang, Jian Cheng, Ting Yuan, Biao Niu, Hanqing Lu
National Laboratory of Pattern Recognition
Institute of Automation, Chinese Academy of Sciences
Beijing, China
{xi.zhang, jcheng, tyuan, bniu, luhq}@nlpr.ia.ac.cn
ABSTRACT
Recommendation for cold users is fairly challenging because
no prior rating can be used in preference prediction. To
tackle this cold-start scenario, rating elicitation is usually
employed through an initial interview in which users are
queried by some carefully selected items. In this paper, we
propose a novel framework to mine the most valuable items
to construct query set using a semi-supervised discriminative
selection (SSDS) model. To learn a low dimensional repre-
sentation for users in item space which can reflect their tastes
to a large extent, the model incorporates category labels as
discriminative information. To ensure the used labels reli-
able as well as all users considered, the model utilizes a semi-
supervised scheme leveraging expert guidance with graph
regularization. Experimental results on real-world dataset
MovieLens demonstrate that the proposed SSDS model out-
performs traditional preference elicitation methods on top-N
measures for cold-start recommendation.
Categories and Subject Descriptors
H.3.3 [Information Search and Retrieval]: Information
filtering; H.3.5 [On-line Information Services]: Web-
based services
General Terms
Algorithms, Experimentation
Keywords
Recommender Systems, Cold-Start, Preference Elicitation
1. INTRODUCTION
With the prevalence of massive web 2.0 applications, on-
line information for a large variety of items is growing rapid-
ly. Recommender systems, as important tools for informa-
tion filtering, have received success in many famous com-
mercial websites such as Amazon, Netflix and Last.fm. Most
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full cita-
tion on the first page. Copyrights for components of this work owned by others than
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-
publish, to post on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from permissions@acm.org.
CIKM’13, Oct. 27–Nov. 1, 2013, San Francisco, CA, USA.
Copyright 2013 ACM 978-1-4503-2263-8/13/10 ...$15.00.
of these websites are based on Collaborative Filtering(CF)
which is a widely used recommendation approach. The basic
hypothesis of CF is that similar users have similar responses
to similar items. However, it is difficult to compute similari-
ty for users without any observed rating, thus CF would fail
in this cold-start scenario. In general, recommendation for
new users is quite challenging.
Existing studies on cold-start problem mainly focus on
three different strategies. The first strategy is extracting la-
tent features via auxiliary social relations instead of directly
exploiting CF [5, 3]. The second one is mapping addition-
al attributes from user profiles into a latent feature space
such that the lacked rating records can be compensated [1].
One limitation of these methods is that the user profile and
social connection are usually personal and not always avail-
able. Hence many recommender systems resort to an alter-
native strategy named Rating Elicitation which solicits user
preference by initial interview process on some items [4, 2,
6]. On the one hand, rating elicitation admits an overal-
l tendency that the more ratings requested from users the
better recommendations are. On the other hand, different
observed rating sets might lead to significantly different rec-
ommendation performance. Consequently, for the purpose
of constructing a most valuable query set to interview cold
users, the essential goal of preference elicitation is item s-
election by making full use of available information in the
system.
In this paper, our item selection model is developed on the
basis of the following two concerns. From the perspective
of elicitation mechanism, the qualified items in query set
should largely reflect user taste on diverse items of multiple
genres. The reason is natural. If the interests on categories
are captured, it would be much easier to comprehensively
understand user preference such as to return an appealing
item list. Moreover, from the perspective of user modeling,
since there is no extra information to describe user except
observed ratings, the rating vectors can be viewed as a sort
of representation for users. The problem of item selection
is converted to the problem of learning a low dimensional
representation which can embody user preference as much
as possible.
For above two considerations, we incorporate category la-
bels as discriminative information to guide the representa-
tive item selection. Particularly, the user category labels are
inferred by rating matrix and item category relations. Here
we only utilize the category labels of partial users(experts)
to make sure labels are trustworthy. In addition, to take
advantage of unlabeled user data, a graph regularization is
employed. Therefore, a Semi-Supervised Discriminative Se-
lection(SSDS) framework is proposed for preference elicita-
tion to integrate discriminative and semi-supervised compo-
nents by combining expert guidance and graph constraint
together.
As aforementioned, the output of SSDS model is the se-
lected representative items. Ratings on them are regarded as
new descriptions of users in a low dimensional space. Final-
ly, ranking scores are generated for cold users through these
learned user representations as well as the original rating
matrix.
2. SEMI-SUPERVISED DISCRIMINATIVE
SELECTION MODEL
2.1 Problem Statement
Let U={u1,··· , un}be the user collection and V=
{v1,··· , vm}be the item collection. Normally, tradition-
al elicitation model have an observed rating matrix R=
(r1,· · · ,rn)Rm×nas input. Due to high data sparsity in
a real world scenario, there exists large amounts of missing
value in each column ri, and most of the items are not pop-
ular. Thereby a candidate pool Vp={v1,··· , vmp}with
mp< m is constructed by filtering out the long-tailed item-
s. Then X= (x1,··· ,xn)Rmp×nis the submatrix of R
where each column xicorresponds to ratings user uigiven
to candidate pool Vpand is deemed as item representation
for ui.
Suppose there are kcategories in the given dataset, C=
{c1,··· , ck}is the category label set. Let Xl= (x1,··· ,xnl)
Rmp×nlwith nl< n denote rating matrix for labeled user-
s. YlRnl×kdenote user-category matrix where Yl(i, j) =
1 if user uiis interested in category cj, and 0 otherwise. A
toy example of data resources is depicted in Figure 1. Note
that the labeled user set Ulis a small subset of U, which con-
sists of only active users in each category and is regarded as
informative experts.
Based on above notations, the problem of item selection
can be stated as: given rating matrix Xfor warm user set
U, rating matrix Xland label matrix Ylfor expert set Ul,
the goal is to select most representative item set Vsfrom
candidate pool Vpby building an item selection model fas
f:{Vp;Xl,Yl,X} → {Vs}
2.2 Expert-Guided Selection
In order to capture preference for cold users, an intuitive
scheme is using most discriminative items as queries which
can express users’ attitude towards items of different cate-
gories, thus user category labels are embedded into our s-
election model. Concretely, a user who has a quantity of
ratings in certain category suggests that the user may be
interested in the category. Yet, there is no sufficient rating
for most users to determine their loved categories. Instead,
labels of experts are utilized as selection guidance, and thus
the correctness of used labels can be guaranteed.
With representation matrix Xland category label matrix
Ylfor experts predefined, the selection procedure can be for-
mulated as the following 2,1-norm regularized least squares
regression problem:
min
PXT
lPYl2
F+αP2,1(1)
......... ...
Figure 1: A toy example of data resources. X and Xl
are representation matrices for all users and experts
respectively, Ylis expert label matrix, and xn+1 is
interview result related with cold user.
where PRmp×kis a mapping matrix to find a low di-
mensional subspace such that the disagreement of predicted
preference over categories and their true labels is minimized.
αis the parameter of 2,1-norm regularization. P2,1is de-
fined as
P2,1=
mp
i=1
k
j=1
P2(i, j) =
mp
i=1
P(i, :)2(2)
where 2,1-norm constrains the sparsity of item dimension
and inclines to select items by jointly considering regression
task on kcategories.
2.3 Graph-Regularized Selection
However, the supervised setting of selection model might
suffer overfitting problem for unlabeled data because of mere-
ly taking ratings of experts into account. To overcome this
limitation, unlabeled users should be concerned. To this
end, we propose a semi-supervised method to integrate graph
regularization into the above selection model, which is pre-
sented as
min
PXT
lPYl2
F+αP2,1
+β
i
i
S(i, i)PTxiPTxi2
2
(3)
where βis a weight to control the strength of graph regu-
larization. In particular, we construct an undirected graph
Gwith adjacency matrix Sfor both labeled and unlabeled
users. The nodes in Gcorrespond to user representation
{x1,· · · ,xn}. The adjacency matrix Sis computed by co-
sine similarity based on the following rule,
S(i, i) = cos(xi,xi),if xi N (xi) or xi N (xi)
0,otherwise.
(4)
where N(xi) denote the nearest neighbor set of xi. The
assumption under graph regularization in Eq.(3) is that if
two users are similar in representation space, their category
labels should also be similar.
Using Laplacian matrix LS=DSSin Eq.(3) where
DS(i, i) = jS(i, j) is a diagonal matrix, the SSDS model
combined expert guidance with graph regularization can be
finally written as
min
PXT
lPYl2
F+αP2,1+βtr(PTXLSXTP)(5)
2.4 Optimization of SSDS Algorithm
To optimize the unified objective function in Eq.(5), we
convert the first term into tr(·) and discard the constant
term tr(YT
lYl). The minimization problem becomes
J(P) = tr(PTGP 2HP) + αP2,1(6)
where G=XlXT
l+βXLSXTand H=YT
lXT
l. To obtain
the optimal solution for P, the derivative of the objective
function in Eq.(6) is set as J(P)
P= 0. Then Pis presented
as
P= (G+αDP)1HT(7)
where DPis a diagonal matrix with DP(i, i) = 1
2P(i,:)2. It
can be proved that G+αDPis a positive definite matrix,
thus the inverse in Eq.(7) exist. Since DPis dependent to P,
Pand DPcan be updated alternatively until convergence.
The detailed optimization algorithm for SSDS is described
in Algorithm 1.
Algorithm 1 The SSDS Algorithm
Input: {Vp,Xl,Yl,X, α, β, K }
Output: Vsincluding Kmost representative items
1: Construct Gby XlXT
l+βXLSXT;
2: Construct Hby YT
lXT
l;
3: Set t= 0 and initialize DPtas an identity matrix;
4: loop
5: Compute Pt+1 = (G+αDPt)1HT;
6: Update the diagonal matrix DPt+1 with the i-th
diagonal element is 1
2Pt+1(i,:)2;
7: t=t+ 1;
8: end loop until convergence
9: Sort each item according to P(i, :)2in descending or-
der and select top-K ranked ones.
3. COLD-START RECOMMENDATION
Once representative items are selected, for a new user, an
interview process is conducted by offering questions with the
learned items aiming to elicit preference efficiently. Further-
more, to recommend items, a ranking estimation method is
introduced to predict possible rankings towards the whole
item set V.
Without loss of generality, a new user can be denoted as
un+1. After interview process, one new column xn+1 is gen-
erated. Next, we place emphasis on utilizing representative
item set Vs, observed rating matrix Rand representation
vector xn+1 in cold-start recommendation. Inspired by low-
rank matrix factorization, a global optimal formulation is
developed which solves a loading matrix WRK×mby
min
w
1
2RWT
X2
F+λ
2W2
F(8)
where
X=X(Vs,:) is new matrix description for user set U
based on selected items Vs. Similarly, un+1 also has
xn+1.
Accordingly, the solution of Wis given by
W= (
X
XT+λI)1
XRT(9)
Where IRK×Kis an identity matrix. When there is a new
user, the personalized ranking score is eventually computed
by rn+1 = (W)T
xn+1. Notice that the loading matrix
can be pre-computed and thus our model is consistent with
online-updating principle of real-world systems.
Table 1: Comparison of Different Selection Strate-
gies for Cold-Start Recommendation(nl= 50, β = 0.1)
Methods Query# MAP NDCG Prec@5
random
K=10
0.0669 0.0465 0.0445
popular 1 0.3119 0.2027 0.1950
popular 2 0.3397 0.2136 0.2023
k-medoids 0.1971 0.1436 0.1369
SSDS 0.3516 0.2272 0.2151
random
K=20
0.1834 0.1268 0.1204
popular 1 0.3479 0.2208 0.2089
popular 2 0.3352 0.2120 0.2023
k-medoids 0.2692 0.1892 0.1803
SSDS 0.3816 0.2469 0.2320
random
K=30
0.2109 0.1467 0.1381
popular 1 0.3668 0.2302 0.2161
popular 2 0.3414 0.2140 0.2022
k-medoids 0.3454 0.2392 0.2290
SSDS 0.4122 0.2701 0.2533
4. EXPERIMENTS
4.1 Experiment Design
In order to evaluate how SSDS behaves on user cold-start
recommendation, we conduct experiments using a bench-
mark dataset MovieLens. The dataset includes one million
observed ratings ranged from 1 to 5 points which are given
by 6,040 users to 3,952 movies. Besides, category informa-
tion about movies is available. To simulate cold users, 20%
users are randomly picked from the whole user set. Then
to test the cold-start recommendation results, an 80-20 split
is used for each cold user, where 80% ratings are randomly
chosen as the response set to answer queries during inter-
view while the rest of 20% ratings is test set. The experi-
ments are set up as 5-fold cross-validation. The parameter
λin Eq.(8) and the size of the nearest neighbor set are re-
spectively determined on MovieLens data as 0.1 and 10 by
cross-validation. We also choose items owned more than 500
ratings to construct the candidate pool Vp.
4.2 Evaluation Metrics
After SSDS-based preference elicitation, ranking scores
are predicted. The performance can be measured by three
classical evaluation metrics in top-N recommendation:
Mean Average Precision(MAP). For each user, Av-
erage Precision(AP) is first defined as
AP(u) = N
i=1 prec(i)×pref(i)
# of preferred items
where prec(i) is precision and pref(i) is a binary preference
indicator at ranked position i. MAP is computed based on
AP by the following equation
MAP = 1
|U|
u∈U
AP(u)
Normalized Discounted Cumulative Gain(NDCG).
For a ranked list of Nitem, NDCG is computed by
NDCG = 1
IDCG ×
N
i=1
2pref(i)1
log2(i+ 1)
where IDCG is produced by a perfect ranking algorithm.
10 20 30
0.3
0.33
0.36
0.39
0.42
K
MAP
nl=20
nl=50
nl=200
nl=500
10 20 30
0.2
0.22
0.24
0.26
K
NDCG
nl=20
nl=50
nl=200
nl=500
Figure 2: Performance Variation of SSDS with re-
spect to expert number nland query number K
Precision. If an item is contained in the test set, we
consider that it is correctly predicted. Prec@N evaluates
the ratio of correctly predicted items in top-N lists. In our
work, we report results for Prec@5.
4.3 Results and Analysis
To demonstrate the effectiveness of the proposed SSDS
model, we compare it with several traditional query selection
methods for cold-start problem including random, popular,
k-medoids strategies. It is worth to notice that we combine
the same cold-start recommendation approach introduced in
Section 3 with all above selection methods. The experimen-
tal results are shown in Table 1. To make a fair comparison,
there are two different popular strategies used here. The first
one(popular 1) is a query list which is comprised of popular
items extracted from multiple categories to make sure diver-
sity of query list, while the other one(popular 2) is normal
popular set without considering category information.
Because rating elicitation aims to minimize user interac-
tion costs at the same time improving recommendation ac-
curacy, the length of the interview is the pivotal parameter
in our work. To avoid boring interviews, we vary the query
number Kfrom 10 to 30. The overall trend of recommen-
dation correctness along with the interview process can be
observed.
For all compared methods, the performance increases when
Kis expanded. This phenomenon confirms a basic fact: the
more ratings are elicited from the users, the more effective
the recommendations are. On the other hand, these method-
s are not equally efficient for learning user preference. Specif-
ically, two points could be inferred by comparing the per-
formance of these methods: 1. Random strategy performs
severely worse than other strategies, which empirically prove
the necessity of designing an appropriate selection model.
2. SSDS model yields best performance under all of the e-
valuation conditions, which verify that our semi-supervised
discriminative selection framework can do benefits on iden-
tifying more informative items than other methods and then
boost cold-start recommendation accuracy.
Additionally, to assess influence of the two fundamental
components of SSDS: expert guidance and graph regulariza-
tion, we study the impact of parameters. Correspondingly,
the two important parameters of SSDS are the number of
experts nland weight of regularization term β. Thereby,
we analyze each of them by keeping the other one fixed.
Firstly, fixed βas β= 0.1 and given the query number
K∈ {10,20,30}, the performance variation with respect
to nlis shown in Figure 2. When nlis increasing, both
MAP and NDCG first increase until reach peaks and then
drop. The main reason behind this observation is: At the
10 20 30
0.2
0.25
0.3
0.35
0.4
K
MAP
β=0
β=0.001
β=0.01
β=0.1
β=1
10 20 30
0.1
0.15
0.2
0.25
K
NDCG
β=0
β=0.001
β=0.01
β=0.1
β=1
Figure 3: Performance Variation of SSDS with re-
spect to graph weight βand query number K
beginning, adding more labeled users can provide more dis-
criminative information so as to help us finding a better user
representation in item space. Nevertheless, with the num-
ber of labeled users growing, some untrustworthy labels are
mixed into discriminative information and finally result in
the accuracy declining.
Secondly, fixed nlas nl= 50 and still given the query
number K∈ {10,20,30}, the performance variation with
respect to βis illustrated in Figure 3. βis varied from the
range {0,0.001,0.01,0.1,1}where larger βenhance the effect
of graph constraint on users who are similar in rating behav-
iors. From the figure we can see that, even though it achieves
the best point when βis at different value, it is clearly that
the performance upgrades on the basis of the model without
graph regularization(β= 0). Especially, the improvement is
significantly when the interview is short(K= 10).
5. CONCLUSIONS
In this paper, we propose a novel query selection frame-
work for preference elicitation. By using semi-supervised
and discriminative information, our model tends to select
representative item set which can describe user preference
on diverse categories comprehensively. Experimental results
on benchmark movie rating dataset show that the proposed
query selection model produces more accurate recommenda-
tion for cold users than competitors.
6. ACKNOWLEDGEMENT
This work was supported by 973 Program under Project
2010CB327905, by the National Natural Science Foundation
of China under Grant No. 61170127 and 61070104.
7. REFERENCE
[1] Z. Gantner, L. Drumond, C. Freudenthaler, S. Rendle, and
L. Schmidt-Thieme. Learning attribute-to-feature mappings
for cold-start recommendations. In Proc. of ICDM, 2010.
[2] N. Golbandi, Y. Koren, and R. Lempel. On bootstrapping
recommender systems. In Proc. of CIKM, 2010.
[3] A. Krohn-Grimberghe, L. Drumond, C. Freudenthaler, and
L. Schmidt-Thieme. Multi-relational matrix factorization
using bayesian personalized ranking for social network data.
In Proc. of WSDM, 2012.
[4] N. N. Liu, X. Meng, C. Liu, and Q. Yang. Wisdom of the
better few: cold start recommendation via representative
based rating elicitation. In Proc. of RecSys, 2011.
[5] L. Tang and H. Liu. Relational learning via latent social
dimensions. In Proc. of KDD, 2009.
[6] K. Zhou, S.-H. Yang, and H. Zha. Functional matrix
factorizations for cold-start recommendation. In Proc. of
SIGIR, 2011.
... In a real world scenario, the rating matrix R is highly sparse and most items are rated by only a few users. Thereby as defined in [9], a candidate pool V p = {v 1 , ..., v mp } is extracted from V with m p < m by filtering out the long-tailed items. A submatrix X = (x 1 , x 2 , ..., x n ) ∈ R mp×n of R is also constructed where each element X ij denotes the rating of user u j to item v i in candidate pool V p . ...
... where 2,1 -norm controls the sparsity of item dimension, which is consistent with intuitive explanation of selection process. The items corresponding to the zero rows of P will be discarded such that the discriminative items could be selected [9]. Different with [9], we need to deal with an unsupervised scenario that the selection guidance Z is unknown. ...
... The items corresponding to the zero rows of P will be discarded such that the discriminative items could be selected [9]. Different with [9], we need to deal with an unsupervised scenario that the selection guidance Z is unknown. Consequently, it is necessary to employ the data of user behaviors to discover the community guidance. ...
... The simplest methods for the seed set selection rank users or items by some ad-hoc score which shows how representative they are and take the top-k ranked entities as a seed set [12], [21], [22], [23]. An obvious drawback of such methods that is avoided in our approach is that these elements are taken from the seed set independently and diversity of the selected elements is limited [6]. ...
Article
Cold start problem in Collaborative Filtering can be solved by asking new users to rate a small seed set of representative items or by asking representative users to rate a new item. The question is how to build a seed set that can give enough preference information for making good recommendations. One of the most successful approaches, called Representative Based Matrix Factorization, is based on Maxvol algorithm. Unfortunately, this approach has one important limitation --- a seed set of a particular size requires a rating matrix factorization of fixed rank that should coincide with that size. This is not necessarily optimal in the general case. In the current paper, we introduce a fast algorithm for an analytical generalization of this approach that we call Rectangular Maxvol. It allows the rank of factorization to be lower than the required size of the seed set. Moreover, the paper includes the theoretical analysis of the method's error, the complexity analysis of the existing methods and the comparison to the state-of-the-art approaches.
Conference Paper
Recommendation Systems are very important systems that saves users time and resources by saving them from searching the bulk data. The best example is googling which searches and gives list of hundreds of pages. Therefore, a major challenge of Recommendation Systems can be how to make recommendations for a new user, that is called cold-start user problem in this papers we are trying to identify different kinds of cold start problems in Recommendation Systems. We are also trying to explore different types of solutions to these problems in last 10 years. are very important systems that saves users time and resources by saving them from searching the bulk data. The best example is googling which searches and gives list of hundreds of pages. Therefore, a major challenge of Recommender systems can be how to make recommendations for a new user, that is called cold-start user problem in this papers we are trying to identify different kinds of cold start problems recommender systems. We are also trying to explore different types of solutions to these problems in last 10 years.
Conference Paper
Full-text available
Social media such as blogs, Facebook, Flickr, etc., presents data in a network format rather than classical IID distribution. To address the interdependency among data instances, relational learning has been proposed, and collective inference based on network connectivity is adopted for prediction. However, connections in social media are often multi-dimensional. An actor can connect to another actor for different reasons, e.g., alumni, colleagues, living in the same city, sharing similar interests, etc. Collective inference normally does not differentiate these connections. In this work, we propose to extract latent social dimensions based on network information, and then utilize them as features for discriminative learning. These social dimensions describe diverse affiliations of actors hidden in the network, and the discriminative learning can automatically determine which affiliations are better aligned with the class labels. Such a scheme is preferred when multiple diverse relations are associated with the same network. We conduct extensive experiments on social media data (one from a real-world blog site and the other from a popular content sharing site). Our model outperforms representative relational learning methods based on collective inference, especially when few labeled data are available. The sensitivity of this model and its connection to existing methods are also examined.
Conference Paper
Full-text available
Recommender systems perform much better on users for which they have more information. This gives rise to a problem of satisfying users new to a system. The problem is even more acute considering that some of these hard to profile new users judge the unfamiliar system by its ability to immediately provide them with satisfying recommendations, and may be the quickest to abandon the system when disappointed. Rapid profiling of new users is often achieved through a bootstrapping process - a kind of an initial interview - that elicits users to provide their opinions on certain carefully chosen items or categories. This work offers a new bootstrapping method, which is based on a concrete optimization goal, thereby handily outperforming known approaches in our tests.
Conference Paper
A key element of the social networks on the internet such as Facebook and Flickr is that they encourage users to create connections between themselves, other users and objects. One important task that has been approached in the literature that deals with such data is to use social graphs to predict user behavior (e.g. joining a group of interest). More specifically, we study the cold-start problem, where users only participate in some relations, which we will call social relations, but not in the relation on which the predictions are made, which we will refer to as target relations. We propose a formalization of the problem and a principled approach to it based on multi-relational factorization techniques. Furthermore, we derive a principled feature extraction scheme from the social data to extract predictors for a classifier on the target relation. Experiments conducted on real world datasets show that our approach outperforms current methods.
Conference Paper
A key challenge in recommender system research is how to effectively profile new users, a problem generally known as cold-start recommendation. Recently the idea of progressively querying user responses through an initial interview process has been proposed as a useful new user preference elicitation strategy. In this paper, we present functional matrix factorization (fMF), a novel cold-start recommendation method that solves the problem of initial interview construction within the context of learning user and item profiles. Specifically, fMF constructs a decision tree for the initial interview with each node being an interview question, enabling the recommender to query a user adaptively according to her prior responses. More importantly, we associate latent profiles for each node of the tree --- in effect restricting the latent profiles to be a function of possible answers to the interview questions --- which allows the profiles to be gradually refined through the interview process based on user responses. We develop an iterative optimization algorithm that alternates between decision tree construction and latent profiles extraction as well as a regularization scheme that takes into account of the tree structure. Experimental results on three benchmark recommendation data sets demonstrate that the proposed fMF algorithm significantly outperforms existing methods for cold-start recommendation.
Conference Paper
Recommender systems have to deal with the cold start problem as new users and/or items are always present. Rating elicitation is a common approach for handling cold start. However, there still lacks a principled model for guiding how to select the most useful ratings. In this paper, we propose a principled approach to identify representative users and items using representative-based matrix factorization. Not only do we show that the selected representatives are superior to other competing methods in terms of achieving good balance between coverage and diversity, but we also demonstrate that ratings on the selected representatives are much more useful for making recommendations (about 10% better than competing methods). In addition to illustrating how representatives help solve the cold start problem, we also argue that the problem of finding representatives itself is an important problem that would deserve further investigations, for both its practical values and technical challenges.
Conference Paper
Cold-start scenarios in recommender systems are situations in which no prior events, like ratings or clicks, are known for certain users or items. To compute predictions in such cases, additional information about users (user attributes, e.g. gender, age, geographical location, occupation) and items (item attributes, e.g. genres, product categories, keywords) must be used. We describe a method that maps such entity (e.g. user or item) attributes to the latent features of a matrix (or higher-dimensional) factorization model. With such mappings, the factors of a MF model trained by standard techniques can be applied to the new-user and the new-item problem, while retaining its advantages, in particular speed and predictive accuracy. We use the mapping concept to construct an attribute-aware matrix factorization model for item recommendation from implicit, positive-only feedback. Experiments on the new-item problem show that this approach provides good predictive accuracy, while the prediction time only grows by a constant factor.