Conference PaperPDF Available

An Approach for Multi-Relational Data Context in Recommender Systems

Authors:

Abstract and Figures

Matrix factorization technique has been successfully used in recommender systems. Currently, many variations are developed using this technique, e.g., biased matrix factorization, non-negative matrix factorization, multi-relational matrix factorization, etc. In the context of multi-relational data, this paper proposes another multi-relational approach for recommender systems by including all of the information from latent factor matrices to the prediction functions so that the models have more data to learn. To validate the proposed approach, experiments are conducted on standard datasets in recommender systems. Experimental results show that the proposed approach is promising.
Content may be subject to copyright.
An Approach for Multi-Relational Data
Context in Recommender Systems
Nguyen Thai-Nghe
(&)
, Mai Nhut-Tu, and Huu-Hoa Nguyen
College of Information and Communication Technology, Can Tho University,
3/2 Street, Can Tho City, Vietnam
{ntnghe,nhhoa}@ctu.edu.vn, mntu.it@gmail.com
Abstract. Matrix factorization technique has been successfully used in
recommender systems. Currently, many variations are developed using this
technique, e.g., biased matrix factorization, non-negative matrix factorization,
multi-relational matrix factorization, etc. In the context of multi-relational data,
this paper proposes another multi-relational approach for recommender systems
by including all of the information from latent factor matrices to the prediction
functions so that the models have more data to learn. To validate the proposed
approach, experiments are conducted on standard datasets in recommender
systems. Experimental results show that the proposed approach is promising.
Keywords: Multi-relational data Matrix factorization Recommender
Systems
1 Introduction
Recommender Systems (RS) have been used in many different areas of information
systems. The RS helps to solve problems of information overload and to choose items
quickly by presenting appropriate contents to individual user. For efciently providing
recommendations, the RS needs a prediction model, which can predict the unseen items
based on the past data then providing suggestions for the user. Therefore, using
accurate algorithms in RS is very important.
In the RS, many algorithms are proposed. They can be grouped into three main
groups [1,2]:
Content-based Filtering approach: Based on description of the items and prole of
the users. These algorithms try to recommend items that are similar to those that the
users liked/selected in the past using attributes of the items or prole of the users.
Collaborative Filtering approach: Algorithms in this group are neighborhood-based
models, which use historical data of the similar users (user-based approach) or use
historical data of the similar items (item-based approach). Another approach is
model-based, which builds the prediction models based on data collected in the
past.
The third group is a combination of the two above approaches.
In the collaborative ltering approach, matrix factorization (MF) is one of the most
successful methods (the state-of-the-art) in rating prediction [3,4]. However, the MF
©Springer International Publishing AG 2017
N.T. Nguyen et al. (Eds.): ACIIDS 2017, Part I, LNAI 10191, pp. 709720, 2017.
DOI: 10.1007/978-3-319-54472-4_66
algorithm focuses on exploiting information in a single relationship between the user
and the item (e.g., the relation Ratesbetween Userand Movieentities).
Therefore, they have not used all relevant information from the other relationships
between the users and the items. To utilize more information, multi-relational matrix
factorization (MRMF) approach was proposed [5,6]. However, in these researches, the
prediction formula does not include all of the information from latent factor matrices.
In this work, for the context of having multi-relational data, we propose a
multi-relational matrix factorization approach, which allows utilizing information from
different relationships between the users and the items for building the prediction
models. For validating the proposed approach, we have experimented on the standard
data sets in both recommender systems (entertainment area) and Intelligent Tutoring
Systems (education area). Experimental results show that the proposed approach can
improve the accuracy of the prediction models.
2 Matrix Factorization Approaches
First, we briey summarize the matrix factorization (MF) [4] on a single relationship
and the multi-relational matrix factorization (MRMF) [58]. Based on these approa-
ches, we then propose a new multi-relational matrix factorization approach for incor-
porating all of available information to the prediction models.
2.1 Matrix Factorization (MF)
Matrix factorization is the task of approximating a matrix Rby the product of two
smaller matrices W
1
and W
2
, i.e. RW1WT
2as illustrated in Fig. 1[4]. In these
notations, W12RUjjKis a matrix where each row uis a vector containing Klatent
factors describing the user uand W22RIjjKis a matrix where each row iis a vector
containing the Klatent factors describing the item i. Let w1uk and w2ik be the elements
and w1uand w2ibe the vectors of W
1
and W
2
respectively, then the rating rgiven by
user uto item iis predicted by:
^
rui ¼Xk
k¼1w1uk w2ik ¼w1uwT
2ið1Þ
Fig. 1. An example of matrix factorization
710 N. Thai-Nghe et al.
W
1
and W
2
are the model parameters (the so-called latent factor matrices) which
can be learnt by optimizing objective function (2) (e.g., using stochastic gradient
descent)
OMF ¼Xu;iðÞ2RRui w1uwT
2i

2þkW1
kk
2
FþW2
kk
2
F

ð2Þ
where
kk
2
Fis a Frobenius norm and k(0 k< 1) is a regularization term which is
used to prevent over-tting.
2.2 Multi-Relational Matrix Factorization (MRMF)
In previous section, we have briey described the MF which uses only one relation
type between two entity types (e.g., the relation ratesbetween userand moviein
Fig. 2). In the MRMF [5,6], we can include more than one relationship and more than
two entity types to the models (Figs. 2and 3).
Let {E
1
,E
2
,,E
N
} be a set of Nentity types, {R
1
,R
2
,,R
M
} be a set of
Mbinary relation types and Rr¼E1r;E2r
ðÞ
fg
(r= 1..M), then the objective function of
the MRMF is presented by [5,6]:
OMRMF ¼XM
r¼1Xu;iðÞ2Rr
Rr
ðÞ
uiwr1uwT
r2i

2þkXN
j¼1Wj
2
F

ð3Þ
where Mis the number of relation types and {W
j
}
j=1N
are the latent factor matrices of
Nentity types. The objective function (3) is optimized by using stochastic gradient
descent. For learning process, the MRMF updates its latent factors using Eqs. (4) and (5):
Fig. 2. Example on ER-diagram of MovieLens dataset
Fig. 3. An example of matrix representations for the ERD in Fig. 2
An Approach for Multi-Relational Data Context 711
wnew
r1u¼wold
r1ub@OMRMF
@wold
r1u
! ð4Þ
wnew
r2i¼wold
r2ib@OMRMF
@wold
r2i
! ð5Þ
where b(0 < b< 1) is a learning rate; and the gradients @OMRMF
@wr1uand @OMRMF
@wr2iare deter-
mined by:
@OMRMF
@wr1u
¼2Rr
ðÞ
uiwr1uwT
r2i

wr2iþkwr1uð6Þ
@OMRMF
@wr2i
¼2Rr
ðÞ
uiwr1uwT
r2i

wr1uþkwr2ið7Þ
Besides, [6] has applied MRMF to predict the performance of the students. The
authors have used scalability of MRMF which can utilize many relationships between
many entities to take full advantage of student information and the task information that
the students must solve, thus, making the prediction model with high accuracy.
Moreover, the authors also introduced a variant of MRMF that is Weighted MRMF
(WMRMF). This technique is similar to MRMF but it allows assigning the weights
(H
r
) to each relation for setting important levels of the relations. With WMRMF, the
objective function in Eq. (3) now becomes:
OWMRMF ¼XM
r¼1HrXðu;iÞ2Rr
Rr
ðÞ
uiwr1uwT
r2i

2þkXN
j¼1Wj
2
F

ð8Þ
Although previous works (MRMF/WMRF) can utilize more information than the
single MF, their prediction functions still use the formula (1) for generating the pre-
diction scores. Therefore, MRMF/WMRMF has not used all the information from
latent factor matrices in the prediction.
In this work, we propose a different approach that try to employ all of relevant
information to the model so that the model have more data to learn, thus, it would
get more prediction accuracy. This is an extended work from [6,9].
3 Proposed Method
We propose a multi-relational factorization approach that can integrate all information
from the latent factor matrices. Therefore, the number of model parameters of the new
approach are different from the MRMF. The proposed methods are named MRMF++
(Multi-Relational Matrix Factorization Plus Plus) and WMRMF++ (Weighted
MRMF++).
712 N. Thai-Nghe et al.
3.1 Multi-Relational Matrix Factorization Plus Plus (MRMF++)
Figure 4presents differences about the number of model parameters between the
MRMF and the MRMF++. Let {E
1
,E
2
,,E
N
} be a set of Nentity types, {R
1
,R
2
,,
R
M
} be a set of Mbinary relation types and Rr¼E1r;E2r
ðÞ
fg
(r=1..M), the MRMF
will have Nlatent factor matrices and the MRMF++ will have 2M latent factor
matrices. Based on the idea of using all information from the latent factor matrices for
prediction, we present the prediction formula of the MRMF++ as the following
^
rui ¼XP
x¼1w1x
!
uXQ
y¼1w2y

T
i¼XK
k¼1XP
x¼1w1x

uk XQ
y¼1w2y
!
ik
!
ð9Þ
where Pand Qare the number of latent factor matrices of uand irespectively; w1xis
matrix at index xin Pmatrices of u;w2yis matrix at index yin Qmatrices of i.To
reduce the length of the formula, we set
X¼X
P
x¼1
w1xand Y ¼X
Q
y¼1
w2y
Then the formula (9) is rewritten as following
^
rui ¼XuYT
ið10Þ
Similar to the MRMF, model parameters of the MRMF++ can be learned by
optimizing objective function (11) given a criterion, e.g., root mean squared error
(RMSE), using stochastic gradient descent
OMRMF þþ ¼XM
r¼1Xu;iðÞ2Rr
Rr
ðÞ
uiXru YT
ri

2þkX2M
j¼1Wj
2
F

ð11Þ
Fig. 4. Comparison of model parameters between MRMF and MRMF++
An Approach for Multi-Relational Data Context 713
where Mis the number of relation types and {W
j
}
j=12M
are the latent factor matrices of
Mrelation types, Rr
ðÞ
ui is real value of relation r, XruYT
ri is the predicted value of relation
r,kP2M
j¼1Wj
2
F

is a regularization term. For learning process, the MRMF++ updates
its latent factors for each relation at iteration nvia Eqs. (12) and (13):
Xn
ru ¼Xn1
ru b@OMRMF þþ
@Xn1
ru
 ð12Þ
Yn
ri ¼Yn1
ri b@OMRMF þþ
@Yn1
ri
 ð13Þ
where bis a learning rate; and the gradients @OMRMF þþ
@Xru and @OMRMF þþ
@Yri are determined by:
@OMRMF þþ
@Xru

¼kXru 2Rr
ðÞ
uiXru YT
ri

Yri ð14Þ
@OMRMF þþ
@Yri

¼kYri 2Rr
ðÞ
uiXru YT
ri

Xru ð15Þ
The MRMF++s learning process is summarized in a LearnMRMF++ procedure.
We initialize the latent factor matrices from the normal distribution N(µ,r2), e.g., mean
µ= 0 and standard deviation r2 = 0.01, and initialize the weight value for each relation
types. While the stopping condition is not met, e.g. reaching the maximum number of
iterations or converging, the latent factors are updated iteratively (converging:
OMRMF þþ
Iterðn1ÞOMRMF þþ
Itern
\.
After the learning process, the model parameters fWjgj¼1...2Mare obtained, then
we can generate the prediction for any relation using the Eq. (9).
714 N. Thai-Nghe et al.
3.2 Weighted Multi-Relational Matrix Factorization Plus Plus
(WMRMF++)
Using the MRMF++, we can utilize many relationships between many entities for
rating prediction. However, this method treats the important role of all relations
equally. Clearly, we can see that the main relation which contains the target variable
(e.g. User-Rates-Moviein Fig. 2) is more important than the other supplement
relations, thus it should have higher weight. Based on [6], we propose the WMRMF++
to take into account the importance of the main relation. The objective function in
Eq. (11) now becomes:
OWMRMF þþ ¼XM
r¼1HrXu;iðÞ2Rr
Rr
ðÞ
uiXru YT
ri

2þkX2M
j¼1Wj
2
F

ð16Þ
where Hris a weight function, for example, it sets the weight to maximum for the main
relation and reduces the weight for the rest, as in Eq. (17). However, other choices
could also be considered.
Hr¼1;if r is the main relation
h;else ð0\h1Þ
ð17Þ
where his a hyperparameter which can be determined from the training data. Another
important property of the WMRMF++ is that in an extreme case (h¼1), the
WMRMF++ is still equivalent to the MRMF++.
For learning process, the WMRMF++ updates its latent factors for each relation
using Eqs. (18) and (19):
Xn
ru ¼Xn1
ru b@OWMRMF þþ
@Xn1
ru
 ð18Þ
Yn
ri ¼Yn1
ri b@OWMRMF þþ
@Yn1
ri
 ð19Þ
where the gradients @OWMRMF þþ
@Xru and @OWMRMF þþ
@Yri are determined by:
@OWMRMF þþ
@Xru

¼kXru 2HrRr
ðÞ
uiXru YT
ri

Yri
@OWMRMF þþ
@Yri

¼kYri 2HrRr
ðÞ
uiXru YT
ri

Xru
The WMRMF++s learning process is summarized in LearnWMRMF++ algo-
rithm below. We initialize the latent factor matrices from the normal distribution N(µ,
r2), e.g., mean µ= 0 and standard deviation r2 = 0.01, and initialize the weight value
for each relation types. While the stopping condition is not met, e.g., reaching the
maximum number of iterations or converging (converging: OMRMF þþ
Iterðn1ÞOMRMF þþ
Itern
\, the latent factors are updated iteratively.
An Approach for Multi-Relational Data Context 715
After the learning process, the model parameters fWjgj¼1...2Mare obtained, we also
generate the prediction for any relation using the Eq. (9).
4 Experiments
4.1 Datasets
For experiments, we have used datasets from two different elds which are enter-
tainment and education. Movielens 100 k dataset is collected by GroupLens (www.
grouplens.org). This data was extracted from a movie recommender system. It has
100,000 rating, 943 users and 1,682 movies. This data set contains user information,
e.g., age, gender, occupation and movie information, e.g., title, release date, genre, etc.
The second dataset is Assistments-20092010 (Assistments) which was extracted from
ASSISTments system (teacherwiki.assistment.org). This dataset represents the log les
of interactions between students and the tutoring systems. While students solve the
problems in the tutoring system, their activities, success and progress indicators are
logged as individual rows in the data set. This data can be mapped to the concepts of
recommender systems as student !user; task !item; and performance
(CFA) !rating. Clearly, in these datasets, there are several relationships between
data attributes that we can exploit. Information about the number of users, items, and
ratings on these datasets are summarized in Table 1.
716 N. Thai-Nghe et al.
4.2 Entity Relationship Diagram (ERD)
To use the MRMF, MRMF++ and WMRMF++, we need to provide a list of entities
and relations which are input parameters, therefore the datasets need to be prepro-
cessed. Parts of ERDs are presented in Fig. 5for Movielens and Assistments dataset.
4.3 Experimental Setting
4.3.1 Baseline
The proposed methods are compared with several methods such as global average,
user average,item average,user-kNN, and item-kNN. Please refer to [8] for details
about these methods. Moreover, we also compare the proposed approach with matrix
factorization (MF) and multi-relational matrix factorization (MRMF). The prediction
functions of these methods are presented in the following:
Global average: The rating rof the user uon the item iis predicted by
^
rui ¼l¼Pðu;i;rÞ2Dtrain r
Dtrain
User average: The rating of the user uon the item iis predicted by
^
rui ¼Pðu0;i;rÞ2Dtrain u0¼ujr
u0;i;rÞ2D
train u0¼ugjj
Item average: The rating of the user uon the item iis predicted by
^
rui ¼Pðu;i0;rÞ2Dtrain i0¼i
jr
u;i0;rÞ2D
train i0¼ig
jj
User-kNN: The rating of the user uon the item iis predicted by
Table 1. Information of datasets
Dataset User Item Rating
Movielens 100 k 943 1,682 100,000
Assistments 8,519 35,798 1,011,079
Fig. 5. ERD for Movielens and Assistments Data sets (the lled gray color relation is the main
relation)
An Approach for Multi-Relational Data Context 717
^
rui ¼
ruþPu02Kusimðu;u0Þðru0i
ru0Þ
Pu02Kusimðu;u0Þ
jj
where Kuis a set of Knearest neighbors of user u;
ruand ru0are average rating over
all the items of user uand u0respectively; sim u;u0
ðÞis the similarity between user u
and user u0, computed by using Cosine similarity:
simcosineðu;u0Þ¼ Pi2luu0rui ru0i
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pi2luu0r2
ui
q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pi2luu0r2
u0i
q
where Iuu0is a set of items rated by both user uand user u0.
Item-kNN: The rating of the user uon the item iis predicted by
^
rui ¼
riþPi02kisimði;i0Þðrui
ri0Þ
Pi02Kisimði;i0Þjj
where Kiis a set of Knearest neighbors of item i;
riand
ri0are average rating over
all users of item iand i0respectively; sim i;i0
ðÞis the similarity between item iand
item i0.
4.3.2 Evaluation Measure
To compare among the methods, we use the standard measure in recommender sys-
tems, which is root mean squared error (RMSE).
RMSE ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1
Dtest
jj
X
u;i;r2Dtest
rui ^
rui
ðÞ
2
s
4.3.3 Hyperparameter Setting
The hyperparameter search is applied for the MF, MRMF, MRMF++ and WMRMF++
to search the best hyperparameters such as the number of iterations (Iter), the number of
latent factors K, learning rate b, regularization term k.
4.4 Experimental Results
Figures 6and 7present the RMSE results on Movielens and Assistment datasets,
respectively. The experimental results show that the proposed MRMF++ and
WMRMF++, which take into account multiple relationships between entities, have
improvements compared to the others. These results show that (W) MRMF++ is a
feasible approach for the multi-relational data.
718 N. Thai-Nghe et al.
5 Conclusion
In many real systems, we can adopt several relationships among data (e.g. product
attributes, user attributes, etc.), thus we have introduced a new multi-relational
approach for recommender systems (MRMF++ and WMRMF++) in the context of
having multi-relational data. With this approach, the prediction model can use all the
information from the latent factor matrices in prediction, so it has improvements
compared to the others. The experimental results show that the proposed methods work
well on both entertainment and education data. In future work, we will test on other
datasets to get more validations on the proposed approach.
Fig. 6. RMSE on Movielens data set. The lower the better
Fig. 7. RMSE on Assistments data set. The lower the better
An Approach for Multi-Relational Data Context 719
References
1. Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. (eds.): Recommender Systems Handbook.
Springer, US (2011)
2. Su, X., Khoshgoftaar, T.M.: A survey of collaborative ltering techniques. Adv. Artif. Intell.
2009, 4:14:19 (2009)
3. Bell, R.M., Koren, Y.: Scalable collaborative ltering with jointly derived neighborhood
interpolation weights. In: Proceedings of the 7th IEEE International Conference on Data
Mining (ICDM 2007), Washington, USA, pp. 4352. IEEE CS (2007)
4. Koren, Y., Bell, R., Volinsky, C.: Matrix factorization techniques for recommender systems.
J. Comput. 42(8), 3037 (2009). IEEE Computer Society Press
5. Lippert, C., Weber, S.H., Huang, Y., Tresp, V., Schubert, M., Kriegel, H.P.: Relation
prediction in multi-relational domains using matrix factorization. In: Proceedings of the NIPS
2008 Workshop: Structured Input-Structured Output, Vancouver, Canada, December, 2008
6. Thai-Nghe, N., Schmidt-Thieme, L.: Multi-relational factorization models for student
modeling in intelligent tutoring systems. In: 2015 Seventh International Conference on
Knowledge and Systems Engineering (KSE), pp. 6166. IEEE (2015)
7. Singh, A.P., Gordon, G.J.: Relational learning via collective matrix factorization. In:
Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining (KDD 2008). KDD 2008, pp. 650658. ACM, New York (2008)
8. Drumond, L., Diaz-Aviles, E., Schmidt-Thieme, L., Nejdl, W.: Optimizing multi-relational
factorization models for multipletarget relations. In: Proceedings of the 23rd ACM
International Conference on Information and Knowledge Management (CIKM 2014) (2014)
9. Thai-Nghe, N.: Predicting Student Performance in an Intelligent Tutoring System. Ph.D.
thesis. University of Hildesheim, Germany (2012). hildok.bsz-bw.de
720 N. Thai-Nghe et al.
... Matrix factorization methods have been employed for movie and product recommendations [9]. Matrix factorization aims to obtain user and item feature vectors by decomposing the original user-item interaction matrix, where values in the matrix equal the times of user-item interaction or user rating on the item, other values equal 0. Recommendation based on matrix factorization is commonly used for product recommendation and rating prediction [10][11]. Matrix factorization researches by decomposing the original matrix into two lower-dimensional matrices. ...
... Wordification (Perovšek, et al. 2015) can be seen as a transformation of a relational database into a corpus of text documents. In (Thai-Nghe, et al. 2017), an approach for Multi-Relational Data Context in recommender systems is proposed based on Matrix factorization technique which includes all of the information from latent factor matrices to the prediction functions so that the models have more data to learn. ...
Article
In this paper, we introduce MRMDCCA for classifying multi relational data. Multi-relational data are stored on relational databases where they consist of multiple relations that are linked together by entity-relationship links. MRMDCCA takes advantage of correlation information of related relations to predict the class label. The proposed approach creates two different multiple feature sets, multiple feature sets based on propagating label information and multiple fused feature sets based on extracting correlation information. It propagates labels from the target table to the background tables based on foreign key paths to create multiple feature sets based on propagating label information. It proposes a approach based on Canonical correlation analysis (CCA) to extracting correlation information between related tables based on join paths to create multiple fused feature sets based on extracting correlation information. Finally, it applies traditional classifiers on two created feature sets and combines result of classifiers by using meta-learner. Testing has been performed on two diverse datasets. We compare our proposed classifier with other state-of-the-art multi relational classifiers which use different approaches to deal with multi relational setting. We showed that the proposed classifier achieves promising results in experiments.
... Thus, they have several statistical methods to address the problem of predicting student performance by using RS. In [17], they proposed another multi-relational approach for recommender systems that can be applied for predicting student performance and assessed the applied model. However, the study depends on the availability of data for the experiment. ...
Article
Full-text available
In Intelligent Tutoring System (ITS) as well as the E-learning system at the university, predicting student learning performance to suggest courses is an essential task of an academic advisor. Many kinds of research address to solve this problem with diverse approaches such as classification, regression, association rules, and recommender systems. Recently, it was a measurable success in using collaborative filtering in the recommender system, especially the matrix factorization technique, to build the courses' recommendation system. There are many advances to improve the accuracy of the prediction, such as using student profiles, course properties, or course relationships; however, they have not been mined. This study proposes an approach which integrates the course relationships into the courses' recommendation system to improve the prediction accuracy. Experimental results of the proposed approach are positive when we validate the published educational datasets. © 2020, World Academy of Research in Science and Engineering. All rights reserved.
Conference Paper
Full-text available
Student Modeling is an important part of an Intelligent Tutoring System. The student model tracks information of individual student (e.g., time spent on problems, hints requested, correct answers, etc). One of the important tasks in student modeling is predicting student performance, where the system can provide the students early feedbacks to help them improving their study results. In this work, we propose using multi-relational factorization approach, which has been successfully applied in recommender systems area, for student modeling in the Intelligent Tutoring Systems. Experiments on large real world data sets show that the proposed approach can improve the prediction results and could be used for student modeling.
Article
Full-text available
As one of the most successful approaches to building recommender systems, collaborative filtering ( CF ) uses the known preferences of a group of users to make recommendations or predictions of the unknown preferences for other users. In this paper, we first introduce CF tasks and their main challenges, such as data sparsity, scalability, synonymy, gray sheep, shilling attacks, privacy protection, etc., and their possible solutions. We then present three main categories of CF techniques: memory-based, model-based, and hybrid CF algorithms (that combine CF with other recommendation techniques), with examples for representative algorithms of each category, and analysis of their predictive performance and their ability to address the challenges. From basic techniques to the state-of-the-art, we attempt to present a comprehensive survey for CF techniques, which can be served as a roadmap for research and practice in this area.
Conference Paper
Full-text available
The paper is concerned with relation predic- tion in multi-relational domains using ma- trix factorization. While most past predic- tive models focussed on one single relation type between two entity types, in the paper a generalized model is presented that is able to deal with an arbitrary number of relation types and entity types in a domain of inter- est. The novel multi-relational matrix fac- torization is domain independent and highly scalable. We validate the performance of our approach using two real-world data sets, i.e. user-movie recommendations and gene func- tion prediction.
Conference Paper
Multi-matrix factorization models provide a scalable and effective approach for multi-relational learning tasks such as link prediction, Linked Open Data (LOD) mining, recommender systems and social network analysis. Such models are learned by optimizing the sum of the losses on all relations in the data. Early models address the problem where there is only one target relation for which predictions should be made. More recent models address the multi-target variant of the problem and use the same set of parameters to make predictions for all target relations. In this paper, we argue that a model optimized for each target relation individually has better predictive performance than models optimized for a compromise on the performance on all target relations. We introduce specific parameters for each target but, instead of learning them independently from each other, we couple them through a set of shared auxiliary parameters, which has a regularizing effect on the target specific ones. Experiments on large Web datasets derived from DBpedia, Wikipedia and BlogCatalog show the performance improvement obtained by using target specific parameters and that our approach outperforms competitive state-of-the-art methods while being able to scale gracefully to big data.
Article
As the Netflix Prize competition has demonstrated, matrix factorization models are superior to classic nearest neighbor techniques for producing product recommendations, allowing the incorporation of additional information such as implicit feedback, temporal effects, and confidence levels.
Conference Paper
Relational learning is concerned with predicting unknown values of a relation, given a database of entities and ob- served relations among entities. An example of relational learning is movie rating prediction, where entities could in- clude users, movies, genres, and actors. Relations encode users' ratings of movies, movies' genres, and actors' roles in movies. A common prediction technique given one pairwise relation, for example a #users #movies ratings matrix, is low-rank matrix factorization. In domains with multiple relations, represented as multiple matrices, we may improve predictive accuracy by exploiting information from one re- lation while predicting another. To this end, we propose a collective matrix factorization model: we simultaneously factor several matrices, sharing parameters among factors when an entity participates in multiple relations. Each rela- tion can have a dierent value type and error distribution; so, we allow nonlinear relationships between the parameters and outputs, using Bregman divergences to measure error. We extend standard alternating projection algorithms to our model, and derive an ecient Newton update for the pro- jection. Furthermore, we propose stochastic optimization methods to deal with large, sparse matrices. Our model gen- eralizes several existing matrix factorization methods, and therefore yields new large-scale optimization algorithms for these problems. Our model can handle any pairwise re- lational schema and a wide variety of error models. We demonstrate its eciency,
Conference Paper
Recommender systems based on collaborative filtering predict user preferences for products or services by learning past user-item relationships. A predominant approach to collaborative filtering is neighborhood based ("k-nearest neighbors"), where a user-item preference rating is interpolated from ratings of similar items and/or users. We enhance the neighborhood-based approach leading to substantial improvement of prediction accuracy, without a meaningful increase in running time. First, we remove certain so-called "global effects" from the data to make the ratings more comparable, thereby improving interpolation accuracy. Second, we show how to simultaneously derive interpolation weights for all nearest neighbors, unlike previous approaches where each weight is computed separately. By globally solving a suitable optimization problem, this simultaneous interpolation accounts for the many interactions between neighbors leading to improved accuracy. Our method is very fast in practice, generating a prediction in about 0.2 milliseconds. Importantly, it does not require training many parameters or a lengthy preprocessing, making it very practical for large scale applications. Finally, we show how to apply these methods to the perceivably much slower user-oriented approach. To this end, we suggest a novel scheme for low dimensional embedding of the users. We evaluate these methods on the netflix dataset, where they deliver significantly better results than the commercial netflix cinematch recommender system.