Article

Local Matrix Factorization with Network Embedding for Recommender Systems

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In recommender systems, the rating matrix is usually not a global low-rank but local low-rank. Constructing low-rank submatrices for matrix factorization can improve the accuracy of rating prediction. This paper proposes a novel network embedding-based local matrix factorization model, which can built more meaningful sub-matrices. To alleviate the sparsity of the rating matrix, the social data and the rating data are integrated into a heterogeneous information network, which contains multiple types of objects and relations. The network embedding algorithm extracts the node representations of users and items from the heterogeneous information network. According to the correlation of the node representations, the rating matrix is divided into different sub-matrices. Finally, the matrix factorization is performed on the sub-matrices for rating prediction. We test our network embedding-based method on two real-world public data sets (Yelp and Douban). Experimental results show that our method can obtain more accurate prediction ratings.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The variable telegraph partial differential equation depend on initial boundary value problem has been studied. The coefficient constant time-space telegraph partial differential equation is obtained from the variable telegraph partial differential equation throughout using Cauchy-Euler formula. The first and second order difference schemes were constructed for both of coefficient constant time-space and variable time-space telegraph partial differential equation. Matrix stability method is used to prove stability of difference schemes for the variable and coefficient telegraph partial differential equation. The variable telegraph partial differential equation and the constant coefficient time-space telegraph partial differential equation are compared with the exact solution. Finally, approximation solution has been found for both equations. The error analysis table presents the obtained numerical results.
Article
Full-text available
Collaborative filtering is the most popular approach when building recommender systems, but the large scale and sparse data of the user-item matrix seriously affect the recommendation results. Recent research shows the user’s social relations information can improve the quality of recommendation. However, most of the current social recommendation algorithms only consider the user's direct social relations, while ignoring potential users’ interest preference and group clustering information. Moreover, project attribute is also important in item rating. We propose a recommendation algorithm which using matrix factorization technology to fuse user information and project information together. We first detect the community structure using overlapping community discovery algorithm, and mine the clustering information of user interest preference by a fuzzy clustering algorithm based on the project category information. On the other hand, we use project-category attribution matrix and user-project score matrix to get project comprehensive similarity and compute project feature matrix based on Entity Relation Decomposition. Fusing the user clustering information and project information together, we get Entity-Association-based Matrix Factorization (EAMF) model which can be used to predict user ratings. The proposed algorithm is compared with other algorithms on the Yelp dataset. Experimental studies show that the proposed algorithm leads to a substantial increase in recommendation accuracy on Yelp data set.
Conference Paper
Full-text available
The graph embedding paradigm projects nodes of a graph into a vector space, which can facilitate various downstream graph analysis tasks such as node classification and clustering. To efficiently learn node embeddings from a graph, graph embedding techniques usually preserve the proximity between node pairs sampled from the graph using random walks. In the context of a heterogeneous graph, which contains nodes from different domains, classical random walks are biased towards highly visible domains where nodes are associated with a dominant number of paths. To overcome this bias, existing heterogeneous graph embedding techniques typically rely on meta-paths (i.e., fixed sequences of node types) to guide random walks. However, using these meta-paths either requires prior knowledge from domain experts for optimal meta-path selection, or requires extended computations to combine all meta-paths shorter than a predefined length. In this paper, we propose an alternative solution that does not involve any meta-path. Specifically, we propose JUST, a heterogeneous graph embedding technique using random walks with JUmp and STay strategies to overcome the aforementioned bias in an more efficient manner. JUST can not only gracefully balance between homogeneous and heterogeneous edges, it can also balance the node distribution over different domains (i.e., node types). By conducting a thorough empirical evaluation of our method on three heterogeneous graph datasets, we show the superiority of our proposed technique. In particular, compared to a state-of-the-art heterogeneous graph embedding technique Hin2vec, which tries to optimally combine all meta-paths shorter than a predefined length, our technique yields better results in most experiments, with a dramatically reduced embedding learning time (about 3x speedup).
Conference Paper
Full-text available
The explicitly observed social relations from online social platforms have been widely incorporated into conventional recommender systems to mitigate the data sparsity issue. However, the direct usage of explicit social relations may lead to an inferior performance due to the unreliability (e.g., noises) of observed links. To this end, the discovery of reliable relations among users plays a central role in advancing social recommender systems. In this paper, we propose a novel approach to adaptively identify implicit friends toward the discovery of more credible user relations. In particular, implicit friends are those who share similar tastes but could be distant from each other on the network topology of social relations. Methodolog-ically, to find the implicit friends for each user, we first model the whole system as a heterogeneous information network, and then capture the similarity of users through embedding representation learning. Finally, our approach adaptively incorporates different amounts of similar users as implicit friends for each user to alleviate the adverse consequences of unreliable social relations for a more effective recommendation. Experimental analysis on three real-world datasets demonstrates the superiority of our method and explains why the implicit friends are helpful in improving the performance of social recommendation.
Conference Paper
Full-text available
Matrix factorization is widely used in personalized recommender systems, text mining, and computer vision. A general assumption to construct matrix approximation is that the original matrix is of global low rank, while Joonseok Lee et al. proposed that many real matrices may be not globally low rank, and thus a locally low-rank matrix approximation method has been proposed.[11] However, this kind of matrix approximation method still leaves some important issues unsolved, for example, the randomly selecting anchor nodes. In this paper, we study the problem of the selection of anchor nodes to enhance locally low-rank matrix approximation. We propose a new model for local low-rank matrix approximation which selects anchor-points using a heuristic method. Our experiments indicate that the proposed method outperforms many state-of-the-art recommendation methods. Moreover, the proposed method can significantly improve algorithm efficiency, and it is easy to parallelize. These traits make it potential for large scale real-world recommender systems.
Article
Full-text available
Due to the flexibility in modelling data heterogeneity, heterogeneous information network (HIN) has been adopted to characterize complex and heterogeneous auxiliary data in recommender systems, called HIN based recommendation. It is challenging to develop effective methods for HIN based recommendation in both extraction and exploitation of the information from HINs. Most of HIN based recommendation methods rely on path based similarity, which cannot fully mine latent structure features of users and items. In this paper, we propose a novel heterogeneous network embedding based approach for HIN based recommendation, called HERec. To embed HINs, we design a meta-path based random walk strategy to generate meaningful node sequences for network embedding. The learned node embeddings are first transformed by a set of fusion functions, and subsequently integrated into an extended matrix factorization (MF) model. The extended MF model together with fusion functions are jointly optimized for the rating prediction task. Extensive experiments on three real-world datasets demonstrate the effectiveness of the HERec model. Moreover, we show the capability of the HERec model for the cold-start problem, and reveal that the transformed embedding information from HINs can improve the recommendation performance.
Conference Paper
Full-text available
Although Recommender Systems have been comprehensively analyzed in the past decade, the study of social-based recommender systems just started. In this paper, aiming at providing a general method for improving recommender systems by incorporating social network information, we propose a matrix factorization framework with social regularization. The contributions of this paper are four-fold: (1) We elaborate how social network information can benefit recommender systems; (2) We interpret the differences between social-based recommender systems and trust-aware recommender systems; (3) We coin the term Social Regularization to represent the social constraints on recommender systems, and we systematically illustrate how to design a matrix factorization objective function with social regularization; and (4) The proposed method is quite general, which can be easily extended to incorporate other contextual information, like social tags, etc. The empirical analysis on two large datasets demonstrates that our approaches outperform other state-of-the-art methods.
Conference Paper
Most of heterogeneous information network (HIN) based recommendation models are based on the user and item modeling with meta-paths. However, they always model users and items in isolation under each meta-path, which may lead to information extraction misled. In addition, they only consider structural features of HINs when modeling users and items during exploring HINs, which may lead to useful information for recommendation lost irreversibly. To address these problems, we propose a HIN based unified embedding model for recommendation, called HueRec. We assume there exist some common characteristics under different meta-paths for each user or item, and use data from all meta-paths to learn unified users’ and items’ representations. So the interrelation between meta-paths are utilized to alleviate the problems of data sparsity and noises on one meta-path. Different from existing models which first explore HINs then make recommendations, we combine these two parts into an end-to-end model to avoid useful information lost in initial phases. In addition, we embed all users, items and meta-paths into related latent spaces. Therefore, we can measure users’ preferences on meta-paths to improve the performances of personalized recommendation. Extensive experiments show HueRec consistently outperforms state-of-the-art methods.
Article
Recommendation methods based on heterogeneous information networks (HINs)have been attracting increased attention recently. Meta paths in HINs represent different types of semantic relationships. Meta path-based recommendation methods aim to use meta paths in HINs to evaluate the similarity or relevancy between nodes to make recommendations. In previous work, the meta paths have usually been selected manually (based on experience), and the path weight optimization methods usually suffer from overfitting. To solve these problems, we propose to automatically select and combine the meta paths through weight optimization. Diversity is introduced into the objective function as a regularization term to avoid overfitting. Inspired by the ambiguity decomposition theory in ensemble learning, we present a new diversity measure and use it to encourage diversity among meta paths to improve recommendation performance. Experimental results on item recommendation and tag recommendation tasks confirm the effectiveness of the proposed method compared with traditional collaborative filtering and state-of-the-art HIN-based recommendation methods.
Conference Paper
Matrix Factorization (MF) is a very popular method for recommendation systems. It assumes that the underneath rating matrix is low-rank. However, this assumption can be too restrictive to capture complex relationships and interactions among users and items. Recently, Local LOw-Rank Matrix Approximation (LLORMA) has been shown to be very successful in addressing this issue. It just assumes the rating matrix is composed of a number of low-rank submatrices constructed from subsets of similar users and items. Although LLORMA outperforms MF, how to construct such submatrices remains a big problem. Motivated by the availability of rich social connections in today’s recommendation systems, we propose a novel framework, i.e., Social LOcal low-rank Matrix Approximation (SLOMA), to address this problem. To the best of our knowledge, SLOMA is the first work to incorporate social connections into the local low-rank framework. Furthermore, we enhance SLOMA by applying social regularization to submatrices factorization, denoted as SLOMA++. Therefore, the proposed model can benefit from both social recommendation and the local low-rank assumption. Experimental results from two real-world datasets, Yelp and Douban, demonstrate the superiority of the proposed models over LLORMA and MF.
Conference Paper
Heterogeneous Information Network (HIN) is a natural and general representation of data in modern large commercial recommender systems which involve heterogeneous types of data. HIN based recommenders face two problems: how to represent the high-level semantics of recommendations and how to fuse the heterogeneous information to make recommendations. In this paper, we solve the two problems by first introducing the concept of meta-graph to HIN-based recommendation, and then solving the information fusion problem with a " matrix factorization (MF) + factorization machine (FM) " approach. For the similarities generated by each meta-graph, we perform standard MF to generate latent features for both users and items. With different meta-graph based features, we propose a group lasso regularized FM to automatically learn from the observed ratings to effectively select useful meta-graph based features. Experimental results on two real-world datasets, Amazon and Yelp, show the effectiveness of our approach compared to state-of-the-art FM and other HIN-based recommendation algorithms.
Article
Matrix approximation is a common tool in recommendation systems, text mining, and computer vision. A prevalent assumption in constructing matrix approximations is that the partially observed matrix is low-rank. In this paper, we propose, analyze, and experiment with two procedures, one parallel and the other global, for constructing local matrix approximations. The two approaches approximate the observed matrix as a weighted sum of low-rank matrices. These matrices are limited to a local region of the observed matrix. We analyze the accuracy of the proposed local low-rank modeling. Our experiments show improvements in prediction accuracy over classical approaches for recommendation tasks. ©2016 Joonseok Lee, Seungyeon Kim, Guy Lebanon, Yoram Singer and Samy Bengio.
Conference Paper
Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks. Here we propose node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks. In node2vec, we learn a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes. We define a flexible notion of a node's network neighborhood and design a biased random walk procedure, which efficiently explores diverse neighborhoods. Our algorithm generalizes prior work which is based on rigid notions of network neighborhoods, and we argue that the added flexibility in exploring neighborhoods is the key to learning richer representations. We demonstrate the efficacy of node2vec over existing state-of-the-art techniques on multi-label classification and link prediction in several real-world networks from diverse domains. Taken together, our work represents a new way for efficiently learning state-of-the-art task-independent representations in complex networks.
Chapter
The collaborative filtering (CF) approach to recommenders has recently enjoyed much interest and progress. The fact that it played a central role within the recently completed Netflix competition has contributed to its popularity. This chapter surveys the recent progress in the field. Matrix factorization techniques, which became a first choice for implementing CF, are described together with recent innovations. We also describe several extensions that bring competitive accuracy into neighborhood methods, which used to dominate the field. The chapter demonstrates how to utilize temporal models and implicit feedback to extend models accuracy. In passing, we include detailed descriptions of some the central methods developed for tackling the challenge of the Netflix Prize competition.
Article
We analyze skip-gram with negative-sampling (SGNS), a word embedding method introduced by Mikolov et al., and show that it is implicitly factorizing a word-context matrix, whose cells are the pointwise mutual information (PMI) of the respective word and context pairs, shifted by a global constant. We find that another embedding method, NCE, is implicitly factorizing a similar matrix, where each cell is the (shifted) log conditional probability of a word given its context. We show that using a sparse Shifted Positive PMI word-context matrix to represent words improves results on two word similarity tasks and one of two analogy tasks. When dense low-dimensional vectors are preferred, exact factorization with SVD can achieve solutions that are at least as good as SGNS's solutions for word similarity tasks. On analogy questions SGNS remains superior to SVD. We conjecture that this stems from the weighted nature of SGNS's factorization.
Conference Paper
Users of popular services like Twitter and Facebook are often simultaneously overwhelmed with the amount of information delivered via their social connections and miss out on much content that they might have liked to see, even though it was distributed outside of their social circle. Both issues serve as difficulties to the users and drawbacks to the services. Social media service providers can benefit from understanding user interests and how they interact with the service, potentially predicting their behaviors in the future. In this paper, we address the problem of simultaneously predicting user decisions and modeling users' interests in social media by analyzing rich information gathered from Twitter. The task differs from conventional recommender systems as the cold-start problem is ubiquitous, and rich features, including textual content, need to be considered. We build predictive models for user decisions in Twitter by proposing Co-Factorization Machines (CoFM), an extension of a state-of-the-art recommendation model, to handle multiple aspects of the dataset at the same time. Additionally, we discuss and compare ranking-based loss functions in the context of recommender systems, providing the first view of how they vary from each other and perform in real tasks. We explore an extensive set of features and conduct experiments on a real-world dataset, concluding that CoFM with ranking-based loss functions is superior to state-of-the-art methods and yields interpretable latent factors.
Article
We present DeepWalk, a novel approach for learning latent representations of vertices in a network. These latent representations encode social relations in a continuous vector space, which is easily exploited by statistical models. DeepWalk generalizes recent advancements in language modeling and unsupervised feature learning (or deep learning) from sequences of words to graphs. DeepWalk uses local information obtained from truncated random walks to learn latent representations by treating walks as the equivalent of sentences. We demonstrate DeepWalk's latent representations on several multi-label network classification tasks for social networks such as BlogCatalog, Flickr, and YouTube. Our results show that DeepWalk outperforms challenging baselines which are allowed a global view of the network, especially in the presence of missing information. DeepWalk's representations can provide F1 scores up to 10% higher than competing methods when labeled data is sparse. In some experiments, DeepWalk's representations are able to outperform all baseline methods while using 60% less training data. DeepWalk is also scalable. It is an online learning algorithm which builds useful incremental results, and is trivially parallelizable. These qualities make it suitable for a broad class of real world applications such as network classification, and anomaly detection.
Article
As the Netflix Prize competition has demonstrated, matrix factorization models are superior to classic nearest neighbor techniques for producing product recommendations, allowing the incorporation of additional information such as implicit feedback, temporal effects, and confidence levels.
Conference Paper
Recommender systems are becoming tools of choice to select the online information relevant to a given user. Collaborative filtering is the most popular approach to building recommender systems and has been successfully employed in many applications. With the advent of online social networks, the social network based approach to recommendation has emerged. This approach assumes a social network among users and makes recommendations for a user based on the ratings of the users that have direct or indirect social relations with the given user. As one of their major benefits, social network based approaches have been shown to reduce the problems with cold start users. In this paper, we explore a model-based approach for recommendation in social networks, employing matrix factorization techniques. Advancing previous work, we incorporate the mechanism of trust propagation into the model. Trust propagation has been shown to be a crucial phenomenon in the social sciences, in social network analysis and in trust-based recommendation. We have conducted experiments on two real life data sets, the public domain Epinions.com dataset and a much larger dataset that we have recently crawled from Flixster.com. Our experiments demonstrate that modeling trust propagation leads to a substantial increase in recommendation accuracy, in particular for cold start users.
Conference Paper
Recommender systems for automatically suggested items of interest to users have become increasingly essential in fields where mass personalization is highly valued. The popular core techniques of such systems are collaborative filtering, content-based filtering and combinations of these. In this paper, we discuss hybrid approaches, using collaborative and also content data to address cold-start - that is, giving recommendations to novel users who have no preference on any items, or recommending items that no user of the community has seen yet. While there have been lots of studies on solving the item-side problems, solution for user-side problems has not been seen public. So we develop a hybrid model based on the analysis of two probabilistic aspect models using pure collaborative filtering to combine with users' information. The experiments with MovieLen data indicate substantial and consistent improvements of this model in overcoming the cold-start user-side problem.
Probabilistic Matrix Factorization
  • A Mnih
  • R R Salakhutdinov
  • J Platt
  • D Koller
  • Y Singer
Mnih, A.-Salakhutdinov, R. R.: Probabilistic Matrix Factorization. In: Platt, J., Koller, D., Singer, Y., Roweis, S. (Eds.): Advances in Neural Information Processing Systems 20 (NIPS 2007). Curran Associates, Inc., 2007, pp. 1257-1264, https://proceedings.neurips.cc/paper_files/paper/ 2007/file/d7322ed717dedf1eb4e6e52a37ea7bcd-Paper.pdf.
Improving Regularized Singular Value Decomposition for Collaborative Filtering
  • A Paterek
Paterek, A.: Improving Regularized Singular Value Decomposition for Collaborative Filtering. Proceedings of KDD Cup and Workshop (KDDCup.07), 2007, pp. 5-8.