Article

Web-Scale Media Recommendation Systems

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Modern consumers are inundated with choices. A variety of products are offered to consumers, who have unprecedented opportunities to select products that meet their needs. The opportunity for selection also presents a time-consuming need to select. This has led to the development of recommender systems that direct consumers to products expected to satisfy them. One area in which such systems are particularly useful is that of media products, such as movies, books, television, and music. We study the details of media recommendation by focusing on a large scale music recommender system. To this end, we introduce a music rating data set that is likely to be the largest of its kind, in terms of both number of users, items, and total number raw ratings. The data were collected by Yahoo! Music over a decade. We formulate a detailed recommendation model, specifically designed to account for the data set properties, its temporal dynamics, and the provided taxonomy of items. The paper demonstrates a design process that we believe to be useful at many other recommendation setups. The process is based on gradual modeling of additive components of the model, each trying to reflect a unique characteristic of the data.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The main task of recommendation system is to predict the users' preferences on specific items according to historical ratings and feedbacks. Currently, the most popular approach is collaborative filtering (CF), which analyzes relationships between users and interdependencies among items to associate the new item and users [7], relying on previous user ratings. CF methods have been widely used in the Netflix Contest [3] and achieved fairly good performance [11]. ...
... where Iij indicates whether the rating from user i to item j is observed (Iij = 1) or not (Iij = 0). Various approaches have also been proposed to get more accurate predicted ratings, including SVD++ and its variations [7,11] which approximates the ratings with a bias term (bij) and implicit feedbacks (f (i)).r ...
... Unlike the Netflix challenge [3] or Yahoo! Music dataset [7], where the RMSE (root mean square error) is widely adopted, the situation in our scenario is different, because the user rating in FlickrUserFavor dataset is binary rather than 5-level scores. Thus, it is biased using the RMSE as the evaluation measure for FlickrUserFavor dataset. ...
Article
With the incredibly growing amount of multimedia data shared on the social media platforms, recommender systems have become an important necessity to ease users' burden on the information overload. In such a scenario, extensive amount of heterogeneous information such as tags, image content, in addition to the user-to-item preferences, is extremely valuable for making effective recommendations. In this paper, we explore a novel hybrid algorithm termed {\em STM}, for image recommendation. STM jointly considers the problem of image content analysis with the users' preferences on the basis of sparse representation. STM is able to tackle the challenges of highly sparse user feedbacks and cold-start problmes in the social network scenario. In addition, our model is based on the classical probabilistic matrix factorization and can be easily extended to incorporate other useful information such as the social relationships. We evaluate our approach with a newly collected 0.3 million social image data set from Flickr. The experimental results demonstrate that sparse topic modeling of the image content leads to more effective recommendations, , with a significant performance gain over the state-of-the-art alternatives.
... It has been shown that in collaborative filtering problems, much of the signal lies in simple popularity biases [71]. For example, the winning model in the Netflix Prize competition [10] managed to explain 42.6% of the ratings' variance i.e., R 2 = 42.6%, but the vast majority of the learned signal was attributed to popularity biases which explained a whopping R 2 = 32.5% of the variance (without any personalization) [72]. ...
... Music dataset, the decrease in item importance is considerably more significant than in the other datasets. This implies that in this dataset the non-stationary temporal trends are of higher importance, in accordance with a deeper analysis performed on this dataset for KDD-Cup'11 [72]. In contrast, the non-stationary temporal effects on the Netflix dataset, while evidently significant, are of less importance than in other datasets. ...
Article
Full-text available
Collaborative filtering methods for recommender systems tend to represent users as a single static latent vector. However, user behavior and interests may dynamically change in the context of the recommended item being presented to the user. For example, in the case of movie recommendations, it is usually true that movies that the user watched more recently are more informative than movies that were watched a long time ago. However, it is possible that a particular movie from the past may become suddenly more relevant for prediction in the presence of a recommendation for its sequel movie. In response to this issue, we introduce the Attentive Item2Vec++ (AI2V++) model, a neural attentive collaborative filtering approach in which the user representation adapts dynamically in the presence of the recommended item. AI2V++ employs a novel context-target attention mechanism in order to learn and capture different characteristics of the user’s historical behavior with respect to a potential recommended item. Furthermore, analysis of the neural-attentive scores allows for improved interpretability and explainability of the model. We evaluate our proposed approach on five publicly available datasets and demonstrate its superior performance in comparison to state-of-the-art baselines across multiple accuracy metrics.
... Motivated by that, in order to learn the various parameters, for each given user u, we first sort the usage points of all her item interactions histories R ui chronologically and hide the last usage point; let i p denote the item whose usage point was hidden. We then apply negative sampling proportional to the items popularity [6,19] and pick an item i n R u . The objective of the model is to maximize the log-likelihood of the hidden items. ...
... Using each dataset, we evaluated our approach and the various baseline methods using an "All But (Last) One" protocol [3] (i.e., the hidden item belongs to the last usage point). Recommender systems usually present/display few items at once (e.g., [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20], where the exact amount is system dependent. Hence the item a user may pick should be among the first few items on the list. ...
Conference Paper
Full-text available
Most collaborative filtering models assume that the interaction of users with items take a single form, e.g., only ratings or clicks or views. In fact, in most real-life recommendation scenarios, users interact with items in diverse ways. This in turn, generates complex usage data that contains multiple and diverse types of user feedback. In addition, within such a complex data setting, each user-item pair may occur more than once, implying on repetitive preferential user behaviors. In this work we tackle the problem of building a Collaborative Filtering model that takes into account such complex datasets. We propose a novel factor model, CDMF, that is capable of incorporating arbitrary and diverse feedback types without any prior domain knowledge. Moreover, CDMF is inherently capable of considering user-item repetitions. We evaluate CDMF against stateof- the-art methods with highly favorable results.
... Context could be location, time, family members around, IoT products located near by and so on. Context-aware recommendation [39] can also be done by analysing similar users with similar smart environments. ...
Preprint
The Internet of Things (IoT) is a dynamic global information network consisting of Internet-connected objects, such as RFIDs, sensors, and actuators, as well as other instruments and smart appliances that are becoming an integral component of the Internet. Over the last few years, we have seen a plethora of IoT solutions making their way into the industry marketplace. Context-aware communication and computing has played a critical role throughout the last few years of ubiquitous computing and is expected to play a significant role in the IoT paradigm as well. In this article, we examine a variety of popular and innovative IoT solutions in terms of context-aware technology perspectives. More importantly, we evaluate these IoT solutions using a framework that we built around well-known context-aware computing theories. This survey is intended to serve as a guideline and a conceptual framework for contextaware product development and research in the IoT paradigm. It also provides a systematic exploration of existing IoT products in the marketplace and highlights a number of potentially significant research directions and trends.
... Context could be location, time, family members around, IoT products located near by and so on. Context-aware recommendation [49] can also be done by analysing similar users with similar smart environments. ...
Book
Full-text available
Please download the book from: https://leanpub.com/sensingasaservice
... Our work is related to this in the sense that we aim to recommend top candidates who are most suitable for the vacancy. A popular method in recommendation (collaborative filtering) is latent factor model [27,14,42]. The basic idea is to apply matrix factorization to user-item rating data to identify the latent factors. ...
Conference Paper
In this paper, we study the problem of TEAM MEMBER REPLACEMENT -- given a team of people embedded in a social network working on the same task, find a good candidate to best replace a team member who becomes unavailable to perform the task for certain reason (e.g., conflicts of interests or resource capacity). Prior studies in teamwork have suggested that a good team member replacement should bring synergy to the team in terms of having both skill matching and structure matching. However, existing techniques either do not cover both aspects or consider the two aspects independently. In this work, we propose a novel problem formulation using the concept of graph kernels that takes into account the interaction of both skill and structure matching requirements. To tackle the computational challenges, we propose a family of fast algorithms by (a) designing effective pruning strategies, and (b) exploring the smoothness between the existing and the new team structures. We conduct extensive experimental evaluations and user studies on real world datasets to demonstrate the effectiveness and efficiency. Our algorithms (a) perform significantly better than the alternative choices in terms of both precision and recall and (b) scale sub-linearly.
... In most cases search engines, as well social and web feeds, are capable of providing answers, i.e., synthesis of the plethora of information that is available online, often ranked utilizing user-generated evaluations and scores [3], [4]. Recommendation systems, in addition, regularly push suggestions and answers also when no search is being undertaken, hence stimulating users with information, which is typically based on past preferences and choices [5], [6], [7]. ...
Conference Paper
Full-text available
Mobile devices and web pages increasingly set not only the direction, but also the pace taken in many everyday life activities. In essence, the lives of many people today follow algorithmic paths, provided by navigation units and by social recommendation systems. Although this improves the efficiency and functionality of many tasks, this process may also lead to a standardized, and, perhaps, oversimplified approach to reality. In essence, many likes on social pages (e.g., Facebook), star ratings on leading traveler websites (e.g., Tripadvisor) and reviews provided by the online crowd may lead the lion's share of users to visit only a limited number of locations. This means that in many cases, people with very different backgrounds, taste, cultural awareness and sensitivity may end in the very same places while missing more appropriate ones, be them historical or commercial. The work presented in this paper aims at moving a first step in unveiling such problem, and experimenting with possible working strategies which may better represent the significance of a location, while still conserving the simplicity of the most commonly utilized evaluation systems.
... he proposed method is expected to be applicable for mobile applications, such as for e-mail, messenger, and image viewing services. Furthermore, we evaluated the proposed method by implementing a prototype for a music recommendation system. he intelligent music recommender uses contextual information, which is currently being studied by many researchers [22][23][24][25] . Our recommendation system may be further developed to incorporate more meaningful information through combination contextual information, such as the weather at the moment that the user plays or skips such music. ...
Article
Full-text available
Data sorting is an important procedure with many applications in computer science. Thus, various methods that improve the performance of sorting algorithms have been published. Most existing algorithms mainly aim to achieve a faster processing time for sorting and require memory space to load all data in the set. Such designs can be a burden for small devices that have relatively insufficient resources. This paper proposes a new sorting method for mobile devices that finds the minimum and maximum values by reading only a portion of the data into memory, unlike existing methods that read all data into memory. This phase is repeated as often as necessary in accordance with the size of the data set. The min/max values that are determined are written at the start and end point of the data set in storage. As the process is repeated, sorting can take place even if a small amount of memory is available, such as when using a mobile device. In order to evaluate the proposed method, we implemented a prototype for a music recommendation system, and we conducted comparative experiments with representative sorting methods. The results indicate that the proposed method consumed significantly less memory, and we confirmed the effectiveness and potentiality of the proposed method.
... In most cases search engines, as well social and web feeds, are capable of providing answers, i.e., synthesis of the plethora of information that is available online, often ranked utilizing user-generated evaluations and scores [3], [4]. Recommendation systems, in addition, regularly push suggestions and answers also when no search is being undertaken, hence stimulating users with information, which is typically based on past preferences and choices [5], [6], [7]. ...
... Here, contexts could be location, time, family members around, IoT products located near by and so on. Context-aware recommendations [39] can also be done by analyzing similar users with similar smart environments. ...
Article
Full-text available
The Internet of Things (IoT) is a dynamic global information network consisting of Internet-connected objects, such as radio frequency identifications, sensors, and actuators, as well as other instruments and smart appliances that are becoming an integral component of the Internet. Over the last few years, we have seen a plethora of IoT solutions making their way into the industry marketplace. Context-aware communications and computing have played a critical role throughout the last few years of ubiquitous computing and are expected to play a significant role in the IoT paradigm as well. In this paper, we examine a variety of popular and innovative IoT solutions in terms of context-aware technology perspectives. More importantly, we evaluate these IoT solutions using a framework that we built around well-known context-aware computing theories. This survey is intended to serve as a guideline and a conceptual framework for context-aware product development and research in the IoT paradigm. It also provides a systematic exploration of existing IoT products in the marketplace and highlights a number of potentially significant research directions and trends.
... Our work is related to this in the sense that we aim to recommend top candidates who are most suitable for the vacancy. A popular method in recommendation (collaborative filtering) is latent factor model [26,13,41]. The basic idea is to apply matrix factorization to user-item rating data to identify the latent factors. ...
Article
In this paper, we study the problem of Team Member Replacement: given a team of people embedded in a social network working on the same task, find a good candidate who can fit in the team after one team member becomes unavailable. We conjecture that a good team member replacement should have good skill matching as well as good structure matching. We formulate this problem using the concept of graph kernel. To tackle the computational challenges, we propose a family of fast algorithms by (a) designing effective pruning strategies, and (b) exploring the smoothness between the existing and the new team structures. We conduct extensive experimental evaluations on real world datasets to demonstrate the effectiveness and efficiency. Our algorithms (a) perform significantly better than the alternative choices in terms of both precision and recall; and (b) scale sub-linearly.
... (This corresponds to a Gaussian model of the data [28].) MF has been extended in many ways to implement modern recommendation systems [7,21,27,31]. ...
Article
We develop a Bayesian Poisson matrix factorization model for forming recommendations from sparse user behavior data. These data are large user/item matrices where each user has provided feedback on only a small subset of items, either explicitly (e.g., through star ratings) or implicitly (e.g., through views or purchases). In contrast to traditional matrix factorization approaches, Poisson factorization implicitly models each user's limited attention to consume items. Moreover, because of the mathematical form of the Poisson likelihood, the model needs only to explicitly consider the observed entries in the matrix, leading to both scalable computation and good predictive performance. We develop a variational inference algorithm for approximate posterior inference that scales up to massive data sets. This is an efficient algorithm that iterates over the observed entries and adjusts an approximate posterior over the user/item representations. We apply our method to large real-world user data containing users rating movies, users listening to songs, and users reading scientific papers. In all these settings, Bayesian Poisson factorization outperforms state-of-the-art matrix factorization methods.
... The ever-growing demand for automated recommendations at scale in the recent decade has been promoting the development of a great variety of techniques to power recommender systems. Collaborative Filtering (CF) algorithms are often the alternative of choice, among which, the matrix factorization (MF) [14] technique is very popular and successful one [3,5,1,21]. In its most basic form, MF associates each user and each item with a latent factor (vector). ...
Article
Full-text available
One of the most challenging recommendation tasks is recommending to a new, previously unseen user. This is known as the user cold start problem. Assuming certain features or attributes of users are known, one approach for handling new users is to initially model them based on their features. Motivated by an ad targeting application, this paper describes an extreme online recommendation setting where the cold start problem is perpetual. Every user is encountered by the system just once, receives a recommendation, and either consumes or ignores it, registering a binary reward. We introduce One-pass Factorization of Feature Sets, 'OFF-Set', a novel recommendation algorithm based on Latent Factor analysis, which models users by mapping their features to a latent space. OFF-Set is able to model non-linear interactions between pairs of features, and updates its model per each recommendation-reward observation in a pure online fashion. We evaluate OFF-Set against several state of the art baselines, and demonstrate its superiority on real ad-targeting data.
Article
The rapid development of Internet provides many high quality data sets for understanding human online behavior patterns. Many analysis tools from complexity science such as complex networks and human dynamics have been applied to study human online activities. However, in most existing works, the topological and temporal features of human online activities are analyzed independently. In this paper, we connect these two dimensions by investigating the relations between online users inter-event time and the network distance between the items selected by users. We find that users will choose items with longer network distance when the inter-event time is long, and vice versa. This finding is then applied to improve the recommendation process where recommendation diversity should be given more weight when users return from a long inactive period. Finally, we use a trade-off between accuracy and complexity to further improve the recommendation algorithms.
Article
Full-text available
Recommender systems recommend items to users based on their interests and have seen tremendous growth due to the use of internet and web services. Recommendation systems have seen escalating growth rate since late 1990’s. A query on Google Scholar (famous research based search engine) gives 175,000 articles for the query “recommender system”. With such a large database of research/application articles, there arises a need to analyses the data so as to fulfill the basic requirements of effectively understanding the potential of the quantum of literature available so far. The study focuses on the topic of recommender system with various soft computing techniques such as fuzzy logic, neural network and genetic algorithm. The major contribution of this work is the demonstration of progressive knowledge for domain visualization and analysis of recommender system with soft computing techniques. The analysis is supported by various scientometric indicators such as Relative Growth Rate (RGR), Doubling Time (DT), Co-Authorship Index (CAI), Author Productivity, Degree of Collaboration, Research Priority Index (RPI), Half Life, Country wise Productivity, Citation Analysis, Page Length Distribution, Source Contributors. This research presents first of its kind scientometric analysis on “recommender system with soft computing techniques”. The present work provides useful parameters for establishing relationships between quantifiable data and intangible contributions in the field of recommender systems.
Conference Paper
Collaborative filtering for recommender systems seeks to learn and predict user preferences for a collection of items by identifying similarities between users on the basis of their past interest or interaction with the items in question. In this work, we present a conjugate prior regularized extension of Hofmann's Gaussian emission probabilistic latent semantic analysis model, able to overcome the over-fitting problem restricting the performance of the earlier formulation. Furthermore, in experiments using the EachMovie and MovieLens data sets, it is shown that the proposed regularized model achieves significantly improved prediction accuracy of user preferences as compared to the latent semantic analysis model without priors.
Article
In this paper, we study ways to enhance the composition of teams based on new requirements in a collaborative environment. We focus on recommending team members who can maintain the team’s performance by minimizing changes to the team’s skills and social structure. Our recommendations are based on computing team-level similarity, which includes skill similarity, structural similarity as well as the synergy between the two. Current heuristic approaches are one-dimensional and not comprehensive, as they consider the two aspects independently. To formalize team-level similarity, we adopt the notion of graph kernel of attributed graphs to encompass the two aspects and their interaction. To tackle the computational challenges, we propose a family of fast algorithms by (a) designing effective pruning strategies, and (b) exploring the smoothness between the existing and the new team structures. Extensive empirical evaluations on real world datasets validate the effectiveness and efficiency of our algorithms.
Article
Recommender systems are one of the most important technologies in e-commerce to help users filter out the overload of information. However, current mainstream recommendation algorithms, such as the collaborative filtering CF family, have problems such as scalability and sparseness. These problems hinder further developments of recommender systems. We propose a new recommendation algorithm based on item quality and user rating preferences, which can significantly decrease the computing complexity. Besides, it is interpretable and works better when the data is sparse. Through extensive experiments on three benchmark data sets, we show that our algorithm achieves higher accuracy in rating prediction compared with the traditional approaches. Furthermore, the results also demonstrate that the problem of rating prediction depends strongly on item quality and user rating preferences, thus opens new paths for further study.
Article
Low-rank Matrix Factorization (MF) methods provide one of the simplest and most effective approaches to collaborative filtering. This paper is the first to investigate the problem of efficient retrieval of recommendations in a MF framework. We reduce the retrieval in a MF model to an apparently simple task of finding the maximum dot-product for the user vector over the set of item vectors. However, to the best of our knowledge the problem of efficiently finding the maximum dot-product in the general case has never been studied. To this end, we propose two techniques for efficient search -- (i) We index the item vectors in a binary spatial-partitioning metric tree and use a simple branch and-bound algorithm with a novel bounding scheme to efficiently obtain exact solutions. (ii) We use spherical clustering to index the users on the basis of their preferences and pre-compute recommendations only for the representative user of each cluster to obtain extremely efficient approximate solutions. We obtain a theoretical error bound which determines the quality of any approximate result and use it to control the approximation. Both these simple techniques are fairly independent of each other and hence are easily combined to further improve recommendation retrieval efficiency. We evaluate our algorithms on real-world collaborative-filtering datasets, demonstrating more than ×7 speedup (with respect to the naive linear search) for the exact solution and over ×250 speedup for approximate solutions by combining both techniques.
Article
Full-text available
We employ a bipartite network to describe an online commercial system. Instead of investigating accuracy and diversity in each recommendation, we focus on studying the influence of recommendation on the evolution of the online bipartite network. The analysis is based on two benchmark datasets and several well-known recommendation algorithms. The structure properties investigated include item degree heterogeneity, clustering coefficient and degree correlation. This work highlights the importance of studying the effects and performance of recommendation in long-term evolution.
Article
Full-text available
Content processing is a vast and growing field that integrates different approaches borrowed from the signal processing, information retrieval and machine learning disciplines. In this article we deal with a particular type of content processing: the so-called content-based transformations. We will not focus on any particular application but rather try to give an overview of different techniques and conceptual implications. We first describe the transformation process itself, including the main model schemes that are commonly used, which lead to the establishment of the formal basis for a definition of content-based transformations. Then we take a quick look at a general spectral based analysis/synthesis approach to process audio signals and how to extract features that can be used in the content-based transformation context. Using this analysis/synthesis approach we give some examples on how content-based transformations can be applied to modify the basic perceptual axis of a sound and how we can even combine different basic effects in order to perform more meaningful transformations. We finish by going a step further in the abstraction ladder and present transformations that are related to musical (and thus symbolic) properties rather than to those of the sound or the signal itself.
Chapter
Full-text available
Recommender Systems (RSs) are software tools and techniques providing suggestions for items to be of use to a user. In this introductory chapter we briefly discuss basic RS ideas and concepts. Our main goal is to delineate, in a coherent and structured way, the chapters included in this handbook and to help the reader navigate the extremely rich and detailed content that the handbook offers.
Conference Paper
Full-text available
Recommender systems have shown great potential to help users find interesting and relevant items from within a large information space. Most research up to this point has focused on improving the accuracy of recommender systems. We believe that not only has this narrow focus been misguided, but has even been detrimental to the field. The recommendations that are most accurate according to the standard metrics are sometimes not the recommendations that are most useful to users. In this paper, we propose informal arguments that the recommender community should move beyond the conventional accuracy metrics and their associated experimental methodologies. We propose new user-centric directions for evaluating recommender systems.
Conference Paper
Full-text available
We have developed a method for recommending items that combines content and collaborative data under a single probabilistic framework. We benchmark our algorithm against a naïve Bayes classifier on the cold-start problem, where we wish to recommend items that no one in the community has yet rated. We systematically explore three testing methodologies using a publicly available data set, and explain how these methods apply to specific real-world applications. We advocate heuristic recommenders when benchmarking to give competent baseline performance. We introduce a new performance metric, the CROC curve, and demonstrate empirically that the various components of our testing strategy combine to obtain deeper understanding of the performance characteristics of recommender systems. Though the emphasis of our testing is on cold-start recommending, our methods for recommending and evaluation are general.
Conference Paper
Full-text available
We examine the case of over-specialization in recommender systems, which results from returning items that are too similar to those previously rated by the user. We propose Outside-The-Box (otb) recommendation, which takes some risk to help users make fresh discoveries, while maintaining high relevance. The proposed formalization relies on item regions and attempts to identify regions that are under-exposed to the user. We develop a recommendation algorithm which achieves a compromise between relevance and risk to find otb items. We evaluate this approach on the MovieLens data set and compare our otb recommendations against conventional recommendation strategies.
Conference Paper
Full-text available
When we evaluate the quality of recommender systems (RS), most approaches only focus on the predictive accuracy of these systems. Recent works suggest that beyond accuracy there is a variety of other metrics that should be considered when evaluating a RS. In this paper we focus on two crucial metrics in RS evaluation: coverage and serendipity. Based on a literature review, we first discuss both measurement methods as well as the trade-off between good coverage and serendipity. We then analyze the role of coverage and serendipity as indicators of recommendation quality, present novel ways of how they can be measured and discuss how to interpret the obtained measurements. Overall, we argue that our new ways of measuring these concepts reflect the quality impression perceived by the user in a better way than previous metrics thus leading to enhanced user satisfaction.
Conference Paper
Full-text available
We describe a novel statistical model, the tied Boltzmann machine, for combining collaborative and content informa- tion for recommendations. In our model, pairwise interac- tions between items are captured through a Boltzmann ma- chine, whose parameters are constrained according to the content associated with the items. This allows the model to use content information to recommend items that are not seen during training. We describe a tractable algorithm for training the model, and give experimental results evaluat- ing the model in two cold start recommendation tasks on the MovieLens data set.
Conference Paper
Full-text available
Recommender systems apply knowledge discovery techniques to the problem of making personalized recom- mendations for information, products or services during a live interaction. These systems, especially the k-nearest neighbor collaborative filtering based ones, are achieving widespread success on the Web. The tremendous growth in the amount of available information and the number of visitors to Web sites in recent years poses some key challenges for recommender systems. These are: producing high quality recommendations, performing many recommendations per second for millions of users and items and achieving high coverage in the face of data sparsity. In traditional collaborative filtering systems the amount of work increases with the number of participants in the system. New recommender system technologies are needed that can quickly produce high quality recommendations, even for very large-scale problems. To address these issues we have explored item-based collaborative filtering techniques. Item- based techniques first analyze the user-item matrix to identify relationships between different items, and then use these relationships to indirectly compute recommendations for users. In this paper we analyze different item-based recommendation generation algorithms. We look into different techniques for computing item-item similarities (e.g., item-item correlation vs. cosine similarities between item vec- tors) and different techniques for obtaining recommendations from them (e.g., weighted sum vs. regression model). Finally, we experimentally evaluate our results and compare them to the basic k-nearest neighbor approach. Our experiments suggest that item-based algorithms provide dramatically better performance than user-based algorithms, while at the same time providing better quality than the best available user-based algorithms.
Article
Full-text available
Electronic Music Distribution (EMD) is in demand of robust, automatically extracted music descriptors. We introduce a timbral similarity measures for comparing music titles. This measure is based on a Gaussian model of cepstrum coefficients. We describe the timbre extractor and the corresponding timbral similarity relation. We describe experiments in assessing the quality of the similarity relation, and show that the measure is able to yield interesting similarity relations, in particular when used in conjunction with other similarity relations. We illustrate the use of the descriptor in several EMD applications developed in the context of the Cuidado European project.
Article
Full-text available
We examine in some detail Mel Frequency Cepstral Coefficients (MFCCs) - the dominant features used for speech recognition - and investigate their applicability to modeling music. In particular, we examine two of the main assumptions of the process of forming MFCCs: the use of the Mel frequency scale to model the spectra; and the use of the Discrete Cosine Transform (DCT) to decorrelate the Mel-spectral vectors.
Article
Full-text available
Recommender systems apply knowledge discovery techniques to the problem of making personalized recommendations for information, products or services during a live interaction. These systems, especially the k-nearest neighbor collaborative filtering based ones, are achieving widespread success on the Web. The tremendous growth in the amount of available information and the number of visitors to Web sites in recent years poses some key challenges for recommender systems. These are: producing high quality recommendations, performing many recommendations per second for millions of users and items and achieving high coverage in the face of data sparsity. In traditional collaborative filtering systems the amount of work increases with the number of participants in the system. New recommender system technologies are needed that can quickly produce high quality recommendations, even for very large-scale problems. To address these issues we have explored item-based collaborative filtering techniques. Itembased techniques first analyze the user-item matrix to identify relationships between different items, and then use these relationships to indirectly compute recommendations for users. In this paper we analyze different item-based recommendation generation algorithms. We look into different techniques for computing item-item similarities (e.g., item-item correlation vs. cosine similarities between item vectors) and different techniques for obtaining recommendations from them (e.g., weighted sum vs. regression model). Finally, we experimentally evaluate our results and compare them to the basic k-nearest neighbor approach. Our experiments suggest that item-based algorithms provide dramatically better performance than user-based algorithms, while at the same time p...
Article
Recommender systems have shown great potential to help users find interesting and relevant items from within a large information space. Most research up to this point has focused on improving the accuracy of recommender systems. We be lieve that not only has this narrow focus been misguided, but has even been detrimental to the field. The recommendations that are most accurate according to the standard metrics are sometimes not the recommendations that are most useful to users. In this paper, we propose informal arguments that the recommender community should move beyond the conventional accuracy metrics and their associated experimental methodologies. We propose new user-centric directions for evaluating recommender systems.
Book
The explosive growth of e-commerce and online environments has made the issue of information search and selection increasingly serious; users are overloaded by options to consider and they may not have the time or knowledge to personally evaluate these options. Recommender systems have proven to be a valuable way for online users to cope with the information overload and have become one of the most powerful and popular tools in electronic commerce. Correspondingly, various techniques for recommendation generation have been proposed. During the last decade, many of them have also been successfully deployed in commercial environments. Recommender Systems Handbook, an edited volume, is a multi-disciplinary effort that involves world-wide experts from diverse fields, such as artificial intelligence, human computer interaction, information technology, data mining, statistics, adaptive user interfaces, decision support systems, marketing, and consumer behavior. Theoreticians and practitioners from these fields continually seek techniques for more efficient, cost-effective and accurate recommender systems. This handbook aims to impose a degree of order on this diversity, by presenting a coherent and unified repository of recommender systems major concepts, theories, methodologies, trends, challenges and applications. Extensive artificial applications, a variety of real-world applications, and detailed case studies are included. Recommender Systems Handbook illustrates how this technology can support the user in decision-making, planning and purchasing processes. It works for well known corporations such as Amazon, Google, Microsoft and AT&T. This handbook is suitable for researchers and advanced-level students in computer science as a reference.
Article
In October, 2006 Netflix released a dataset containing 100 million anonymous movie ratings and challenged the data mining, machine learning and computer science communities to develop systems that could beat the accuracy of its recommendation system, Cinematch. We briefly describe the challenge itself, review related work and efforts, and summarize visible progress to date. Other potential uses of the data are outlined, including its application to the KDD Cup 2007.
Article
Rating prediction is an important application, and a popular research topic in collaborative filtering. However, both the validity of learning algorithms, and the validity of standard testing procedures rest on the assumption that missing ratings are missing at random (MAR). In this paper we present the results of a user study in which we collect a random sample of ratings from current users of an online radio service. An analysis of the rating data collected in the study shows that the sample of random ratings has markedly different properties than ratings of user-selected songs. When asked to report on their own rating behaviour, a large number of users indicate they believe their opinion of a song does affect whether they choose to rate that song, a violation of the MAR condition. Finally, we present experimental results showing that incorporating an explicit model of the missing data mechanism can lead to significant improvements in prediction performance on the random sample of ratings.
Article
As the Netflix Prize competition has demonstrated, matrix factorization models are superior to classic nearest neighbor techniques for producing product recommendations, allowing the incorporation of additional information such as implicit feedback, temporal effects, and confidence levels.
Conference Paper
Recommender systems provide users with personalized suggestions for products or services. These systems often rely on Collaborating Filtering (CF), where past transactions are analyzed in order to establish connections between users and products. The two more successful approaches to CF are latent factor models, which directly profile both users and products, and neighborhood models, which analyze similarities between products or users. In this work we introduce some innovations to both approaches. The factor and neighborhood models can now be smoothly merged, thereby building a more accurate combined model. Further accuracy improvements are achieved by extending the models to exploit both explicit and implicit feedback by the users. The methods are tested on the Netflix data. Results are better than those previously published on that dataset. In addition, we suggest a new evaluation metric, which highlights the differences among methods, based on their performance at a top-K recommendation task.
Conference Paper
Customer preferences for products are drifting over time. Product perception and popularity are constantly changing as new selection emerges. Similarly, customer inclinations are evolving, leading them to ever redefine their taste. Thus, modeling temporal dynamics should be a key when designing recommender systems or general customer preference models. However, this raises unique challenges. Within the eco-system intersecting multiple products and customers, many different characteristics are shifting simultaneously, while many of them influence each other and often those shifts are delicate and associated with a few data instances. This distinguishes the problem from concept drift explorations, where mostly a single concept is tracked. Classical time-window or instance-decay approaches cannot work, as they lose too much signal when discarding data instances. A more sensitive approach is required, which can make better distinctions between transient effects and long term patterns. The paradigm we offer is creating a model tracking the time changing behavior throughout the life span of the data. This allows us to exploit the relevant components of all data instances, while discarding only what is modeled as being irrelevant. Accordingly, we revamp two leading collaborative filtering recommendation approaches. Evaluation is made on a large movie rating dataset by Netflix. Results are encouraging and better than those previously reported on this dataset.
Conference Paper
We propose a novel latent factor model to accurately predict re- sponse for large scale dyadic data in the presence of features. Our approach is based on a model that predicts response as a multiplica- tive function of row and column latent factors that are estimated through separate regressions on known row and column features. In fact, our model provides a single unified framework to address both cold and warm start scenarios that are commonplace in practi- cal applications like recommender systems, online advertising, web search, etc. We provide scalable and accurate model fitting meth- ods based on Iterated Conditional Mode and Monte Carlo EM al- gorithms. We show our model induces a stochastic process on the dyadic space with kernel (covariance) given by a polynomial func- tion of features. Methods that generalize our procedure to estimate factors in an online fashion for dynamic applications are also con- sidered. Our method is illustrated on benchmark datasets and a novel content recommendation application that arises in the con- text of Yahoo! Front Page. We report significant improvements over several commonly used methods on all datasets.
Conference Paper
A fundamental aspect of rating-based recommender systems is the observation process, the process by which users choose the items they rate. Nearly all research on collaborative ltering and recommender systems is founded on the as- sumption that missing ratings are missing at random. The statistical theory of missing data shows that incorrect as- sumptions about missing data can lead to biased parameter estimation and prediction. In a recent study, we demon- strated strong evidence for violations of the missing at ran- dom condition in a real recommender system. In this paper we present the rst study of the eect of non-random miss- ing data on collaborative ranking, and extend our previous results regarding the impact of non-random missing data on collaborative prediction.
Conference Paper
Cold-start scenarios in recommender systems are situations in which no prior events, like ratings or clicks, are known for certain users or items. To compute predictions in such cases, additional information about users (user attributes, e.g. gender, age, geographical location, occupation) and items (item attributes, e.g. genres, product categories, keywords) must be used. We describe a method that maps such entity (e.g. user or item) attributes to the latent features of a matrix (or higher-dimensional) factorization model. With such mappings, the factors of a MF model trained by standard techniques can be applied to the new-user and the new-item problem, while retaining its advantages, in particular speed and predictive accuracy. We use the mapping concept to construct an attribute-aware matrix factorization model for item recommendation from implicit, positive-only feedback. Experiments on the new-item problem show that this approach provides good predictive accuracy, while the prediction time only grows by a constant factor.
Article
Customer preferences for products are drifting over time. Product perception and popularity are constantly changing as new selection emerges. Similarly, customer inclinations are evolving, leading them to ever redefine their taste. Thus, modeling temporal dynamics is essential for designing recommender systems or general customer preference models. However, this raises unique challenges. Within the ecosystem intersecting multiple products and customers, many different characteristics are shifting simultaneously, while many of them influence each other and often those shifts are delicate and associated with a few data instances. This distinguishes the problem from concept drift explorations, where mostly a single concept is tracked. Classical time-window or instance decay approaches cannot work, as they lose too many signals when discarding data instances. A more sensitive approach is required, which can make better distinctions between transient effects and long-term patterns. We show how to model the time changing behavior throughout the life span of the data. Such a model allows us to exploit the relevant components of all data instances, while discarding only what is modeled as being irrelevant. Accordingly, we revamp two leading collaborative filtering recommendation approaches. Evaluation is made on a large movie-rating dataset underlying the Netflix Prize contest. Results are encouraging and better than those previously reported on this dataset. In particular, methods described in this paper play a significant role in the solution that won the Netflix contest.
Article
Recommendation algorithms are best known for their use on e-commerce Web sites, 1 where they use input about a customer’s interests to generate a list of recommended items. Many applications use only the items that customers purchase and explicitly rate to represent their interests, but they can also use other attributes, including items viewed, demographic data, subject interests, and favorite artists. At Amazon.com, we use recommendation algorithms to personalize the online store for each customer. The store radically changes based on customer interests, showing programming titles to a software engineer and baby toys to a new mother. The click-through and conversion rates — two
Article
A method is described for the minimization of a function of n variables, which depends on the comparison of function values at the (n + 1) vertices of a general simplex, followed by the replacement of the vertex with the highest value by another point. The simplex adapts itself to the local landscape, and contracts on to the final minimum. The method is shown to be effective and computationally compact. A procedure is given for the estimation of the Hessian matrix in the neighbourhood of the minimum, needed in statistical estimation problems.
Article
This paper generalizes the widely used Nelder and Mead (Comput J 7:308–313, 1965) simplex algorithm to parallel processors. Unlike most previous parallelization methods, which are based on parallelizing the tasks required to compute a specific objective function given a vector of parameters, our parallel simplex algorithm uses parallelization at the parameter level. Our parallel simplex algorithm assigns to each processor a separate vector of parameters corresponding to a point on a simplex. The processors then conduct the simplex search steps for an improved point, communicate the results, and a new simplex is formed. The advantage of this method is that our algorithm is generic and can be applied, without re-writing computer code, to any optimization problem which the non-parallel Nelder–Mead is applicable. The method is also easily scalable to any degree of parallelization up to the number of parameters. In a series of Monte Carlo experiments, we show that this parallel simplex method yields computational savings in some experiments up to three times the number of processors.
Article
Recommendation algorithms are best known for their use on e-commerce Web sites, where they use input about a customer's interests to generate a list of recommended items. Many applications use only the items that customers purchase and explicitly rate to represent their interests, but they can also use other attributes, including items viewed, demographic data, subject interests, and favorite artists. At Amazon.com, we use recommendation algorithms to personalize the online store for each customer. The store radically changes based on customer interests, showing programming titles to a software engineer and baby toys to a new mother. There are three common approaches to solving the recommendation problem: traditional collaborative filtering, cluster models, and search-based methods. Here, we compare these methods with our algorithm, which we call item-to-item collaborative filtering. Unlike traditional collaborative filtering, our algorithm's online computation scales independently of the number of customers and number of items in the product catalog. Our algorithm produces recommendations in real-time, scales to massive data sets, and generates high quality recommendations.
Article
The need to optimize a function whose derivatives are unknown or non-existent arises in many contexts, particularly in real-world applications. Various direct search methods, most notably the Nelder-Mead `simplex' method, were proposed in the early 1960s for such problems, and have been enormously popular with practitioners ever since. Nonetheless, for more than twenty years these methods were typically dismissed or ignored in the mainstream optimization literature, primarily because of the lack of rigorous convergence results. Since 1989, however, direct search methods have been rejuvenated and made respectable. This paper summarizes the history of direct search methods, with special emphasis on the Nelder-Mead method, and describes recent work in this area. This paper is based on a plenary talk given at the Biennial Dundee Conference on Numerical Analysis, Dundee, Scotland, 1995. 1. Introduction Unconstrained optimization---the problem of minimizing a nonlinear function f(x) for x 2...
BMatrix factorization techniques for recommender systems BCollaborative filtering and the missing at random assumption
  • Y Koren
  • R M Bell
  • C Volinsky
Y. Koren, R. M. Bell, and C. Volinsky, BMatrix factorization techniques for recommender systems,[ IEEE Computer, vol. 42, no. 8, pp. 30–37, Aug. 2009. [12] B. M. Marlin and R. S. Zemel, BCollaborative filtering and the missing at random assumption,[ in Proc. 23rd Conf. Uncertainty Artif. Intell., 2007, pp. 19–28.
Web Scale Media Recommendation Systems This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination
  • Dror
Dror et al.: Web Scale Media Recommendation Systems This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
degree in computer science from The Weizmann Institute
  • Yehuda D Koren Received The Ph
Yehuda Koren received the Ph.D. degree in computer science from The Weizmann Institute, Rehovot, Israel, in 2003. He joined Yahoo! Research, Haifa, Israel, in September 2008. Prior to this, he was a principal member of AT&T Labs-Research.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. Proc. Extended Abstr. Human Factors Comput. Syst., 2006, pp. 1097-1101.
His research interests include machine learning applications in community question answering services, recommender systems, activity recognition, marketing, medicine, psychology
  • Israel Tel-Aviv-Yaffo
Tel-Aviv-Yaffo, Israel, where he also serves as the Head of the artificial intelligence program. Since 2010, he has been on academic leave, with Yahoo! Research, Haifa, Israel. His research interests include machine learning applications in community question answering services, recommender systems, activity recognition, marketing, medicine, psychology, natural language processing, computer vision, high-energy physics, and bioinformatics.