Yehuda Koren’s research while affiliated with Google Inc. and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (88)


Figure 1: A snapshot of the most recent questions in Yahoo! Answers, showing the diversity of questions. 
Table 1 : Example of attributes for "Would the introduction of new technology ruin the tension in football? If so: why?"
Figure 2: Three families of question attributes 
Table 2 : Baseline accuracy using various weight preprocessing
Figure 3: Seven channels 

+4

I want to answer, Who has a question? Yahoo! answers recommender system
  • Conference Paper
  • Full-text available

August 2011

·

5,948 Reads

·

85 Citations

·

Yehuda Koren

·

·

Yahoo! Answers is currently one of the most popular question answering systems. We claim however that its user experience could be significantly improved if it could route the "right question" to the "right user." Indeed, while some users would rush answering a question such as "what should I wear at the prom?," others would be upset simply being exposed to it. We argue here that Community Question Answering sites in general and Yahoo! Answers in particular, need a mechanism that would expose users to questions they can relate to and possibly answer. We propose here to address this need via a multi-channel recommender system technology for associating questions with potential answerers on Yahoo! Answers. One novel aspect of our approach is exploiting a wide variety of content and social signals users regularly provide to the system and organizing them into channels. Content signals relate mostly to the text and categories of questions and associated answers, while social signals capture the various user interactions with questions, such as asking, answering, voting, etc. We fuse and generalize known recommendation approaches within a single symmetric framework, which incorporates and properly balances multiple types of signals according to channels. Tested on a large scale dataset, our model exhibits good performance, clearly outperforming standard baselines.

Download

Automatically tagging email by leveraging other users' folders

August 2011

·

261 Reads

·

33 Citations

Most email applications devote a significant part of their real estate to organization mechanisms such as folders. Yet, we verified on the Yahoo! Mail service that 70% of email users have never defined a single folder. This implies that one of the most well known email features is underexploited. We propose here to revive the feature by providing a method for generating a lighter form of folders, or tags, benefiting even the most passive users. The method automatically associates, whenever possible, an appropriate semantic tag with a given email. This gives rise to an alternate mechanism for organizing and searching email. We advocate a novel modeling approach that exploits the overall population of users, thereby learning from the wisdom-of-crowds how to categorize messages. Given our massive user base, it is enough to learn from a minority of the users who label certain messages in order to label that kind of messages for the general population. We design a novel cascade classification approach, which copes with the severe scalability and accuracy constraints we are facing. Significant efficiency gains are achieved by working within a low dimensional latent space, and by using a novel hierarchical classifier. Precision level is controlled by separating the task into a two-phase classification process. We performed an extensive empirical study covering three different time periods, over 100 million messages, and thousands of candidate tags per message. The results are encouraging and compare favorably with alternative approaches. Our method successfully tags 72% of incoming email traffic. Performance-wise, the computational overhead, even on surge large traffic, is sufficiently low for our approach to be applicable in production on any large Web mail service.


Adaptive bootstrapping of recommender systems using decision trees

February 2011

·

676 Reads

·

171 Citations

Recommender systems perform much better on users for which they have more information. This gives rise to a problem of satisfying users new to a system. The problem is even more acute considering that some of these hard to profile new users judge the unfamiliar system by its ability to immediately provide them with satisfying recommendations, and may quickly abandon the system when disappointed. Rapid profiling of new users by a recommender system is often achieved through a bootstrapping process - a kind of an initial interview - that elicits users to provide their opinions on certain carefully chosen items or categories. The elicitation process becomes particularly effective when adapted to users' responses, making best use of users' time by dynamically modifying the questions to improve the evolving profile. In particular, we advocate a specialized version of decision trees as the most appropriate tool for this task. We detail an efficient tree learning algorithm, specifically tailored to the unique properties of the problem. Several extensions to the tree construction are also introduced, which enhance the efficiency and utility of the method. We implemented our methods within a movie recommendation service. The experimental study delivered encouraging results, with the tree-based bootstrapping process significantly outperforming previous approaches.


Recommender Systems Handbook

January 2011

·

374 Reads

·

450 Citations

The collaborative filtering (CF) approach to recommenders has recently enjoyed much interest and progress. The fact that it played a central role within the recently completed Netflix competition has contributed to its popularity. This chapter surveys the recent progress in the field. Matrix factorization techniques, which became a first choice for implementing CF, are described together with recent innovations. We also describe several extensions that bring competitive accuracy into neighborhood methods, which used to dominate the field. The chapter demonstrates how to utilize temporal models and implicit feedback to extend models accuracy. In passing, we include detailed descriptions of some the central methods developed for tackling the challenge of the Netflix Prize competition.


Figure 1: The test error rate vs. number of train ratings per user on the Netflix data. Lower y-axis values represent more accurate predictions. The x-axis describes the exact number of ratings taken for each user. When the x value equals k, we are considering only users that gave at least k ratings. For each such user, we sort the ratings in chronological order and take the first k ratings into account. Results are computed by the factorized item-item model [5].
Figure 2: The test error rate vs. number of displayed items (=size of seed set), for various methods of selecting seed set items. Methods that disregard item popularity (Random and Entropy) significantly lag in performance. GreedyExtend delivers the best performing seed sets by guiding the set creation process with a suitable cost function. Note that the legend orders methods by their performance.  
On bootstrapping recommender systems

October 2010

·

1,237 Reads

·

89 Citations

Recommender systems perform much better on users for which they have more information. This gives rise to a problem of satisfying users new to a system. The problem is even more acute considering that some of these hard to profile new users judge the unfamiliar system by its ability to immediately provide them with satisfying recommendations, and may be the quickest to abandon the system when disappointed. Rapid profiling of new users is often achieved through a bootstrapping process - a kind of an initial interview - that elicits users to provide their opinions on certain carefully chosen items or categories. This work offers a new bootstrapping method, which is based on a concrete optimization goal, thereby handily outperforming known approaches in our tests.



Figure 1: Rating distribution for Netflix (solid line) and Movielens (dashed line). Items are ordered according to popularity (most popular at the bottom).
Performance of recommender algorithms on top-N recommendation tasks

September 2010

·

24,980 Reads

·

1,462 Citations

In many commercial systems, the 'best bet' recommendations are shown, but the predicted rating values are not. This is usually referred to as a top-N recommendation task, where the goal of the recommender system is to find a few specific items which are supposed to be most appealing to the user. Common methodologies based on error metrics (such as RMSE) are not a natural fit for evaluating the top-N recommendation task. Rather, top-N performance can be directly measured by alternative methodologies based on accuracy metrics (such as precision/recall). An extensive evaluation of several state-of-the art recommender algorithms suggests that algorithms optimized for minimizing RMSE do not necessarily perform as expected in terms of top-N recommendation task. Results show that improvements in RMSE often do not translate into accuracy improvements. In particular, a naive non-personalized algorithm can outperform some common recommendation approaches and almost match the accuracy of sophisticated algorithms. Another finding is that the very few top popular items can skew the top-N performance. The analysis points out that when evaluating a recommender algorithm on the top-N recommendation task, the test set should be chosen carefully in order to not bias accuracy metrics towards non-personalized solutions. Finally, we offer practitioners new variants of two collaborative filtering algorithms that, regardless of their RMSE, significantly outperform other recommender algorithms in pursuing the top-N recommendation task, with offering additional practical advantages. This comes at surprise given the simplicity of these two methods.


Workshop on information heterogeneity and fusion in recommender systems (HetRec 2010)

September 2010

·

43 Reads

·

69 Citations

·

·

Yehuda Koren

·

[...]

·

Markus Weimer

In this tutorial we discuss the evaluation of recommender systems. We discuss the main reason for evaluating recommender systems, i.e., the selection task. We overview some general guidelines for conducting evaluation tests. We then discuss the evaluation ...


Collaborative Filtering with Temporal Dynamics

April 2010

·

551 Reads

·

1,449 Citations

Communications of the ACM

Customer preferences for products are drifting over time. Product perception and popularity are constantly changing as new selection emerges. Similarly, customer inclinations are evolving, leading them to ever redefine their taste. Thus, modeling temporal dynamics is essential for designing recommender systems or general customer preference models. However, this raises unique challenges. Within the ecosystem intersecting multiple products and customers, many different characteristics are shifting simultaneously, while many of them influence each other and often those shifts are delicate and associated with a few data instances. This distinguishes the problem from concept drift explorations, where mostly a single concept is tracked. Classical time-window or instance decay approaches cannot work, as they lose too many signals when discarding data instances. A more sensitive approach is required, which can make better distinctions between transient effects and long-term patterns. We show how to model the time changing behavior throughout the life span of the data. Such a model allows us to exploit the relevant components of all data instances, while discarding only what is modeled as being irrelevant. Accordingly, we revamp two leading collaborative filtering recommendation approaches. Evaluation is made on a large movie-rating dataset underlying the Netflix Prize contest. Results are encouraging and better than those previously reported on this dataset. In particular, methods described in this paper play a significant role in the solution that won the Netflix contest.



Citations (79)


... Collaborative filtering is the most common and widely used method for generating recommendations in music streaming services [22]. This algorithm relies on a set of songs that users preferred in the past to predict which song they would like to listen to. ...

Reference:

Content-Based Filtering Technique using Clustering Method for Music Recommender Systems
Advances in Collaborative Filtering
  • Citing Chapter
  • November 2021

... Reproducibility studies in [18,19,54] showed that sometimes even decade-old and conceptually quite simple methods, when properly tuned, almost consistently outperformed the most recent deep learning models of the time for top-n recommendation tasks. Later research then re-assessed the effectiveness of widely-used and more than fifteen years old matrix factorization models, finding that they are still competitive with models that would be considered the state-of-the-art today [48,55]. Similar findings were also reported for sequential recommendation models [43], models based on architectures like Graph Neural Networks (GNNs) [3,17,60], and even in other areas of applied machine learning like time-series forecasting [46]. ...

Revisiting the Performance of iALS on Item Recommendation Benchmarks
  • Citing Conference Paper
  • September 2022

... Recommender systems are widely used in various real-world domains, such as video streaming [7], e-commerce [44], and web search [18,36]. Numerous recommendation algorithms have been developed, including Matrix Factorisation (MF) [15], Factorisation Machine (FM) [27], Deep Factorisation Machine (DeepFM) [9], Deep Cross Network (DCN) [32], Deep Interest Network (DIN) [45], and * Corresponding author LightGCN [10]. Nevertheless, these approaches often depend on statistical correlations, which can introduce various estimation biases, such as popularity bias [1], selection bias [21,22], and conformity bias [43]. ...

Matrix factorization techniques for recommender systems
  • Citing Article
  • August 2009

Computer

... The authors [13] proposed an approach that might fit the identification of library migrations by analyzing large datasets related to software developmentspecifically, code change histories. This approach searches for migration process patterns and then filters them based on their frequency or associated code changes. ...

Factor in the neighbors
  • Citing Article
  • January 2010

ACM Transactions on Knowledge Discovery from Data

... In [4], semantic change between consecutive queries and the relationship between the changed query and the clicked document is used to infer query context. In addition, query clustering [3], geographical location [15], and association rules [1] are some of the methods used by researchers for better information retrieval. However, we argued that these context extraction methods are confined by the capacity of their employed representation, which is hardly generalizable and not optimal for retrieval tasks. ...

Expediting search trend detection via prediction of query counts
  • Citing Conference Paper
  • February 2013

... We quantitatively evaluate our model in the context of two large datasets containing both numerical and text reviews; the Amazon Review dataset [17] and the Yelp dataset [25]. To avoid the problems frequently highlighted with RMSE-based evaluation [12], we follow the approach of Koren and Sill [31]. 2 The evaluation highlights that our proposed KNN model beats strong baselines for both memory-based and model-based systems. The result is that our model provides both explainability benefits, inherited from memory-based methods, enhanced by now enabling textual-review snippets to be used, as well as competitive performance. ...

Collaborative filtering on ordinal user feedback
  • Citing Conference Paper
  • August 2013

... These methods emphasize the importance of learning item-to-item semantics rather than user-to-item predictions. For example, [14] proposed learning item representations from implicit feedback in a Euclidean space. The I2V model [15] is a popular method for learning static item representations based on CF item cooccurrences [15]. ...

Towards scalable and accurate item-oriented recommendations
  • Citing Conference Paper
  • October 2013

... It has been shown that in collaborative filtering problems, much of the signal lies in simple popularity biases [71]. For example, the winning model in the Netflix Prize competition [10] managed to explain 42.6% of the ratings' variance i.e., R 2 = 42.6%, but the vast majority of the learned signal was attributed to popularity biases which explained a whopping R 2 = 32.5% of the variance (without any personalization) [72]. ...

Web-Scale Media Recommendation Systems
  • Citing Article
  • September 2012

Proceedings of the IEEE

... They streamline access to relevant information by identifying resources aligned with user interests based on historical experiences, aiming to save users time and costs. Originating in ecommerce to combat information overload in the Web 2.0 era, recommender systems quickly expanded into e-learning [2], tourism [27], smart cities [5], music [3], research resources, and television programs. In modern times, platforms like Amazon.com, ...

Recommender Systems Handbook
  • Citing Book
  • October 2010