Content uploaded by Lukas Wegmeth
Author content
All content in this area was uploaded by Lukas Wegmeth on Sep 11, 2023
Content may be subject to copyright.
We previously (COSEAL’22) showed that CASH is
challenging in rating prediction tasks in RecSys.
Recently RecSys focused more on ranking prediction
tasks whose evaluation is largely different compared
to the former
In this poster, we present a selection of our research
on the CASH problem in both rating and ranking
prediction tasks in RecSys
Introduction & Previous Work
1
Lukas Wegmeth, Tobias Vente, Joeran Beel
University of Siegen, Intelligent Systems Group
The Challenges of Algorithm Selection and Hyperparameter
Optimization for Recommender Systems
Corresponding Author: lukas.wegmeth@uni-siegen.de
CASH: Combined algorithm selection and hyperparameter optimization
RecSys: Recommender systems
Rating prediction tasks focus on explicit user ratings.
Ranking prediction tasks focus on the top-n most
relevant items for each user.
The evaluation of ranking prediction tasks ignores all
but the top-n items.
Limiting the model evaluation to the top-n items
makes optimization more challenging.
Additionally, models that are optimized offline can
usually not retain their capabilities in an online
system.
Introduction & Previous Work
2
Retrospective: CASH for Rating Prediction RecSys
3
Method
RMSE
Accuracy
Virtual Best
1.178
100%
Meta
-Learner
1.186
64.09%
Single Best
1.191
35.56%
•Last year, we presented CaMeLS: a tool for meta-learning
algorithm selection for rating prediction
•Evaluation of CaMeLS show that it works fairly well for rating
predictions, but
•Open problem: it is unclear how CaMeLS is impacted by data sets
from different domains.
Method
RMSE
GES
-SiloTop50
1.3571
GES
-Top50
1.3615
Single Best
1.3626
•Last year, we investigated the influence from AS, HPO and
Ensembling techniques on RecSys algorithms with LensKit-Auto.
•So far, we are only capable of ensemble rating prediction algorithms.
•We perceived that the performance increases, but comes with a
small improvement compared to the cost.
Is the computational overhead worth it?
Meta-learned algorithm selection
performance for rating prediction
Fundamental Problems with Ranking Predictions RecSys
4
Perf. diff. after CASH & Ensembling
Solving CASH for ranking prediction presents unsolved fundamental challenges.
Are the top-n elements
always the best?
We analyze what happens
when we pick itemsin a wider
range (up to top-16).
The example also shows that
it is not possible to
generalize the selection from
validation to test. Here: top-n selection chooses the top-8
items, and bottom-n chooses items 9-16.
This graph shows the mean rank of a
mask (y-axis) if it selects the specific item
index from the top-16 (x-axis).