ArticlePDF Available

Recommending Items in Social Tagging Systems Using Tag and Time Information

Authors:

Abstract

In this work we present a novel item recommendation approach that aims at improving Collaborative Filtering (CF) in social tagging systems using the information about tags and time. Our algorithm follows a two-step approach, where in the first step a potentially interesting candidate item-set is found using user-based CF and in the second step this candidate item-set is ranked using item-based CF. Within this ranking step we integrate the information of tag usage and time using the Base-Level Learning (BLL) equation coming from human memory theory that is used to determine the reuse-probability of words and tags using a power-law forgetting function. As the results of our extensive evaluation conducted on data-sets gathered from three social tagging systems (BibSonomy, CiteULike and MovieLens) show, the usage of tag-based and time information via the BLL equation also helps to improve the ranking and recommendation process of items and thus, can be used to realize an effective item recommender that outperforms two alternative algorithms which also exploit time and tag-based information.
Recommending Items in Social Tagging Systems Using
Tag and Time Information
Emanuel Lacic
Knowledge Technology
Institute
Graz University of Technology
Graz, Austria
elacic@know-center.at
Dominik Kowald
Know-Center
Graz University of Technology
Graz, Austria
dkowald@know-center.at
Paul Seitlinger
Knowledge Technology
Institute
Graz University of Technology
Graz, Austria
paul.seitlinger@tugraz.at
Christoph Trattner
Know-Center
Graz University of Technology
Graz, Austria
ctrattner@know-center.at
Denis Parra
CS Department
Pontificia Universidad Católica
de Chile
Santiago, Chile
dparra@ing.puc.cl
ABSTRACT
In this work we present a novel item recommendation ap-
proach that aims at improving Collaborative Filtering (CF)
in social tagging systems using the information about tags
and time. Our algorithm follows a two-step approach, where
in the first step a potentially interesting candidate item-set
is found using user-based CF and in the second step this can-
didate item-set is ranked using item-based CF. Within this
ranking step we integrate the information of tag usage and
time using the Base-Level Learning (BLL) equation com-
ing from human memory theory that is used to determine
the reuse-probability of words and tags using a power-law
forgetting function.
As the results of our extensive evaluation conducted on data-
sets gathered from three social tagging systems (BibSonomy,
CiteULike and MovieLens) show, the usage of tag-based and
time information via the BLL equation also helps to improve
the ranking and recommendation process of items and thus,
can be used to realize an effective item recommender that
outperforms two alternative algorithms which also exploit
time and tag-based information.
Categories and Subject Descriptors
H.2.8 [Database Management]: Database Applications—
Data mining; H.3.3 [Information Storage and Retrieval]:
Information Search and Retrieval—Information filtering
Keywords
recommender systems; social tagging; collaborative filtering;
item ranking; base-level learning equation
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear
this notice and the full citation on the first page. Copyrights for components
of this work owned by others than ACM must be honored. Abstracting with
credit is permitted. To copy otherwise, or republish, to post on servers or to
redistribute to lists, requires prior specific permission and/or a fee. Request
permissions from permissions@acm.org.
HT’14, September 1–4, 2014, Santiago, Chile.
Copyright 2014 ACM xxx-x-xxxx-xxxx-x/xx/xx ...$xx.xx
1. INTRODUCTION
Over the past few years social tagging gained tremendously
in popularity, helping people for instance to categorize or de-
scribe resources on the Web for better information retrieval
(e.g., BibSonomy or CiteULike) [13, 23]. Although the pro-
cess of tagging has been well explored in the past and in
particular the task of predicting the right tags to the user in
a personalized manner [12, 20], studies on predictive models
to recommend items to users based on social tags are still
rare. To contribute to this sparse field of research, in this
paper we present preliminary results of a study that aims at
addressing this issue. In particular, we provide first results
of a novel attempt to improve item recommendations by tak-
ing into account peoples’ social tags and the information of
the time the tags have been applied by the users. As shown
in related work, recommending items to users in a collabo-
rative manner relying on social tagging information is not
an easy task in general (e.g., [24] or [17]). However, other
related work has also proofed that the information of time
is an important factor to make the models more accurate in
the end (e.g., [26] or [10]).
Contrary to the previous work mentioned above, we suggest
a less data-driven approach that is inspired by principles of
human memory theory about remembering things over time.
As shown in our previous work on tag recommender systems
[15], the base-level learning (BLL) equation introduced by
Anderson and Schooler [16] (see also Anderson et al. [1]),
which integrates tag frequency and recency (i.e., the time
since the last tag usage), can be used to implement an effec-
tive tag recommendation and ranking algorithm. In partic-
ular, the BLL equation models the time-depended drift of
forgetting of words and tags using a power-law distribution
in order to determine a probability value that a specific tag
will be reused by a target user.
In this work, we apply this equation for ranking and recom-
mending items to users. To this end, we present a novel rec-
ommender approach called Collaborative Item Ranking Us-
ing Tag and Time Information (CIRTT) that firstly identi-
fies a potentially interesting candidate item set and secondly,
ranks this candidate set in a personalized manner (similar
to [10]). In this second step of personalization, we integrate
the BLL equation to include this information about tags
and time. To investigate the question as to whether tag and
time information can improve the ranking and recommen-
dation process, we conducted an extensive evaluation using
folksonomy datasets gathered from three social tagging sys-
tems (BibSonomy, CiteULike and MovieLens). Within this
study we compared our approach to two alternative tag and
time based recommender algorithms [26, 10] amongst others.
The results show that integrating tag and time information
using the BLL equation helps to improve item recommenda-
tions and to outperform state-of-the-art baselines in terms
of recommender accuracy.
The remainder of this paper is organized as follows. We be-
gin with explaining our tag and time based approach CIRTT
in Section 2. Then we describe the experimental setup of our
evaluation in Section 3 and summarize the results of this
study in Section 4. Finally, in Section 5, we close the paper
with a short conclusion and an outlook into the future.
2. APPROACH
In this section we provide a detailed description of our item
recommendation approach called Collaborative Item Rank-
ing Using Tag and Time Information (CIRTT). In general,
our CIRTT algorithm uses a similar strategy as the approach
proposed by Huang et al. [10] and thus, consists of two steps
relying on a combination of user- and item-based CF: in the
first step, a potentially interesting candidate item set for the
target user uis determined and in the second step, this can-
didate item set gets ranked using item similarities and tag
and time information.
Step one (i.e., determining candidate items) is conducted
using a simple user-based CF approach. Hence, we first
find the most similar users for the target user u(i.e., the
neighborhood) based on the binary user-item matrix Bu,i
(see also [26]) and then, use the bookmarked items of these
neighbours as our candidate item set. We use a neighbour-
hood of k= 20 users and the Cosine similarity measure [7]
(see also Section 3.3).
In the second step (i.e., ranking candidate items) we use an
item-based CF approach in order to determine the relevance
of each candidate item for the target user based on the items
she has bookmarked in the past. Hence, for each candidate
item iin the candidate item set we calculate this combined
similarity value sim(u, i) by the item-based CF formula:
sim(u, i) = X
jitems(u)
sim(i, j) (1)
, where items(u) is the set of items the target user uhas
bookmarked in the past. This item-based CF step helps us
to give a higher ranking to candidate items that are more
similar to the items the target user has bookmarked in the
past (see also [10]).
To finally realize CIRTT in order to integrate tag and time
information we make use of the base-level learning (BLL)
equation proposed by Anderson et al. [1]. As described
in our previous work [15], the BLL equation can be used to
determine a relevance value for a tag tin the tag assignments
Dataset |B| |U| |R| |T| |T AS|
BibSonomy 82,539 2,437 28,000 30,919 339,337
CiteULike 36,471 3,202 15,400 20,937 99,635
MovieLens 53,607 3,983 5,724 14,883 92,387
Table 1: Properties of the datasets, where |B|is the
number of bookmarks, |U|the number of users, |R|
the number of resources, |T|the number of tags and
|T AS|the number of tag assignments.
of a target user ubased on tag frequency and recency:
BLL(u, t) = ln(
n
X
i=1
td
i) (2)
, where nis the number of times thas been used by uand ti
is the recency, i.e., the time since the ith occurrence of tin
the tag assignments of u. The exponent dis used to model
the power law of forgetting memory items and is usually set
to .5 (see [1]). In order to map these BLL values on a range
of 0 - 1, we used the same normalization method as used in
our previous work [15].
We adopt this equation for the ranking of items in social
tagging systems using a similar method as proposed in [26]
and [10]. Thus, a user is assumed to prefer an item if it
has been tagged with tags of high relevance for the user,
that is, with tags exhibiting a high BLL value. Given this
assumption, the BLL value of a given item ifor the target
user uis determined using the following formula:
BLL(u, i) = X
ttags(u,i)
BLL(u, t) (3)
, where tags(u, i) is the set of tags uhas used to tag i.
Taken together, the prediction value pred(u, i) of a candi-
date item iusing our CIRTT approach is given by:
pred(u, i) = X
jitems(u)
sim(i, j)
| {z }
sim(u,i)
×BLL(u, i) (4)
This approach enables us to weight higher the items within
the candidate set that are more important to the target user
(i.e., items associated with tags exhibiting a high BLL value
that integrates tag frequency and recency). CIRTT and the
baseline algorithms presented in this work are implemented
in the Java programming language, are open-source software
and can be downloaded online from our Github Repository1
[14].
3. EXPERIMENTAL SETUP
In this section we describe in detail the datasets, the evalu-
ation methodology and metrics as well as the baseline algo-
rithms used for our experiments.
3.1 Datasets
In order to evaluate our approach and for reasons of re-
producibility we used freely-available folksonomies gathered
1https://github.com/learning-layers/TagRec/
from three well-known social-tagging systems. We used data-
sets of the social bookmark and publication sharing system
BibSonomy2, the reference management system CiteULike3
and the movie recommendation site MovieLens4. As sug-
gested by related work in the field (e.g. [11, 9]), we excluded
all automatically imported and generated tags (e.g., bibtex-
import). In the case of CiteULike we randomly selected 10%
of the user profiles for reasons of computational effort (see
also [7]).
We did not use a full p-core pruning technique, since this
would negatively influence the recommender evaluation re-
sults in social tagging system as shown by Doerfel and J¨
aschke
[6], but excluded all unique resources (i.e., resources that
have been bookmarked only by a single user). The final
dataset statistics can be found in Table 1.
3.2 Evaluation Methodology
To evaluate our item recommender approach we used a train-
ing and test-set split method as proposed by popular and
related work in this area [10, 26]. Hence, for each user
we sorted her bookmarks in chronological order and used
the 20% most recent bookmarks for testing and the rest for
training. With the training set we examined then whether
a recommender approach could predict the bookmarked re-
sources of a target user in the test set. This procedure also
simulates well a real environment where the bookmarking
behavior of a user in the future is tried to be predicted based
on the bookmarking behavior in the past [3].
To finally quantify the recommendation accuracy of our ap-
proaches, we used a set of well-known information retrieval
metrics. In particular, we report Normalized Discounted Cu-
mulative Gain (nDCG@20), Mean Average Precision (MAP
@20), Recall (R@20), Diversity (D) and User Coverage (UC)
[21, 8]. All performance metrics are calculated and reported
based on the top-20 recommended items. Moreover we also
show the performance of the algorithms in the plots of all
three accuracy metrics (nDCG, MAP and Recall) for 1 - 20
recommended items (see also [4]).
3.3 Baseline Algorithms
In order to evaluate our tag and time based CIRTT ap-
proach, we compared it to several baseline algorithms in
terms of recommender accuracy. The algorithms have been
selected with respect to their popularity, performance and
novelty.
MostPopular (MP): The most basic approach we utilized
is the simple Most Popular (MP) approach that recommends
for any user the same set of items. These items are weighted
by their frequency in all bookmarks, meaning that the most
frequently bookmarked items are recommended.
User-based Collaborative Filtering (CF): Another ap-
proach we benchmarked against is the well-known User-
based Collaborative Filtering (CF) recommendation algorithm
[19]. The main idea of CF is that users that are more similar
to each other (i.e., have similar taste), will probably also like
2http://www.kde.cs.uni-kassel.de/bibsonomy/dumps
3http://www.citeulike.org/faq/data.adp
4http://grouplens.org/datasets/movielens/
the same items. Thus, the CF approach first finds the kmost
similar users for the target user and afterwards recommends
their items that are new to her (i.e., have not been book-
marked before). We calculated the user-similarities based on
both, the binary user-item matrix as proposed in [26] (here-
inafter referred to as CFB) and the tag-based user profiles as
proposed in [10] (hereinafter referred to as CFT). Although
we also considered using Item-based CF [18], we dismissed it
based on the tag-based recommender experiments of Bogers
et al. [2] showing that user-based CF always beat item-based
CF. They explain the result given that the number of items
in the dataset is larger than the number of users, and this
is also the case in our three datasets (Table 1).
Collaborative Filtering Using Tag and Time Infor-
mation (Z / H): We also compared our approach to two
alternative algorithms that focus on improving Collabora-
tive Filtering for social tagging systems using tag and time
information. The first one has been proposed by Zheng et
al. [26] (hereinafter referred to as Z) and improves the tradi-
tional CF approach based on the binary user-resource matrix
using tag and time information. As in our CIRTT approach
this is done using information about tag frequency and re-
cency but in contrast to our solution the authors model the
forgetting process using an exponential distribution rather
than a power-law distribution. Moreover, this information
is already used in the user similarity calculation step and
not in the item ranking step as it is done in our approach.
The second tag and time-based approach we tried to bench-
mark against was proposed by Huang et al. [10] (hereinafter
referred to as H). As in our approach, this algorithm uses
a 2-step recommendation process, where in the first step
a potentially interesting candidate item-set for the target
user is determined using user-based CF and in the second
step this candidate item-set is ranked using item-based CF.
In contrast to our approach, the authors calculate the user
and item similarities based on user tag-profiles rather than
based on the binary user-item matrix. Furthermore, in this
approach the forgetting process is modeled using a simple
linear function rather than a power-law distribution.
All CF-based approaches mentioned in this section use a
neighborhood of 20 users and make use of the Cosine simi-
larity measure as it is also done in CIRTT (see also [7]).
4. RESULTS
In this section, we present the results of the evaluation com-
paring our CIRTT approach to the baseline algorithms de-
scribed in Section 3.3 with respect to recommender accuracy
on three different folksonomy datasets (BibSonomy, CiteU-
Like and MovieLens).
In an extensive empirical study, Cremonesi et al. [5] have
shown that standard Information Retrieval accuracy metrics
(e.g., Recall or nDCG) are well suited to evaluate recom-
mender systems, at least in case of top-Nrecommendation
tasks. Therefore, Table 2 provides measures of accuracy
(nDCG@20, MAP@20, R@20) and - additionally - measures
of Diversity (D) and User Coverage (UC) for each approach
and for each of the three datasets.
Dataset Metric M P CFTC FBZ H CI RT T
BibSonomy
nDCG@20 .0143 .0448 .0610 .0621 .0564 .0638
MAP@20 .0057 .0319 .0440 .0447 .0394 .0464
R@20 .0204 .0618 .0820 .0834 .0816 .0907
D.8307 .8275 .8852 .8528 .6209 .8811
UC 100% 99.76% 99.52% 99.52% 99.76% 99.76%
CiteULike
nDCG@20 .0062 .0407 .0717 .0762 .0706 .0912
MAP@20 .0036 .0241 .0453 .0484 .0459 .0629
R@20 .0077 .0630 .1033 .1077 .0928 .1225
D.8936 .7969 .8642 .8145 .6318 .8640
UC 100% 98.38% 96.44% 97.32% 98.38% 97.61%
MovieLens
nDCG@20 .0198 .0361 .0602 .0614 .0484 .0650
MAP@20 .0075 .0201 .0347 .0367 .0263 .0413
R@20 .0366 .0561 .1031 .1013 .0763 .1058
D.9326 .8861 .9267 .9119 .7789 .9176
UC 100% 97.82% 95.90% 98.43% 97.82% 95.90%
Table 2: nDCG@20, MAP@20, R@20, D and UC values for BibSonomy, CiteULike and MovieLens showing
that CIRTT, that integrates tag and time information using the BLL-equation, outperforms state-of-the-art
baseline algorithms.
As expected, the MP baseline approach, which is not per-
sonalized at all, resulted in the lowest accuracy estimates.
Regarding the two traditional CF approaches, the C FBap-
proach, which constructs a binary user-item matrix based
on bookmarks, performs better than CFT, which is based
solely on the user tag-profiles. Regarding the two alterna-
tive tag- and time-based approaches, a same phenomenon
can be observed as the algorithm of Zheng et al. (Z) [26],
that is also based on the binary user-item matrix, performs
better than the approach of Huang et al. (H) [10], that is
based on the user tag-profiles.
With respect to all accuracy metrics (nDCG@20, MAP@20,
R@20), our CIRTT approach, that integrates tag and time
information using the BLL-equation, performs best in all
three datasets (BibSonomy, CiteULike and MovieLens). This
may suggest that applying a power-law function as it is done
via the BLL-equation is more appropriate to account for ef-
fects of recency than an exponential function (Zheng et al.
[26]) or a linear function (Huang et al. [10]). A same pattern
of results can be observed when looking at Figure 1 that re-
veals estimates of the nDCG, MAP and Recall measures for
different sizes of the recommended item set. We have also
tried to integrate the exponential recency function of Zheng
et al. in our approach which resulted in lower accuracy es-
timates than the BLL power law forgetting function.
When looking at the other two not accuracy-based metrics,
interestingly, the approach of Huang et al. (H) always results
in the lowest Diversity (D) of recommended items. This re-
sult might appear because this approach is based on the user
tag-profiles and the Diversity metric is calculated based on
tags. Finally, as all personalized approaches utilize a user-
based CF approach for finding similar users, the measure of
User Coverage (UC) does not appear to deviate between the
different algorithms. We observed the maximum deviation
of 2.53% within the MovieLens dataset.
5. CONCLUSIONS & FUTURE WORK
In this work we have presented preliminary results of a novel
recommendation approach called Collaborative Item Rank-
ing Using Tag and Time Information (CIRTT) that aims at
improving Collaborative Filtering in social tagging systems.
Our algorithm follows a two-step approach as also done in
[10], where in the first step a potentially interesting can-
didate item set is found performing user-based CF and in
the second step this candidate item set is ranked perform-
ing item-based CF. Within this ranking step we integrate
the information of frequency and recency of tag use apply-
ing the Base-Level Learning (BLL) equation [1]. Thus, in
contrast to existing approaches that also consider informa-
tion about tags and time (e.g., [26, 10]), CIRTT draws on
an empirically well established formalism modeling the reuse
probability of memory items (tags) in form of a power-law
forgetting function. In recent work, the same formalism has
turned out to substantially improve the ranking and recom-
mendation of tags ([15]).
The current evaluation conducted on datasets gathered from
three social tagging systems (BibSonomy, CiteULike and
MovieLens) reveals that applying the BLL equation also
helps to improve the ranking and recommendation process
of items. Most important, the results speak in favor of an
integrative research endeavor that places a data-driven ap-
proach on a theoretical foundation provided by research on
human cognition and semiotics.
Our future work will aim at improving the approach pre-
sented in this paper. For example, we will examine as to
whether the BLL equation can also help to improve the cal-
culation of user similarities and thus, to find more suitable
user neighborhoods and candidate items. Additionally, we
will put more emphasis on semiotic dynamics that have been
found to play out in tagging systems (e.g., [22]) and how
individual learning and forgetting processes are influenced
by other individuals’ behavior in the system. Moreover, we
also plan to further improve the item ranking process us-
ing insights of relevant research dealing with recommender
novelty and diversity (e.g., [25] in order to increase the user
acceptance.
Acknowledgments: This work is supported by the Know-
Center, the EU funded project Learning Layers (Grant Nr.
318209) and the Austrian Science Fund (FWF): P 25593-
(a) nDCG
BibSonomy (b) nDCG
CiteULike (c) nDCG
MovieLens
(d) MAP
BibSonomy
(e) MAP
CiteULike
(f) MAP
MovieLens
(g) Recall
BibSonomy
(h) Recall
CiteULike
(i) Recall
MovieLens
Figure 1: nDCG, MAP and Recall plots for BibSonomy, CiteULike and MovieLens showing the recommen-
dation accuracy of our tag and time based CIRTT approach along with state-of-the-art baseline algorithms
for 1 - 20 recommended items (k). It can be seen that CIRTT reaches the highest levels of recommender
accuracy over all three metrics and on all datasets.
G22. The Know-Center is funded within the Austrian COMET
Program - Competence Centers for Excellent Technologies
- under the auspices of the Austrian Ministry of Transport,
Innovation and Technology, the Austrian Ministry of Eco-
nomics and Labor and by the State of Styria. COMET
is managed by the Austrian Research Promotion Agency
(FFG).
6. REFERENCES
[1] J. R. Anderson, M. D. Byrne, S. Douglass, C. Lebiere,
and Y. Qin. An integrated theory of the mind.
Psychological Review, 111(4):1036–1050, 2004.
[2] T. Bogers and A. van den Bosch. Recommending
scientific articles using citeulike. In Proceedings of the
2008 ACM Conference on Recommender Systems,
RecSys ’08, pages 287–290, New York, NY, USA,
2008. ACM.
[3] P. G. Campos, F. D´ıez, and I. Cantador. Time-aware
recommender systems: a comprehensive survey and
analysis of existing evaluation protocols. User
Modeling and User-Adapted Interaction, pages 1–53,
2013.
[4] P. Cremonesi, P. Garza, E. Quintarelli, and R. Turrin.
Top-n recommendations on unpopular items with
contextual knowledge. In Workshop on Context-aware
Recommender Systems ’11.
[5] P. Cremonesi, Y. Koren, and R. Turrin. Performance
of recommender algorithms on top-n recommendation
tasks. In Proc., RecSys ’10, New York, NY, USA.
ACM.
[6] S. Doerfel and R. J¨
aschke. An analysis of
tag-recommender evaluation procedures. In
Proceedings of the 7th ACM conference on
Recommender systems, pages 343–346. ACM, 2013.
[7] J. Gemmell, T. Schimoler, M. Ramezani,
L. Christiansen, and B. Mobasher. Improving folkrank
with item-based collaborative filtering. Recommender
Systems & the Social Web, 2009.
[8] J. L. Herlocker, J. A. Konstan, L. G. Terveen, and
J. T. Riedl. Evaluating collaborative filtering
recommender systems. ACM Transactions on
Information Systems (TOIS), 22(1):5–53, 2004.
[9] A. Hotho, R. J¨
aschke, C. Schmitz, and G. Stumme.
Information retrieval in folksonomies: Search and
ranking. In The semantic web: research and
applications. Springer, 2006.
[10] C.-L. Huang, P.-H. Yeh, C.-W. Lin, and D.-C. Wu.
Utilizing user tag-based interests in recommender
systems for social resource sharing websites.
Knowledge-Based Systems, 2014.
[11] R. J¨
aschke, L. Marinho, A. Hotho,
L. Schmidt-Thieme, and G. Stumme. Tag
recommendations in folksonomies. In Knowledge
Discovery in Databases: PKDD 2007, pages 506–514.
Springer, 2007.
[12] R. J¨
aschke, L. Marinho, A. Hotho,
L. Schmidt-Thieme, and G. Stumme. Tag
recommendations in social bookmarking systems. Ai
Communications, 21(4):231–247, 2008.
[13] C. K¨
orner, D. Benz, A. Hotho, M. Strohmaier, and
G. Stumme. Stop thinking, start tagging: tag
semantics emerge from collaborative verbosity. In
Proceedings of the 19th international conference on
World wide web, pages 521–530. ACM, 2010.
[14] D. Kowald, E. Lacic, and C. Trattner. Tagrec:
Towards a standardized tag recommender
benchmarking framework. In Proceedings of the 25th
ACM Conference on Hypertext and Social Media, HT
’14, New York, NY, USA, 2014. ACM.
[15] D. Kowald, P. Seitlinger, C. Trattner, and T. Ley.
Long time no see: The probability of reusing tags as a
function of frequency and recency. In Proc. WWW
’14. ACM.
[16] J. R. A. Lael J. Schooler. Reflections of the
environment in memory. Psychological Science, 1991.
[17] D. Parra-Santander and P. Brusilovsky. Improving
collaborative filtering in social tagging systems for the
recommendation of scientific articles. In WI-IAT, 2010
IEEE/WIC/ACM.
[18] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl.
Item-based collaborative filtering recommendation
algorithms. In Proceedings of the 10th International
Conference on World Wide Web, WWW ’01, pages
285–295, New York, NY, USA, 2001. ACM.
[19] J. B. Schafer, D. Frankowski, J. Herlocker, and S. Sen.
Collaborative filtering recommender systems. In The
adaptive web, pages 291–324. Springer, 2007.
[20] P. Seitlinger, D. Kowald, C. Trattner, and T. Ley.
Recommending tags with a model of human
categorization. In Proceedings of the 22nd ACM
international conference on Conference on
information and knowledge management, CIKM ’13,
pages 2381–2386, New York, NY, USA, 2013. ACM.
[21] B. Smyth and P. McClave. Similarity vs. diversity. In
D. Aha and I. Watson, editors, Case-Based Reasoning
Research and Development, LNCS. Springer, 2001.
[22] L. Steels. Semiotic dynamics for embodied agents.
Intelligent Systems, IEEE, 21(3):32–38, 2006.
[23] C. Trattner, Y.-l. Lin, D. Parra, Z. Yue, W. Real, and
P. Brusilovsky. Evaluating tag-based information
access in image collections. In Proceedings of the 23rd
ACM Conference on Hypertext and Social Media, HT
’12, pages 113–122, New York, NY, USA, 2012. ACM.
[24] K. H. Tso-Sutter, L. B. Marinho, and
L. Schmidt-Thieme. Tag-aware recommender systems
by fusion of collaborative filtering algorithms. In Proc.
of SAC ’08. ACM.
[25] S. Vargas and P. Castells. Rank and relevance in
novelty and diversity metrics for recommender
systems. In Proc., RecSys ’11. ACM.
[26] N. Zheng and Q. Li. A recommender system based on
tag and time information for social tagging systems.
Expert Syst. Appl., 2011.
... Research papers Model of human categorization [17,23,35] Activation processes in human memory [18,21,24,37] Informal learning se ings [5][6][7] Resource recommendations Research papers A ention-interpretation dynamics [15,34] Tag and time information [27,28] Recommendation evaluation Research papers Real-world folksonomies [20] Technology enhanced learning se ings [16] Hashtag recommendations ...
... Resource Recommendations using Tag and Time Information. In [27], the Collaborative Item Ranking Using Tag and Time Information (CIRTT) approach was presented. CIRTT uses Collaborative Filtering to identify a set of candidate resources and re-ranks these candidate resources by incorporating tag and time information. ...
Conference Paper
Full-text available
Recommender systems have become important tools to support users in identifying relevant content in an overloaded information space. To ease the development of recommender systems, a number of recommender frameworks have been proposed that serve a wide range of application domains. Our TagRec framework is one of the few examples of an open-source framework tailored towards developing and evaluating tag-based recommender systems. In this paper, we present the current, updated state of TagRec, and we summarize and reeect on four use cases that have been implemented with TagRec: (i) tag recommendations, (ii) resource recommendations, (iii) recommendation evaluation, and (iv) hashtag recommendations. To date, TagRec served the development and/or evaluation process of tag-based recommender systems in two large scale European research projects, which have been described in 17 research papers. us, we believe that this work is of interest for both researchers and practitioners of tag-based recommender systems.
... Collaborative filtering extensions. In [27] , the Collaborative Item Ranking Using Tag and Time Information (CIRTT) approach is introduced, which combines user-based and item-based CF with the information about tag frequency and recency through the base-level learning (BLL) equation from human memory theory. An extensive survey on CF was recently conducted by [43] . ...
... In order to evaluate our algorithm and to follow common practice in recommender systems research (e.g., [24, 19, 47]), we split our datasets into training and test sets. Therefore , we followed the method described in [27] to retain the chronological order of the posts. Specifically, we used the 20% most recent posts of each user for testing and the rest for training the algorithms. ...
Article
Full-text available
Classic resource recommenders like Collaborative Filtering treat users as just another entity, thereby neglecting non-linear user-resource dynamics that shape attention and interpretation. SUSTAIN, as an unsupervised human category learning model, captures these dynamics. It aims to mimic a learner's categorization behavior. In this paper, we use three social bookmarking datasets gathered from BibSonomy, CiteULike and Delicious to investigate SUSTAIN as a user modeling approach to re-rank and enrich Collaborative Filtering following a hybrid recommender strategy. Evaluations against baseline algorithms in terms of recommender accuracy and computational complexity reveal encouraging results. Our approach substantially improves Collaborative Filtering and, depending on the dataset, successfully competes with a computationally much more expensive Matrix Factorization variant. In a further step, we explore SUSTAIN's dynamics in our specific learning task and show that both memorization of a user's history and clustering, contribute to the algorithm's performance. Finally, we observe that the users' attentional foci determined by SUSTAIN correlate with the users' level of curiosity, identified by the SPEAR algorithm. Overall, the results of our study show that SUSTAIN can be used to efficiently model attention-interpretation dynamics of users and can help improve Collaborative Filtering for resource recommendations.
... Collaborative Filtering Extensions: One of our previous studies in this field [28], introduces the so-called Collaborative Item Ranking Using Tag and Time Information (CIRTT) approach, which extends CF in social tagging systems by incorporating tag and time information. This approach combines user-based and item-based CF with the information of tag frequency and recency by applying the baselevel learning (BLL) equation coming from human memory theory. ...
... In order to evaluate our algorithm and to follow common practice in recommender systems research (e.g., [20,41]), we split our datasets into training and test sets. Therefore, we followed the method described in [28] to retain the chronological order of the posts. This also simulates well a real-world environment, where future interactions are tried to be predicted based on interactions in the past [5]. ...
Conference Paper
Full-text available
Classic resource recommenders like Collaborative Filtering (CF) treat users as being just another entity, neglecting non-linear user-resource dynamics shaping attention and interpretation. In this paper, we propose a novel hybrid recommendation strategy that refines CF by capturing these dynamics. The evaluation results reveal that our approach substantially improves CF and, depending on the dataset, successfully competes with a computationally much more expensive Matrix Factorization variant.
... Here, we also want to study popularity bias in top-n settings using ranking-aware metrics such as nDCG (e.g., as used in [18]). Finally, we plan to work on further bias mitigation strategies based on cognitive-inspired user modeling and recommendation techniques (e.g., [21,17,14]. ...
Chapter
Full-text available
Multimedia recommender systems suggest media items, e.g., songs, (digital) books and movies, to users by utilizing concepts of traditional recommender systems such as collaborative filtering. In this paper, we investigate a potential issue of such collaborative-filtering based multimedia recommender systems, namely popularity bias that leads to the underrepresentation of unpopular items in the recommendation lists. Therefore, we study four multimedia datasets, i.e., Last.fm, MovieLens, BookCrossing and MyAnimeList, that we each split into three user groups differing in their inclination to popularity, i.e., LowPop, MedPop and HighPop. Using these user groups, we evaluate four collaborative filtering-based algorithms with respect to popularity bias on the item and the user level. Our findings are three-fold: firstly, we show that users with little interest into popular items tend to have large user profiles and thus, are important data sources for multimedia recommender systems. Secondly, we find that popular items are recommended more frequently than unpopular ones. Thirdly, we find that users with little interest into popular items receive significantly worse recommendations than users with medium or high interest into popularity.Keywordsmultimedia recommender systemscollaborative filteringpopularity biasalgorithmic fairness
... Similar to domains such as social networks or social tagging systems [14,17,21], the personalization of online content has become one of the key drivers for news portals to increase user engagement and convince readers to become paying subscribers [8,9,22]. A natural way for news portals to do this, is to provide their users with articles that are fresh and popular. ...
Chapter
Full-text available
Personalized news recommender systems support readers in finding the right and relevant articles in online news platforms. In this paper, we discuss the introduction of personalized, content-based news recommendations on DiePresse, a popular Austrian online news platform, focusing on two specific aspects: (i) user interface type, and (ii) popularity bias mitigation. Therefore, we conducted a two-weeks online study that started in October 2020, in which we analyzed the impact of recommendations on two user groups, i.e., anonymous and subscribed users, and three user interface types, i.e., on a desktop, mobile and tablet device. With respect to user interface types, we find that the probability of a recommendation to be seen is the highest for desktop devices, while the probability of interacting with recommendations is the highest for mobile devices. With respect to popularity bias mitigation, we find that personalized, content-based news recommendations can lead to a more balanced distribution of news articles’ readership popularity in the case of anonymous users. Apart from that, we find that significant events (e.g., the COVID-19 lockdown announcement in Austria and the Vienna terror attack) influence the general consumption behavior of popular articles for both, anonymous and subscribed users.KeywordsNews recommendationUser interfacePopularity bias
... Tags for their essential role are recognized as a suitable mean to foster collaborative learning and students' knowledge development [8]. Despite these positives of tagging and high demand for tagging support, research on tag recommendation in TEL domain is only very limited [9]. e previous works aim particularly on tag recommendation for learning objects (e.g. ...
Conference Paper
Systems for Community Question Answering (CQA) are well-known on the open web (e.g. Stack Overflow or Quora). They have been recently adopted also for use in educational domain (mostly in MOOCs) to mediate communication between students and teachers. As students are only novices in topics they learn about, they may need various scaffoldings to achieve effective question answering. In this work, we focus specifically on automatic recommendation of tags classifying students' questions. We propose a novel method that can automatically analyze a text of a question and suggest appropriate tags to an asker. The method takes specifics of educational domain into consideration by a two-step recommendation process in which tags reflecting course structure are recommended at first and consequently supplemented with additional related tags. Evaluation of the method on data from CS50 MOOC at Stack Exchange platform showed that the proposed method achieved higher performance in comparison with a baseline method (tag recommendation without taking educational specifics into account).
... Zheng and Li[13]built a resource-recommendation model that combines tag and time information in collaborating filtering (CF). In[14], Lacic et al. proposed a two-step collaborative service ranking using tag and time information approach that integrates user-and service-CF with Base-Level Learning (BLL) equation. Although these models are able to model the shift of users' preference over time, they do not consider the impact of friends. ...
Chapter
In this work, we tackle the problem of adapting a real-time recommender system to multiple application domains, and their underlying data models and customization requirements. To do that, we present Uptrendz, a multi-domain recommendation platform that can be customized to provide real-time recommendations in an API-centric way. We demonstrate (i) how to set up a real-time movie recommender using the popular MovieLens-100 k dataset, and (ii) how to simultaneously support multiple application domains based on the use-case of recommendations in entrepreneurial start-up founding. For that, we differentiate between domains on the item- and system-level. We believe that our demonstration shows a convenient way to adapt, deploy and evaluate a recommender system in an API-centric way. The source-code and documentation that demonstrates how to utilize the configured Uptrendz API is available on GitHub.KeywordsUptrendzAPI-centric recommendationsMulti-domain recommendationsReal-time recommendations
Article
Full-text available
Tag-based resource recommendation is an interesting and important research topic and has been applied to a wide range of applications. The user’s tagging behavior usually reflects his/her interests in social tagging systems, however most existing work can not fully consider the features of user’s tagging behavior, such as tag frequency, time and ordinal position in tag assignments. In this paper, we employ the combination of cluster analysis and data fitting for extracting the correlations between user interests and the three features, and then present a novel user interest model based on the features to compute the user interest degree. In addition, we propose a collaborative filtering based approach, in which top-k similar users are filtered by resource-interest-based profiles; resource similarities are obtained by tag-frequency-based profiles; the candidate resources are then ranked according to the user interest model, resource profile similarity and user profile similarity. The experiment results conducted on two real-world datasets demonstrate that the proposed approach outperforms the traditional collaborative filtering baselines.
Article
Full-text available
Traditional recommender systems provide recommendations of items to users; recently, some of them also consider the context related to predictions. In this paper we propose a technique that relies on classical recommendation algo-rithms and post-filters recommendations on the basis of con-textual information available for them. Association rules are exploited to identify the most significant correlations among context and item characteristics. The mined rules are used to filter the predictions performed by traditional recommender systems to provide contextualized recommen-dations. Our experimental results show that the proposed approach allows improving the output of classical algorithms proposed in the literature, especially in the case of unpopu-lar items.
Conference Paper
Full-text available
In this paper, we introduce TagRec, a standardized tag recommender benchmarking framework implemented in Java. The purpose of TagRec is to provide researchers with a framework that supports all steps of the development process of a new tag recommendation algorithm in a reproducible way, including methods for data pre-processing, data modeling, data analysis and recommender evaluation against state-of-the-art baseline approaches. We demonstrate the performance of the algorithms implemented in TagRec in terms of prediction quality and runtime using an extensive evaluation of a real-world folksonomy dataset. Furthermore, TagRec contains two novel tag recommendation approaches based on models derived from human cognition and human memory theories.
Article
Full-text available
Recently collaborative tagging, also known as “folksonomy” in Web 2.0, allows users to collaboratively create and manage tags to classify and categorize dynamic content for searching and sharing. A user’s interest in social resources usually changes with time in such a dynamic and information rich environment. Additionally, a social network is one innovative characteristic in social resource sharing websites. The information from a social network provides an inference of a certain user’s interests based on the interests of this user’s network neighbors. To handle the problem of personalized interests changing gradually with time, and to utilize the benefit of the social network, this study models a personalized user interest, incorporating frequency, recency, and duration of tag-based information, and performs collaborative recommendations using the user’s social network in social resource sharing websites. The proposed method includes finding neighbors from the “social friends” network by using collaborative filtering and recommending similar resource items to the users by using content-based filtering. This study examines the proposed system’s performance using an experimental dataset collected from a social bookmarking website. The experimental results show that the hybridization of user’s preferences with frequency, recency, and duration plays an important role, and provides better performances than traditional collaborative recommendation systems. The experimental results also reveal that the friend network information can successfully collaborate, thus improving the collaborative recommendation process.
Conference Paper
Full-text available
In this paper, we introduce a tag recommendation algorithm that mimics the way humans draw on items in their long-term mem-ory. This approach uses the frequency and recency of previous tag assignments to estimate the probability of reusing a particu-lar tag. Using three real-world folksonomies gathered from book-marks in BibSonomy, CiteULike and Flickr, we show how adding a time-dependent component outperforms conventional "most pop-ular tags" approaches and another existing and very effective but less theory-driven, time-dependent recommendation mechanism. By combining our approach with a simple resource-specific fre-quency analysis, our algorithm outperforms other well-established algorithms, such as FolkRank, Pairwise Interaction Tensor Fac-torization and Collaborative Filtering. We conclude that our ap-proach provides an accurate and computationally efficient model of a user's temporal tagging behavior. We show how effective prin-ciples for information retrieval can be designed and implemented if human memory processes are taken into account.
Article
Full-text available
In this paper, we introduce a tag recommendation algorithm that mimics the way humans draw on items in their long-term memory. This approach uses the frequency and recency of previous tag assignments to estimate the probability of reusing a particular tag. Using three real-world folksonomies gathered from bookmarks in BibSonomy, CiteULike and Flickr, we show how adding a time-dependent component outperforms conventional "most popular tags" approaches and another existing and very effective but less theory-driven, time-dependent recommendation mechanism. By combining our approach with a simple resource-specific frequency analysis, our algorithm outperforms other well-established algorithms, such as FolkRank, Pairwise Interaction Tensor Factorization and Collaborative Filtering. We conclude that our approach provides an accurate and computationally efficient model of a user's temporal tagging behavior. We show how effective principles for information retrieval can be designed and implemented if human memory processes are taken into account.
Article
Full-text available
Exploiting temporal context has been proved to be an effective approach to improve recommendation performance, as shown, e.g. in the Netflix Prize competition. Time-aware recommender systems (TARS) are indeed receiving increasing attention. A wide range of approaches dealing with the time dimension in user modeling and recommendation strategies have been proposed. In the literature, however, reported results and conclusions about how to incorporate and exploit time information within the recommendation processes seem to be contradictory in some cases. Aiming to clarify and address existing discrepancies, in this paper we present a comprehensive survey and analysis of the state of the art on TARS. The analysis show that meaningful divergences appear in the evaluation protocols used—metrics and methodologies. We identify a number of key conditions on offline evaluation of TARS, and based on these conditions, we provide a comprehensive classification of evaluation protocols for TARS. Moreover, we propose a methodological description framework aimed to make the evaluation process fair and reproducible. We also present an empirical study on the impact of different evaluation protocols on measuring relative performances of well-known TARS. The results obtained show that different uses of the above evaluation conditions yield to remarkably distinct performance and relative ranking values of the recommendation approaches. They reveal the need of clearly stating the evaluation conditions used to ensure comparability and reproducibility of reported results. From our analysis and experiments, we finally conclude with methodological issues a robust evaluation of TARS should take into consideration. Furthermore we provide a number of general guidelines to select proper conditions for evaluating particular TARS.
Conference Paper
Full-text available
Social tagging involves complex processes of human catego-rization that have been the topic of much research in the cognitive sciences. In this paper we present a recommender approach for social tags whose principles are derived from some of the more prominent and empirically well-founded models from this research tradition. The basic architecture is a simple three-layers connectionist model. The input layer encodes patterns of semantic features of a user-specific re-source, which are either latent topics elicited through Latent Dirichlet Allocation (LDA) or available external categories. The hidden layer categorizes the resource by matching the encoded pattern against already learned exemplar patterns. The latter are composed of unique feature patterns and as-sociated tag distributions. Finally, the output layer samples tags from the associated tag distributions to verbalize the preceding categorization process. We have evaluated this ap-proach on a real-world folksonomy gathered from Wikipedia bookmarks in Delicious. In the experiment our approach outperformed LDA, a well-established algorithm. We at-tribute this to the fact that our approach processes seman-tic information (either latent topics or external categories) across the three di↵erent layers, and this substantially en-hances the recommendation performance. With this paper, we demonstrate that a theoretically guided design of algo-rithms not only holds potential for improving existing rec-ommendation mechanisms, but it also allows us to derive more generalizable insights about how human information interaction on the Web is determined by both semantic and verbal processes.
Conference Paper
Full-text available
The availability of social tags has greatly enhanced access to infor-mation. Tag clouds have emerged as a new "social" way to find and visualize information, providing both one-click access to in-formation and a snapshot of the "aboutness" of a tagged collection. A range of research projects explored and compared different tag artifacts for information access ranging from regular tag clouds to tag hierarchies. At the same time, there is a lack of user studies that compare the effectiveness of different types of tag-based browsing interfaces from the users point of view. This paper contributes to the research on tag-based information access by presenting a con-trolled user study that compared three types of tag-based interfaces on two recognized types of search tasks – lookup and exploratory search. Our results demonstrate that tag-based browsing interfaces significantly outperform traditional search interfaces in both per-formance and user satisfaction. At the same time, the differences between the two types of tag-based browsing interfaces explored in our study are not as clear.
Article
Since the rise of collaborative tagging systems on the web, the tag recommendation task -- suggesting suitable tags to users of such systems while they add resources to their collection -- has been tackled. However, the (offline) evaluation of tag recommendation algorithms usually suffers from difficulties like the sparseness of the data or the cold start problem for new resources or users. Previous studies therefore often used so-called post-cores (specific subsets of the original datasets) for their experiments. In this paper, we conduct a large-scale experiment in which we analyze different tag recommendation algorithms on different cores of three real-world datasets. We show, that a recommender's performance depends on the particular core and explore correlations between performances on different cores.
Article
Collaborative tagging applications allow users to annotate online resources. The result is a complex tapestry of interrelated users, re- sources and tags often called a folksonomy. Folksonomies present an attractive target for data mining applications such as tag recom- menders. A challenge of tag recommendation remains the adapta- tion of traditional recommendation techniques originally designed to work with two dimensional data. To date the most successful recommenders have been graph based approaches which explicitly connects all three components of the folksonomy. In this paper we speculate that graph based tag recommenda- tion can be improved by coupling it with item-based collaborative filtering. We motive this hypothesis with a discussion of informa- tional channels in folksonomies and provide a theoretical explana- tion of the additive potential for item-based collaborative filtering. We then provided experimental results on hybrid tag recommenders built from graph models and other techniques based on popularity, user-based collaborative filtering and item-based collaborative fil- tering. We demonstrate that a hybrid recommender built from a graph based model and item-based collaborative filtering outperforms its constituent recommenders. Furthermore the inability of the other recommenders to improve upon the graph-based approach suggests that they offer information already included in the graph based model. These results confirm our conjecture. We provide exten- sive evaluation of the hybrids using data collected from three real world collaborative tagging applications.