ArticlePDF Available

Forgetting the Words but Remembering the Meaning: Modeling Forgetting in a Verbal and Semantic Tag Recommender

Authors:

Abstract and Figures

We assume that recommender systems are more successful, when they are based on a thorough understanding of how people process information. In the current paper we test this assumption in the context of social tagging systems. Cognitive research on how people assign tags has shown that they draw on two interconnected levels of knowledge in their memory: on a conceptual level of semantic fields or LDA topics, and on a lexical level that turns patterns on the semantic level into words. Another strand of tagging research reveals a strong impact of time-dependent forgetting on users' tag choices, such that recently used tags have a higher probability being reused than ``older'' tags. In this paper, we align both strands by implementing a computational theory of human memory that integrates the two-level conception and the process of forgetting in form of a tag recommender. Furthermore, we test the approach in three large-scale social tagging datasets that are drawn from BibSonomy, CiteULike and Flickr. As expected, our results reveal a selective effect of time: forgetting is much more pronounced on the lexical level of tags. Second, an extensive evaluation based on this observation shows that a tag recommender interconnecting the semantic and lexical level based on a theory of human categorization and integrating time-dependent forgetting on the lexical level results in high accuracy predictions and outperforms other well-established algorithms, such as Collaborative Filtering, Pairwise Interaction Tensor Factorization, FolkRank and two alternative time-dependent approaches. We conclude that tag recommenders will benefit from going beyond the manifest level of word co-occurrences, and from including forgetting processes on the lexical level.
Content may be subject to copyright.
Forgetting the Words but Remembering the
Meaning: Modeling Forgetting in a Verbal and
Semantic Tag Recommender
Dominik Kowald1,2, Paul Seitlinger2, Simone Kopeinik2, Tobias Ley3, and
Christoph Trattner4
1Know-Center, Graz University of Technology, Graz, Austria
dkowald@know-center.at
2Knowledge Technologies Institute, Graz University of Technology, Graz, Austria
{paul.seitlinger,simone.kopeinik}@tugraz.at
3Institute of Informatics, Tallin University, Tallinn, Estonia
tley@tlu.ee
4Norwegian University of Science and Technology, Trondheim, Norway
chritrat@idi.ntnu.no
Abstract. We assume that recommender systems are more successful,
when they are based on a thorough understanding of how people process
information. In the current paper we test this assumption in the context
of social tagging systems. Cognitive research on how people assign tags
has shown that they draw on two interconnected levels of knowledge in
their memory: on a conceptual level of semantic fields or LDA topics,
and on a lexical level that turns patterns on the semantic level into
words. Another strand of tagging research reveals a strong impact of
time-dependent forgetting on users’ tag choices, such that recently used
tags have a higher probability being reused than “older” tags. In this
paper, we align both strands by implementing a computational theory of
human memory that integrates the two-level conception and the process
of forgetting in form of a tag recommender. Furthermore, we test the
approach in three large-scale social tagging datasets that are drawn from
BibSonomy, CiteULike and Flickr.
As expected, our results reveal a selective effect of time: forgetting is
much more pronounced on the lexical level of tags. Second, an extensive
evaluation based on this observation shows that a tag recommender in-
terconnecting the semantic and lexical level based on a theory of human
categorization and integrating time-dependent forgetting on the lexical
level results in high accuracy predictions and outperforms other well-
established algorithms, such as Collaborative Filtering, Pairwise Interac-
tion Tensor Factorization, FolkRank and two alternative time-dependent
approaches. We conclude that tag recommenders will benefit from going
beyond the manifest level of word co-occurrences, and from including
forgetting processes on the lexical level.
Keywords: personalized tag recommendations; time-dependent recommender
systems; Latent Dirichlet Allocation; LDA; human categorization; human mem-
ory model; BibSonomy; CiteULike; Flickr
1 Introduction
Many interactive systems are designed to mimic human behavior and thinking.
A telling example for this are intelligent tutoring systems, which make inferences
similar to teachers when drawing on knowledge of learning domains, knowledge
about the learners and knowledge about effective teaching strategies. When look-
ing at recommender systems, Collaborative Filtering approaches use information
about socially related individuals to recommend items, much in the same way
as humans are influenced by related peers when making choices. An implicit
assumption behind this may be that interactive systems will perform better the
closer they correspond to human behavior. This assumption seems to be reason-
able as it is humans that interact with these systems, while these systems often
also draw on data produced by humans (e.g., in the case of Collaborative Filter-
ing). Therefore it can be assumed, that strategies that have evolved in humans
over their individual or collective development form good models for interactive
systems. However, the assumption that an interactive system will perform better
the closer it mimics human behavior has not often been tested directly.
In the current paper, we investigate this assumption in the context of a tag
recommender algorithm that borrows its basic architecture from MINERVA2
([1], see also [2]), a computational theory of human categorization. We draw
on research that has explored how human memory is used in a dynamic and
adaptive fashion to understand new information encountered in the environ-
ment. Sensemaking happens by dynamically forming ad-hoc categories that re-
late the new information with knowledge stored in the semantic memory (e.g.,
[3]). For instance, when reading an article about “personalized recommenda-
tions”, a novice has to figure out meaningful connections between previously
distinct topics such as “cognition” and “information retrieval” and hence, has
to start developing an ad-hoc category about common features of both of them.
When using a social tagging system in such a situation, people apply labels to
their own resources which to some extent externalizes this process of sponta-
neously generating ad-hoc categories [4]. Usually, a user describes a particular
bookmark by a combination of about three to five tags verbalizing and associat-
ing aspects of different topics (e.g., “memory”,“retrieval”, “recommendations”,
“collaborative filtering”).
In previous work, we have shown that this behavior can be well described by
differentiating between two separate forms of information processing. In human
memory we find a semantic process that generates and retrieves topics or gist
traces, and a verbal process that generates verbatim word forms to describe
the topics [5]. In this paper we improve this model emphasizing on another
fundamental principle of human cognition. According to Polyn et al. [6], memory
traces including recently activated features contribute more strongly to retrieval
than traces including features that have not been activated for a longer period of
time. This relationship provides a natural account of what is called the recency
effect in memory psychology (e.g., [7]). Obviously, things that happened a longer
time ago tend to be forgotten and influence our current behavior less than things
that have happened recently.
The purpose of this paper is twofold. First, we study the interaction between
the effect of recency and the level of knowledge representation in human memory
(semantic vs. verbal) within a social tagging system. In particular, we raise the
question whether the impact of recency interacts with the level of knowledge
representation, i.e., whether a time-dependent shift in the use of topics can
be dissociated from a time-dependent shift in the use of particular tags. The
second aim is to investigate to which extend our tag recommender based on
MINERVA2 can be improved by integrating a time-dependent forgetting process.
We also determine the performance of this recommender compared to other well-
established tag recommender algorithms (e.g., Collaborative Filtering, FolkRank
and Pairwise Interaction Tensor Factorization), as well as two alternative time-
dependent approaches called GIRPTM [8] and BLL+C [9] (based on the ACT-
R theory of human memory [10]). Hence, we raise the following two research
questions:
RQ1 : Is there a difference between the time-dependent shift in the use of
topics and the time-dependent shift in the use of particular tags?
RQ2 : Can a time-dependent forgetting process be integrated into a tag-
recommender to create an efficient algorithm in comparison to the state-of-
the-art?
The remainder of this paper tackles this two research questions and is or-
ganized as follows: We begin with discussing related work in the field of tag
recommender in Section 2. Next, we review some of the work concerning recency
in memory research and its current use in social tagging in Section 3 (first re-
search question). Then we describe our approach and the experimental setup of
our extensive evaluation in Sections 4 and 5. Section 6 presents the results of
this evaluation in terms of recommender quality (second research question). We
conclude the paper by discussing our findings and future work in Section 7.
2 Related Work
Tagging as an important feature of the Social Web, has demonstrated to im-
prove search considerably [11, 12] and has supported the users with a simple
tool to collaboratively organize and annotate content [13]. However, despite the
potential advantages of tag usage, people do not tend to provide tags thoroughly
or regularly. Thus, from an applied perspective, one important purpose of tag
recommendations is to increase user’s motivation to provide appropriate tags to
their bookmarked resources.
In contrast to previously developed and typically data-driven tag recom-
mender approaches, our research explores the suitability of psychologically sound
memory processes to improve tag recommender approaches. Previously, in [5, 9]
we presented two simple methods (= 3L and BLL+C) that aim to explain mem-
ory processes in social tagging systems. Based on our previous research and other
incentives from related work, we introduce in this work a novel time-based tag
recommender algorithm (= 3LTtag ) based on the MINERVA2 theory of human
categorization [1, 2] that significantly outperforms popular state-of-the-art algo-
rithms as well as BLL+C [9], an alternative time-based approach based on the
ACT-R theory of human memory [10]. It models the activation of elements in
a person’s declarative memory by considering frequency and recency of a user’s
tagging history as well as semantic context.
To date, two tag-recommender approaches have been established: graph-
based and content-based tag recommender systems [14], whereas in this work
we focus on graph-based approaches. Prominent algorithms in this respect can
be found for instance in the work of Hotho et al. [15] who introduced FolkRank
(FR), which has established itself as the most prominent benchmarking tag rec-
ommender approach over the past few years. Further investigated, was the rec-
ommendation of tags to users in a personalized manner. In the scope of this
research strand, J¨aschke et al. [16] or Hamouda & Wanas [17] are well known
to present a set of Collaborative Filtering (CF) approaches. Rendle et al. [18],
Krestel et al. [19] or Rawashdeh et al. [20] more recently presented a factorization
model (FM and PITF), a semantic model (based on LDA) or a link prediction
model to recommend tags to users, respectively (see also Section 5.3).
Comparing these principles now with simple “most popular tags” approaches,
we will notice a big disadvantage in their computational expense as well as in
their lack of considering recent observations made in social tagging systems, such
as the variation of the individual tagging behavior over time [21]. To that end,
recent research has made first promising steps towards more accurate graph-
based models that also account for the variable of time [22,8].
However, although these time-dependent approaches have shown to outper-
form some of the current state-of-the-art tag recommender algorithms, all of
them ignore well-established and long standing research from cognitive psychol-
ogy on how humans process information. Therefore, we try to fill this gap by
investigating tagging mechanisms that aim to mimic peoples’ tagging behavior.
3 Recency in Memory and in the Use of Social Tagging
In previous work we have introduced 3Layers [5], which is a model for recom-
mending tags that is inspired by cognitive-psychological research on categorizing
and verbalizing objects (e.g., [4]) and is adapted in this work based on MIN-
ERVA2 in order to answer our two research questions. 3Layers consists of an
input, a hidden and an output layer, where the hidden layer is built up by
a semantic and an interconnected lexical matrix. The semantic matrix stores
the topics of all bookmarks in the user’s personomy5, calculated with Latent
Dirichlet Allocation (LDA) [19], while the lexical matrix stores the tags of those
bookmarks. In a first step of calculation, the LDA topics of a new bookmark,
5We define a bookmark (also known as “post”) as the set of tags a target user has
assigned to a target resource at a specific time, and the personomy as a collection
of all bookmarks of a user.
for which appropriate tags should be recommended, are represented in the in-
put layer and compared with the semantic matrix of the hidden layer. In the
course of this comparison, semantically relevant bookmarks of the user’s person-
omy become activated. The resulting pattern of activation across the semantic
matrix is then applied to the lexical matrix to further activate and recommend
those tags that belong to relevant bookmarks. In a final step, the activation
pattern across the lexical matrix is summarized on the output layer in form of
a vector. This vector represents a tag distribution that can be used to predict
a substantial amount of variance in the user’s tagging behavior when creating a
new bookmark.
We draw on Fuzzy Trace Theory (FTT; e.g., [23]) to make a prediction with
respect to our first research question about a potentially differential impact of
recency on semantic and lexical representations, i.e., on the usage of topics and
tags, respectively. FTT differentiates between two distinct memory traces, a
gist trace and a verbatim trace, which represent general semantic information
of e.g., a read sentence and the sentence’s exact wording, respectively. These
two types of memory traces share properties with our distinction between a
semantic and a lexical matrix (see also Section 4). While vectors of the semantic
matrix provide a formal account of each bookmark’s gist (its general semantic
content), vectors of the lexical matrix correspond to a bookmark’s verbatim
trace (explicit verbal information in form of assigned tags). This distinction is
also in line with Kintsch & Mangalath [24] who model gist traces of words by
means of LDA topic vectors and explicit traces of words by means of word co-
occurrence vectors. An empirically well-established assumption of FTT is that
verbatim traces are much more prone to time-dependent forgetting than gist
traces (e.g., [23]): while people tend to forget the exact wording, usually they
can remember the gist of a sentence (or a bookmark). Taken together, we derived
the hypothesis that a user’s verbatim traces (vectors in the lexical matrix that
encode the user’s tags) are more strongly affected by time-dependent forgetting
and therefore more variable over time than a user’s gist traces (vectors in the
semantic matrix that contain topics).
To test this hypothesis, we performed an empirical analysis in BibSonomy,
CiteULike and Flickr (see Section 5.1). The topics for the resources of these
datasets’ bookmarks were calculated using Latent Dirichlet Allocation (LDA)
[19] (see Section 4.2) based on 100, 500 and 1000 latent topics in order to cover
different levels of topic specialization (these numbers of latent topics are also
suggested by related work in the field [24,25]). For each user we selected the
most recent bookmark (i.e., the one from the test set with the most recent
timestamp, see also Section 5.2) and described the bookmark by means of two
vectors: one encoding the bookmark’s LDA topic pattern (gist vector) and one
encoding the tags assigned by the user (verbatim vector). Then, we searched
for all the remaining bookmarks of the same user, described each of them by
means of the two vectors and arranged them in a chronologically descending
order. Next, we compared the gist and the verbatim vector of the most recent
(a) BibSonomy
(b) CiteULike
(c) Flickr
Fig. 1. Interaction between time-dependent forgetting and level of knowledge repre-
sentation for BibSonomy, CiteULike and Flickr showing a more pronounced decline for
tags than for topics (100, 500, 1000 LDA topics; first research question).
bookmark with the two corresponding vectors of all bookmarks in the user’s past
by means of the cosine similarity measure.
The obtained results are represented in the three diagrams of Figure 1, plot-
ting the average cosine similarities over all users against the time lags (given in
number of past bookmarks). For all three datasets we show these results for the
last 100 bookmarks of tagging activity per user because in this range, there are
enough users available for each past bookmark to calculate mean values reliably.
The diagrams quite clearly reveal that – independent of the environment (Bib-
Sonomy, CiteULike or Flickr) – the similarity between the most recent bookmark
and all other bookmarks decreases monotonically as a function of time lag. More
importantly and as expected, the time-dependent decline is more strongly pro-
nounced for the verbatim vectors (encoding tag assignments) in contrast to the
gist vectors (encoding LDA topics). Furthermore, we can see that the more LDA
topics we use, the more similar is the time-dependent decline of the two vectors
(tags vs. topics) to each other.
4 Approach
In this section we introduce two novel time-dependent tag recommender algo-
rithms which model the process of forgetting on a semantic and lexical layer in
a time-depended manner. Moreover, we describe how we created the semantic
features (i.e, topics) for the bookmarks in our datasets using Latent Dirichlet
Allocation (LDA).
4.1 Tag-Recommender Algorithms
Due to our findings introduced within the previous section, we assume that the
factor of time plays a more critical role on the lexical layer than on the semantic
layer. The approaches implemented in this section are based on a preliminary
recommender model called 3Layers (3L) that was introduced in our previous
work [5].
Figure 2 schematically shows how 3Layers (3L) represents a user’s person-
omy within the hidden layer, which interconnects a semantic matrix, MS(l
bookmarks ·nLDA topics matrix), and a lexical matrix, ML(lbookmarks ·m
tags matrix). Thus, each bookmark of the user is represented by two associated
vectors; by a vector of LDA topics Si,k stored in MSand by a vector of tags
Li,j stored in ML. Similar to [2], we borrow a mechanism from MINERVA2, a
computational theory of human categorization [1], to process the network consti-
tuted by the input, hidden and output layer. First, the LDA topics of the target
resource to be tagged are represented on the input layer in form of a vector P
with nfeatures. Then, Pis used as a cue to activate each bookmark (Bi) in MS
depending on the similarity (Simi) between both vectors, i.e., Pand Bi. Similar
to [2], we estimate Simiby calculating the cosine between the two vectors:
Simi=Pn
k=1(Pk·Si,k )
pPn
k=1 P2
k·qPn
k=1 S2
i,k
(1)
1
...
...
...
j
...
...
...
m
1
1
1
1
1
Li1
1
Lij
Lim
1
1
1
1
1
Tags
LDA topics of new bookmark
Bookmarks
Semantic
matrix MS
Lexical
matrix ML
LDA topics
1
2
3
Tag combination
corresponding to pattern in P
P
Ttopic
Ttag
P1
Pk
Input Layer
Hidden Layer
Output Layer
1
...
...
...
k
...
...
...
n
B1
A1
...
..."
Bi
Si1
Sik
Sin
Ai
...
...
Bl
Al"
c1
cj
cm
Fig. 2. Schematic illustration of 3L showing the connections between the semantic
matrix (MS) encoding the LDA topics and the lexical matrix (ML) encoding the tags.
Furthermore, Ttopic and Ttag schematically demonstrate how the time component is
integrated in case of 3LTtopic and 3LTtag, respectively.
If no topics are available for the target resource (i.e., n= 0), we set Simito
1 and thus, activate each bookmark with the same value. To transform the
resulting similarity values into activation values (Ai) and to further reduce the
influence of bookmarks with low similarities, Simiis raised to the power of 3,
i.e., Ai=Sim3
i(see also [1]). Next, these activation values are propagated to
MLto activate tags that are associated with highly activated bookmarks on the
semantic matrix MS(circled numbers 2 and 3 in Figure 2). This is computed by
the following equation that yields an activation value cjfor each of the mtags
on the output layer:
cj=
l
X
i=1
(Li,j ·Ai)
| {z }
3L
(2)
To finally compute 3LTtopic and 3LTtag , we integrate a time component on
the level of topics (hereinafter called Ttopic) and on the level of tags (Ttag),
respectively. Both recency components are calculated by the following equation
that is based on the base-level learning (BLL) equation [7]:
BLL(t) = ln((tmstpr ef tmstpt)d) (3)
, where tmstpref is the timestamp of the most recent bookmark of the user
and tmstptis the timestamp of the last occurrence of t, encoded as the topic
in the case of Ttopic or as the tag in the case of Ttag , in the user’s bookmarks.
The exponent daccounts for the power-law of forgetting and was set to .5 as
suggested by Anderson et al. [10]. While 3LTtopic can be computed by using
equation 4, 3LTtag can be computed by using equation 5:
cj=
l
X
i=1
(Li,j ·
n
X
k=1
(Si,k ·BLL(k)) ·Ai)
| {z }
3LTtopic
(4)
cj=
l
X
i=1
(Li,j ·BLL(j)·Ai)
| {z }
3LTtag
(5)
As suggested in related work [26, 14, 9], we additionally consider tags that
have been applied to the target resource by other users. This allows the recom-
mendation of new tags, i.e., tags that have not been used by the target user
before. We implement this by taking into account the most popular tags in the
tag assignments of the resource Yr(MPr, i.e., arg maxtT(|Yr|)) [15]. Therefore,
we have chosen MProver other methods like CF, as previous work [27,28] shows
that users in social tagging systems are more likely to imitate previously assigned
tags by other users to a target resource. In order to combine cjwith MPr, the
following normalization method was used:
kcjk=exp(cj)
Pm
i=1 exp(ci)(6)
Taken together, the list of recommended tags for a given user uand resource
ris then calculated as
e
T(u, r) = arg max
jT
(βkcjk+ (1 β)k|Yr|k) (7)
, where βis used to inversely weight the two components. The results presented
in Section 6 were calculated using β= .5, thus, applying the same weight to
both components.
4.2 Topic Generation via LDA
As outlined in Section 3, we used LDA to calculate the semantic features (i.e.,
topics) of the resources of the full datasets. LDA is a probability model that
helps to find latent topics for documents where each topic is described by words
in these documents [19]. This can be formalized as follows:
P(ti|d) =
Z
X
j=1
(P(ti|zi=j)·P(zi=j|d)) (8)
Table 1. Properties of the used dataset samples, where |B|is the number of bookmarks,
|U|the number of users, |R|the number of resources, |T|the number of tags and |T AS|
the number of tag assignments.
Dataset |B| |U| |R| |T| |T AS|
BibSonomy 400,983 5,488 346,444 103,503 1,479,970
CiteULike 379,068 8,322 352,343 138,091 1,751,347
Flickr 864,679 9,590 864,679 127,599 3,552,540
Here P(ti|d) is the probability of the ith word for a document dand P(ti|zi=j)
is the probability of tiwithin the topic zi.P(zi=j|d) is the probability of
using a word from topic ziin the document. The number of latent topics Zis
determined in advance and defines the level of granularity. We calculated the
semantic features for our datasets with different amounts of LDA topics (100,
500 and 1000 - see also [24, 25]).
When using LDA in tagging environments, documents are resources which are
described by tags. This means that based on the tag vectors of the resources (i.e.,
all the tags the users have assigned to the resource), resources in the bookmarks
can also be represented with the topics identified by LDA. These topics were
then used as features in the semantic matrix MS. We implemented LDA with
Gibbs sampling using the Java framework Mallet6.
5 Experimental Setup
In this section we describe our experiment’s datasets, evaluation methodology
and the baseline algorithms in detail.
5.1 Datasets
To conduct our study, we used three well-known folksonomy datasets that are
freely available for scientific purposes and thus, allow for reproducibility. In this
respect, we utilized datasets from the social bookmark and publication sharing
system BibSonomy7(2013-07-01), the reference management system CiteULike8
(2013-03-10) and the image sharing platform Flickr9(2010-01-07) to evaluate our
approach on both types of folksonomies, broad (BibSonomy and CiteULike; all
users are allowed to annotate a particular resource) and narrow (Flickr; only the
user who has uploaded a resource is allowed to tag it) ones [29]. We furthermore
excluded all automatically generated tags from the datasets (e.g., no-tag,bibtex-
import, etc.) and decapitalized all tags as suggested in related work (e.g., [18]).
To reduce computational effort, we randomly selected 10% of CiteULike, and
6http://mallet.cs.umass.edu/topics.php
7http://www.kde.cs.uni-kassel.de/bibsonomy/dumps/
8http://www.citeulike.org/faq/data.adp
9http://www.tagora-project.eu/data/#flickrphotos
3% of Flickr user profiles (see also [30])10. We did not apply a p-core pruning to
keep the original bookmarks of the users and thus, to prevent a biased evaluation
[31]. The statistics of our used dataset samples can be found in Table 1.
5.2 Evaluation Methodology
To evaluate our tag recommender approaches, we split the three datasets into
training and test sets based on a leave-one-out hold-out method as proposed
in related work (e.g., [16]). Hence, for each user we selected her most recent
bookmark (or post) in time and put it into the test set. The remaining bookmarks
were then used for the training of the algorithms. This procedure is a promising
simulation of a real-world environment, as it predicts a user’s future tagging
behavior based on tagging behavior in the past. Furthermore, it is a standard
practice for evaluation of time-based recommender systems [32].
In order to quantify the recommender quality and to benchmark our rec-
ommender against other tag recommendation approaches, a set of well-known
metrics in information retrieval and recommender systems were used [16, 14]:
Recall (R) is defined as the number of recommended tags that are relevant
for the target user/resource divided by the total number of relevant tags [33]:
R@k=1
|U|X
uU
(|tk
uTu|
|Tu|) (9)
, where tk
udenotes the top krecommended tags and Tuthe list of relevant tags
of a bookmark of user uU.
Precision (P) is calculated as the number of correctly recommended tags
divided by the total number of recommended tags |tk
u|(= k) [33]:
P@k=1
|U|X
uU
(|tk
uTu|
|tk
u|) (10)
F1-score (F1) is a combination of the recall and precision metrics and is
calculated using the following equation [33]:
F1@k=1
|U|X
uU
(2 ·P@k·R@k
P@k+R@k) (11)
Mean reciprocal rank (MRR) is a rank-dependent evaluation metric that
is calculated as the sum of the reciprocal ranks (or positions) of all relevant tags
in the list of recommended tags [20]:
MRR =1
|U|
|U|
X
u=1
(1
|Tu|X
tTu
1
rank(t)) (12)
10 Note: We used the same dataset samples as in our previous work [9], except for
CiteULike, where we used a smaller sample for reasons of computational effort in
respect to the calculation of the LDA topics.
This way, a recommender achieves a higher MRR if relevant tags occur at early
positions in the list of recommended tags.
Mean average precision (MAP) extends the precision metric and also
considers the order of the recommended tags. This is done by computing the
precision value at every position kof the ranked list of tags and using the average
of these values [20]:
MAP =1
|U|
|U|
X
u=1
(1
|Tu|
|tk
u|
X
k=1
(Bk·P@k)) (13)
, where Bkis 1 if the tag at position kof the list of recommended tag is correct.
In particular, we report R@k, P@k, MRR and MAP for k= 10 and F1-Score
(F1@k) for k= 5 recommended tags.
5.3 Baseline Algorithms
We compared the results of our approach to several “baseline” tag recommender
algorithms. The algorithms were selected in respect to their popularity in the
community, performance and novelty [34]. The most basic approach we utilized is
the unpersonalized MostPopular (MP) algorithm. MP recommends independent
of user and resource, the same set of tags that is weighted by the frequency over
all tag assignments [16]. A personalized extension of MP is the MostPopularu,r
(MPu,r)algorithm that suggests the most frequent tags in the tag assignments
of the user (MPu) and the resource (MPr) [16]. As done in our approaches, we
weighted the user and the resource components equally (β=.5).
Another well known recommender approach is Collaborative Filtering (CF)
which was adapted for tag recommendations by Marinho et al. [34]. Here the
neighborhood of a user is formed based on the tag assignments in the user profile
and the only variable parameter is the number of users kin this neighborhood.
khas been set to 20 as suggested by Gemmell et al. [30]. In Section 4.2 we have
described how we applied Latent Dirichlet Allocation (LDA) for tag recommen-
dations. The results presented in this work have been calculated using Z= 1000
latent topics [19].
An additional approach we utilized is the well-known FolkRank (FR) algo-
rithm which is an improvement of the Adapted PageRank (APR) approach [16].
FR extends the PageRank algorithm in order to rank the nodes within the graph
structure of a folksonomy [16], which is based on their importance in the network.
Our implementation of APR and FR builds upon the code and the settings of
the open-source Java tag recommender framework provided by the University of
Kassel11. In this implementation the parameter dis set to .7 and the maximum
number of iterations lis set to 10.
A different popular and recent tag recommender mechanism is Pairwise In-
teraction Tensor Factorization (PITF) proposed by Rendle & Schmidt-Thieme
[18]. It is an extension of Factorization Machines (FM) and explicitly models
11 http://www.kde.cs.uni-kassel.de/code
pairwise interactions between users, resources and tags. The FM and PITF re-
sults presented in this paper were calculated using the open-source C++ tag
recommender framework provided by the University of Konstanz12. We set the
dimensions of factorization kU,kRand kTto 256 and the number of iterations
lto 50 as suggested in [18].
Finally, we tried to benchmark against two time-dependent approaches. The
first one is the GIRPTM algorithm presented by Zhang et al. [8] which is
based on the frequency and the temporal usage of a user’s tag assignments.
The approach models the temporal tag usage with an exponential distribution
based on the first- and last-time usage of the tags. The second time-dependent
tag-recommender approach is the Base-Level Learning Equation with Context
(BLL+C) algorithm introduced in our previous work [9]. BLL+C is based on
the ACT-R human memory theory by Anderson et al. [10] and uses a power-law
distribution based on all tag usages to mimic the time-dependent forgetting in
tag applications. In both approaches the resource component is modeled by a
simple most popular tags by resource mechanism, as it is also done in our 3Layers
approach. In previous work [9], we showed that BLL+C outperforms GIRPTM
and other well-established algorithms, such as FR, PITF and CF.
The algorithms described in this section along with our developed approaches
(see Section 4 are implemented within our Java-based TagRec framework [35].
Published as open-source software, it can be downloaded from our Github Repos-
itory13 along with the herein used test and training sets (see Section 5.1 and 5.2).
6 Results
In this section we present the evaluation of the two novel algorithms in line with
our research questions. In step 1, we compared the three 3Layers approaches (3L,
3LTtopic and 3LTtag) with one another, in order to examine our first research
question of whether recency has a differential effect on topics and tags. According
to the empirical analysis illustrated in Section 3, 3LTtag yields more accurate
predictions than 3LTtopic and 3L.
Results shown in Table 2 prove this assumption since - independent of the
metric (F1@5, MRR and MAP) and the number of LDA topics (100, 500, and
1000) applied - the difference between 3LTtag and 3L is significantly larger than
the one between 3LTtopic and 3L. This allows us to conclude that a user’s gist
traces (LDA topics) associated with the user’s bookmarks are less prone to “for-
getting” than a user’s verbatim traces (tags associated with the bookmarks).
Interestingly, this effect is more strongly pronounced under the narrow folkson-
omy condition (Flickr), where no tags of other users are available for the target
user’s resource, than under the broad folksonomy condition (BibSonomy and
CiteULike), where users could get inspired by tags of other users.
Furthermore, Table 2 illustrates the performance of 3L, 3LTtopic and 3LTtag
for different numbers of LDA topics (100, 500 and 1000). It can be seen that
12 http://www.informatik.uni-konstanz.de/rendle/software/tag-recommender/
13 https://github.com/learning-layers/TagRec/
Table 2. F1@5, MRR and MAP values for BibSonomy, CiteULike and Flickr showing
the performance of 3L and its time-dependent extensions (3LTtopic and 3LTtag) for
100, 500 and 1000 LDA topics (first research question).
# Topics Measure 3L 3LTtopic 3LTtag
BibSonomy
100 F1@5 .197 .198 .204
MRR .152 .154 .161
MAP .201 .202 .212
500 F1@5 .204 .205 .209
MRR .156 .158 .163
MAP .206 .208 .215
1000 F1@5 .206 .207 .211
MRR .157 .158 .162
MAP .207 .208 .214
CiteULike
100 F1@5 .211 .212 .221
MRR .192 .194 .211
MAP .226 .228 .248
500 F1@5 .218 .219 .225
MRR .196 .198 .211
MAP .232 .234 .250
1000 F1@5 .232 .233 .238
MRR .199 .200 .212
MAP .235 .236 .250
Flickr
100 F1@5 .500 .507 .535
MRR .421 .429 .476
MAP .560 .571 .634
500 F1@5 .564 .567 .582
MRR .443 .448 .476
MAP .591 .596 .635
1000 F1@5 .568 .571 .585
MRR .450 .454 .477
MAP .599 .604 .636
all three approaches provide good results for different levels of topic specializa-
tion, with the best accuracy values reached for 1000 LDA topics14 .F1@5, MRR
and MAP values calculated for 1000 topics are further used within the second
evaluation step, which is described in the next paragraph.
In a second step, we compared the performance of our approaches, especially
3LTtag , with several state-of-the-art algorithms. By this means we address our
second research question, of whether 3L and its two extensions can be imple-
mented in form of effective and efficient tag recommendation mechanisms. First,
Table 3 reveals that all personalized recommendation mechanisms clearly out-
perform the unpersonalized MP approach. This is not surprising, as MP solely
takes into account the tag’s usage frequency independent of information about
a particular user or resource.
Second and more important, 3L and its two extensions (3LTtopic and 3LTtag )
reach significantly higher accuracy estimates than the well-established mecha-
nisms LDA, MPu,r, CF, APR, FR, FM and PITF. From this we conclude that
predicting tags based on psychologically plausible steps that turn a user’s gist
14 NOTE: We also performed experiments with more than 1000 LDA topics (e.g.,
2000, 3000, ...). However, as also shown by related work (e.g., [19, 24, 25]) this step
did not help in increasing the performance of the LDA-based tag recommenders.
Table 3. F1@5, MRR and MAP values for all the users in the datasets (BibSonomy, CiteULike and Flickr) and for users with a minimum
number of 20 bookmarks (Bmin = 20) showing that our time-dependent 3LTtag approach outperforms current state-of-the art algorithms
(second research question). The symbols ,∗∗ and ∗∗∗ indicate statistically significant differences based on a Wilcoxon Ranked Sum
test between 3L, 3LTtopic , 3LTtag and BLL+C at αlevel .05, .01 and .001, respectively; ,◦◦ and ◦◦◦ indicate statistically significant
differences between our two time-dependent approaches 3LTtopic, 3LTtag and 3L at the same αlevels.
Bmin Measure MP LDA MPuMPrMPu,r CF APR FR FM PITF GIRPTM BLL+C 3L 3LTtopic 3LTtag
BibSonomy
-F1@5 .013 .097 .152 .074 .192 .166 .175 .171 .122 .139 .197 .201 .206 .207 .211
MRR .008 .083 .114 .054 .148 .133 .149 .148 .097 .120 .152 .158 .157 .158 .162
MAP .009 .101 .148 .070 .194 .173 .193 .194 .120 .150 .200 .207 .207 .208 .214
20 F1@5 .019 .142 .156 .078 .195 .204 .184 .197 .162 .163 .240 .249 .264 .269 .296
∗∗
MRR .011 .129 .135 .059 .160 .175 .159 .171 .135 .137 .201 .216 .224 .227 .251
∗∗
MAP .012 .152 .163 .074 .200 .219 .197 .214 .164 .166 .256 .275 .289 .291 .325
∗∗
CiteULike
-F1@5 .007 .068 .182 .033 .199 .157 .162 .160 .113 .130 .207 .215 .232 .233 .238
∗∗
MRR .005 .065 .164 .024 .179 .168 .181 .181 .116 .149 .196 .205 .199 .200 .212
MAP .005 .073 .191 .029 .210 .196 .212 .212 .132 .169 .229 .241 .235 .236 .250
20 F1@5 .008 .145 .228 .031 .237 .228 .221 .225 .193 .196 .282 .298 .331.334.353
∗∗∗
MRR .006 .144 .225 .022 .233 .271 .237 .239 .201 .210 .321 .335 .312 .316 .367
∗∗
◦◦◦
MAP .006 .162 .258 .028 .269 .308 .273 .276 .229 .237 .369 .389 .369 .373 .430
∗∗
◦◦◦
Flickr
-F1@5 .023 .169 .435 - .435 .417 .328 .334 .297 .316 .509 .523 .568∗∗∗ .571∗∗∗ .585
∗∗∗
MRR .023 .171 .360 - .360 .436 .352 .355 .300 .333 .445 .466 .450 .454 .477
◦◦◦
MAP .023 .205 .468 - .468 .581 .453 .459 .384 .426 .590 .619 .599 .604 .636
◦◦◦
20 F1@5 .030 .190 .382 - .382 .495 .322 .334 .309 .309 .534 .553 .610∗∗∗ .616∗∗∗ .643
∗∗∗
◦◦◦
MRR .028 .174 .322 - .322 .473 .309 .317 .290 .289 .485 .508 .478 .485 .530
∗∗
◦◦◦
MAP .029 .215 .427 - .427 .655 .405 .419 .378 .376 .664 .701 .661 .670 .732
∗∗∗
◦◦◦
traces into words, calculates tag recommendations that correspond well to the
user’s tagging behavior.
Third, we can see that also the two other time-dependent algorithms (GIRPTM
and BLL+C) outperform the state-of-the art approaches that do not take the
time component into account. BLL+C based on ACT-R even reaches slightly
higher estimates of accuracy than our 3L approach based on MINERVA2. How-
ever, this relation changes when we enhance 3L by the recency component at the
level of tags. Then, 3LTtag clearly outperforms BLL+C with respect to all three
metrics and across all three datasets. Finally, as shown in Figure 3, a very similar
pattern of results becomes apparent when evaluating the different approaches by
plotting recall against precision for k= 1 - 10 recommended tags.
To furthermore prove our assumption that memory processes play an impor-
tant role in social tagging systems, we also performed an experiment where we
looked at users that have bookmarked a minimum of Bmin = 20 resources (see
also [36]). We conducted this experiment by applying a post-filtering method,
i.e., recommendations were still calculated on the whole folksonomy graph but
accuracy estimates were calculated only on the basis of the filtered user profiles
(= 780 users in the case of BibSonomy, 1,757 in the case of CiteULike and 4,420
for Flickr). The results of the experiment are also shown in Table 3. We can
observe that in general the accuracy estimates of all algorithms are increasing.
Furthermore, it demonstrates that the difference between 3LTtag and the other
algorithms (including BLL+C) grows substantially larger the more user “mem-
ory” (history) is used. These differences between 3LTtag and BLL+C as well as
between 3LTtag and 3L proved to be statistically significant based on a Wilcoxon
Rank Sum test across all accuracy metrics (F1@5, MRR and MAP) and all three
datasets (see Table 3).
7 Discussion and Conclusion
In this study we have provided empirical evidence for an interaction between the
level of knowledge representation (semantic vs. lexical) and time-based forgetting
in the context of social tagging. Based on the analysis of three large-scale tagging
datasets (BibSonomy, CiteULike and Flickr), we conclude that - as expected -
the gist traces of a user’s personomy (the combination of LDA topics associated
with the bookmarks) are more stable over time than the verbatim traces (the
combination of associated tags). This pattern of results is well in accordance
with research on human memory (e.g., [23]) suggesting that while people tend
to forget surface details they keep quite robust memory traces of the general
meaning underlying the experiences of the past (e.g., the meaning of read words).
The interaction effect suggests that it is worthwhile to differentiate between both,
time-based forgetting as well as the level of knowledge representation in social
tagging research. Moreover, the differential affect of forgetting on the two levels
of processing has further substantiated the differences between tagging behavior
on a semantic level of gist traces and a lexical level of verbatim traces [28]. This
in turn is in line with cognitive research on social tagging (e.g., [37]) that suggests
(a) BibSonomy (b) BibSonomy (Bmin = 20)
(c) CiteULike (d) CiteULike (Bmin = 20)
(e) Flickr (f) Flickr (Bmin = 20)
Fig. 3. Recall/Precision plots for all the users in the datasets (BibSonomy, CiteULike
and Flickr) and for users with a minimum number of 20 bookmarks (Bmin = 20)
showing the performance of the algorithms for 1 - 10 recommended tags (k).
to consider a latent, semantic level (e.g., modeled in form of LDA topics) when
trying to understand the variance in the statistical patterns on the manifest level
of users’ tagging behavior.
Finally, we have gathered evidence for our assumption that interactive sys-
tems can be improved by basing them on a thorough understanding of how
humans process information. We note in particular that integrating two fun-
damental principles of human information processing, time-based forgetting and
differentiating into semantic and lexical processing, enhances the accuracy of tag
predictions as compared to a situation when only one of the principles is con-
sidered. Our experiments showed that topics are more stable over time which
means that they are, unless tags, not as suitable to be modelled using the BLL
equation but can improve the results as an activation value on the basis of topic
similarities. Therefore, 3L, that is based on the MINERVA2 theory of human
categorization [1, 2] is enhanced by forgetting on the lexical level (3LTtag). This
approach significantly outperforms both the traditional 3L, as well as other well-
established algorithms, such as CF, APR, FR, FM, PITF and the time-based
GIRPTM. Furthermore, 3LTtag also clearly reaches higher levels of accuracy
than BLL+C, the to-date leading time-based tag recommender approach, that
is based on the ACT-R theory of human memory [10] and was introduced in our
previous work [9].
One limitation of this work is the calculation of semantic features (or topics)
of the resources using LDA, which is not only very time-consuming but also
could be biased because of the tag information it is based on. In this respect
an interesting extension for future work would be to additionally conduct our
experiments using external topics of the resources (e.g., Wikipedia categories
as used in [5]). Looking at another aspect, our work has been inspired by the
human memory model ACT-R proposed by Anderson et al. [10], but so far only
investigates the first part of the equation, the recency component. Thus, it would
be very interesting to further extend our approach by additionally investigating
the associative component of the model. Also, as the computations were carried
out with fixed values .5 for the exponent d(3) and the weight β(7), it would be
worth exploring alternative values.
Moreover, we plan to include our algorithms in an actual online social tagging
system (e.g., BibSonomy). Only in such a setting it is possible to test the recom-
mendation performance by looking at user acceptance. Because our approach is
theory-driven, it is rather straightforward to transfer it to recommendations in
other interactive systems and Web paradigms where semantic and lexical pro-
cessing plays a role (such as, for example, in Web curation). Thus, the general-
ization to other paradigms is another important benefit of driving recommender
systems research by an understanding of human information processing on the
Web.
Acknowledgments This work is supported by the Know-Center, the EU
funded projects Learning Layers (Grant Agreement 318209) and weSPOT (Grant
Agreement 318499) and the Austrian Science Fund (FWF): P 25593-G22. More-
over, parts of this work were carried out during the tenure of an ERCIM “Alain
Bensoussan” fellowship programme.
References
1. Hintzman, D.L.: Minerva 2: A simulation model of human memory. Behavior
Research Methods, Instruments, & Computers 16 (1984) 96–101
2. Kwantes, P.J.: Using context to build semantics. Psychonomic Bulletin & Review
12 (2005) 703–710
3. Barsalou, L.: Situated simulation in the human conceptual system. Language and
cognitive processes 18 (2003) 513–562
4. Glushko, R.J., Maglio, P.P., Matlock, T., Barsalou, L.W.: Categorization in the
wild. Trends in cognitive sciences 12 (2008) 129–135
5. Seitlinger, P., Kowald, D., Trattner, C., Ley, T.: Recommending tags with a model
of human categorization. In: Proc. CIKM ’13, New York, NY, USA, ACM (2013)
2381–2386
6. Polyn, S.M., Norman, K.A., Kahana, M.J.: A context maintenance and retrieval
model of organizational processes in free recall. Psychological review 116 (2009)
129
7. Anderson, J.R., Schooler, L.J.: Reflections of the environment in memory. Psy-
chological Science 2(1991) 396–408
8. Zhang, L., Tang, J., Zhang, M.: Integrating temporal usage pattern into person-
alized tag prediction. In: Web Technologies and Applications. Springer (2012)
354–365
9. Kowald, D., Seitlinger, P., Trattner, C., Ley, T.: Long time no see: The probability
of reusing tags as a function of frequency and recency. In: Proc. WWW ’14, New
York, NY, USA, ACM (2014)
10. Anderson, J.R., Byrne, M.D., Douglass, S., Lebiere, C., Qin, Y.: An integrated
theory of the mind. Psychological Review 111 (2004) 1036–1050
11. Helic, D., Trattner, C., Strohmaier, M., Andrews, K.: Are tag clouds useful for nav-
igation? a network-theoretic analysis. International Journal of Social Computing
and Cyber-Physical Systems 1(2011) 33–55
12. Trattner, C., Lin, Y.l., Parra, D., Yue, Z., Real, W., Brusilovsky, P.: Evaluating
tag-based information access in image collections. In: Proceedings of the 23rd ACM
conference on Hypertext and social media, ACM (2012) 113–122
13. orner, C., Benz, D., Hotho, A., Strohmaier, M., Stumme, G.: Stop thinking,
start tagging: tag semantics emerge from collaborative verbosity. In: Proceedings
of the 19th international conference on World wide web. WWW ’10, New York,
NY, USA, ACM (2010) 521–530
14. Lipczak, M.: Hybrid Tag Recommendation in Collaborative Tagging Systems. PhD
thesis, Dalhousie University (2012)
15. Hotho, A., J¨aschke, R., Schmitz, C., Stumme, G.: Information retrieval in folk-
sonomies: Search and ranking. In: The semantic web: research and applications.
Springer (2006) 411–426
16. aschke, R., Marinho, L., Hotho, A., Schmidt-Thieme, L., Stumme, G.: Tag recom-
mendations in folksonomies. In: Knowledge Discovery in Databases: PKDD 2007.
Springer (2007) 506–514
17. Hamouda, S., Wanas, N.: Put-tag: personalized user-centric tag recommendation
for social bookmarking systems. Social network analysis and mining 1(2011)
377–385
18. Rendle, S., Schmidt-Thieme, L.: Pairwise interaction tensor factorization for per-
sonalized tag recommendation. In: Proc. WSDM 2010, New York, NY, USA, ACM
(2010) 81–90
19. Krestel, R., Fankhauser, P., Nejdl, W.: Latent dirichlet allocation for tag recom-
mendation. In: Proc. RecSys 2009, ACM (2009) 61–68
20. Rawashdeh, M., Kim, H.N., Alja’am, J.M., El Saddik, A.: Folksonomy link pre-
diction based on a tripartite graph for tag recommendation. Journal of Intelligent
Information Systems (2012) 1–19
21. Yin, D., Hong, L., Xue, Z., Davison, B.D.: Temporal dynamics of user interests
in tagging systems. In: Twenty-Fifth AAAI conference on artificial intelligence.
(2011)
22. Yin, D., Hong, L., Davison, B.D.: Exploiting session-like behaviors in tag predic-
tion. In: Proc. WWW’2011, ACM (2011) 167–168
23. Brainerd, C., Reyna, V.: Recollective and nonrecollective recall. Journal of memory
and language 63 (2010) 425–445
24. Kintsch, W., Mangalath, P.: The construction of meaning. Topics in Cognitive
Science 3(2011) 346–370
25. Krestel, R., Fankhauser, P.: Tag recommendation using probabilistic topic models.
ECML PKDD Discovery Challenge 2009 (DC09) (2009) 131
26. Lorince, J., Todd, P.M.: Can simple social copying heuristics explain tag popularity
in a collaborative tagging system? In: Proc. of WebSci ’13, New York, NY, USA,
ACM (2013) 215–224
27. Floeck, F., Putzke, J., Steinfels, S., Fischbach, K., Schoder, D.: Imitation and
quality of tags in social bookmarking systems–collective intelligence leading to
folksonomies. In: On collective intelligence. Springer (2011) 75–91
28. Seitlinger, P., Ley, T.: Implicit imitation in social tagging: familiarity and semantic
reconstruction. In: Proc. CHI ’12, New York, NY, USA, ACM (2012) 1631–1640
29. Helic, D., K¨orner, C., Granitzer, M., Strohmaier, M., Trattner, C.: Navigational
efficiency of broad vs. narrow folksonomies. In: Proc. HT ’12, New York, NY, USA,
ACM (2012) 63–72
30. Gemmell, J., Schimoler, T., Ramezani, M., Christiansen, L., Mobasher, B.: Im-
proving folkrank with item-based collaborative filtering. Recommender Systems &
the Social Web (2009)
31. Doerfel, S., J¨aschke, R.: An analysis of tag-recommender evaluation procedures.
In: Proc. RecSys ’13, New York, NY, USA, ACM (2013) 343–346
32. Campos, P.G., D´ıez, F., Cantador, I.: Time-aware recommender systems: a com-
prehensive survey and analysis of existing evaluation protocols. User Modeling and
User-Adapted Interaction (2013) 1–53
33. Van Rijsbergen, C.J.: Foundation of evaluation. Journal of Documentation 30
(1974) 365–373
34. Balby Marinho, L., Hotho, A., Jschke, R., Nanopoulos, A., Rendle, S., Schmidt-
Thieme, L., Stumme, G., Symeonidis, P.: Recommender Systems for Social Tagging
Systems. SpringerBriefs in Electrical and Computer Engineering. Springer (2012)
35. Kowald, D., Lacic, E., Trattner, C.: Tagrec: Towards a standardized tag recom-
mender benchmarking framework. In: Proc. HT’14, New York, NY, USA, ACM
(2014)
36. Parra-Santander, D., Brusilovsky, P.: Improving collaborative filtering in social
tagging systems for the recommendation of scientific articles. In: Proc. WI-IAT
2010. Volume 1., IEEE (2010) 136–142
37. Fu, W.T., Dong, W.: Collaborative indexing and knowledge exploration: A social
learning model. IEEE Intelligent Systems 27 (2012) 39–46
... In this paper, the 3Layers tag recommendation algorithm is extended by incorporating the time-dependent decay of tag reuse. This is realized by integrating the BLL equation of the cognitive architecture ACT-R [Kowald et al., 2015b]. ...
... Apart from this strand of research, the author of this thesis has also contributed to the design of other cognitive-inspired tag recommendation methods presented in [Seitlinger et al., 2013] and [Kowald et al., 2015b]. These methods are based on a computational model of human categorization called MINERVA2 [Hintzman, 1984] in order to process a network constituted by a input, hidden and output layer. ...
... In [Seitlinger et al., 2013], the 3Layers (3L) algorithm was presented, which uses categories assigned to the current resource in order to recommend tags of semantically similar resources. This approach is extended in [Kowald et al., 2015b] to create 3LT, which enriches 3L in order to also incorporate temporal processes of tag usage (see Section 3.3). ...
... In this paper, the 3Layers tag recommendation algorithm is extended by incorporating the time-dependent decay of tag reuse. This is realized by integrating the BLL equation of the cognitive architecture ACT-R [Kowald et al., 2015b]. ...
... Apart from this strand of research, the author of this thesis has also contributed to the design of other cognitive-inspired tag recommendation methods presented in [Seitlinger et al., 2013] and [Kowald et al., 2015b]. These methods are based on a computational model of human categorization called MINERVA2 [Hintzman, 1984] in order to process a network constituted by a input, hidden and output layer. ...
... In [Seitlinger et al., 2013], the 3Layers (3L) algorithm was presented, which uses categories assigned to the current resource in order to recommend tags of semantically similar resources. This approach is extended in [Kowald et al., 2015b] to create 3LT, which enriches 3L in order to also incorporate temporal processes of tag usage (see Section 3.3). ...
Book
Full-text available
Modeling Activation Processes in Human Memory for Tag Recommendations: Using Models from Human Memory Theory to Implement Recommender Systems for Social Tagging and Microblogging Environments
... Sabater-mir et al. (2013) use the cognitive architecture Belief/Desire/Intention (BDI) as an intermediate between recommenders and their users. The cognitive architecture ACT-R (short for adaptive control of thought-rational) (Anderson et al., 1997) has been employed in the context of recommender systems in several works (Maanen and Marewski, 2009;Kowald et al., 2014;Trattner et al., 2016;Kowald et al., 2013;Kowald et al., 2017b;Kowald and Lex, 2016;Stanley and Byrne, 2016)). ACT-R describes central cognitive operations of the human mind. ...
... In this vein, Seitlinger et al. (2013) use the connectionist human memory simulation model ALCOVE (Kruschke, 1992) to implement a novel tag recommendation algorithm termed 3Layers. Kowald et al. (2013) enhance the 3Layers algorithm with recency effects by combining it with the BLL equation mentioned before. Another connectionist model is used by Kopeinik et al. (2017a), who apply SUSTAIN (Love et al., 2004), a connectionist model of human category learning and successor of ALCOVE, to recommend resources that fit to a user's current attentional focus. ...
Book
Full-text available
Personalized recommender systems have become indispensable in today’s online world. Most of today’s recommendation algorithms are data-driven and based on behavioral data. While such systems can produce useful recommendations, they are often uninterpretable, black-box models that do not incorporate the underlying cognitive reasons for user behavior in the algorithms’ design. This survey presents a thorough review of the state of the art of recommender systems that leverage psychological constructs and theories to model and predict user behavior and improve the recommendation process – so-called psychology-informed recommender systems. The survey identifies three categories of psychology-informed recommender systems: cognition-inspired, personality-aware, and affectaware recommender systems. For each category, the authors highlight domains in which psychological theory plays a key role. Further, they discuss selected decision-psychological phenomena that impact the interaction between a user and a recommender. They also focus on related work that investigates the evaluation of recommender systems from the user perspective and highlight user-centric evaluation frameworks, and potential research tasks for future work at the end of this survey.
... Sabater-mir et al. (2013) use the cognitive architecture Belief/Desire/Intention (BDI) as an intermediate between recommenders and their users. The cognitive architecture ACT-R (short for adaptive control of thought-rational) (Anderson et al., 1997) has been employed in the context of recommender systems in several works (Maanen and Marewski, 2009;Kowald et al., 2014;Trattner et al., 2016;Kowald et al., 2013;Kowald et al., 2017b;Kowald and Lex, 2016;Stanley and Byrne, 2016)). ACT-R describes central cognitive operations of the human mind. ...
... In this vein, Seitlinger et al. (2013) use the connectionist human memory simulation model ALCOVE (Kruschke, 1992) to implement a novel tag recommendation algorithm termed 3Layers. Kowald et al. (2013) enhance the 3Layers algorithm with recency effects by combining it with the BLL equation mentioned before. Another connectionist model is used by Kopeinik et al. (2017a), who apply SUSTAIN (Love et al., 2004), a connectionist model of human category learning and successor of ALCOVE, to recommend resources that fit to a user's current attentional focus. ...
... Then exponential distribution is applied to model the interval of tagging behaviors. BLL AC [15], BLL + MP m [16] and BLL ac + MP m [13] are proposed based on the theory of human memory. Essentially, these methods tend to recommend tags which are used recently and consider the frequency or association component to capture features of items. ...
... For TAPITF and ABNT, we initialize their embedding parameters using the pre-trained models, and we tune the learning rate of [0.001, 0.0005, 0.0001]. We also tune the length of l s u of [5,10,15,20,25,30] for ABNT. Another important hyper-parameter is α, which is the growth rate in Eqs. 13 and 14. ...
Chapter
Personalized tag recommender systems suggest tags to users when annotating specific items. Usually, recommender systems need to take both users’ preference and items’ features into account. Existing methods like latent factor models based on tensor factorization use low-dimensional dense vectors to represent latent features of users, items and tags. The problem with these models is using the static representation for the user, which neglects that users’ preference keeps evolving over time. Other methods based on base-level learning (BLL) only use a simple time-decay function to weight users’ preference. In this paper, we propose a personalized tag recommender system based on neural networks and attention mechanism. This approach utilizes the multi-layer perceptron to model the non-linearities of interactions among users, items and tags. Also, an attention network is introduced to capture the complex pattern of the user’s tagging sequence. Extensive experiments on two real-world datasets show that the proposed model outperforms the state-of-the-art tag recommendation method.
... Research papers Model of human categorization [17,23,35] Activation processes in human memory [18,21,24,37] Informal learning se ings [5][6][7] Resource recommendations Research papers A ention-interpretation dynamics [15,34] Tag and time information [27,28] Recommendation evaluation Research papers Real-world folksonomies [20] Technology enhanced learning se ings [16] Hashtag recommendations ...
... Tag Recommendations Using a Model of Human Categorization. In [17,23,35], the authors introduced a tag recommendation algorithm based on the human categorization models ALCOVE [26] and MINERVA2 [12]. is algorithm is called 3Layers and simulates categorization processes in human memory. erefore, the categories assigned to a given resource, which a user is going to annotate, are matched against already annotated resources of this user. ...
Preprint
Full-text available
Recommender systems have become important tools to support users in identifying relevant content in an overloaded information space. To ease the development of recommender systems, a number of recommender frameworks have been proposed that serve a wide range of application domains. Our TagRec framework is one of the few examples of an open-source framework tailored towards developing and evaluating tag-based recommender systems. In this paper, we present the current, updated state of TagRec, and we summarize and reflect on four use cases that have been implemented with TagRec: (i) tag recommendations, (ii) resource recommendations, (iii) recommendation evaluation, and (iv) hashtag recommendations. To date, TagRec served the development and/or evaluation process of tag-based recommender systems in two large scale European research projects, which have been described in 17 research papers. Thus, we believe that this work is of interest for both researchers and practitioners of tag-based recommender systems.
... However, offline data studies are limited to evaluating the prediction of user behaviour. In our previous work [23,21], we have intensively investigated the suitability of two tag recommendation approaches via offline studies [21]: the first called (Bi) is inspired by the Base Level Learning Equation (BLL) [1], which models the frequency and recency of past tag use. The second algorithm, called Minerva [17,34], incorporates tag use frequency as well as semantic context. ...
... The applied recommendation mechanisms have been extensively investigated in offline experiments [23,21] where they showed promising results when applied on social bookmarking and TEL data sets. Notably, the cognitive-inspired mechanisms consistantly outperformed state-of-the-art tag recommendation algorithms such as Collaborative Filtering, FolkRank and even graph based methods. ...
Conference Paper
Full-text available
In online social learning environments, tagging has demonstrated its potential to facilitate search, to improve recommendations and to foster reflection and learning. Studies have shown that as a prerequisite for learning, shared understanding needs to be established in the group. We hy-pothesise that this can be fostered through tag recommendation strategies that contribute to semantic stabilization. In this study, we investigate the application of two tag rec-ommenders that are inspired by models of human memory: (i) the base-level learning equation BLL and (ii) Minerva. BLL models the frequency and recency of tag use while Min-verva is based on frequency of tag use and semantic context. We test the impact of both tag recommenders on semantic stabilization in an online study with 51 students completing a group-based inquiry learning project in school. We find that displaying tags from other group members contributes significantly to semantic stabilization in the group, as compared to a strategy where tags from the students' individual vocabularies are used. Testing for the accuracy of the different recommenders revealed that algorithms using frequency counts such as BLL performed better when individual tags were recommended. When group tags were recommended, the Minerva algorithm performed better. We conclude that tag recommenders, exposing learners to each other's tag choices by simulating search processes on learn-ers' semantic memory structures, show potential to support semantic stabilization and thus, inquiry-based learning in groups.
... An algorithm that is especially designed to follow knowledge creation theory is the 3Layers tag recommendation approach Kowald, Seitlinger, Kopeinik, Ley, & Trattner, 2013). It learns how a learner, or group of learners, categorizes resources. ...
Article
Full-text available
In this paper, we propose the Social Semantic Server (SSS) as a service-based infrastructure for workplace and professional learning analytics (LA). The design and development of the SSS have evolved over eight years, starting with an analysis of workplace learning inspired by knowledge creation theories and their application in different contexts. The SSS collects data from workplace learning tools, integrates it into a common data model based on a semantically enriched artifact-actor network, and offers it back for LA applications to exploit the data. Further, the SSS design’s flexibility enables it to be adapted to different workplace learning situations. This paper contributes by systematically deriving requirements for the SSS according to knowledge creation theories, and by offering support across a number of different learning tools and LA applications integrated into the SSS. We also show evidence for the usefulness of the SSS extracted from 4 authentic workplace learning situations involving 57 participants. The evaluation results indicate that the SSS satisfactorily supports decision making in diverse workplace learning situations and allow us to reflect on the importance of knowledge creation theories for this analysis.
... Apart from that, in [12], we have shown that ACT-R can be generalized for related use cases such as hashtag recommendations in Twitter. For future work, we plan to build upon this results in order to propose cognitive-inspired recommender systems (e.g., for resource recommendation) as an alternative to data-driven ones [13,[15][16][17]. In this sense, our long-term goal is to design hybrid approaches, which combine the advantages of both worlds in order to adapt to the current setting (i.e., sparse vs. dense ones). ...
Preprint
Full-text available
In this paper, we study the imbalance between current state-of-the-art tag recommendation algorithms and the folksonomy structures of real-world social tagging systems. While algorithms such as FolkRank are designed for dense folksonomy structures, most social tagging systems exhibit a sparse nature. To overcome this imbalance, we show that cognitive-inspired algorithms, which model the tag vocabulary of a user in a cognitive-plausible way, can be helpful. Our present approach does this via implementing the activation equation of the cognitive architecture ACT-R, which determines the usefulness of units in human memory (e.g., tags). In this sense, our long-term research goal is to design hybrid recommendation approaches, which combine the advantages of both worlds in order to adapt to the current setting (i.e., sparse vs. dense ones).
Article
Full-text available
In recent years, various recommendation algorithms have been proposed to support learners in technology-enhanced learning environments. Such algorithms have proven to be quite effective in big-data learning settings (massive open online courses), yet successful applications in other informal and formal learning settings are rare. Common challenges include data sparsity, the lack of sufficiently flexible learner and domain models, and the difficulty of including pedagogical goals into recommendation strategies. Computational models of human cognition and learning are, in principle, well positioned to help meet these challenges, yet the effectiveness of cognitive models in educational recommender systems remains poorly understood to this date. This thesis contributes to this strand of research by investigating i) two cognitive learner models (CbKST and SUSTAIN) for resource recommendations that qualify for sparse user data by following theory-driven top down approaches, and ii) two tag recommendation strategies based on models of human cognition (BLL and MINERVA2) that support the creation of learning content meta-data. The results of four online and offline experiments in different learning contexts indicate that a recommendation approach based on the CbKST, a well-founded structural model of knowledge representation, can improve the users? perceived learning experience in formal learning settings. In informal settings, SUSTAIN, a human category learning model, is shown to succeed in representing dynamic, interest based learning interactions and to improve Collaborative Filtering for resource recommendations. The investigation of the two proposed tag recommender strategies underlined their ability to generate accurate suggestions (BLL) and in collaborative settings, their potential to promote the development of shared vocabulary (MINERVA2). This thesis shows that the application of computational models of human cognition holds promise for the design of recommender mechanisms and, at the same time, for gaining a deeper understanding of interaction dynamics in virtual learning systems.
Conference Paper
Full-text available
In this paper, we introduce TagRec, a standardized tag recommender benchmarking framework implemented in Java. The purpose of TagRec is to provide researchers with a framework that supports all steps of the development process of a new tag recommendation algorithm in a reproducible way, including methods for data pre-processing, data modeling, data analysis and recommender evaluation against state-of-the-art baseline approaches. We demonstrate the performance of the algorithms implemented in TagRec in terms of prediction quality and runtime using an extensive evaluation of a real-world folksonomy dataset. Furthermore, TagRec contains two novel tag recommendation approaches based on models derived from human cognition and human memory theories.
Conference Paper
Full-text available
The emergence of social tagging systems enables users to organize and share their interested resources. In order to ease the human-computer interaction with such systems, extensive researches have been done on how to recommend personalized tags for rescources. These researches mainly consider user profile, resource content, or the graph structure of users, resources and tags. Users' preferences towards different tags are usually regarded as invariable against time, neglecting the switch of users' short-term interests. In this paper, we examine the temporal factor in users' tagging behaviors by investigating the occurrence patterns of tags and then incorporate this into a novel method for ranking tags. To assess a tag for a user-resource pair, we first consider the user's general interest in it, then we calculate its recurrence probability based on the temporal usage pattern, and at last we consider its tag relevance to the content of the post. Experiments conducted on real datasets from Bibsonomy and Delicious demonstrate that our method outperforms other temporal models and state-of-the-art tag prediction methods.
Article
This report describes our study of different ways to improve existing collaborative filtering techniques in order to recommend scientific articles. Using data crawled from CiteUlike, a collaborative tagging service for academic purposes, we compared the classical user-based collaborative filtering algorithm as described by Schafer et al. [2], with two enhanced variations: 1) using a tag-based similarity calculation, to avoid depending on ratings to find the neighborhood of a user, and 2) incorporate the amount of raters in the final recommendation ranking to decrease the noise of items that have been rated by too few users. We provide a discussion of our results, describing the dataset and highlighting our findings about applying collaborative filtering on folksonomies instead of the classic bipartite user-item network, and providing guidelines of our future research.
Chapter
In this chapter we describe the state-of-the-art in social tagging recommender systems. Many of the algorithms presented here borrow ideas and techniques from other areas such as information retrieval, machine learning, and statistical relational learning. In Section 4.3 we also describe many approaches for exploiting additional sources of information such as the content of resources and the social relations of users.
Article
Since the rise of collaborative tagging systems on the web, the tag recommendation task -- suggesting suitable tags to users of such systems while they add resources to their collection -- has been tackled. However, the (offline) evaluation of tag recommendation algorithms usually suffers from difficulties like the sparseness of the data or the cold start problem for new resources or users. Previous studies therefore often used so-called post-cores (specific subsets of the original datasets) for their experiments. In this paper, we conduct a large-scale experiment in which we analyze different tag recommendation algorithms on different cores of three real-world datasets. We show, that a recommender's performance depends on the particular core and explore correlations between performances on different cores.
Book
Social Tagging Systems are web applications in which users upload resources (e.g., bookmarks, videos, photos, etc.) and annotate it with a list of freely chosen keywords called tags. This is a grassroots approach to organize a site and help users to find the resources they are interested in. Social tagging systems are open and inherently social; features that have been proven to encourage participation. However, with the large popularity of these systems and the increasing amount of user-contributed content, information overload rapidly becomes an issue. Recommender Systems are well known applications for increasing the level of relevant content over the noise that continuously grows as more and more content becomes available online. In social tagging systems, however, we face new challenges. While in classic recommender systems the mode of recommendation is basically the resource, in social tagging systems there are three possible modes of recommendation: users, resources, or tags. Therefore suitable methods that properly exploit the different dimensions of social tagging systems data are needed. In this book, we survey the most recent and state-of-the-art work about a whole new generation of recommender systems built to serve social tagging systems. The book is divided into self-contained chapters covering the background material on social tagging systems and recommender systems to the more advanced techniques like the ones based on tensor factorization and graph-based models.