ArticlePDF Available

Abstract and Figures

Recommender systems suggest items by exploiting the interactions of the users with the system (e.g., the choice of the movies to recommend to a user is based on those she previously evaluated). In particular, content-based systems suggest items whose content is similar to that of the items evaluated by a user. An emerging application domain in content-based recommender systems is represented by the consideration of the semantics behind an item description, in order to have a disambiguation of the words in the description and improve the recommendation accuracy. However, different phenomena, such as changes in the preferences of a user over time or the use of her account by third parties, might affect the accuracy by considering items that do not reflect the actual user preferences. Starting from an analysis of the literature and of an architecture proposed in a recent survey, in this paper we first highlight the current limits in this research area, then we propose design guidelines and an improved architecture to build semantics-aware content-based recommendations.
Content may be subject to copyright.
Semantics-Aware Content-Based Recommender Systems:
Design and Architecture GuidelinesI
Ludovico Boratto, Salvatore Carta, Gianni Fenu, Roberto Saia
Dipartimento di Matematica e Informatica, Università di Cagliari
Via Ospedale 72 - 09124 Cagliari (Italy)
Abstract
Recommender systems suggest items by exploiting the interactions of the users
with the system (e.g., the choice of the movies to recommend to a user is based
on those s/he previously evaluated). In particular, content-based systems sug-
gest items whose content is similar to that of the items evaluated by a user.
An emerging application domain in content-based recommender systems is rep-
resented by the consideration of the semantics behind an item description, in
order to have a disambiguation of the words in the description and improve the
recommendation accuracy. However, different phenomena, such as a changes in
the preferences of a user over time or the use of her/his account by third parties,
might affect the accuracy by considering items that do not reflect the actual user
preferences. Starting from an analysis of the literature and of an architecture
proposed in a recent survey, in this paper we first highlight the current lim-
its in this research area, then we propose design guidelines and an improved
architecture to build semantics-aware content-based recommendations.
Keywords: Semantics-aware Recommender Systems, Semantic Analysis,
Design, Architecture
1. Introduction
A recommender system is designed to provide suggestions for items that
are expected to interest a user [1]. One of the most employed approaches in
the literature and in real-world applications (e.g., e-commerce websites) are
the so-called content-based recommender systems [2]. These systems analyze
the content of the items a user has previously evaluated (e.g., their textual
IThis work is partially funded by Regione Sardegna under project NOMAD (Next gener-
ation Open Mobile Apps Development), through PIA - Pacchetti Integrati di Agevolazione
“Industria Artigianato e Servizi" (annualità 2013), and by MIUR PRIN 2010-11 under project
“Security Horizons”.
Email addresses: ludovico.boratto@acm.org (Ludovico Boratto), salvatore@unica.it
(Salvatore Carta), fenu@unica.it (Gianni Fenu), roberto.saia@unica.it (Roberto Saia)
Preprint submitted to Elsevier March 10, 2017
description), in order to detect items that s/he has not considered yet and
are similar to those s/he likes. Emerging application domains in this area are
represented by those systems and services that involve the use of ontologies
and semantic analysis tools in content-based recommender systems, in order
to perform a disambiguation of the item descriptions and improve a system’s
accuracy [3, 4]. This leads to the generation of a class of systems known in the
literature as semantics-aware content-based recommender systems [2, 5], which
have recently emerged.
In their very recent survey, de Gemmis et al. proposed a high-level ar-
chitecture of a semantics-aware content-based recommender system [2]. This
architecture processes all the items a user evaluated, in order to recommend the
users items with a similar content. However, when designing an architecture
that performs recommendations purely based on the content of the items, it
should be considered that a user might change her/his preferences over time, or
that someone else might use her/his profile to make transactions with items s/he
would not consider. Indeed, identifying and removing these incoherent items be-
comes a problem of central relevance in this area, which should be tackled since
the architectural definition of a system, in order to implement this feature in
a working content-based recommender system. These current open issues are
now presented in detail.
Presence of incoherent items in a user profile. Most of the solutions
regarding the user-profiling task of a recommender system involve a filtering of
the whole set of items previously evaluated by a user, in order to measure their
similarity with those that s/he did not consider yet, and recommend the most
similar items [2]. Indeed, the recommendation process is usually based on the
principle that the users’ preferences remain unchanged over time. While this
can be true in many cases, it is not the norm, due to the existence of temporal
dynamics in the given preferences [6, 7, 8]. Therefore, a static approach to user
profiling can lead toward wrong results due to various factors, such as a simple
change of tastes over time or the temporary use of a user’s account by other
people.
Magic barrier problem. Some studies [9, 10] have showed that a subset of
user ratings might be considered as outliers, due to the fact that the same user
may rate the same item with different ratings, at different moments in time.
This well-known problem, known in the literature as magic barrier [11, 12, 13],
is caused by the fact that, due to the noise in the data, a recommender sys-
tem reaches a point in which its accuracy cannot be further improved. After
the magic barrier has been reached, any improvement in terms of accuracy
might mean an overfitting instead of a performance enhancement. Therefore,
the magic barrier problem is very relevant in the recommendation research, but
no approach has ever studied, from a content-based point of view, how to filter
out items whose content represents an outlier.
Our contributions. In this paper, we first analyze the state-of-the-art archi-
tecture of a content-based recommender system, then we will explore in detail
the possible problems that might occur by employing it. Some design guidelines
on how to enrich that architecture will be proposed. Moreover, we will present
2
a novel architecture, which allows a system to tackle the problems previously
highlighted and improve the effectiveness of the recommendation process. Even
though we will focus on the emerging application domain we previously men-
tioned (i.e., the semantics-aware systems), we will also show the usefulness of
our proposal on classic content-based approaches. In order to build effective
recommendations, it is essential to properly exploit the semantics behind the
content of the items. Therefore, this study is meant to provide both architec-
tural and practical tools for any researcher or developer involved in the design of
real-world semantics-aware content-based recommender systems. The scientific
contributions coming from this paper are now summarized:
we analyze the state-of-the-art architecture of a semantics-aware content-
based recommender system, to study what might happen in the recom-
mendation process when the incoherent items are filtered by the system.
None of the approaches in the literature shows, from an architectural and
practical point of view, the impact of incoherent items in a user profile;
we study, for the first time, the magic barrier problem in a content-based
recommender system and from an architectural point of view. Indeed, the
existing studies on the magic barrier problem tackle it from a collaborative
filtering perspective;
we present design guidelines and a novel architecture, in order to improve
the state-of-the-art one, and overcome the aforementioned issues;
we analyze the impact of the components we introduce in the proposed
architecture from a computational cost point of view.
The rest of the paper is organized as follows: Section 2 presents related
work on content-based recommender systems and on the emerging problems
and application domains that affect the classic architecture of these systems;
in Section 3 we explore the state-of-the-art architecture of a semantics-aware
content-based recommender system; Section 4 highlights the limits that the
current architecture presents when considering incoherent items in a user profile,
and introduces design guidelines to improve it; Section 5 proposes an improved
architecture, by following the design guidelines; Section 6 presents conclusions
and future work.
2. Related Work
Content-based recommender systems suggest to users items that are similar
to those they previously evaluated [2, 14]. The early systems used relatively
simple retrieval approaches, such as the Vector Space Model, with the basic
TF-IDF weighting. The Vector Space Model is a spatial representation of text
documents, where each document is represented by a vector in an n-dimensional
space (known as bag of words), and each dimension is related to a term from the
overall vocabulary of a specific document collection. Examples of systems that
3
employ this type of content filtering are [15, 16, 17, 18]. Due to the fact that
this approach based on a simple bag of words is not able to perform a semantic
disambiguation of the terms in an item description, content-based recommender
systems evolved and started employing external sources of knowledge (e.g., on-
tologies), semantic analysis tools, and additional information to improve their
accuracy [3, 4, 5, 19].
Regarding the user profile considered by a recommender system, there is a
common problem that may affect the effectiveness of the obtained results, i.e.,
the capability of the information stored in a user profile to lead toward reliable
recommendations. Several ways to define user profiles have been presented in
the literature, from basic models created by exploiting explicit information [20],
to more complicated ones [21, 22], often optimized through sophisticate math-
ematical criteria, such as those in [23, 24, 25]. In order to face the problem
of dealing with unreliable information in a user profile, the state of art pro-
poses different strategies. Several approaches, such as [7], take advantage from
the Bayesian analysis of the user-provided relevance feedback, in order to detect
non-stationary user interests. Also exploiting the feedback information provided
by the users, [8] makes use of a tree-descriptor model to detect shifts in the user
interests. Other techniques exploit the knowledge captured in an ontology to
improve the effectiveness of the results [26, 27, 28, 29]. In [30, 31, 32], the
problem of modeling semantically correlated items was tackled, but the authors
consider a temporal correlation and not the one between the items and a user
profile.
Considering the item incoherence problem, it should be noted that there
is another common issue that afflicts the recommendation approaches. This
problem, known in the literature as magic barrier [11], defines the theoretical
boundary for the level of optimization that can be achieved by a recommendation
algorithm on transactional data [33]. The evaluation models assume as a ground
truth that the transactions made in the past by the users, and stored in their
profiles, are free of noise. This is a concept that has been faced in [9, 34], where
a study aimed to capture the noise in a service that operates in a synthetic
environment was performed.
It should be observed that in the content-based recommendation literature
there are no approaches that take into account how the architecture and the flow
of computation might be affected by the item incoherence and magic barrier
issues.
3. A State-of-the-Art Architecture for Semantics-Aware Content-based
Recommender Systems
This section will present the high-level architecture of a semantics-aware
content-based recommender system proposed in [2] and presented in Fig. 1. In
order to highlight how this architecture might be improved in case of incoherent
items in a user profile and present our proposal, we will explore it by presenting
the flow of the computation of a system that employs it.
4
Fig. 1: Architecture of a semantics-aware content-based recommender system.
The description of the items usually has no structure (e.g., its textual de-
scription), so it is necessary to perform some preprocessing steps to extract
information from it. Given an Information source, represented by the Item
Descriptions (e.g., product descriptions, Web pages, news, etc.) that will be
processed during the filtering, the first component employed by a system is a
Content Analyzer. This component converts each item description into a
format processable by the following steps (e.g., keywords, n-grams, concepts)
thanks to the employment of feature extraction techniques. The output gener-
ated by this component is a Structured Item Representation, stored in a Repre-
sented Items repository.
Out of all the represented items, the system considers the ones evaluated
by each active user uato whom recommendations have to provided (User ua
training examples), in order to build a profile that contains the preferences
of the user. This task is accomplished by a Profile Learner component,
which employs Machine Learning algorithms to combine the structured item
representations in a unique model. The output produced by the component is
auser profile, which is stored in a Profiles repository.
The recommendation task is performed by a Filtering Component, which
compares the output of the two previous components (i.e., the profile of the
active user and a set of items s/he has not evaluated yet). Given a new item
representation, the component predicts wether or not the item is suitable for
the active user ua, usually with a value that indicates its relevance with respect
to the user profile. The filtered items are ranked by relevance and the top-n
items in the ranking represent the output produced by the component, i.e., a
5
List of recommendations.
The List of recommendations is proposed to the active user ua, who either
accepts or rejects the recommended items (e.g., by watching a recommended
movie, or by buying a recommended item), by providing a feedback on them
(User uafeedback), stored in a Feedback repository.
The feedback provided by the active user is then used by the system to
update her/his user profile.
4. Design Guidelines
In the previous section, we presented the state-of-the-art architecture of
a semantics-aware content-based recommender system. We will now present
the possible problems that might occur by employing it and provide design
guidelines on how to improve it.
The possible problems that might occur will be presented through possible
use cases/scenarios.
Scenario 1. The account of the active user is used by another person, who
evaluates items that the user would have never evaluated (e.g., s/he buys
items that the active user would have never bought). This would lead to
the presence of noise in a user profile, since the Structured Item Represen-
tation of these incoherent items with respect to the user profile would be
considered by the Profile Learner component. The component would
make them part of the user uaprofile, stored as it is in the Profiles repos-
itory, and employed in the recommendation process by the Filtering
Component. This would generate bad recommendations and affect the
accuracy of the system.
Scenario 2. The preferences of the active user change over time, but the oldest
items that do not reflect the current preferences of the user, and that have
been positively evaluated by her/him, are still part of the user profile.
A form of aging of the items in a user profile would allow the system to
ignore these items after some time, but until that moment those items
would represent noise. Such noise might affect the system for a lot of
time, since the aging process is usually gradual and the items are removed
slowly. Again, this would affect the recommendation accuracy.
Scenario 3. If a mix of the two previous scenarios occurs and these type of
problems are iterated over time, the system would reach the so-called
magic barrier, i.e., a point where the noise affects the system so much
that it is impossible to improve the accuracy any further. As highlighted
in Section 2, the problem has been widely studied in the Collaborative
Filtering literature, in order to identify and remove the noisy items based
on their ratings, but no paper in the literature studied the magic barrier
from a content-based point of view.
6
The three previously presented scenarios put in evidence that the architec-
ture of a semantics-aware content-based system should be able to deal with the
presence of incoherent items in the user profiling process, to avoid the previously
aforementioned problems. Therefore, we will present design guidelines on how
to improve the state-of-the-art-art architecture of a system.
The first scenario highlighted the need for a system to detect how coherent
is an item with the rest of the items that have been evaluated by a user, in order
to detect the presence of noise. This could be done by comparing the content
of the item (i.e., its structured item representation) with that of the other items
evaluated by the user (user uatraining examples).
Scenario 2 confirms the need for a system to evaluate the temporal corre-
lation of an item with the rest of the items in the user profile. Indeed, if an
item is too old and, as previously said, it is also too different with respect to
the other items, it should be removed from the user profile.
Both the second and the third scenarios highlighted that the presence of
noisy/incoherent items on a user profile should be reduced to a very limited
amount of time. In particular, thanks to scenario 3 we know that these items
should not be discarded gradually, but the system should be able to do a one-off
removal. This would allow the filtering component to consider only items that
are coherent with each other and with the real preferences of the users.
The next section adopts these design guidelines to present an architecture
that overcomes these issues.
5. An Improved Architecture to Build Semantics-aware Content-based
Recommender Systems
In this section we propose our architecture. The updated high-level architec-
ture of the system is first proposed (Section 5.1), then in Section 5.2 we present
the details of the novel component that faces the problems highlighted in the
previous section. We close our presentation with a brief analysis that shows how
our proposal fits with the development of a real-world system (Section 5.3).
5.1. High-Level Architecture
Fig. 2 proposes an updated version of the state-of-the-art architecture illus-
trated in Section 3. This architecture integrates a novel component, which we
named Profile Cleaner, with the aim to analyze a profile and remove the
incoherent items, before storing it in the Profiles repository. In order to solve
the previous problems, the component should be able to remove an item if it
meets the following two conditions:
1. the coherence/content-based similarity of the item with the rest of the
profile is under a Minimum Coherence threshold value;
2. it is located in the first part of the user’s iteration history. Based on this
requirement, an item is considered far from the user’s preferences only
when it goes up in the first part of the iterations (i.e., when the distance
7
Fig. 2: Architecture of a semantics-aware content-based recommender system.
with the last evaluated item is higher than a Maximum Temporal Distance
threshold).
By removing the incoherent old items, the Filtering Component would
consider only the real preferences of the users and the previously mentioned
problems are solved. Indeed, by checking that both conditions are met, the
system avoids removing from a profile the items that are diverse from those
s/he previously considered, but that might be associated to a recent change in
the preferences of the user.
Regarding scenario 1, if among a user uatraining examples there is an inco-
herent item evaluated by a third party, it would be detected by the component,
since it receives it as an input. Regarding scenarios 2 and 3, by checking the
temporal correlation of an item with the others in the user profile, the compo-
nent would be able to remove an item as soon as it becomes old and incoherent,
avoiding the problems related to the aging strategies (which might still be em-
ployed by the Profile Learner, but are not enough) and to the presence of
too many incoherent items that would lead to the magic barrier problem.
5.2. Low-level Representation of the Profile Cleaner
In Fig. 3 we inspect the component introduced in our architecture, to present
a low-level analysis of the subcomponents it should employ to accomplish its
task.
8
Fig. 3: Architectural organization of the profile cleaner task.
As Fig. 2 showed, the profile cleaner takes as input both an item ia user
has evaluated (i.e., one of the training examples or of the feedbacks provided by
a user) and her/his user profile.
The Items Coherence Analyzer subcomponent compares the structured
representation of an item iwith the rest of the user profile, in order to detect
the coherence/similarity of the item with the rest of the profile. If the Struc-
tured Item Representation involves semantic structures (e.g., Wordnet synsets),
as the modern content-based systems do, several metrics can be employed to
evaluate the semantic similarity between two structured representations that in-
volve synsets. Some examples of state-of-the-art metrics are the following five:
Leacock and Chodorow [35], Jiang and Conrath [36], Resnik [37], Lin [38], and
Wu and Palmer [39]. However, any type of similarity/coherence might be em-
ployed, even if no semantic information is available in the item representation
(e.g., TF-IDF). The output produced by the subcomponent is an Item iCoher-
ence value, which will be later employed by the Items Removal Analyzer
subcomponent to decide if the item should be removed or not.
In parallel, the Temporal Analyzer subcomponent considers how far was the
evaluation of the considered item with respect to that of the other items in the
user profile (especially the last evaluated one). The distance threshold might
be defined as a fixed value, or by defining regions based on the chronology with
which the items have been evaluated (e.g., to remove an item if it is among the
9
first half of oldest evaluated items). The output is an Item iTemporal Distance,
which will also be employed by Items Removal Analyzer subcomponent.
The output of the two previously subcomponents is then handled by the
Items Removal Analyzer which also receives as input the Minimum Coher-
ence and Maximum Temporal Distance thresholds, and decides if the considered
item ishould be removed from a user profile or not. The output produced by the
subcomponent (and by the Profile Cleaner main component) is a cleaned
user uaprofile, which does not contain the incoherent and oldest items.
5.3. Developing a System that Employs this Architecture
This section starts by analyzing the asymptotic time complexity analysis on
the proposed architecture, concluding with a brief summary of its effectiveness
in a real-world system.
5.3.1. Asymptotic Time Complexity Analysis
It becomes natural to think that the introduction of a Profile Cleaner
component, even if useful, might lead to heavy tasks to be computed by the
system. Indeed, the component has to deal with a comparison between each
item and the rest of the user profile, and this similarity might involve seman-
tic elements and measures, which are usually very heavy to compute. Given
the widely-known big data problem that characterizes and affects the systems
nowadays, here we try to inspect on how to develop this component in real-world
scenarios.
Actually, the computation of the coherence of each of the new items with
the rest of the user profile might be distributed over different computers, by em-
ploying large scale distributed computing models like MapReduce. Moreover,
this process can be handled in background by the system, since when a user
evaluates a new item, it would hardly make any instant difference on the com-
puted recommendations. Therefore, if it gets removed in a reasonable time and
with a distributed approach, the employment of Profile Cleaner component
would be both effective and efficient at the same time.
Moreover, we studied the structure of the Profile Cleaner component to
let it run two subcomponents in parallel, so that even under this perspective
the process can be parallelized and efficient.
We believe that even if we are introducing a possibly heavy computational
process, the improvements in terms of accuracy and the structure of the com-
ponent would overcome the complexity limits. Moreover, this complexity would
also be efficiently dealt with the current technologies employed to face the big
data problems (e.g., Hadoop’s MapReduce).
In any case, in order to evaluate the asymptotic time complexity of the
proposed architecture, here we formalize the Algorithm 1 used to perform the
profile cleaner task, carried out by the Profile Cleaner component (Fig. 3).
It requires as input the user profile uaand returns it (ˆua) after all the
incoherent items in the first part of the user profile have been removed. It
should be noted that, in the algorithm, we define as τthe first part of the
10
user’s iteration history (i.e., the area where an incoherent item can be removed).
Changing this value however does not affect the complexity analysis.
In step 2 we calculate the minimum coherence value (m) to use as threshold
to mark an item uaas coherent or incoherent. In steps 3-9 we process each
item in the user profile: we first get its coherence value c(step 4 ) and its position
pin the user’s iteration history (step 5 ), proceeding by removing from the user
profile all the items with a coherence value cunder the threshold value mand
a value pwithin the the first part of the user’s iteration history (step 6 ). In
step 10, the algorithm returns the user profile uaafter all the incoherent items
in the first part of the user’s iteration history have been removed.
Algorithm 1 Profile Cleaner
Input: ua=User profile
Output: ˆua=Processed user profile
1: procedure ProfileCleaner(ua)
2: m=GetMinCoherence(ua)
3: for each item uado
4: c=GetCoherence(item)
5: p=GetP osition(item)
6: if c < m AND pτthen
7: Remove(item)
8: end if
9: end for
10: Return ˆua
11: end procedure
In the theoretical complexity analysis of the Algorithm 1 we assume that the
dimension of the inputs is N, with N=|ua|. The complexity is then analyzed
for one profile, and so their actual running time depends on the number of
users to process. The complexity (Big O notation) of the step 4 and step 5 is
O(1), since this information is available at the end of step 2, which presents a
complexity of O(N2)(it tests the semantical meaning of all pairs of terms in an
item, with those in the other items). Moreover, the complexity of the cycle in
the steps 3-9 is O(N), and that of the steps 6-7 is, in the worst case, however
lower than O(N). Therefore, the asymptotic complexity of the algorithm is
O(N2).
The asymptotic complexity of the algorithm is then given by the step 2,
whose mathematical formalization is presented in Equation 1. It measures the
minimum coherence value by calculating the average of the semantic similarity
between each single item with the other ones (ua\item).
m=1
N·X
itemua
simSEM (item, ua\item)(1)
Regarding the computational load derived from this step, it should be ob-
served that it does not affect the recommendation process, because it happens
11
in another moment (i.e., when a user evaluates a new item). Therefore, even if
our approach has a high computational complexity, it is meant to run in back-
ground. Indeed, when a new item is evaluated by the user, it would not be
removed (even if incoherent) since it is recent, so its similarity with the rest of
the user profile can be processed in background.
5.3.2. Effectiveness of the Architecture in a Real-world System
Here, we give a brief summary of the results obtained by a real-world imple-
mentation of a semantics-aware recommender system that employs this archi-
tecture [40].
The experimental environment was based on the Java language, with the
support of Java API implementation for WordNet Searching (JAWS)1. In or-
der to test the proposed architecture, we used two real-world datasets widely-
employed in the recommender systems literature, i.e., Yahoo! Webscope (R4)2
and Movielens 10M3.
The recommender systems used during the tests are SVD and a classic User-
Based Nearest Neighbors Collaborative Filtering approach. This allowed us to
evaluate our architecture on recommender systems in which the dimensionality
is reduced and and in which the feature space is processed as it is. The Mahout4
framework was used to implement these recommender systems.
The results show that, when employed by a real-world system, our architec-
ture leads to statistically significant improvements in the accuracy, both when
compared to SVD, and to a classic User-Based Nearest Neighbors Collaborative
Filtering. Moreover, there are cases in which hundreds of items are removed
from a single user profile and get classified by our approach as noise, thus im-
proving the efficiency of the filtering during the recommendation process.
6. Conclusions and Future Work
In this paper, we dealt with the problems that might occur with the current
way in which content-based recommender systems are engineered and designed.
Given the high impact that emerging aspects are having in research and real-
world recommender systems, such as the introduction of the semantics in the fil-
tering process and the so-called magic barrier problem, we analyzed the current
architecture employed by a content-based recommender system and highlighted
possible improvements to deal with the presence of incoherent items. Indeed,
we showed that a form of cleaning of the user profiles is necessary in order to
overcome these limitations.
We then proposed an updated architecture, which was analyzed both from
a high-level point of view and by inspecting on the component that allows a
1http://lyle.smu.edu/ tspell/jaws/index.html
2http://webscope.sandbox.yahoo.com
3http://grouplens.org/datasets/movielens/
4http://mahout.apache.org/
12
system to clean a profile. By only adding one component to the state-of-the-art
architecture, our proposal is able to remove incoherent items from a user profile.
The strength of this component is that it is designed to run two subtasks (one
to detect the similarity of an item with the rest in the user profile and one to
detect the temporal correlation between the evaluation of the item and that
of the other ones in the user profile), which are designed to run in parallel
and would allow its implementation in distributed systems. The weakness of
the proposed component is the high computational complexity, necessary to
detect the semantic similarity between an item and the rest of the user profile.
However, the component can run in background, so it can be integrated in
real-world systems.
Future work will move from the software engineering perspective of our study,
to develop real-world efficient implementations of this architecture (e.g., on a
grid), in order to study its efficiency and effectives in scenarios characterized by
the big data (e.g., the recommendations performed by an e-commerce website).
References
[1] F. Ricci, L. Rokach, B. Shapira, Introduction to recommender systems
handbook, in: F. Ricci, L. Rokach, B. Shapira, P. B. Kantor (Eds.), Rec-
ommender Systems Handbook, Springer, 2011, pp. 1–35.
[2] M. de Gemmis, P. Lops, C. Musto, F. Narducci, G. Semeraro, Semantics-
aware content-based recommender systems, in: F. Ricci, L. Rokach,
B. Shapira (Eds.), Recommender Systems Handbook, Springer US, 2015,
pp. 119–159. doi:10.1007/978-1-4899-7637-6_4.
URL http://dx.doi.org/10.1007/978-1-4899-7637-6_4
[3] M. Capelle, F. Frasincar, M. Moerland, F. Hogenboom, Semantics-based
news recommendation, in: Proceedings of the 2Nd International Conference
on Web Intelligence, Mining and Semantics, WIMS ’12, ACM, New York,
NY, USA, 2012, pp. 27:1–27:9.
[4] M. Capelle, F. Hogenboom, A. Hogenboom, F. Frasincar, Semantic news
recommendation using wordnet and bing similarities, in: Proceedings of
the 28th Annual ACM Symposium on Applied Computing, SAC ’13, ACM,
New York, NY, USA, 2013, pp. 296–302.
[5] P. Basile, C. Musto, M. de Gemmis, P. Lops, F. Narducci, G. Semeraro,
Content-based recommender systems + dbpedia knowledge = semantics-
aware recommender systems, in: V. Presutti, M. Stankovic, E. Cambria,
I. Cantador, A. D. Iorio, T. D. Noia, C. Lange, D. R. Recupero, A. Tordai
(Eds.), Semantic Web Evaluation Challenge - SemWebEval 2014 at ESWC
2014, Anissaras, Crete, Greece, May 25-29, 2014, Revised Selected Pa-
pers, Vol. 475 of Communications in Computer and Information Science,
Springer, 2014, pp. 163–169.
13
[6] L. Li, Z. Yang, B. Wang, M. Kitsuregawa, Dynamic adaptation strate-
gies for long-term and short-term user profile to personalize search, in:
G. Dong, X. Lin, W. Wang, Y. Yang, J. X. Yu (Eds.), Advances in Data and
Web Management, Joint 9th Asia-Pacific Web Conference, APWeb 2007,
and 8th International Conference, on Web-Age Information Management,
WAIM 2007, Huang Shan, China, June 16-18, 2007, Proceedings, Vol. 4505
of Lecture Notes in Computer Science, Springer, 2007, pp. 228–240.
[7] W. Lam, S. Mukhopadhyay, J. Mostafa, M. J. Palakal, Detection of shifts
in user interests for personalized information filtering, in: SIGIR, 1996, pp.
317–325.
[8] D. H. Widyantoro, T. R. Ioerger, J. Yen, Learning user interest dynamics
with a three-descriptor representation, JASIST 52 (3) (2001) 212–225.
[9] X. Amatriain, J. M. Pujol, N. Oliver, I like it... I like it not: Evaluating
user ratings noise in recommender systems, in: G. Houben, G. I. McCalla,
F. Pianesi, M. Zancanaro (Eds.), User Modeling, Adaptation, and Personal-
ization, 17th International Conference, UMAP 2009, formerly UM and AH,
Trento, Italy, June 22-26, 2009. Proceedings, Vol. 5535 of Lecture Notes in
Computer Science, Springer, 2009, pp. 247–258.
[10] W. C. Hill, L. Stead, M. Rosenstein, G. W. Furnas, Recommending and
evaluating choices in a virtual community of use, in: I. R. Katz, R. L. Mack,
L. Marks, M. B. Rosson, J. Nielsen (Eds.), Human Factors in Computing
Systems, CHI ’95 Conference Proceedings, Denver, Colorado, USA, May
7-11, 1995., ACM/Addison-Wesley, 1995, pp. 194–201.
[11] J. L. Herlocker, J. A. Konstan, L. G. Terveen, J. Riedl, Evaluating collabo-
rative filtering recommender systems, ACM Trans. Inf. Syst. 22 (1) (2004)
5–53.
[12] A. Said, B. J. Jain, S. Narr, T. Plumbaum, Users and noise: The magic
barrier of recommender systems, in: J. Masthoff, B. Mobasher, M. C. Des-
marais, R. Nkambou (Eds.), User Modeling, Adaptation, and Personaliza-
tion - 20th International Conference, UMAP 2012, Montreal, Canada, July
16-20, 2012. Proceedings, Vol. 7379 of Lecture Notes in Computer Science,
Springer, 2012, pp. 237–248.
[13] A. Bellogín, A. Said, A. P. de Vries, The magic barrier of recommender sys-
tems - no magic, just ratings, in: V. Dimitrova, T. Kuflik, D. Chin, F. Ricci,
P. Dolog, G. Houben (Eds.), User Modeling, Adaptation, and Personaliza-
tion - 22nd International Conference, UMAP 2014, Aalborg, Denmark, July
7-11, 2014. Proceedings, Vol. 8538 of Lecture Notes in Computer Science,
Springer, 2014, pp. 25–36.
[14] M. J. Pazzani, D. Billsus, Content-based recommendation systems, in:
P. Brusilovsky, A. Kobsa, W. Nejdl (Eds.), The Adaptive Web, Springer-
Verlag, Berlin, Heidelberg, 2007, pp. 325–341.
URL http://dl.acm.org/citation.cfm?id=1768197.1768209
14
[15] M. Balabanović, Y. Shoham, Fab: Content-based, collaborative recommen-
dation, Commun. ACM 40 (3) (1997) 66–72.
[16] D. Billsus, M. J. Pazzani, A hybrid user model for news story classification,
in: Proceedings of the Seventh International Conference on User Modeling,
UM ’99, Springer-Verlag New York, Inc., Secaucus, NJ, USA, 1999, pp.
99–108.
URL http://dl.acm.org/citation.cfm?id=317328.317338
[17] H. Lieberman, Letizia: An agent that assists web browsing, in: Proceed-
ings of the 14th International Joint Conference on Artificial Intelligence
- Volume 1, IJCAI’95, Morgan Kaufmann Publishers Inc., San Francisco,
CA, USA, 1995, pp. 924–929.
URL http://dl.acm.org/citation.cfm?id=1625855.1625975
[18] M. Pazzani, J. Muramatsu, D. Billsus, Syskill &#38; webert: Identifying
interesting web sites, in: Proceedings of the Thirteenth National Conference
on Artificial Intelligence - Volume 1, AAAI’96, AAAI Press, 1996, pp. 54–
61.
URL http://dl.acm.org/citation.cfm?id=1892875.1892883
[19] I. Anagnostopoulos, G. Razis, P. Mylonas, C. Anagnostopoulos, Semantic
query suggestion using twitter entities, Neurocomputing 163 (2015) 137–
150. doi:10.1016/j.neucom.2014.12.090.
URL http://dx.doi.org/10.1016/j.neucom.2014.12.090
[20] P. Lops, M. de Gemmis, G. Semeraro, Content-based recommender sys-
tems: State of the art and trends, in: F. Ricci, L. Rokach, B. Shapira,
P. B. Kantor (Eds.), Recommender Systems Handbook, Springer, 2011,
pp. 73–105. doi:10.1007/978-0-387-85820-3_3.
URL http://dx.doi.org/10.1007/978-0-387-85820-3_3
[21] Q. Du, H. Xie, Y. Cai, H. Leung, Q. Li, H. Min, F. L. Wang, Folksonomy-
based personalized search by hybrid user profiles in multiple levels, Neuro-
computing 204 (2016) 142–152. doi:10.1016/j.neucom.2015.10.135.
URL http://dx.doi.org/10.1016/j.neucom.2015.10.135
[22] Y. Koren, R. M. Bell, C. Volinsky, Matrix factorization techniques for rec-
ommender systems, IEEE Computer 42 (8) (2009) 30–37. doi:10.1109/
MC.2009.263.
URL http://dx.doi.org/10.1109/MC.2009.263
[23] K. Ji, R. Sun, X. Li, W. Shu, Improving matrix approximation for recom-
mendation via a clustering-based reconstructive method, Neurocomputing
173 (2016) 912–920. doi:10.1016/j.neucom.2015.08.046.
URL http://dx.doi.org/10.1016/j.neucom.2015.08.046
[24] O. A. Arqub, A.-S. Mohammed, S. Momani, T. Hayat, Numerical solu-
tions of fuzzy differential equations using reproducing kernel hilbert space
method, Soft Computing (2015) 1–20.
15
[25] O. A. Arqub, Adaptation of reproducing kernel algorithm for solving fuzzy
fredholm–volterra integrodifferential equations, Neural Computing and Ap-
plications (2015) 1–20.
[26] V. Schickel-Zuber, B. Faltings, Inferring user’s preferences using ontologies,
in: Proceedings, The Twenty-First National Conference on Artificial Intel-
ligence and the Eighteenth Innovative Applications of Artificial Intelligence
Conference, July 16-20, 2006, Boston, Massachusetts, USA, AAAI Press,
2006, pp. 1413–1418.
[27] D. Laniado, D. Eynard, M. Colombetti, A semantic tool to support navi-
gation in a folksonomy, in: Proceedings of the Eighteenth Conference on
Hypertext and Hypermedia, HT ’07, ACM, New York, NY, USA, 2007, pp.
153–154. doi:10.1145/1286240.1286282.
URL http://doi.acm.org/10.1145/1286240.1286282
[28] M. N. Moreno, S. Segrera, V. F. L. Batista, M. D. M. Vicente, A. L.
Sánchez, Web mining based framework for solving usual problems in rec-
ommender systems. A case study for movies’ recommendation, Neurocom-
puting 176 (2016) 72–80. doi:10.1016/j.neucom.2014.10.097.
URL http://dx.doi.org/10.1016/j.neucom.2014.10.097
[29] G. Lv, C. Hu, S. Chen, Research on recommender system based on ontology
and genetic algorithm, Neurocomputing 187 (2016) 92–97. doi:10.1016/
j.neucom.2015.09.113.
URL http://dx.doi.org/10.1016/j.neucom.2015.09.113
[30] G. Stilo, P. Velardi, Time makes sense: Event discovery in twitter using
temporal similarity, in: Proceedings of the 2014 IEEE/WIC/ACM Inter-
national Joint Conferences on Web Intelligence (WI) and Intelligent Agent
Technologies (IAT) - Volume 02, WI-IAT ’14, IEEE Computer Society,
Washington, DC, USA, 2014, pp. 186–193.
[31] G. Stilo, P. Velardi, Temporal semantics: Time-varying hashtag sense clus-
tering, in: Knowledge Engineering and Knowledge Management, Vol. 8876
of Lecture Notes in Computer Science, Springer International Publishing,
2014, pp. 563–578.
[32] G. Stilo, P. Velardi, Efficient temporal mining of micro-blog texts and its
application to event discovery, Data Mining and Knowledge Discovery.
[33] A. Said, B. J. Jain, S. Narr, T. Plumbaum, S. Albayrak, C. Scheel, Esti-
mating the magic barrier of recommender systems: a user study, in: W. R.
Hersh, J. Callan, Y. Maarek, M. Sanderson (Eds.), The 35th International
ACM SIGIR conference on research and development in Information Re-
trieval, SIGIR ’12, Portland, OR, USA, August 12-16, 2012, ACM, 2012,
pp. 1061–1062.
16
[34] X. Amatriain, J. M. Pujol, N. Tintarev, N. Oliver, Rate it again: in-
creasing recommendation accuracy by user re-rating, in: L. D. Bergman,
A. Tuzhilin, R. D. Burke, A. Felfernig, L. Schmidt-Thieme (Eds.), Proceed-
ings of the 2009 ACM Conference on Recommender Systems, RecSys 2009,
New York, NY, USA, October 23-25, 2009, ACM, 2009, pp. 173–180.
[35] C. Leacock, M. Chodorow, Combining local context and wordnet similarity
for word sense identification, in: C. Fellbaum (Ed.), WordNet: An Elec-
tronic Lexical Database, MIT Press, 1998, pp. 305–332.
[36] J. J. Jiang, D. W. Conrath, Semantic similarity based on corpus statistics
and lexical taxonomy, arXiv preprint cmp-lg/9709008.
[37] P. Resnik, Using information content to evaluate semantic similarity in a
taxonomy, in: Proceedings of the 14th International Joint Conference on
Artificial Intelligence - Volume 1, IJCAI’95, Morgan Kaufmann Publishers
Inc., San Francisco, CA, USA, 1995, pp. 448–453.
URL http://dl.acm.org/citation.cfm?id=1625855.1625914
[38] D. Lin, An information-theoretic definition of similarity, in: J. W. Shavlik
(Ed.), Proceedings of the Fifteenth International Conference on Machine
Learning (ICML 1998), Madison, Wisconsin, USA, July 24-27, 1998, Mor-
gan Kaufmann, 1998, pp. 296–304.
[39] Z. Wu, M. Palmer, Verbs semantics and lexical selection, in: Proceedings
of the 32Nd Annual Meeting on Association for Computational Linguis-
tics, ACL ’94, Association for Computational Linguistics, Stroudsburg, PA,
USA, 1994, pp. 133–138.
[40] R. Saia, L. Boratto, S. Carta, A semantic approach to remove inco-
herent items from a user profile and improve the accuracy of a rec-
ommender system, Journal of Intelligent Information Systems (2016) 1–
24doi:10.1007/s10844-016-0406-7.
URL http://dx.doi.org/10.1007/s10844-016-0406-7
17
... Additionally, with the deregulation of the retail electricity market, research has also employed recommender systems to provide electricity customers with personalized recommendations for electricity providers and electricity plans tailored to their individual consumption needs, aiming to help them reduce energy costs [158]. Furthermore, product recommendation systems have also been applied to suggest building design strategies [159] and energy transition solutions [114], offering decision support to occupants. ...
Preprint
Full-text available
The indoor environment significantly impacts human health and well-being; enhancing health and reducing energy consumption in these settings is a central research focus. With the advancement of Information and Communication Technology (ICT), recommendation systems and reinforcement learning (RL) have emerged as promising approaches to induce behavioral changes to improve the indoor environment and energy efficiency of buildings. This study aims to employ text mining and Natural Language Processing (NLP) techniques to thoroughly examine the connections among these approaches in the context of human-building interaction and occupant context-aware support. The study analyzed 27,595 articles from the ScienceDirect database, revealing extensive use of recommendation systems and RL for space optimization, location recommendations, and personalized control suggestions. Furthermore, this review underscores the vast potential for expanding recommender systems and RL applications in buildings and indoor environments. Fields ripe for innovation include predictive maintenance, building-related product recommendation, and optimization of environments tailored for specific needs, such as sleep and productivity enhancements based on user feedback. The study also notes the limitations of the method in capturing subtle academic nuances. Future improvements could involve integrating and fine-tuning pre-trained language models to better interpret complex texts.
Article
Full-text available
In this article, we introduce a novel deterministic method based on Expectation Maximization (EM) to solve the rather complex problem of designing a tourist trip or Personalized Itinerary Recommendation (PIR). PIR objective is to recommend a personalized tour consisting of successive Points of Interest (POIs), which maximizes user satisfaction and respects user time-frame constraints. On top of that, the POIs are divided into categories, in order for travelers to be able to set limits on the maximum (and minimum) number of POIs that belong to one category and are included in the itinerary. In the proposed framework, emphasis is given on the POIs sequence selection, which exploits the customized POI recommendations offered by a recommender system. Additionally, the proposed methodology with POIs categories is able to solve the TourMustSee problem, so that the tour includes a set of POIs that must be visited. The proposed system has been successfully incorporated into a mobile app, offering a complete tourist trip design. The high performance, resilience, and computational efficiency of the proposed framework are demonstrated by experimental findings and comparisons to existing approaches on numerous synthetic and real datasets.
Article
Conversational recommendation systems are crucial for making recommendations agreeable to the user. To reach an agreeable recommendation, this study proposes a dialog strategy that represents a reasonable order of items and elicits the current estimation by the wording of utterances based on the subjective preference estimation. We developed two dialog functions for the topic and word choice based on the history of preference estimations. The human impression of a robot’s diligence, understanding capability, and satisfaction were evaluated through a conversation with a virtual robot using a crowdsourcing platform. We compared six conditions that differed based on two topics and three wording patterns. The experimental results indicated that the main effect of the wording patterns, whereas one of the topic choices was not found to be significant. Further analysis showed that accurate estimation improves the robot’s impression when demonstrating its diligence.
Article
During the COVID-19 outbreak, crowdsourcing-based context-aware recommender systems (CARS) which capture the real-time context in a contactless manner played an important role in the "new normal". This study investigates whether this approach effectively supports users' decisions during epidemics and how different game designs affect users performing crowdsourcing tasks. This study developed a crowdsourcing-based CARS focusing on restaurant recommendations. We used four conditions (control, self-competitive, social-competitive, and mixed gamification) and conducted a two-week field study involving 68 users. The system provided recommendations based on real-time contexts including restaurants' epidemic status, allowing users to identify suitable restaurants to visit during COVID-19. The result demonstrates the feasibility of crowdsourcing to collect real-time information for recommendations during COVID-19 and reveals that a mixed competitive game design encourages both high- and low-performance users to engage more and that a game design with self-competitive elements motivates users to take on a wider variety of tasks. These findings inform the design of restaurant recommender systems in an epidemic context and serve as a comparison of incentive mechanisms for gamification of self-competition and competition with others.
Article
The emergence of the micro-moment concept highlights the influence of context; recommender system design should reflect this trend. In response to different contexts, a micro-moment recommender system (MMRS) requires an effective interaction mechanism that allows users to easily interact with the system in a way that supports autonomy and promotes the creation and expression of self. We study four types of interaction mechanisms to understand which personalization approach is the most suitable design for MMRSs. We assume that designs that support micro-moment needs well are those which give users more control over the system and constitute a lighter user burden. We test our hypothesis via a two-week between-subject field study in which participants used our system and provided feedback. User-initiated and mix-initiated intention mechanisms show higher perceived active control, and the additional controls do not add to user burdens. Therefore, these two designs suit the MMRS interaction mechanism.
Article
Full-text available
Recommender systems usually suggest items by exploiting all the previous interactions of the users with a system (e.g., in order to decide the movies to recommend to a user, all the movies she previously purchased are considered). This canonical approach sometimes could lead to wrong results due to several factors, such as a change in user preferences over time, or the use of her account by third parties. This kind of incoherence in the user profiles defines a lower bound on the error the recommender systems may achieve when they generate suggestions for a user, an aspect known in literature as magic barrier. This paper proposes a novel dynamic coherence-based approach to define the user profile used in the recommendation process. The main aim is to identify and remove from the previously evaluated items those not semantically adherent to the the others, in order to make a user profile as close as possible to the user's real preferences, solving the aforementioned problems. Moreover, reshaping the user profile in such a way leads to great advantages in terms of computational complexity, since the number of items considered during the recommendation process is highly reduced. The performed experiments show the effectiveness of our approach to remove the incoherent items from a user profile, increasing the recommendation accuracy.
Chapter
Full-text available
Content-based recommender systems (CBRSs) rely on item and user descriptions (content) to build item representations and user profiles that can be effectively exploited to suggest items similar to those a target user already liked in the past. Most content-based recommender systems use textual features to represent items and user profiles, hence they suffer from the classical problems of natural language ambiguity. This chapter presents a comprehensive survey of semantic representations of items and user profiles that attempt to overcome the main problems of the simpler approaches based on keywords. We propose a classification of semantic approaches into top-down and bottom-up. The former rely on the integration of external knowledge sources, such as ontologies, encyclopedic knowledge and data from the Linked Data cloud, while the latter rely on a lightweight semantic representation based on the hypothesis that the meaning of words depends on their use in large corpora of textual documents. The chapter shows how to make recommender systems aware of semantics to realize a new generation of content-based recommenders.
Article
Full-text available
In this article, we propose the reproducing kernel Hilbert space method to obtain the exact and the numerical solutions of fuzzy Fredholm–Volterra integrodifferential equations. The solution methodology is based on generating the orthogonal basis from the obtained kernel functions in which the constraint initial condition is satisfied, while the orthonormal basis is constructing in order to formulate and utilize the solutions with series form in terms of their r-cut representation form in the Hilbert space (Formula presented.). Several computational experiments are given to show the good performance and potentiality of the proposed procedure. Finally, the utilized results show that the present method and simulated annealing provide a good scheduling methodology to solve such fuzzy equations.
Article
Learning users' interest categories is challenging in a dynamic environment like the Web because they change over time. This article describes a novel scheme to represent a user's interest categories, and an adaptive algorithm to learn the dynamics of the user's interests through positive and negative relevance feedback. We propose a three‐descriptor model to represent a user's interests. The proposed model maintains a long‐term interest descriptor to capture the user's general interests and a short‐term interest descriptor to keep track of the user's more recent, faster‐changing interests. An algorithm based on the three‐descriptor representation is developed to acquire high accuracy of recognition for long‐term interests, and to adapt quickly to changing interests in the short‐term. The model is also extended to multiple three‐descriptor representations to capture a broader range of interests. Empirical studies confirm the effectiveness of this scheme to accurately model a user's interests and to adapt appropriately to various levels of changes in the user's interests.
Article
Recently, some systems allow users to rate and annotate resources, e.g., MovieLens, and we consider that it provides a way to identify favor tags and annoying tags of a user by integrating user's rating and tags. In this paper, we reveal and elaborate on the limitations of current work on user profiling for personalized search in collaborative tagging systems. Then we propose a new multi-level user profiling model by integrating tags and ratings to achieve personalized search, which can reflect not only the user's favor but also a user's nuisances. To the best of our knowledge, this is the first effort to integrate the ratings and tags to model multi-level user profiles for personalized search.
Conference Paper
Recommender Systems need to deal with different types of users who represent their preferences in various ways. This difference in user behaviour has a deep impact on the final performance of the recommender system, where some users may receive either better or worse recommendations depending, mostly, on the quantity and the quality of the information the system knows about the user. Specifically, the inconsistencies of the user impose a lower bound on the error the system may achieve when predicting ratings for that particular user. In this work, we analyse how the consistency of user ratings (coherence) may predict the performance of recommendation methods. More specifically, our results show that our definition of coherence is correlated with the so-called magic barrier of recommender systems, and thus, it could be used to discriminate between easy users (those with a low magic barrier) and difficult ones (those with a high magic barrier). We report experiments where the rating prediction error for the more coherent users is lower than that of the less coherent ones. We further validate these results by using a public dataset, where the magic barrier is not available, in which we obtain similar performance improvements.
Article
Allied to the extensive use of online shopping, product recommendation on websites is vitally important for E-commerce. Consequently, collaborative filtering and content-based filtering have been widely used in recommendation systems on E-commerce websites. However, these filtering methods have many problems, such as cold start, prejudiced ratings and inaccurate suggestions. To generate valid and accurate suggestions, researchers have proposed integrating semantic features of the data in an ontology into the recommendation process. However, most existing studies only include the type and features of a product without considering the relational characteristics thereof. Given that the relational characteristics can provide much useful information during the recommendation process, we have designed an easily realizable recommendation system framework based on relational data by integrating the relational data in the domain ontology and applying a genetic algorithm to process the recommendation. Experimental results show that there are obvious improvements in the methods for dealing with sparsity and cold start problems as well as the accuracy and timeliness of recommendations.