PreprintPDF Available

Similarity Measures for Recommender Systems: Drawbacks and Neighbors Formation

Preprints and early-stage research may not have been peer reviewed yet.


Similarity measures are crucial for electing neighbors for the users of recommender systems. However, the massive amount of the processed data in such applications may hide the inner nature of the utilized similarity measures. This paper devotes lots of studies to three standard similarity measures based on many synthetic and real datasets. The aim is to uncover the hidden nature of such measures and conclude their suitability for the recommender systems under different scenarios. Moreover, we propose a novel similarity measure called the normalized sum of multiplications (NSM) and two different variants of it. For experimentation, we examine all measures at three levels; a toy example, synthetic datasets, and real-world datasets. The results show that sometimes Pearson correlation coefficient and cosine similarity exclude similar neighbors in favor of less valuable ones. The former measures the correlation direction, while the latter measures the angle. However, both direction and angle are not similarity but an indication to it and can have the same values for two far vectors. On the other hand, the proposed similarity measure constantly reveals the exact similarity and tracks the closest neighbors. The results prove its robustness and its very good predictive accuracy compared to the traditional ones.
Similarity Measures for Recommender Systems:
Drawbacks and Neighbors Formation
Mohammad Al-Shamri ( )
King Khalid University
Research Article
Keywords: Web-based services, Collaborative recommender system, Similarity measures, User prole,
Web Personalization
Posted Date: September 23rd, 2022
License: This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License
Similarity Measures for Recommender Systems:
Drawbacks and Neighbors Formation
Mohammad Yahya H. Al-Shamri1,2
1Computer Engineering Department, College of Computer Science, King Khalid University, Abha, Saudi Arabia
2Electrical Engineering Department, Faculty of Engineering, Ibb University, Ibb, Yemen.
E-mail: mohamad.alshamri@
Similarity measures are crucial for electing neighbors for the users of recommender systems.
However, the massive amount of the processed data in such applications may hide the inner
nature of the utilized similarity measures. This paper devotes lots of studies to three standard
similarity measures based on many synthetic and real datasets. The aim is to uncover the hidden
nature of such measures and conclude their suitability for the recommender systems under
different scenarios. Moreover, we propose a novel similarity measure called the normalized
sum of multiplications (NSM) and two different variants of it. For experimentation, we examine
all measures at three levels; a toy example, synthetic datasets, and real-world datasets. The
results show that sometimes Pearson correlation coefficient and cosine similarity exclude
similar neighbors in favor of less valuable ones. The former measures the correlation direction,
while the latter measures the angle. However, both direction and angle are not similarity but an
indication to it and can have the same values for two far vectors. On the other hand, the
proposed similarity measure constantly reveals the exact similarity and tracks the closest
neighbors. The results prove its robustness and its very good predictive accuracy compared to
the traditional ones.
Key Words Web-based services, Collaborative recommender system, Similarity measures,
User profile, Web Personalization.
1. Introduction
The web has become an inevitable way of handling many services. One such im-
portant service is the recommendation of online items and services, which is growing
very fast due to its ability to shrink the search space and avoid overwhelming users with
a massive amount of irrelevant information [1]. In fact, the recommendation system
builds a bidirectional relationship between users and service providers. The success of
such a relationship will benefit both sides and push researchers toward developing ef-
fective strategies. The suggested items may likely interest the active user, and hence
this gratified user may become a loyal user of the system. For the success of recom-
mender systems, powerful and efficient matching tools to compare users and guide
them toward close neighbors are needed [2,3]. Since the matching process is essential
for the success of such systems, many efforts are made toward proposing many types of
similarity measures. Some measures are inherited from the information retrieval do-
main, while others are solely presented for recommender systems [4].
Typically, similarity functions generate a single value from a multitude of scores,
representing the degree of resemblance between two users/items. This value is signifi-
cant as it determines the degree of influence of this user/item on the predicted ratings.
Hence, it helps active users to find relevant and exciting information in the form of
suggestions, and therefore, they have a direct effect on the correctness of the generated
list. Actually, getting a representative similarity value will enhance the system's person-
alization level and achieve the primary goal of such systems [5].
This paper explores the most common similarity measures for collaborative filtering,
namely, Pearson correlation coefficient (PCC), cosine similarity (COS), and mean
square difference-based similarity (MSD), and studies their performance in the recom-
mendation field. Due to the massive amount of information available for the recom-
mendation, many similarity measures may present good performance. However, once
deeply analyzing their behavior, one may find conflicting cases and sometimes unac-
ceptable performance. These problems may bias the system’s performance and limit its
diversity toward different situations and moods. However, it is not easy to identify such
issues by examining the similarity measures on a massive real-world dataset as it is
usually biased toward users' common sense. For example, almost all people like the Ti-
tanic film and score it highly, suggesting it to anyone will have a big chance to be a cor-
rect prediction. Once the user profile contains a reasonable ratio of such items, the simi-
larity measure's role deteriorates and becomes minor, any random value may present
good results. We need to dig into the user vector and assume many synthetic cases to
uncover this matter. A general similarity measure should consider all possible cases of
users. Therefore, we test the examined similarity measures on many synthetic data hav-
ing various characteristics. First, we assume uniform distributed samples; then, we
augmented the same dataset with one exact sample and then five identical samples of
each active user. This arrangement allows us to investigate the hidden performance of
each measure in different scenarios. Finally, we assume a single-mood synthetic dataset
which is very important for those users preferring to rate only one type of item, possi-
bly what they like, and give them the same value. Hence, we will study the perfor-
mance of the basic similarity measures on synthetic and real datasets for individual us-
ers and the system as a whole. Based on the findings of this study, we go one step fur-
ther and propose a novel similarity measure that alleviates the identified problems and
achieves good performance for recommender systems. The proposed measure alleviates
the drawbacks of the standard similarity measures and handles all users fairly. The con-
tributions of this work are:
1. Studying and analyzing many similarity measures to highlight the weaknesses for
each one if found and identify the reasons for such performance.
2. Creating many synthetic datasets to determine the suitability of many similarity
measures for the recommendation applications.
3. Proposing a novel similarity measure that positively affects the recommendation
The remaining of this paper discusses related work in Section 2. The most common
similarity measures for recommender systems are discussed in Section 3. This section
will discuss the range of similarity scores and their relationship to the number of stars
in the system. We will go through a toy example to have an initial view of the perfor-
mance of each similarity function for different scenarios. In Section 4, we will discuss
the proposed similarity measure and its variants and spotlight their effect on the system
performance. For more verifications, we devoted two sections to experiments—section
5 experiments similarity measures on synthetic data to observe their performance on
data with known characteristics. The experimental setup, experiments, and evaluation
procedure are discussed there. On the other hand, Section 6 is devoted to experimenting
with similarity measures on two real datasets. We conclude the paper in section 7 with
some direction for future work.
2. Related Work
Matching users or items is the cornerstone of the success of the recommendation
process. Without an efficient matching process, the system may follow unrelated users
or items and hence provide misleading information as an output. Bag et al. [6] classi-
fied similarity measures into two types: traditional and heuristic. Based on this classifi-
cation, all the examined similarity measures in this paper are traditional. Bagchi [2]
used Apache Mahout to investigate the quality of many similarity measures used to
match collaborative filtering users. He concluded that the system’s performance is high-
ly dependent on the metric used for similarity measurement. They found that Euclide-
an-based similarity outperformed the others in error, recall, and precision.
Actually, the literature takes three major directions in dealing with the similarity
measures. One direction tries to identify the limitations and drawbacks of the traditional
similarity measures [7-14], as summarized in Table 1. This table lists the problem code,
name, a brief description of each one, and their references. The second direction goes
for merging more than one similarity measure to compensate for some drawbacks. The
third direction finds it easier to propose new similarity measures.
Some drawbacks of the existing similarity measures, especially PCC and COS, are
highlighted by Patra et al. [10]. These drawbacks are related to common ratings and
their number, local information, and the utilization of the user ratings. They concluded
that these measures could not generate trustworthy neighbors if the rating matrix is
sparse. Jain et al. [15] noticed that PCC showed low similarity scores for similar pat-
terns and high similarity scores for different patterns. In addition, they found that MSD
degraded accuracy due to ignoring the number of shared histories between users, and
Jaccard MSD cannot see the actual preference difference between users. Many refer-
ences that studied the limitations of traditional similarity measures are listed in [11,13].
They argued that PCC and COS focused on the direction and ignored the length of the
user vectors. Guo et al. [11] highlighted four problems for traditional similarity
measures, while Tan and He [13] added three more. This work identified three more
and added them to the list. The identified problems are about COS, MSD, and NSM.
COS gives full marks for two opposite ratings if they appear constantly in the profile
while MSD considers the difference without looking for its position in the rating scale.
Morozov and Zhong [8] listed many drawbacks of traditional similarity measures.
They inferred that COS is not trustworthy because it gives a total similarity value for
one common rating. Moreover, they illustrated that PCC is undefined for any uniform
vector, and apart from directions, MSD emphasizes distance with no rating adjustment.
They concluded that similarity functions that rely on Covariance would give 0 value
whenever there are only two common ratings, regardless of their ratings. Hence, they
suggested combining PCC and MSD to overcome some drawbacks. He and Luo [7] ar-
gued that linear correlation between users caused many weaknesses for traditional simi-
larity metrics. To solve that, they proposed a new similarity measure based on mutual
information to exploit the nonlinear dependency between items. Similarly, Sheugh and
Alizadeh [9] investigated PCC as a similarity measure, highlighted its limitations, and
proposed a new measure to mitigate that.
Table 1: Identified drawbacks/problems for traditional similarity measures.
Some authors proposed and explored different approaches for calculating the similar-
ity degree between users. Karypis [16] discussed variants of similarity measures that
use conditional probability to compute the similarity between items or users. A simple
asymmetrical similarity measure based on the cardinalities of both the common set and
the sets of the users is proposed by Millan et al. [17]. Bobadilla et al. [18] used hidden
attributes of the recommendation process to propose a singularity measure that im-
proves predictions. Still, this measure is limited to the number of singularities from the
shared history between users. Guo et al. [11] proposed a Bayesian similarity that con-
sidered the user vectors' direction and length. However, Tan and He [13] exported the
principle of physical resonance into the similarity process to overcome the low number
of shared histories between users. They proposed a parameterized similarity measure
that considers the distance between users, common ratings, and uncommon ratings. Mu
et al. [19] used Hellinger distance to mitigate the effect of the sparsity problem on the
similarity computation. This measure is used for finding a global similarity measure,
and hence a weighted sum of local and global similarities is used for predictions.
Nonlinear similarity models with three semantic heuristics: proximity, impact, and
popularity (PIP), are proposed by Ahn [20]. Later, Liu et al. [21] argued that the PIP
model might degrade the ratings frequently in some situations and therefore proposed
another model that uses a different set of heuristics called proximity, significance, and
singularity. These models show adaptive behavior, but they are over-dependent on the
co-rated items. Al-Shamri and Bharadwaj [22] used the fuzzy concordance/discordance
principle as a similarity measure for memory-based recommenders. It again relies on
the common set of ratings between two users. Bag et al. [6] argued that considering on-
ly common ratings is inappropriate for generating trustworthy neighbors. So, they pro-
posed variants of the Jaccard index, which considers all user ratings irrespective of
whether they are common. Ayub et al. [14] proposed a similarity measure based on the
concept of the Jaccard index by introducing another argument to consider the average
ratings of users. Schwarz et al. [23] inversed the Euclidean distance to be a similarity
measure for finding the coincidence between users. Pirasteh et al. [24] proposed an
asymmetric similarity function based on the common ratings and then normalized it by
the cardinality of the active user. Gazdar and Hidri [3] introduced a new similarity func-
tion based on the common ratings and the number of ratings given by the two users.
They used nonlinear systems, linear differential equations, and integrals to transform
user preferences into a similarity value.
Finally, some authors explore the performance under other parameters like sparsity
or cardinality of the common set. For example, Jain et al. [4] reviewed multifarious
similarity measures and analyzed their performance for different sparsity percentages.
They found that PCC and COS may give false indications about similarity while the
best results are obtained with those relying on Minkowski distance. Hassanieh et al.
[25] studied the error performance of many similarities. They induced that the system
performance is different for different sparsity levels. Moreover, Stephen et al. [26] ad-
dressed sparse recommendation data and its effect on similarity calculations. They ar-
gued that finding two vectors with common ratings is even sparser. Therefore, they
studied the hierarchical categorization of items to have a clear overview of the user in-
terests. Al-Shamri [12] studied the effect of the common set cardinality on different
similarity measures. They learned this effect using empirical examples and inferred
some weaknesses of PCC and COS.
3. Similarity Measures for Recommendation Process
Similarity measures guide the system to identify a set of neighbors for the active user
on hand. PCC, the most common similarity measure for the collaborative recommenda-
tion, measures how linearly two users are related to each other [4,6,10].
( )
( )
( )
( )
( )
,k xy
k xy k xy
xk x yk y
xk x yk y
sS sS
rm rm
rm rm
, is the rating of the user, , for item , is the mean rating of , and
, is the common ratings between and . On the other hand, COS finds the dot
product between the vectors of two users [4,10]. It is used by many applications, in-
cluding YouTube and Amazon.
( )
k xy
k xy k xy
xk yk
xk yk
sS sS
In fact, COS finds the alignment between two vectors, not the rating agreement [8].
Finally, mean square distance measures the distance between two vectors [6,9,10,27].
This measure puts more emphasis on significant differences rather than minor differ-
( )
( )
k xy
xk yk
dis RS
We divide the squared differences by the maximum rating scale value, , to keep
a normalized value. The distance can be transformed into a similarity value by subtract-
ing it from one. Accordingly, the MSD similarity measure will be [6,10]:
( ) ( )
,1 ,
xy xy
MSD dis= uu uu
The following subsections discuss the range values of the similarity score and then
present a toy example to have an initial test for the performance of different similarity
measures on different profiles.
3.1 Range Values of the Similarity Score
The range of the generated similarity score is significant for understanding the be-
havior of different similarity measures. Usually, similarity measures generate values
that range from 0 to 1 for unipolar measures and -1 to 1 for bipolar ones. However, rec-
ommender systems rarely use 0 as a rating scale value, and hence many direct similarity
measures will start from some value greater than 0 as with MSD. Logically, the maxi-
mum range value can be obtained when comparing exact cases. In contrast, the mini-
mum can be obtained between the two extreme cases, i.e., the maximum and the mini-
mum of the rating scale.
To study this range for different measures, let us assume a single mood user,
, ,...,
Ri i i i
is the rating scale value. Table 2 lists the similarity results
(Here  = 5) and all other single mood users for three different rat-
ing scales. As it is evident from the results, PCC cannot help for such cases while COS
generates the same value for all. On the other hand, MSD somehow gives correct val-
ues for the maximum and the minimum. The minimum value changes with the number
of stars on the rating scale, and it approaches zero more once the stars are increased.
3.2 Toy Example
This example generates a dataset of 15 users rating 15 items using a 5-star rating
scale, as shown in Table 3. This small dataset has users with different rating histories.
For clarity purposes, we assumed pure positive ratings are 4 and 5 while pure negative
ratings are 2 and 1. The rating 3 may be considered neutral, but it is usually added to
the broad category of positive ratings. The toy example includes many opposite profiles
to compare the performance of the similarity measures on opposite users. The inclusion
of such users grants us to conceive the similarity measure behavior with users having
contrasting moods.
In the toy example, user 1, 1, gives uniformly distributed ratings while user 2, 2,
behaves oppositely. These two users represent the ideal case, where users utilize the
rating scale evenly. For generating the opposite profile, we calculate the counter-value
2: Range values of different similarity measures for 5-, 7-, and 10-star rating scales.
of each rating value as:
() 1
xk xk
cr N r= +−
Note that is the maximum star on the rating scale. User 3, 3, has positive ratings
(3,4,5) only, representing a dominant case of recommender systems where users navi-
gate and then rate what they liked in advance. The opposite user for this case is User 4,
4. At the extreme points, user 5, 5, has pure negative ratings (1,2) while user 6, 6, is
the opposite with pure positive ratings (4,5). User 7, 7, represents a colinear vector
with 5 by a factor of 2. A user with the only one-rated item is given by user 8, 8. User
9, 9, is designed to study the cross-value problem. Five more users are added to show
single mood ratings for each rating scale value.
Results and Discussions
The similarity values between different users of the toy Example are listed in Table
3. Since all examined similarity measures are symmetric, we allocate the upper triangle
of table 3 for three similarity measures and the lower triangle for the other three. We pu t
2 for the undefined (incomputable) similarity value. The following points illustrate the
principal observations and conclusions:
1. The bottom right-corner red subtable illustrates the similarity values for inter-
and intra- single mood users, where:
Table 3: Dataset of Example 1: 15 users are rating 15 items using a 5-star rating scale.
PCC cannot match these users because it calculates the rating deviation from
the user mean, which is always zero (PB4). Hence PCC is not suitable for these
COS always gives full marks, even for opposite moods. This behavior is due to
the nature of COS, which measures the cosine of the angle between the two
vectors. If they have a single positive or negative mood, they will be parallel
with zero angles.
MSD gives reasonable and gradual values based on how much the difference is
because it finds the rating difference, not angle or direction.
2. The results of the opposite cases (1,
2), (3,
4), and (5,
6), reveal that
PCC quickly identified them while COS and MSD gave them high value even
Table 4: Similarity values for users of the Toy Example for different similarity measures.
they are opposite. COS assigned high values because the two vectors are almost
parallel to the -axis, so minimal angle will be there.
3. The common ratings between (2,
6) are (5,4,3,5) and (4,2,4,4), indicating
positive relationship. Similarly, the common between (1,
5) are (1,2,3,1) and
(2,1,2,2) indicate negative relationship. However, PCC gave both -0.493, treat-
ing them as opposite users, a misleading value. This value assures the cross-
value problem (PB9) of PCC.
4. The common ratings between (1,
3) are (4,4,5,5) and (5,5,4,4), which indicate
a strong positive relationship. However, PCC degraded the similarity to 0.222,
which again assures the sensitivity of PCC to the cross values. On the other
hand, COS and MSD gave almost full marks.
5. MSD cannot grasp user preferences if the difference is the same (PB11). For ex-
ample, the MSD values for users (12,
15), and (11,
14) are the same 0.64.
Similarly, the values for users (12,
14), (11,
13), and (13,
15) are the same
0.84. Some pairs belong to the same category (positive or negative) while others
belong to different categories, and hence treating them the same is not logical.
6. PB3 is evident for 8, where it got an undefined value from PCC and a total
value from COS.
7. PCC and COS values for all users with the collinear users 5 and 7 are equal,
proving the problem PB5.
8. MSD assigned a moderate value for the two extreme users 11 and 15, which is
better than PCC and COS. However, it should be zero.
9. The common history between 2 and 5 is (2,1,2,2) and (5,4,3,5), showing op-
posite mood users. However, COS gave them a very high mark of 0.961 due to
the free value problem (PB10). A similar situation occurs between (6,
15), and all opposite users.
10. The single co-rated item problem (PB2) is very clear from the results of (1,
and (2,
9), where PCC gave +1 or -1, and COS always gave 1.
11. MSD results of (6,
9) and (6,
14) proved the unequal-length problem (PB6)
where dense history user encounters unfair evaluation in favor of shallow histo-
ry user.
12. Two users, 1 and 2, have uniformly distributed ratings. However, they got dif-
ferent MSD values with 10 because of the common snap of ratings (PB8).
From the results of this example, we can conclude the following main points:
PCC shows different behavior due to its dependency on the ratings’ means.
COS calculates the cosine of the angle between the vectors of the two users.
Therefore, it cannot always differentiate between the user categories.
MSD treats the same differences equally. This is inappropriate for the recom-
mendation process.
The listed pitfalls and the obtained results from the toy example encouraged us to go
further and propose a new similarity measure in the following section.
4. Normalized Sum of Multiplications Similarity Measures
PCC and COS indicate mainly the direction of the correlation between users, not
their exact similarity. Both measures rely on different combinations of the sum of mul-
tiplications. PCC multiplies the rating deviation from the mean of each user, while COS
multiplies ratings directly. However, both of them normalize the result by the norms of
their combinations. These combinations for example make COS an indicator of the an-
gle between the users, which may be the same for two far users. To resolve that, we
propose a normalized sum of multiplications that modifies the COS similarity to solve
this issue.
Definition 1: The normalized sum of multiplications (NSM) between two vectors (us-
ers’ profiles), ux and uy, having
common items is defined by:
( ) ( )
max( , )
k xy
k xy
xk yk
xk yk
Here the denominator is the sum of the square of the maximum between individual
ratings, which preserves the exact similarity between different users. The range values
of NSM are also listed in Table 2. It starts from some value greater than 0 and gives
correct values for the maximum. NSM differs from MSD in developing the minimum.
The minimum value of both MSD and NSM changes with the number of stars on the
rating scale, and it approaches zero more once the stars are increased. However, NSM
always generates less minimum value than MSD, making it appropriate for capturing
opposite cases better than MSD.
4.1 Normalized Sum of Multiplications Similarity Measure for the Users of the Toy
The results of NSM for the users of the toy Example are also shown in Table 3. We
will trace the same points in the previous analysis using NSM similarity for this exam-
1. The NSM similarity values of the bottom right-corner red table of Table 3 are
gradual for different users, starting from 0.5 for close users in the negative cate-
gory to 0.8 for close users in the positive category.
2. The similarity values between opposite users are less than MSD. The similarity
value between the two extreme users, 11 and 15, is 0.2, the minimum possible
value for this rating scale using the NSM similarity measure (Table 2).
3. The similarity value between (2,
6) and (1,
5) is 0.571, representing the de-
gree of positive/negative relationship between these users.
4. The similarity value between 1 and 3 having a strong positive relationship is
0.821, which indicates the degree of positiveness between them.
5. Unlike MSD, NSM grasps user preferences if the difference is the same, avoid-
ing PB11. For example, the NSM values for (12,
15) and (11,
14) are 0.4
and 0.25. Similarly, the NSM values for (12,
14), (11,
13), and (13,
are 0.5, 0.33 and 0.6. This performance keeps the category difference between
6. NSM and MSD gave the same full similarity for 8 with 3. However, they be-
have differently with 8 and 4 as they have opposite ratings. The value given
by NSM is more accurate than that of MSD.
7. NSM and MSD values for users with the collinear users 5 and 7 are not
equal. Again the value given by NSM is more realistic than that of MSD.
8. NSM assigned a meager value for the two extreme users 11 and 15, which is
better than MSD.
9. NSM does not suffer from the free-value problem (PB10). It gave logical values
for different cases. The similarities between opposite users are far smaller than
10. NSM and MSD do not suffer from PB2.
11. The unequal-length problem (PB6) is common between many similarity
measures due to the common history restrictions for calculating the similarity
12. The common snap of ratings between users lets NSM suffers from PB8.
From the results of this example on NSM, we can conclude that:
All opposite cases have similarity values less than 0.5.
The minimum value depends on the rating scale.
NSM can differentiate easily between the user categories by treating the same
differences based on their classes.
4.2 Upper-scale and Lower-scale Normalized Sum of Multiplications Similarity
NSM emphasizes the similarity between close positive users and downgrades the
similarity between the close negative users due to the multiplication and maximum op-
erators between top values or bottom values of the rating scale. For example, 4×5=20
and the maximum is 5 while 1×2=2 and the maximum is 2. Both values have a one-unit
distance; however, their NSM values are different, 0.8 and 0.5. Hence it will encourage
the recommendation of liked items by promoting the similarity of positive users and
degrading the similarity between negative users. This behavior may be accepted by
some applications having a huge amount of data but it may be not by other applications.
To resolve this problem, we have to consider the counter-values of the ratings when
doing the multiplication process. In fact, the previous toy example represents counter-
values where 5 is the counter-value of 1 and 4 is the counter-value of 2 on the 5-star
rating scale. This discriminatory behavior encourages us to propose two other versions
of NSM. The first one shifts the results toward the top, while the second one shifts the
results downward.
Definition 2: The counter-value normalized sum of multiplications between two vec-
tors (users’ profiles), ux and uy, having
common items is defined by:
( )
( )
( )
( )
xy x y
NSM NSM c c=uu u u
Definition 3: The upper-scale normalized sum of multiplications (UNSM) between two
vectors (users’ profiles), ux and uy, having
common items is defined by:
( ) ( ) ( )
( )
, max , , ,
xy xy xy
UNSM NSM NSM=uu uu uu
Definition 4: The lower-scale normalized sum of multiplications (LNSM) between two
vectors (users’ profiles), ux and uy, having
common items is defined by:
( ) ( ) ( )
( )
, min , , ,
xy xy xy
LNSM NSM NSM=uu uu uu
4.3 UNSM and LNSM for the Users of the Toy Example
Table 3 lists the results of UNSM and LNSM for the users of the toy Example, which
reveal the following remarks:
1. All NSM variants declare the same score if one user is the counter of the other,
like all opposite cases.
2. If the countering operation flips both values, the score may decrease or increase
much like that of (8,
15) and (9,
3. If the countering operation flips only one value, then the score may change
slightly like in the case of (12,
13), (8,
9), and (4,
4. The maximum difference between UNSM and LNSM values depends on the
utilized rating scale, which is 0.3 for the utilized rating scale.
5. Examined Similarity Measures with Synthetic Datasets
Usually, different similarity measures are examined using real-world datasets like
MovieLens and Jester. However, these datasets are biased due to the inherited nature of
the recommendation systems of suggesting mainly liked items. Hence these items have
a high chance of being rated. In the following, we will discuss the generated synthetic
dataset and then analyze the results for various similarity measures.
5.1 Generated Synthetic Dataset
To observe the performance of different similarity measures on a dataset with some
predefined statistical characteristics, we randomly generated four different synthetic
datasets with four splits for cross-validation, as discussed below. The numbers of rat-
ings for all examined datasets are listed in Table 5. We omitted Dataset_2 and Da-
taset_3 as they have different combinations of Dataset_1.
Dataset_1: (Uniformly Distributed Dataset)
This dataset consists of four splits of 125 users rating randomly 1200 movies on a 9-
star rating scale. The number of ratings for each user and the ratings themselves is uni-
formly distributed. Hence, each rating scale value will have an equal chance of appear-
ing in the user-item matrix. The aim is to verify the performance of similarity measures
based on a proper uniform distribution dataset, not a biased one. Moreover, we use a
large and symmetric rating scale to give a clear picture of the effect of the uniformly
distributed ratings on the system performance. For each experiment, one split will be
assumed as the set of active users, and the remaining splits will be treated as the train-
ing set. Hence, the test user will be 125, while the training users will become 375.
Dataset_1 has uniformly distributed random ratings. Here we added the same set of
testing users to the training users for each split, and hence the training users become
500 users while the test users remain the same, 125 users. The aim is to explore the ef-
fect of having one similar user in the training dataset for each active user. For each ex-
periment, one split will be assumed as the set of active users, while all splits will be
treated as the training set, including the one used for testing.
To extend the notion of similar users in the training dataset, we duplicated each test
user of Dataset_1 five times and added them to the training dataset. Hence, we have
1000 training users for each fold of this experiment. For each experiment, one split will
be assumed as the set of active users, while the training set consists of all splits dupli-
cated five times, including the one used for testing. Hence the test users will be 125 us-
ers, while the training users will become 2500 users.
Dataset_4: (Single Mood Dataset)
To study the effect of single mood users on the system’s performance, we randomly
generated four splits of 125 users rating randomly 1200 movies with random single
mood ratings for each user. For experimentation, one split will be assumed as the set of
active users, and the remaining splits will be treated as the training set. Hence, the test
user will be 125, while the training users will become 375.
5.2 Experimental Setup
Four folds are formed by reserving a split for testing the system and merging the re-
maining splits to form the fold training dataset. Hence, the training users will be 375,
Table 5:
Number of ratings for each dataset.
and the test users will be 125 for Dataset_1 and Dataset_4. The ratings of each test user
are divided randomly into training ratings (80%) and test ratings (20%). The perfor-
mance of various similarity measures is studied for different neighborhood sets to see
the performance under many conditions. In fact, we set the neighborhood set size (NSS)
to be 1,2,3,4,5,10,20,30,40,50. NSS gradually increases by one step, five, and finally,
tens to cover very small, small, and normal NSS.
5.3 Evaluation Metrics
The experiments are evaluated using the percentage of correct prediction (PCP),
mean absolute error (MAE), and user coverage (UC) [12,28]. UC is a fundamental met-
ric to inspect how many active users the system can successfully help since we deal
with neighbors and their effect on the system performance. The results are averaged
over all folds.
5.4 Analysis of the Results
This subsection discusses the results of the four conducted experiments.
The results of this experiment for all NSS are given in Fig. 1. The results show com-
parable low performance for all similarity measures as they deal with the ideal case
with uniformly distributed ratings. PCP is shown in part(a); the results are around 11%
and 12% for all similarity measures for high values of NSS with a clear advantage for
LNSM. In fact, NSM outperforms MSD in almost all cases, and LNSM transcends
both. UNSM is the worst among NSM variants. PCP starts very low for all approaches
with low NSS and begins to increase by growing up NSS where it reaches 12% as the
maximum value with NSS=50. The results grow as more neighbors contribute to the
prediction process. This performance is expected as we have highly random equiproba-
ble ratings, which are rare in real datasets.
In terms of MAE, all measures reveal comparable results with a slight advantage for
LNSM. Note that MAE values are high and show some improvement with increased
NSS, where 26% is the maximum at NSS=50. Somehow stable performance occurs at
NSS = 30, 40, and 50. Fig. 1(c) shows the user coverage for different similarity
measures. UC is increased by moving up the NSS value. LNSM surpasses other simi-
larity measures by a small value for low values of NSS up to NSS=10. However, UC is
very small for low values of NSS, which is only around 10% for all. Thus only 10% of
the active users can be benefitted from the system with this value of NSS.
(a) PCP
(b) MAE
(c) User Coverage
Fig. 1. Evaluation metrics for Dataset_1
(a) PCP
(b) MAE
(c) User Coverage
Fig. 2. Evaluation metrics for Dataset_2
The results’ pattern of this experiment (Fig. 2) is almost opposite to that of the previ-
ous experiment because we now have one exact user in the training dataset for each test
user. Due to this doping, the lowest PCP value is now around 16% which exceeds the
highest value of the previous experiment, which emphasizes the power of similarity
measures for identifying similar users once they exist in the training dataset. The sys-
tem used this information efficiently for more enhancement.
PCP starts very high even with only one neighbor, as shown in Fig. 2(a). This ad-
vantage is reduced by increasing the number of neighbors, reaching the minimum with
NSS = 50 as more less-valuable neighbors are contributing now to the prediction pro-
cess. The results indicate that many neighbors are not always desirable, especially for
datasets having similar moods. Maybe it somehow benefits those having odd attitudes.
The results are comparable for all similarities for the lower part of NSS up to NSS =5.
Then PCC shows the best performance with a small margin compared to others. MSD
performance is inferior to NSM and UNSM in terms of all evaluation metrics. UNSM is
the best among NSM variants, and LNSM is the worst for all metrics.
In terms of MAE, MSD and COS show the lowest performance, especially for high
NSS. However, MAE is improved very much compared to that of experiment 1. It is
zero with NSS=1 for all similarity measures and does not exceed 20% with NSS=50.
UC results are depicted in Fig. 2(c). All similarities reveal high UC with a minimum
value of 91%, ensuring the importance of finding close neighbors in the training da-
taset. UC values decrease after NSS = 10, indicating that many neighbors are not al-
ways desirable.
The results (Fig. 3) of this experiment are enhanced further compared to experiment
2. Here the training dataset includes five similar users to each active user. Therefore,
the results of NSS = 1,2,3,4 and 5 are identical for all similarity measures. Note that
this dataset does not include conflicting cases like single mood users. The results are
very high in PCC and start decaying after NSS = 5, where PCC keeps its advantage
over the others while MSD keeps the lowest performance in this range. All reach the
minimum with NSS = 50. The minimum value of PCP is now 44%. All NSM variants
outperform MSD and COS for the same range. This conclusion becomes apparent with
MAE, and this assures the findings of PC P. One critical point here is that MAE is very
low and does not exceed 10% for all, with a clear advantage for PCC with NSS = 10 to
50. Again having similar users in the dataset is very important for the system's success,
and their number affects the performance highly.
Here UC shows stable performance for the lower part of NSS with 99.5%. It
achieves 100% for PCC and NSM variants with NSS = 10 and begins slowly to decay
(a) PCP
(b) MAE
(c) User Coverage
Fig. 3. Evaluation metrics for Dataset_3
(a) PCP
(b) MAE
(c) User Coverage
Fig. 4. Evaluation metrics for Dataset_4
after that. After NSS = 5, UC changes because the set of neighbors is changing by in-
creasing NSS. Note that the performance monotonically decreases after NSS = 5; se-
lecting the best NSS value is crucial for the system's success and depends on the simi-
larity measure and the dataset on hand.
This dataset is unique, representing users with single mood ratings. This dataset al-
lows us to assess the performance for such cases, which are always hidden in the real-
world datasets as they have various moods. In this case, the mean of training users and
testing users are the same due to the single mood nature of the dataset, and hence PCC
will be undefined. Other similarity measures can handle this situation and start to show
the maximum possible value with NSS = 40 (Fig. 4). Increasing NSS beyond this value
does not affect all evaluation metrics.
Apparently, the PCP of this experiment shows that COS is the best for such a dataset
though it cannot differentiate between single mood users exactly as discussed earlier.
This advantage vanishes as NSS goes up to where all perform almost the same with
NSS = 40. With this value, PCP reaches 100%, and hence increasing it further will not
add any value to the system. It will consume resources without any actual benefit. Con-
sistent findings appear for MAE. However, the error values are high for the lower part
of NSS and reached 50% for some similarities. This value monotonically decreased to
zero for all similarities except PCC, where it stands fixed at about 56% because no
neighbor is there.
UC is above 90% for all similarities except PCC, where it is zero. Except for PCC,
UC is full for NSS greater than 3 as they can select close neighbors for this case. COS
is better than others for NSS =1 and 2.
6. Examined Similarity Measures with Real Datasets
This section examines similarity measures with two real datasets, Movielens [29] and
Jester [30] datasets. For this purpose, we randomly selected four folds of 125 users reflect-
ing the dataset distribution as discussed in Al-Shamri [12]. The rating scale of Jester is dig-
itized to 10 stars, and three mean groups are defined to create the dataset categories in anal-
ogy to that of Movielens. The ratings of each test user for items are divided randomly into
80/20 percent between training and test ratings. The training dataset includes the remaining
folds for each split, and the results are averaged over all splits to ensure cross-validation of
our results.
6.1 MovieLens Dataset
The results of this experiment are shown in Fig. 5. PCP is very low for the lower part
of NSS and increased monotonically for the top part, showing 37% for LNSM. PCC,
NSM, and UNSM alternate the results’ superiority between lower and top parts of NSS.
However, they are less than that of LNSM for all cases. Moreover, MSD outperforms
NSM and UNSM by a small margin for the top part of NSS, whereas COS shows the
lowest. Note that the results do not reach a stable level, and hence they may increase
once NSS is increased beyond 50. This indicates that NSS should be selected carefully
based on the dataset and its characteristics.
MAE dictates a similar behavior to PCP, where it performs very low at the beginning
with a 66% error. The randomness of this dataset is very high, which is improved by
increasing NSS as more neighbors are elaborated to the process. The improvement per-
centage between LNSM values at NSS = 1 and NSS = 50 is around 275%. UC results
assure the random nature of the dataset, where it is only 40% at NSS = 1. In this case,
the selected neighbors cannot help 60% of the active users to predict any item correctly.
This value is enhanced very much with NSS=10, where it is now 95%.
6.2 Jester Dataset
This dataset is denser than Movielens, where each user has rated at least 35 jokes out of
101. PCP (Fig. 6(a)) starts high with NSS = 1 compared to the Movielens dataset, with a
very high sparsity level. However, the big success at the start does not continue, and it
achieves 20% only as of the possible maximum at NSS = 20. The results decay slightly af-
ter that. LNSM outperforms other for NSS = 1..5,10,20,40. PCC performs the worst for
NSS=1..10 and as good as NSM for the remaining values except NSS = 30, where it stands
the best.
Here, the system can assist users from the beginning as there is a low number of
jokes. Movielens results are shallow at the beginning (2%) and gradually increased. In
contrast, Jester results are high initially and are stable early and then decrease slightly
from their maximum value, which is the opposite of Movielens. Of course, 10% of PCP
is not numerically high but compared to Movielens; it is high.
MSD is the best for the lower part of NSS in terms of MAE. These values start high
with NSS = 1 and go down when NSS goes up. It begins to be stable with 16% at NSS
= 30. Increasing NSS is not suitable for such a dataset. UC is high for all (between 75%
(a) PCP
(b) MAE
(c) User Coverage
Fig. 5. Evaluation metrics for Movielens dataset
(a) PCP
(b) MAE
(c) User Coverage
Fig. 6. Evaluation metrics for Jester Dataset.
and 90%). LNSM shows the best values for five NSSs, while PCC shows the best re-
sults for four high NSS.
7. Conclusions
A similarity measure for a collaborative recommender system should reflect the ac-
tual similarity between users. Usually, RSs include many users in the prediction process
which hides the fine details prevailed by the similarity measures. However, the most
used similarity measures may fail, as illustrated by the results of the toy example and
the synthetic data. Hence we propose NSM and its variants as efficient solutions for
such systems. NSM similarity inherits the power of COS and avoids its weakness of
indicating the direction only. NSM directly manipulates each vector value and grasps
the exact differences between them. However, strict differentiation between users is
sometimes not required for recommender systems. Therefore, we propose two variants
of NSM that represent lenient and strict behavior based on the position of the rating
value on the scale.
We examine the proposed similarity variants on synthetic and real datasets with dif-
ferent characteristics. The results of MSD and NSM are robust for many cases, with an
advantage for NSM as it does not suffer from PB10. Moreover, the results show that
similarity measures show comparable results for the high number of neighbors as the
dataset becomes denser.
Basic similarities for CRSs rely on the common set only. However, this is not enough
as more parameters should be considered. The future work will examine the effect of
including such parameters besides studying the performance of the proposed similarity
measures with different neighbors.
Data Availability Statement
The datasets generated during and/or analyzed during the current study are available
from the corresponding author on reasonable request.
The author extends his appreciation to the Deanship of Scientific Research at King
Khalid University for funding this work through the Research Group Project under
grant number (RGP. 2/165/43).
1. Kluver D, Ekstrand MD, Konstan, JA (2018) Rating-Based Collaborative Filtering: Algorithms
and Evaluation. Social Information Access: Systems and Technologies, 344-390.
2. Bagchi S (2015) Performance and Quality Assessment of Similarity Measures in Collaborative
Filtering Using Mahout. Procedia Computer Science 50:229-234.
3. Gazdar A, Hidri, L (2020) A new similarity measure for collaborative filtering based recom-
mender systems. Knowledge-Based Systems 188, 105058,
4. Jain G, Mahara T, Tripathi KN (2020a) A Survey of Similarity Measures for Collaborative Fil-
tering-Based Recommender System. M. Pant et al. (eds.), Soft Computing: Theories and Appli-
cations, Advances in Intelligent Systems and Computing 1053, Springer.
5. Wischenbart M, Firmenich S, Rossi G et al (2021) Engaging end-user driven recommender sys-
tems: personalization through web augmentation. Multimed Tools Appl 80:67856809.
6. Bag S, Kumar SK, Tiwari MK (2019a) An efficient recommendation generation using relevant
Jaccard similarity. Information Science 483.
7. He X, Luo Y (2010) Mutual Information Based Similarity Measure for Collaborative Filtering.
Proc. Progress in Informatics and Computing (PIC), 2010 IEEE International Conference, pp.
8. Morozov S, Zhong X (2013) The evaluation of similarity metrics in collaborative filtering rec-
ommenders. Proc. of the 2013 Hawaii University International Conferences Education & Tech-
nology Math & Engineering Technology, Honolulu, HI, USA.
9. Sheugh L, Alizadeh SH (2015) A note on Pearson correlation coefficient as a metric of similari-
ty in recommender system. Proc. 2015 AI & Robotics (IRANOPEN), pp. 1-6, doi:
10. Patra BKr, Launonen R, Ollikainen V, Nandi S (2015) A new similarity measure using
Bhattacharyya coefficient for collaborative filtering in sparse data. Knowledge-Based Systems
11. Guo G, Zhang J, Yorke-Smith N (2016) A novel evidence-based Bayesian similarity measure
for recommender systems. ACM Trans. Web 10(2):130. doi: 10.1145/2856037
12. Al-Shamri MYH (2016) Effect of Collaborative Recommender System Parameters: Common Set Cardi-
nality and the Similarity Measure. Advances in Artificial Intelligence 2016:10 pages.
13. Tan Z, He L (2017) An Efficient Similarity Measure for User-Based Collaborative Filtering
Recommender Systems Inspired by the Physical Resonance Principle. IEEE Access 5:27211-
27228. doi: 10.1109/ACCESS.2017.2778424.
14. Ayub M, Ghazanfar MA, Maqsood M, Saleem A (2018) A Jaccard base similarity measure to
improve performance of CF based recommender systems. Proc. 2018 International Conference
on Information Networking (ICOIN), pp. 1-6, doi: 10.1109/ICOIN.2018.8343073.
15. Jain A, Nagar S, Singh PK, Dhar J (2020b) EMUCF: Enhanced multistage user-based collabora-
tive filtering through nonlinear similarity for recommendation systems. Exp. Sys. with App.
16. Karypis G (2001) Evaluation of item-based top-N recommendation algorithms. Proc. the Inter-
national Conference on Information and Knowledge Management (CIKM ’01), pp. 247254,
Atlanta, Ga, USA.
17. Millan M, Trujillo M, Ortiz E (2007) A Collaborative Recommender System Based on Asym-
metric User Similarity. In: Yin H, Tino P, Corchado E, Byrne W, Yao X (eds) Intelligent Data
Engineering and Automated Learning - IDEAL 2007. Lecture Notes in Computer Science, vol
4881. Springer, Berlin, Heidelberg.
18. Bobadilla J, Ortega F, Hernando A (2012) A collaborative filtering similarity measure based on
singularities. Inf. Process. Manage 48(2):204217. doi: 10.1016/j.ipm.2011.03.007.
19. Mu Y, Xiao N, Tang R, Luo L, Yin X (2019) An Efficient Similarity Measure for Collaborative
Filtering. Procedia Computer Science, 147:416-421.
20. Ahn HJ (2008) A new similarity measure for collaborative filtering to alleviate the new user
cold-starting problem. Inf. Sci. 178(1):37-51.
21. Liu H, Hu Z, Mian A, Tian H, Zhu X (2014) A new user similarity model to improve the accu-
racy of collaborative filtering. Knowledge-Based Systems 56:156166.
22. Al-Shamri MYH, Bharadwaj KK (2007) A hybrid preference-based recommender system based
on fuzzy concordance/discordance principle. Proc. 3rd Indian International Conference on Artifi-
cial Intelligence (IICAI’07), pp. 301314, Pune, India.
23. Schwarz M, Lobur M, Stekh Y (2017) Analysis of the effectiveness of similarity measures for
recommender systems. Proc. 14th International Conference: The Experience of Designing and
Application of CAD Systems in Microelectronics (CADSM), pp. 275-277. doi:
24. Pirasteh P, Jung JJ, Hwang D (2015) An Asymmetric Weighting Schema for Collaborative Fil-
tering. In: Camacho D, Kim SW, Trawiński B (eds) New Trends in Computational Collective
Intelligence. Studies in Computational Intelligence, vol 572. Springer.
25. Hassanieh LA, Jaoudeh CA, Abdo JB, Demerjian J (2018) Similarity measures for collaborative
filtering recommender systems. Proc. IEEE Middle East and North Africa Communications
Conference (MENACOMM), pp. 1-5. doi: 10.1109/MENACOMM.2018.8371003.
26. Stephen SC, Xie H, Rai S (2017) Measures of Similarity in Memory-Based Collaborative Filter-
ing Recommender System: A Comparison. Proc. 4th Multidisciplinary International Social
Networks Conference (MISNC '17), Association for Computing Machinery, New York, NY,
USA, pp. 18. DOI:
27. Ning X, Desrosiers C, Karypis G (2015) A Comprehensive Survey of Neighborhood-Based
Recommendation Methods. In: Ricci F, Rokach L, Shapira B (eds) Recommender Systems
Handbook. Springer, Boston, MA.
28. Bag S, Ghadge A, Tiwari MK. (2019b) An integrated recommender system for improved accuracy and
aggregate diversity. Computers & Industrial Engineering 130:187-197.
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
In the past decades recommender systems have become a powerful tool to improve personalization on the Web. Yet, many popular websites lack such functionality, its implementation usually requires certain technical skills, and, above all, its introduction is beyond the scope and control of end-users. To alleviate these problems, this paper presents a novel tool to empower end-users without programming skills, without any involvement of website providers, to embed personalized recommendations of items into arbitrary websites on client-side. For this we have developed a generic meta-model to capture recommender system configuration parameters in general as well as in a web augmentation context. Thereupon, we have implemented a wizard in the form of an easy-to-use browser plug-in, allowing the generation of so-called user scripts, which are executed in the browser to engage collaborative filtering functionality from a provided external rest service. We discuss functionality and limitations of the approach, and in a study with end-users we assess the usability and show its suitability for combining recommender systems with web augmentation techniques, aiming to empower end-users to implement controllable recommender applications for a more personalized browsing experience.
Full-text available
The objective of a recommender system is to provide customers with personalized recommendations while selecting an item among a set of products (movies, books, etc.). The collaborative filtering is the most used technique for recommender systems. One of the main components of a recommender system based on the collaborative filtering technique, is the similarity measure used to determine the set of users having the same behavior with regard to the selected items. Several similarity functions have been proposed, with different performances in terms of accuracy and quality of recommendations. In this paper, we propose a new simple and efficient similarity measure. Its mathematical expression is determined through the following paper contributions: 1) transforming some intuitive and qualitative conditions, that should be satisfied by the similarity measure, into relevant mathematical equations namely: the integral equation, the linear system of differential equations and a non-linear system and 2) resolving the equations to achieve the kernel function of the similarity measure. The extensive experimental study driven on a benchmark datasets shows that the proposed similarity measure is very competitive, especially in terms of accuracy, with regards to some representative similarity measures of the literature.
Full-text available
Information explosion creates dilemma in finding preferred products from the digital marketplaces. Thus, it is challenging for online companies to develop an efficient recommender system for large portfolio of products. The aim of this research is to develop an integrated recommender system model for online companies, with the ability of providing personalized services to their customers. The K-nearest neighbors (KNN) algorithm uses similarity matrices for performing the recommendation system; however, multiple drawbacks associated with the conventional KNN algorithm have been identified. Thus, an algorithm considering weight metric is used to select only significant nearest neighbors (SNN). Using secondary dataset on MovieLens and combining four types of prediction models, the study develops an integrated recommender system model to identify SNN and predict accurate personalized recommendations at lower computation cost. A timestamp used in the integrated model improves the performance of the personalized recommender system. The research contributes to behavioral analytics and recommender system literature by providing an integrated decision-making model for improved accuracy and aggregate diversity. The proposed prediction model helps to improve the profitability of online companies by selling diverse and preferred portfolio of products to their customers.
Full-text available
In the field of recommendation system, the memory-based Collaborative filtering has been proven to be useful in lots of practices. Similarity measures like Pearson correlation coefficient tend to only focus on improving as much as possible the accuracy. Handling datasets with different features, exiting measures cannot apply to different types of data simultaneously. In this paper, an improved similarity measure Common Pearson Correlation Coefficient (COPC) was proposed. Unlike existing measures, it strongly depends on chosen distance function, which adhere to the natural property of monotonicity and utilize consensus evaluation measure to capture an optimal value to improve PCC measure. To mitigate sparse problem, we also introduce the Hellinger Distance (Hg) as global similarity to lower the impact of lacking co-rated items. Experimental results on real-world datasets demonstrates that our measure outperformed the existing schemes of predicting ratings.
Full-text available
In the literature, various collaborative filtering approaches have been developed to perform an efficient recommendation on top of reducing the search cost of the customers. The recommender system methods are concentrated on improving the accuracy and to achieve that goal they focused on formulating complex similarity approaches and neglect the computation time in their model. Furthermore, in order to compute the similarity metric, most of the traditional similarity measures have only considered co-rated items and overlooked the total rating vector of the user or item. However, considering only co-rated items to measure similarity metrics in a recommender system is an insignificant approach to identifying appropriate nearest neighbors in relatively sparse datasets. Therefore, in this research, two new simple but effective similarity models have been developed by considering all rating vectors of users to classify relevant neighborhoods and generate recommendations in lower computation time. Moreover, MovieLens, a well-known dataset used in recommender system domain, is involved here to validate the performance of the proposed model. It seems that the proposed relevant Jaccard similarity perform more accurately and effectively to generate well recommendation than other traditional similarity models.
Full-text available
User-based collaborative filtering is an important technique used in collaborative filtering recommender systems to recommend items based on the opinions of like-minded nearby users, where similarity computation is the critical component. Traditional similarity measures, such as PCC and COS, mainly focus on the directions of co-related rating vectors and have inherent limitations for recommendations. In addition, CF-based recommendation systems always suffer from the cold-start problem, where users do not have enough co-related ratings for prediction. To address these problems, we propose a novel similarity measure inspired by a physical resonance phenomenon, named resonance similarity (RES). We fully consider different personalized situations in RES by mathematically modeling the consistency of users’ rating behaviors, the distances between users’ opinions, and the Jaccard factor with both co-related and non-related ratings. RES is a cumulative sum of the arithmetic product of these three parts and is optimized using learning parameters from datasets. Results evaluated on six real datasets show that RES is robust against the observed problems and has superior predictive accuracy compared to state-of-the-art similarity measures on full users’, grouped users’, and cold-start users’ evaluations.
Recommender systems help users find information by recommending content that a user might not know about, but will hopefully like. Rating-based collaborative filtering recommender systems do this by finding patterns that are consistent across the ratings of other users. These patterns can be used on their own, or in conjunction with other forms of social information access to identify and recommend content that a user might like. This chapter reviews the concepts, algorithms, and means of evaluation that are at the core of collaborative filtering research and practice. While there are many recommendation algorithms, the ones we cover serve as the basis for much of past and present algorithm development. After presenting these algorithms we present examples of two more recent directions in recommendation algorithms: learning-to-rank and ensemble recommendation algorithms. We finish by describing how collaborative filtering algorithms can be evaluated, and listing available resources and datasets to support further experimentation. The goal of this chapter is to provide the basis of knowledge needed for readers to explore more advanced topics in recommendation.