Article

Solving the Sparsity Problem in Recommender Systems Using Association Retrieval

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Recommender systems are being widely applied in many fields, such as e-commerce etc, to provide products, services and information to potential customers. Collaborative filtering as the most successful approach, which recommends contents to the current customers mainly is based on the past transactions and feedback of the similar customer. However, it is difficult to distinguish the similar interests between customers because the sparsity problem is caused by the insufficient number of the transactions and feedback data, which confined the usability of the collaborative filtering. This paper proposed the direct similarity and the indirect similarity between users, and computed the similarity matrix through the relative distance between the user's rating; using association retrieval technology to explore the transitive associations based on the user's feedback data, realized a new collaborative filtering approach to alleviate the sparsity problem and improved the quality of the recommendation. In the end, we implemented experiment based on Movielens data set, the experiment results indicated that the proposed approach can effectively alleviate the sparsity problem, have good coverage rate and recommendation quality.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Many attempts have been made to alleviate the sparsity and cold-start problems. Transitive association-based methods [10,11], clustering-based methods [12,13], which reduce the dimensionality using latent semantic indexing (LSI) [14], binary preferencebased methods [15], and correlation and cosine-based techniques [16], give good performance for recommendations while reducing a Correspondence to: Incheon Paik. E-mail: paikic@u-aizu.ac.jp ...
... Section 6 concludes and discusses implications for future work. [10] leverages the experiences of similar users in the system to predict the target users' personalized preferences and make recommendations. It only requires past user ratings to predict the remaining unknown ratings; then, items can be recommended to users according to the predictions [22,23]. ...
... Chen et al. [10] managed the sparsity problem successfully by using association retrieval technology and proposed a new CF algorithm to improve recommendation performance. They examined the transitive associations based on the user's feedback data. ...
Article
Full-text available
The rapid development of the Internet in recent years has led to a vast increase in the numbers of Web services, which challenge users' capability to find their favorite services quickly and accurately. There is thus an urgent demand for service recommendations that help discover applicable services. Although the collaborative filtering technique is one of the most successful recommendation system technologies, it suffers from data sparsity and cold‐start problems, which in turn lead to inaccurate results. In this article, we deal with these issues by applying a novel ontology‐based clustering approach that uses domain specificity and service similarity for ontology generation. This clustering approach can easily and effectively increase the data density of the user‐service dataset by assuming blank user preferences according to the history of user‐favored domain(s). Then, user ratings are predicted by calculating the trust value between users. The experimental results indicate that the proposed approach can effectively alleviate the data sparsity and cold‐start problems with lower prediction error with the best recommendation performance, and our method performed better than baseline approaches in terms of accuracy and novelty. © 2019 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.
... For available rating, the Eqs. (45), (46) and (47) are used to compute prediction values. In the case of unavailable ratings, the Eqs. ...
... The Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) are most commonly used evaluation criteria to validate the CF-based RS performance. [45,46] MAE is the mean of absolute deviation between the actual and the predicted rating. ...
... The predicted rating values (P u j ; I i ) for the proposed method, are calculated by using Eqs. (45)(46)(47). Finally, the predicted ratings are rounded off to the nearest integer for both the existing and proposed method. ...
Article
Full-text available
The Recommender Systems (RSs) based on the performance of Collaborative filtering (CF) depends on similarities among users or items obtained by a user-item rating matrix. The conventional measures such as the Pearson correlation coefficient (PCC), cosine (COS), and Jaccard (JACC) provide a varied and dissimilar value when the ratings between the users lie in the positive and negative side of the rating scale. These measures are also not very effective when there is sparsity in the rating matrix of the user-item. These problems are addressed by the Proximity-Impact-Popularity (PIP) similarity measure. Even though the PIP method provides an improved solution for this problem, the range of values for each component in PIP is very high. To address this issue and to improve the performance of a CF-based RS, a modified proximity-impact-popularity (MPIP) similarity measure is introduced. The expression is designed to get PIP values within the range of 0 to 1. A modified prediction expression is proposed to predict the available and unavailable ratings by combining user- and item-related components. The proposed method is tested by using various benchmark datasets. The size of the user-item sparse matrix varies to compare the performance of the methods in terms of mean absolute error, root mean squared error, precision, recall, and F1-measure. The performance of the proposed method is statistically tested through the Friedman and McNemer test. The results obtained by using the evaluation criteria indicate that the proposed method provides a better solution than the conventional methods. The statistical analysis reveals that the proposed method provides minimum MAE and RMSE values. Similarly, it also provides a maximum F1-measure for all the sub-problems.
... However the CBF fails to differentiate a good item from a bad item when both items use similar words (synonymy problem) and to retrieve multimedia item due to its perception of the content [C+11a]. CF on the other hand recommends items by considering users' ratings of an item to find the match of rating patterns of some items involving other users with similar interests [MLR03,C+11b]. Unlike CBF, CF deals with users ratings given to items not content of the items but ignore to provide recommendation to new user without ratings, which is referred to cold start problem [KAU16]. ...
... HF employs the concept of CBF and the CF to provide recommendations. Among these systems, CF is the most popular and successful that provides personalised recommendation to users because it can be used to recommend any type of items to users such as books, movies, news, music, web pages and so forth [C+11b,PM12]. Therefore, this research focuses on CF recommender. ...
... Specifically, some studies focus on how to developed effective similarity functions to determine similar users/items and predict correct item rating based on the users preferences/rating [Z+05,YAL13,G+15,N+10,O+11,ZNC13,YW16]. While other studies to reduce data sparsity [C+11b,RAP16,X+13]. Despite these studies, several shortcomings are left unresolved, which include: The existing approach fails to recommend popular items due to similarity metric used that highly penalise popular items; The approach also recommends items that are not of interest to users due to its failure to solicit for user opinion of the items; and in addition, it increases data sparsity problem because unwanted recommended items are ignored. ...
Article
Full-text available
ABSTRACT: Collaborative filtering recommender systems being the most successful and widely used plays an important role in providing suggestions or recommendations to users for the items of interest. However, many of these systems recommend items to individual users based on ratings which may not be possible if they are not sufficient due to the following problems: it may lead to the prediction of uninterested popular items already known to the users because of the penalty function employed to punish those items, the sparsity of the user-item rating matrix increases making it difficult to provide accurate recommendations and also it ignores the users general preferences on the recommended items whether they are of interest to users or not. Therefore, many times uninterested items can be found in the recommended lists of an individual user. This will make user to lose interest in the recommendations if these uninterested predicted items always appear in the lists. In this paper, we proposed a collaborative filtering recommendations refinement framework that combines the solutions to these three identified problems. The framework incorporates a popularise similarity function to reduce the influence of popular items during recommendations, an algorithm to fill up the missing ratings of unwanted recommendations in the user-item rating matrix thereby reducing the sparsity problem and finally an algorithm to solicit for user feedback on the recommended items to minimise uninterested recommendations.
... This prompted many research efforts on recommender systems [8,2,14,16,18,3]. Among these systems include Collaborative Filtering (CF) which is the most popular and successful system that provides recommendations to users because it recommends any type of items to users such as books, movies, news, music, web pages and so forth [5,19]. ...
... Numerous studies have been conducted on CF systems to deal with the data sparsity problem for predicting correct item rating in order to provide good recommendations [15,13,23,25,5,7,20,22,17,1,12,4 ]. Despite these studies, several challenges are left on resolved, which include: The existing method [4] fails to pre-process the missing ratings of the new items which increase the sparsity of the rating matrix. ...
... Chen et al. [5] introduced an algorithm to solve sparsity problem by using associative retrieval approach. The algorithm tries to capture the user's interest utilising both direct and indirect similarity between users ratings relative distance to compute similarity matrix. ...
Article
Full-text available
Collaborative filtering recommender system suffers from data sparsity problem due to its reliance on numerical ratings to provide recommendations to users. This problem makes it difficult for the system to compute accurate similar neighbours for the items and provide good quality recommendations. Existing methods fail to pre-process the missing ratings of the new items and to predict cold items to the active users which lead to poor quality recommendations. In this work, a sparsity reduction method is presented to improve the quality of recommendations. The method utilises Bi-Separated clustering algorithm to cluster the ratings matrix simultaneously into users and items bi-clusters based on ratings classification. It also employs Bi-Mean Imputation algorithm to fill the missing ratings in the bi-clusters using the estimated means. The method then performs the traditional collaborative filtering process on the new rating matrix for cold items prediction. The experimental results demonstrated that compared to the existing method, the proposed BiSCBiMI improves density of the rating matrix by 5.75%, 10.73% and 7.35% as well as Mean Absolute Error (MAE) of the new items prediction for all of the considered datasets. The results indicated that, the proposed approaches are effective in reducing the data sparsity problem as well as items prediction, which in turn returns good quality recommendations.
... It performs satisfactorily only when there is adequate rating information [23]. Depending on the ratings exposed CF approach to the following issues [21], [24]: a. Sparsity Problem: One of the major problems that complicate the personalized item ranking process is data sparsity because items cannot be reliably linked to users [25], causing a limitation in the recommendation's effectiveness and limited coverage of recommendation space [26]. This problem occurs due to the following issues [26], [27]: ...
... It can be considered as a particular case of the sparsity problem in which most of the cells of the item-user interaction matrix contain null values [29]. The CF approach is not able to generate accurate recommendations for new users or items without sufficient existing data on them [24]. c. ...
Article
Full-text available
In recent times, the recommender systems (RSs) have considerable importance in academia, commercial activities, and industry. They are widely used in various domains such as shopping (Amazon), music (Pandora), movies (Netflix), travel (TripAdvisor), restaurant (Yelp), people (Facebook), and articles (TED). Most of the RSs approaches rely on a single-criterion rating (overall rating) as a primary source for the recommendation process. However, the overall rating is not enough to gain high accuracy of recommendations because the overall rating cannot express fine-grained analysis behind the user’s behavior. To solve this problem, multi-criteria recommender systems (MCRSs) have been developed to improve the accuracy of the RS performance. Additionally, a new source of information represented by the user-generated reviews is incorporated in the recommendation process because of the rich and numerous information included (i.e. review elements) related to the whole item or to a certain feature of the item or the user’s preferences. The valuable review elements are extracted using either text mining or sentiment analysis. MCRSs benefit from the review elements of the user-generated reviews in building their criteria forming multi-criteria review based recommender systems. The review elements improve the accuracy of the RS performance and mitigate most of the RS’s problems such as the cold start and sparsity. In this review, we focused on the multi-criteria review-based recommender system and explained the user reviews elements in detail and how these can be integrated into the RSs to help develop their criteria to enhance the RSs performance. Finally, based on the survey, we presented four future trends based on this type of RSs to support researchers who wish to pursue studies in this area.
... Many researchers have incorporated the use of graphs and exploited graph properties in order to generate recommendations. These graphs can be of varied types ranging from normal graphs (Lee and Lee 2015;Huang et al. 2002) to bipartite (Chen et al. 2011(Chen et al. , 2013Huang et al. 2004;Sawant 2013;Lopes et al. 2016) and even tripartite graphs (Shams and Haratizadeh 2017). Chen et al. (2013) proposed a graph based approach of recommendation. ...
... Finally resource allocation for the target user can be determined by combining the similarity measure along with initial resource allocation of the target user. Huang et al. (2004) and Chen et al. (2011) created a bipartite graph and used association retrieval by exploiting the path length property to generate recommendations. Sawant (2013) in his work used the weighted bipartite graph projection that defined a new similarity function based on the network properties of the dataset which was then used to generate recommendations. ...
Conference Paper
Full-text available
Collaborative filtering technique is widely adopted by researchers to generate quality recommendations. Constant efforts are being made by the researchers to generate quality recommendations thus satisfying and retaining the user. This work is an effort to generate quality recommendations by proposing a collaborative filtering approach. The proposed work models the sparse rating data as a weighted bipartite graph which represents data flexibly and exploits the graph properties to generate recommendations. In the proposed work user similarity is formulated as measure of entropy and cosine similarity which takes into account the relative difference between the ratings. Performance of the proposed approach is compared with the traditional collaborative filtering technique using Precision, Recall and F-Measure. Experiments were conducted on public and private datasets namely MovieLens and News dataset respectively. Results indicate that the performance of the proposed approach outperforms the traditional collaborative filtering approach.
... Here pages pi, pj, pk and pm are termed as frequent itemsets. Association rules of type mentioned in Eq. (18) are mined using those frequent item-sets from set S. Support and Confidence value for all frequent item-sets ''x'' that constitutes to those association rules mined are computed to eliminate rules that are not suitable for recommendation process [27]. The support and confidence value of each mined rule is computed using the following Eq. ...
... This is computed using the Eq. (27). Figure 2 clearly states that in all the algorithms with various samples of dataset tested, the optimum value for ''k'' lies within 20-25 with increased F1-measure. ...
Article
Full-text available
Recommendation system predicts and suggests those web pages that are likely to be visited by web users. The usage of recommendation system reduces delay in search and helps users to achieve the desired purpose in web search. Personalization in recommender system creates user profiles by analyzing the user’s interest through previous search history and patterns. The web pages that are recommended will be predicted based on the user profile. In this paper, the idea of Case-Based Reasoning has been adapted suitable for web page recommendation as an extension of Collaborative filtering. Users’ profile will be generated comprising of eight characteristic features and two content-based features generated using web access search logs. The collaboration among the k-NN user profile is identified based on Case-Based Reasoning. To enhance the accuracy Weighted Association Rule Mining is applied, which generates rules among the user profiles and optimally predicts the web pages suitable for the given search keyword by a user. To verify the effectiveness of the proposed idea, experiments were carried out with multiple datasets covering 2370 web pages accessed by 77 different users. Experiment result shows that the proposed algorithm outperforms existing methods with increased accuracy and minimum miss-out and fall-out rates.
... When more and more Web services are published online, the chance of selection possibility is decreased. Because of that, users may not get a chance to rate some items and if available data are insufficient for identifying similar users then the sparsity problem occurs [3]. The cold-start problem occurs [4] when a new user or new item has just entered the system and no information is available about them. ...
... The sparsity problem: Chen et al. [3] used association retrieval technology to manage the sparsity problem and proposed a new CF algorithm to increase the recommendation performance. They explored the transitive associations based on the user's feedback data using association retrieval technology. ...
... In (Rupasingha and Paik, 2019), address sparsity problem using the specificity aware ontology-based clustering and it improves recommendation performance. In (Chen, 2011), they fulfilled the sparsity problem using association retrieval technology and improve the recommendation result. In (Li et al., 2017), using the simplified method, that improved the performance and reduce the sparsity of the recommendation. ...
... Then used PCC to improve to recommendation result. Association retrieval method (Chen, 2011), used the user's feedback data and calculate the relative distance between users' rating and similarity matrix then combined these and alleviating the sparsity problem. The binary method (Li et al., 2017), used a simplified similarity measure (SSM) method for fulfilling to sparsity problem. ...
Conference Paper
Full-text available
With the development of the world wide web (WWW), the number of people who can deal with their work through the Internet, is increasing and it helps to do their tasks effectively and efficiently. In this case, a very important task is fulfilled by Web services. But the main problem is users struggling to select their favourite Web services quickly and accurately among available Web services. Web service recommendations help to solve this problem successfully. In this paper, we used collaborative filtering (CF)-based recommendation technique, but it suffers from the data sparsity and cold-start problem. Therefore, we applied an ontology-based clustering approach to overcome these problems. It effectively increased the data density by assuming the missing user preferences comparing the history of user favoured domains. Then, user ratings are predicted based on the model-based approach such as singular value decomposition (SVD). The result showed that the clustering approach can overcome the CF problems effectively and the SVD method can predict user ratings with lower prediction error compared with existing approaches.
... Similarity calculation in such scenario is a tough job for the recommender system. Few researchers suggested the usage of trust network (Chen, 2011) i.e. association values to explore the relation among the users present in the system. Dimensionality reduction techniques can also be used to trim the attributes which has high sparsity percentage (Huang, 2004). ...
Article
Full-text available
In the past few decades recommender system has reshaped the way of information filtering between websites and the users. It helps in identifying user interest and generates product suggestions for the active users. This paper presents an enlightening analysis of various recommender system such as content-based, collaborative-based and hybrid recommendation techniques along with few optimization models that has been applied to improvise the parameters being considered by the aforementioned techniques. We explored 125 articles published from 1992 to 2019 in order to discuss the problems associated with the existing models. Various advantages and disadvantages of each recommendation model including the input methods has been elaborated. Critical review on research problems based on the explored techniques and future directions has also been covered.
... Papagelis et al. (2005) proposed a method to mitigate the problem of sparsity by using trust inference that finds an association between users in the context of the social network. Chen et al. (2011) suggested the use of association retrieval technology to explore the transitive associations between the users. They proposed direct and indirect ...
Article
Full-text available
Recommender system plays a supporting role in the process of information filtering. It plays a remarkable role in large-scale online shopping and product suggestions. This paper discusses various trends of recommender system such as content-based, collaborative-based and hybrid personalization techniques proposed for recommendations. It provides better insight and future directions of recommender systems. We have reviewed 142 articles from several journals and conference papers which were published from 1992 to 2019. We have used statistical descriptions to show the progression and drawbacks of the various notions of recommendation approaches. We have also discussed growing research demand in the area of recommender systems as well as the pros and cons of the currently available classifications. We have created a classification of recommender techniques, including various user inputs, knowledge from the database, the ways in which the recommendation will be presented to the user and the technologies which are used to create the recommendations.
... Recommender systems are major applications that support digital marketing via recommending products reflecting user preference. Recommenders for fashion products have been sought after to improve customer satisfaction [9][10][11]15,16,[53][54][55][56][57][58]. The fashion product recommenders usually use interaction logs between users and products and visual features [8,15,16]. ...
Article
Full-text available
Many companies operate e-commerce websites to sell fashion products. Some customers want to buy products with intention of sustainability and therefore the companies need to suggest appropriate fashion products to those customers. Recommender systems are key applications in these sustainable digital marketing strategies and high performance is the most necessary factor. This research aims to improve recommendation systems’ performance by considering item session and attribute session information. We suggest the Item Session-Based Recommender (ISBR) and the Attribute Session-Based Recommenders (ASBRs) that use item and attribute session data independently, and then we suggest the Feature-Weighted Session-Based Recommenders (FWSBRs) that combine multiple ASBRs with various feature weighting schemes. Our experimental results show that FWSBR with chi-square feature weighting scheme outperforms ISBR, ASBRs, and Collaborative Filtering Recommender (CFR). In addition, it is notable that FWSBRs overcome the cold-start item problem, one significant limitation of CFR and ISBR, without losing performance.
... In effect, sparsity is a major issue limiting the quality of recommendations. Of course, the similarity between two users is zero in collaborative recommendation system (Chen et al., 2011). Another drawback of the current recommendation systems is shilling attacks. ...
Article
Full-text available
The recent rapid growth of the Internet content has led to building recommendation systems that guide users to their needs through an information retrieving process. An expert recommendation system is an emerging area that attempts to detect the most knowledgeable people in some specific topics. This detection is based on both the extracted information from peoples’ activities and the content of the documents concerned with them. Moreover, an expert recommendation system takes a user topic or query and then provides a list of people sorted by the degree of their relevant expertise with the given topic or query. These systems can be modeled by information retrieval approaches, along with search engines or a combination of natural language processing systems. The following study provides a critical overview of existing expert recommendation systems and their advantages and disadvantages, considering as well different techniques employed by them.
... In [14], Y. Chen et al. used Association Retrieval (AR) technology to manage the sparsity problem, they compute the similarly matrix through the relatively distance between the user's rating and use the AR technology to realize a new collaborative filtering approach. In [15] solves the sparsity problem in a movie recommendation system by using a k-mean clustering. ...
... In effect, sparsity is a major issue limiting the quality of recommendations. Of course, the similarity between two users is zero in collaborative recommendation system (Chen et al., 2011). Another drawback of the current recommendation systems is shilling attacks. ...
Article
The recent rapid growth of the Internet content has led to building recommendation systems that guide users to their needs through an information retrieving process. An expert recommendation system is an emerging area that attempts to detect the most knowledgeable people in some specific topics. This detection is based on both the extracted information from peoples' activities and the content of the documents concerned with them. Moreover, an expert recommendation system takes a user topic or query and then provides a list of people sorted by the degree of their relevant expertise with the given topic or query. These systems can be modeled by information retrieval approaches, along with search engines or a combination of natural language processing systems. The following study provides a critical overview of existing expert recommendation systems and their advantages and disadvantages, considering as well different techniques employed by them.
... Format of Log file Recall is the ratio of number of relevant web pages retrieved to the total number of web pages. It is expressed as a percentage and calculated as shown in(15). ...
... YiBo Chen et al. [76] proposed to compute the similarity matrix based on relative distance between user ratings to solve the sparsity problem in recommender systems. ...
Thesis
Full-text available
Web growth, especially in social networks, is continuously increasing every day. Multiplicity of products offered and web pages has made picking up relevant items a tedious job. On the other hand, different tastes and behaviors of users is creating the probability to find a similar user among a large group of users difficult. As a result, automated software systems have difficulty to discover what is interesting to users. We have proposed a new approach to adapt to this flow. We will exploit domain knowledge of training data set to create a summary matrix. The summary matrix consists of new and few columns according to the attribute values of the selected feature. We fill the summary matrix with the average ratings based on the number of times that the attribute values appear in the user's profile for rated items. We use the summary matrix in two hybrid recommender systems. In our approach, we use meta-level technique which is one of the pipelined hybridization techniques. The proposed approach will reduce the effects of sparsity, cold start, and scalability which are common problems with the collaborative recommender systems. Also, the proposed approach will improve the recommendation accuracy when there is comparison with the Collaborative Filtering Pearson Correlation approach and it will be faster.
... Sparsity problem is one of the major limitations in CF approaches, in which the user-item matrix is extremely sparse in that it contains a small percentage of non-zero elements (Nikolakopoulos et al., 2015). This problem will also lead to low predictive accuracy and low precision (Anand and Bharadwaj, 2011;Chen et al., 2011;Huang et al., 2004;Lika et al., 2014;Papagelis et al., 2005). Guo et al. (2014) incorporate trust, as a new source of information, to alleviate this problem. ...
Article
The importance of recommendation systems for business applications has led to extensive research efforts to improve the recommendations accuracy as well as to reduce the sparsity problem. Despite the success of both collaborative filtering and multi-criteria approaches, they still need to be further optimized to address the stated problems. In this paper, we propose a new hybrid method based on enhanced fuzzy multi-criteria collaborative filtering which incorporates demographic information and an item-based ontological semantic filtering approach for movie recommendation purposes. We use an adaptive neuro-fuzzy inference system to discover the relationship between each criterion and the overall rating. A fusion of fuzzy cosine and Jaccard similarities is further adopted to calculate the total similarity between users/movies with respect to the effect of co-rated item set cardinality on the reliability of similarity measures. To increase the robustness and reliability of the final similarity measure, especially in the case of cold start users, a convex combination of both user and movie based similarities is used; in which the convex weightings are determined through the gradient decent algorithm to ensure a minimum prediction error. Experimental results demonstrate the efficiency of the proposed method in reducing the sparsity problem and improving the prediction accuracy.
... CF has been used fairly successfully in various domains. However, it has main limitations such as the data sparsity [3][4][5] and cold-start [6][7][8][9] problems. Data sparsity arises when the number of ratings obtained from users is very small compared to the number of ratings that must be predicted, and so it becomes difficult to find a significant overlap between the items rated by two users [10]. ...
Article
Full-text available
Collaborative Filtering (CF) is the most popular recommendation technique that uses preferences of users in a community to make personal recommendations for other users. Despite its popularity and success, CF suffers from the data sparsity and cold-start problems. To alleviate these issues, in recent years, there has been an upsurge of interest in exploiting trust information to improve the performance of CF. In general, trust has a number of distinct properties such as asymmetry, transitivity, dynamicity and context-dependency. However, conventional trust-based CF systems do not address trust computation by considering all the properties of trust. Particularly, the context-dependency property has received less attention in the existing approaches. The consideration of all these properties leads to more accurate recommendations since the quality of the inferred is improved. In this paper, we propose a novel trust-based approach, called Semantic-enhanced Trust based Ant Recommender System (STARS), which satisfies all the properties mentioned above. Using ant colony optimization, the proposed system performs a depth- first search for the optimal trust paths in the trust network and selects the best neighbors of an active user to provide better recommendations. To consider the context-dependency property, trust inference in STARS depends on the semantic descriptions of items. Incorporation of both global and local trust in CF-based recommender systems in addition to the trust computation based on the semantic features of items allows STARS to alleviate the data sparsity, cold-start and “multiple-interests and multiple-content” problems of CF. Experimental results on real-world data sets show that STARS outperforms its counterparts in terms of prediction accuracy and recommendation quality and can overcome the above problems. [Full-text view-only version of this paper is available at: http://rdcu.be/nqq4]
... User available data on one online social network is not sufficient to understand their interest and capture their changing preferences. This is the time when a new, cold start and sparsity problems start [7]. YouTube video recommendation system suffers from different problems such as Recommendation to a new user, cold start and sparsity. ...
... So, the consumer-product interaction matrix can be extremely sparse. This problem is commonly named as the sparsity [19], [20]. ...
Article
Full-text available
In major e-commerce recommendation systems, the number of users and items is very large and available data are insufficient for identifying similar users. As a result, recommender systems could not use users' opinion to make suggestions to other users and the quality of the recommendations might reduce. The main objective of our research is to provide high quality recommendations even when sufficient data are unavailable. In this article we have presented a model for this condition that combines recommendation methods (e.g., Collaborative Filtering (CF) and Content Based Filtering (CBF)) with other methods such as clustering and association rules. The model consists of four phases, at the first phase, tourists are clustered based on their location and the target tourist's cluster is sent to the next phase. In the second phase, a two level graph is made based on the similarity between the tourist interests and the similarity of the tours. According to this graph, transitive relations are discovered among the tourists and k number of items that have the highest weight of relationships and are suggested to the target tourists. According to the experiments, the standard F-measure indicates that the quality of the recommendations of this model is higher than the traditional approaches which cannot discover transitive relationships.
... As shown by Chen et al. [8], the usability of collaborative filtering approaches is confined by sparsity problem, since a user does not have enough co-rated TV programs with other users with similar preferences. New shows might also be problematic for collaborative filtering in TV recommender systems because if there are no ratings for a new TV content, thus, it may not be recommended. ...
Conference Paper
Full-text available
This paper surveys the landscape of actual personalized TV recommender systems, and introduces challenges on context-awareness and viewer behavior prediction applied to social TV-recommender systems. Real data related to the viewers behaviors and the social context have been picked up in real-time through a social TV platform. We highlighted the future benefits of analyzing viewer behavior and exploiting the social influence on viewers’s preferences to improve recommendation in respect with TV contents’ change
... Place (location) F 6 No joint problem F 7 Cardiac problem F 8 Muscle mass F 9 Over-weight F 10 Temperature F 11 Time of run F 12 Heart rate F 13 Pulse rate F 14 Heart rate ejection fraction F 15 Respiratory Tables 3 and 4, the performance of the athlete relies on gender selection, muscle mass, time to run, heart rate, pulse rate, speed, precision, and endurance. However, others are not most influencing features that dominate the performance of an athlete during the marathon race. ...
Article
Full-text available
Recent research works have shown the robustness towards the recommendation system for athletics using an AI automated system that enhances the longevity of the system. With this model, the automated recommendation helps to improve the quality of athletes during the process of training or other processes. Moreover, domain experts only can understand the rationale of the recommender system where the analyzed data is stored in the cloud system. This research proposes a machine learning–based solution for an athletic dataset that automatically predicts the state of the individual with features like age, gender, calories, temperature, pressure, heart rate, pulse rate, sugar level, respiratory conditions, and state of the body. This research concentrates on modeling a framework for implementing the machine learning approaches with an optimization problem. Here, a novel extreme multi-gradient evolutionary computation (EMGEC) with improved grey wolf optimization (IGWO) is proposed to achieve exploration and exploitation during the selection of features. The dataset collected from the athletes during the marathon (running) is collected from online resources and the feature subsets are extracted from the dataset. The features of these data are analyzed and encoded before placing it over the cloud environment. The performance of the proposed machine learning approach is compared with other approaches and provides better prediction accuracy, precision, recall, and F-measure respectively. The accuracy of the anticipated model is 83.13%, precision is 91.1%, and recall is 91.3% which is substantially higher than of other approaches. The proposed model shows a better trade-off in contrast to prevailing approaches like SVM, RF, k-NN, and logistic regression.
... When determining what items to present to a user, these systems necessarily pare down the complete set of possible items from the millions to a small handful. The recommendation system problem setting is a high sparsity problem where the recommending system has very little interaction data between all the available users and all the available items [6][7][8][9][10]. Recommendation systems can also suffer from the long-tail phenomenon-there is an outsized amount of user interaction data for a tiny subset of the available item set and an extremely large number of items which effectively have no interaction data [11]. ...
Preprint
Full-text available
We evaluate two popular local explainability techniques, LIME and SHAP, on a movie recommendation task. We discover that the two methods behave very differently depending on the sparsity of the data set. LIME does better than SHAP in dense segments of the data set and SHAP does better in sparse segments. We trace this difference to the differing bias-variance characteristics of the underlying estimators of LIME and SHAP. We find that SHAP exhibits lower variance in sparse segments of the data compared to LIME. We attribute this lower variance to the completeness constraint property inherent in SHAP and missing in LIME. This constraint acts as a regularizer and therefore increases the bias of the SHAP estimator but decreases its variance, leading to a favorable bias-variance trade-off especially in high sparsity data settings. With this insight, we introduce the same constraint into LIME and formulate a novel local explainabilty framework called Completeness-Constrained LIME (CLIMB) that is superior to LIME and much faster than SHAP.
... Wang et al. provide a novel approach based on trust, and it helps solve the coldstart problem in large MSN [13]. A common problem with the CF is the sparseness of the data set, and there are several approaches provided to solve this sparseness problem [14]. ...
... Pour déterminer ce profil de l'utilisateur, il est nécessaire de recueillir certaines informations en suivant deux approches (Bobadilla et al., 2013 ;Ricci et al., 2011) : (Leblay, 2016 ;Lemdani, 2016 ;Tadlaoui et al., 2015 ;Haydar, 2014 ;Ricci et al., 2011). Les grandes catégories sont : (Chen et al., 2011 ;Guo, 2012) ; ...
Thesis
L’évolution des technologies de l’information a impacté le domaine de l’éducation par l’introduction de l’usage du numérique dans les processus pédagogiques qui permet d’assister et, en particulier, de personnaliser l’apprentissage. De nombreuses recherches ont été menées sur la personnalisation et l’utilisation des systèmes de recommandation en reprenant certaines approches appliquées dans le commerce en ligne. Notre travail de recherche s’inscrit dans ce contexte et vise à tester l’impact de l’hybridation des approches de recommandation en combinant le filtrage à base de contenu et de filtrage collaboratif. Ces deux méthodes s’appuient respectivement sur des caractéristiques individuelles et sociales de l’apprenant. Les résultats globalement probants de notre étude et des deux expériences qui l’ont accompagnées, ont permis de proposer plusieurs recommandations et un scénario d’application sous forme d’une démarche basée sur un mode d’apprentissage mixte associant le mode présentiel et distanciel. Cette démarche assure une personnalisation de l’apprentissage grâce à la mise en place d’une architecture en couches : services, recommandations et données. Les différentes recommandations ont été contextualisées dans le domaine de l’enseignement supérieur en général, et particulièrement dans le système d’enseignement marocain privé et public. L’expérimentation portant sur l’intérêt des systèmes hybrides de recommandation dans l’enseignement en ligne s’est déroulée dans le contexte sanitaire de la COVID-19.
... Although, there are increasing research questions about some fundamental challenges of collaborative filtering recommender systems, such as data sparsity, cold-start, scalability and so on, increasing research is aiming to proffer solutions to these problems (Chen et al., 2011). Broadly speaking, collaborative filtering (CF) technique can be divided into two main branches: memory-based and modelbased. ...
Conference Paper
Full-text available
Collaborative filtering (CF) algorithm is used to predict user preferences in item selection based on the known user ratings of items. As one of the most valuable algorithms used in recommendation systems, CF has proven to be effective for solving the information excess problem. The typical calculations of similarities are most times inefficient, which suffers from data sparsity and poor prediction quality problems in the learning process hence leading to inadequate learning. To solve this problem, the collaborative filtering recommendation algorithm is proposed. Traditional algorithms focus only on user ratings and do not take into account the changes of user interest and the credibility of ratings data, which seriously affected the quality of the system's recommendation. In this paper, the different types of CF recommender system and their respective techniques will be discussed extensively. Keywords - recommendation systems; collaborative filtering; memory-based; model-based; similarity measure
... Hence, older and popular items dominate the recommendation process, which is not desirable for news recommendations. Second, there is the sparsity problem (Chen et al., 2011), which occurs when there is insufficient overlap between the consumption patterns of users. As the relevance of news stories sharply decreases over time, it is not unreasonable to assume little overlap between new and old users. ...
Thesis
Digitalization and the emergence of large amounts of media content has pushed organizations towards the use of algorithms to (semi-)automatically determine how information should be filtered, ranked and sorted. Especially in the news environment, there is an evolution ongoing in which news organizations increasingly rely on recommendation algorithms to personalize the news offer and tailor it to the users’ preferences. Although there are several commercial benefits related to the use of recommendation algorithms, several parties such as scholars and policy makers are concerned about how these technologies are used and designed. They believe that recommendation algorithms are a risk to citizens because they are trained to focus on similarities, between articles and people, rather than on differences. As such, they may provide more of the same news and expose citizens to a lesser extent to the diversity that is present in the news supply. In addition, they could also reinforce the self-selection process of citizens, which in turn also poses a risk to citizens’ consumed diversity. However, the idea of diversity is in several normative theories such as the public sphere perceived as an essential prerequisite to inform citizens properly and ensure the functioning of strong democracies. Academics therefore recommend exploring alternative ideas that can mitigate these risks and promote the idea of news diversity. In this dissertation, we take steps in that direction by examining how news organizations can incorporate diversity as a criterion in the development of recommendation algorithms and, by doing so, stimulate users to consume a diverse range of news articles. To do so, we make use of three research themes that give structure and meaning to this dissertation. These research themes are (1) news diversity as an alternative recommendation value, (2) audiences’ perceptions towards diversity-based recommendation algorithms and (3) audiences’ consumption behavior when using diversity-based recommendation algorithms. The rationale for these research themes lies in the idea to approach news algorithms from a socio-technical perspective, taking into account both the technical-conceptual aspects and social aspects of algorithms. In the first theme, we focus on these technical-conceptual aspects by conducting a systematic literature review and an interdisciplinary study on the meaning of news diversity and the different building blocks of a diversity-based algorithm. In the second and third theme, we focus on the social aspects by conducting a survey study and experimental study in which we investigate how news consumers evaluate and interact with news algorithms. Based on these studies, we present in this dissertation for each of these research themes several interesting insights that may be relevant to different stakeholders such as scholars, policy makers, news media and even citizens. A first important insight that emerges from our systematic literature review is that there is much diversity in the conceptualizations of the concept news diversity. For example, in our study we found that communication scholars have used more than 43 diversity dimensions and 26 different conceptualizations to shape the concept news diversity. In addition, researchers typically focus on dimensions that are easier to measure, such as the location of the news topic or the length of an article. Dimensions that are harder to measure, such as objectivity or controversy, are generally less chosen as objects of study. Normative assumptions about news diversity are also often neglected, making it difficult to assess which ideal is dominant in the academic literature. These results are especially valuable for academia in which the concept is frequently used to assess the news landscape and where a detailed dissection of the concept was lacking. In addition, news organizations can also use these insights to reflect on their own activities and/or the development of a diversity-based algorithm. A second important insight was found in our interdisciplinary study in which it became clear that the development of a diversity-based algorithm raises pertinent questions for a broad range of disciplines. These questions were not only present in the field of computer science where most recommendation algorithms are developed, but also in fields such as law, computational linguistics, and communication sciences. For computational linguistics and computer science, these questions are primarily situated in the technical elaboration that determines the accuracy and relevance of recommendation algorithms. For example, relevant content dimensions must be translated into content extraction algorithms, which is not a solved issue. The design of the recommendation algorithm must also be carefully considered, as the right balance has to be made between relevance and diversity. For law and communication sciences, in turn, the questions are more fundamental in nature. Questions such as 'which diversity dimensions are relevant to extract' or 'what is the optimal diversity outcome' are important questions, to which no unambiguous answers currently exist. Our study presents a concise overview of these discussions and also clarifies the challenges that arise with each of these topics. For academics, these challenges are particularly relevant in order to shape future research. At the same time, this study also shows that an interdisciplinary approach is required for the development of diversity-based algorithms and can even make help the development process to be more efficient and structured. A third important insight comes from the survey study in which we shed light on the perceptions that users have towards the different news selection mechanisms that underlie news algorithms. The results of this study show that the audience has a greater preference for news selection principles belonging to the ‘content-based similarity’ news algorithm than for those belonging to the ‘collaborative similarity’ or ‘content-based diversity’ news algorithm. This result shows that when the audience has the choice to determine how they want to receive the news, they have a tendency to prefer news articles that only interests them. To address the risks that are involved with this tendency, we forward a new approach, called ‘personalized diversity’. In this approach, the ultimate goal of the diversity algorithm remains the same, but it takes advantage of the personalization techniques that underlie ‘similarity-based’-news algorithms. This approach is particularly valuable for news organizations who want to implement the idea of diversity in existing or future recommendation activities. At the same time, it also shows that news selection principles are not mutually exclusive and are thus quite compatible with each other. Finally, in our experimental study, we found interesting insights about how diversity-based algorithms can affect people’s news exposure behavior and perceptions. In particular, our results show that diversity-based algorithms can steer users towards more diverse exposure behavior, with the personalized diversity-based news recommender being most effective. Moreover, we found that people using a diversity-based news recommender did not think they read more diverse, pointing towards a so-called diversity paradox. We forward several explanations for this paradox, but mainly point in the direction of transparency and the lack thereof in recommendation systems. This result is especially valuable for policy makers, to advance discussions on the importance of transparency in recommendation systems and to take further policy actions on this issue.
... Collaborative filtering (CF) techniques [1,2] are widely used in e-commerce websites because their efficiency. CF recommendation algorithms only use user-item ratings to mine the users' preference so that make recommendation for the active user which results that CF algorithms encounter data sparsity problem [3,4] and cold start problem [5,6]. The former is caused by that ratings collected by the recommender system are few, so it is difficult to model user preferences. ...
Article
Full-text available
Relationships between users in social networks have been widely used to improve recommender systems. However, actual social relationships are always sparse, which sometimes bring great harm to the performance of recommender systems. In fact, a user may interact with others that he/she does not connect directly, and thus has an impact on these users. To mine abundant information for social recommendation and alleviate the problem of data sparsity, we study the process of trust propagation and propose a novel recommendation algorithm that incorporates multiple information sources into matrix factorization. We first explore heterogeneous influence strength for each pair of linked users and mine indirect trust between users by using trust propagation and aggregation strategy in social networks. Then, explicit and implicit information of user trust and ratings are incorporated into matrix factorization, and the influence of indirect trust is considered in the recommendation process. Experimental results show that the proposed model achieves better performance than some state-of-the-art recommendation models in terms of accuracy and relieves the cold-start problem.
... Hence, older and popular items dominate the recommendation process, which is not desired for news recommendations. Second, there is the sparsity problem [15], which occurs when there is insufficient overlap between the consumption patterns of users. As the relevance of news stories sharply decreases over time, it is not unreasonable to assume little overlap between new and old users. ...
Chapter
Concerns about selective exposure and filter bubbles in the digital news environment trigger questions regarding how news recommender systems can become more citizen-oriented and facilitate – rather than limit – normative aims of journalism. Accordingly, this chapter presents building blocks for the construction of such a news algorithm as they are being developed by the Ghent University interdisciplinary research project #NewsDNA, of which the primary aim is to actually build, evaluate and test a diversity-enhancing news recommender. As such, the deployment of artificial intelligence could support the media in providing people with information and stimulating public debate, rather than undermine their role in that respect. To do so, it combines insights from computer sciences (news recommender systems), law (right to receive information), communication sciences (conceptualisations of news diversity), and computational linguistics (automated content extraction from text). To gather feedback from scholars of different backgrounds, this research has been presented and discussed during the 2019 IFIP summer school workshop on ‘co-designing a personalised news diversity algorithmic model based on news consumers’ agency and fine-grained content modelling’. This contribution also reflects the results of that dialogue.
... Therefore collaborative filtering methods are not able to recommend the products/services for new users due to the lack of user ratings. This problem is often known as "cold start problem" in recommender system [6], [7], [16]. ...
Article
Full-text available
With the dawn of the information age, the number of choices that each person has to make each day has seen an explosive growth. Every minute, users are bombarded with so many different options for everything user do, from buying clothes to watching TV, that one live in a state of constant information overload, feeling stressed out by this imposed free will then be liberated by it. One ways in which user have now become used to mitigating this constant barrage of incessant information is by the usage of Recommender Systems. These systems are software components that use some form of rating provided by a user to generate recommendations for items that they may be interested in. Unfortunately, recommender systems face a major problem called the “Cold – Start problem”, which essentially means that a recommender system in an online setting has no information about the user when the user first signs up. Many approaches have been proposed to solve this, but the one this paper shall be taking is to use information from social networks, specifically Twitter, to get this initial information. Implicitly infer user interest with good accuracy is proposed here and this understanding of interests can later be updated by observing user actions as they interact with the system. We leverage the power of the categorical information stored in the Wikipedia database to allow us to assign relative weights to entities that a user follows on Twitter. This proposed approach for creating the rating vector for each user by mining the Twitter account of the users’ is evaluated using the RMSE and MAE metrics.
... This technique does not involve an item cold-start problem. Besides, due to the sparsity problem [11], CF approach does not work very well. There are many research papers which combine these two techniques to build a hybrid filtering system. ...
Article
Full-text available
With the increase of information on the web and online users' activities, to find the appropriate information at the right time and discover items that customers are interested in among the available choices become difficult challenges. Recommender Systems have been introduced to overcome this problem by offering potentially relevant elements and providing users with personalized recommendations. Although several social-based recommendation techniques have been proposed in the literature, applying the community detection techniques based on previous transactions of users can improve the precision of the algorithms and better handle the data sparsity and cold-start problem. In this paper, we propose an improved association rule mining process based on Rule Power Factor to generate potent rules and enhance the accuracy of recommendation. The performance of the algorithm is analyzed on transactional datasets of MovieLens. The experimental results show that our method outperforms several state-of-the-art recommendation methods with increased precision, accuracy and minimum Mean Average Error values.
... In [37] the authors proposed a transfer learning approach that utilizes different dimensions of basic human values to reduce the effect of sparsity and increases the efficiency of the recommender algorithm. In [11] the authors introduce the concept of direct and indirect similarity and computed the similarity matrix with the help of relative distance between the users' rating using association retrieval technology to find the transitive associations based on the users' feedback data. The effectiveness of the proposed approach is measured by conducting experiments with the 'movilense' dataset and the results confirm good coverage and the recommendation quality. ...
Article
Recommender systems have been successfully used in a wide variety of domains. They predict, rank and recommend items to users. The prediction is based on user's preferences in the form of ratings over a set of items stored in the 'user-item rating matrix'. However, often this matrix does not have sufficient ratings to make good quality of recommendations as most of the users do not provide ratings for the items that they have consumed. This situation leads to 'sparsity' by which a majority of traditional recom-mender systems are suffering. Although existing works on recommender systems have attempted to address 'sparsity', most of them are unable to take advantage of 'resource description framework' and 'Jena engine' to obtain additional preferences of learners implicitly from Moodle server. Hence, in this paper, we propose an enhanced course recommendation framework to enrich the 'sparse rating matrix' for improving the accuracy of recommendations. Experimental results on the 'enriched item rating matrix' show the effectiveness of the proposed approach and the importance of these semantic tools.
... Radically, the recommendation problem is to substitute user to evaluate the products, which include books, movies, CD, web and so on, it is a process from known to unknown. [2]. ...
... In that case, each consumer is exemplified by an integer feature matrix, with 2 million elements, and the value of each component corresponds to the rating given by the consumer to a specific book. This matrix is called the consumer-product interaction matrix [28]. In most large-scale applications, both the numbers of consumers and products are enormous. ...
Article
Full-text available
Recommender systems are widely used to provide users with recommendations based on their preferences. With the ever-growing volume of information online, recommender systems have been a useful tool to overcome information overload. The utilization of recommender systems cannot be overstated, given its potential influence to ameliorate many over-choice challenges. There are many types of recommendation systems with different methodologies and concepts. Various applications have adopted recommendation systems, including e-commerce, healthcare, transportation, agriculture, and media. This paper provides the current landscape of recommender systems research and identifies directions in the field in various applications. This article provides an overview of the current state of the art in recommendation systems, their types, challenges, limitations, and business adoptions. To assess the quality of a recommendation system, qualitative evaluation metrics are discussed in the paper.
Preprint
Full-text available
Recommender system has been proven to be significantly crucial in many fields and is widely used by various domains. Most of the conventional recommender systems rely on the numeric rating given by a user to reflect his opinion about a consumed item; however, these ratings are not available in many domains. As a result, a new source of information represented by the user-generated reviews is incorporated in the recommendation process to compensate for the lack of these ratings. The reviews contain prosperous and numerous information related to the whole item or a specific feature that can be extracted using the sentiment analysis field. This paper gives a comprehensive overview to help researchers who aim to work with recommender system and sentiment analysis. It includes a background of the recommender system concept, including phases, approaches, and performance metrics used in recommender systems. Then, it discusses the sentiment analysis concept and highlights the main points in the sentiment analysis, including level, approaches, and focuses on aspect-based sentiment analysis.
Article
Lately, recommendation system has an important role in providing advice on products and services to match the various requirements of users. The popular method for developing recommender system is Collaborative Filtering. This method will search for other users in the systems that are interested by the same or similar items. With this method, users need not to know each other. The system will then suggest choices of other users that might be interested by the current user. However this technique is not work well with scarce data. This problem is known as the sparsity problem. Therefore, we propose to modify Collaborative Filtering using frequent itemsets by imputing the missing value. According to experimental results, the proposed method can properly fill up the missing values and improve the accuracy of recommendations to users with MAE of 0.55 with the neighborhood size of 30.
Conference Paper
The emergence of personalization in new generation e-commerce and power shift towards consumers enforces incorporation of recommendation system. To implement that, an innovative user web-interactions models and novel recommendation strategy will be in demand. This research work looks for innovative approach for on-line mobile trades, which integrate knowledge of expert with product specifications to develop hybrid personalized recommendations system. The uniqueness of this work includes advanced user-web interaction strategies, need based mobile selections criteria and added gain of expert knowledge. The very well known, fuzzy logic technique had implied to compute similarity matching along with flexibility. Experimental results demonstrate the competence test of system by getting feedback about customer satisfaction against various perspectives of systems performance. The proposed research work could also provide guidelines to deal with design challenge for personalized recommendations model for many other domains.
Article
Recommendation systems are being widely adopted in many areas which include social networking, e-commerce etc. Long years of research have led to the proposal of many algorithms in order to effectively capture the real tastes of users and deliver the recommendations accurately. Collaborative filtering is considered to be one of the popular and successful approaches to provide recommendations. In this paper, we conduct a performance evaluation of three popular collaborative filtering algorithms viz. User based, Item based and Slope-one recommender. We illustrate a brief overview on the different approaches of collaborative filtering, their method of working, advantages and limitations. We demonstrate the results based on the evaluation metrics precision, recall, f-measure, fallout and reach. Our experiments revealed that the Slope-one approach outperformed the other two approaches based on the evaluation metrics. We also explored different kinds of similarity metrics and highlighted the effect of size of the neighbourhood on the evaluation metrics. Keywords: Collaborative Filtering (CF), Recommendation systems, Apache Mahout, User based CF, Item based CF, Slope one.
Article
As the state-of-the-art method in recommender systems, centralized collaborative filtering recommender systems (CFRS) suffers from sparse data problem and a lack of scalability. In this paper, we proposed distributed collaborative filtering recommender system (DCFRS) for mobile users based on cloud computing because of its advantage of scalability as an alternative architecture. Experimental results demonstrate that the proposed algorithm improves the accuracy of a centralized system containing the same ratings and proves the feasibility and advantages of the proposed system.
Article
Full-text available
Recommender systems have been used successfully in order to deal with information overload problems in a wide variety of domains ranging from e-commerce, e-tourism, to e-learning. They typically predict the ratings of unseen items by a user and recommend the top N items based on user's profile. Moreover, the profile can be enriched further by using additional information such as contextual data, domain knowledge, and tagging information among others for improving the quality of recommendations. Traditional approaches have not been effective in exploiting these additional data sources. Hence, new techniques need to be developed for extracting and integrating them into the recommendation process. In this article, the authors present a survey on state of the art recommendation approaches their algorithms, issues and also provides further research directions for developing smart and intelligent recommender systems.
Article
Full-text available
As news selection is increasingly controlled by algorithms, a growing number of scholars are exploring how news recommenders can serve public services. Despite aspirations towards public service algorithms, little is known about which type of news recommender people prefer, let alone about a news recommender that aims to promote societal values. This study aims to give insights into audiences’ perceptions to news recommenders and their underlying news selection mechanisms. To do so, we distinguish between three news selection mechanisms, namely between content-based similarity, collaborative similarity and content-based diversity. The first two strive for similarity, respectively between news content and news users, while the third one aims for diversity in the news content consumed. Results of a large-scale survey (n = 943) show that people prefer content-based similarity over collaborative similarity and content-based diversity. Audience characteristics, such as news information overload and concerns towards missing challenging viewpoints, explain how audiences evaluate the different news selection mechanisms. We discuss how these results align with concerns about selectivity and how news algorithms can be used to tackle these concerns. We therefore introduce the concept ‘personalized diversity’ and promote the idea of news recommenders as an individual filter for the growing abundance of online information.
Thesis
Just-In-Time recommender systems involve all systems able to provide recommendations tailored to the preferences and needs of users in order to help them access useful and interesting resources within a large data space. The user does not need to formulate a query, this latter is implicit and corresponds to the resources that match the user's interests at the right time. Our work falls within this framework and focuses on developing a proactive context-aware recommendation approach for mobile devices that covers many domains. It aims at recommending relevant items that match users' personal interests at the right time without waiting for the users to initiate any interaction. Indeed, the development of mobile devices equipped with persistent data connections, geolocation, cameras and wireless capabilities allows current context-aware recommender systems (CARS) to be highly contextualized and proactive. We also take into consideration to which degree the recommendation might disturb the user. It is about balancing the process of recommendation against intrusive interruptions. As a matter of fact, there are different factors and situations that make the user less open to recommendations. As we are working within the context of mobile devices, we consider that mobile applications functionalities such as the camera, the keyboard, the agenda, etc., are good representatives of the user's interaction with his device since they somehow stand for most of the activities that a user could use in a mobile device in a daily basis such as texting messages, chatting, tweeting, browsing or taking selfies and pictures.
Article
Web growth, especially in social networks, is continuously increasing every day. Multiplicity of products offered and web pages has made picking up relevant items a tedious job. On the other hand, different tastes and behaviors of users is creating the probability to find a similar user among a large group of users difficult. As a result, automated software systems have difficulty to discover what is interesting to users. We have proposed a new approach to adapt to this flow. We will exploit domain knowledge of training data set to create a summary matrix. The summary matrix consists of new and few columns according to the attribute values of the selected feature. We fill the summary matrix with the average ratings based on the number of times that the attribute values appear in the user's profile for rated items. We use the summary matrix in two hybrid recommender systems. In our approach, we use meta-level technique which is one of the pipelined hybridization techniques. The proposed approach will reduce the effects of sparsity, cold start, and scalability which are common problems with the collaborative recommender systems. Furthermore, the proposed approach will improve the recommendation accuracy when there is comparison with the Collaborative Filtering Pearson Correlation approach and it will be faster as well.
Article
The existing similarity functions use the user-item rating matrix to process similar neighbours that can be used to predict ratings to the users. However, the functions highly penalise high popular items which lead to predicting items that may not be of interest to active users due to the punishment function employed. The functions also reduce the chances of selecting less popular items as similar neighbours due to the items with common ratings used. In this article, a popularised similarity function (pop_sim) is proposed to provide effective recommendations to users. The pop_sim function introduces a modified punishment function to minimise the penalty on high popular items. The function also employs a popularity constraint which uses ratings threshold to increase the chances of selecting less popular items as similar neighbours. The experimental studies indicate that the proposed pop_sim is effective in improving the accuracy of the rating prediction in terms of not only lowering the MAE but also the RMSE.
Chapter
Full-text available
Recommender systems have been used successfully in order to deal with information overload problems in a wide variety of domains ranging from e-commerce, e-tourism, to e-learning. They typically predict the ratings of unseen items by a user and recommend the top N items based on user's profile. Moreover, the profile can be enriched further by using additional information such as contextual data, domain knowledge, and tagging information among others for improving the quality of recommendations. Traditional approaches have not been effective in exploiting these additional data sources. Hence, new techniques need to be developed for extracting and integrating them into the recommendation process. In this article, the authors present a survey on state of the art recommendation approaches their algorithms, issues and also provides further research directions for developing smart and intelligent recommender systems.
Article
Machine learning is a method which is used to learn from data without any human involvement. Recommendation systems come under Machine learning technique which has become one of the essential systems in our day to day e-commerce internet interaction. Many algorithms are proposed to effectively capture the taste of the users and to provide recommendations accurately. Collaborative filtering is one such successful method to provide recommendation to the users. Classification which also falls under Machine learning technique contains many algorithms which can classify text, numerical data, etc. In this paper, we demonstrate two Collaborative Filtering algorithms viz, User based and Item based recommender systems; and three Classification algorithms viz, Naive-Bayes, Logistic Regression and Random Forest Classification. We analysed the results based on evaluation metrics. Our experiment suggests that in Recommender systems, Item based scores over User based; and in Classification, Naive-Bayes emerges superior.
Article
Full-text available
Collaborative filtering or recommender systems use a database about user preferences to predict additional topics or products a new user might like. In this paper we describe several algorithms designed for this task, including techniques based on correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods. We compare the predictive accuracy of the various methods in a set of representative problem domains. We use two basic classes of evaluation metrics. The first characterizes accuracy over a set of individual predictions in terms of average absolute deviation. The second estimates the utility of a ranked list of suggested items. This metric uses an estimate of the probability that a user will see a recommendation in an ordered list. Experiments were run for datasets associated with 3 application areas, 4 experimental protocols, and the 2 evaluation metrics for the various algorithms. Results indicate that for a wide range of conditions, Bayesian networks with decision trees at each node and correlation methods outperform Bayesian-clustering and vector-similarity methods. Between correlation and Bayesian networks, the preferred method depends on the nature of the dataset, nature of the application (ranked versus one-by-one presentation), and the availability of votes with which to make predictions. Other considerations include the size of database, speed of predictions, and learning time.
Article
Full-text available
Collaborative filtering is an important technique of information filtering, commonly used to predict the interest of a user for a new item. In collaborative filtering systems, this prediction is made based on user-item preference data involving similar users or items. When the data is sparse, however, direct similarity measures between users or items provide little information that can be used for the prediction. In this paper, we present a new collaborative filtering approach that computes global similarities between pairs of items and users, as the equilibrium point of a system relating user similarities to item similarities. We show how this approach extends the classical techniques based on direct similarity, and illustrate, by testing on various datasets, its advantages over such techniques.
Article
Full-text available
Eigentaste is a collaborative filtering algorithm that uses universal queries to elicit real-valued user ratings on a common set of items and applies principal component analysis (PCA) to the resulting dense subset of the ratings matrix. PCA facilitates dimensionality reduction for offline clustering of users and rapid computation of recommendations. For a database of n users, standard nearest-neighbor techniques require O(n) processing time to compute recommendations, whereas Eigentaste requires O(1) (constant) time. We compare Eigentaste to alternative algorithms using data from Jester, an online joke recommending system. Jester has collected approximately 2,500,000 ratings from 57,000 users. We use the Normalized Mean Absolute Error (NMAE) measure to compare performance of different algorithms. In the Appendix we use Uniform and Normal distribution models to derive analytic estimates of NMAE when predictions are random. On the Jester dataset, Eigentaste computes recommendations two orders of magnitude faster with no loss of accuracy. Jester is online at: http://eigentaste.berkeley.edu
Conference Paper
Full-text available
We have developed a method for recommending items that combines content and collaborative data under a single probabilistic framework. We benchmark our algorithm against a naïve Bayes classifier on the cold-start problem, where we wish to recommend items that no one in the community has yet rated. We systematically explore three testing methodologies using a publicly available data set, and explain how these methods apply to specific real-world applications. We advocate heuristic recommenders when benchmarking to give competent baseline performance. We introduce a new performance metric, the CROC curve, and demonstrate empirically that the various components of our testing strategy combine to obtain deeper understanding of the performance characteristics of recommender systems. Though the emphasis of our testing is on cold-start recommending, our methods for recommending and evaluation are general.
Conference Paper
Full-text available
Research shows that recommendations comprise a valuable service for users of a digital library [11]. While most existing recommender systems rely either on a content-based approach or a collaborative approach to make recommendations, there is potential to improve recommendation quality by using a combination of both approaches (a hybrid approach). In this paper, we report how we tested the idea of using a graph-based recommender system that naturally combines the content-based and collaborative approaches. Due to the similarity between our problem and a concept retrieval task, a Hopfield net algorithm was used to exploit high-degree book-book, user-user and book-user associations. Sample hold-out testing and preliminary subject testing were conducted to evaluate the system, by which it was found that the system gained improvement with respect to both precision and recall by combining content-based and collaborative approaches. However, no significant improvement was observed by exploiting high-degree associations.
Article
Full-text available
blem through a collaborative filtering approach. PHOAKS works by automatically recognizing, tallying, and redistributing recommendations of Web resources mined from Usenet news messages. A collaborative filtering system that recognizes and reuses recommendations. PHOAKS: 60 March 1997/Vol. 40, No. 3 COMMUNICATIONS OF THE ACM the same types of benefits. In the case of ratingsbased systems, for example, everyone rates objects of interest. Yet there is evidence that people naturally prefer to play distinct producer/consumer roles in the information ecology [2]; in particular, only a minority of people expend the effort of judging information and volunteering their opinions to others. Independently, we have observed such role specialization in Netnews; authors volunteer long lists of recommended Web resources at a stable, but low, rate. PHOAKS assumes the roles of recommendation provider and recommendati
Article
Full-text available
,.-0/012323-04657-98:<;7:<=>-023:?@@AB;C:<=!?=>DE:<=>DE/F?AG?465IHJ41FKAE-F57L-M5DE:<N /91O-98P;Q=>-0/!R4D+SUT-0:=>1V=>R-@78>1WAE-02X1Y#2Z?HJDE4L[@78>1J5T/=]8>-0/012QN 23-9465?^=>DE14:57T8>DE4L_?`AEDEO-/0T:<=>123-98D(4U=>-98!?/9=>DE14a?4653=>R-9;b?^8>- ?/RDE-0OJD(4LQKD(5-0:P@78>-F?5 :PT/0/0-0:P:DE4acdN<e]12323-98>/0-[41^K?5?F;7:0fgh4 =>RDE:`@6?@i-98FjdK-ZDE4JO-0:<=>DEL?^=>-C:P-9O-98!?Ak=>-0/R4D(SUT-0:[Yl183?46?AE;7m0DE4L A(?^8>L-9Nn:P/F?A(-`@T8>/!R6?:P-M?465 @78>-9Yo-8>-04/0-M5?=!?QYo18p=>R-V@T8>@i1:P-V1Y @78>1J5T/0DE4L%T:P-YoTA8>-0/012323-0465?=>DE14:3=>1q/0T:<=>123-98>:0frg4s@?8PN =>DE/0TA(?8Fj]K]- ?@@AB;t?u/01AEAE-0/9=>DE14s1Y?AEL18>DB=>R23:Q:PT/Rv?:`=P8!?5DBN =>DE146?Aw5?=!?Q23DE4D(4Lj4-F?^8>-0:<=PNn4-0DELR7Wi18p/91AEA(?Wi18!?^=>DEO-_xAE=>-8>D(4Lj ?465V5DE23-04:PDE146?AEDE=;[8>-F5T/9=>DE14V14[=K1y5DBz-98>-04U=d5?=!?:P-9=...
Article
Full-text available
Recommender systems apply knowledge discovery techniques to the problem of making personalized recommendations for information, products or services during a live interaction. These systems, especially the k-nearest neighbor collaborative filtering based ones, are achieving widespread success on the Web. The tremendous growth in the amount of available information and the number of visitors to Web sites in recent years poses some key challenges for recommender systems. These are: producing high quality recommendations, performing many recommendations per second for millions of users and items and achieving high coverage in the face of data sparsity. In traditional collaborative filtering systems the amount of work increases with the number of participants in the system. New recommender system technologies are needed that can quickly produce high quality recommendations, even for very large-scale problems. To address these issues we have explored item-based collaborative filtering techniques. Itembased techniques first analyze the user-item matrix to identify relationships between different items, and then use these relationships to indirectly compute recommendations for users. In this paper we analyze different item-based recommendation generation algorithms. We look into different techniques for computing item-item similarities (e.g., item-item correlation vs. cosine similarities between item vectors) and different techniques for obtaining recommendations from them (e.g., weighted sum vs. regression model). Finally, we experimentally evaluate our results and compare them to the basic k-nearest neighbor approach. Our experiments suggest that item-based algorithms provide dramatically better performance than user-based algorithms, while at the same time p...
Article
Full-text available
This article discusses the challenges involved in creating a collaborative filtering system for Usenet news. The public trial of GroupLens invited users from over a dozen newsgroups selected to represent a cross-section of Usenet (listed in Table 1) to apply our news reader software to enter ratings and receive predictions (we provided GroupLens-adapted versions of Gnus, xrn, and tin). Over a seven-week trial starting February 8, 1996, we registered 250 users who submitted a total of 47,569 ratings and received over 600,000 predictions for 22,862 different articles. These users were volunteers who saw our announcement postings or our Web page. They downloaded specially modified news browsers that accepted ratings and displayed predictions on a 1--5 scale where 1 was described as "this item is really bad! a waste of net.bandwidth" and 5 as "this article is great, I would like to see more like it." For privacy reasons, users were known to us only by pseudonyms. Qualitative results are therefore the compilation of feedback from the GroupLens mailing list and private email rather than a comprehensive survey. In [5] we present a more detailed summary of the trial results, along with comparisons with noncollaborative approaches to managing Usenet news.
Article
In recent years, Collaborative Filtering (CF) has proven to be one of the most successful techniques used in recommendation systems. Since current CF systems estimate the ratings of not-yet-rated items based on other items’ ratings, these CF systems fail to recommend products when users’ preferences are not expressed in numbers. In many practical situations, however, users’ preferences are represented by ranked lists rather than numbers, such as lists of movies ranked according to users’ preferences. Therefore, this study proposes a novel collaborative filtering methodology for product recommendation when the preference of each user is expressed by multiple ranked lists of items. Accordingly, a four-staged methodology is developed to predict the rankings of not-yet-ranked items for the active user. Finally, a series of experiments is performed, and the results indicate that the proposed methodology produces high-quality recommendations.
Article
This paper addresses the problems that must be considered if computers are going to treat their users as individuals with distinct personalities, goals, and so forth. It first outlines the issues, and then proposes stereotypes as a useful mechanism for building models of individual users on the basis of a small amount of information about them. In order to build user models quickly, a large amount of uncertain knowledge must be incorporated into the models. The issue of how to resolve the conflicts that will arise among such inferences is discussed. A system, Grundy, is described that builds models of its users, with the aid of stereotypes, and then exploits those models to guide it in its task, suggesting novels that people may find interesting. If stereotypes are to be useful to Grundy, they must accurately characterize the users of the system. Some techniques to modify stereotypes on the basis of experience are discussed. An analysis of Grundy's performance shows that its user models are effective in guiding its performance.
Conference Paper
In this paper, we describe a new model for collaborative filtering. The motivation of this work comes from the fact that two users with very similar preferences on items may have very different rating schemes. For example, one user may tend to assign a higher rating to all items than another user. Unlike previous models of collaborative filtering, which determine the similarity between two users only based on their rating performance, our model treats the user's preferences on items separately from the user's rating scheme. More specifically, for each user, we build two separate models: a preference model capturing which items are favored by the user and a rating model capturing how the user would rate an item given the preference information. The similarity of two users is computed based on the underlying preference model, instead of the surface ratings. We compare the new model with several representative previous approaches on two data sets. Experiment results show that the new model outperforms all the previous approaches that are tested consistently on both data sets.
Conference Paper
The current collaborative recommendation approaches mainly measure users’ similarity by comparing user’s entire interests and don’t consider user’s interest quality, especially interest span. We propose a new approach to provide inter-website recommendation on proxy server based on partial similarity of interests, and construct corresponding user’s interest model to realize this method. According to psychological characteristic of interests, this approach divides user’s interest into several interest-points, which are correlated each other and farther divided into long-term interest and short-term interest. We mine the interest quality and correlation of interest-points from proxy log to construct user’s interest model. This method adopts different recommendation mechanism separately for long-term interests and short-term interests, which provides recommendation to target user’s long-term interests based on neighbors with partially similar interests and recommendation to user’s short-term interests based on experienced users. Experimental results indicate that this method can recommend interesting and unexpected inter-website pages to target users and improve the precision of personalized recommendation service on proxy server.
Article
predicated on the belief that information filtering can be more effective when humans are involved in the filtering process. Tapestry was designed to support both content-based filtering and collaborative filtering, which entails people collaborating to help each other perform filtering by recording their reactions to documents they read. The reactions are called annotations; they can be accessed by other people’s filters. Tapestry is intended to handle any incoming stream of electronic documents and serves both as a mail filter and repository; its components are the indexer, document store, annotation store, filterer, little box, remailer, appraiser and reader/browser. Tapestry’s client/server architecture, its various components, and the Tapestry query language are described.
Article
Recommendation algorithms are best known for their use on e-commerce Web sites, 1 where they use input about a customer’s interests to generate a list of recommended items. Many applications use only the items that customers purchase and explicitly rate to represent their interests, but they can also use other attributes, including items viewed, demographic data, subject interests, and favorite artists. At Amazon.com, we use recommendation algorithms to personalize the online store for each customer. The store radically changes based on customer interests, showing programming titles to a software engineer and baby toys to a new mother. The click-through and conversion rates — two
Article
Recommendation algorithms are best known for their use on e-commerce Web sites, where they use input about a customer's interests to generate a list of recommended items. Many applications use only the items that customers purchase and explicitly rate to represent their interests, but they can also use other attributes, including items viewed, demographic data, subject interests, and favorite artists. At Amazon.com, we use recommendation algorithms to personalize the online store for each customer. The store radically changes based on customer interests, showing programming titles to a software engineer and baby toys to a new mother. There are three common approaches to solving the recommendation problem: traditional collaborative filtering, cluster models, and search-based methods. Here, we compare these methods with our algorithm, which we call item-to-item collaborative filtering. Unlike traditional collaborative filtering, our algorithm's online computation scales independently of the number of customers and number of items in the product catalog. Our algorithm produces recommendations in real-time, scales to massive data sets, and generates high quality recommendations.
Article
Predicting items a user would like on the basis of other users' ratings for these items has become a well-established strategy adopted by many recommendation services on the Internet. Although this can be seen as a classification problem, algorithms proposed thus far do not draw on results from the machine learning literature. We propose a representation for collaborative filtering tasks that allows the application of virtually any machine learning algorithm. We identify the shortcomings of current collaborative filtering techniques and propose the use of learning algorithms paired with feature extraction techniques that specifically address the limitations of previous approaches. Our best-performing algorithm is based on the singular value decomposition of an initial matrix of user ratings, exploiting latent structure that essentially eliminates the need for users to rate common items in order to become predictors for one another's preferences. We evaluate the prop...
Article
Recent projects in collaborative filtering and information filtering address the task of inferring user preference relationships for products or information. The data on which these inferences are based typically consists of pairs of people and items. The items may be information sources (such as web pages or newspaper articles) or products (such as books, software, movies or CDs). We are interested in making recommendations or predictions. Traditional approaches to the problem derive from classical algorithms in statistical pattern recognition and machine learning. The majority of these approaches assume a "flat" data representation for each object, and focus on a single dyadic relationship between the objects. In this paper, we examine a richer model that allows us to reason about many different relations at the same time. We build on the recent work on probabilistic relational models (PRMs), and describe how PRMs can be applied to the task of collaborative filtering. PRMs allow us t...
Article
This paper describes a technique for making personalized recommendations from any type of database to a user based on similarities between the interest profile of that user and those of other users. In particular, we discuss the implementation of a networked system called Ringo, which makes personalized recommendations for music albums and artists. Ringo's database of users and artists grows dynamically as more people use the system and enter more information. Four different algorithms for making recommendations by using social information filtering were tested and compared. We present quantitative and qualitative results obtained from the use of Ringo by more than 2000 people. KEYWORDS: social information filtering, personalized recommendation systems, user modeling, information retrieval, intelligent systems, CSCW. INTRODUCTION Recent years have seen the explosive growth of the sheer volume of information. The number of books, movies, news, advertisements, and in particular on-lin...
Progress of the personalized recommendation systems
  • Liu Jianguo
  • Zhou Tao
Liu Jianguo, Zhou Tao, et al. Progress of the personalized recommendation systems. Progress of Nature and Science, 200919(1):1-15
The interests include computer networks, e-Learning, grid computing and semantic web
  • Chanle Wu
Chanle Wu, born in 1945, professor, The interests include computer networks, e-Learning, grid computing and semantic web.
Ph.D. candidate. The research interests include personalization recommendation and semantic web
  • Yibo Chen
Yibo Chen, born in 1982, Ph.D. candidate. The research interests include personalization recommendation and semantic web.
Ph.D. candidate. The research interests include data mining and semantic web
  • Ming Xie
Ming Xie, born in 1978, Ph.D. candidate. The research interests include data mining and semantic web.
Applying collaborative filtering to usenet news
  • Ja Konstan
  • Bn Iller
Konstan JA, M iller BN, et al. GroupLens: Applying collaborative filtering to usenet news. Comm ACM, 1997, 40(3):77-87