ThesisPDF Available

An Analytical Study and Analysis of Filtering Techniques used for Classification of Recommender Systems

Authors:

Abstract and Figures

*in partial fulfilment for the award of the degree of Bachelor of Computer Applications* This research proposes a classification of recommender systems based upon the filtering techniques used for refining the recommendations. The aim is to explore different types of filtering techniques and ideas and use them to classify the recommender systems with justifications, their advantages, and disadvantages, and then compare these techniques on the basis of input collection, processing data, and output as a recommendation. The classification is on the basis of how a system treats the data, the collection process of the data, approaches of collecting and processing the data, and the area of application. In the research, we have classified the recommender systems using various filtering techniques, viz. collaborative filtering, content-based filtering, demographic filtering, and knowledge-based filtering techniques of data processing, approaches, and applications. These techniques have been diagrammatically represented throughout the chapters to best explain the scenarios where these techniques can be helpful. The research also discusses the shortfalls of these techniques in their respective region and where and how they overcome the limitations of other techniques described throughout the research.
Content may be subject to copyright.
An Analytical Study and Analysis of Filtering Techniques
used for Classification of Recommender Systems
DISSERTATION REPORT
Submitted by
Mohammad Muzammil Khan
2017 301 080
in partial fulfilment for the award of the degree of
Bachelor of Computer Applications
Under the supervision of
Prof. M. Afshar Alam
Department of Computer Science & Engineering
School of Engineering Sciences & Technology
JAMIA HAMDARD
(Deemed to be University)
New Delhi-110062
2020
Table of Contents
CERTIFICATE ............................................................................................... I
DECLARATION ........................................................................................... II
ACKNOWLEDGEMENT .......................................................................... III
List of Abbreviation(s) ................................................................................ IV
List of Table(s) ............................................................................................... V
List of Figure(s) ............................................................................................ VI
Title and Abstract ........................................................................................... 1
Chapter 1: Introduction ................................................................................. 3
Chapter 2: Past researches ............................................................................ 8
Chapter 3: Basis of Classification of Recommender Systems .................. 12
3.1 Definition ............................................................................................... 13
3.2 Scope ...................................................................................................... 14
3.3 Basis of Classification ........................................................................... 14
Chapter 4: Collaborative Filtering ............................................................. 16
4.1 User-based and Item-based Collaborative Filtering techniques .......... 20
4.2 Model-Based Collaborative Filtering techniques ................................. 22
Chapter 5: Content-based Filtering ............................................................ 24
5.1 Heuristic-Based ..................................................................................... 27
5.2 Model-Based .......................................................................................... 28
5.3 Web mining-Based ................................................................................. 28
Chapter 6: Demographic Filtering .............................................................. 30
Chapter 7: Knowledge-based Filtering ...................................................... 35
7.1 Case-based recommendations ............................................................... 40
7.2 Constraint-based recommendations ...................................................... 41
Chapter 8: Comparision of Classified Recommender Systems ............... 42
Chapter 9: Conclusion .................................................................................. 47
References ...................................................................................................... 50
Conference certificate and conference paper ............................................... 54
I
Certificate
II
DECLARATION
I, Mohammad Muzammil Khan, a student of Bachelor of
Computer Applications (BCA) (Enrolment No: 2017-301-080) hereby
declare that the dissertation entitled An Analytical Study and Analysis of
Filtering Techniques used for Classification of Recommender Systems”
which is being submitted by me to the Department of Computer Science &
Engineering, School of Engineering Sciences & Technology, Jamia
Hamdard, New Delhi in partial fulfilment of the requirement for the award
of the degree of Bachelor of Computer Applications (BCA), is my
original work and has not been submitted anywhere else for the award of
any Degree, Diploma, Associateship, Fellowship or other similar title or
recognition.
Mohammad Muzammil Khan
Date:
Place: New Delhi, India
III
ACKNOWLEDGEMENT
“And remember! Your Lord caused to be declared: If ye are grateful, I will add more unto you; …”
(The Quran, Ibrahim:7)
All thanks to Allah, the Almighty who graced me the opportunity and
stamina to complete this project with ease and well within time.
After the Almighty, I would like to extend my gratitude to my supervisor
and mentor Prof. M. Afshar Alam, Dean of School of Engineering Sciences
and Technology, Head of Department of Computer Science and
Engineering, Jamia Hamdard, who was extremely helpful and provided me
with much-needed support in all my highs and lows. I am in awe of his
dedication and deeply revere his great personality. His dynamism, vision,
sincerity, and motivation have greatly inspired me. I am greatly privileged
and honoured to study under his esteemed guidance.
I am extremely grateful to my family for their love, care, prayers, and
sacrifices for educating and helping me overcome all the obstacles of my
life. Especially my sister, Ms Darakhshan Ishrat, who helped me in ways I
can’t even begin to fathom.
I must thank all of my teachers for showing me, each in their unique way,
what it means to be dedicated to your work. All of them have given me their
time, energy, and expertise, and hence, made me richer for it. I shall always
be indebted to them. My completion of this work would not have been
possible without the kind support of my teachers who either directly or
indirectly shaped this work.
Mohammad Muzammil Khan
IV
List of Abbreviation(s)
Abbreviation
Full form
RS/RecSys
Recommender System(s)
CF
Collaborative Filtering
k-NN
k-nearest neighbour
MBCF
Model-Based Collaborative Filtering
CBF
Content-based Filtering
DF
Demographic Filtering
KBF
Knowledge-based Filtering
HRS
Hybrid Recommender System
V
List of Table(s)
S. No.
Table No.
Caption
Page no.
1
Table 4.1
Comparison on the basis of input,
processing, and output.
44
VI
List of Figure(s)
S. No.
Figure No.
Caption
Page no.
1
Figure 4.1
Representation of Collaborative
Filtering
17
2
Figure 4.2
A recommendation is made based on CF
18
3
Figure 5.1
Representation of Content-based
Filtering
25
4
Figure 5.2
Item is recommended based on CBF
26
5
Figure 6.1
Representation of Demographic
Filtering
32
6
Figure 6.2
Recommendation for demographic
neighbourhood
33
7
Figure 7.1
A query to a recommender system
36
8
Figure 7.2
Recommendation based on KBF system
38
1
Title and Abstract
An Analytical Study and Analysis of Filtering Techniques Used
for Classification of Recommender Systems
1
Abstract. This research proposes a classification of recommender
systems based upon the filtering techniques used for refining the
recommendations. The aim is to explore different types of filtering
techniques and ideas and use them to classify the recommender systems
with justifications, their advantages, and disadvantages, and then compare
these techniques on the basis of input collection, processing data, and output
1
Khan, Mohammad Muzammil, A Review on Filtering Techniques Used for Classification of Recommender
Systems (March 29, 2020). Available at
SSRN: https://ssrn.com/abstract=3563557 or http://dx.doi.org/10.2139/ssrn.3563557
Presented at the 3rd International Conference on Innovative Computing & Communications (ICICC) 2020
Held on 21 23 February, 2020.
2
as a recommendation. The classification is on the basis of how a system
treats the data, the collection process of the data, approaches of collecting
and processing the data, and the area of application.
In the research, we have classified the recommender systems using
various filtering techniques, viz. collaborative filtering, content-based
filtering, demographic filtering, and knowledge-based filtering techniques
of data processing, approaches, and applications. These techniques have
been diagrammatically represented throughout the chapters to best explain
the scenarios where these techniques can be helpful. The research also
discusses the shortfalls of these techniques in their respective region and
where and how they overcome the limitations of other techniques described
throughout the research.
Keywords. Recommender Systems, demographic filtering, collaborative
filtering, content-based filtering, knowledge-based filtering.
3
CHAPTER I
INTRODUCTION
4
Introduction
With time, the technological advancements have led to buyers facing
difficulties instead of ease in finding an equilibrium between their
needs/wants and the products available to them. Because of the
advancements, the consumer class has a wide variety of choices and
collection, be it beneficial or not, available now than ever before. This is
where the Recommender Systems play a huge role by understanding the
behaviour of a consumer, purchase analysis, prediction of purchase, etc.
therefore, helping buyers get what they would prefer.
Today, we use Recommendation Systems (RecSys) everywhere; whether
it is online shopping for items [1], streaming services for audio and videos
or a recommendation for doctors and hospitals, etc. [2, 3]. We are
surrounded by systems that determine what we would like and recommend
it to us. The fact is that we are so accustomed to the recommendations that
the online world we recognize today is very much built by automated
systems that determine our likes for us. If not for these systems, the online
world would not be recognizable as we are used to today.
5
Many research papers have been reported, especially in the field of book
recommendation [4, 5], and several techniques have been used including
novel fuzzy techniques [6], Ordered Weighted Averaging [2], and implicit
feedback for recommendations [7 9].
The research in recommendation technology has always longed for new
and innovative approaches. A system that is relevant today might not be able
to yield good recommendations in the next, say, five years. The technology
has been continuously improving but the true potential of recommender
systems remains somewhat untapped. There is still much yet to be
discovered which could help in improving the recommendations and user
experiences. With the help of this research, we aim to classify and justify
the classification of these recommender systems.
Recommender systems are very crucial in some industries since they can
produce a huge amount of profit if they are efficient or can also be a way to
stand out significantly from contenders. As a proof of the importance of the
recommender systems, we can mention that, from 2006 to 2009, Netflix, a
video streaming service, organized a campaign called the Netflix
Prize aimed at producing a recommender system that performs better than
its current algorithm with a prize of US$1 million. BellKor's Pragmatic
Chaos team managed to win this contest in 2009 [10] by creating a system
which was better than what Netflix had and produced recommendations
10% more accurately.
Different types of recommender systems face different difficulties in the
establishment phase as well as later stages where the network gets too big,
as we shall discuss in detail in chapter four. Some systems overcome other
6
system’s complications but introduce another problem in the process. One
of the main problems faced with every recommender system is the problem
of a cold start. A cold start is when the system doesn’t have enough
information, either about the user or the consumer, to accurately predict
what the user/consumer would like to purchase or have. The problem leads
to having a nonsensical recommendation for the consumer in case of a lack
of information about the user/consumer or not having recommended the best
possible recommendation because the system didn’t have information about
the product. The cold start problem can deter, dissuade and deprive the new
coming consumer. Therefore, a dearth of much-needed exposure of newly
added products takes place. We shall discuss how to deal with this problem
efficiently, including other problems, with solutions and methods to use
throughout the research.
Therefore, it is imperative to have an understanding of the different types
of recommender systems, their usage, fields, etc. to effectively use the best
system in a given environment. This paper will describe and classify the
fundamental filtering techniques used in Recommender Systems, classify
the systems based on these techniques, and how they differ on the basis of
techniques used to collect data, process the data, and infer recommendation
based on previously known attributes or newly learnt information about
either the user or product as well as discuss the possible solutions to
problems faced by a recommender system in detail.
The report is structured as follows - Chapter Two will give a review of
the past researches done in the field of recommendation technology and
their innovative takes. Then we will move on to Chapter Three which will
provide a window into the basis of classification and scope of this research.
7
Then from Chapter Four to Chapter Seven, we will discuss classified
filtering techniques of recommender systems, each having a different
perspective at the technique of filtering for our classification, enriched with
examples and different scenarios to help visualize the recommendation
process better. Then in Chapter Eight, we shall compare these techniques on
the basis of data collection, processing, and output as a recommendation.
Finally, in Chapter Nine, we shall conclude the research with our findings
and discuss our approach for classification.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Objectives: To propose the top books for universities students by using the proposed fuzzy based approach, Ordered Ranked weighted Aggregation method. Methods/Statistical Analysis: The recommendations of books by different universities differ significantly. A staunch aggregation of the differently recommended books by the top ranked universities may lead to vigorous recommendation. We apply Positional Aggregation based Scoring technique, a rank aggregation method for partial list. We have suggested Ordered Ranked Weighted Aggregation (ORWA) operator, which assigns weights to the ranker. Findings: By using proposed technique, the recommendation of top ranked university is preferred over lower ranked universities. The philosophy of ORWA is the fact that the recommendation of a book by a top ranked university will eventually increase the importance of the recommended books. The top 20 books on “Artificial Intelligence” are recommended using PAS and ORWA based techniques. The recommendation would help the users in finding the books of their requirement. Improvements: The relative comparisons between both the discussed techniques PAS and ORWA are discussed and shown graphically. The results indicate a clear improvement of ORWA over PAS.
Article
Full-text available
The customer's review plays an important role in deciding the purchasing behavior for online shopping as a customer prefers to get the opinion of other customers by observing their opinion through online products’ reviews, blogs and social networking sites etc. The customer's reviews reflect the customer's sentiments and have a substantial significance for the products being sold online including electronic gadgets, movies, house hold appliances and books. Hence, extracting the exact features of the products by analyzing the text of reviews requires a lot of efforts and human intelligence. In this paper we intend to analyze the online reviews available for books and extract book-features from the reviews using human intelligence. We have proposed a technique to categorize the features of books from the reviews of the customers. The extracted features may help in deciding the books to be recommended for readers. The ultimate goal of the work is to fulfill the requirement of the user and provide them their desired books. Thus, we have evaluated our categorization method by users themselves, and surveyed qualified persons for the concerned books. The survey results show high precision of the features categorized which clearly indicates that proposed method is very useful and appealing. The proposed technique may help in recommending the best books for concerned people and may also be generalized to recommend any product to the users.
Conference Paper
The text-to-speech and speech-to-text functionalities have become an integral part of our lives in this digital era. This paper proposes a system that provides a way to generate audio from text and listen to the user’s microphone and convert speech-to-text. This system shall be implemented with various user-friendly features. The target audience of this system is people with disabilities like dyslexia, reading challenges, or visual impairment which results in difficulty in reading or writing. This system can aid them and help them use technology. Its simplicity and ease of use make it stand apart from such existing systems.
Article
Evaluation strategies are essentials in assessing the degree of satisfaction that recommender systems can provide to users. The evaluation schemes rely heavily on user feedback, however these feedbacks may be casual, biased or spam which leads to an inappropriate evaluation. In this paper, a comprehensive approach for the evaluation of recommendation system is proposed. The implicit user feedbacks are taken for the different products on the basis of the reviews provided to them. A novel sincerity check mechanism is suggested to render the biasedness and casual among the users. Further, mathematical model is presented to classify the products preference criteria. The list of the preferred products yield different ranking. Rank aggregation algorithm is used to obtain a final ranking, which is compared with the base ranking to be evaluated. Hence, with the help of suggested methodology, an evaluation strategy is suggested that avoids the risk of fake and biased feedbacks. The comparison of the proposed approach with existing schemes shows the superiority of the aforementioned approach from various parameters. It is envisaged that the proposed evaluation scheme lays a platform for users to assess the recommender systems for their ease and reliable online shopping.
Article
The text book prescribed to the students at universities helps them a lot in acquiring knowledge and performing well in their courses. However, the recent research suggests there is a decline in the number of text book readers. Since recommender systems help in providing items of users’ need, good and precise recommendation of the books could enhance the users’ affinity toward reading the books. The primary objective of this paper is to present an opinion mining-based recommendation technique which can provide the university students with the promising books for their syllabus. The problem with the existing recommender technique is that these methods take only expert recommendation in consideration and the involvement of the opinion of the users, i.e., students/readers, has not been considered which allow us to understand how the readers perceive the recommended books and whether they are satisfied with the recommendation or not? To address the issue, we consider experts and readers both by employing experts’ recommendation for books at top-ranked universities and exploring users’ reviews on the concerned book at online retailers’ sites such as Amazon. To validate the efficacy of the proposed algorithms, eight different parameters have been used; up to 55% improvement in the result has been obtained through proposed method. It is envisaged that the adopted opinion mining approach can be very useful for the recommendation of products of other domain too.
Article
Generally the book recommendation approaches are personalized in nature, that is, they utilize the users’ purchasing behavior to recommend them the book similar to their preferences. The main problem with the personalized recommendation is its knowledge requirement about users’ past preferences. As a result, these techniques fail in producing appropriate recommendation for a new user whose preferences are not known. The personalized recommendation also needs extra space to store the users’ preferences. In this paper, a framework to recommend books to university students for their studies is presented. In order to answer which books are to be included in the syllabus, a specialized way of recommendation, where recommendations from experts of the subjects at different universities are considered, is presented. We have suggested a ranked recommendation approach for books, which employ Ordered Weighted Aggregation (OWA), a fuzzy-based aggregation, to aggregate the several ranking of the top universities. On the one hand, it does not need user prior preferences, and on the other hand, it eases the complexities of personalized recommendation to huge number of users and replaces it with a single ranked recommendation. The experimental results are compared with the existing positional aggregation algorithm that demonstrates significant improvement in the results with respect to various performance metrics.
Conference Paper
The modern tools and techniques have given a great opportunities to the researchers to involve these advancement in solving the daily life issues, and making an easy platform to fulfill the need of a common man. In recent days a great revolution in a common man's daily life has been observed by the use of wearable technology based devices, i.e. wearable devices. These devices get a great and positive response from the perspective of business markets which lead the researchers to incline towards this field of research. In this paper, we have discussed several wearable devices and studied a framework, which is based on opinion mining to enhance the wearable technologies. The enhancement may support manufacturers to produce an efficient wearable device for the consumers. The proposed model, “User feedback based Model for Enhancement of Wearable Technology” (UWM) in the concerned study presented a two-step procedure for enhancement. We have concluded by our study that the proposed model, UMW, may help in enhancing the quality of the wearable devices.
Conference Paper
The recommender systems are being used immensely to promote various services, products and facilities of daily life. Due to the success of this technology, the reliance of people on the recommendations of others is increasing with tremendous pace. One of the best and easiest ways to acquire the suggestions of the other like-minded and neighbor customers is to mine their opinions about the products and services. In this paper, we present a feature based opinion extraction and analysis from customers? online reviews for books. Ordered Weighted Aggregation (OWA), a well-known fuzzy averaging operator, is used to quantify the scores of the features. The linguistic quantifiers are applied over extracted features to ensure that the recommended books have the maximum coverage of these features. The results of the three linguistic quantifiers, ?at least half?, ?most? and ?as many as possible? are compared based on the evaluation metric - precision@5. It is evident from the results that quantifier ?as many as possible? outperformed others in the aforementioned performance metric. The proposed approach will surely open a new chapter in designing the recommender systems to address the expectation of the users and their need of finding relevant books in a better way.
Conference Paper
A major challenge in collaborative filtering based recommender systems is how to provide recommendations when rating data is sparse or entirely missing for a subset of users or items, commonly known as the cold-start problem. In recent years, there has been considerable interest in developing new solutions that address the cold-start problem. These solutions are mainly based on the idea of exploiting other sources of information to compensate for the lack of rating data. In this paper, we propose a novel algorithmic framework based on matrix factorization that simultaneously exploits the similarity information among users and items to alleviate the cold-start problem. In contrast to existing methods, the proposed algorithm decouples the following two aspects of the cold-start problem: (a) the completion of a rating sub-matrix, which is generated by excluding cold-start users and items from the original rating matrix; and (b) the transduction of knowledge from existing ratings to cold-start items/users using side information. This crucial difference significantly boosts the performance when appropriate side information is incorporated. We provide theoretical guarantees on the estimation error of the proposed two-stage algorithm based on the richness of similarity information in capturing the rating data. To the best of our knowledge, this is the first algorithm that addresses the cold-start problem with provable guarantees. We also conduct thorough experiments on synthetic and real datasets that demonstrate the effectiveness of the proposed algorithm and highlights the usefulness of auxiliary information in dealing with both cold-start users and items.
Article
This paper presents an overview of the field of recommender systems and describes the current generation of recommendation methods that are usually classified into the following three main categories: content-based, collaborative, and hybrid recommendation approaches. This paper also describes various limitations of current recommendation methods and discusses possible extensions that can improve recommendation capabilities and make recommender systems applicable to an even broader range of applications. These extensions include, among others, an improvement of understanding of users and items, incorporation of the contextual information into the recommendation process, support for multicriteria ratings, and a provision of more flexible and less intrusive types of recommendations.