Content uploaded by George Lekakos
Author content
All content in this area was uploaded by George Lekakos on Jul 05, 2020
Content may be subject to copyright.
S. Yamamoto (Ed.): HIMI 2014, Part II, LNCS 8522, pp. 579–589, 2014.
© Springer International Publishing Switzerland 2014
Operations Research and Recommender Systems
Thomas Asikis and George Lekakos
Department of Management Science and Technology,
Athens University of Economics and Business, Athens, Greece
asikis.thomas@gmail.com, glekakos@aueb.gr
Abstract. Nowadays, Recommender Systems (RS) are being widely and suc-
cessfully used in online applications. A successful Recommender System can
help in increasing the revenue of a web-site as well as helping it to maintain and
increase its users. Until now, research in recommendation algorithms is mainly
based on machine learning and AI techniques. In this article we aim to develop
recommendation algorithms utilizing Operations Research (OR) methods that
provide the ability to move towards an optimized set of items to be recom-
mended. We focus on expressing the Collaborative Filtering Algorithm (CF or
CFA) as a Greedy Construction Algorithm as well as implementing and testing
a Collaborative Metaheuristic Algorithm (CMA) for providing recommenda-
tions. The empirical findings suggest that the recommendation problem can in-
deed be defined as an optimization problem, which provides new opportunities
for the application of powerful and effective OR algorithms on recommendation
problems.
Keywords: Recommender Systems, Personalization algorithms, Operational
Research, Metaheuristic.
1 Introduction
The number of possible choices in various sites that offer products or services in-
creases rapidly day by day. Indeed, a typical user may have to make decisions such
as: “With whom should I connect in a social network?”, “What song should I hear on
Soundcloud?”, “Which product should I buy on Amazon?”. Websites, offer a vast
amount of possible choices to questions such as the above ones, so users need support
in their decision making process in order to make the best possible selections [1].
Recommender Systems (RS) represent a type of information filtering systems that
aim at predicting a user’s items of interest in a large space of possible options, based
on his previous preferences [2]. Typical recommendation approaches include colla-
borative, content based filtering and hybrid methods. Collaborative Filtering RS rely
mostly on the behavioral similarity between users. Content Based Filtering RS focus
on the similarities between item features that a user has favored in the past. In addi-
tion, several hybrid methods utilizing collaborative, content-based, demographic, and
knowledge-based criteria have been developed in the last years [3]. In this article, the
focus is on the collaborative approach due to its popularity, simplicity and intuition.
580 T. Asikis and G. Lekakos
Operations research (OR) provides methods and techniques that support the deci-
sion making process, by evaluating every possible alternative and estimating the po-
tential outcome [4]. The main idea underlying the work presented in this paper is to
exploit OR methods towards an optimized set of items to be recommended to the user.
Combinatorial Optimization is a method used in OR for the identification of op-
timal object/objects in a finite collection of objects [5]. It provides a suitable frame-
work for the definition of a recommendation problem as an OR “Selection” problem.
Recommender Systems aim at creating a selection of items that a user is most likely
to respond positively (and eventually purchase or use the proposed item/items). In OR
Selection problems, the aim is to select each item for the recommendation based on a
choice criterion and evaluate the solution based on a specific evaluation criterion.
Along the above line of thinking, the objective of this paper is twofold: (a) to dem-
onstrate that the recommendation problem can be defined and treated as an OR prob-
lem and (b) to design a well-performing recommendation algorithm based on OR
techniques.
In the following, we firstly present extant related research and then we define the
recommendation problem as an OR problem. At the next section we describe the im-
plementation of a Collaborative-filtering Greedy-construction Algorithm (CGA) and a
Collaborative Metaheuristic Algorithm (CMA) to determine if the combinatorial ap-
proach is suitable for the Social Recommender System Problem (SRSP). Finally, an
empirical evaluation of the above algorithms is presented and the results are discussed
in the final section of the paper.
2 Related Work
In this section we present some relevant RS techniques and discuss if they can be used
to enhance the OR implementation in RS. In general, there are a lot of criteria that can
be used for optimizing a recommendation. This provides us the flexibility to construct
more reliable and accurate OR optimization methods. Trust is a concept that in gener-
al reflects the probability of someone doing an action, based on the actions performed
from another person [6]. People are naturally grouped by trust. People who trust one
another, usually can influence the behavior of each other and a high level of trust
usually benefits all the parties in a transaction, by reducing the transaction cost be-
tween the seller and the buyer of an item. Moreover, trust can be aggregated and
propagated though the members of a system. Those features have made trust one of
the key components of some RS [7].
The advantages of the trust-based recommender systems can be found in a number
of aspects: invulnerability to malicious attacks, greater control of the recommendation
process, an explanation can be provided to users, for each item recommended to them.
However, an important problem of trust-based algorithms, is data sparseness. Trust-
based algorithms tend to face difficulty in sparse datasets [7] and their ability to pro-
duce good recommendations is limited.
On the other hand, features (mostly used in e-commerce recommenders where on-
line purchases are enabled) such as the item’s price may be used as a selection
Operations Research and Recommender Systems 581
criterion. Such criteria are particularly useful for optimization problems that aim at
maximizing product supplier’s profits. Therefore the optimization problem can be
rephrased as the following problem: How can we provide recommendations that
match user’s interests while ensuring the maximum profit for the provider? [8;12].
Novelty and diversity of recommendations can also serve as selection criteria as
they may lead to quite effective recommendations [9]. Novel and diverse recommen-
dations refer to items beyond the typical spectrum of items previously seen or con-
sumed by the user and therefore they are perceived as an unexpected option that they
user may have never considered in the past [10].
3 Defining Recommendation as an OR Problem
3.1 Problem Definition
A RS can be defined by the following:
U = {u1, u2 …uN} a set of users.
Α = {a1, a2 …an} a set of items.
Ci = {c1, c2 …cm} the set of items that user i has already chosen.
Pt = {p1, p2… pk} → the set of items that are selected to be recommended to Ut (rec-
ommendations).
From the above formalization we have:
ut → the user who will receive the recommendations.
Ct = {t1, t2… tm} → the set of items that Ut has already used.
Ri = {rck … r cj} → the set of ratings that a user i has given to an item cj.
The above problem in combinatorial terms, can be expressed as follows:
“Which is the optimal1 set of items that we can recommend to a user?”
In order to operationalize the above problem in Operations Research terms, the fol-
lowing elements are defined:
Form of Problem Solution: The solution to the above problem is a selection of items
from set A, that represents the set of items to be recommended (Pt).
Element of Solution: The element of solution is the single item rj that will be recom-
mended to the user.
Criterion of Choice: The criterion for selecting an item from A and use it as a
recommendation in Pt. It can be expressed by a function of multiple
criteria ,,…. Depending on the value of this function, it can be determined
if the item can be used as a recommendation.
Evaluation Criterion: The criterion, which that will be used in order to evaluate the
solution/recommendations provided to Ut. One of the most difficult and crucial tasks
towards the solution of the recommendation problem is to define an evaluation crite-
rion that could increase the quality of recommendations, without increasing the time
1 Optimal means the set of items that will best match the user interests.
582 T. Asikis and G. Lekakos
consumed for the algorithm execution. In general the evaluation criterion is used for
evaluating the solutions produced from an algorithm. A metaheuristic can produce a
vast number of possible solutions. The evaluation criterion is used by the metaheuris-
tic to evaluate the solutions and choose the optimal one[15].
3.2 Algorithms
Collaborative-Filtering as a Greedy-Construction Algorithm (CGA). The first
algorithm implemented is a Greedy Construction algorithm, which is based on the
Collaborative Filtering. The algorithm has the following characteristics:
Criterion of Choice: The criterion upon which an item c from A will be selected and
used as a recommendation in Pt for the user Ut is the “collaborative rating”:
́∑,
∑, (1)
Where:
́: is the predicted rating for the Ut for item c.
: is the average rating score of Ut.
: is the average rating of a user Ui.
,: is the correlation between users Ut and Ui, based on their known rat-
ings.
: is the rating of item c from user i.
The algorithm is executed for a target user Ut from U, as follows:
1. Pick the target user Ut from U.
2. Calculate all correlations between Ut and the other users of U.
3. Create neighbors of users that share correlation higher than 0.5, in a set Usimilars.
4. Based on those users, calculate the collaborative score for each item they have
rated, and user Ut hasn’t rated.
5. Create the set Arecs, consisting of the above (step 4) rating
6. For each item in Arecs: if it has score greater than 4, put the item in the recom-
mendations set Pt.
7. After putting all the appropriate items in Pt, recommend the items to the user.
8. Evaluate the solution. Compare the average of the correlations the user had with
all the users, with the one he has now.
9. End.
Collaborative Metaheuristic Algorithm (CMA). This is the second algorithm that
was developed and tested. CF can be used both as a metaheuristic and as a construc-
tive algorithm. Constructive or Heuristic Algorithms, construct a solution from zero,
building it element by element. To use a constructive CF, we created a solution from
zero, adding items based on their collaborative score. To implement a collaborative
model in a Metaheuristic Algorithm. An algorithm which uses already existent solu-
tions, to find better ones. The basic elements of the CFA are the following:
Operations Research and Recommender Systems 583
Criterion of Choice: The criterion upon which an item c is selected from A for in-
clusion or exclusion to Pt is the following:
Inclusion Criterion:
,,,,
, (2)
Exclusion criterion:
,,
,,
,
(3)
Where:
∑,
∑,
(4)
: is the absolute difference between the initial average correlation the user had, and
the current average correlation he has now.
: The initial average correlation of Ut with other users.
: The current average correlation of Ut with other users.
,: The new correlation between Ut and Ui, after the algorithm has re-
moved and imported new items.
,: The initial correlation user Ut had with other users, before the start of
the algorithm.
: is the rating of item c from user Ui.
: A randomly generated number between 0 and 1. This variable is used to
make the CMA a probabilistic algorithm. This is used to express the vagueness and
randomness in human behavior.
,: Those are the intensification/diversity factors, in this article we are going to call
them scope values, because they determine the scope of the solutions area that the
algorithm is going to check. With higher values of y and v, less often the algorithm
will allow a change to happen in Pt, be it either an import or removal.
Evaluation Criterion: The difference between the initial value of average correlation
and the present value of average correlation, of the same user. The purpose of this
criterion is to describe the user based on the average correlation he has with the users
that are similar to him (correl(Ut,Ui)>0.5). Our aim is to reconstruct the target user’s
ratings, with new items and see if he still remains similar with the users that used to
be highly correlated with him. Every time the algorithm produces a solution, we eva-
luate it based on the following:
min min ,, (5)
The algorithm is being executed in the following way:
1. Calculate the initial average correlation
for Ut.
584 T. Asikis and G. Lekakos
2. Create U’t2.
3. Calculate3 a current average correlation value
, either randomly or
based on U’t.
4. Calculate dt. dbest=dt.
5. Set Pbest=Pt.
6. For counter = 0, counter<x*4, counter++
a. Pick a user Ui from Usimilars.
i. For each item of Ui
ii. Pick an item c that Ui has already rated
iii. Check if c belongs to Ct or Pt.
1. If it does, check if the removal criterion is fulfilled.
a. If it is remove the item from Pt.
b. Else do nothing.
2. If it does not, check if the removal criterion is ful-
filled.
a. If it is, import the item in Pt.
b. Else do nothing.
b. After all removals and imports are done for Pt, calculate the new
.
c. Calculate d’t .
d. Compare d’t with dbest .
i. If d’t< dbest. dbest=d’, Pbest=Pt .
e. dt=d’t.
7. Propose the Pbest as the new recommendation.
8. End
The above Metaheuristic algorithm takes an imported set of chosen items or a set
of recommendations, and sets it as Pt. Each time the metaheuristic executes a loop, it
changes the contents of Pt and it compares the changes that happened on the average
correlation value -Avg(Correl(Ut,Ui))- of the user. The random variable and the con-
stants y and v are the tuning factors that decide if the algorithm is intensifying or di-
versifying the search. If y and v are low (around 0.4), the algorithm has a bigger scope
in the solution area. This means that the algorithm is less likely to get stuck in a local
minimum, but it is harder for it to find one.
The number of times the algorithm executes, is also another tuning factor for inten-
sification. The more iterations the algorithm executes before proposing a solution, the
more accurate the solution is. As the number of iterations increases, the program
2 U’t is the new user. This user can be “created” in various ways. It can be the input from anoth-
er algorithm. We can create him also by removing/importing randomly solution elements from
Pt.
3 This value (average correlation) doesn’t always need an U’t to be calculated. It can be set with
a random value between 1 and 0. We will explain the significance of this later on.
4 X can be any positive number. It indicates the number of times we want the metaheuristic to
execute. Each time the metaheuristic checks a new solution.
Operations Research and Recommender Systems 585
spends even more time in finding the solution. CMA is an algorithm that can use any
set of choices as Pt, meaning that we can even import a random solution in it, and still
expect it to produce better ones.
4 Empirical Evaluation
4.1 Dataset
The empirical evaluation of the above algorithms will be based upon the well-known
dataset from epinions.com. This dataset consists of:
• A set U of 49.290 users.
• A set A of total of 139.738 items.
• In total 664824 ratings were given, as a set of ΣR=r1+r2+…+r664824
• Regarding trust, 487.181 statements were made.
This dataset has also been used to describe the ways of setting up a recommender
system for new users, based on trust [11]. The dataset consists of 2 files. The first file
provides the ratings data, to be used by the collaborative filtering. The data is
represented as:
{User_id Item_id Rating}
• User_id is an integer, with positive values, which provides us the id of the user that
gave the rating.
• Item_id is a positive integer also, which provides us the id of the rated item.
• Rating is a positive integer, ranging from 1 to 5. Its value provides us how much
the user liked the item in ascending order. 5 means the user liked the item very
much. 1 means the user didn’t like the item.
4.2 Experimental Environment
In order to successfully test the algorithms’ performance, we set up the experimental
environment using the following:
─ Java programming language for constructing and executing the algorithms.
─ The datasets as well as the output data were stored in space delimited text files.
─ Statistical processing was partially done by the Java applications and by the use of
Microsoft Office Excel®.
─ The hardware used for executing the above experiments, were 2 computers with
following specs. A desktop with: Intel® i7 960 quad-core processor at 3.2 GHz, 6
GB of Ram and Windows 8 Professional OS. A laptop with: Intel® i7 2670QM
quad-core processor at 2.2 GHz and 4 GB of RAM.
586 T. Asikis and G. Lekakos
4.3 Algorithm Implementation and Tuning
The Collaborative-filtering Greedy-construction Algorithm (CGA), was implemented
as it was described above. On the contrary, for the CMA implementation, some
special tuning was performed. In order to decide the values of v and y, as well as the
number of the algorithm iterations, we had to test it in some simulated datasets and
some smaller samples of the Epinions Dataset. It is very important to note that as an
input of the CMA after trying several scope values for this dataset - we determined
that the following values should be used5:
• y=0.15, which means that the algorithm is likely to add new items in the rec-
ommendations set easily.
• v=0.9, which means that the algorithm will avoid deleting items often.
• Iterations’ number = 50, which means that the algorithm will try 50 times to
make the solution more accurate by changing the set of recommendations.
4.4 Results
We executed both algorithms for each user of the dataset, considered as the target
user. Each time an algorithm was fully executed, it provided a set of recommenda-
tions for the target user. We evaluated the precision and recall of this set, using a test
set consisting of the 30% of user ratings which we removed and treated as unknown
to the algorithm. The results of the evaluation are summarized in the table below:
CGA CMA
Average Recall 0,004872652 0,00808811
Average Precision 0,441938921 0,572996974
Average F 0,008215866 0,012658391
Except from the average values, we checked how both algorithms performed
throughout the dataset. The following charts (Fig 1, Fig 2) demonstrate that the CMA
performed better, throughout the whole dataset, for all the evaluation metrics. Each
chart represents the average score the algorithm achieved in the corresponding metric.
The average total choices, show us how many items the user has rated, when he was
used as target user. It must be noted that the collaborative approach cannot respond
well to a cold start situation[13;14], where the target user has provided no ratings. For
cold-start conditions a content-based metaheuristic algorithms is a more efficient
solution.
5 In a real-time applied recommender system, those values would be determined in a training
set, and then applied to the production. Also this values, could be changed in real-time execu-
tion from the system.
Fig. 1.
Fig. 2.
A
5 Conclusions an
d
Recommendation algorith
m
user is most likely to be i
n
there is no guarantee that t
h
represents the optimum sol
u
hand, OR methods aim at
main idea underlying this
p
nation of items among the
v
We managed to imple
m
structive algorithm, thus d
e
Operations Research and Recommender Systems
Algorithm Comparison in terms of Recall
A
lgorithm Comparison in terms of Precision
d
Future Work
m
s typically aim at predicting a set of items that the ta
r
n
terested in. Even considering a highly accurate algorit
h
h
e set of recommended items (produced by that algorit
h
u
tion for the given recommendation problem. On the o
t
optimizing the solutions to selection problems. Thus,
p
aper is to utilize OR methods to find the optimum co
m
v
arious “good” recommendation sets that can be produc
m
ent the Collaborative filtering algorithm as a greedy
c
e
monstrating that Recommender systems can be treate
d
587
r
get
h
m,
h
m)
t
her
the
m
bi-
ed.
c
on-
d
as
588 T. Asikis and G. Lekakos
an OR problem. Furthermore we created a metaheuristic algorithm for providing
recommendations. Both algorithms were tested on a large and real user dataset. They
both gave efficient and precise recommendations, though the metaheuristic being
more efficient in the overall.
After analyzing the dataset and the recommendations that the algorithms provided,
we realized that there is definitely a correlation between some of the user characteris-
tics and the performance of each algorithm. The metaheuristic proved to be more
promising on dealing with datasets and users that provide us sparse information, com-
pared to the Collaborative Filtering.
Other OR algorithms and methods using fuzzy sets and neural networks were taken
in notice, and seemed very promising. Combining this with the conclusions of this
article we can say that recommendation problems can be definitely be treated as Op-
erational Research problem.
The conclusion of this article can help us research new methods of facing the RS.
With the use of OR, we can definitely create new algorithms and systems that can
help us provide more accurate and efficient recommendations, such as:
• Designing of constructive algorithms and metaheuristic algorithms in RS using as
recommendation criteria trust, diversity, novelty and content based metrics.
• Designing of more complex metaheuristic algorithms such as neural networks,
swarm intelligence and genetic algorithms for handling RS.
• Use of evaluation metrics other than recall and precision for the evaluation of the
algorithms.
References
1. Hosein, J., Hiang, S.A.T., Robab, S.: A Naive Recommendation Model for Large Databas-
es. Internation Journal of Information an Education Technology, 216–219 (2012)
2. Fransesco, R., Lior, R., Brach, S.: Introduction to Recommender Systems Handbook.
Recommender Systems Handbook. Springer (2011)
3. Peter, B., Alfred, K., Wolfgang, N.: The Adaptive Web (2007)
4. Sharma, S.C.: Introductory Operation Research. Discovery Publishing House (2006)
5. Alexander, S.: Combinatorial Optimization. Springer (2003)
6. Gambetta, D.: Can We Trust Trust? Trust: Making and Breaking and Breaking Coopera-
tive Relations (2000)
7. Qiu, Q., Annika, H.: Trust Based Recommendations for mobile Tourists in TIP. Hamiltou:
[s.n.] (2008)
8. Fan, W.H., Cheng-Ting, W.: A strategy oriented operation module for recommender sys-
tems in E-commerce. Computers nad Operations Research (2010)
9. Paolo, C., Franca, G., Roberto, T.: Investigating the Persuasion Potential of Recommender
Systems from a Quality Perspective: An Empirical Study. ACM Transactions on Interac-
tive Intelligent Systems 2 (2012)
10. Saul, V., Pablo, C.: Rank and Relevance in Novelty and Diversity Metrics for Recom-
mender Systems
Operations Research and Recommender Systems 589
11. Messa, P., Avesani, P.: Trust aware bootsraping of recommender systems. In: Proceedings
of ECAI 2006 Workshop on Recommender Systems, pp. 29–32 (2006)
12. Chen, L.-S., Hsu, F.-H., Chen, M.-C., Hsu, Y.-C.: Developing recommender systems with
the consideration of product profitability for sellers (2008)
13. Schein, A.I., Popescul, A., Ungar, L.H., Pennock, D.M.: Methods and Metrics for Cold-
Start Recommendations (2002)
14. Lashkari, Y., Metral, M., Maes, P.: Collaborative Interface Agents (1994)
15. Gonzalez, T.F.: Handbook of Approximation Algorithms and Metaheuristics (2007)