Content uploaded by George Lekakos

Author content

All content in this area was uploaded by George Lekakos on Jul 05, 2020

Content may be subject to copyright.

S. Yamamoto (Ed.): HIMI 2014, Part II, LNCS 8522, pp. 579–589, 2014.

© Springer International Publishing Switzerland 2014

Operations Research and Recommender Systems

Thomas Asikis and George Lekakos

Department of Management Science and Technology,

Athens University of Economics and Business, Athens, Greece

asikis.thomas@gmail.com, glekakos@aueb.gr

Abstract. Nowadays, Recommender Systems (RS) are being widely and suc-

cessfully used in online applications. A successful Recommender System can

help in increasing the revenue of a web-site as well as helping it to maintain and

increase its users. Until now, research in recommendation algorithms is mainly

based on machine learning and AI techniques. In this article we aim to develop

recommendation algorithms utilizing Operations Research (OR) methods that

provide the ability to move towards an optimized set of items to be recom-

mended. We focus on expressing the Collaborative Filtering Algorithm (CF or

CFA) as a Greedy Construction Algorithm as well as implementing and testing

a Collaborative Metaheuristic Algorithm (CMA) for providing recommenda-

tions. The empirical findings suggest that the recommendation problem can in-

deed be defined as an optimization problem, which provides new opportunities

for the application of powerful and effective OR algorithms on recommendation

problems.

Keywords: Recommender Systems, Personalization algorithms, Operational

Research, Metaheuristic.

1 Introduction

The number of possible choices in various sites that offer products or services in-

creases rapidly day by day. Indeed, a typical user may have to make decisions such

as: “With whom should I connect in a social network?”, “What song should I hear on

Soundcloud?”, “Which product should I buy on Amazon?”. Websites, offer a vast

amount of possible choices to questions such as the above ones, so users need support

in their decision making process in order to make the best possible selections [1].

Recommender Systems (RS) represent a type of information filtering systems that

aim at predicting a user’s items of interest in a large space of possible options, based

on his previous preferences [2]. Typical recommendation approaches include colla-

borative, content based filtering and hybrid methods. Collaborative Filtering RS rely

mostly on the behavioral similarity between users. Content Based Filtering RS focus

on the similarities between item features that a user has favored in the past. In addi-

tion, several hybrid methods utilizing collaborative, content-based, demographic, and

knowledge-based criteria have been developed in the last years [3]. In this article, the

focus is on the collaborative approach due to its popularity, simplicity and intuition.

580 T. Asikis and G. Lekakos

Operations research (OR) provides methods and techniques that support the deci-

sion making process, by evaluating every possible alternative and estimating the po-

tential outcome [4]. The main idea underlying the work presented in this paper is to

exploit OR methods towards an optimized set of items to be recommended to the user.

Combinatorial Optimization is a method used in OR for the identification of op-

timal object/objects in a finite collection of objects [5]. It provides a suitable frame-

work for the definition of a recommendation problem as an OR “Selection” problem.

Recommender Systems aim at creating a selection of items that a user is most likely

to respond positively (and eventually purchase or use the proposed item/items). In OR

Selection problems, the aim is to select each item for the recommendation based on a

choice criterion and evaluate the solution based on a specific evaluation criterion.

Along the above line of thinking, the objective of this paper is twofold: (a) to dem-

onstrate that the recommendation problem can be defined and treated as an OR prob-

lem and (b) to design a well-performing recommendation algorithm based on OR

techniques.

In the following, we firstly present extant related research and then we define the

recommendation problem as an OR problem. At the next section we describe the im-

plementation of a Collaborative-filtering Greedy-construction Algorithm (CGA) and a

Collaborative Metaheuristic Algorithm (CMA) to determine if the combinatorial ap-

proach is suitable for the Social Recommender System Problem (SRSP). Finally, an

empirical evaluation of the above algorithms is presented and the results are discussed

in the final section of the paper.

2 Related Work

In this section we present some relevant RS techniques and discuss if they can be used

to enhance the OR implementation in RS. In general, there are a lot of criteria that can

be used for optimizing a recommendation. This provides us the flexibility to construct

more reliable and accurate OR optimization methods. Trust is a concept that in gener-

al reflects the probability of someone doing an action, based on the actions performed

from another person [6]. People are naturally grouped by trust. People who trust one

another, usually can influence the behavior of each other and a high level of trust

usually benefits all the parties in a transaction, by reducing the transaction cost be-

tween the seller and the buyer of an item. Moreover, trust can be aggregated and

propagated though the members of a system. Those features have made trust one of

the key components of some RS [7].

The advantages of the trust-based recommender systems can be found in a number

of aspects: invulnerability to malicious attacks, greater control of the recommendation

process, an explanation can be provided to users, for each item recommended to them.

However, an important problem of trust-based algorithms, is data sparseness. Trust-

based algorithms tend to face difficulty in sparse datasets [7] and their ability to pro-

duce good recommendations is limited.

On the other hand, features (mostly used in e-commerce recommenders where on-

line purchases are enabled) such as the item’s price may be used as a selection

Operations Research and Recommender Systems 581

criterion. Such criteria are particularly useful for optimization problems that aim at

maximizing product supplier’s profits. Therefore the optimization problem can be

rephrased as the following problem: How can we provide recommendations that

match user’s interests while ensuring the maximum profit for the provider? [8;12].

Novelty and diversity of recommendations can also serve as selection criteria as

they may lead to quite effective recommendations [9]. Novel and diverse recommen-

dations refer to items beyond the typical spectrum of items previously seen or con-

sumed by the user and therefore they are perceived as an unexpected option that they

user may have never considered in the past [10].

3 Defining Recommendation as an OR Problem

3.1 Problem Definition

A RS can be defined by the following:

U = {u1, u2 …uN} a set of users.

Α = {a1, a2 …an} a set of items.

Ci = {c1, c2 …cm} the set of items that user i has already chosen.

Pt = {p1, p2… pk} → the set of items that are selected to be recommended to Ut (rec-

ommendations).

From the above formalization we have:

ut → the user who will receive the recommendations.

Ct = {t1, t2… tm} → the set of items that Ut has already used.

Ri = {rck … r cj} → the set of ratings that a user i has given to an item cj.

The above problem in combinatorial terms, can be expressed as follows:

“Which is the optimal1 set of items that we can recommend to a user?”

In order to operationalize the above problem in Operations Research terms, the fol-

lowing elements are defined:

Form of Problem Solution: The solution to the above problem is a selection of items

from set A, that represents the set of items to be recommended (Pt).

Element of Solution: The element of solution is the single item rj that will be recom-

mended to the user.

Criterion of Choice: The criterion for selecting an item from A and use it as a

recommendation in Pt. It can be expressed by a function of multiple

criteria ,,…. Depending on the value of this function, it can be determined

if the item can be used as a recommendation.

Evaluation Criterion: The criterion, which that will be used in order to evaluate the

solution/recommendations provided to Ut. One of the most difficult and crucial tasks

towards the solution of the recommendation problem is to define an evaluation crite-

rion that could increase the quality of recommendations, without increasing the time

1 Optimal means the set of items that will best match the user interests.

582 T. Asikis and G. Lekakos

consumed for the algorithm execution. In general the evaluation criterion is used for

evaluating the solutions produced from an algorithm. A metaheuristic can produce a

vast number of possible solutions. The evaluation criterion is used by the metaheuris-

tic to evaluate the solutions and choose the optimal one[15].

3.2 Algorithms

Collaborative-Filtering as a Greedy-Construction Algorithm (CGA). The first

algorithm implemented is a Greedy Construction algorithm, which is based on the

Collaborative Filtering. The algorithm has the following characteristics:

Criterion of Choice: The criterion upon which an item c from A will be selected and

used as a recommendation in Pt for the user Ut is the “collaborative rating”:

́∑,

∑, (1)

Where:

́: is the predicted rating for the Ut for item c.

: is the average rating score of Ut.

: is the average rating of a user Ui.

,: is the correlation between users Ut and Ui, based on their known rat-

ings.

: is the rating of item c from user i.

The algorithm is executed for a target user Ut from U, as follows:

1. Pick the target user Ut from U.

2. Calculate all correlations between Ut and the other users of U.

3. Create neighbors of users that share correlation higher than 0.5, in a set Usimilars.

4. Based on those users, calculate the collaborative score for each item they have

rated, and user Ut hasn’t rated.

5. Create the set Arecs, consisting of the above (step 4) rating

6. For each item in Arecs: if it has score greater than 4, put the item in the recom-

mendations set Pt.

7. After putting all the appropriate items in Pt, recommend the items to the user.

8. Evaluate the solution. Compare the average of the correlations the user had with

all the users, with the one he has now.

9. End.

Collaborative Metaheuristic Algorithm (CMA). This is the second algorithm that

was developed and tested. CF can be used both as a metaheuristic and as a construc-

tive algorithm. Constructive or Heuristic Algorithms, construct a solution from zero,

building it element by element. To use a constructive CF, we created a solution from

zero, adding items based on their collaborative score. To implement a collaborative

model in a Metaheuristic Algorithm. An algorithm which uses already existent solu-

tions, to find better ones. The basic elements of the CFA are the following:

Operations Research and Recommender Systems 583

Criterion of Choice: The criterion upon which an item c is selected from A for in-

clusion or exclusion to Pt is the following:

Inclusion Criterion:

,,,,

, (2)

Exclusion criterion:

,,

,,

,

(3)

Where:

∑,

∑,

(4)

: is the absolute difference between the initial average correlation the user had, and

the current average correlation he has now.

: The initial average correlation of Ut with other users.

: The current average correlation of Ut with other users.

,: The new correlation between Ut and Ui, after the algorithm has re-

moved and imported new items.

,: The initial correlation user Ut had with other users, before the start of

the algorithm.

: is the rating of item c from user Ui.

: A randomly generated number between 0 and 1. This variable is used to

make the CMA a probabilistic algorithm. This is used to express the vagueness and

randomness in human behavior.

,: Those are the intensification/diversity factors, in this article we are going to call

them scope values, because they determine the scope of the solutions area that the

algorithm is going to check. With higher values of y and v, less often the algorithm

will allow a change to happen in Pt, be it either an import or removal.

Evaluation Criterion: The difference between the initial value of average correlation

and the present value of average correlation, of the same user. The purpose of this

criterion is to describe the user based on the average correlation he has with the users

that are similar to him (correl(Ut,Ui)>0.5). Our aim is to reconstruct the target user’s

ratings, with new items and see if he still remains similar with the users that used to

be highly correlated with him. Every time the algorithm produces a solution, we eva-

luate it based on the following:

min min ,, (5)

The algorithm is being executed in the following way:

1. Calculate the initial average correlation

for Ut.

584 T. Asikis and G. Lekakos

2. Create U’t2.

3. Calculate3 a current average correlation value

, either randomly or

based on U’t.

4. Calculate dt. dbest=dt.

5. Set Pbest=Pt.

6. For counter = 0, counter<x*4, counter++

a. Pick a user Ui from Usimilars.

i. For each item of Ui

ii. Pick an item c that Ui has already rated

iii. Check if c belongs to Ct or Pt.

1. If it does, check if the removal criterion is fulfilled.

a. If it is remove the item from Pt.

b. Else do nothing.

2. If it does not, check if the removal criterion is ful-

filled.

a. If it is, import the item in Pt.

b. Else do nothing.

b. After all removals and imports are done for Pt, calculate the new

.

c. Calculate d’t .

d. Compare d’t with dbest .

i. If d’t< dbest. dbest=d’, Pbest=Pt .

e. dt=d’t.

7. Propose the Pbest as the new recommendation.

8. End

The above Metaheuristic algorithm takes an imported set of chosen items or a set

of recommendations, and sets it as Pt. Each time the metaheuristic executes a loop, it

changes the contents of Pt and it compares the changes that happened on the average

correlation value -Avg(Correl(Ut,Ui))- of the user. The random variable and the con-

stants y and v are the tuning factors that decide if the algorithm is intensifying or di-

versifying the search. If y and v are low (around 0.4), the algorithm has a bigger scope

in the solution area. This means that the algorithm is less likely to get stuck in a local

minimum, but it is harder for it to find one.

The number of times the algorithm executes, is also another tuning factor for inten-

sification. The more iterations the algorithm executes before proposing a solution, the

more accurate the solution is. As the number of iterations increases, the program

2 U’t is the new user. This user can be “created” in various ways. It can be the input from anoth-

er algorithm. We can create him also by removing/importing randomly solution elements from

Pt.

3 This value (average correlation) doesn’t always need an U’t to be calculated. It can be set with

a random value between 1 and 0. We will explain the significance of this later on.

4 X can be any positive number. It indicates the number of times we want the metaheuristic to

execute. Each time the metaheuristic checks a new solution.

Operations Research and Recommender Systems 585

spends even more time in finding the solution. CMA is an algorithm that can use any

set of choices as Pt, meaning that we can even import a random solution in it, and still

expect it to produce better ones.

4 Empirical Evaluation

4.1 Dataset

The empirical evaluation of the above algorithms will be based upon the well-known

dataset from epinions.com. This dataset consists of:

• A set U of 49.290 users.

• A set A of total of 139.738 items.

• In total 664824 ratings were given, as a set of ΣR=r1+r2+…+r664824

• Regarding trust, 487.181 statements were made.

This dataset has also been used to describe the ways of setting up a recommender

system for new users, based on trust [11]. The dataset consists of 2 files. The first file

provides the ratings data, to be used by the collaborative filtering. The data is

represented as:

{User_id Item_id Rating}

• User_id is an integer, with positive values, which provides us the id of the user that

gave the rating.

• Item_id is a positive integer also, which provides us the id of the rated item.

• Rating is a positive integer, ranging from 1 to 5. Its value provides us how much

the user liked the item in ascending order. 5 means the user liked the item very

much. 1 means the user didn’t like the item.

4.2 Experimental Environment

In order to successfully test the algorithms’ performance, we set up the experimental

environment using the following:

─ Java programming language for constructing and executing the algorithms.

─ The datasets as well as the output data were stored in space delimited text files.

─ Statistical processing was partially done by the Java applications and by the use of

Microsoft Office Excel®.

─ The hardware used for executing the above experiments, were 2 computers with

following specs. A desktop with: Intel® i7 960 quad-core processor at 3.2 GHz, 6

GB of Ram and Windows 8 Professional OS. A laptop with: Intel® i7 2670QM

quad-core processor at 2.2 GHz and 4 GB of RAM.

586 T. Asikis and G. Lekakos

4.3 Algorithm Implementation and Tuning

The Collaborative-filtering Greedy-construction Algorithm (CGA), was implemented

as it was described above. On the contrary, for the CMA implementation, some

special tuning was performed. In order to decide the values of v and y, as well as the

number of the algorithm iterations, we had to test it in some simulated datasets and

some smaller samples of the Epinions Dataset. It is very important to note that as an

input of the CMA after trying several scope values for this dataset - we determined

that the following values should be used5:

• y=0.15, which means that the algorithm is likely to add new items in the rec-

ommendations set easily.

• v=0.9, which means that the algorithm will avoid deleting items often.

• Iterations’ number = 50, which means that the algorithm will try 50 times to

make the solution more accurate by changing the set of recommendations.

4.4 Results

We executed both algorithms for each user of the dataset, considered as the target

user. Each time an algorithm was fully executed, it provided a set of recommenda-

tions for the target user. We evaluated the precision and recall of this set, using a test

set consisting of the 30% of user ratings which we removed and treated as unknown

to the algorithm. The results of the evaluation are summarized in the table below:

CGA CMA

Average Recall 0,004872652 0,00808811

Average Precision 0,441938921 0,572996974

Average F 0,008215866 0,012658391

Except from the average values, we checked how both algorithms performed

throughout the dataset. The following charts (Fig 1, Fig 2) demonstrate that the CMA

performed better, throughout the whole dataset, for all the evaluation metrics. Each

chart represents the average score the algorithm achieved in the corresponding metric.

The average total choices, show us how many items the user has rated, when he was

used as target user. It must be noted that the collaborative approach cannot respond

well to a cold start situation[13;14], where the target user has provided no ratings. For

cold-start conditions a content-based metaheuristic algorithms is a more efficient

solution.

5 In a real-time applied recommender system, those values would be determined in a training

set, and then applied to the production. Also this values, could be changed in real-time execu-

tion from the system.

Fig. 1.

Fig. 2.

A

5 Conclusions an

d

Recommendation algorith

m

user is most likely to be i

n

there is no guarantee that t

h

represents the optimum sol

u

hand, OR methods aim at

main idea underlying this

p

nation of items among the

v

We managed to imple

m

structive algorithm, thus d

e

Operations Research and Recommender Systems

Algorithm Comparison in terms of Recall

A

lgorithm Comparison in terms of Precision

d

Future Work

m

s typically aim at predicting a set of items that the ta

r

n

terested in. Even considering a highly accurate algorit

h

h

e set of recommended items (produced by that algorit

h

u

tion for the given recommendation problem. On the o

t

optimizing the solutions to selection problems. Thus,

p

aper is to utilize OR methods to find the optimum co

m

v

arious “good” recommendation sets that can be produc

m

ent the Collaborative filtering algorithm as a greedy

c

e

monstrating that Recommender systems can be treate

d

587

r

get

h

m,

h

m)

t

her

the

m

bi-

ed.

c

on-

d

as

588 T. Asikis and G. Lekakos

an OR problem. Furthermore we created a metaheuristic algorithm for providing

recommendations. Both algorithms were tested on a large and real user dataset. They

both gave efficient and precise recommendations, though the metaheuristic being

more efficient in the overall.

After analyzing the dataset and the recommendations that the algorithms provided,

we realized that there is definitely a correlation between some of the user characteris-

tics and the performance of each algorithm. The metaheuristic proved to be more

promising on dealing with datasets and users that provide us sparse information, com-

pared to the Collaborative Filtering.

Other OR algorithms and methods using fuzzy sets and neural networks were taken

in notice, and seemed very promising. Combining this with the conclusions of this

article we can say that recommendation problems can be definitely be treated as Op-

erational Research problem.

The conclusion of this article can help us research new methods of facing the RS.

With the use of OR, we can definitely create new algorithms and systems that can

help us provide more accurate and efficient recommendations, such as:

• Designing of constructive algorithms and metaheuristic algorithms in RS using as

recommendation criteria trust, diversity, novelty and content based metrics.

• Designing of more complex metaheuristic algorithms such as neural networks,

swarm intelligence and genetic algorithms for handling RS.

• Use of evaluation metrics other than recall and precision for the evaluation of the

algorithms.

References

1. Hosein, J., Hiang, S.A.T., Robab, S.: A Naive Recommendation Model for Large Databas-

es. Internation Journal of Information an Education Technology, 216–219 (2012)

2. Fransesco, R., Lior, R., Brach, S.: Introduction to Recommender Systems Handbook.

Recommender Systems Handbook. Springer (2011)

3. Peter, B., Alfred, K., Wolfgang, N.: The Adaptive Web (2007)

4. Sharma, S.C.: Introductory Operation Research. Discovery Publishing House (2006)

5. Alexander, S.: Combinatorial Optimization. Springer (2003)

6. Gambetta, D.: Can We Trust Trust? Trust: Making and Breaking and Breaking Coopera-

tive Relations (2000)

7. Qiu, Q., Annika, H.: Trust Based Recommendations for mobile Tourists in TIP. Hamiltou:

[s.n.] (2008)

8. Fan, W.H., Cheng-Ting, W.: A strategy oriented operation module for recommender sys-

tems in E-commerce. Computers nad Operations Research (2010)

9. Paolo, C., Franca, G., Roberto, T.: Investigating the Persuasion Potential of Recommender

Systems from a Quality Perspective: An Empirical Study. ACM Transactions on Interac-

tive Intelligent Systems 2 (2012)

10. Saul, V., Pablo, C.: Rank and Relevance in Novelty and Diversity Metrics for Recom-

mender Systems

Operations Research and Recommender Systems 589

11. Messa, P., Avesani, P.: Trust aware bootsraping of recommender systems. In: Proceedings

of ECAI 2006 Workshop on Recommender Systems, pp. 29–32 (2006)

12. Chen, L.-S., Hsu, F.-H., Chen, M.-C., Hsu, Y.-C.: Developing recommender systems with

the consideration of product profitability for sellers (2008)

13. Schein, A.I., Popescul, A., Ungar, L.H., Pennock, D.M.: Methods and Metrics for Cold-

Start Recommendations (2002)

14. Lashkari, Y., Metral, M., Maes, P.: Collaborative Interface Agents (1994)

15. Gonzalez, T.F.: Handbook of Approximation Algorithms and Metaheuristics (2007)