Conference PaperPDF Available

CBR-Recommendation System on Massive Contents Processing Using Optimized MFNN Algorithm

Authors:

Abstract

Though recommendation systems have been widely used for websites to generate new recommendations based on like-minded users’ preferences, IEEE Internet Computing points out that current system can not meet the real large-scale e-commerce demands, and has some weakness such as low precision and slow reaction. Huge personalized data are the key to successfully give a new recommendation, but they are difficultly dealt with for they are massive with high dimensional; addressing such problems, the paper suggests to use multi-layer feed-forward neural networks (MFNN) system based on case intelligence to partition massive personalized data into the most similar groups. The subsequent experiment indicates that our system model is constructive and understandable, and our algorithm can decrease the complexity of ANN algorithm, for which the system performance can be guaranteed.
CBR-Recommendation System on Massive
Contents Processing Using Optimized MFNN
Algorithm
Rui Li1, a, Jianyang Li2, b, Benkun Zhu2, c
1Department of Information Engineering, Anhui Communications
Technical College
Hefei, 230051, China
2School of Computer and Information, Hefei University of Technology
Hefei, 230009, China
aemail: Liruilary@gmail.com, bemail: lijianyang@sina.com, cemail:
zbk@zjc.edu.cn
Abstract
Though recommendation systems have been widely used for websites to
generate new recommendations based on like-minded users’ preferences, IEEE
Internet Computing points out that current system can not meet the real
large-scale e-commerce demands, and has some weakness such as low precision
and slow reaction. Huge personalized data are the key to successfully give a new
recommendation, but they are difficultly dealt with for they are massive with
high dimensional; addressing such problems, the paper suggests to use
multi-layer feed-forward neural networks (MFNN) system based on case
intelligence to partition massive personalized data into the most similar groups.
The subsequent experiment indicates that our system model is constructive and
understandable, and our algorithm can decrease the complexity of ANN
algorithm, for which the system performance can be guaranteed.
Keywords:CBR-Recommendation System; Optimized MFNN Algorithm;
Automatic Retrieval; Massive Contents
Introduction
E-commerce is increasing quickly and has been reshaping the world trades, and
how to acquire the consumers’ needs and improve their satisfaction, is emerging
from the both sides’ needs -the suppliers and customs. How to make a successful
recommendation becomes the essential task for recommendation system, which
can provide consumer with the information and advice of products, simulate a
sales to help our consumers through the purchase process. As well known,
acquiring the personalized knowledge is the key process for recommender to
achieve success, which must identify the specific needs of each consumer on the
basis of the consumer preference [1].
International Symposium on Computers & Informatics (ISCI 2015)
© 2015. The authors - Published by Atlantis Press
22
The system uses data mining and other artificial intelligence technology [2],
analyzes the collected data, obtain the behavior and generate interests, where
web-mining emerges in response to the e-commerce need and has gradually
grown up to a complete technical system. Both IR and IE are the two essential
steps to explore personalized data from websites, and the system construction
must involve in many technologies, like database technology, information
technology, statistics, ANN and machine learning, to exploit potential
information or patterns useful [3]. In order to achieve those multiplex tasks in
such complex environments and complicated actions, the paper [4] has described
that case-intelligence recommendation system, using a variety of data mining
technology, which can be used for acquiring effective personalized knowledge.
Plentiful personalized data is the key to meet the individual needs for
recommendation system, for the recommender is a data priority- the more
accumulation of data, and the higher accuracy the recommender can perform.
But the real users’ behavior drilled from websites can accumulate up to millions
or even billions, the processing of massive users’ data is the greatest challenge-
they are huge with high dimensional and involves the system performance
sharply. That is why the statistics report from ACM points out such
dissatisfactory performances in the current recommender, which has discussed in
the paper [5], this paper proposes a new method by using MFNN algorithm to
solve such problems.
MFNN Algorithm
Great achievement has been acquired for CBR in the field of knowledge lack,
which is the simulator of human analogy learning, and becomes the foundation
of the case intelligent decision techniques, for which case is the integrated
representation of the human sense, logics and creativity. ANN has the natural
relationship with CBR, and they can complement each other perfectly.
Case Intelligence with MFNN
MFNN consists of one input layer, one or more hidden layers, one output layer,
where each neuron in every layers is a processor to be used to process simple
information. It has proved that 3-layer MFNN can realize any given function for
approximate accuracy, thus it can be used to solve the nonlinear classification
problem. The case library in the CBR system can be viewed as a CSP, therefore
CS-ANN model, such as Schema model, Hopfield model, Boltzmann and
Harmony theory can be employed to construct the case library[6,7].
Case retrieval is the key process of the CBR intelligent system. Currently the
main way for case-matching is the k-nearest neighbor algorithm, but it can not
reflect the relationship between the cases and their attributes, neither can it
shows the preference of the customers; especially for the large-scale case library,
the retrieving time is unacceptable.
Facing such problems, several successful theories have been put forward to
integrate ANN into the CBR system and covering all the application aspects of it
23
with the ANN components. Theoretically, in the symbolic description
model-based CBR system, rules can be elicited by ANN method; and in the
quantitative description model-based CBR system, due to the system’s flexibility,
many mathematical approaches and optimization techniques can be employed in
the definition and analysis of similarity measurement or case adaptation criteria.
Classic MFNN Algorithms
MFNN has a clear structural layer, and can be a digraph as input a vector x,
then through the networks to get an output vector y, which can be used as a
feedforward networks to process the mapping x to y of a converter. MFNN has
many classic models and algorithms, such as back-propagation network, radial
basis function network, simulated annealing algorithm and their ameliorated
algorithms.
MFNN has a lot of various improved algorithms, and has achieved many
significant results as we have seen. But all of these changes make the networks
become more complex (or performance function is more complex, such as linear
becomes nonlinear, etc.; or structures become more complex), which hopes to
increase the complexity of the network structure to improve network learning
speed as return.
After investigating the behavior of MFNN for case retrieval, the weaknesses
such as having lower speed and local extreme value, are inherent in those
algorithms. For example, RBF is a good similarity detector, but it can hardly deal
with huge user data directly. However, many of those weaknesses are resident in
current algorithms, and can not be conquered to achieve satisfactory level,
especially for such complex data.
System Construction and Processing
Our domain algorithm is a constructive method of MFNN, which based on
the geometrical representation MP model to build three-layer networks by means
of its own structure of input data.
Domain Algorithm
Each input vectors
x
of an n-dimension can be projected to a certain
hyper-sphere
n
S
of an expanding (n+l)-dimensional space, and a “certain
spherical domain” is corresponded with a neural weight and threshold function,
where the transformation
T
:
n
SD
,
)||,()(
22
xRxxT =
should be
used to achieve the projection, thus all points of sample D are projected
to
n
S
[8].
Assume the input samples set
can be classified as
subset of so many with r classes
12
{, , , }
r
K xx x=
; then, a group of sphere
24
domains can be used to cover the samples set K:
(1) Calculate the center of all the samples, and then cover from this point,
which is the nearest sample point
ji Ka
from the center;
(2) Calculate the domain
)( i
aC
, which centers from
i
a
. Suppose
,,2,1,)( == iDKaC iji
=
0
D
,
)(
1id
},{max xai
,
)}(,,{min)( 12 idxaxaid ii >=
. Then compute the radii of sphere
covering domain:
2))()(()(
21
ididid +=
;
(3) Calculate barycenter
b
of i
D
if
1i
D
is the subset i
D
suppose
bai=
+1
i++
return step (2) until the number of samples covered are not
more than the number of the samples. Delete all the points covered by
k
C
,
j
k
jr
KCK =
jrjm KKK /=
mj
KK =
r++
. Then calculate
another covering.
After executing all these steps, a group of sphere domains
},...,,{
21 p
CCC
can be gained, and can be proved that the samples in the same covering domain
must have a high degree of similarity. In our recommendation system, the
personalized user data can be partitioned into several “domains”, where they are
the most similar groups.
System construction
By this way of vectors dimension expansion and space projection, these
domains can be used as the input users’ vectors to the MFNN for case retrieval.
Thus, the next two steps are built for our recommendation system.
Firstly, the construction of MFNN can be described as follows:
The first layer: assume total amounts of P neurons
12
, ... p
AA A
,
i
A
represents the neuron of covering
()Ci
.
)()1(),()1( 1
1
θθ
== aW
The second layer: select the same amounts of neurons as the first step
p
BBB ,,.,
21
.
),,2,1(
),,2,1(
,1)2(,
,0
,1
,1
)2( pj
pi
i
ij
ij
ij
W
i
i
j
=
=
=
>
=
<
=
θ
The last layer: select total amounts of T neurons
12
, ,,
T
CC C
, T is the
total classification type of samples.
25
1, ( ) mod 0
(3) , (2 ) 1, ( 1, 2, , ), ( 1, 2, , )
0,
i
ji
ji T
W i i Tj p
others
θ
−=
= =
−= =

The network adds a hidden layer to make the neuron weights increase
linearly and decrease the complexity of the algorithm, in contrast, the original
three-layer neural networks makes the neuron weights increase with index.
Secondly, the system framework that we proposed based on our MFNN
algorithm (figure omitted for page limited, which has described in our paper [5]),
mainly constructs by three parts: input module, recommendation methods and
output module, where our MFNN algorithm is added in as retrieval process to be
evaluated directly with huge data. Our case recommender has such characteristic
advantage- excellent flexibility, which leads to a regenerate process for case
retrieval.
Experiments and Analysis
The data of “forest cover type” in our experiment is downloaded from UCI
repository, and is designed to validate our MFNN algorithm, whose main
information is described as follows: Number of instances (observations) 581,012,
Number of Attribute: 54; Number of Class: 7.
Each record represents the user personalized data collected from the websites,
which is regarded as a user behavior vector with 54 Attributes, and users’ data
library accumulates to 581,012 users’ sessions. Then, the normal
Macro-averaging is used to calculate all classes’ means F-score.
As table 1 shows, we can find the data are spare matrix with high dimensional
and huge records, and the system efficiency is significantly enhanced. Our
experimental results also indicates that recommendation system runs in two
stages, the first spends on training the users’ personalized cases, which costs too
many time, but can be run in the backstage; and the second spends on
recommending the proper case just in a few time. Thus, ours can manipulate
massive personalized data effectively, and improve the performance of
e-commerce recommendation.
Table 1 system performance
User Cases F-score(%) T-partion(s) Time(ms)
10,000 79.1 14.207 14.81
15,000 81.7 33.225 34.339
20,000 82.9 49.209 50.391
30,000 81.3 68.316 69.577
40,000 82.4 86.05 87.349
26
50,000 81.6 103.06 104.4
100,000 83.2 272.31 273.75
Conclusion
Personalized data involves a process of gathering and storing information
about site visitors, they are the key assets for analyzing current and past user
interactive behavior, and delivering the right content to each visitor; but they are
massive along with high dimension, and can be hardly manipulated. The sequent
investigations indicate that our recommender has clear system structure, feasible
component combination, easy integration and construction. Our experimental
results suggest that our MFNN algorithm is suitable for the large-scale and high
dimensional data processing, which can guarantee the better performance for
CBR-recommendation system.
Acknowledgement
In this paper, the research was sponsored by the Nature Science Foundation of
Anhui Province (Project No. KJ2014A050).
References
[1] Wei ChuSeung-Taek. Park, “Personalized Recommendation on Dynamic
Content Using Predictive Bilinear Models” [C], WWW2009, pp691-700
[2] Bach, K., Althoff, K.-D., Newo, R., Stahl, A. “A Case-Based Reasoning
Approach for Providing Machine Diagnosis from Service Reports” [C]. ICCBR
2011. LNCS, (6880), pp. 363-377
[3] Zurina Saaya, Markus Schaal, Maurice Coyle, Peter Briggs, and Barry Smyth.
“Exploiting Extended Search Sessions for Recommending Search Experiences in
the Social Web” [C]. ICCBR 2012. LNAI, (7466), pp. 369-383
[4] Jianyang Li, Xiaoping Liu. “Personalized Recommendation System on
Massive Content Processing Using Improved MFNN” [C]. Springer's LNCS
7529 (2012), pp183-190
[5] Jianyang Li, Xiaoping Liu, Rui Li. “Optimized RBF for
CBR-Recommendation System” [J]. AMM 214 (2012), pp568-572
[6] Debarun Kar, Sutanu Chakraborti, and Balaraman Ravindran. “Feature
Weighting and Confidence Based Prediction for Case Based Reasoning
27
Systems” [C]. LNAI, (7466), pp. 211-225
[7] Zhiwei NiJianyang LiFenggang LiShanlin Yang. “Survey of Case
Decision Techniques and Case Decision Support System” [J]. Chinese Computer
science,2009,36(11),pp18-24
[8] ZHANG Ling. “The relationship between Kernel Functions Based SVM and
Three-layer Feedforward Neural Networks” [J]. Chinese J. Computer, 25(7):
696-700, 2002.
28
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
The quality of the cases maintained in a case base has a direct influence on the quality of the proposed solutions. The presence of cases that do not conform to the similarity hypothesis decreases the alignment of the case base and often degrades the performance of a CBR system. It is therefore important to find out the suitability of each case for the application of CBR and associate a solution with a certain degree of confidence. Feature weighting is another important aspect that determines the success of a system, as the presence of irrelevant and redundant attributes also results in incorrect solutions. We explore these problems in conjunction with a real-world CBR application called InfoChrom. It is used to predict the values of several soil nutrients based on features extracted from a chromatogram image of a soil sample. We propose novel feature weighting techniques based on alignment, as well as a new alignment and confidence measure as potential solutions. The hypotheses are evaluated on UCI datasets and the case base of Infochrom and show promising results.
Article
Full-text available
Recommendation systems are widely used in E-commerce to help their customers find products to purchase, with which an important problem is to efficiently search the contents with their demands, and have been attracting attention from quite a few researchers and practitioners from different fields. This paper proposes the CBR-recommender (Case-Based Reasoning) which is a comprehensive expression of human sense, logics and creativity, and can automatically acquire the user’s preferences from the process of adaptation or revision to satisfy the personalized needs; and we deploy radial basis function network (RBF) to control the system scale caused by the large amounts of data with high dimensions, whose performance is also superior with respect to the total time for satisfying a query Our experiments indicate that our mechanism is efficient since it is bounded by the number of neighbors and scalable because no global knowledge is required to be maintained.
Conference Paper
Full-text available
This paper presents a case-based reasoning system that has been applied in a machine diagnosis customer support scenario. Complex machine problems are solved by sharing machine engineers' experiences among technicians. Within our approach we made use of existing service reports, extracted machine diagnosis information and created a case base out it that provides solutions faster and more efficient than the traditional approach. The problem solving knowledge base is a data set that has been collected over about five years for quality assurance purposes and we explain how existing data can be used to build a case-based reasoning system by creating a vocabulary, developing similarity measures and populating cases using information extraction techniques.
Conference Paper
Full-text available
In Web-based services of dynamic content (such as news arti- cles), recommender systems face the diculty of timely iden- tifying new items of high-quality and providing recommen- dations for new users. We propose a feature-based machine learning approach to personalized recommendation that is capable of handling the cold-start issue eectively. We main- tain profiles of content of interest, in which temporal charac- teristics of the content, e.g. popularity and freshness, are up- dated in real-time manner. We also maintain profiles of users including demographic information and a summary of user activities within Yahoo! properties. Based on all features in user and content profiles, we develop predictive bilinear regression models to provide accurate personalized recom- mendations of new items for both existing and new users. This approach results in an oine model with light computa- tional overhead compared with other recommender systems that require online re-training. The proposed framework is general and flexible for other personalized tasks. The supe- rior performance of our approach is verified on a large-scale data set collected from the Today-Module on Yahoo! Front Page, with comparison against six competitive approaches.
Conference Paper
Though the research in personalized recommendation systems has become widespread for recent years, IEEE Internet Computing points out that current system can not meet the real large-scale e-commerce demands, and has some weakness such as low precision and slow reaction. We have proposed a structure of personalized recommendation system based on case intelligence, which originates from human experience learning, and can facilitate to integrate various artificial intelligence components. Addressing on user case retrieval problem, the paper uses constructive and understandable multi-layer feed-forward neural networks (MFNN), and employs covering algorithm to decrease the complexity of ANN algorithm. Testing from the two different domains, our experimental results indicate that the integrated method is feasible for the processing of vast and high dimensional data, and can improve the recommendation quality and support the users effectively. The paper finally signifies that the better performance mainly comes from the reliable constructing MFNN.
Article
The equivalent between kernel functions based SVM (Vapnik) and the three-layer feedforward neural network is demonstrated. From the covering algorithms of neural networks that author proposed, a kernel function existence theorem is proved. The theory shows that given a set of training samples, there must exist a corresponding function such that the image of the training samples is linear separated in a high dimensional space under the mapping of the function. An algorithm of seeking the kernel functions is given. The computational complexity of the algorithm is polynomial growing with the sample size and the solution is the maximal margin one in the high dimensional space.
Conference Paper
HeyStaks is a case-based social search system that allows users to create and share case bases of search experiences (called staks) and uses these staks as the basis for result recommendations at search time. These recommendations are added to conventional results from Google and Bing so that searchers can benefit from more focused results from people they trust on topics that matter to them. An important point of friction in HeyStaks is the need for searchers to select their search context (that is, their active stak) at search time. In this paper we extend previous work that attempts to eliminate this friction by automatically recommending an active stak based on the searchers context (query terms, Google results, etc.) and demonstrate significant improvements in stak recommendation accuracy.