Multi-instance genetic programming for web index recommendation
ABSTRACT This article introduces the use of a multi-instance genetic programming algorithm for modelling user preferences in web index recommendation systems. The developed algorithm learns user interest by means of rules which add comprehensibility and clarity to the discovered models and increase the quality of the recommendations. This new model, called G3P-MI algorithm, is evaluated and compared with other available algorithms. Computational experiments show that our methodology achieves competitive results and provide high-quality user models which improve the accuracy of recommendations.
SourceAvailable from: ieeexplore.ieee.org[Show abstract] [Hide abstract]
ABSTRACT: Mining from ambiguous data is very important in data mining. This paper discusses one of the tasks for mining from ambiguous data known as multi-instance problem. In multi-instance problem, each pattern is a labeled bag that consists of a number of unlabeled instances. A bag is negative if all instances in it are negative. A bag is positive if it has at least one positive instance. Because the instances in the positive bag are not labeled, each positive bag is an ambiguous. The mining aim is to classify unseen bags. The main idea of existing multi-instance algorithms is to find true positive instances in positive bags and convert the multi-instance problem to the supervised problem, and get the labels of test bags according to predict the labels of unknown instances. In this paper, we aim at mining the multi-instance data from another point of view, i.e., excluding the false positive instances in positive bags and predicting the label of an entire unknown bag. We propose an algorithm called Multi-Instance Covering kNN (MICkNN) for mining from multi-instance data. Briefly, constructive covering algorithm is utilized to restructure the structure of the original multi-instance data at first. Then, the kNN algorithm is applied to discriminate the false positive instances. In the test stage, we label the tested bag directly according to the similarity between the unseen bag and sphere neighbors obtained from last two steps. Experimental results demonstrate the proposed algorithm is competitive with most of the state-of-the-art multi-instance methods both in classification accuracy and running time.Tsinghua Science & Technology 08/2013; 18(4). DOI:10.1109/TST.2013.6574674
[Show abstract] [Hide abstract]
ABSTRACT: Clickstreams in users' navigation logs have various data which are related to users' web surfing. Those are visit counts, stay times, product types, etc. When we observe these data, we can divide clickstreams into sub-clickstreams so that the pages in a sub-clickstream share more contexts with each other than with the pages in other sub-clickstreams. In this paper, we propose a method which extracts more informative rules from clickstreams for web page recommendation based on genetic programming and association rules. First, we split clickstreams into sub-clickstreams by contexts for generating more informative rules. In order to split clickstreams in consideration of context, we extract six features from users' navigation logs. A set of split rules is generated by combining those features through genetic programming, and then informative rules for recommendation are extracted with the association rule mining algorithm. Through experiments, we verify that the proposed method is more effective than the other methods in various conditions.IEICE Transactions on Communications 05/2012; E95.B(5):1558-1565. DOI:10.1587/transcom.E95.B.1558 · 0.33 Impact Factor
[Show abstract] [Hide abstract]
ABSTRACT: Web index recommendation systems are designed to help internet users with suggestions for finding relevant information. One way to develop such systems is using the multi-instance learning (MIL) approach: a generalization of the traditional supervised learning where each example is a labeled bag that is composed of unlabeled instances, and the task is to predict the labels of unseen bags. This paper proposes a multi-instance learning wrapper method using the Rocchio classifier to recommend web index pages. The wrapper implements a new way to relate the instances with the class labels of the bags. The proposed method has low computational cost and the experimental study on benchmark data sets shows that it performs better than the state-of-the-art methods for this problem.Knowledge-Based Systems 03/2014; 59. DOI:10.1016/j.knosys.2014.01.008 · 3.06 Impact Factor