Conference Paper

# Feature weighting and instance selection for collaborativefiltering

Inst. of Comput. Sci., Univ. of Munich

DOI: 10.1109/DEXA.2001.953076 Conference: Database and Expert Systems Applications, 2001. Proceedings. 12th International Workshop on Source: IEEE Xplore

- [Show abstract] [Hide abstract]

**ABSTRACT:**Collaborative filtering systems have achieved great success in both research and business applications. One of the key technologies in collaborative filtering is similarity measure. Cosine-based and Pearson correlation-based methods are popular ways for similarity measure, but have low accuracy. In this paper, we propose a novel method for similarity measure, referred as hierarchical pair-wise sequence (HPWS). In HPWS, we take into account both the sequence property of user behaviors and the hierarchical property of item categories. We design a collaborative filtering recommendation system to evaluate the performance of HPWS based on the empirical data collected from a real P2P application, i.e. "byrBT" in CERNET. Experiment results show that HPWS outperforms traditional Cosine similarity and Pearson similarity measures under all scenarios.01/2012; - [Show abstract] [Hide abstract]

**ABSTRACT:**Research on the use of social trust relationships for col- laborative ltering has shown that trust-based recom- mendations can outperform traditional methods in cer- tain cases. This, in turn, lead to insights that tie trust to certain more subtle types of similarity between users which is not captured in the overall similarity measures normally used for making recommendations. In this study, we investigate the use these trust-inspired nu- anced similarity measures directly for making recom- mendations. After describing previous research that identied these similarity statistics, we present an ex- periment run on two data sets: FilmTrust and Movie- Lens. Our results show that using a simple measure - the single largest dierence between users - as a weight01/2009; - [Show abstract] [Hide abstract]

**ABSTRACT:**The properties of training data set such as size, distribution and number of attributes significantly contribute to the generalization error of a learning machine. A not- well-distributed data set is prone to lead to a partial overfitting model. The two approaches proposed in this paper for the binary classification enhance the useful data information by mining negative data. First, error driven compensating hypothesis approach is based on the Support Vector Machines with 1+k times learning, where the base learning hypothesis is iteratively compensated k times. This approach produces a new hypothesis on the new data set in which, each label is a transformation of the label from the negative data set, further produces the child positive and negative data subsets in subsequent iterations. This procedure refines the model created by the base learning algorithm, creating k number of hypotheses over k iterations. A predicting method is also proposed to trace the relationships between the negative subsets and testing data set by vector similarity technique. Second, a statistical negative examples learning approach based on theoretical analysis improves the performance of base learning algorithm learner by creating one or two additional hypothesis audit and booster to mine the negative examples output from the learner. The learner employs a regular support vector

Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.