We propose a new support vector data description (SVDD) incorporating the local density of a training data set by introducing a local density degree for each data point. By using a density-induced distance measure based on the degree, we reformulate a conventional SVDD. Experiments with various real data sets show that the proposed method more accurately describes training data sets than the conventional SVDD in all tested cases. (c) 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
[Show abstract][Hide abstract] ABSTRACT: Subcellular localization is one of the key functional characteristics of proteins. An automatic and efficient prediction method for the protein subcellular localization is highly required owing to the need for large-scale genome analysis. From a machine learning point of view, a dataset of protein localization has several characteristics: the dataset has too many classes (there are more than 10 localizations in a cell), it is a multi-label dataset (a protein may occur in several different subcellular locations), and it is too imbalanced (the number of proteins in each localization is remarkably different). Even though many previous works have been done for the prediction of protein subcellular localization, none of them tackles effectively these characteristics at the same time. Thus, a new computational method for protein localization is eventually needed for more reliable outcomes. To address the issue, we present a protein localization predictor based on D-SVDD (PLPD) for the prediction of protein localization, which can find the likelihood of a specific localization of a protein more easily and more correctly. Moreover, we introduce three measurements for the more precise evaluation of a protein localization predictor. As the results of various datasets which are made from the experiments of Huh et al. (2003), the proposed PLPD method represents a different approach that might play a complimentary role to the existing methods, such as Nearest Neighbor method and discriminate covariant method. Finally, after finding a good boundary for each localization using the 5184 classified proteins as training data, we predicted 138 proteins whose subcellular localizations could not be clearly observed by the experiments of Huh et al. (2003).
Nucleic Acids Research 02/2006; 34(17):4655-66. DOI:10.1093/nar/gkl638 · 9.11 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Computer-vision-based automatic detection of fabric defects is one of the difficult one-class classification tasks in the real world. To overcome the incapacity of a single fractal feature in dealing with this task, multiple fractal features have been extracted in the light of the theory of and problems present in the box-counting method as well as the inherent characteristics of woven fabrics. Based on statistical learning theory, the up-to-date support vector data description (SVDD) is an excellent approach to the problem of one-class classification. A robust new scheme is presented in this paper for optimally selecting values of the parameters especially that of the scale parameter of the Gaussian kernel function involved in the training of the SVDD model. Satisfactory experimental results are finally achieved by jointly applying the extracted multiple fractal features and SVDD to the detection of defects from several datasets of fabric samples with different texture backgrounds.
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.