-
[show abstract]
[hide abstract]
ABSTRACT: In this paper, we propose a general graph-based semi-supervised learning algorithm. The core idea of our algorithm is to not
only achieve the goal of semi-supervised learning, but also to discover the latent novel class in the data, which may be unlabeled
by the user. Based on the normalized weights evaluated on data graph, our algorithm is able to output the probabilities of
data points belonging to the labeled classes or the novel class. We also give the theoretical interpretations for the algorithm
from three viewpoints on graph, i.e., regularization framework, label propagation, and Markov random walks. Experiments on
toy examples and several benchmark datasets illustrate the effectiveness of our algorithm.
KeywordsPattern recognition-Semi-supervised learning-Novel class discovery-Normalized weights
Neural Computing and Applications 04/2012; 19(4):549-555. · 0.70 Impact Factor
-
Pattern Recognition Letters. 01/2012; 33:485-491.
-
[show abstract]
[hide abstract]
ABSTRACT: The problem of semisupervised learning has aroused considerable research interests in the past few years. Most of these methods aim to learn from a partially labeled dataset, i.e., they assume that the exact labels of some data are already known. In this paper, we propose to use a novel type of supervision information to guide the process of semisupervised learning, which indicates whether a point does not belong to a specific category. We call this kind of information negative label (NL) and propose a novel approach called NL propagation (NLP) to efficiently make use of this type of information to assist the process of semisupervised learning. Specifically, NLP assumes that nearby points should have similar class indicators. The data labels are propagated under the guidance of NL information and the geometric structure revealed by both labeled and unlabeled points, by employing some specified initialization and parameter matrices. The convergence analysis, out-of-sample extension, parameter determination, computational complexity, and relations to other approaches are presented. We also interpret the proposed approach within the framework of regularization. Promising experimental results on image, digit, spoken letter, and text classification tasks are provided to show the effectiveness of our method.
IEEE Transactions on Neural Networks 04/2011; · 2.95 Impact Factor
-
IEEE Transactions on Neural Networks. 01/2011; 22:420-432.
-
[show abstract]
[hide abstract]
ABSTRACT: Semi-supervised learning has been paid increasing attention and is widely used in many fields such as data mining, information retrieval and knowledge management as it can utilize both labeled and unlabeled data. Laplacian SVM (LapSVM) is a very classical method whose effectiveness has been validated by large number of experiments. However, LapSVM is sensitive to labeled data and it exposes to cubic computation complexity which limit its application in large scale scenario. In this paper, we propose a multi-class method called Probabilistic labeled Semi-supervised SVM (PLSVM) in which the optimal decision surface is taught by probabilistic labels of all the training data including the labeled and unlabeled data. Then we propose a kernel version dual coordinate descent method to efficiently solve the dual problems of our Probabilistic labeled Semi-supervised SVM and decrease its requirement of memory. Synthetic data and several benchmark real world datasets show that PLSVM is less sensitive to labeling and has better performance over traditional methods like SVM, LapSVM (LapSVM) and Transductive SVM (TSVM).
Data Mining Workshops, 2009. ICDMW '09. IEEE International Conference on; 01/2010
-
Pattern Recognition. 01/2010; 43:720-730.
-
[show abstract]
[hide abstract]
ABSTRACT: The recent years have witnessed a surge of interests of learning a subspace for image classification, which has aroused considerable researches from the pattern recognition and signal processing fields. However, for image classification, the accuracies of previous methods are not so high since they neglect some particular characters of the image data. In this paper, we propose a new subspace learning method. It constrains that the transformation basis is orthonormal and the derived coefficients are spatially smooth. Classification is then performed in the image subspace. The proposed method can not only represent the intrinsic structure of the image data, but also avoid over-fitting. More importantly, it can be considered as a general framework, within which the performances of other subspace learning methods can be improved in the same way. Some related analyses of the proposed approach are presented. Promising experimental results on different kinds of real images demonstrate the effectiveness of our algorithm for image classification.
IEEE Signal Processing Letters 05/2009; · 1.39 Impact Factor
-
Neural Processing Letters. 01/2009; 30:89-102.
-
Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, Hong Kong, China, November 2-6, 2009; 01/2009
-
[show abstract]
[hide abstract]
ABSTRACT: Dimensionality reduction is a big challenge in many areas. A large number of local approaches, stemming from statistics or geometry, have been developed. However, in practice these local approaches are often in lack of robustness, since in contrast to maximum variance unfolding (MVU), which explicitly unfolds the manifold, they merely characterize local geometry structure. Moreover, the eigenproblems that they encounter, are hard to solve. We propose a unified framework that explicitly unfolds the manifold and reformulate local approaches as the semi-definite programs instead of the above-mentioned eigenproblems. Three well-known algorithms, locally linear embedding (LLE), laplacian eigenmaps (LE) and local tangent space alignment (LTSA) are reinterpreted and improved within this framework. Several experiments are presented to demonstrate the potential of our framework and the improvements of these local algorithms.
Pattern Recognition. 01/2009;