Rapid and brief communication: Improving support vector data description using local density degree

Pattern Recognition (Impact Factor: 2.58). 01/2005; 38(10):1768-1771. DOI: 10.1016/j.patcog.2005.03.020
Source: DBLP

ABSTRACT We propose a new support vector data description (SVDD) incorporating the local density of a training data set by introducing a local density degree for each data point. By using a density-induced distance measure based on the degree, we reformulate a conventional SVDD. Experiments with various real data sets show that the proposed method more accurately describes training data sets than the conventional SVDD in all tested cases.

  • [Show abstract] [Hide abstract]
    ABSTRACT: One-class classification (OCC) has received a lot of attention because of its usefulness in the absence of statistically-representative non-target data. In this situation, the objective of OCC is to find the optimal description of the target data in order to better identify outlier or non-target data. An example of OCC, support vector data description (SVDD) is widely used for its flexible description boundaries without the need to make assumptions regarding data distribution. By mapping the target dataset into high-dimensional space, SVDD finds the spherical description boundary for the target data. In this process, SVDD considers only the kernel-based distance between each data point and the spherical description, not the density distribution of the data. Therefore, it may happen that data points in high-density regions are not included in the description, decreasing classification performance. To solve this problem, we propose a new SVDD introducing the notion of density weight, which is the relative density of each data point based on the density distribution of the target data using the k-nearest neighbor (k-NN) approach. Incorporating the new weight into the search for an optimal description using SVDD, this new method prioritizes data points in high-density regions, and eventually the optimal description shifts to these regions. We demonstrate the improved performance of the new SVDD by using various datasets from the UCI repository.
    Expert Systems with Applications 01/2014; 41(7):3343–3350. · 1.85 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Support vector data description is a data description method which can give the target data set a spherically shaped description. However, when the target data set contains two classes of objects, this data description method can treat the target data set as a whole and give only one description for the training data set. This paper presents an improved support vector data description, which could give each class of target objects in the target data set a hyper-spherically shaped boundary, if the target data set contains two classes of objects.
    International Conference on Machine Learning and Cybernetics, ICMLC 2010, Qingdao, China, July 11-14, 2010, Proceedings; 01/2010
  • [Show abstract] [Hide abstract]
    ABSTRACT: The support vector domain description is a robust data domain description method. Its performance, however, is strongly influenced by kernel parameter. In this paper, we present a novel parameter-optimizing algorithm based on the idea that the optimal parameter can lead to a hypersphere-shaped distribution of the mapped data in the feature space. Firstly, based on an orthogonal basis of the subspace spanned by the mapped data, a way is given to capture the structure of the entire mapped data, which avoids the problem that the mapped data cannot be expressed in an explicit form. Secondly, based on the maximum-entropy non-Gaussian measurement, a new criterion is presented for estimating the degree for a distribution to be closed to the hypersphere area and it is used to select the suitable kernel parameter. The experiments on simulated data and real-world data demonstrate the effectiveness of the proposed method.
    ACTA AUTOMATICA SINICA 01/2008; 34(9).