Article

Support vector data description using privileged information

Wiley
Electronics Letters
Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Support vector data description (SVDD) is a data description method which gives the target data set a hypersphere-shaped description and can be used for one-class classification or outlier detection. To further improve its performance, a novel SVDD called SVDD+ which introduces the privileged information to the traditional SVDD is proposed. This privileged information, which is ignored by the classical SVDD but often exists in human learning, will optimise the training phase by constructing a set of correcting functions. The performance of SVDD+ on data sets from the UCI machine learning repository and radar emitter recognition is demonstrated. The experimental results indicate the validity and advantage of this method.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Over the past decade, researchers have exploited additional information for improving the generalization ability of classifiers [1][2][3][4][5][6][7][8][9][10]. In the real world, additional information generally exists with training samples, but traditional Chandan [11] addressed this issue by proposing a novel framework, i.e., learning using privileged information (LUPI). ...
... In recent years, Wang et al. [1] and Meng et al. [15] developed LUPI frameworkbased method for multi-label classification and pedestrian detection, respectively. Researchers also explored LUPI framework for the one-class classification (OCC) task [4,6,9]. This paper is also focusing on the OCC task using privileged information. ...
... These methods have been further enabled to utilize the privileged information during learning. Zhu and Zhong [9] combined the LUPI framework with OCSVM and named as OCSVM+, and Zhang [6] combined the concept of SVDD and LUPI framework, which was referred to as SVDD+. Further, Burnaev and Smolyakov [4] regularized the privileged information term in the formulation of SVDD+ and OCSVM+. ...
Article
Full-text available
In recent years, non-iterative learning approaches for kernel have received quite an attention by researchers and kernel ridge regression (KRR) approach is one of them. Recently, KRR-based Auto-Encoder is developed for the one-class classification (OCC) task and named as AEKOC. OCC is generally used for outlier or novelty detection. The brain can detect outlier just by learning from only normal samples. Similarly, OCC also uses only normal samples to train the model, and trained model can be used for outlier detection. In this paper, AEKOC is enabled to utilize privileged information, which is generally ignored by AEKOC or any traditional machine learning technique but usually present in human learning. For this purpose, we have combined learning using privileged information (LUPI) framework with AEKOC, and proposed a classifier, which is referred to as AEKOC+. Privileged information is only available during training but not during testing. Therefore, AEKOC is unable to utilize this information for building the model. However, AEKOC+ can efficiently handle the privileged information due to the inclusion of the LUPI framework with AEKOC. Experiments have been conducted on MNIST dataset and on various other datasets from UCI machine learning repository, which demonstrates the superiority of AEKOC+ over AEKOC. Our formulation shows that AEKOC does not utilize the privileged features in learning; however, formulation of AEKOC+ helps it in learning from the privileged features differently from other available features and improved generalization performance of AEKOC. Moreover, AEKOC+ also outperformed two LUPI framework–based one-class classifiers (i.e., OCSVM+ and SSVDD+).
... In recent years, learning using privileged information ( LUPI ) framework is being quite popular among researchers [3,5,11,12,16,23,26,28,29] . In the real world, human learns not just by looking at an object but also learns by listening to extra information provided by someone. ...
... Further, Celik and McDaniel [4] employed privileged information via generalized distillation. Most recently, LUPI framework has been employed for one-class classification ( OCC ) task [3,28,29] using iterative learning approach. In this paper, we explore LUPI framework for OCC task using non-iterative learning approach. ...
... Zhu and Zhong [29] have extented this concept for OCC using OCSVM and referred to as OCSVM + . Further, Zhang [28] has developed it for SVDD and named it SVDD + . Most recently, Burnaev and Smolyakov [3] modified the formulation of OCSVM + and SVDD + by adding a regularization factor on privileged feature space. ...
Article
A kernel-based one-class classifier is mainly used for outlier or novelty detection. Kernel ridge regression (KRR) based methods have received quite a lot of attention in recent years due to its non-iterative approach of learning. In this paper, KRR-based one-class classifier (KOC) has been extended for learning using privileged information (LUPI) framework. LUPI-based KOC method is referred to as KOC+ in this paper. This privileged information is available as feature/features of the dataset, but only during training (not during testing). KOC+ utilizes privileged features information differently compared to other features information. It uses this information in KOC+ by the help of so-called correction function. This information helps KOC+ in achieving better generalization performance. Existing and proposed classifiers are evaluated on the datasets taken from UCI machine learning repository and MNIST dataset. Moreover, experimental results exhibit that KOC+ outperforms KOC and other LUPI-based state-of-the-art one-class classifiers. Source code of this paper is provided on the corresponding author’s GitHub homepage:https://github.com/Chandan-IITI/KOCPlus_or_OCKELMPlus_or_OCLSSVMPlus
... In recent years, learning using privileged information (LUPI) framework is being quite popular among researchers [1,2,3,4,5,6,7,8,9,10]. In the real world, human learns not just by looking at an object but also learns by listening to extra information provided by someone. ...
... Further, this concept has been explored for various types of tasks viz., face verification [12], multi-Label classification [1], visual recognition [5], malware detection [4] etc. Most recently, LUPI framework has been employed for one-class classification (OCC) task [4,6,9]. ...
... LUPI concept is inspired from student-teacher learning among human being where a student learns from the explanation and comments of his teacher. Further, this concept has been employed with OCSVM and SVDD, and created one-class classifiers OCSVM+ [9] and SVDD+ [6], respectively. Most recently, Burnaev and Smolyakov[4] modified the formulation of OCSVM+ and SVDD+ by adding a regularization factor on privileged feature space. ...
Preprint
Full-text available
Kernel method-based one-class classifier is mainly used for outlier or novelty detection. In this letter, kernel ridge regression (KRR) based one-class classifier (KOC) has been extended for learning using privileged information (LUPI). LUPI-based KOC method is referred to as KOC+. This privileged information is available as a feature with the dataset but only for training (not for testing). KOC+ utilizes the privileged information differently compared to normal feature information by using a so-called correction function. Privileged information helps KOC+ in achieving better generalization performance which is exhibited in this letter by testing the classifiers with and without privileged information. Existing and proposed classifiers are evaluated on the datasets from UCI machine learning repository and also on MNIST dataset. Moreover, experimental results evince the advantage of KOC+ over KOC and support vector machine (SVM) based one-class classifiers.
... with the problem statement (5)) is that the parameters ν r and γ influence the regularization in a dependent manner, i.e. their contribution to the regularization can not be disentangled. A similar framework is used in [16], where, as in the previous paper, the authors use ordinal numbers of corresponding subdomains as privileged information. The main difference with paper [15] is that the SVDD algorithm underlies their approach. ...
... The main difference with paper [15] is that the SVDD algorithm underlies their approach. In fact the authors of [16] propose to solve the following optimization problem in order to find the decision rule: ...
... One more difference from the problem statement, proposed in this paper, is another approach to modelling slack variables ξ i . In our approach we use linear model (4), but in [16] ...
... with the problem statement (5)) is that the parameters ν r and γ influence the regularization in a dependent manner, i.e. their contribution to the regularization can not be disentangled. A similar framework is used in [16], where, as in the previous paper, the authors use ordinal numbers of corresponding subdomains as privileged information. The main difference with paper [15] is that the SVDD algorithm underlies their approach. ...
... The main difference with paper [15] is that the SVDD algorithm underlies their approach. In fact the authors of [16] propose to solve the following optimization problem in order to find the decision rule: ...
... In [15] and [16] (we provide the review of the related methods in section IV) authors describe results of experiments on real datasets. In particular, in [15] results of experiments on four datasets are described, and in [16] only two datasets are used among those, which are considered in [15]. ...
Article
Full-text available
A number of important applied problems in engineering , finance and medicine can be formulated as a problem of anomaly detection. A classical approach to the problem is to describe a normal state using a one-class support vector machine. Then to detect anomalies we quantify a distance from a new observation to the constructed description of the normal class. In this paper we present a new approach to the one-class classification. We formulate a new problem statement and a corresponding algorithm that allow taking into account a privileged information during the training phase. We evaluate performance of the proposed approach using a synthetic dataset, as well as the publicly available Microsoft Malware Classification Challenge dataset.
... The models for one-class SVM with privileged information, closest to our OC-SVM+, were considered in Zhu and Zhong [57] , and Zhang [81]. Burnaev and Smolyakov [20] compared these algorithms with the offline version of OC-SVM+ and found that they are comparable in anomaly detection accuracy. ...
... Their formulations are close to (8)- (9), however there are no privileged features x * , and each group of data examples has its own set of the parameters w * , ϕ * , and b * . Similar approach proposed by Zhang [81] incorporates privileged features based on SVDD formulation (1)- (2). ...
Article
Full-text available
One of the powerful techniques in data modeling is accounting for features that are available at the training stage, but are not available when the trained model is used to classify or predict test data — Learning Using Privileged Information paradigm (LUPI, Vapnik and Vashist [1]). Sequential Minimal Optimization (SMO) method has been already developed for supervised Support Vector Machines (SVM) in Platt [2] and Keerthi et al. [3], for unsupervised (one-class) SVM in Schölkopf et al. [4], and for SVM with privileged information (SVM+) in Pechyony and Vapnik [5]. As can be seen, the missing brick in this research has long been a one-class SVM with privileged information (OC-SVM+). In this paper, we propose SMO algorithm for OC-SVM+ that significantly outperforms non-sequential algorithms for training the OC-SVM+ model. Its finite-time convergence is established. The experiments show how privileged information affects a descriptive domain in the space of original features. Comparative benchmark tests demonstrate that our algorithm is superior over interior point algorithms.
... Recently, a new learning paradigm named LUPI [18] is extended in SVM [19] and applied for similarity control [20], emotion recognition [21] and data clustering [22]. Zhang [23] developed incomplete SVDD+ model using privileged information and some critical limitations exist. Firstly, it is not clear how group information is obtained and what the boundary is. ...
... The privileged information * x often exists in the real applications so that it can be used to improve the classification performance. During the training phase, the training data set T can be expanded to the triplets   [23]. The optimization objective function of ESVDD is formulated as: ...
Article
Full-text available
A novel machine learning method named extended support vector data description with negative examples (ESVDDneg) is developed to classify the fast Fourier transform-magnitude feature of complex high-resolution range profile (HRRP), motivated by the problem of radar automatic target recognition. The proposed method not only inherits the close non-linear boundary advantage of support vector data description with negative examples model but also incorporates a new learning paradigm named learning using privileged information into the model. It leads to the appealing application with no assumptions regarding the distribution of data and needs less training samples and prior information. Besides, the second order central moment is selected as privileged information for better recognition performance, weakening the effect of translation sensitivity, and the normalisation contributes to eliminating the amplitude sensitivity. Hence, there will be a remarkable improvement of recognition accuracy not only with small training dataset but also under the condition of low signal-to-noise ratio. Numerical experiments based on two publicly UCI datasets and HRRPs of four aircrafts demonstrate the feasibility and superiority of the proposed method. The noise robust ESVDD-neg is ideal for HRRP-based radar target recognition.
... Hyperellipsoid-based domain description method (one hyperellipsoid model) proposed in the section III (A) is compared with four other outlier detection methods: local outlier factor (LOF), OneClassSVM, [15] support vector domain description (SVDD) [16] and isolation forest [17]. The idea of LOF is to find the average density of the location of a sample point within the specified range. ...
Article
Full-text available
Big data is usually massive, diverse, time-varying, and high-dimensional. The focus of this paper is on the domain description of big data, which is the basis for solving the above problems. This paper has three main contributions. Firstly, one hyperellipsoid model is proposed to analyze domain description of big data. The parameters of the hyperellipsoid model can be adaptively adjusted according to the proposed objective function without relying on manual parameter selection, which expands the application range of the model. Secondly, an improved FDPC algorithm is proposed to generate multiple hyperellipsoid models to approximate the spatial distribution of big data, thus improving the accuracy of domain description. Multiple hyperellipsoid models can not only greatly eliminate the spatial redundancy of the domain description based on one hyperellipsoid model, but also provide a feasible method for describing complex spatial distribution. Thirdly, an online domain description algorithm based on hyperellipsoid models is proposed, which improves the robustness of hyperellipsoid models on time-varying data. The parallel processing flow of the algorithm is given. In the experiment, synthetic instances and real-world datasets were applied to test the performance of hyperellipsoid models. By comparing LOF, OneClassSVM, SVDD and isolation forest, the performance of the proposed method is competitive and promising.
... standard soft-margin one-class ѵ-SVM formulation or SVDD [24][25][26][27]. Also, privileged information can effectively help in object classification or age estimation [28][29][30]. ...
Article
Full-text available
Many classification models, based on support vector machine, have been designed so far to improve classification performance in both supervised and semi-supervised learning. One of the studies which is done in this case is about the use of privileged information that is hidden in training data. However, the challenge is how to find the privileged information. In most researches, experts have defined privileged information, but in this paper, it has been tried to automatically select a feature as privileged information and classify training data into several groups. This grouping has been used to correct the decision function of classifier. Moreover, the proposed classifier has been used in one-against-all (OAA) approach for semi-supervised datasets. To overcome uncertain areas in OAA, belief function and active learning techniques are applied to extract the most informative samples. The experimental results indicate the superiority of the proposed method among the other state-of-the-art methods in terms of classification accuracy.
... Tax and Duin proposed SVDD inspired by support vector machine (SVM) [6]. Then various ideas, such as using privileged information [7,8] and density weighted [9], for improving the performance of the SVDD are proposed. Besides, based on equality constraints instead of inequality ones, least square support vector machine (LSSVM) is proposed for classification and regression [10,11], respectively. ...
Article
A risk monitoring method based on normal region estimation (NRE) is systematically proposed for the actual situation of the lack of fault data in the condition identification and monitoring of railway vehicle bearings. First, the basic concept of normal domain theory is expounded, and the formal expression of normal domain is given. Secondly, the academic thoughts and implementation steps of risk monitoring based on NRE are summarized. Then, two algorithms based on convex hull and support vector data description (SVDD) are proposed respectively to solve the core problem of boundary estimation. Finally, the rolling-bearing vibration acceleration data was used for the experiment, and the performance of the two algorithms is compared. The results show that both algorithms are effective. In contrast, the convex hull algorithm is faster, and the SVDD algorithm is smoother and more flexible. In practical applications, the two algorithms can be selected according to different requirements of real time and accuracy. © 2019 Taylor & Francis Group, LLC and The University of Tennessee.
Article
The pattern recognition of a partial discharge (PD) is critical to evaluate the insulation condition of electric equipment of high voltage. However, much attention had been paid to recognise PD types which are known, but it is ignored that the types which did not appear previously. To solve the above problems, a method to recognise unknown PD types based on improved support vector data description (SVDD) algorithm is introduced in this study. Tri-training algorithm and double thresholds set based on Otsu algorithm are used to improve the traditional SVDD classifiers. PD samples collected from different artificial defects models are finally classified by the improved fuzzy c-means clustering algorithm. Experiments compared the improved SVDD with existing one-class classification methods such as SVDD, one-class support vector machine and probability density function estimation. The results show that the proposed method has much higher recognition accuracy. It is verified that the improved SVDD is an efficient method which can be applied to the recognition of unknown PD types.
Article
Full-text available
Data domain description concerns the characterization of a data set. A good description covers all target data but includes no superfluous space. The boundary of a dataset can be used to detect novel data or outliers. We will present the Support Vector Data Description (SVDD) which is inspired by the Support Vector Classifier. It obtains a spherically shaped boundary around a dataset and analogous to the Support Vector Classifier it can be made flexible by using other kernel functions. The method is made robust against outliers in the training set and is capable of tightening the description by using negative examples. We show characteristics of the Support Vector Data Descriptions using artificial and real data.
Article
In this paper, we derive a new one-class Support Vector Machine (SVM) based on hidden information. Taking into account the fact that in some applications, the training instances are rather limited, we attempt to utilize the additional information hidden in the training data. We demonstrate the performance of the new one-class SVM on several publicly available data sets from UCI machine learning repository and also present the comparison with the standard one-class SVM. The experimental results indicate the validity and advantage of the new one-class SVM.
Article
One-class classification (OCC) has received a lot of attention because of its usefulness in the absence of statistically-representative non-target data. In this situation, the objective of OCC is to find the optimal description of the target data in order to better identify outlier or non-target data. An example of OCC, support vector data description (SVDD) is widely used for its flexible description boundaries without the need to make assumptions regarding data distribution. By mapping the target dataset into high-dimensional space, SVDD finds the spherical description boundary for the target data. In this process, SVDD considers only the kernel-based distance between each data point and the spherical description, not the density distribution of the data. Therefore, it may happen that data points in high-density regions are not included in the description, decreasing classification performance. To solve this problem, we propose a new SVDD introducing the notion of density weight, which is the relative density of each data point based on the density distribution of the target data using the k-nearest neighbor (k-NN) approach. Incorporating the new weight into the search for an optimal description using SVDD, this new method prioritizes data points in high-density regions, and eventually the optimal description shifts to these regions. We demonstrate the improved performance of the new SVDD by using various datasets from the UCI repository.
Article
Support vector data description (SVDD) is a data description method that can give the target data set a spherically shaped description and be used to outlier detection or classification. In real life the target data set often contains more than one class of objects and each class of objects need to be described and distinguished simultaneously. In this case, traditional SVDD can only give a description for the target data set, regardless of the differences between different target classes in the target data set, or give a description for each class of objects in the target data set. In this paper, an improved support vector data description method named two-class support vector data description (TC-SVDD) is presented. The proposed method can give each class of objects in the target data set a hypersphere-shaped description simultaneously if the target data set contains two classes of objects. The characteristics of the improved support vector data descriptions are discussed. The results of the proposed approach on artificial and actual data show that the proposed method works quite well on the 3-class classification problem with one object class being undersampled severely.
Article
In the Afterword to the second edition of the book "Estimation of Dependences Based on Empirical Data" by V. Vapnik, an advanced learning paradigm called Learning Using Hidden Information (LUHI) was introduced. This Afterword also suggested an extension of the SVM method (the so called SVM(gamma)+ method) to implement algorithms which address the LUHI paradigm (Vapnik, 1982-2006, Sections 2.4.2 and 2.5.3 of the Afterword). See also (Vapnik, Vashist, & Pavlovitch, 2008, 2009) for further development of the algorithms. In contrast to the existing machine learning paradigm where a teacher does not play an important role, the advanced learning paradigm considers some elements of human teaching. In the new paradigm along with examples, a teacher can provide students with hidden information that exists in explanations, comments, comparisons, and so on. This paper discusses details of the new paradigm and corresponding algorithms, introduces some new algorithms, considers several specific forms of privileged information, demonstrates superiority of the new learning paradigm over the classical learning paradigm when solving practical problems, and discusses general questions related to the new ideas.
Article
A simple yet effective unsupervised classification rule to discriminate between normal and abnormal data is based on accepting test objects whose nearest neighbors distances in a reference data set, assumed to model normal behavior, lie within a certain threshold. This work investigates the effect of using a subset of the original data set as the reference set of the classifier. With this aim, the concept of a reference consistent subset is introduced and it is shown that finding the minimum cardinality reference consistent subset is intractable. Then, the CNNDD algorithm is described, which computes a reference consistent subset with only two reference set passes. Experimental results revealed the advantages of condensing the data set and confirmed the effectiveness of the proposed approach. A thorough comparison with related methods was accomplished, pointing out the strengths and weaknesses of one-class nearest-neighbor-based training set consistent condensation.
Time-frequency/time-scale analysis for radar applications
  • V C Chen