Fig 3 - uploaded by Youwei Luo
Content may be subject to copyright.
Unrelated groups generated by NFS and PLRC when the test set label (Anka) is different to the training set label (Smith). Ideally, all Anka images should be grouped into the unrelated group to produce a smaller unrelated distance. In PLRC method, some Anka images may not be properly grouped into the unrelated group due to style, color or other possible reasons. Consequently, the unrelated distance derived from PLRC is larger than that from NFS.
Source publication
Image set recognition has been widely applied in many practical problems like real-time video retrieval and image caption tasks. Due to its superior performance, it has grown into a significant topic in recent years. However, images with complicated variations, e.g., postures and human ages, are difficult to address, as these variations are continu...
Contexts in source publication
Context 1
... Note that U k is independent to the test set Y l , so the step of calculating distance in PLRC is not required in NFS. Fig. 3 shows the unrelated groups generated by NFS and PLRC. Because of their style, color or other complicated factors, PLRC fails to take all Anka images into the unrelated group. Consequently, the McCartney and Hawking images are considered to be members of an unrelated group, resulting in a larger unrelated distance. However, NFS avoids ...
Context 2
... images are randomly divided into three parts: training set, validation set and test set. The number parameter previously mentioned is set as (n train , n valid , n test ) = (3, 3, 3), and the experiments are conducted randomly for 30 times. The average recognition rate (RR) and standard error (STE) are reported in Tab. ...
Citations
... So far, many research works have been made in computer vision [7][8][9][10][11], especially in the image set classification [1][2][3][4][12][13][14][15][16][17][18] field. In this subsection, as shown in Figure 2, we briefly review two types of image set classification methods: static and dynamic modeling methods. ...
... However, in DLRC, in each distance calculation, only two image sets were used: it cannot use the useful information from other gallery image sets. Based on DLRC, two new algorithms, Pairwise Linear Regression Classification (PLRC) [17] and Discriminative Residual Analysis (DRA) [18], were then developed by introducing different unrelated subspaces. Specifically, PLRC maximized the unrelated distance between two image sets and minimized the related distance to improve the classification results. ...
As an important research direction in image and video processing, set-based video recognition requires speed and accuracy. However, the existing static modeling methods focus on computational speed but ignore accuracy, whereas the dynamic modeling methods are higher-accuracy but ignore the computational speed. Combining these two types of methods to obtain fast and accurate recognition results remains a challenging problem. Motivated by this, in this study, a novel Manifolds-based Low-Rank Dictionary Pair Learning (MbLRDPL) method was developed for a set-based video recognition/image set classification task. Specifically, each video or image set was first modeled as a covariance matrix or linear subspace, which can be seen as a point on a Riemannian manifold. Second, the proposed MbLRDPL learned discriminative class-specific synthesis and analysis dictionaries by clearly imposing the nuclear norm on the synthesis dictionaries. The experimental results show that our method achieved the best classification accuracy (100%, 72.16%, 95%) on three datasets with the fastest computing time, reducing the errors of state-of-the-art methods (JMLC, DML, CEBSR) by 0.96–75.69%.
... The image set classi¯cation problem has been studied in many works especially in face recognition applications. 34,37 In this work, we used a consensus prediction method 30 for our game-wise classication due to its simplicity. The consensus prediction is equivalent to the voting/ averaging ensemble method but an individual result comes from di®erent inputs (images) in a set instead of di®erent prediction models. ...
Icons and screenshots are important media displayed in game distribution platforms for providing a brief understanding of the game content to the customers. In this study, we develop ensemble convolutional neural networks for icon and screenshot analysis as three applications: an automatic genre classification, a similar game searching, and a recognition quality assessment. First, the genre classifier is developed using 154358 images from 18 030 games in 17 genres. The proposed genre classifiers achieve 40.5% and 47.6% accuracies for classifying a single icon and a single screenshot, which outperform the average performance of the human testers. The accuracy can be boosted to 54.2% by aggregating results from every image of the game. The Grad-CAM is applied to analyze what models learned. Then, the feature extraction part trained by this task is transferred to the other two applications. For the similar game searching, a dissimilarity of two images is directly computed by the Euclidean distance in the feature space. We define a dissimilarity between two games which are sets of multiple images based on their image-pairwise dissimilarity. The results show that the features are successfully transferred, and the model seems to be able to cluster the games with a similar gameplay and differentiate them from the other gameplays even if they come from the same genre. For the third application, we develop a system for quality assessment of game images based on the correctness of viewers’ understanding of game content by combining multiple models from three different problem definitions. Our system can identify good-genre-representing game images which most of the human testers can recognize their genre correctly with 75.0% accuracy for icons and 76.2% accuracy for screenshots.
... Then an alternating optimization algorithm is developed to solve the model. Ren et al. [36] exploit the discriminant features and present a discriminative residual analysis approach to improve the image set classification performance. ...
In image set classification, dual linear regression classification (DLRC) has shown the excellent performance on face image data without the interference of the complex background. However, DLRC could not well identify the data set with the complex background. The complex background means that the background is cluttered and the viewpoint is unusual or the object is partially occluded. This paper proposes a new model, kernelized dual regression (KDR), based on DLRC and the kernel trick which is a useful technique in image classification. Different from DLRC, KDR adopts a block partitioning strategy to extract the local information, which is able to conquer the shortcoming of DLRC. To capture the nonlinear relationship between the training set and test set, KDR tactfully maps these image sets into a high-dimensional feature space by adopting the nonlinear mapping associated with the Gaussian kernel function. In the reproducing kernel Hilbert space (RKHS), KDR can find the joint coefficients by minimizing the distance between training set and test set, and has a closed-form solution. Extensive experiments on four datasets show that KDR could achieve better classification performance than that of DLRC and other existing methods.
... As a development of DLRC, pairwise linear regression model (PLRC) [21] considers a new unrelated subspace as well. On the basis of PLRC, the recently proposed discriminative residual analysis (DAR) [31] obtains discriminant features and then projects the gallery set and probe set into the discriminant subspace for improving the classification performance. Recently, deep learning (e.g., DRM-MV) [32] is also gradually applied to image set classification tasks. ...
... The latest experiment performs with the various resolutions and the deep feature. To observe the effect by using the deep features, we add several comparison methods, such as DLRC [30] and DAR [31]. All comparison methods are performed adopted the source codes, which are given by the authors' homepage. ...
... In LFW database, there are The best results are shown in bold Neural Computing and Applications faces images with various illumination, poses and partially obscured. We perform the alignment version LFW-a [31,42]. Similar to way [30], we resize all face images to 90 Â 78. ...
Since image set classification has strong power to overcome various variations in illumination, expression, pose, and so on, it has drawn extensive attention in recent years. Noteworthily, the point-to-point distance-based methods have achieved the promising performance, which aim to compute the similarity between each gallery set and the probe set for classification purpose.
Nevertheless, these existing methods have to face the following problems: (1) they do not take full advantage of the between-set discrimination information; (2) they ideally presume that the importance of different gallery sets is equal, whereas this always violates objective facts and may degenerate algorithm performance in practice; (3) they tend to have high computational cost and several parameters, though explicit sparsity can enhance discrimination. To address these problems, we propose a novel method for face image set classification, namely self-weighted latent sparse discriminative learning (SLSDL). Specifically, a novel self-weighted strategy guided discrimination term is proposed to largely boost the discrimination of different gallery sets, such that the effect of true sets can be boosted while the effect of false sets can be weakened or removed. Moreover, we propose a latent sparse normalization to reduce computational complexity as well as the number of trade-off parameters. In addition, we propose an efficient optimization algorithm to solve the final SLSDL. Comprehensive experiments on four public benchmark datasets demonstrate that SLSDL is superior to the state-of-the-art competitors.
... One of the most common assumptions in UDA based on a statistical distribution, the alignment based on covariance matrices, that lie on the Riemannian manifold, equips the domain with the manifold and statistical properties. Motivated by previous attempts [32], [40], our work aims to embed a graphbased discriminant criterion to the target domain, and align the source and target domain based on the manifold assumption. ...
Unsupervised domain adaptation is effective in leveraging rich information from a labeled source domain to an unlabeled target domain. Though deep learning and adversarial strategy made a significant breakthrough in the adaptability of features, there are two issues to be further studied. First, hard-assigned pseudo labels on the target domain are arbitrary and error-prone, and direct application of them may destroy the intrinsic data structure. Second, batch-wise training of deep learning limits the characterization of the global structure. In this paper, a Riemannian manifold learning framework is proposed to achieve transferability and discriminability simultaneously. For the first issue, this framework establishes a probabilistic discriminant criterion on the target domain via soft labels. Based on pre-built prototypes, this criterion is extended to a global approximation scheme for the second issue. Manifold metric alignment is adopted to be compatible with the embedding space. The theoretical error bounds of different alignment metrics are derived for constructive guidance. The proposed method can be used to tackle a series of variants of domain adaptation problems, including both vanilla and partial settings. Extensive experiments have been conducted to investigate the method and a comparative study shows the superiority of the discriminative manifold learning framework.
... One of the most common assumptions in UDA based on a statistical distribution, the alignment based on covariance matrices, that lie on the Riemannian manifold, equips the domain with the manifold and statistical properties. Motivated by previous attempts [32], [40], our work aims to embed a graphbased discriminant criterion to the target domain, and align the source and target domain based on the manifold assumption. ...
Unsupervised domain adaptation is effective in leveraging rich information from a labeled source domain to an unlabeled target domain. Though deep learning and adversarial strategy made a significant breakthrough in the adaptability of features, there are two issues to be further studied. First, hard-assigned pseudo labels on the target domain are arbitrary and error-prone, and direct application of them may destroy the intrinsic data structure. Second, batch-wise training of deep learning limits the characterization of the global structure. In this paper, a Riemannian manifold learning framework is proposed to achieve transferability and discriminability simultaneously. For the first issue, this framework establishes a probabilistic discriminant criterion on the target domain via soft labels. Based on pre-built prototypes, this criterion is extended to a global approximation scheme for the second issue. Manifold metric alignment is adopted to be compatible with the embedding space. The theoretical error bounds of different alignment metrics are derived for constructive guidance. The proposed method can be used to tackle a series of variants of domain adaptation problems, including both vanilla and partial settings. Extensive experiments have been conducted to investigate the method and a comparative study shows the superiority of the discriminative manifold learning framework.
At present, the development of the Internet of Things (IoT) has become a significant symbol of the information age. And the video monitoring system is an important basic work in the IoT system. However, many existing video monitoring system usually use feature embedding method to learn more discriminative feature representation, while this manner is very time‐consuming. And the learned features usually do not match the subsequent classifiers. To solve these issues, this paper proposes a new video‐based fast image set classification framework, which consists of fast feature learning part and representation learning based classifier part. Extensive experiments on several well‐known benchmark datasets demonstrate the effectiveness and efficiency of the proposed framework.