Conference Paper

Multi-scale Feature Imitation for Unsupervised Anomaly Localization

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The unsupervised anomaly localization task is challenging due to the absence of abnormal samples during the training phase, dealing with multiple exceptions for the same object, and detecting unseen anomalies. In order to address these problems, we propose a novel approach that consists of a separate teacher-student feature imitation network and a multi-scale processing strategy that combines an image and feature pyramid. Additionally, we design a side task to optimize weight for each student network block through gradient descent algorithm. Compared with these anomaly localization methods based on feature modeling, experimental results demonstrate that our proposed method has a better performance on MVTec dataset which is a real industrial product detection dataset. Furthermore, our multi-scale strategy effectively improves the performance compared to the benchmark method.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Recent advances in deep neural networks have shown that reconstruction-based methods using autoencoders have potential for anomaly detection in visual inspection tasks. However, there are challenges when applying these methods to high-resolution images, such as the need for large network training and computation of anomaly scores. Autoencoder-based methods detect anomalies by comparing an input image to its reconstruction in pixel space, which can result in poor performance due to imperfect reconstruction. In this paper, we propose a method to address these challenges by using a conditional patch-based convolutional autoencoder and one-class deep feature classification. We train an autoencoder using only normal images and compute anomaly maps as the difference between the input and output of the autoencoder. We then embed these anomaly maps using a pretrained convolutional neural network feature extractor. Using the deep feature embeddings from the anomaly maps of training samples, we train a one-class classifier to compute an anomaly score for an unseen sample. A simple threshold-based criterion is used to determine if the unseen sample is anomalous or not. We compare our proposed algorithm to state-of-the-art methods on multiple challenging datasets, including a dataset of zipper cursors and eight datasets from the MVTec dataset collection. We find that our approach outperforms alternatives in all cases, achieving an average precision score of 94.77% for zipper cursors and 96.51% for MVTec datasets.
Article
Full-text available
Anomaly detection is a hot and practical problem. Most of the existing research is based on the model of the generative model, which judges abnormalities by comparing the data errors between original samples and reconstruction samples. Among them, Variational AutoEncoder (VAE) is widely used, but it has the problem of over-generalization. In this paper, we design an unsupervised deep learning anomaly detection method named VESC and propose the recursive reconstruction strategy. VESC adopts the idea of data compression and three structures on the basis of the original VAE, namely spatial constrained network, reformer structure, and re-encoder. The recursive reconstruction strategy can improve the accuracy of the model by increasing the number and typicality of training samples, and it can apply to most unsupervised learning methods. Experimental results of several benchmarks show that our model outperforms state-of-the-art anomaly detection methods. And our proposed strategy can improve the detection results of the original model.
Article
Full-text available
Expert interpretation of anatomical images of the human brain is the central part of neuro-radiology. Several machine learning-based techniques have been proposed to assist in the analysis process. However, the ML models typically need to be trained to perform a specific task, e.g., brain tumour segmentation or classification. Not only do the corresponding training data require laborious manual annotations, but a wide variety of abnormalities can be present in a human brain MRI — even more than one simultaneously, which renders a representation of all possible anomalies very challenging. Hence, a possible solution is an unsupervised anomaly detection (UAD) system that can learn a data distribution from an unlabelled dataset of healthy subjects and then be applied to detect out-of-distribution samples. Such a technique can then be used to detect anomalies — lesions or abnormalities, for example, brain tumours, without explicitly training the model for that specific pathology. Several Variational Autoencoder (VAE) based techniques have been proposed in the past for this task. Even though they perform very well on controlled artificially simulated anomalies, many of them perform poorly while detecting anomalies in clinical data. This research proposes a compact version of the “context-encoding” VAE (ceVAE) model, combined with pre and post-processing steps, creating a UAD pipeline (StRegA), which is more robust on clinical data and shows its applicability in detecting anomalies such as tumours in brain MRIs. The proposed pipeline achieved a Dice score of 0.642 ± 0.101 while detecting tumours in T2w images of the BraTS dataset and 0.859 ± 0.112 while detecting artificially induced anomalies, while the best performing baseline achieved 0.522 ± 0.135 and 0.783 ± 0.111, respectively.
Conference Paper
Full-text available
Deep learning methods are typically trained in a supervised with annotated data for analysing medical images with the motivation of detecting pathologies. In the absence of manually annotated training data, unsupervised anomaly detection can be one of the possible solutions. This work proposes StRegA, an unsupervised anomaly detection pipeline based on a compact ceVAE and shows its applicability in detecting anomalies such as tumours in brain MRIs. The proposed pipeline achieved a Dice score of 0.642±0.101 while detecting tumours in T2w images of the BraTS dataset and 0.859±0.112 while detecting artificially induced anomalies.
Article
Full-text available
In realizing unsupervised pixel-precise anomaly localization by utilizing a generative model, a reference image must be generated (for comparison with an input image) by transforming abnormal patterns of an input image, if any, into normal patterns. In this study, a patch-level operation with adaptive patch control is proposed to improve anomaly localization by generating a better reference image. As a way to exploit a generative model, we divide an image into non-overlapped patches of the same size, generate patch-level reference images, and stitch the patch-level reference images into a single reference image. We then conduct anomaly localization by comparing an input image with the stitched, reconstructed image. To effectively apply the patch-level operation, we propose adaptive patch control to determine the number of non-overlapped patches to be applied. For this, we synthesize defective images using normal images and examine how well the candidate methods with different numbers of patches remove the synthesized defects. In the same way, we utilize adaptive patch control to select a promising model among the candidate generative models. Based on experiments conducted using the MVTec Anomaly Detection dataset, we demonstrate that our method outperforms previous existing methods. Under a real-world scenario, our method shows ROC AUC of 0.926, in contrast to the best value of 0.893 from existing studies. Furthermore, we prove the feasibility of the adaptive patch control by showing that the removal of the synthesized defects and the anomaly localization for real defective images are highly correlated.
Article
Full-text available
The detection of anomalous structures in natural image data is of utmost importance for numerous tasks in the field of computer vision. The development of methods for unsupervised anomaly detection requires data on which to train and evaluate new approaches and ideas. We introduce the MVTec anomaly detection dataset containing 5354 high-resolution color images of different object and texture categories. It contains normal, i.e., defect-free images intended for training and images with anomalies intended for testing. The anomalies manifest themselves in the form of over 70 different types of defects such as scratches, dents, contaminations, and various structural changes. In addition, we provide pixel-precise ground truth annotations for all anomalies. We conduct a thorough evaluation of current state-of-the-art unsupervised anomaly detection methods based on deep architectures such as convolutional autoencoders, generative adversarial networks, and feature descriptors using pretrained convolutional neural networks, as well as classical computer vision methods. We highlight the advantages and disadvantages of multiple performance metrics as well as threshold estimation techniques. This benchmark indicates that methods that leverage descriptors of pretrained networks outperform all other approaches and deep-learning-based generative models show considerable room for improvement.
Conference Paper
Full-text available
Obtaining models that capture imaging markers relevant for disease progression and treatment monitoring is challenging. Models are typically based on large amounts of data with annotated examples of known markers aiming at automating detection. High annotation effort and the limitation to a vocabulary of known markers limit the power of such approaches. Here, we perform unsupervised learning to identify anomalies in imaging data as candidates for markers. We propose AnoGAN, a deep convolutional generative adversarial network to learn a manifold of normal anatomical variability, accompanying a novel anomaly scoring scheme based on the mapping from image space to a latent space. Applied to new data, the model labels anomalies, and scores image patches indicating their fit into the learned distribution. Results on optical coherence tomography images of the retina demonstrate that the approach correctly identifies anomalous images, such as images containing retinal fluid or hyperreflective foci.
Article
Full-text available
While depth tends to improve network performances, it also makes gradient-based training more difficult since deeper networks tend to be more non-linear. The recently proposed knowledge distillation approach is aimed at obtaining small and fast-to-execute models, and it has shown that a student network could imitate the soft output of a larger teacher network or ensemble of networks. In this paper, we extend this idea to allow the training of a student that is deeper and thinner than the teacher, using not only the outputs but also the intermediate representations learned by the teacher as hints to improve the training process and final performance of the student. Because the student intermediate hidden layer will generally be smaller than the teacher's intermediate hidden layer, additional parameters are introduced to map the student hidden layer to the prediction of the teacher hidden layer. This allows one to train deeper students that can generalize better or run faster, a trade-off that is controlled by the chosen student capacity. For example, on CIFAR-10, a deep student network with almost 10.4 times less parameters outperforms a larger, state-of-the-art teacher network.
Article
The deep learning anomaly detection method that employs visual sensors as the original signal input suffers from the over-boundary problem caused by a lack of labeled datasets, resulting in unsupervised training. This paper proposes an anomaly detection method based on weakly supervised learning. First, this paper proposes a pseudo-label generation technique based on a multi-instance ranking algorithm to generate pseudo-labels, thereby transforming the weakly supervised learning problem into a fully supervised learning problem. The C3D network extracts video temporal-spatial features as the initial data stream input for the pseudo-label generator/anomaly detection model. Finally, the attention and guidance augmentation modules are combined to make the network focus on the anomaly region, improving the model's spatial localization capability. A series of experimental results on two datasets with varied scales and scene complexity demonstrate that our method can reach a frame-level AUC of 81.48% in UCF-Crime and 94.01% in ShanghaiTech.
Article
Generative Adversarial Networks (GANs) are commonly used as a system able to perform unsupervised learning. We propose and demonstrate the use of a GAN architecture, known as the fast Anomaly Generative Adversarial Network (f-AnoGAN), to solve the problem of anomaly detection from aerial images. This architecture was previously applied to medical images and, in this work, we adapt it for use on satellite or aerial photographs. To test the effectiveness of this approach, we implemented anomaly detection schemes based on the Bi-directional Generative Adversarial Network (BiGAN), the image-z-image mapping (izi), the z-image-z (ziz) mapping, and a deep convolutional autoencoder (AE). The results show that the f-AnoGAN outperformed others, achieving AUC (area under the curve) values of 0.99 and 0.92 for urban and rural spaces image sets, respectively.
Conference Paper
Anomaly detection or outlier is one of the challenging subjects in unsupervised learning. This paper is introduced a student-teacher framework for anomaly detection that its teacher network is enhanced for achieving high-performance metrics. For this purpose, we first pretrain the ResNet-18 network on the ImageNet and then finetune it on the MVTech-AD dataset. Experiment results on the image-level and pixel-level demonstrate that this idea has achieved better metrics than the previous methods. Our model, Enhanced Teacher for Student-Teacher Feature Pyramid (ET-STPM), achieved 0.971 mean accuracy on the image-level and 0.977 mean accuracy on the pixel-level for anomaly detection.
Article
We present a novel model called One Class Minimum Spanning Tree (OCmst) for novelty detection problem that uses a Convolutional Neural Network (CNN) as deep feature extractor and graph-based model based on Minimum Spanning Tree (MST). In a novelty detection scenario, the training data is no polluted by outliers (abnormal class) and the goal is to recognize if a test instance belongs to the normal class or the abnormal class. Our approach uses the deep features from CNN to feed a pair of MSTs built starting from each test instance. To cut down the computational time we use a parameter γ to specify the size of the MST’s starting to the neighbours from the test instance. To prove the effectiveness of the proposed approach we conducted experiments on two publicly available datasets, well-known in literature and we achieved the state-of-the-art results on the CIFAR10 dataset.
Chapter
We present a new framework for Patch Distribution Modeling, PaDiM, to concurrently detect and localize anomalies in images in a one-class learning setting. PaDiM makes use of a pretrained convolutional neural network (CNN) for patch embedding, and of multivariate Gaussian distributions to get a probabilistic representation of the normal class. It also exploits correlations between the different semantic levels of CNN to better localize anomalies. PaDiM outperforms current state-of-the-art approaches for both anomaly detection and localization on the MVTec AD and STC datasets. To match real-world visual industrial inspection, we extend the evaluation protocol to assess performance of anomaly localization algorithms on non-aligned dataset. The state-of-the-art performance and low complexity of PaDiM make it a good candidate for many industrial applications.
Chapter
In this paper, we address the problem of image anomaly detection and segmentation. Anomaly detection involves making a binary decision as to whether an input image contains an anomaly, and anomaly segmentation aims to locate the anomaly on the pixel level. Support vector data description (SVDD) is a long-standing algorithm used for an anomaly detection, and we extend its deep learning variant to the patch-based method using self-supervised learning. This extension enables anomaly segmentation and improves detection performance. As a result, anomaly detection and segmentation performances measured in AUROC on MVTec AD dataset increased by 9.8% and 7.0%, respectively, compared to the previous state-of-the-art methods. Our results indicate the efficacy of the proposed method and its potential for industrial application. Detailed analysis of the proposed method offers insights regarding its behavior, and the code is available online (https://github.com/nuclearboy95/Anomaly-Detection-PatchSVDD-PyTorch).
Conference Paper
We introduce a powerful student-teacher framework for the challenging problem of unsupervised anomaly detection and pixel-precise anomaly segmentation in high-resolution images. Student networks are trained to regress the output of a descriptive teacher network that was pretrained on a large dataset of patches from natural images. This circumvents the need for prior data annotation. Anomalies are detected when the outputs of the student networks differ from that of the teacher network. This happens when they fail to generalize outside the manifold of anomaly-free training data. The intrinsic uncertainty in the student networks is used as an additional scoring function that indicates anomalies. We compare our method to a large number of existing deep learning based methods for unsupervised anomaly detection. Our experiments demonstrate improvements over state-of-the-art methods on a number of real-world datasets, including the recently introduced MVTec Anomaly Detection dataset that was specifically designed to benchmark anomaly segmentation algorithms.
Conference Paper
We study a new high dimensional data problem in this paper. In pattern classification, if many dimensions of two groups share a similar distribution, the classification error rates will be 50%. We have proposed a new clustering algorithm to deal with this problem. Its basic idea is to confine the support of the optimization equation so that the data points in one group can only have small contribution to the estimated cluster center in another group. Experiments show that the proposed method is able to yield good results in eight real world data sets and its performance is better than 10 existing methods.