Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Defect detection is one of the most essential processes for industrial quality inspection. However, in Continuous Defect Detection (CDD), where defect categories and samples continually increase, the challenge of incremental few-shot defect detection remains unexplored. Current defect detection models fail to generalize to novel categories and suffer from catastrophic forgetting. To address these problems, this paper proposes an Incremental Knowledge Learning Framework (IKLF) for continuous defect detection. The proposed framework follows the pretrain-finetuning paradigm. To realize end-to-end fine-tuning for novel categories, an Incremental RCNN module is proposed to calculate cosine-similarity features of defects and decouple class-wise representations. What’s more, two incremental knowledge align losses are proposed to deal with catastrophic problems. The Feature Knowledge Align (FKA) loss is designed for class-agnostic feature maps, while the Logit Knowledge Align (LKA) loss is proposed for class-specific output logits. The combination of two align losses mitigates the catastrophic forgetting problem effectively. Experiments have been conducted on two real-world industrial inspection datasets (NEU-DET and DeepPCB). Results show that IKLF outperforms other methods on various incremental few-shot scenes, which proves the effectiveness of the proposed method.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... RetinaNet-line, which utilizes focal loss, has been effectively applied to the detection of small objects. It employs PVTv2B2-Hi [16] as the backbone network, which helps mitigate the issue of depth feature redundancy often encountered in models using residual networks.. Sun et al. [17] proposed Feature Knowledge Alignment Loss and Logit Knowledge Alignment Loss to address the issue of catastrophic forgetting. Despite these advancements, the detection speed of the model still necessitates further enhancement. ...
Article
Full-text available
Detecting micro-defects in densely populated printed circuit boards (PCBs) with complex backgrounds is a critical challenge. To address the problem, the DHNet, a small object detection network based on YOLOv8 employing multi-scale convolutional kernels is proposed for feature extraction and fusion. The lightweight VOVGSHet module is designed for feature fusion and a pyramid structure to efficiently leverage feature map relationships while minimizing model complexity and parameters. Otherwise, to optimize the original extraction structure and enhance multi-scale defect detection, convolutional kernels of varying sizes process the same input channels. Additionally, the incorporation of the Wise-IoU loss function improves small defect detection accuracy and efficiency. Moreover, extensive experiments on a custom PCB dataset demonstrate DHNet's effectiveness, achieving an outstanding mean Average Precision (mAP) of 84.5%, surpassing the original YOLOv8 network by 4.0%, with parameters only of 2.85 M. Model demonstrates a latency of 3.6 ms on NVIDIA 4090. However, YOLOv8n has a latency of 4.4 ms. Validation on public DeepPCB and NEU datasets further confirms DHNet's superiority, which can reach 99.1% and 79.9% mAP, respectively. Finally, successful deployment on the NVIDIA Jetson Nano platform validates DHNet's suitability for real-time defect detection in industrial applications.
Article
Unsupervised domain adaptation (UDA) has recently gained attention in fault diagnosis due to its ability to address domain shift problems arising from changes in working conditions. However, when faced with the continual domain shift problem inherent in real-world industries with dynamic working conditions, UDA often suffers from catastrophic forgetting. To address this challenge, we propose a novel replay-free continual UDA framework, CoUDA, for fault diagnosis under dynamic working conditions. In CoUDA, prototype contrastive learning is employed in source domain pre-training in order to improve the model generalization ability in preparation for the adaptation to the subsequent target domains. Then, source discriminator constraint is employed to ensure that the acquired source domain knowledge serves as an anchor, and source feature knowledge distillation is applied to prevent catastrophic forgetting without replay in sequential target domain adaptation. In addition, for better domain adaptation, local domain alignment and information entropy minimization are utilized to achieve fine-grained domain alignment. Experimental results demonstrate the superiority of the proposed CoUDA in achieving robust fault diagnosis under dynamic working conditions.
Article
Full-text available
In spite of the advent of Machine Learning (ML) and its successful deployment in measurement systems, little information can be found in the literature about uncertainty quantification in these systems [1]. Uncertainty is crucial for the adoption of ML in commercial products and services. Designers are now being encouraged to be upfront about the uncertainty in their ML systems, because products that specify their uncertainty can have a significant competitive advantage and can unlock new value, reduce risk, and improve usability [2]. In this article, we will describe uncertainty quantification in ML. Because there isn't enough room in one article to explain all ML methods, we concentrate on Deep Learning (DL), which is one of the most popular and effective ML methods in I&M [3]. Please note that this article follows and uses concepts from Part 1 [4], so readers are highly encouraged to first read that part. In addition, we assume the reader has a basic understanding of both DL and uncertainty. Readers for whom this assumption is false are encouraged to first read the brief introduction to DL and its applications in I&M presented in [3] as well as the uncertainty tutorial in [5].
Article
Full-text available
Like any science and engineering field, Instrumentation and Measurement (I&M) is currently experiencing the impact of the recent rise of Applied AI and in particular Machine Learning (ML) [1]. But I&M and ML use terminology that sometimes sound or look similar, though they might only have a marginal relationship or even be false friends. Therefore, understanding the terminology used by both communities and how they do and do not relate to one another is of crucial importance to understand the influences of ML in an I&M system. In addition, while I&M experts are well aware of the importance of measurement uncertainty, the concept has been understudied in the ML context. In this article, we will give an overview of ML's contribution to measurement error, and how to avoid confusion with the said terminology, to better understand the application of ML in measurement. Then, in Part 2 [2], we use that understanding and terminology to show how to quantify the uncertainty introduced by ML in a measurement system. This is of particular importance for measurement in the age of big data because we need to evaluate the trustworthiness of the available data and their impact on the derived conclusions and decision-making [3].
Article
Full-text available
Although great progress has been made in generic object detection by advanced deep learning techniques, detecting small objects from images is still a difficult and challenging problem in the field of computer vision due to the limited size, less appearance, and geometry cues, and the lack of large-scale datasets of small targets. Improving the performance of small object detection has a wider significance in many real-world applications, such as self-driving cars, unmanned aerial vehicles, and robotics. In this article, the first-ever survey of recent studies in deep learning-based small object detection is presented. Our review begins with a brief introduction of the four pillars for small object detection, including multiscale representation, contextual information, super-resolution, and region-proposal. Then, the collection of state-of-the-art datasets for small object detection is listed. The performance of different methods on these datasets is reported later. Moreover, the state-of-the-art small object detection networks are investigated along with a special focus on the differences and modifications to improve the detection performance comparing to generic object detection architectures. Finally, several promising directions and tasks for future work in small object detection are provided. Researchers can track up-to-date studies on this webpage available at: https://github.com/tjtum-chenlab/SmallObjectDetectionList.
Article
Full-text available
Due to continuing and rapid advances of both hardware and software technologies in camera and computing systems, we continue to have access to cheaper, faster, higher quality, and smaller cameras and computing units. As a result, vision based methods consisting of image processing and computational intelligence can be implemented more easily and affordably than ever using a camera and its associated operations units. Among their various applications, such systems are also being used more and more by researchers and practitioners as generic instruments to measure and monitor physical phenomena. In this article, we take a look at this rising trend and how cameras and vision are being used for instrumentation and measurement, and we also cast a glance at the metrological gauntlet thrown down by vision-based instruments.
Conference Paper
Full-text available
In this paper, we present a novel scale- and rotation-invariant interest point detector and descriptor, coined SURF (Speeded Up Robust Features). It approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster. This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descriptors (in casu, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps. The paper presents experimental results on a standard evaluation set, as well as on imagery obtained in the context of a real-life object recognition application. Both show SURF’s strong performance.
Article
Surface defect detection is of great significance to ensure the quality of steel plate. The surface defects of steel plate are characterized by multiple types, complex and irregular shapes, large scale range, and high similarity with normal regions, resulting in low accuracy of widely used vision based defect detection methods. To overcome these issues, this paper proposes a method of detecting steel plate surface defects based on deformation convolution and background suppression. First, an improved Faster RCNN method with deformable convolution and Region-of-Interest align is proposed to enhance the detection performance for large-scale defects with complex and irregular shapes; Second, a background suppression method is proposed to enhance the discrimination ability between the normal region and the defect region. Experimental results show that, compared with the state-of-the-art methods, the proposed method can significantly improve the defect detection performance of steel plate.
Article
Deep learning (DL)-based fault diagnosis models need to collect sufficient fault information for each fault type to ensure high-precision diagnosis. Some unexpected and new fault types will inevitably appear in actual conditions, which are called incremental fault types or class increment. Traditional DL models require the costly collection of all known data for retraining; while the use of new fault data may lead to catastrophic forgetting of old tasks. To solve the problem of bearing diagnosis with incremental fault types, a lifelong learning method based on generative feature replay (LLMGFR) is proposed in this study. A feature distillation method is put forward in this method to avoid forgetting in the feature extractor. The generator is trained to produce old task features. The generated features are mixed with real features of the current task to solve the imbalance problem and catastrophic forgetting of the classifier effectively. According to incremental fault diagnosis cases, LLMGFR can learn constantly and adaptively in dynamic environments with incremental fault types.
Article
Most unsupervised image anomaly localization methods suffer from overgeneralization because of the high generalization abilities of convolutional neural networks, leading to unreliable predictions. To mitigate the overgeneralization, this study proposes to collaboratively optimize normal and abnormal feature distributions with the assistance of synthetic anomalies, namely collaborative discrepancy optimization (CDO). CDO introduces a margin optimization module and an overlap optimization module to optimize the two key factors determining the localization performance, i.e. , the margin and the overlap between the discrepancy distributions (DDs) of normal and abnormal samples. With CDO, a large margin and a small overlap between normal and abnormal DDs are obtained, and the prediction reliability is boosted. Experiments on MVTec2D and MVTec3D show that CDO effectively mitigates the overgeneralization and achieves great anomaly localization performance with real-time computation efficiency. A real-world automotive plastic parts inspection application further demonstrates the capability of the proposed CDO.
Article
Defect detection is a task to locate and classify the possible defects in an image. However, unlike common object detection tasks, defect detection often needs to deal with images with relatively complex backgrounds, for example, in industrial product quality inspection scenario. The complex background can greatly interfere with the feature of the target objects in the multiscale feature fusion process and therefore puts great challenge on the defect detector. In this work, a channel-space adaptive enhancement feature pyramid network (CA-FPN) is proposed to eliminate this interference from the complex background. By extracting the inner relationship of different scale features, CA-FPN realizes adaptive fusion of multiscale features to enhance the semantic information of the defect while avoiding background interference as much as possible. In particular, CA-FPN is very lightweight. Moreover, considering that defects are often of varying sizes and can be extremely tiny or slender, a flexible anchor-free detector CA-AutoAssign is proposed by combining CA-FPN and an anchor-free detection strategy AutoAssign. Based on the Alibaba Cloud Tianchi Fabric dataset and NEU-DET, CA-AutoAssign is compared with the state-of-the-art (SOTA) detectors. The experimental results show that CA-AutoAssign has the best detection performance with AP50 [mean average precision (mAP) with the intersection over union (IOU) threshold of 50%] reaching 89.1 and 82.7, respectively. Despite the improvement in accuracy, the processing time has barely increased. Furthermore, CA-FPN is applied to other classical detectors, and the experimental results demonstrate the competitiveness and generalization ability of CA-FPN. The code is available at https://github.com/EasonLuht/CA-AutoAssign.git .
Article
Surface defect recognition (SDC) is essential in intelligent manufacturing. Deep Learning (DL) is a research hotspot in SDC. Limited defective samples are available in most real-world cases, which poses challenges for DL methods. Given such circumstances, generating defective samples by generative adversarial networks (GAN) is applied. However, insufficient samples and high-frequency texture details in defects make GANs very hard to train, yield mode collapse and poor image quality, which can further impact SDC. To solve these problems, this paper proposes a new generative adversarial network called Contrastive GAN, which can be trained to generate diverse defects with only extremely limited samples. Specifically, a Shared Data Augmentation (SDA) module is proposed for avoiding overfitting. Then, a Feature Attention Matching (FAM) module is proposed to align features for improving the quality of generated images. Finally, a Contrastive Loss based on Hyper Sphere is employed to constrain GAN to generate images differ from traditional transform. Experiments show that the proposed GAN generates defective images with higher quality and lower variance between real defects compared to other GANs. Synthetic images contribute to pre-trained DL networks with accuracies of up to 95.00%-99.56% for NEU datasets of different sizes and 91.84% for PCB cases, which proves the effectiveness of the proposed method.
Article
In this paper, we focus on the challenging few-shot class incremental learning (FSCIL) problem, which requires to transfer knowledge from old tasks to new ones and solves catastrophic forgetting. We propose the exemplar relation distillation incremental learning framework to balance the tasks of old-knowledge preserving and new-knowledge adaptation. First, we construct an exemplar relation graph to represent the knowledge learned by the original network and update gradually for new tasks learning. Then an exemplar relation loss function for discovering the relation knowledge between different classes is introduced to learn and transfer the structural information in relation graph. A large number of experiments demonstrate that relation knowledge does exist in the exemplars and our approach outperforms other state-of-the-art class-incremental learning methods on the CIFAR100, miniImageNet, and CUB200 datasets.
Article
Image Anomaly Detection is a significant stage for visual quality inspection in intelligent manufacturing systems. According to the assumption that only normal images are available during the training stage, unsupervised methods have been studied recently for image anomaly detection. But anomalous images of small scale can be collected for training in many real-world industrial scenarios, and the unsupervised methods make no use of them to improve the detection accuracy. This leads to a semi-supervised image anomaly detection with an unbalanced detection challenge. In this paper, a Logit Inducing with Abnormality Capturing (LIAC) method is proposed to address semi-supervised image anomaly detection. Firstly, a Logit Inducing Loss is proposed to train a classifier for dealing with unbalanced detection. And secondly, an Abnormality Capturing Module is proposed to address anomaly detection. With labeling only 40 anomalous images for training, the proposed LIAC method achieves a 98.8% f1-score on the image anomaly detection of the printed circuit board, compared with the state-of-the-art methods. More, the proposed LIAC method is experimentally compared with the state-of-the-art methods on MTD, ROCT, and ELPV three open-source datasets, respectively achieves f1-score of 85.2%, 96.8%, and 66.6% with given 40 anomalous images for training.
Article
Prereflow automatic optical inspection (AOI) has been widely used to ensure product quality in surface mount technology (SMT). When confronted with a complex industrial environment, traditional hand-designed visual inspection algorithms may lack robustness and generalizability. In this article, PCBNet, a convolutional neural network (CNN) method that combines data preprocessing, detection network, and visualization, is proposed to localize electronic components and recognize defects. In the data preprocessing stage, raw images are segmented into several regions of interest (ROIs). The ROI patches are inspected by a CNN-based detection system, which is capable of classifying defects and positioning components. After inspection, the reporting system visualizes the results via the human–computer interface. In comparative studies, the effectiveness of the proposed PCBNet was validated on a large-scale PCB component defect dataset. The PCBNet backbone outperforms other well-known lightweight CNN backbones in terms of accuracy and latency on 4×4\times ARM Cortex A72 CPU @ 1.5 GHz. Compared to other learning-based methods on the small-scale benchmark dataset, the PCBNet also achieves the best balance between inference speed and accuracy. In addition, extensive experiments demonstrate the superior efficiency of PCBNet in comparison to some famous traditional object detectors and novel oriented object detection algorithms.
Article
Deep slearning (DL)-based fault diagnosis models have to collect the most comprehensive data of mechanical fault types to ensure reliability. In real scenarios, due to complex, variable operating conditions, machines often generate unexpected faults that lead to an increment of fault types, causing the diagnosis model to be invalid. Therefore, the data of new fault types are needed to retrain the model. However, DL models suffer from catastrophic forgetting when incrementally learning new classes. To solve the problem of the diagnosis of increasing fault types, a lifelong learning method for fault diagnosis (LLMFD) is proposed in this article under the lifelong learning paradigm. The key of LLMFD is a proposed dual-branch aggregation networks (DBANets) framework that is combined with reserved exemplars to learn the new fault types without forgetting the old ones. In DBANets, each residual block layer has a dynamic block and a steady block to solve the stability–plasticity dilemma in lifelong learning. The aggregation weights are adopted to balance stability and plasticity. LLMFD is applied to a diagnosis case of incremental fault types. Results verify that LLMFD is superior to other lifelong learning methods and has satisfactory robustness.
Article
Detecting objects and estimating their viewpoints in images are key tasks of 3D scene understanding. Recent approaches have achieved excellent results on very large benchmarks for object detection and viewpoint estimation. However, performances are still lagging behind for novel object categories with few samples. In this paper, we tackle the problems of few-shot object detection and few-shot viewpoint estimation. We demonstrate on both tasks the benefits of guiding the network prediction with class-representative features extracted from data in different modalities: image patches for object detection, and aligned 3D models for viewpoint estimation. Despite its simplicity, our method outperforms state-of-the-art methods by a large margin on a range of datasets, including PASCAL and COCO for few-shot object detection, and Pascal3D+ and ObjectNet3D for few-shot viewpoint estimation. Furthermore, when the 3D model is not available, we introduce a simple category-agnostic viewpoint estimation method by exploiting geometrical similarities and consistent pose labelling across different classes. While it moderately reduces performance, this approach still obtains better results than previous methods in this setting. Last, for the first time, we tackle the combination of both few-shot tasks, on three challenging benchmarks for viewpoint estimation in the wild, ObjectNet3D, Pascal3D+ and Pix3D, showing very promising results.
Article
Object detection is a well-known task in the field of computer vision, especially the small target detection problem that has aroused great academic attention. In order to improve the detection performance of small objects, in this article, a novel enhanced multiscale feature fusion method is proposed, namely, the atrous spatial pyramid pooling-balanced-feature pyramid network (ABFPN). In particular, the atrous convolution operators with different dilation rates are employed to make full use of context information, where the skip connection is applied to achieve sufficient feature fusions. In addition, there is a balanced module to integrate and enhance features at different levels. The performance of the proposed ABFPN is evaluated on three public benchmark datasets, and experimental results demonstrate that it is a reliable and efficient feature fusion method. Furthermore, in order to validate the applicational potential in small objects, the developed ABFPN is utilized to detect surface tiny defects of the printed circuit board (PCB), which acts as the neck part of an improved PCB defect detection (IPDD) framework. While designing the IPDD, several powerful strategies are also employed to further improve the overall performance, which is evaluated via extensive ablation studies. Experiments on a public PCB defect detection database have demonstrated the superiority of the designed IPDD framework against the other seven state-of-the-art methods, which further validates the practicality of the proposed ABFPN.
Article
Deep learning based algorithms have been widely employed to build reliable steel surface defect detection systems, which are important for manufacturing. The performance of deep learning models relies heavily on abundant annotated data. Nevertheless, the labeled image volume in industrial datasets is often limited. The scarcity of training data would lead to poor detection precision. To tackle this issue, we propose the first few-shot defect detection framework. Through pre-training models using data relevant to the target task, the proposed framework can produce well-trained networks with a few labeled images. Meanwhile, we release the first publicly available few-shot defect detection dataset, namely few-shot NEU-DET (FS-ND). This dataset will serve as a fair benchmark for contrasting various methods. Afterwards, we analyze the characteristics of steel surface defect detection. It is observed that the limited amount of training data can hardly cover the data distributions in practical applications. Given this observation, we develop two domain generalization strategies that enhance the appearance and scale diversity of extracted features. Furthermore, it is found that noise existing in industrial images could result in the collapse of models. To address this problem, we devise a noise regularization strategy that improves the robustness of trained models significantly. We have conducted extensive experiments to evaluate the effectiveness of our framework. The results indicate that our framework outperforms the contrasted baseline by around 15 mAP and achieves comparable performance with models trained using abundant data.
Article
In the integrated circuit (IC) packaging, the surface defect detection of flexible printed circuit boards (FPCBs) is important to control the quality of IC. Although various computer vision (CV)-based object detection frameworks have been widely used in industrial surface defect detection scenarios, FPCB surface defect detection is still challenging due to non-salient defects and the similarities between diverse defects on FPCBs. To solve this problem, a decoupled two-stage object detection framework based on convolutional neural networks (CNNs) is proposed, wherein the localization task and the classification task are decoupled through two specific modules. Specifically, to effectively locate non-salient defects, a multi-hierarchical aggregation (MHA) block is proposed as a location feature (LF) enhancement module in the defect localization task. Meanwhile, to accurately classify similar defects, a locally non-local (LNL) block is presented as a SEF enhancement module in the defect classification task. What is more, an FPCB surface defect detection dataset (FPCB-DET) is built with corresponding defect category and defect location annotations. Evaluated on the FPCB-DET, the proposed framework achieves state-of-the-art (SOTA) accuracy to 94.15% mean average precision (mAP) compared with the existing surface defect detection networks. Soon, source code and dataset will be available at https://github.com/SCUTyzy/decoupled-two-stage-framework .
Article
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time and storage.
Article
When a detection model that has been well-trained on a set of classes faces new classes, incremental learning is always necessary to adapt the model to detect the new classes. In most scenarios, it is required to preserve the learned knowledge of the old classes during incremental learning rather than reusing the training data from the old classes. Since the objects in remote sensing images often appear in various sizes, arbitrary directions, and dense distribution, it further makes incremental learning-based object detection more difficult. In this article, a new architecture for incremental object detection is proposed based on feature pyramid and knowledge distillation. Especially, by means of a feature pyramid network (FPN), the objects with various scales are detected in the different layers of the feature pyramid. Motivated by Learning without Forgetting (LwF), a new branch is expended in the last layer of FPN, and knowledge distillation is applied to the outputs of the old branch to maintain the old learning capability for the old classes. Multitask learning is adopted to jointly optimize the losses from two branches. Experiments on two widely used remote sensing data sets show our promising performance compared with state-of-the-art incremental object detection methods.
Chapter
Few-shot object detection (FSOD) helps detectors adapt to unseen classes with few training instances, and is useful when manual annotation is time-consuming or data acquisition is limited. Unlike previous attempts that exploit few-shot classification techniques to facilitate FSOD, this work highlights the necessity of handling the problem of scale variations, which is challenging due to the unique sample distribution. To this end, we propose a Multi-scale Positive Sample Refinement (MPSR) approach to enrich object scales in FSOD. It generates multi-scale positive samples as object pyramids and refines the prediction at various scales. We demonstrate its advantage by integrating it as an auxiliary branch to the popular architecture of Faster R-CNN with FPN, delivering a strong FSOD solution. Several experiments are conducted on PASCAL VOC and MS COCO, and the proposed approach achieves state of the art results and significantly outperforms other counterparts, which shows its effectiveness. Code is available at https://github.com/jiaxi-wu/MPSR.
Article
Few-shot learning aims to learn a well-performing model from a few labeled examples. Recently, quite a few works propose to learn a predictor to directly generate model parameter weights with episodic training strategy of meta-learning and achieve fairly promising performance. However, the predictor in these works is task-agnostic, which means that the predictor cannot adjust to novel tasks in the testing phase. In this article, we propose a novel meta-learning method to learn how to learn task-adaptive classifier-predictor to generate classifier weights for few-shot classification. Specifically, a meta classifier-predictor module, (MPM) is introduced to learn how to adaptively update a task-agnostic classifier-predictor to a task-specialized one on a novel task with a newly proposed center-uniqueness loss function. Compared with previous works, our task-adaptive classifier-predictor can better capture characteristics of each category in a novel task and thus generate a more accurate and effective classifier. Our method is evaluated on two commonly used benchmarks for few-shot classification, i.e., miniImageNet and tieredImageNet. Ablation study verifies the necessity of learning task-adaptive classifier-predictor and the effectiveness of our newly proposed center-uniqueness loss. Moreover, our method achieves the state-of-the-art performance on both benchmarks, thus demonstrating its superiority.
Article
Automated computer-vision-based defect detection has received much attention with the increasing surface quality assurance demands for the industrial manufacturing of flat steels. This paper attempts to present a comprehensive survey on surface defect detection technologies by reviewing about 120 publications over the last two decades for three typical flat steel products of con-casting slabs, hot- and cold-rolled steel strips. According to the nature of algorithms as well as image features, the existing methodologies are categorized into four groups: Statistical, spectral, model-based and machine learning. These literatures are summarized in this review to enable easy referral to suitable methods for diverse application scenarios in steel mills. Realization recommendations and future research trends are also addressed at an abstract level.
Chapter
We propose CornerNet, a new approach to object detection where we detect an object bounding box as a pair of keypoints, the top-left corner and the bottom-right corner, using a single convolution neural network. By detecting objects as paired keypoints, we eliminate the need for designing a set of anchor boxes commonly used in prior single-stage detectors. In addition to our novel formulation, we introduce corner pooling, a new type of pooling layer that helps the network better localize corners. Experiments show that CornerNet achieves a 42.1% AP on MS COCO, outperforming all existing one-stage detectors.
Article
The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In contrast, one-stage detectors that are applied over a regular, dense sampling of possible object locations have the potential to be faster and simpler, but have trailed the accuracy of two-stage detectors thus far. In this paper, we investigate why this is the case. We discover that the extreme foreground-background class imbalance encountered during training of dense detectors is the central cause. We propose to address this class imbalance by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples. Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training. To evaluate the effectiveness of our loss, we design and train a simple dense detector we call RetinaNet. Our results show that when trained with the focal loss, RetinaNet is able to match the speed of previous one-stage detectors while surpassing the accuracy of all existing state-of-the-art two-stage detectors. Code is at: https://github.com/facebookresearch/Detectron.
Article
We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that's pretty swell. It's a little bigger than last time but more accurate. It's still fast though, don't worry. At 320x320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 mAP@50 in 51 ms on a Titan X, compared to 57.5 mAP@50 in 198 ms by RetinaNet, similar performance but 3.8x faster. As always, all the code is online at https://pjreddie.com/yolo/
Article
State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully-convolutional network that simultaneously predicts object bounds and objectness scores at each position. RPNs are trained end-to-end to generate high-quality region proposals, which are used by Fast R-CNN for detection. With a simple alternating optimization, RPN and Fast R-CNN can be trained to share convolutional features. For the very deep VGG-16 model, our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007 (73.2% mAP) and 2012 (70.4% mAP) using 300 proposals per image. The code will be released.
Article
In this paper, a novel fabric detect detection scheme based on HOG and SVM is proposed. Firstly, each block-based feature of the image is encoded using the histograms of orientated gradients (HOG), which are insensitive to various lightings and noises. Then, a powerful feature selection algorithm, AdaBoost, is performed to automatically select a small set of discriminative HOG features in order to achieve robust detection results. In the end, support vector machine (SVM) is used to classify the fabric defects. Experimental results demonstrate the efficiency of our proposed algorithm. Index Terms—Fabric defect; HOG; AdaBoost; SVM;
Article
This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Frustratingly Simple Few-Shot Object Detection
  • X Wang
  • T Huang
  • J Gonzalez
  • T Darrell
  • F Yu
X. Wang, T. Huang, J. Gonzalez, T. Darrell, and F. Yu, "Frustratingly Simple Few-Shot Object Detection," in Proc. Int. Conf. Mach. Learn., PMLR, Nov. 2020, pp. 9919-9928.
Uncertainty Quantification
  • Al Osman
  • S Shirmohammadi
H. Al Osman and S. Shirmohammadi, "Machine Learning in Measurement Part 2: Uncertainty Quantification," IEEE Instrum. Meas. Mag., vol. 24, no. 3, pp. 23-27, May 2021, doi: 10.1109/MIM.2021.9436102.
A Real-time PCB Defect Detector Based on Supervised and Semi-supervised Learning
  • F He
  • S Tang
  • S Mehrkanoon
  • X Huang
  • J Yang
F. He, S. Tang, S. Mehrkanoon, X. Huang, and J. Yang, "A Real-time PCB Defect Detector Based on Supervised and Semi-supervised Learning," Appl. Intell., 2020.
  • K Chen
K. Chen et al., "MMDetection: Open MMLab Detection Toolbox and Benchmark," arXiv:1906.07155 [cs, eess], Jun. 2019.
A real-time PCB defect detector based on supervised and semi-supervised learning
  • He
Frustratingly simple few-shot object detection
  • Wang
MMDetection: Open MMLab detection toolbox and benchmark
  • Chen