PreprintPDF Available

Concealed Object Detection for Passive Millimeter-Wave Security Imaging Based on Task-Aligned Detection Transformer

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Passive millimeter-wave (PMMW) is a significant potential technique for human security screening. Several popular object detection networks have been used for PMMW images. However, restricted by the low resolution and high noise of PMMW images, PMMW hidden object detection based on deep learning usually suffers from low accuracy and low classification confidence. To tackle the above problems, this paper proposes a Task-Aligned Detection Transformer network, named PMMW-DETR. In the first stage, a Denoising Coarse-to-Fine Transformer (DCFT) backbone is designed to extract long- and short-range features in the different scales. In the second stage, we propose the Query Selection module to introduce learned spatial features into the network as prior knowledge, which enhances the semantic perception capability of the network. In the third stage, aiming to improve the classification performance, we perform a Task-Aligned Dual-Head block to decouple the classification and regression tasks. Based on our self-developed PMMW security screening dataset, experimental results including comparison with State-Of-The-Art (SOTA) methods and ablation study demonstrate that the PMMW-DETR obtains higher accuracy and classification confidence than previous works, and exhibits robustness to the PMMW images of low quality.
Content may be subject to copyright.
Article
Hidden object detection has attracted a lot of attention recently due to its importance in security surveillance and other real-world applications. It is considered one of the most challenging tasks in computer vision. Thanks to deep learning for playing a significant role in the rapid technical evolution in this field over the past decade. This article presents a roadmap of hidden object detection, starting from its insightful evolution in 1984, and extensively reviews the technical evolution and shifts in detection approaches. To the best of our knowledge, this is the first ever review work carried out in this field. Various aspects related to hidden object detection have been discussed, including basic building blocks of the detection system, historical milestone detectors, detection datasets, challenges, pre-processing techniques, modern state-of-the-art detection frameworks, and the various evaluation metrics used to assess the detection performance. Towards the end, the paper emphasizes on some unanswered research concerns and possible future prospects in the field of hidden object detection. This review paper aims to serve as a valuable resource for researchers, practitioners, and enthusiasts seeking a thorough understanding of the concepts, advancements, and challenges in this dynamic area of computer vision as hidden object detection continues to have an impact on a variety of interdisciplinary fields of research.
Article
Full-text available
In public spaces, conducting security checks to detect concealed objects carried on the human body is crucial for enhancing global anti-terrorist measures. Terahertz imaging has recently played a pivotal role in concealed object detection. However, previous studies have faced significant challenges in achieving superior accuracy and performance. To address these issues, we propose a YOLOv5m model for detecting hidden objects beneath human clothing. We employ the CSPDarknet53 block to reduce noise and enhance discriminative power. Object location and size are identified using a PANet and the prediction head. To reduce computational complexity and obtain highly relevant features, we utilize multi-convolutional layers. Duplicate boxes are eliminated and high-quality bounding boxes are accurately detected using the NMS block. Hyper parameter tuning is performed using the Mutation Enabled Salp Swarm Algorithm, resulting in improved detection accuracy and reduced processing time. Our proposed model achieves impressive metrics, including a precision of 98.99%, recall of 97.80%, F1 score of 98.05%, detection rate of 96.50% and execution time of 135 s. Comparatively, our method outperforms existing approaches such as CNN, YOLO3, AC-SDBSCAN, YOLO-v2, RaadNet and SPFAN. We train and test our proposed method using a terahertz video dataset, demonstrating excellent results with high precision.
Article
Full-text available
A Ka-band 1024-channel passive millimeter wave (PMMW) imager (BHU-1024) based on synthetic aperture interferometric radiometer (SAIR) technique has been developed by Beihang University for security screening. BHU-1024 uses linear phased array to obtain resolution in the horizontal direction and uses aperture synthesis to obtain resolution in the vertical direction. This imager is designed for detecting concealed weapons on the human body and operated under the near-field condition of the antenna array. Thus, the conventional direct Fourier imaging theory, which is based on the far-field approximation, can’t be applied any longer. In this paper, a novel near-field image reconstruction method based on the beamforming technique is proposed. In this method, we derive the near-field imaging formula by beamforming theory and demonstrate the feasibility of obtaining spatial brightness temperature distribution by beam focusing. In the design of the beamformer, we further optimize the weight vector to suppress the sidelobe levels of the synthesized beam. The results of simulation and measurement demonstrate that the proposed method is an advantageous, effective imaging method for near-field passive millimeter wave imaging. At present, the proposed method has been applied to the actual passive millimeter wave imaging system BHU-1024.
Article
Full-text available
This paper presents a feasibility study of using a passive millimeter-wave imaging (PMMWI) system to assess burn wounds and the potential for monitoring the healing process under dressing materials, without their painful removal. Experimental images obtained from ex vivo porcine skin samples indicate that a ThruVision passive imager operating over the band 232–268 GHz can be used for diagnosing burns and for potentially monitoring the healing under dressing materials. Experimental images show that single and multiple burns are observed throughout dressing materials. As the interaction of millimeter-wave (MMW) radiation with the human body is almost exclusively with the skin, the major outcomes of the research are that PMMWI is capable of discriminating burn-damaged skin from unburned skin, and these measurements can be made through bandages without the imager making any physical contact with the skin or the bandage. This highlights the opportunity that the healing of burn wounds can be assessed and monitored without the removal of dressing materials. The key innovation in this work is in detecting single and multiple burns under dressing materials in noncontact with the skin and without exposing the skin to any type of manmade radiation (i.e., passive sensing technology). These images represent the first demonstration of burns wound under dressing materials using a passive sensing imager.
Chapter
Detection transformers have achieved competitive performance on the sample-rich COCO dataset. However, we show most of them suffer from significant performance drops on small-size datasets, like Cityscapes. In other words, the detection transformers are generally data-hungry. To tackle this problem, we empirically analyze the factors that affect data efficiency, through a step-by-step transition from a data-efficient RCNN variant to the representative DETR. The empirical results suggest that sparse feature sampling from local image areas holds the key. Based on this observation, we alleviate the data-hungry issue of existing detection transformers by simply alternating how key and value sequences are constructed in the cross-attention layer, with minimum modifications to the original models. Besides, we introduce a simple yet effective label augmentation method to provide richer supervision and improve data efficiency. Experiments show that our method can be readily applied to different detection transformers and improve their performance on both small-size and sample-rich datasets. Code will be made publicly available at https://github.com/encounter1997/DE-DETRs.KeywordsData efficiencyDetection transformerSparse featureRich supervisionLabel augmentation
Article
Atmospheric data collected through spaceborne millimeter (mm) and submillimeter-wave (sub-mm-wave) radiometry combined with complex numerical weather prediction models allow meteorological offices around the world to deliver accurate weather forecasting services of great socioeconomic value. Additionally, the same radiometric technology is pivotal to other applications that include but are not limited to the study of the upper atmosphere of Earth, the understanding of the effects of climate change, and the study of the atmosphere and geological surface of other planets in the solar system. Table 1 shows some examples of remote sensing instruments where there is a demand for mm-wave, sub-mm-wave, and supra-THz radiometers and whose scientific success is underpinned, in great part, by the performance of the fundamental building blocks that form the instruments.
Article
Passive millimeter-wave (PMMW) imaging has merits of nonradiation and good penetrability to most clothes, hence it has been a reliable security technique for the detection of concealed objects. At present, deep learning-based approaches have shown a great advantage in automatic detection. However, the low resolution and high background noise of PMMW images make the task tricky, especially for small objects. In this article, we propose a transformer-based anchor-free detector with the integration of local/global information and adaptive label assignment to address the aforementioned issues. Compared with the existing anchor-based methods adopted for PMMW image detection, our detector can further improve the efficiency and remove the handcraft anchor boxes. To be specific, we first employ hierarchical transformer architecture as the backbone, which has the capacity to model long-range dependencies of the feature at different scales. We propose a new strategy that calculates self-attention within the local region/global region in turn, providing detailed and global features of small objects. Second, we design a learnable position encoding module to obtain positional information between pixels. We propose an attention weighting module that enables the network to adaptively refine the features and distinguish positive and negative samples. Finally, we propose an adaptive label assignment strategy to dynamically optimize the number of positive samples used for detections. The proposed method is validated on our self-developed PMMW imager. The experimental results show that our method achieves better performance on accuracy and competitive speed compared with the state-of-the-art methods.
Article
The forthcoming spaceborne ice cloud imager (ICI) millimeter/submillimeter-wave radiometer is designed to support climate monitoring and ice clouds representation in weather and climate models. The assessment of the correct pointing of each ICI channel is of undeniable importance to deliver high-quality products. Nevertheless, the ICI channels have a limited chance to sample the surface features due to the strong atmospheric gas absorption. Only for channels within 183–325 GHz, few locations worldwide show the sufficiently dry environmental conditions allowing for an occasional sampling of surface landmark targets. In this work, for the first time, we investigate the possibility of exploiting distinctive atmospheric signatures, namely, those generated by water vapor masses and deep convective clouds, for absolute and relative geolocation validation purposes. The main idea behind the proposed approach is: 1) to georeference a pivotal channel at 183 GHz, exploiting the synergy of infrared and microwave collocated observations (absolute geolocation) and 2) to test the relative pointing accuracy of all the other ICI channels with respect to the pivotal one (relative geolocation). Observations of the Special Sensor Microwave Imager/Sounder (SSMIS), the Spinning Enhanced Visible and Infrared Imager (SEVIRI), and radiative transfer simulations are used to pursue the goals. Results show that water vapor mass (WVM) atmospheric targets can achieve an absolute point accuracy for the lower ICI channels of the order of 5.1 km (i.e., 32% of the 16-km footprint size). Conversely, when dealing with the relative pointing accuracy of higher ICI channels, the expected pointing accuracy is smaller than 4.1 km (i.e., 25% of the footprint size).