Figure 6 - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
Visual qualitative comparison of average feature maps and average heat maps, via Local Multi-scale Learning.
Source publication
In this paper, we propose an end-to-end single-image super-resolution neural network by leveraging hybrid multi-scale features of images. Different from most existing convolutional neural network (CNN) based solutions, our proposed network depends on the observation that image features extracted by CNN contain hybrid multi-scale features: both mult...
Context in source publication
Context 1
... ablation experiment shows that, compared with Convblock, the efficient and lightweight feature-extraction module EFblock has the advantages of a small number of parameters and calculations, while performing quite well. Figure 6 shows the feature maps and heat maps extracted from the RF series of multi-scale modules with different receptive field sizes. It can be seen that, as the receptive field increases, the extracted texture features become more and more concise, from fine details to image contours. ...
Similar publications
NuText is a novel music-encoding technology based on numbered musical notation. This paper outlines the notation principles of numbered musical notation and delineates the conversion relationship and encoding protocol between NuText and numbered musical notation. Furthermore, this study demonstrates NuText's playback software and its practical appl...
Citations
... The purpose was to enhance the sensing images. Huang et al. [37] propose a single-image super-resolution neural network that exploits the mixed multi-scale features of the image, which can extract local texture features and global structural features and achieves higher performance with fewer parameters. ...
The differential count of white blood cells (WBCs) can effectively provide disease information for patients. Existing stained microscopic WBC classification usually requires complex sample-preparation steps, and is easily affected by external conditions such as illumination. In contrast, the inconspicuous nuclei of stain-free WBCs also bring great challenges to WBC classification. As such, image enhancement, as one of the preprocessing methods of image classification, is essential in improving the image qualities of stain-free WBCs. However, traditional or existing convolutional neural network (CNN)-based image enhancement techniques are typically designed as standalone modules aimed at improving the perceptual quality of humans, without considering their impact on advanced computer vision tasks of classification. Therefore, this work proposes a novel model, UR-Net, which consists of an image enhancement network framed by ResUNet with an attention mechanism and a ResNet classification network. The enhancement model is integrated into the classification model for joint training to improve the classification performance for stain-free WBCs. The experimental results demonstrate that compared to the models without image enhancement and previous enhancement and classification models, our proposed model achieved a best classification performance of 83.34% on our stain-free WBC dataset.
... To show the evaluation metrics, Peak Signal to Noise (PSNR) and Structure Similarity (SSIM) index are adopted, which are commonly practiced in the literature for the current research domain. We compare the performance of the proposed model against state-of-the-art methods including Bicubic [49], SRCNN [50], FSRCNN [31],GLADSR [51],HMSF [52], and the CDLSR [53]. Before demonstrating the experimental performance of our proposed approach, we described the adopted dataset and implementation details utilized in the current analysis. ...
... We compare our proposed model with state-of-the-art methods, namely Bicubic [49], SRCNN [50], FSRCNN [31], GLADSR [51], HMSF [52], and the CDLSR [53] for multi-model image super-resolution. The results were obtained from the state-of-the-art approaches utilizing their existing available codes. ...
... Alternatively, for the test image 'jelly bean', the second-best approach GLADSR [51] performed better with 0.03dB and 0.0003dB over our proposed scheme. Additionally, the HMSF [52] increases the PSNR = 42.67 of the test image 'Egyptian' over the proposed approach with PSNR = 42.65. The comparative visual performances of the proposed scheme with competing methods are in 4 × scaleup shown in Fig. 4. Ultimately, the results show that the proposed method achieved improved performance with a higher measure gap of 1.11dB PSNR and 0.0118 dB SSIM in 4 × upscaling against the state-of-the-art performances of competitive approaches. ...
This paper proposes a wavelet domain-based method for multispectral image super-resolution. The stationary wavelet transform is proposed to decompose the multispectral image into directional wavelet components and for each wavelet component, a joint dictionary learning algorithm is proposed. Using sparse and redundant representations, the proposed approach helps capture intrinsic multispectral features using wavelet domain learning utilizing the up-sampling property of (SWT). The proposed method can learn and recover those image features more accurately. In order to validate the proposed method, we conducted comprehensive experiments. Moreover, we present a comparison of our proposed method with state-of-the-art algorithms over PSNR and SSIM evaluation parameters. The results of the experiments indicate that the proposed method outperforms state-of-the-art methods.
... Faster R-CNN architecture is the two-state object detector which has proven to have high accuracy and be end-to-end trainable. Our work is to fine-tune this architecture by integrating the MobileNetV3 [27] backbone and feature pyramid network for extracting multi-scale features [28]. MobileNetv3 model is a lightweight neural network suitable for devices with a limited computational resource budget. ...
Ski goggles help protect the eyes and enhance eyesight. The most important part of ski goggles is their lenses. The quality of the lenses has leaped with technological advances, but there are still defects on their surface during manufacturing. This study develops a deep learning-based defect detection system for ski goggles lenses. The first step is to design the image acquisition model that combines cameras and light sources. This step aims to capture clear and high-resolution images on the entire surface of the lenses. Next, defect categories are identified, including scratches, watermarks, spotlight, stains, dust-line, and dust-spot. They are labeled to create the ski goggles lenses defect dataset. Finally, the defects are automatically detected by fine-tuning the mobile-friendly object detection model. The mentioned defect detection model is the MobileNetV3 backbone used in a feature pyramid network (FPN) along with the Faster-RCNN detector. The fine-tuning includes: replacing the default ResNet50 backbone with a combination of MobileNetV3 and FPN; adjusting the hyper-parameter of the region proposal network (RPN) to suit the tiny defects; and reducing the number of the output channel in FPN to increase computational performance. Our experiments demonstrate the effectiveness of defect detection; additionally, the inference speed is fast. The defect detection accuracy achieves a mean average precision (mAP) of 55%. The work automatically integrates all steps, from capturing images to defect detection. Furthermore, the lens defect dataset is publicly available to the research community on GitHub. The repository address can be found in the Data Availability Statement section.
... At present, most super-resolution reconstruction networks still use single-scale convolution kernel to extract the underlying feature information of images, which ignores more details on low-resolution images, while multi-scale feature extraction can effectively extract feature information at different levels (Huang et al., 2022;Meng et al., 2022). Figure 1 shows the proposed generative network. ...
Image super-resolution reconstruction is one of the methods to improve resolution by learning the inherent features and attributes of images. However, the existing super-resolution models have some problems, such as missing details, distorted natural texture, blurred details and too smooth after image reconstruction. To solve the above problems, this paper proposes a Multi-scale Dual-Attention based Residual Dense Generative Adversarial Network (MARDGAN), which uses multi-branch paths to extract image features and obtain multi-scale feature information. This paper also designs the channel and spatial attention block (CSAB), which is combined with the enhanced residual dense block (ERDB) to extract multi-level depth feature information and enhance feature reuse. In addition, the multi-scale feature information extracted under the three-branch path is fused with global features, and sub-pixel convolution is used to restore the high-resolution image. The experimental results show that the objective evaluation index of MARDGAN on multiple benchmark datasets is higher than other methods, and the subjective visual effect is better. This model can effectively use the original image information to restore the super-resolution image with clearer details and stronger authenticity.
In recent years, the convergence of Artificial Intelligence (AI) and microfluidic technologies has given rise to unprecedented advancements in various fields, ranging from healthcare to environmental monitoring. Among which the AI AI-assisted microfluidic biological imaging has been one most widely applied example due to its advantages such as high-throughput and high-content imaging capability as well as the outstanding abilities in analyzing and mining massive data generated by microfluidic bio-imaging systems. AI exhibits significant potential in assisting microfluidic bio-imaging by enhancing imaging resolution and improving classification and detection performances. Therefore, in this review, we focus on some key technologies and recent advancements in AI-assisted microfluidic bio-imaging sensors and presenting discussions from three aspects: sensing devices, AI, and corresponding applications. In the aspect of sensing devices, we offer a detailed introduction to the structure and design of commonly used imaging sensors, including frame-based image sensors and event-based image sensors, and we present two types of frame-based image sensors: charge-coupled devices (CCD) and Complementary Metal-Oxide Semiconductor (CMOS) image sensors. In terms of AI, we present the development process of AI, summarizing various machine learning and deep learning algorithms commonly used in the field of bio-imaging, such as super-resolution, classification, and detection, etc. In terms of application, we provide a list of recent practical applications that integrate various AI techniques with diverse imaging sensors. Finally, we conclude with discussions on the current challenges faced in the field and present potential directions in the future.
Laparoscopic surgery offers minimally invasive procedures with better patient outcomes, but smoke presence challenges visibility and safety. Existing learning-based methods demand large datasets and high computational resources. We propose the Progressive Frequency-Aware Network (PFAN), a lightweight GAN framework for laparoscopic image desmoking, combining the strengths of CNN and Transformer for progressive information extraction in the frequency domain. PFAN features CNN-based Multi-scale Bottleneck-Inverting (MBI) Blocks for capturing local high-frequency information and Locally-Enhanced Axial Attention Transformers (LAT) for efficiently handling global low-frequency information. PFAN efficiently desmokes laparoscopic images even with limited training data. Our method outperforms state-of-the-art approaches in PSNR, SSIM, CIEDE2000, and visual quality on the Cholec80 dataset and retains only 629K parameters. Our code and models are made publicly available at: https://github.com/jlzcode/PFAN.