Jinan Gu’s research while affiliated with Jiangsu University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (47)


YOLOv8-TEA: Recognition Method of Tender Shoots of Tea Based on Instance Segmentation Algorithm
  • Article

May 2025

·

1 Read

·

Yidan Xi

·

Jinan Gu

·

[...]

·

Man Zhou

With the continuous development of artificial intelligence technology, the transformation of traditional agriculture into intelligent agriculture is quickly accelerating. However, due to the diverse growth postures of tender shoots and complex growth environments in tea plants, traditional tea picking machines are unable to precisely select the tender shoots, and the picking of high-end and premium tea still relies on manual labor, resulting in low efficiency and high costs. To address these issues, an instance segmentation algorithm named YOLOv8-TEA is proposed. Firstly, this algorithm is based on the single-stage instance segmentation algorithm YOLOv8-seg, replacing some C2f modules in the original feature extraction network with MVB, combining the advantages of convolutional neural networks (CNN) and Transformers, and adding a C2PSA module following spatial pyramid pooling (SPPF) to integrate convolution and attention mechanisms. Secondly, a learnable dynamic upsampling method is used to replace the traditional upsampling, and the CoTAttention module is added, along with the fusion of dilated convolutions in the segmentation head to enhance the learning ability of the feature fusion network. Finally, through ablation experiments and comparative experiments, the improved algorithm significantly improves the segmentation accuracy while effectively reducing the model parameters, with mAP (Box) and mAP (Mask) reaching 86.9% and 86.8%, respectively, and GFLOPs reduced to 52.7.


Enhancing industrial machinery maintenance through advanced fault and novelty detection using variational autoencoder and hybrid transformer model

March 2025

·

14 Reads

In the context of Industry 4.0, new sensing and communication technologies have unlocked vast amounts of process data, offering significant potential for its transformation into actionable insights to support manufacturing decisions. The reliable detection and diagnosis of faults in rolling element bearings pose a significant challenge for condition-based maintenance and fault detection and diagnosis (FDD), which are critical strategies for enhancing equipment reliability and reducing operational costs. Deep learning methods, such as convolutional neural networks (CNNs), can extract features from vibration signals compared to traditional signal processing. However, these methods in isolation are insufficient to reliably detect novel fault conditions and faults in variable working environments. Also, existing novelty and anomaly detection criteria are not accurate enough to correctly distinguish novel or unseen faults. This study introduces a multi-fault detection framework leveraging a variational autoencoder with Mahalanobis distance (MD) novelty scores for unknown condition detection and a hybrid CNN-Swin transformer (Swin-T) model for incremental learning and fault classification. Using frequency-domain transformation and image-based representation of vibration signals, a hybrid model with a CNN-based feature extractor after projecting to the patch embedding layer of a simplified Swin-T model is trained incrementally with novel conditions to allow continuous learning and adaptation. Extensive validation with three separate datasets from fault simulation test rigs demonstrates the superior performance of the method over traditional and cutting-edge models in FDD and novelty detection (ND), achieving near-perfect accuracy (99.7%), precision (99.8%), recall (99.6%), and F1 score (99.7%). ND outperformed traditional approaches with an MD novelty score threshold yielding a true-positive rate of 98.9% and a false-positive rate of 1.2%. Additionally, incremental learning improved classification accuracy by up to 5.4% for newly introduced fault types, highlighting its adaptability. These results demonstrate the framework’s ability to enhance reliability and efficiency in industrial machinery maintenance by identifying both known and novel fault conditions with high precision.


Images of different types of defects. (a) Defects are similar to background features. (b) Large inter‐class feature differences of defects. (c) Small intra‐class feature differences of defects.
Structure of the SE module.
Structure of the CBAM module.
Coupled prediction module in the YOLO series algorithm.
Decoupled prediction module in the RetinaNet algorithm.

+17

Surface Defect Detector Based on Deformable Convolution and Lightweight Multi‐Scale Attention
  • Article
  • Publisher preview available

February 2025

·

3 Reads

The detection of defects on industrial surfaces is essential for guaranteeing the quality and safety of products. Deep learning‐based object detection methods have demonstrated impressive efficacy in industrial applications in recent years. However, due to the complex and variable shape of defects, the similarity between defects and background, large intra‐class differences, and small inter‐class differences lead to low classification accuracy, it is a great challenge to achieve accurate defect detection. To overcome these challenges, this research proposed a novel network specifically designed for defect detection. First, a feature extraction network, ResDCA‐Net, is constructed based on deformable convolution and lightweight multi‐scale attention, where deformable convolution can adaptively adjust to extract features of defects with complex and variable shapes. Second, the lightweight multi‐scale attention module is constructed, which uses multi‐branch and cross‐space fusion to obtain the complete feature space attention map, thereby improving the defect feature attention and reducing the background feature attention. Third, to enhance the classification and localization accuracy, an attention‐based decoupled prediction module is proposed to ensure that the classification and regression branches of the model can focus on their required features. Finally, extensive comparative experiments indicate that the proposed approach performs best, achieving 83.7% and 83.4% mean Average Precision (mAP) on the GC10‐DET and NEU‐DET datasets, respectively. The effectiveness of the proposed individual modules is further validated in ablation experiments, which demonstrate the excellent performance and potential in defect detection tasks.

View access options



High-precision apple recognition and localization method based on RGB-D and improved SOLOv2 instance segmentation

June 2024

·

71 Reads

·

9 Citations

Intelligent apple-picking robots can significantly improve the efficiency of apple picking, and the realization of fast and accurate recognition and localization of apples is the prerequisite and foundation for the operation of picking robots. Existing apple recognition and localization methods primarily focus on object detection and semantic segmentation techniques. However, these methods often suffer from localization errors when facing occlusion and overlapping issues. Furthermore, the few instance segmentation methods are also inefficient and heavily dependent on detection results. Therefore, this paper proposes an apple recognition and localization method based on RGB-D and an improved SOLOv2 instance segmentation approach. To improve the efficiency of the instance segmentation network, the EfficientNetV2 is employed as the feature extraction network, known for its high parameter efficiency. To enhance segmentation accuracy when apples are occluded or overlapping, a lightweight spatial attention module is proposed. This module improves the model position sensitivity so that positional features can differentiate between overlapping objects when their semantic features are similar. To accurately determine the apple-picking points, an RGB-D-based apple localization method is introduced. Through comparative experimental analysis, the improved SOLOv2 instance segmentation method has demonstrated remarkable performance. Compared to SOLOv2, the F1 score, mAP, and mIoU on the apple instance segmentation dataset have increased by 2.4, 3.6, and 3.8%, respectively. Additionally, the model’s Params and FLOPs have decreased by 1.94M and 31 GFLOPs, respectively. A total of 60 samples were gathered for the analysis of localization errors. The findings indicate that the proposed method achieves high precision in localization, with errors in the X, Y, and Z axes ranging from 0 to 3.95 mm, 0 to 5.16 mm, and 0 to 1 mm, respectively.






Citations (34)


... The combination of CNNs and Transformers leverages the strengths of both architectures (Song et al. 2024b;Jiang et al. 2024;Pan et al. 2024;Zhang et al. 2024a). CNNs excel in local feature extraction, effectively capturing fine-grained details, while Transformers specialize in modeling global context and long-range dependencies. ...

Reference:

CASEMark: A hybrid model for robust anatomical landmark detection in multi-structure X-rays
Picking point identification and localization method based on swin-transformer for high-quality tea
  • Citing Article
  • December 2024

Journal of King Saud University - Computer and Information Sciences

... Visual sensors can capture visible light information such as color and texture, making them suitable for object recognition and localization [123], but they struggle to reflect the physiological state of crops. Multispectral sensors, on the other hand, can perceive information related to vegetation health, moisture, and pest/disease presence across multiple bands [124], but they have lower spatial resolution and less clear structural features. ...

High-precision apple recognition and localization method based on RGB-D and improved SOLOv2 instance segmentation

... In current research on autonomous navigation of orchard robots, filtering methods are used to remove noise points, outliers, and abnormal points, and then sampling methods are used to reduce the data volume. Subsequently, depending on the requirements, clustering, segmentation, feature extraction, and plane fitting methods are chosen to deal with different scenarios (Malavazi et al., 2018;Liu et al., 2021;Xie et al., 2023;Zhang et al., 2024). However, this process is very cumbersome. ...

An image segmentation and point cloud registration combined scheme for sensing of obscured tree branches
  • Citing Article
  • June 2024

Computers and Electronics in Agriculture

... Li et al [11] proposed an improved Faster R-CNN algorithm based on transfer learning to enhance detection performance in challenging visual scenes. Wang et al [12] enhanced RetinaNet's accuracy in complex image detection by introducing an attention mechanism and optimizing its pyramid structure. To achieve precise localization in complex visual environments and improve small target detection, Lai et al [13] incorporated K-means++ and multi-scale feature learning modules to enhance traffic sign detection in driving scenarios, demonstrating notable effectiveness. ...

Multiscale Maize Tassel Identification Based on Improved RetinaNet Model and UAV Images

... The deep neural models still struggled to maintain high accuracy under significant noise conditions. Several methods, such as filtering techniques [24,25], statistical methods [26], signal domain transformation [27,28], autoencoders [29], and adaptive filtering [30], can be applied to reduce the effect of negative impacts of the signal noise. It is possible to leverage the power of these denoising techniques to preprocess real-world data before fault identification methods are applied. ...

Improved signal processing for bearing fault diagnosis in noisy environments using signal denoising, time–frequency transform, and deep learning
  • Citing Article
  • October 2023

Journal of the Brazilian Society of Mechanical Sciences and Engineering

... In recent years, to tackle the challenges of intelligent recognition and detection in complex environments, researchers across the globe have increasingly turned to deep learning methods [21][22][23]. Xu et al. [24] enhanced YOLOv5 by integrating the Mish activation function, employing DIoU_Loss to accelerate bounding box regression, and incorporating the Squeeze Excitation module. These modifications resulted in a grading precision of 90.6% and a real-time processing speed of 59.63 FPS, significantly boosting both the precision and detection efficiency for apple grading tasks. ...

Research on Apple Object Detection and Localization Method Based on Improved YOLOX and RGB-D Images

... Although progress has been made in evaluating driving behavior and correlations, existing methods face several limitations. In particular, machine learning models are sensitive to data partitioning and struggle with complexity when handling large-scale, high-dimensional driving data [30][31][32]. This leads to longer computation times and higher demands on computer performance. ...

A review on the application of computer vision and machine learning in the tea industry

... Additionally, as mentioned, Si et al. [13] achieved a binocular positioning method with a traditional image processing algorithm for apples with a distance estimation error of 20 mm in the range of 400-1500 mm. Hu et al. [39] designed an apple object detection and localization method based on improved YOLOX and RGB-D images, which achieved a depth error of less than 5 mm. These methods achieved satisfactory performance in binocular positioning, but they are still not robust enough for various occulted situations, illuminations, and distances. ...

Research on Apple Object Detection and Localization Method Based on Improved Yolox and Rgb-D Images
  • Citing Article
  • January 2023

SSRN Electronic Journal

... Therefore, meeting the system constraints is a problem that must be solved in visual servo control. Model predictive control can naturally build system constraints into the optimization problem to ensure constraint satisfaction by constructing optimization problems to solve controller actions [15]. Thus, model predictive control methods are often used to execute constrained visual servo control [16][17][18][19][20][21]. ...

Hierarchical multiloop MPC scheme for robot manipulators with nonlinear disturbance observer

Mathematical Biosciences & Engineering

... Section 3 shows the discussion, scientific reflections, and future directions. Other related developments are described in Section 4. Finally, the conclusion is illustrated in section 5. Robot manipulators face several obstacles in achieving the accurate control objective of positioning tracking such as load variation, disturbances, uncertainties, friction and coupling, nonlinearity, hysteresis, high-order, and underactuated property [5], [20]- [22]. Therefore, several advanced control strategies have been proposed to overcome these impediments thanks to the prosperous growth in integrated circuits, embedded systems, sensing techniques, and computer technologies in recent years [23]. ...

Optimal design of model predictive controller based on transient search optimization applied to robotic manipulators

Mathematical Biosciences & Engineering