Longfeng Shen’s research while affiliated with Huaibei Normal University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (19)


An Enhanced Cross‐Attention Based Multimodal Model for Depression Detection
  • Article

January 2025

·

1 Read

Computational Intelligence

Yifan Kou

·

Fangzhen Ge

·

·

[...]

·

Huaiyu Liu

Depression, a prevalent mental disorder in modern society, significantly impacts people's daily lives. Recently, there have been advancements in developing automated diagnosis models for detecting depression. However, data scarcity, primarily due to privacy concerns, has posed a challenge. Traditional speech features have limitations in representing knowledge for depression diagnosis, and the complexity of deep learning algorithms necessitates substantial data support. Furthermore, existing multimodal methods based on neural networks overlook the heterogeneity gap between different modalities, potentially resulting in redundant information. To address these issues, we propose a multimodal depression detection model based on the Enhanced Cross‐Attention (ECA) Mechanism. This model effectively explores text‐speech interactions while considering modality heterogeneity. Data scarcity has been mitigated by fine‐tuning pre‐trained models. Additionally, we design a modal fusion module based on ECA, which emphasizes similarity responses and updates the weight of each modal feature based on the similarity information between modal features. Furthermore, for speech feature extraction, we have reduced the computational complexity of the model by integrating a multi‐window self‐attention mechanism with the Fourier transform. The proposed model is evaluated on the public dataset, DAIC‐WOZ, achieving an accuracy of 80.0% and an average F 1 value improvement of 4.3% compared with relevant methods.


PDP&CGIT
PDP
CGIT
Flowchart of PDP&CGIT
Schematic of PDP: yellow boxes indicate dominant solutions; green circles represent nondominant solutions; and the blue line denotes the classifier (color figure online)

+8

A dynamic multi-objective optimization algorithm based on probability-driven prediction and correlation-guided individual transfer
  • Article
  • Publisher preview available

December 2024

·

9 Reads

The Journal of Supercomputing

The primary challenge in addressing dynamic multi-objective optimization problems (DMOPs) is the rapid tracking of optimal solutions. Although methods based on transfer learning have shown remarkable performance in tackling DMOPs, most existing methods overlook the potential relationships between individuals within the population and those from historical environments. Consequently, they fail to adequately exploit historical information. To this end, this study proposes a dynamic multi-objective optimization algorithm based on probability-driven prediction and correlation-guided individual transfer (PDP&CGIT), which consists of two strategies: probability-driven prediction (PDP) and correlation-guided individual transfer (CGIT). Specifically, the PDP strategy analyzes the distribution of population characteristics and constructs a discriminative predictor based on a probability-annotation matrix to classify high-quality solutions from numerous randomly generated solutions within the decision space. Moreover, from the perspective of individual evolution, the CGIT strategy analyzes the correlation between current elite individuals and those from the previous moment. It learns the dynamic change pattern of the individuals and transfers this pattern to new environments. This is to maintain the diversity and distribution of the population. By integrating the advantages of these two strategies, PDP&CGIT can efficiently respond to environmental changes. Extensive experiments were performed to compare the proposed PDP&CGIT with five state-of-the-art algorithms across the FDA, F, and DF test suites. The results demonstrated the superiority of PDP&CGIT.

View access options

MBDRes-U-Net: Multi-Scale Lightweight Brain Tumor Segmentation Network

November 2024

·

1 Read

Accurate segmentation of brain tumors plays a key role in the diagnosis and treatment of brain tumor diseases. It serves as a critical technology for quantifying tumors and extracting their features. With the increasing application of deep learning methods, the computational burden has become progressively heavier. To achieve a lightweight model with good segmentation performance, this study proposes the MBDRes-U-Net model using the three-dimensional (3D) U-Net codec framework, which integrates multibranch residual blocks and fused attention into the model. The computational burden of the model is reduced by the branch strategy, which effectively uses the rich local features in multimodal images and enhances the segmentation performance of subtumor regions. Additionally, during encoding, an adaptive weighted expansion convolution layer is introduced into the multi-branch residual block, which enriches the feature expression and improves the segmentation accuracy of the model. Experiments on the Brain Tumor Segmentation (BraTS) Challenge 2018 and 2019 datasets show that the architecture could maintain a high precision of brain tumor segmentation while considerably reducing the calculation overhead.Our code is released at https://github.com/Huaibei-normal-university-cv-laboratory/mbdresunet



TransFGVC: transformer-based fine-grained visual classification

June 2024

·

178 Reads

·

3 Citations

The Visual Computer

Fine-grained visual classification (FGVC) aims to identify subcategories of objects within the same superclass. This task is challenging owing to high intra-class variance and low inter-class variance. The most recent methods focus on locating discriminative areas and then training the classification network to further capture the subtle differences among them. On the one hand, the detection network often obtains an entire part of the object, and positioning errors occur. On the other hand, these methods ignore the correlations between the extracted regions. We propose a novel highly scalable approach, called TransFGVC, that cleverly combines Swin Transformers with long short-term memory (LSTM) networks to address the above problems. The Swin Transformer is used to obtain remarkable visual tokens through self-attention layer stacking, and LSTM is used to model them globally, which not only accurately locates the discriminative region but also further introduces global information that is important for FGVC. The proposed method achieves competitive performance with accuracy rates of 92.7%, 91.4% and 91.5% using the public CUB-200-2011 and NABirds datasets and our Birds-267-2022 dataset, and the Params and FLOPs of our method are 25% and 27% lower, respectively, than the current SotA method HERBS. To effectively promote the development of FGVC, we developed the Birds-267-2022 dataset, which has 267 categories and 12,233 images.




Consistent Weighted Correlation-Based Attention for Transformer Tracking

November 2023

·

20 Reads

Electronics

Attention mechanism takes a crucial role among the key technologies in transformer-based visual tracking. However, the current methods for attention computing neglect the correlation between the query and the key, which results in erroneous correlations. To address this issue, a CWCTrack framework is proposed in this study for transformer visual tracking. To balance the weights of the attention module and enhance the feature extraction of the search region and template region, a consistent weighted correlation (CWC) module is introduced into the cross-attention block. The CWC module computes the correlation score between each query and all keys. Then, the correlation multiplies the consistent weights of the other query–key pairs to acquire the final attention weights. The weights of consistency are computed by the relevance of the query–key pairs. The correlation is enhanced for the relevant query–key pair and suppressed for the irrelevant query–key pair. Experimental results conducted on four prevalent benchmarks demonstrate that the proposed CWCTrack yields preferable performances.



Figure 1. Overall network architecture of DSKCA-UNet. DSKCA-UNet = dynamic selective kernel channel attention UNet.
Figure 3. Details of the SKA module. SKA = selective kernel attention.
Figure 4. Details of the ECA module. ECA = efficient channel attention.
Segmentation accuracy of different methods on the synapse dataset. The best 3 results are highlighted in red, blue, and green, respectively.
Ablation study on the impact of the single attention module. The best result is highlighted by coarsening.
DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation

September 2023

·

22 Reads

·

3 Citations

Medicine

U-Net has attained immense popularity owing to its performance in medical image segmentation. However, it cannot be modeled explicitly over remote dependencies. By contrast, the transformer can effectively capture remote dependencies by leveraging the self-attention (SA) of the encoder. Although SA, an important characteristic of the transformer, can find correlations between them based on the original data, secondary computational complexity might retard the processing rate of high-dimensional data (such as medical images). Furthermore, SA is limited because the correlation between samples is overlooked; thus, there is considerable scope for improvement. To this end, based on Swin-UNet, we introduce a dynamic selective attention mechanism for the convolution kernels. The weight of each convolution kernel is calculated to fuse the results dynamically. This attention mechanism permits each neuron to adaptively modify its receptive field size in response to multiscale input information. A local cross-channel interaction strategy without dimensionality reduction was introduced, which effectively eliminated the influence of downscaling on learning channel attention. Through suitable cross-channel interactions, model complexity can be significantly reduced while maintaining its performance. Subsequently, the global interaction between the encoder features is used to extract more fine-grained features. Simultaneously, the mixed loss function of the weighted cross-entropy loss and Dice loss is used to alleviate category imbalances and achieve better results when the sample number is unbalanced. We evaluated our proposed method on abdominal multiorgan segmentation and cardiac segmentation datasets, achieving Dice similarity coefficient and 95% Hausdorff distance metrics of 80.30 and 14.55%, respectively, on the Synapse dataset and Dice similarity coefficient metrics of 90.80 on the ACDC dataset. The experimental results show that our proposed method has good generalization ability and robustness, and it is a powerful tool for medical image segmentation.


Citations (7)


... Thanks to the Transformer's powerful MHSA mechanism, it achieves competitive results compared to the top-performing CNN results. Similarly, TransFGVC [11] ingeniously combines the Swin Transformer with LSTM methods. This hybrid approach accurately locates discriminative regions and thoroughly extracts global information, achieving superior performance in FGVC tasks. ...

Reference:

Enhanced Fine-Grained Visual Classification through Lightweight Transformer Integration and Auxiliary Information Fusion
TransFGVC: transformer-based fine-grained visual classification

The Visual Computer

... In recent years, researchers have proposed methods to dynamically adjust niches, effectively addressing the parameter sensitivity problem in traditional niche strategies [37,40,41]. Therefore, this paper introduces a dynamic adjustment of niche scale by using a cluster pool. ...

A dynamic multi-objective evolutionary algorithm based on Mahalanobis distance and intra-cluster individual correlation rectification
  • Citing Article
  • June 2024

Information Sciences

... The CNN and U-Net architectures have been enhanced through various adaptations aimed at optimizing their performance. Innovations have included the introduction of attention mechanisms [29], the incorporation of dynamic modeling [6,25,35], modifying skip connections [18], and the replacement of backbone modules within segmentation networks [21]. These enhancements have been critical in advancing the capabilities of these architectures to handle the complexities of medical image segmentation. ...

DSKCA-UNet: Dynamic selective kernel channel attention for medical image segmentation

Medicine

... Zhang et al. [29] proposed a transfer learningbased surrogate-assisted evolutionary algorithm, utilizes historical optimal knee solutions to augment the training data for building Gaussian process models, effectively improving the quality of solutions. HOU et al. [30] by creating a time series and utilizing GRU neural network to maximize distribution features and minimize losses, the network model is trained to ultimately form the initial population for the next moment. YAO et al. [31] combining clustering difference strategy with transfer learning technology to improve the quality of the solution and the convergence speed. ...

Temporal distribution-based prediction strategy for dynamic multi-objective optimization assisted by GRU neural network
  • Citing Article
  • September 2023

Information Sciences

... In the domain of deep learning, various models, including CNNs and U-Net, have achieved success in brain tumor segmentation (Ding et al., 2020;Shen et al., 2023;Zhang et al., 2023). CNNs (Convolutional Neural Networks) excel in image processing and analysis but are characterized by high computational complexity, susceptibility to overfitting on small datasets, and sensitivity to hyperparameters. ...

Feature interaction network based on hierarchical decoupled convolution for 3D medical image segmentation

... To address these limita�ons, mul�modal super-resolu�on approaches have emerged, where high-resolu�on images from complementary modali�es serve as guides to enhance low-resolu�on MSI data [13][14][15] . In these methods, the high-resolu�on images provide structural or phenotypic context, which guides the reconstruc�on of super-resolved MSI images, resul�ng in more accurate metabolite maps that preserve both spa�al and chemical integrity. ...

A super-resolution strategy for mass spectrometry imaging via transfer learning

Nature Machine Intelligence

... And at the second stage, the patch weight computation is based on the result of the first stage. Shen et al. [72] propose a cooperative low-rank graph model to suppress background clutter. 835 It decomposes input dual-modal features into low-rank components and sparse, noisy components and dynamically updates them by the collaborative graph learning algorithm. ...

RGBT Tracking based on Cooperative Low-Rank Graph Model
  • Citing Article
  • April 2022

Neurocomputing