Yizhou Li’s research while affiliated with Sichuan University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (7)


FedRFC: Federated Learning with Recursive Fuzzy Clustering for improved non-IID data training
  • Article

June 2024

·

22 Reads

·

2 Citations

Future Generation Computer Systems

Yuxiao Deng

·

Anqi Wang

·

Lei Zhang

·

[...]

·

Yizhou Li




The behavioral activities of a social network user from the past to the present. When timestamp is t1, the user was posting. When timestamp is t3, the user liked a posted piece of content. When timestamp is t5, the user reposted posts. When timestamp is t7, the user was posting another piece of content.
The architecture of MRLBot. DDTCN is the behavioral representation learning model. IB2V is the relationship representation learning model. u˜b is the generated behavior representation, and u˜g is the generated relationship representation. u˜ is the multi-dimensional representation.
The architecture of DDTCN.
The architecture of IB2V.
Intra- and outer-community-oriented random walks.

+12

MRLBot: Multi-Dimensional Representation Learning for Social Media Bot Detection
  • Article
  • Full-text available

May 2023

·

1,116 Reads

·

5 Citations

Social media bots pose potential threats to the online environment, and the continuously evolving anti-detection technologies require bot detection methods to be more reliable and general. Current detection methods encounter challenges, including limited generalization ability, susceptibility to evasion in traditional feature engineering, and insufficient exploration of user relationships. To tackle these challenges, this paper proposes MRLBot, a social media bot detection framework based on unsupervised representation learning. We design a behavior representation learning model that utilizes Transformer and a CNN encoder–decoder to simultaneously extract global and local features from behavioral information. Furthermore, a network representation learning model is proposed that introduces intra- and outer-community-oriented random walks to learn structural features and community connections from the relationship graph. Finally, the behavioral representation and relationship representation learning models are combined to generate fused representations for bot detection. The experimental results of four publicly available social network datasets demonstrate that the proposed method has certain advantages over state-of-the-art detection methods in this field.

Download

A Hybrid Attention Network for Malware Detection Based on Multi-Feature Aligned and Fusion

February 2023

·

74 Reads

·

11 Citations

With the widespread use of computers, the amount of malware has increased exponentially. Since dynamic detection is costly in both time and resources, most existing malware detection methods are based on static features. However, existing static methods mainly rely on single feature types of malware, while few pay attention to multi-feature fusion. This paper presents a novel multi-feature extraction and fusion method to effectively detect malware variants by combining binary and opcode features. We propose a stacked convolutional network to capture the temporal and discontinuity information in the function call of the binary file from malware. Additionally, we adopt the triangular attention algorithm to extract code-level features from assembly code. Additionally, these two extracted features are aligned and fused by the cross-attention, which could provide a stable feature representation. We evaluate our method on two different datasets. It achieves an accuracy of 0.9954 on the Kaggle Malware Classification dataset and an accuracy of 0.9544 on a large real-world dataset. To optimize our detection model, we conduct in-depth discussions on different feature extractors and multi-feature fusion strategies. Moreover, a visualized attention module in our model is provided to explain its superiority in the opcode feature extraction. An experimental analysis is performed against five baseline deep learning models and five state-of-the-art malware detection models, which reveals that our strategy outperforms competing approaches in all evaluation circumstances.


Fig 2. Performance of three attention mechanisms for multi-omics feature fusion. A. Boxplot shows distribution of SE attention score of three omics data types. B. Barplot (upper-left) of cAttn attention score of each omics type and ternary entity diagram of relationships between omics pairs in five datasets. M: mRNA, green. C:CNV, yellow. D: DNA methylation, red. C. Heatmap of confusion matrix obtained from cAGCN .
Fig 4. GO and KEGG enrichment analysis of top100 genes. A. Gene overlap analysis expanded via shared enriched ontologies. On the outside, each arc represents the identity of each gene list: LumA (green arc), LumB (purple arc), Normal (orange arc), Basal (red arc) and Her2 (blue arc). On the inside, each arc represents a gene list, where each gene member of that list is assigned a spot on the arc. Dark orange color represents the genes that are shared by multiple lists and light orange color represents genes that are unique to that gene list. Purple lines link the same gene that are shared by multiple gene lists (notice a gene that appears in two gene lists will be mapped once onto each gene list, therefore, the two positions are purple linked). Blue lines link the genes, although different, fall under the same ontology term (the term has to statistically significantly enriched and with size no larger than 100). B. For the four input gene lists, GO and KEGG enrichment terms were hierarchically clustered into a tree based on Kappa-statistical similarities among their gene memberships. The heatmap cells are colored by their P values, white cells indicate the lack of enrichment for that term in the corresponding gene list. C. Network layout of a subset of representative terms from the full cluster. Each term is represented by a circle node, where its size is proportional to the number of input genes fall under that term, and its color represent its cluster identity. Terms with a similarity score > 0.3 are linked by an edge (the thickness of the
Fig 5. Implement of three attention mechanisms. A. Squeeze and Excitation (SE)
Figures
Effects of adding reference genes on model performance.
Attention-based GCN Integrates Multi-omics Data for Breast Cancer Subtype Classification and Patient-specific Gene Marker Identification

September 2022

·

129 Reads

·

1 Citation

Breast cancer is a heterogeneous disease and can be divided into several subtypes with unique prognostic and molecular characteristics. Classification of breast cancer subtypes plays an important role in the precision treatment and prognosis of breast cancer. Benefitting from the relation-aware ability of graph convolution network (GCN), we presented a multi-omics integrative method, attention-based GCN, for breast cancer molecular subtype classification using mRNA expression, copy number variation, and DNA methylation multi-omics data. Several attention mechanisms were performed and all exhibited effectiveness in integrating heterogeneous data. Column-wise attention-based GCN outperformed the other baseline methods, achieving AUC of 0.9816, ACC of 0.8743 and MCC of 0.8151 in 5-fold cross validation. Besides, Layer-wise Relevance Propagation (LRP) algorithm was used for interpretation of model decision and could identify patient-specific important biomarkers which were reported to be related to the occurrence and development of breast cancer. Our results highlighted the effectiveness of GCN and attention mechanisms in multi-omics integrative analysis and implement of LRP algorithm could provide biologically reasonable insights into model decision. Author Summary Identification of molecular subtype is essential to understanding the pathogenesis of cancer and advancing the development of cancer precision treatment. We have developed a graph convolution network architecture to predict the molecular subtype of breast cancer patients based on multi-omics data. The major difference between our work and other methods is that we have offered a new view for multi-omics data integration with the assistance of graph convolution network. Besides, we adopted an interpretability method to explain the model decision and prioritize the genes for biomarker discovery.

Citations (3)


... Deep learning has significantly advanced app usage prediction, with LSTM-based and GRU-based models demonstrating strong performance by capturing sequential dependencies [28,29,30]. To further enhance predictive accuracy, studies have incorporated contextual features, including time and location [31,32,33], and explored multi-task learning to jointly model app usage and related behaviors [17,34]. ...

Reference:

Atten-Transformer: A Deep Learning Framework for User App Usage Prediction
DDHCN: Dual Decoder Hyperformer Convolutional Network for Downstream-Adaptable User Representation Learning on App Usage
  • Citing Article
  • September 2023

Expert Systems with Applications

... , that implemented a novel GNN combined with Random Forest, where they generate subgraphs to train GNN classifiers augmented by a Fully Connected Network (FCN), enhancing both accuracy and robustness. Or Zeng et al. (2023) that presented a multidimensional learning approach integrating behavioral and relational analytics. ...

MRLBot: Multi-Dimensional Representation Learning for Social Media Bot Detection

... Yang et al. introduced a cross-attention method to combine binary sequence and opcode features with a stacked neural network model [22]. Although this approach uses sophisticated attention mechanisms, it is still constrained by reliance on traditional opcode and binary features. ...

A Hybrid Attention Network for Malware Detection Based on Multi-Feature Aligned and Fusion