Xue Yuan’s research while affiliated with Yunnan Normal University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (3)


Software vulnerable functions discovery based on code composite feature
  • Article

March 2024

·

11 Reads

·

4 Citations

Journal of Information Security and Applications

Xue Yuan

·

·

Huan Mei

·

[...]

·

Jun Zhang

The workflow consists of three stages. In stage 1, we compare the performance of four models. In stage 2, a synthetic dataset derived from SARD projects is poured into the real-world dataset to fine tune the parameters of CodeBERT. In stage 3, we examine important parameters of CodeBERT in terms of code feature extraction.
The example of C functions. (a) A vulnerable function. (b) Tthe revised function.
The example of C functions. (a) A vulnerable function. (b) Tthe revised function.
Results of two comparative experiments. (a) and (c) The precision and recall of several embedding methods, respectively; (b) and (d) the precision and recall of three models.
Results of two comparative experiments. (a) and (c) The precision and recall of several embedding methods, respectively; (b) and (d) the precision and recall of three models.

+2

Deep Neural Embedding for Software Vulnerability Discovery: Comparison and Optimization
  • Article
  • Full-text available

January 2022

·

260 Reads

·

42 Citations

Due to multitudinous vulnerabilities in sophisticated software programs, the detection performance of existing approaches requires further improvement. Multiple vulnerability detection approaches have been proposed to aid code inspection. Among them, there is a line of approaches that apply deep learning (DL) techniques and achieve promising results. This paper attempts to utilize CodeBERT which is a deep contextualized model as an embedding solution to facilitate the detection of vulnerabilities in C open-source projects. The application of CodeBERT for code analysis allows the rich and latent patterns within software code to be revealed, having the potential to facilitate various downstream tasks such as the detection of software vulnerability. CodeBERT inherits the architecture of BERT, providing a stacked encoder of transformer in a bidirectional structure. This facilitates the learning of vulnerable code patterns which requires long-range dependency analysis. Additionally, the multihead attention mechanism of transformer enables multiple key variables of a data flow to be focused, which is crucial for analyzing and tracing potentially vulnerable data flaws, eventually, resulting in optimized detection performance. To evaluate the effectiveness of the proposed CodeBERT-based embedding solution, four mainstream-embedding methods are compared for generating software code embeddings, including Word2Vec, GloVe, and FastText. Experimental results show that CodeBERT-based embedding outperforms other embedding models on the downstream vulnerability detection tasks. To further boost performance, we proposed to include synthetic vulnerable functions and perform synthetic and real-world data fine tuning to facilitate the model learning of C-related vulnerable code patterns. Meanwhile, we explored the suitable configuration of CodeBERT. The evaluation results show that the model with new parameters outperform some state-of-the-art detection methods in our dataset.

Download

Citations (2)


... Yuan et al. [4] propose a method combining GRU and GGRN models to extract textual and structural features from source code functions for vulnerability identification. The approach outperforms existing methods in identifying vulnerabilities by leveraging both serialized and structural features. ...

Reference:

Editorial: Special issue on software protection and attacks
Software vulnerable functions discovery based on code composite feature
  • Citing Article
  • March 2024

Journal of Information Security and Applications

... Their findings suggested that all these three techniques are suitable for representing source code for the task of VP, but the BERTbased embedding method seemed to be the most promising one. Yuan et al. [42] compared a CodeBERT-based embedding method with word2vec, fastText, and GloVe [43] showing that the former outperforms the latter in the task of vulnerability prediction. ...

Deep Neural Embedding for Software Vulnerability Discovery: Comparison and Optimization