May 2023
·
28 Reads
·
118 Citations
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
May 2023
·
28 Reads
·
118 Citations
December 2022
·
56 Reads
·
6 Citations
Deep learning (DL) models of code have recently reported great progress for vulnerability detection. In some cases, DL-based models have outperformed static analysis tools. Although many great models have been proposed, we do not yet have a good understanding of these models. This limits the further advancement of model robustness, debugging, and deployment for the vulnerability detection. In this paper, we surveyed and reproduced 9 state-of-the-art (SOTA) deep learning models on 2 widely used vulnerability detection datasets: Devign and MSR. We investigated 6 research questions in three areas, namely model capabilities, training data, and model interpretation. We experimentally demonstrated the variability between different runs of a model and the low agreement among different models' outputs. We investigated models trained for specific types of vulnerabilities compared to a model that is trained on all the vulnerabilities at once. We explored the types of programs DL may consider "hard" to handle. We investigated the relations of training data sizes and training data composition with model performance. Finally, we studied model interpretations and analyzed important features that the models used to make predictions. We believe that our findings can help better understand model results, provide guidance on preparing training data, and improve the robustness of the models. All of our datasets, code, and results are available at https://figshare.com/s/284abfba67dba448fdc2.
... Contrary to traditional static analysis that is based on the identification of violations of specific rules and best practices, VP is able to detect more complex vulnerability patterns, due to the utilization of advanced AI algorithms. More specifically, with the emergence of the Transformer architecture [19] and the Large Language Models (LLMs), enhanced VP solutions have been proposed [20], [21]. However, the vast majority of the proposed Vulnerability Prediction Models (VPMs) perform predictions on class of function level of granularity [2], [20], not being able to provide specific information about the actual location and type of the detected vulnerability. ...
May 2023
... IV. DEEP LEARNING-BASED SOFTWARE VULNERABILITY DETECTION Deep learning-based vulnerability detection methods involve training neural networks to identify potential issues in code [22]. These techniques typically involve four steps: data collection, data preparation, model building, and evaluation/test [23]. Data collection is to gather labeled vulnerable and non-vulnerable data for neural model training. ...
December 2022