W. Eric Wong’s research while affiliated with University of Texas at Dallas and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (262)


Software Fault Localization Based on Multi-objective Feature Fusion and Deep Learning
  • Preprint

November 2024

·

1 Read

Xiaolei Hu

·

·

W. Eric Wong

·

Ya Zou

Software fault localization remains challenging due to limited feature diversity and low precision in traditional methods. This paper proposes a novel approach that integrates multi-objective optimization with deep learning models to improve both accuracy and efficiency in fault localization (FL). By framing feature selection as a multi-objective optimization problem (MOP), we extract and fuse three critical fault-related feature sets: spectrum-based, mutation-based, and text-based features, into a comprehensive feature fusion model. These features are then embedded within a deep learning architecture, comprising a multilayer perceptron (MLP) and gated recurrent network (GRN), which together enhance localization accuracy and generalizability. Experiments on the Defects4J benchmark dataset with 434 faults show that the proposed algorithm reduces processing time by 78.2% compared to single-objective methods. Additionally, our MLP and GRN models achieve a 94.2% improvement in localization accuracy compared to traditional FL methods, outperforming state-of-the-art deep learning-based FL method by 7.67%. Further validation using the PROMISE dataset demonstrates the generalizability of the proposed model, showing a 4.6% accuracy improvement in cross-project tests over state-of-the-art deep learning-based FL method.


Many-Objective Search-Based Coverage-Guided Automatic Test Generation for Deep Neural Networks

November 2024

·

3 Reads

To ensure the reliability of DNN systems and address the test generation problem for neural networks, this paper proposes a fuzzing test generation technique based on many-objective optimization algorithms. Traditional fuzz testing employs random search, leading to lower testing efficiency and tends to generate numerous invalid test cases. By utilizing many-objective optimization techniques, effective test cases can be generated. To achieve high test coverage, this paper proposes several improvement strategies. The frequency-based fuzz sampling strategy assigns priorities based on the frequency of selection of initial data, avoiding the repetitive selection of the same data and enhancing the quality of initial data better than random sampling strategies. To address the issue that global search may yield test not satisfying semantic constraints, a local search strategy based on the Monte Carlo tree search is proposed to enhance the algorithm's local search capabilities. Furthermore, we improve the diversity of the population and the algorithm's global search capability by updating SPEA2's external archive based on a decomposition-based archiving strategy. To validate the effectiveness of the proposed approach, experiments were conducted on several public datasets and various neural network models. The results reveal that, compared to random and clustering-based sampling, the frequency-based fuzz sampling strategy provides a greater improvement in coverage rate in the later stages of iterations. On complex networks like VGG16, the improved SPEA2 algorithm increased the coverage rate by about 12% across several coverage metrics, and by approximately 40% on LeNet series networks. The experimental results also indicates that the newly generated test cases not only exhibit higher coverage rates but also generate adversarial samples that reveal model errors.



A Hybrid Sampling and Multi-Objective Optimization Approach for Enhanced Software Defect Prediction

October 2024

·

7 Reads

Accurate early prediction of software defects is essential to maintain software quality and reduce maintenance costs. However, the field of software defect prediction (SDP) faces challenges such as class imbalances, high-dimensional feature spaces, and suboptimal prediction accuracy. To mitigate these challenges, this paper introduces a novel SDP framework that integrates hybrid sampling techniques, specifically Borderline SMOTE and Tomek Links, with a suite of multi-objective optimization algorithms, including NSGA-II, MOPSO, and MODE. The proposed model applies feature fusion through multi-objective optimization, enhancing both the generalization capability and stability of the predictions. Furthermore, the integration of parallel processing for these optimization algorithms significantly boosts the computational efficiency of the model. Comprehensive experiments conducted on datasets from NASA and PROMISE repositories demonstrate that the proposed hybrid sampling and multi-objective optimization approach improves data balance, eliminates redundant features, and enhances prediction accuracy. The experimental results also highlight the robustness of the feature fusion approach, confirming its superiority over existing state-of-the-art techniques in terms of predictive performance and applicability across diverse datasets.



Smart Contract Vulnerability Detection based on Static Analysis and Multi-Objective Search

September 2024

·

5 Reads

This paper introduces a method for detecting vulnerabilities in smart contracts using static analysis and a multi-objective optimization algorithm. We focus on four types of vulnerabilities: reentrancy, call stack overflow, integer overflow, and timestamp dependencies. Initially, smart contracts are compiled into an abstract syntax tree to analyze relationships between contracts and functions, including calls, inheritance, and data flow. These analyses are transformed into static evaluations and intermediate representations that reveal internal relations. Based on these representations, we examine contract's functions, variables, and data dependencies to detect the specified vulnerabilities. To enhance detection accuracy and coverage, we apply a multi-objective optimization algorithm to the static analysis process. This involves assigning initial numeric values to input data and monitoring changes in statement coverage and detection accuracy. Using coverage and accuracy as fitness values, we calculate Pareto front and crowding distance values to select the best individuals for the new parent population, iterating until optimization criteria are met. We validate our approach using an open-source dataset collected from Etherscan, containing 6,693 smart contracts. Experimental results show that our method outperforms state-of-the-art tools in terms of coverage, accuracy, efficiency, and effectiveness in detecting the targeted vulnerabilities.



Multi-Objective Software Defect Prediction via Multi-Source Uncertain Information Fusion and Multi-Task Multi-View Learning

August 2024

·

7 Reads

IEEE Transactions on Software Engineering

Effective software defect prediction (SDP) is important for software quality assurance. Numerous advanced SDP methods have been proposed recently. However, how to consider the task correlations and achieve multi-objective SDP accurately and efficiently still remains to be further explored. In this paper, we propose a novel multi-objective SDP method via multi-source uncertain information fusion and multi-task multi-view learning (MTMV) to accurately and efficiently predict the proneness, location, and type of defects. Firstly, multi-view features are extracted from multi-source static analysis results, reflecting uncertain defect location distribution and semantic information. Then, a novel MTMV model is proposed to fully fuse the uncertain defect information in multi-view features and realize effective multi-objective SDP. Specifically, the convolutional GRU encoders capture the consistency and complementarity of multi-source defect information to automatically filter the noise of false and missed alarms, and reduce location and type uncertainty of static analysis results. A global attention mechanism combined with the hard parameter sharing in MTMV fuse features according to their global importance of all tasks for balanced learning. Then, considering the latent task and feature correlations, multiple task-specific decoders jointly optimize all SDP tasks by sharing the learning experience. Through the extensive experiments on 14 datasets, the proposed method significantly improves the prediction performance over 12 baseline methods for all SDP objectives. The average improvements are 30.7%, 31.2%, and 32.4% for defect proneness, location, and type prediction, respectively. Therefore, the proposed multi-objective SDP method can provide more sufficient and precise insights for developers to significantly improve the efficiency of software analysis and testing.




Citations (58)


... Generative AI refers to advanced computational techniques that create new, meaningful content such as text, images, and audio from existing data [14]. These capabilities of GenAI can address significant challenges that teachers face in designing culturally responsive assessments, such as the time-intensive nature of creating materials that are both culturally relevant and pedagogically sound [47,48]. Furthermore, the interactive nature of GenAI-based assessments allows for real-time feedback and adaptation, providing teachers and students with immediate opportunities to learn and correct misunderstandings [49,50]. ...

Reference:

Generative AI for Culturally Responsive Assessment in Science: A Conceptual Framework
Exploring the Capability of ChatGPT in Test Generation
  • Citing Conference Paper
  • October 2023

... On 27 the contrary, Bohrbugs [2] are a category of bugs that always manifest them- 28 selves when the relevant code is executed, being thus signicantly easier to 29 detect and isolate using standard debugging tools and techniques. 30 While intuitively Mandelbugs can be deemed to be infrequent, in relation 31 to the total number of faults in a software system, Grottke et al. [3] present 32 an analysis of the software faults that were identied in the on-board soft- 33 ware used for JPL/NASA space missions, and determined that Mandelbugs 34 account for the 36.5% of the total number of software faults. Chillarege [4] 35 correlated the types of bugs with the software quality dimensions, and con- 36 cluded that Mandelbugs mainly have repercussions on the non-functional 37 aspects of software (e.g. ...

Slicing‐Based Techniques for Software Fault Localization
  • Citing Chapter
  • April 2023

... The first is Code Search (CS) [Sun et al. 2024] during the implementation phase, which aims to assist developers in reusing specific code snippets from open source repositories, rather than "reinventing the wheel". The second is Fault Localization (FL) [Wong et al. 2023] during the debugging phase, which aims to identify the buggy program elements within the entire software system, and thus speed up the process of combating the bugs. ...

Software Fault Localization: an Overview of Research, Techniques, and Tools
  • Citing Chapter
  • April 2023

... Program elements can be taken into account in three levels including 1) statement (line of code), 2) method, and 3) file. Fault localization techniques have been categorized into two main groups, based on the methodology used for the analysis [30,110]. ...

Handbook of Software Fault Localization: Foundations and Advances
  • Citing Book
  • Full-text available
  • January 2023

... Tang, Chenghua, and colleagues (Tang, Guan, Yang, & Qiang, 2023) developed TaintSE, which combines symbolic execution with dynamic taint analysis, improving path coverage by 24%-35%. Li, Dongcheng (Li, Wong, Li, & Chau, 2022) and colleagues integrated heuristic search algorithms with symbolic execution, proposing a local search algorithm based on adaptive simulated annealing and symbolic path constraints to generate high-coverage test cases for multiple criteria within a limited time budget. These studies inspired the development of MCTCG. ...

Improving Search-based Test Case Generation with Local Search using Adaptive Simulated Annealing and Dynamic Symbolic Execution
  • Citing Conference Paper
  • August 2022

... The performance of MOO methods used to separate mixed signals is often vital. Especially in biomedical signals, the accuracy of the signals separated using MOO methods is confirmed by at least two objective functions [8,9]. Many optimization algorithms have been proposed to separate the mixed signals with the BSS method. ...

Automatic Test Case Generation Using Many-Objective Search and Principal Component Analysis

IEEE Access

... APR techniques aim to generate patches for buggy programs to pass given test suites. These techniques can be categorized into search-based [20,31], semantics-based [16,17,33], and pattern/learningbased approaches [21,22,57]. Search-based APR techniques like GenProg [18] use predefined code mutation operators to generate patches, while semantics-based APR techniques generate patches by solving repair constraints based on test suite specifications. ...

Improving Search-Based Automatic Program Repair With Neural Machine Translation

IEEE Access

... Then, the key contribution of this work lies in going beyond the denoising capacities of wavelet decomposition [2] and its applications in anomaly management, which has been primarily detection-centric [3], thus contributing to the current literature related to network issue detection and classification. Besides, the present work proposes a new dual automatic labeling, both visual and linguistic, providing insight about the duration and intensity of the detected anomalies. ...

Anomaly Detection Via Kpis for Software Performance Failures
  • Citing Article
  • January 2022

SSRN Electronic Journal

... To address the needs of drone path planning applications, researchers have proposed numerous path planning algorithms over the past few years. 9 These algorithms can be categorized into three types: machine learning-based algorithms, [10][11][12] samplingbased algorithms, 13,14 , and heuristic algorithms. 15,16 Machine learning-based algorithms leverage datadriven approaches to optimize paths, while samplingbased algorithms, including Rapidly-exploring Random Tree (RRT), 17 Voronoi Diagram, 18 Probabilistic Roadmaps (PRMs), 19 A* algorithm, 20 generate random samples in the search space to find feasible paths. ...

Quality-Oriented Hybrid Path Planning Based on A* and Q-Learning for Unmanned Aerial Vehicle

IEEE Access