April 2025
What is this page?
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
Publications (83)
April 2025
·
2 Reads
January 2025
·
3 Reads
Semi-Supervised Learning (SSL) can leverage abundant unlabeled data to boost model performance. However, the class-imbalanced data distribution in real-world scenarios poses great challenges to SSL, resulting in performance degradation. Existing class-imbalanced semi-supervised learning (CISSL) methods mainly focus on rebalancing datasets but ignore the potential of using hard examples to enhance performance, making it difficult to fully harness the power of unlabeled data even with sophisticated algorithms. To address this issue, we propose a method that enhances the performance of Imbalanced Semi-Supervised Learning by Mining Hard Examples (SeMi). This method distinguishes the entropy differences among logits of hard and easy examples, thereby identifying hard examples and increasing the utility of unlabeled data, better addressing the imbalance problem in CISSL. In addition, we maintain a class-balanced memory bank with confidence decay for storing high-confidence embeddings to enhance the pseudo-labels' reliability. Although our method is simple, it is effective and seamlessly integrates with existing approaches. We perform comprehensive experiments on standard CISSL benchmarks and experimentally demonstrate that our proposed SeMi outperforms existing state-of-the-art methods on multiple benchmarks, especially in reversed scenarios, where our best result shows approximately a 54.8\% improvement over the baseline methods.
January 2025
·
25 Reads
·
2 Citations
IEEE Internet of Things Journal
With the rapid advancement of flexible manufacturing in the Industrial Internet of Things (IIoT), there has been a significant increase in the number of IIoT devices and application software aimed at meeting various needs. The software defects may lead to delays or crashes in flexible manufacturing system, thereby affecting the production schedule. Automated software defect localization based on code changes can significantly reduce development and maintenance time costs, thereby maintaining the competitive edge of flexible manufacturing in the IIoT. Current efforts in software defect localization are primarily based on deep learning models or information retrieval models. This paper investigates the performance of Large Language Models (LLMs) in software defect localization and optimizes localization accuracy by combining it with an information retrieval model. Our empirical study reveals that GPT, given a software defect description, is unable to determine whether specific code changes are relevant. The model is unable to provide accurate answers, which aligns with the generative nature of LLMs where responses are generated according to probability distributions. However, the combined framework of LLMs and information retrieval models proposed in this paper outperforms the current state-of-the-art models on public datasets. We conclude that LLMs can enhance localization performance when used as side information in conjunction with existing information retrieval models. The effectiveness of the framework has been validated through experiments conducted on publicly available datasets and in practical applications within IIoT projects. This offers valuable insights into the application and development of LLMs for defect localization in the software development and maintenance processes in the IIoT flexible manufacturing.
October 2024
·
4 Reads
·
1 Citation
July 2024
·
8 Reads
Generative retrieval (GR) has emerged as a transformative paradigm in search and recommender systems, leveraging numeric-based identifier representations to enhance efficiency and generalization. Notably, methods like TIGER employing Residual Quantization-based Semantic Identifiers (RQ-SID), have shown significant promise in e-commerce scenarios by effectively managing item IDs. However, a critical issue termed the "\textbf{Hourglass}" phenomenon, occurs in RQ-SID, where intermediate codebook tokens become overly concentrated, hindering the full utilization of generative retrieval methods. This paper analyses and addresses this problem by identifying data sparsity and long-tailed distribution as the primary causes. Through comprehensive experiments and detailed ablation studies, we analyze the impact of these factors on codebook utilization and data distribution. Our findings reveal that the "Hourglass" phenomenon substantially impacts the performance of RQ-SID in generative retrieval. We propose effective solutions to mitigate this issue, thereby significantly enhancing the effectiveness of generative retrieval in real-world E-commerce applications.
July 2024
·
2 Reads
July 2024
·
18 Reads
·
1 Citation
July 2024
·
1 Read
·
1 Citation
June 2024
·
5 Reads
·
1 Citation
Citations (52)
... Additionally, the analysis of diverse data dimensions (e.g. parameter configurations and product quality inspection data) allows the optimization of production parameters, the prediction of equipment failures, and the dynamic adjustment of supply chains, leading to improved manufacturing flexibility and adaptability [90], [91]. ...
- Citing Article
January 2025
IEEE Internet of Things Journal
... Based on the pretrain multi-modal representation, existing generative recommendation frameworks [25,36] use RQ-VAE [49] to encode the embedding into semantic tokens. However, such method is suboptimal due to the unbalanced code distribution which is known as the hourglass phenomenon [23]. We apply a multi-level balanced quantitative mechanism to transform the e with residual K-Means quantization algorithm [27]. ...
- Citing Conference Paper
January 2024
... We first evaluate the HR estimation on all datasets under intra-dataset setting. We compare our method with 32 methods, including traditional methods [2], [4], [5], [25], [56], non-end-to-end methods [9]- [11], [17], [19], [57]- [59], and end-to-end methods [1], [12], [14], [15], [28], [49], [60]- [72]. Note that our CodePhys adopts a hybrid approach that incorporates extensive pre-training in Stage I, followed by an end-to-end process in Stage II. ...
- Citing Conference Paper
October 2024
... Data privacy and security are critical concerns, as the use of sensitive patient information requires robust safeguards and compliance with regulatory standards [43]. Additionally, the interpretability of AI algorithms remains a significant issue, as clinicians must understand how AI-driven decisions are made to trust and adopt these technologies in clinical practice [44]. ...
- Citing Conference Paper
July 2024
... Drawing inspiration from RegGPT (Wang et al., 2024b), we observe that regulations are typically composed of six core elements: meta data (entity/property), constraint, condition, measure, scope, and external references. To facilitate more effective translation of regulatory rules into executable code by LLMs, we propose several refinements to this schema. ...
- Citing Conference Paper
July 2024
... Zhang et al. [405] construct line-level code graphs for code changes and test whether the homophily assumption holds on the constructed code graphs with nodes labeled as "defective" or "non-defective". Results reveal a nonuniform distribution with most code graphs showing strong homophily but some showing significant heterophily. ...
- Citing Conference Paper
June 2024
... The learning rate is 0.0001, the batch size is 32, and the hidden layer feature dimension is 64. (7) ODCRN: OD Convolutional Recurrent Network [40]. ODCRN integrates recurrent and 2D graph convolutional neural networks to address the highly complex spatiotemporal dependencies in sequential OD matrices. ...
- Citing Conference Paper
October 2023
... A prompt is then generated combining the context and the retrieved results and fed to LLM to return a predicted statement. Another approach that also uses graph representation is presented in [30]. It integrates a retrieval model that searches for similar code graphs to generate graph nodes, and a completion model based on a Multi-field Graph Attention Block. ...
- Citing Conference Paper
- Full-text available
April 2024
... Similarly, BugTranslator [156] used an attention-based RNN encoderdecoder model to translate natural languages (i.e., bug reports) into code tokens. TROBO [191], CGMBL [27] and BL-GAN [189] used adversarial learning to bridge the semantic gap between code and bug reports, which includes a generator and a discriminator. The generator is designed to generate new synthetic instances from the task domain that can fool the discriminator, whereas the discriminator is desired to classify instances as either real (i.e., from the real task domain) or fake (i.e., generated by the generator) [189]. ...
- Citing Conference Paper
December 2022
... The structure of the system is adaptable and can be altered to accommodate changes at the intersection. In Kuang et al. (2021), an ITSC method that utilizes RL with a state reduction was developed. The model utilizes historical traffic flow and a dual-objective reward function to minimize vehicle delays and enhance the signal timing. ...
- Citing Article
July 2021
ACM Transactions on Internet Technology