Zenglin Xu

Zenglin Xu
University of Electronic Science and Technology of China | UESTC · School of Computer Science & Engineering

PhD

About

280
Publications
47,143
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
8,412
Citations
Introduction
Additional affiliations
August 2010 - April 2014
Purdue University West Lafayette
Position
  • Researcher

Publications

Publications (280)
Article
Transformer-based models have demonstrated state-of-the-art performance in various intelligent coding tasks such as code comment generation and code completion. Previous studies show that deep learning models are sensitive to input variations, but few have systematically studied the robustness of Transformer under perturbed input code. In this work...
Preprint
The problem of forecasting spatiotemporal events such as crimes and accidents is crucial to public safety and city management. Besides accuracy, interpretability is also a key requirement for spatiotemporal forecasting models to justify the decisions. Interpretation of the spatiotemporal forecasting mechanism is, however, challenging due to the com...
Preprint
As pre-trained models, like Transformers, are increasingly deployed on cloud platforms for inference services, the privacy concerns surrounding model parameters and inference data are becoming more acute. Current Privacy-Preserving Transformer Inference (PPTI) frameworks struggle with the "impossible trinity" of privacy, efficiency, and performance...
Conference Paper
Full-text available
Recent advancements in pre-trained large language models (LLMs) have significantly influenced various domains. Adapting these models for specific tasks often involves fine-tuning (FT) with private, domain-specific data. However, privacy concerns keep this data undisclosed, and the computational demands for deploying LLMs pose challenges for resourc...
Article
Graph-based multi-view learning, which has hitherto been used to discover the intrinsic patterns of graph data giving the credit to its convenience of implementation and effectiveness. Note that even though these approaches have been increasingly adopted in multi-view clustering and have generated promising outcomes, they are still faced with the s...
Article
Full-text available
Sindhi word segmentation is a challenging task due to space omission and insertion issues. The Sindhi language itself adds to this complexity. It’s cursive and consists of characters with inherent joining and non-joining properties, independent of word boundaries. Existing Sindhi word segmentation methods rely on designing and combining hand-crafte...
Preprint
Full-text available
Federated Learning (FL) is a distributed learning approach that trains neural networks across multiple devices while keeping their local data private. However, FL often faces challenges due to data heterogeneity, leading to inconsistent local optima among clients. These inconsistencies can cause unfavorable convergence behavior and generalization p...
Preprint
Cardiovascular diseases (CVDs) are currently the leading cause of death worldwide, highlighting the critical need for early diagnosis and treatment. Machine learning (ML) methods can help diagnose CVDs early, but their performance relies on access to substantial data with high quality. However, the sensitive nature of healthcare data often restrict...
Preprint
Time series forecasting is extensively applied across diverse domains. Transformer-based models demonstrate significant potential in modeling cross-time and cross-variable interaction. However, we notice that the cross-variable correlation of multivariate time series demonstrates multifaceted (positive and negative correlations) and dynamic progres...
Preprint
Multimodal Large Language Models (MLLMs) demonstrate remarkable performance across a wide range of domains, with increasing emphasis on enhancing their zero-shot generalization capabilities for unseen tasks across various modalities. Instruction tuning has emerged as an effective strategy for achieving zero-shot generalization by finetuning pretrai...
Preprint
Full-text available
Recent advancements in pre-trained large language models (LLMs) have significantly influenced various domains. Adapting these models for specific tasks often involves fine-tuning (FT) with private, domain-specific data. However, privacy concerns keep this data undisclosed, and the computational demands for deploying LLMs pose challenges for resourc...
Conference Paper
Federated learning (FL) revolutionizes distributed machine learning by enabling devices to collaboratively learn a model while maintaining data privacy. However, FL usually faces a critical challenge with limited labeled data, making semi-supervised learning (SSL) crucial for utilizing abundant unlabeled data. The integration of SSL within the fede...
Article
Trustworthy Artificial Intelligence (TAI) has proven invaluable in curbing potential negative repercussions tied to AI applications. Within the TAI spectrum, Federated Learning (FL) emerges as a promising solution to safeguard personal information in distributed settings across a multitude of practical contexts. However, the realm of FL is not with...
Preprint
Product attribute value extraction involves identifying the specific values associated with various attributes from a product profile. While existing methods often prioritize the development of effective models to improve extraction performance, there has been limited emphasis on extraction efficiency. However, in real-world scenarios, products are...
Preprint
Graph Neural Networks (GNNs) have gained significant attention as a powerful modeling and inference method, especially for homophilic graph-structured data. To empower GNNs in heterophilic graphs, where adjacent nodes exhibit dissimilar labels or features, Signed Message Passing (SMP) has been widely adopted. However, there is a lack of theoretical...
Article
Self-supervised learning aims to learn representation that can be effectively generalized to downstream tasks. Many self-supervised approaches regard two views of an image as both the input and the self-supervised signals, assuming that either view contains the same task-relevant information and the shared information is (approximately) sufficient...
Article
Few-shot classification aims to adapt classifiers trained on base classes to novel classes with a few shots. However, the limited amount of training data is often inadequate to represent the intraclass variations in novel classes. This can result in biased estimation of the feature distribution, which in turn results in inaccurate decision boundari...
Article
Most previous sequential labeling models are taskspecific, while recent years have witnessed the rise of generative models due to the advantage of unifying all named entity recognition (NER) tasks into the encoder-decoder framework. Although achieving promising performance, our pilot studies demonstrate that existing generative models are ineffecti...
Article
Federated learning (FL) has surged in prominence due to its capability of collaborative model training without direct data sharing. However, the vast disparity in local data distributions among clients, often termed the nonindependent identically distributed (Non-IID) challenge, poses a significant hurdle to FL’s generalization efficacy. The scenar...
Chapter
Generalized category discovery (GCD) is an important open-world task to automatically cluster the unlabeled samples using information transferred from the labeled dataset, given a dataset consisting of labeled and unlabeled instances. A major challenge in GCD is that unlabeled novel class samples and unlabeled known class samples are mixed together...
Chapter
As a class of nonlinear subspace clustering methods, kernel subspace clustering has shown promising performance in many applications. This paper focuses on the kernel selection problem in the kernel subspace clustering model. Currently, the kernel function is typically chosen by the single kernel or multiple kernel methods. The former relies on a g...
Conference Paper
The success of Transformers in long time series forecasting (LTSF) can be attributed to their attention mechanisms and non-autoregressive (NAR) decoder structures, which capture long-range de- pendencies. However, time series data also contain abundant local temporal dependencies, which are often overlooked in the literature and significantly hinde...
Conference Paper
Fine-tuning large vision-language models is a challenging task. Prompt tuning approaches have been introduced to learn fixed textual or visual prompts while freezing the pre-trained model in downstream tasks. Despite the effectiveness of prompt tuning, what do those learnable prompts learn remains unexplained. In this work, we explore whether promp...
Article
Technical Q&A sites, such as Stack Overflow and Ask Ubuntu, have been widely utilized by software engineers to seek support for development challenges. However, not all the raised questions get instant feedback and the retrieved answers can vary in quality. The users can hardly avoid spending much time before solving their problems. Prior studies p...
Article
As one of the most important research topics in the unsupervised learning field, Multi-View Clustering (MVC) has been widely studied in the past decade and numerous MVC methods have been developed. Among these methods, the recently emerged Graph Neural Networks (GNN) shine a light on modeling both topological structure and node attributes in the fo...
Preprint
Full-text available
Large language models (LLMs) encode a vast amount of world knowledge acquired from massive text datasets. Recent studies have demonstrated that LLMs can assist an embodied agent in solving complex sequential decision making tasks by providing high-level instructions. However, interactions with LLMs can be time-consuming. In many practical scenarios...
Preprint
With the development of deep learning (DL), DL-based code search models have achieved state-of-the-art performance and have been widely used by developers during software development. However, the security issue, e.g., recommending vulnerable code, has not received sufficient attention, which will bring potential harm to software development. Poiso...
Preprint
Full-text available
Learning with imbalanced data is a challenging problem in deep learning. Over-sampling is a widely used technique to re-balance the sampling distribution of training data. However, most existing over-sampling methods only use intra-class information of minority classes to augment the data but ignore the inter-class relationships with the majority o...
Preprint
Multivariate Time Series forecasting has been an increasingly popular topic in various applications and scenarios. Recently, contrastive learning and Transformer-based models have achieved good performance in many long-term series forecasting tasks. However, there are still several issues in existing methods. First, the training paradigm of contras...
Article
Graph Neural Networks (GNNs) have attracted much attention due to their superior learning capability. Despite the successful applications of GNNs in many areas, their performance suffers heavily from the long-tailed node degree distribution. Most prior studies tackle this issue by devising sophisticated model architectures. In this paper, we aim to...
Article
FedLab is a lightweight open-source framework for the simulation of federated learning. The design of FedLab focuses on federated learning algorithm effectiveness and communication efficiency. It allows customization on server optimization, client optimization, communication agreement, and communication compression. Also, FedLab is scalable in diff...
Preprint
With increasing privacy concerns on data, recent studies have made significant progress using federated learning (FL) on privacy-sensitive natural language processing (NLP) tasks. Much literature suggests fully fine-tuning pre-trained language models (PLMs) in the FL paradigm can mitigate the data heterogeneity problem and close the performance gap...
Preprint
Code completion is a valuable topic in both academia and industry. Recently, large-scale mono-programming-lingual (MonoPL) pre-training models have been proposed to boost the performance of code completion. However, the code completion on low-resource programming languages (PL) is difficult for the data-driven paradigm, while there are plenty of de...
Article
Full-text available
Bayesian inference is an important research area in cognitive computation due to its ability to reason under uncertainty in machine learning. As a representative algorithm, Stein variational gradient descent (SVGD) and its variants have shown promising successes in approximate inference for complex distributions. In practice, we notice that the ker...
Conference Paper
Multi-view subspace clustering aims to exploit a common affinity representation by means of self-expression. Plenty of works have been presented to boost the clustering performance, yet seldom considering the topological structure in data, which is crucial for clustering data on manifold. Orthogonal to existing works, in this paper, we argue that i...
Article
Imperceptible adversarial examples are capable of deceiving the deep neural networks with high confidence. Recent studies show that it is particularly effective to control the attack space to low-frequency components of the image on the basis of adversarial training. However, those methods face the problem of losing valuable knowledge, especially s...
Preprint
Few-shot learning (FSL) targets at generalization of vision models towards unseen tasks without sufficient annotations. Despite the emergence of a number of few-shot learning methods, the sample selection bias problem, i.e., the sensitivity to the limited amount of support data, has not been well understood. In this paper, we find that this problem...
Article
Real-world data usually present long-tailed distributions. Training on imbalanced data tends to render neural networks perform well on head classes while much worse on tail classes. The severe sparseness of training instances for the tail classes is the main challenge, which results in biased distribution estimation during training. Plenty of effor...
Preprint
Full-text available
We introduce ViLPAct, a novel vision-language benchmark for human activity planning. It is designed for a task where embodied AI agents can reason and forecast future actions of humans based on video clips about their initial activities and intents in text. The dataset consists of 2.9k videos from \charades extended with intents via crowdsourcing,...
Preprint
Transformer-based models have achieved great success on sentence pair modeling tasks, such as answer selection and natural language inference (NLI). These models generally perform cross-attention over input pairs, leading to prohibitive computational costs. Recent studies propose dual-encoder and late interaction architectures for faster computatio...
Article
Multiview graph clustering has emerged as an important yet challenging technique due to the difficulty of exploiting the similarity relationships among multiple views. Typically, the similarity graph for each view learned by these methods is easily corrupted because of the unavoidable noise or diversity among views. To recover a clean graph, existi...
Article
Convolutional Neural Networks (CNNs) have achieved tremendous success in a number of learning tasks including image classification. Residual-like networks, such as ResNets, mainly focus on the skip connection to avoid gradient vanishing. However, the skip connection mechanism limits the utilization of intermediate features due to simple iterative u...
Preprint
Transformer-based models have demonstrated state-of-the-art performance in many intelligent coding tasks such as code comment generation and code completion. Previous studies show that deep learning models are sensitive to the input variations, but few studies have systematically studied the robustness of Transformer under perturbed input code. In...
Conference Paper
Hierarchical clustering recursively partitions data at an increasingly finer granularity. In real-world applications, multi-view data have become increasingly important. This raises a less investigated problem, i.e., multi-view hierarchical clustering, to better understand the hierarchical structure of multi-view data. To this end, we propose a nov...
Article
Considerable attention to kernel subspace clustering has emerged in recent years due to its excellent performance in grouping data points from multiple manifolds. In particular, multiple kernel learning (MKL) is commonly used in kernel subspace clustering models to alleviate the difficulty of choosing a suitable kernel function, which brings out nu...
Article
Multi-view clustering has received a lot of attentions in data mining recently. Though plenty of works have been investigated on this topic, it is still a severe challenge due to the complex nature of the multiple heterogeneous features. Particularly, existing multi-view clustering algorithms fail to consider the topological structure in the data,...
Preprint
Graph Neural Networks (GNNs) have attracted much attention due to their ability in learning representations from graph-structured data. Despite the successful applications of GNNs in many domains, the optimization of GNNs is less well studied, and the performance on node classification heavily suffers from the long-tailed node degree distribution....
Article
Full-text available
We study the model robustness against adversarial examples, referred to as small perturbed input data that may however fool many state-of-the-art deep learning models. Unlike previous research, we establish a novel theory addressing the robustness issue from the perspective of stability of the loss function in the small neighborhood of natural exam...
Article
Solving a kernel regression problem usually suffers from expensive computation and storage costs due to the large kernel size. To tackle this problem, the Nyström method is proposed and widely applied to large-scale kernel methods as an approximate solution. The key idea of this method is to select a subset of columns of the kernel matrix and rebui...
Preprint
Full-text available
Tensorial Convolutional Neural Networks (TCNNs) have attracted much research attention for their power in reducing model parameters or enhancing the generalization ability. However, exploration of TCNNs is hindered even from weight initialization methods. To be specific, general initialization methods, such as Xavier or Kaiming initialization, usua...
Preprint
Full-text available
Hierarchical clustering recursively partitions data at an increasingly finer granularity. In real-world applications, multi-view data have become increasingly important. This raises a less investigated problem, i.e., multi-view hierarchical clustering, to better understand the hierarchical structure of multi-view data. To this end, we propose a nov...
Article
Multi-view learning aims to explore a global common structure shared by different views collected from multiple individual sources. The nascent field of federated learning tries to learn a global model over distributed networks of devices. This paper shows that multi-view learning is naturally suited to address the feature heterogeneity of the fede...
Article
The ResNet and its variants have achieved remarkable successes in various computer vision tasks. Despite its success in making gradient flow through building blocks, the information communication of intermediate layers of blocks is ignored. To address this issue, in this brief, we propose to introduce a regulator module as a memory mechanism to ext...
Preprint
Source code summarization aims at generating concise and clear natural language descriptions for programming languages. Well-written code summaries are beneficial for programmers to participate in the software development and maintenance process. To learn the semantic representations of source code, recent efforts focus on incorporating the syntax...
Article
Full-text available
Multi-view spectral clustering has drawn much attention due to the effectiveness of exploiting the similarity relationships among data points. These methods typically reveal the intrinsic structure using a predefined graph for each view. The predefined graphs are fused to a consensus one, on which the final clustering results are obtained. However,...
Article
Multi-view clustering aims to reveal the correlation between different input modalities in an unsupervised way. Similarity between data samples can be described by a similarity graph, which governs the quality of multi-view clustering. However, existing multi-view graph learning methods mainly construct similarity graph based on raw features, which...
Article
In multi-view subspace clustering, it is significant to find a common latent space in which the multi-view datasets are located. A number of multi-view subspace clustering methods have been proposed to explore the common latent subspace and achieved promising performance. However, previous multi-view subspace clustering algorithms seldom consider t...
Article
Deep semi-supervised learning is a fast-growing field with a range of practical applications. This paper provides a comprehensive survey on both fundamentals and recent advances in deep semi-supervised learning methods from perspectives of model design and unsupervised loss functions. We first present a taxonomy for deep semi-supervised learning th...
Preprint
Full-text available
Few-shot classification aims to adapt classifiers to novel classes with a few training samples. However, the insufficiency of training data may cause a biased estimation of feature distribution in a certain class. To alleviate this problem, we present a simple yet effective feature rectification method by exploring the category correlation between...
Preprint
Full-text available
Deep discriminative models (DDMs), such as deep regression forests, deep neural decision forests, have been extensively studied recently to solve problems like facial age estimation, head pose estimation, gaze estimation and so forth. Such problems are challenging in part because a large amount of effective training data without noise and bias is o...
Conference Paper
Full-text available
The category gap between training and evaluation has been characterised as one of the main obstacles to the success of Few-Shot Learning (FSL). In this paper, we for the first time empirically identify image background, common in realistic images, as a shortcut knowledge helpful for in-class classification but ungeneralizable beyond training catego...