About
395
Publications
137,473
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
18,645
Citations
Introduction
Xinwang Liu received his PhD degree from National University of Defense Technology (NUDT), China. He is now Professor at School of Computer, NUDT. Dr. Liu has published 70+ peer-reviewed papers, including those in highly regarded journals and conferences such as IEEE T-PAMI, IEEE T-KDE, IEEE T-IP, IEEE T-NNLS, IEEE T-MM, IEEE T-IFS, NeurIPS, CVPR, ICCV, etc. More information can be found at https://xinwangliu.github.io/
Current institution
Additional affiliations
January 2020 - December 2020
August 2014 - February 2015
December 2019 - present
Education
February 2009 - December 2013
September 2006 - December 2008
Publications
Publications (395)
Knowledge graphs (KGs) represent known entities and their relationships using triplets, but this method cannot represent relationships between facts, limiting their expressiveness. Recently, the Bi-level Knowledge Graph (Bi-level KG) has addressed this issue by modeling facts as nodes and establishing relationships between these facts, introducing...
Federated learning is a promising bridge that connects machine learning methods and multi-central medical data. It trains models using the local data, and protects the privacy of data. There are many methods for federated learning to aggregate models, especially personalized methods, which show relatively excellent performance. However, most of the...
Leveraging the powerful representation learning capabilities, deep multi-view clustering methods have demonstrated reliable performance by effectively integrating multi-source information from diverse views in recent years. Most existing methods rely on the assumption of clean views. However, noise is pervasive in real-world scenarios, leading to a...
Chain-of-thought (CoT) distillation allows a large language model (LLM) to guide a small language model (SLM) in reasoning tasks. Existing methods train the SLM to learn the long rationale in one iteration, resulting in two issues: 1) Long rationales lead to a large token-level batch size during training, making gradients of core reasoning tokens (...
Hyperspectral video (HSV) provides rich spectral-spatial-temporal information, enabling the capture of complex object dynamics beyond the limitations of conventional single-and multi-modal tracking. However, current HSV tracking methods face challenges such as data scarcity, band gaps, spectral fragmentation, temporal underutilization, and high com...
In incomplete multi-view clustering (IMVC), missing data induce prototype shifts within views and semantic inconsistencies across views. A feasible solution is to explore cross-view consistency in paired complete observations, further imputing and aligning the similarity relationships inherently shared across views. Nevertheless, existing methods a...
Reasoning paths are reliable information in knowledge graph completion (KGC) in which algorithms can find strong clues of the actual relation between entities. However, in real-world applications, it is difficult to guarantee that computationally affordable paths exist toward all candidate entities. According to our observation, the prediction accu...
Clustering has attracted more and more attention as one of the most fundamental techniques in the field of unsupervised learning. To deal with nonlinear problems, clustering methods have been extended to the kernel version. As a traditional kernel clustering algorithm, multiple kernel k -means (MKKM) aims to learn clustering results from a consensu...
Multi-view clustering (MVC) for remote sensing data is a critical and challenging task in Earth observation. Although recent advances in graph neural network (GNN)-based MVC have shown remarkable success, the most prevalent approaches have two major limitations: 1) heavily relying on a predefined yet fixed graph, which limits the performance of clu...
Existing Multiple Kernel Clustering (MKC) algorithms commonly utilize the Nyström method to handle large-scale datasets. However, most of them employ uniform sampling for kernel matrix approximation, hence failing to accurately capture the underlying data structure, leading to large approximation errors. Additionally, they often use the same landma...
Text-based knowledge graph completion methods take advantage of pre-trained language models (PLM) to enhance intrinsic semantic connections of raw triplets with detailed text descriptions. Typical methods in this branch map an input query (textual descriptions associated with an entity and a relation) and its candidate entities into feature vectors...
Anchor selection or learning has become a critical component in large-scale multi-view clustering. Existing anchor-based methods, which either select-then-fix or initialize-then-optimize with orthogonality, yield promising performance. However, these methods still suffer from instability of initialization or insufficient depiction of data distribut...
Text-based knowledge graph completion methods take advantage of pre-trained language models (PLM) to enhance intrinsic semantic connections of raw triplets with detailed text descriptions. Typical methods in this branch map an input query (textual descriptions associated with an entity and a relation) and its candidate entities into feature vectors...
Zhibin Dong Pei Li Yi Jiang- [...]
K.L. He
Chronic noncommunicable diseases (NCDS) are often characterized by gradual onset and slow progression, but the difficulty in early prediction remains a substantial health challenge worldwide. This study aims to explore the interconnectedness of disease occurrence through multi‐omics studies and validate it in large‐scale electronic health records....
Attribute-missing graph learning, a common yet challenging problem, has recently attracted considerable attention. Existing efforts have at least one of the following limitations: 1) lack a noise filtering and information enhancing scheme, resulting in less comprehensive data completion; 2) isolate the node attribute and graph structure encoding pr...
Single Domain Generalization (SDG) aims to train models with consistent performance across diverse scenarios using data from a single source. While using latent diffusion models (LDMs) show promise in augmenting limited source data, we demonstrate that directly using synthetic data can be detrimental due to significant feature distribution discrepa...
Multi-view clustering leverages complementary representations from diverse sources to enhance performance. However, real-world data often suffer incomplete cases due to factors like privacy concerns and device malfunctions. A key challenge is effectively utilizing available instances to recover missing views. Existing methods frequently overlook th...
Multimodal named entity recognition (MNER) is an emerging field that aims to automatically detect named entities and classify their categories, utilizing input text and auxiliary resources such as images. While previous studies have leveraged object detectors to preprocess images and fuse textual semantics with corresponding image features, these m...
Multi-view clustering (MvC) aims to integrate information from different views to enhance the capability of the model in capturing the underlying data structures. The widely used joint training paradigm in MvC is potentially not fully leverage the multi-view information, since the imbalanced and under-optimized view-specific features caused by the...
Dynamic ensemble has significantly greater potential space to improve the classification of imbalanced data compared to static ensemble. However, dynamic ensemble schemes are far less successful than static ensemble methods in the imbalanced learning field. Through an in-depth analysis on the behavior characteristics of dynamic ensemble, we find th...
Multi-view graph clustering (MVGC) explores pairwise correlations of entire instances and comprehensively aggregates diverse source information with optimal graph structure. One major issue of practical MVGC is the high time and space complexities prohibiting being applied on large-scale applications. As a promising solution of addressing large-sca...
Deep learning has been widely applied in recommender systems, which has recently achieved revolutionary progress. However, most existing learning-based methods assume that the user and item distributions remain unchanged between the training phase and the test phase. However, the distribution of user and item features can naturally shift in real-wo...
Object detection is a critical task in computer vision, with applications in various domains such as autonomous driving and urban scene monitoring. However, deep learning-based approaches often demand large volumes of annotated data, which are costly and difficult to acquire, particularly in complex and unpredictable real-world environments. This d...
Multiview clustering thrives in applications where views are collected in advance by extracting consistent and complementary information among views. However, it overlooks scenarios where data views are collected sequentially, i.e., real-time data. Due to privacy issues or memory burden, previous views are not available with time in these situation...
Spatial transcriptomics technology fully leverages spatial location and gene expression information for spatial clustering tasks. However, existing spatial clustering methods primarily concentrate on utilizing the complementary features between spatial and gene expression information, while overlooking the discriminative features during the integra...
Federated knowledge graph reasoning (FedKGR) aims to perform reasoning over different clients while protecting data privacy, drawing increasing attention to its high practical value. Previous works primarily focus on data heterogeneity, ignoring challenges from limited data scale and primitive negative sample strategies, i.e., random entity replace...
Clustering ensemble has been widely studied in data mining and machine learning. However, the existing clustering ensemble methods do not pay attention to fairness, which is important in real-world applications, especially in applications involving humans. To address this issue, this paper proposes a novel fair clustering ensemble method, which tak...
This study is devoted to the design of gradient boosted fuzzy rule-based models for regression problems. Fuzzy rule-based models are built on the basis of information granules formed in the input and output spaces whose structure involves a family of conditional ‘if-then’ statements. The architecture of fuzzy rule-based models contributes to the re...
Multiple kernel clustering (MKC) enhances clustering performance by deriving a consensus partition or graph from a predefined set of kernels. Despite many advanced MKC methods proposed in recent years, the prevalent approaches involve incorporating all kernels by default to capture diverse information within the data. However, learning from all ker...
Video instance segmentation (VIS) is a challenging task, requiring handling object classification, segmentation, and tracking in videos. Existing Transformer-based VIS approaches have shown remarkable success, combining encoded features and instance queries as decoder inputs. However, their decoder inputs are low-resolution due to computational cos...
This paper investigates Gradient Normalization Stochastic Gradient Descent without Clipping (NSGDC) and its variance reduction variant (NSGDC-VR) for nonconvex optimization under heavy-tailed noise. We present significant improvements in the theoretical results for both algorithms, including the removal of logarithmic factors from the convergence r...
In recent years, there has been significant advancement in the field of single-cell data analysis, particularly in the development of clustering methods. Despite these advancements, most algorithms continue to focus primarily on analyzing the provided single-cell matrix data. However, within medical contexts, single-cell data often encompasses a we...
Although data-driven methods usually have noticeable performance on disease diagnosis and treatment, they are suspected of leakage of privacy due to collecting data for model training. Recently, federated learning provides a secure and trustable alternative to collaboratively train model without any exchange of medical data among multiple institute...
Few-shot learning presents a substantial challenge in developing robust models due to the inherent scarcity of samples within each category. To overcome this challenge, metric-based methods have been introduced, classifying images based on the relationships among samples within a given embedding space. While these methods are effective, the limited...
Vertical federated learning is a natural and elegant approach to integrate multi-view data vertically partitioned across devices (clients) while preserving their privacies. Apart from the model training, existing methods requires the collaboration of all clients in the model inference. However, the model inference is probably maintained for service...
Multi-view subspace clustering (MVSC) is a popular area of research that concentrates on partitioning data points from multiple views. It has gained wide attention in recent years due to the ability to handle complex data with diverse features across different views. However, the success of MVSC largely relies on the quality of the learned similari...
Clustering is a popular research pipeline in unsupervised learning to find potential groupings. As a representative paradigm in multiple kernel clustering (MKC), late fusion-based models learn a consistent partition across multiple base kernels. Despite their promising performance, a common concern is the limited representation capacity caused by t...
Incomplete multiview clustering (IMVC) generally requires the number of anchors to be the same in all views. Also, this number needs to be tuned with extra manual efforts. This not only degenerates the diversity of multiview data but also limits the model’s scalability. For generating differentiated numbers of anchors without tuning, in this articl...
Anchor graph has been recently proposed to accelerate multi-view graph clustering and widely applied in various large-scale applications. Different from capturing full instance relationships, these methods choose small portion anchors among each view, construct single-view anchor graphs and combine them into the unified graph. Despite its efficienc...
Benefiting from the strong reasoning capabilities, Large language models (LLMs) have demonstrated remarkable performance in recommender systems. Various efforts have been made to distill knowledge from LLMs to enhance collaborative models, employing techniques like contrastive learning for representation alignment. In this work, we prove that direc...
Multiview clustering has become a prominent research topic in data analysis, with wide-ranging applications across various fields. However, the existing late fusion multiview clustering (LFMVC) methods still exhibit some limitations, including variable importance and contributions and a heightened sensitivity to noise and outliers during the alignm...
Unsupervised multi-view bipartite graph clustering (MVBGC) is a fast-growing research, due to promising scalability in large-scale tasks. Although many variants are proposed by various strategies, a common design is to construct the bipartite graph directly from the input data, i.e. only consider the unidirectional “encoding” process. However, “enc...
Anchor-based multi-view graph clustering has recently gained popularity as an effective approach for clustering data with multiple views. However, existing methods have limitations in terms of handling inconsistent information and noise across views, resulting in an unreliable consensus representation. Additionally, post-processing is needed to obt...
Single-cell multi-view clustering is essential for analyzing the different cell subtypes of the same cell from different views. Some attempts have been made, but most of these models still struggle to handle single-cell sequencing data, primarily due to their non-specific design for cellular data. We observe that such data distinctively exhibits: (...
Deep learning has been widely applied in recommender systems, which has achieved revolutionary progress recently. However, most existing learning-based methods assume that the user and item distributions remain unchanged between the training phase and the test phase. However, the distribution of user and item features can naturally shift in real-wo...
Deep graph clustering, which aims to reveal the underlying graph structure and divide the nodes into different clusters without human annotations, is a fundamental yet challenging task. However, we observe that the existing methods suffer from the representation collapse problem and tend to encode samples with different classes into the same latent...
Knowledge graph reasoning (KGR), aiming to deduce new facts from existing facts based on mined logic rules underlying knowledge graphs (KGs), has become a fast-growing research direction. It has been proven to significantly benefit the usage of KGs in many AI applications, such as question answering, recommendation systems, and etc. According to th...
While humans can excel at image classification tasks by comparing a few images, existing metric-based few-shot classification methods are still not well adapted to novel tasks. Performance declines rapidly when encountering new patterns, as feature embeddings cannot effectively encode discriminative properties. Moreover, existing matching methods i...
Temporal graph learning aims to generate high-quality representations for graph-based tasks with dynamic information, which has recently garnered increasing attention. In contrast to static graphs, temporal graphs are typically organized as node interaction sequences over continuous time rather than an adjacency matrix. Most temporal graph learning...
Multi-view clustering (MVC) has attracted broad attention due to its capacity to exploit consistent and complementary information across views. This paper focuses on a challenging issue in MVC called the incomplete continual data problem (ICDP). Specifically, most existing algorithms assume that views are available in advance and overlook the scena...
Existing multiple kernel clustering (MKC) algorithms have two ubiquitous problems. From the theoretical perspective, most MKC algorithms lack sufficient theoretical analysis, especially the consistency of learned parameters, such as the kernel weights. From the practical perspective, the high complexity makes MKC unable to handle large-scale datase...
Deep clustering with attribute-missing graphs, where only a subset of nodes possesses complete attributes while those of others are missing, is an important yet challenging topic in various practical applications. It has become a prevalent learning paradigm in existing studies to perform data imputation first and subsequently conduct clustering usi...
GraIL and its variants have shown their promising capacities for inductive relation reasoning on knowledge graphs. However, the uni-directional message-passing mechanism hinders such models from exploiting hidden mutual relations between entities in directed graphs. Besides, the enclosing subgraph extraction in most GraIL-based models restricts the...
Incomplete multi-view clustering has attracted much attention due to its ability to handle partial multi-view data. Recently, similarity-based methods have been developed to explore the complete relationship among incomplete multi-view data. Although widely applied to partial scenarios, most of the existing approaches are still faced with two limit...
Crime prediction is a crucial yet challenging task within urban computing, which benefits public safety and resource optimization. Over the years, various models have been proposed, and spatial-temporal hypergraph learning models have recently shown outstanding performances. However, three correlations underlying crime are ignored, thus hindering t...
In numerous real-world applications, it is quite common that sample information is partially available for some views due to machine breakdown or sensor failure, causing the problem of incomplete multi-view clustering (IMVC). While several IMVC approaches using view-shared anchors have successfully achieved pleasing performance improvement, (1) the...
Few-shot relation reasoning on knowledge graphs (FS-KGR) is an important and practical problem that aims to infer long-tail relations and has drawn increasing attention these years. Among all the proposed methods, self-supervised learning (SSL) methods, which effectively extract the hidden essential inductive patterns relying only on the support se...
Graph Neural Networks (GNNs) have achieved promising performance in semi-supervised node classification in recent years. However, the problem of insufficient supervision, together with representation collapse, largely limits the performance of the GNNs in this field. To alleviate the collapse of node representations in semi-supervised scenario, we...
With the development of various applications, such as recommendation systems and social network analysis, graph data have been ubiquitous in the real world. However, graphs usually suffer from being absent during data collection due to copyright restrictions or privacy-protecting policies. The graph absence could be roughly grouped into attribute-i...
Fully supervised change detection (CD) methods have achieved significant advancements in performance, yet they depend severely on acquiring costly pixel-level labels. Considering that the patch-level annotations also contain abundant information corresponding to both changed and unchanged objects in bi-temporal images, an intuitive solution is to s...
Hyperspectral image (HSI) clustering is a fundamental yet challenging task that groups image pixels with similar features into distinct clusters. Among various approaches, contrastive learning methods, which employ the concept of encouraging semantically similar samples to move closer together while pushing semantically inconsistent samples apart,...
Hao Yu Ke Liang Dayu Hu- [...]
Xinwang Liu
The ubiquity of Graph Neural Networks (GNNs) emphasizes the imperative to assess their resilience against node injection attacks, a type of evasion attacks that impact victim models by injecting nodes with fabricated attributes and structures. However, prevailing attacks face two primary limitations: (1) Sequential construction of attributes and st...
Site selection aims to select optimal locations for new stores, which is crucial in business management and urban computing. The early data-driven models heavily relied on feature engineering, which could not effectively model the complex relationships and diverse influences among different data. To alleviate such issues, the knowledge-driven parad...
Unsupervised bipartite graph learning has been a hotpot in multi-view clustering, to tackle the restricted scalability issue of traditional full graph clustering in large-scale applications. However, the existing bipartite graph clustering paradigm pays little attention to the adverse impact of noisy features on learning process. To further facilit...
Anchor technology is popularly employed in multi-view subspace clustering (MVSC) to reduce the complexity cost. However, due to the sampling operation being performed on each individual view independently and not considering the distribution of samples in all views, the produced anchors are usually slightly distinguishable, failing to characterize...
Jingtao Hu Bin Xiao Hu Jin- [...]
En Zhu
Graph anomaly detection (GAD) has gained increasing attention in various attribute graph applications, i.e., social communication and financial fraud transaction networks. Recently, graph contrastive learning (GCL)-based methods have been widely adopted as the mainstream for GAD with remarkable success. However, existing GCL strategies in GAD mainl...
Sign-based stochastic methods have gained attention due to their ability to achieve robust performance despite using only the sign information for parameter updates. However, the current convergence analysis of sign-based methods relies on the strong assumptions of first-order gradient Lipschitz and second-order gradient Lipschitz, which may not ho...
In order to solve the time-varying quadratic programming (TVQP) problem more effectively, a new self-adaptive zeroing neural network (ZNN) is designed and analyzed in this article by using the Takagi-Sugeno fuzzy logic system (TSFLS) and thus called the Takagi-Sugeno (T-S) fuzzy ZNN (TSFZNN). Specifically, a multiple-input-single-output TSFLS is de...
Yu Hu Endai Guo Zhi Xie- [...]
Hongmin Cai
Multi-view clustering aims at integrating information from different views to improve clustering performance. Recent methods integrate multiple view-specific partition matrices to seek a consensus one and have demonstrated promising clustering performance in various applications. However, the clustering performance of such methods heavily relies on...
Recently, graph anomaly detection on attributed networks has attracted growing attention in data mining and machine learning communities. Apart from attribute anomalies, graph anomaly detection also aims at suspicious topological-abnormal nodes that exhibit collective anomalous behavior. Closely connected uncorrelated node groups form uncommonly de...
Multiview clustering has attracted increasing attention to automatically divide instances into various groups without manual annotations. Traditional shadow methods discover the internal structure of data, while deep multiview clustering (DMVC) utilizes neural networks with clustering-friendly data embeddings. Although both of them achieve impressi...
Graph clustering, which aims to divide nodes in the graph into several distinct clusters, is a fundamental yet challenging task. Benefiting from the powerful representation capability of deep learning, deep graph clustering methods have achieved great success in recent years. However, the corresponding survey paper is relatively scarce, and it is i...
Recently, metric-based meta-learning methods have been effectively applied to few-shot image classification. These methods classify images based on the relationship between samples in an embedding space, avoiding over-fitting that can occur when training classifiers with limited samples. However, finding an embedding space with good generalization...
Anchor-based multi-view graph clustering (AMVGC) has received abundant attention owing to its high efficiency and the capability to capture complementary structural information across multiple views. Intuitively, a high-quality anchor graph plays an essential role in the success of AMVGC. However, the existing AMVGC methods only consider single-str...