Yijie Lin’s research while affiliated with Sichuan University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (17)


LLaVA-ReID: Selective Multi-image Questioner for Interactive Person Re-Identification
  • Preprint

April 2025

·

1 Read

Yiding Lu

·

Mouxing Yang

·

·

[...]

·

Xi Peng

Traditional text-based person ReID assumes that person descriptions from witnesses are complete and provided at once. However, in real-world scenarios, such descriptions are often partial or vague. To address this limitation, we introduce a new task called interactive person re-identification (Inter-ReID). Inter-ReID is a dialogue-based retrieval task that iteratively refines initial descriptions through ongoing interactions with the witnesses. To facilitate the study of this new task, we construct a dialogue dataset that incorporates multiple types of questions by decomposing fine-grained attributes of individuals. We further propose LLaVA-ReID, a question model that generates targeted questions based on visual and textual contexts to elicit additional details about the target person. Leveraging a looking-forward strategy, we prioritize the most informative questions as supervision during training. Experimental results on both Inter-ReID and text-based ReID benchmarks demonstrate that LLaVA-ReID significantly outperforms baselines.


Incomplete Multi-view Clustering via Diffusion Contrastive Generation

April 2025

·

1 Read

Proceedings of the AAAI Conference on Artificial Intelligence

Incomplete multi-view clustering (IMVC) has garnered increasing attention in recent years due to the common issue of missing data in multi-view datasets. The primary approach to address this challenge involves recovering the missing views before applying conventional multi-view clustering methods. Although imputation-based IMVC methods have achieved significant improvements, they still encounter notable limitations: 1) heavy reliance on paired data for training the data recovery module, which is impractical in real scenarios with high missing data rates; 2) the generated data often lacks diversity and discriminability, resulting in suboptimal clustering results. To address these shortcomings, we propose a novel IMVC method called Diffusion Contrastive Generation (DCG). Motivated by the consistency between the diffusion and clustering processes, DCG learns the distribution characteristics to enhance clustering by applying forward diffusion and reverse denoising processes to intra-view data. By performing contrastive learning on a limited set of paired multi-view samples, DCG can align the generated views with the real views, facilitating accurate recovery of views across arbitrary missing view scenarios. Additionally, DCG integrates instance-level and category-level interactive learning to exploit the consistent and complementary information available in multi-view data, achieving robust and end-to-end clustering. Extensive experiments demonstrate that our method outperforms state-of-the-art approaches.


Incomplete Multi-view Clustering via Diffusion Contrastive Generation

March 2025

·

1 Read

Incomplete multi-view clustering (IMVC) has garnered increasing attention in recent years due to the common issue of missing data in multi-view datasets. The primary approach to address this challenge involves recovering the missing views before applying conventional multi-view clustering methods. Although imputation-based IMVC methods have achieved significant improvements, they still encounter notable limitations: 1) heavy reliance on paired data for training the data recovery module, which is impractical in real scenarios with high missing data rates; 2) the generated data often lacks diversity and discriminability, resulting in suboptimal clustering results. To address these shortcomings, we propose a novel IMVC method called Diffusion Contrastive Generation (DCG). Motivated by the consistency between the diffusion and clustering processes, DCG learns the distribution characteristics to enhance clustering by applying forward diffusion and reverse denoising processes to intra-view data. By performing contrastive learning on a limited set of paired multi-view samples, DCG can align the generated views with the real views, facilitating accurate recovery of views across arbitrary missing view scenarios. Additionally, DCG integrates instance-level and category-level interactive learning to exploit the consistent and complementary information available in multi-view data, achieving robust and end-to-end clustering. Extensive experiments demonstrate that our method outperforms state-of-the-art approaches.


Overview of the MetaQ algorithm
a MetaQ accepts uni-omics or paired multi-omics raw count matrices as inputs and learns cell embeddings through the encoder network. b In the embedding space, MetaQ introduces a discrete codebook with learnable entries, where each entry corresponds to the embedding of a metacell. Cells are then quantized by assigning each to its nearest entry in the codebook. To prevent degenerated quantization, MetaQ records the entry usage and adjusts the codebook to eliminate over-large or over-small entries. c MetaQ encourages each codebook entry to reconstruct the input omics data of all cells it quantizes with the decoder network. For better reconstruction, the model tends to assign biologically similar cells into the same codebook entry, which intrinsically achieves cell grouping for metacell inference. d After training, MetaQ computes metacells by averaging the original uni- or multi-omics count data of cells within each codebook entry. e The inferred metacells serve as a compressed representation of the original data, which could be seamlessly used for single-cell downstream analysis (DE: Differential Expression).
MetaQ effectively and efficiently infers prototypical metacells
a UMAP visualization of the original 433,495 cells from the human fetal atlas. The red box highlights retina cells. b Classification accuracy of cell classifiers trained with [500, 1000, 2000, 4000] metacells inferred by MetaQ, SEACell, MetaCell V2, SuperCell, and random sub-sampling on five random experiments. Each boxplot ranges from the upper and lower quartiles with the median as the horizontal line and whiskers extend to 1.5 times the interquartile range. Two-sided T-test results: *0.01 < p ≤ 0.05, **0.001 < p ≤ 0.01, ***0.0001 < p ≤ 0.001, ****p ≤ 0.0001. c UMAP visualization of 4000 metacells inferred by the four methods, with cell type colors matching those in b. Retina cells are marked with red boxes. d Agreement between the ground-truth annotations and the labels predicted by classification models trained with 500 metacells. Matrices with a clearer diagonal structure indicate better classification performance. e Compactness of [500, 1000, 2000, 4000] metacells inferred by different methods on five random experiments. Each boxplot ranges from the upper and lower quartiles with the median as the horizontal line and whiskers extend to 1.5 times the interquartile range. Two-sided T-test results: ****p ≤ 0.0001. f Separation of [500, 1000, 2000, 4000] metacells inferred by different methods on five random experiments. Each boxplot ranges from the upper and lower quartiles with the median as the horizontal line and whiskers extend to 1.5 times the interquartile range. g Running times (logged) and memory cost for inferring 1000 metacells from different numbers of original cells. h Running times (logged) and memory cost for inferring different numbers of metacells from 100,000 cells. Source data are provided as a Source Data file.
MetaQ supports multi-omics metacell inference and preserves developmental trajectory
a UMAP visualization of the original 30,672 cells from human bone marrow in RNA and ADT modalities (CD4 Memory Memory CD4+ T cell, CD4 Naive Naive CD4+ T cell, CD8 Effector Effector CD8+ T cell, CD8 Memory Memory CD8+ T cell, CD8 Naive Naive CD8+ T cell, CD14 Mono CD14 Monocytes, CD16 Mono CD16 Monocytes, CD56 bright NK CD56 bright natural killer, GMP Granulocyte-macrophage progenitors, HSC Hematopoietic stem cell, LMPP Lymphoid-primed multipotent progenitors, MAIT Mucosal-associated invariant T cell, NK Natural killer, Prog_B Progenitors of B cell, Prog_DC Progenitors of dendritic cell lineages, Prog_Mk Progenitor of megakaryocyte, Prog_RBC Progenitors of erythroid, Treg Regulatory T cell, cDC2 Type 2 conventional dendritic cell, gdT Gamma delta T cell, pDC Plasmacytoid dendritic cell). b UMAP visualization of WNN results on original cells and 613 metacells (a 50-fold reduction) inferred by MetaQ, SEACell, MetaCell V2, and SuperCell. c Compactness and separation of 613 metacells inferred by different methods, calculated separately for RNA and ADT modalities. Each boxplot ranges from the upper and lower quartiles with the median as the horizontal line and whiskers extend to 1.5 times the interquartile range. d Purity of 613 metacells inferred by different methods across RNA-informative (top panel) and ADT-informative (bottom panel) cell types. Each boxplot ranges from the upper and lower quartiles with the median as the horizontal line and whiskers extend to 1.5 times the interquartile range. Two-sided T-test results: *0.01 < p ≤ 0.05, **0.001 < p ≤ 0.01, ***0.0001 < p ≤ 0.001, ****p ≤ 0.0001. e PAGA cell embedding of metacells along the plasmablast developmental path inferred by MetaQ, with metacells colored by type and pseudotime, respectively. f Annotation and marker gene expression changes along the plasmablast developmental path. Source data are provided as a Source Data file.
MetaQ facilitates batch integration
a UMAP visualization of the original 14,767 cells from the human pancreas dataset, with cells colored by types and batches, respectively. b UMAP visualization of integrated cell embeddings obtained by performing Harmony on the original data (MHC class II: Major histocompatibility complex Class II). c UMAP visualization of 590 MetaQ metacells (a 25-fold reduction) integrated by Harmony. d UMAP visualization of the integrated embeddings of original cells recovered by MetaQ. e Sankey plots showing Louvain cluster assignments on cell embeddings integrated by Harmony, recovered by SuperCell, and recovered by MetaQ. f Normalized expression of the marker gene TM4SF4 projected on the UMAP plot of alpha cell embeddings by MetaQ and Harmony, respectively. The subplot in the left plot shows the results for the Baron batch. g AMI, ARI, and Homogeneity scores of Louvain clustering with three different resolutions of [0.5, 1.0, 2.0] on cell embeddings obtained by Harmony and recovered by MetaQ, SEACell, MetaCell V2, and SuperCell. h AMI, ARI, and Homogeneity scores of Louvain clustering with three different resolutions of [1.0, 2.0, 5.0] on 590 metacells inferred by different metacell algorithms. Source data are provided as a Source Data file.
MetaQ preserves differential expressions with respect to cell types and perturbations
a UMAP visualization of the original 240,090 cells from human PBMC perturbation data, with cells colored by types, donors, and perturbations, respectively. b Rank of top differentially expressed genes concerning cell types, computed on the original cells (Top) and MetaQ metacells (Bottom) with a reduction rate of 10. c Consistency between the ranking of top 2000 differentially expressed genes computed on original cells and 24,009 metacells using different methods. Each boxplot ranges from the upper and lower quartiles with the median as the horizontal line and whiskers extend to 1.5 times the interquartile range. Two-sided T-test results: *0.01 < p ≤ 0.05. d Differential expression values relative to perturbations on CD8+ T cells, calculated on original cells and 24,009 metacells inferred by MetaQ, SEACell, and SuperCell. Perturbations and genes that differ the most between different methods are shown for clearer comparisons. e Pearson correlation between perturbation differential expression values computed on original cells and 24,009 metacells using different methods. Each boxplot ranges from the upper and lower quartiles with the median as the horizontal line and whiskers extend to 1.5 times the interquartile range. Two-sided T-test results: *0.01 < p ≤ 0.05, **0.001 < p ≤ 0.01, ***0.0001 < p ≤ 0.001, ****p ≤ 0.0001. Source data are provided as a Source Data file.

+1

MetaQ: fast, scalable and accurate metacell inference via single-cell quantization
  • Article
  • Full-text available

January 2025

·

23 Reads

To overcome the computational barriers of analyzing large-scale single-cell sequencing data, we introduce MetaQ, a metacell algorithm that scales to arbitrarily large datasets with linear runtime and constant memory usage. Inspired by cellular development, MetaQ conceptualizes each metacell as a collective ancestor of biologically similar cells. By quantizing cells into a discrete codebook, where each entry represents a metacell capable of reconstructing the original cells it quantizes, MetaQ identifies homogeneous cell subsets for efficient and accurate metacell inference. This approach reduces computational complexity from exponential to linear while maintaining or surpassing the performance of existing metacell algorithms. Extensive experiments demonstrate that MetaQ excels in downstream tasks such as cell type annotation, developmental trajectory inference, batch integration, and differential expression analysis. Thanks to its superior efficiency and effectiveness, MetaQ makes analyzing datasets with millions of cells practical, offering a powerful solution for single-cell studies in the era of high-throughput profiling.

Download

Fig. 1 Six categories of prior knowledge for deep clustering. a Structure Prior: data structure could reflect the semantic relation between instances. b Distribution Prior: instances from different clusters follow distinct data distributions. c Augmentation Invariance: samples augmented by the same instance have similar features. d Neighborhood Consistency: neighboring samples have consistent cluster assignments. e Pseudo Label: cluster assignments with high confidence are likely to be correct. f External Knowledge: abundant knowledge favorable to clustering exists in open-world data and models
Commonly used mathematical notations
A summary of datasets commonly used for deep clustering
A survey on deep clustering: from the prior perspective

December 2024

·

35 Reads

·

6 Citations

Vicinagearth

Facilitated by the powerful feature extraction ability of neural networks, deep clustering has achieved great success in analyzing high-dimensional and complex real-world data. The performance of deep clustering methods is affected by various factors such as network structures and learning objectives. However, as pointed out in this survey, the essence of deep clustering lies in the incorporation and utilization of prior knowledge, which is largely ignored by existing works. From pioneering deep clustering methods based on data structure assumptions to recent contrastive clustering methods based on data augmentation invariances, the development of deep clustering intrinsically corresponds to the evolution of prior knowledge. In this survey, we provide a comprehensive review of deep clustering methods by categorizing them into six types of prior knowledge. We find that in general the prior innovation follows two trends, namely, i) from mining to constructing, and ii) from internal to external. Besides, we provide a benchmark on five widely-used datasets and analyze the performance of methods with diverse priors. By providing a novel prior knowledge perspective, we hope this survey could provide some novel insights and inspire future research in the deep clustering community.


UNITE: Multitask Learning With Sufficient Feature for Dense Prediction

August 2024

·

4 Reads

·

1 Citation

IEEE Transactions on Systems Man and Cybernetics Systems

Existing multitask dense prediction methods typically rely on either global shared neural architecture or cross-task fusion strategy. However, these approaches tend to overlook either potential cross-task complementary or consistent information, resulting in suboptimal results. Motivated by this observation, we propose a novel plug-and-play module to concurrently leverage cross-task consistent and complementary information, thereby capturing a sufficient feature. Specifically, for a given pair of tasks, we compute a cross-task similarity matrix that extracts cross-task consistent features bidirectionally. To integrate the complementary signals from different tasks, we fuse the cross-task consistent features with the corresponding task-specific features using an 1×11\times 1 convolution. Extensive experimental results demonstrate the remarkable performance gain of our method on two challenging datasets w.r.t different task sets, compared with seven approaches. Under the two-task setting, our method has achieved 1.63% and 8.32% improvements on NYUD-v2 and PASCAL-Context, respectively. On the three-task setting, we obtain an additional 7.7% multitask performance gain.


A Survey on Deep Clustering: From the Prior Perspective

June 2024

·

21 Reads

Facilitated by the powerful feature extraction ability of neural networks, deep clustering has achieved great success in analyzing high-dimensional and complex real-world data. The performance of deep clustering methods is affected by various factors such as network structures and learning objectives. However, as pointed out in this survey, the essence of deep clustering lies in the incorporation and utilization of prior knowledge, which is largely ignored by existing works. From pioneering deep clustering methods based on data structure assumptions to recent contrastive clustering methods based on data augmentation invariances, the development of deep clustering intrinsically corresponds to the evolution of prior knowledge. In this survey, we provide a comprehensive review of deep clustering methods by categorizing them into six types of prior knowledge. We find that in general the prior innovation follows two trends, namely, i) from mining to constructing, and ii) from internal to external. Besides, we provide a benchmark on five widely-used datasets and analyze the performance of methods with diverse priors. By providing a novel prior knowledge perspective, we hope this survey could provide some novel insights and inspire future research in the deep clustering community.


Decoupled Contrastive Multi-View Clustering with High-Order Random Walks

March 2024

·

7 Reads

·

36 Citations

Proceedings of the AAAI Conference on Artificial Intelligence

In recent, some robust contrastive multi-view clustering (MvC) methods have been proposed, which construct data pairs from neighborhoods to alleviate the false negative issue, i.e., some intra-cluster samples are wrongly treated as negative pairs. Although promising performance has been achieved by these methods, the false negative issue is still far from addressed and the false positive issue emerges because all in- and out-of-neighborhood samples are simply treated as positive and negative, respectively. To address the issues, we propose a novel robust method, dubbed decoupled contrastive multi-view clustering with high-order random walks (DIVIDE). In brief, DIVIDE leverages random walks to progressively identify data pairs in a global instead of local manner. As a result, DIVIDE could identify in-neighborhood negatives and out-of-neighborhood positives. Moreover, DIVIDE embraces a novel MvC architecture to perform inter- and intra-view contrastive learning in different embedding spaces, thus boosting clustering performance and embracing the robustness against missing views. To verify the efficacy of DIVIDE, we carry out extensive experiments on four benchmark datasets comparing with nine state-of-the-art MvC methods in both complete and incomplete MvC settings. The code is released on https://github.com/XLearning-SCU/2024-AAAI-DIVIDE.



Decoupled Contrastive Multi-View Clustering with High-Order Random Walks

August 2023

·

9 Reads

In recent, some robust contrastive multi-view clustering (MvC) methods have been proposed, which construct data pairs from neighborhoods to alleviate the false negative issue, i.e., some intra-cluster samples are wrongly treated as negative pairs. Although promising performance has been achieved by these methods, the false negative issue is still far from addressed and the false positive issue emerges because all in- and out-of-neighborhood samples are simply treated as positive and negative, respectively. To address the issues, we propose a novel robust method, dubbed decoupled contrastive multi-view clustering with high-order random walks (DIVIDE). In brief, DIVIDE leverages random walks to progressively identify data pairs in a global instead of local manner. As a result, DIVIDE could identify in-neighborhood negatives and out-of-neighborhood positives. Moreover, DIVIDE embraces a novel MvC architecture to perform inter- and intra-view contrastive learning in different embedding spaces, thus boosting clustering performance and embracing the robustness against missing views. To verify the efficacy of DIVIDE, we carry out extensive experiments on four benchmark datasets comparing with nine state-of-the-art MvC methods in both complete and incomplete MvC settings.


Citations (8)


... To extract and synthesize the full picture from these disparate sources, one of the pivotal techniques in multi-view learning is multi-view clustering (MVC), which aims to partition multi-view data into distinct clusters (Cai et al. 2024a;Yan et al. 2024;Cai et al. 2024b;Lu et al. 2024) by leveraging the inherent consistency and complementarity nature of the information across different views . The success of existing multi-view clustering methods heavily relies on the completeness of multi-view data, i.e., all views are consistently available for each sample. ...

Reference:

Incomplete Multi-view Clustering via Diffusion Contrastive Generation
A survey on deep clustering: from the prior perspective

Vicinagearth

... The classical clustering algorithms are K-Means(KM), Normalized Cuts(Ncuts) and all-view version of them. Other methods include MvSCN (Huang et al. 2019), EAMC (Zhou and Shen 2020), MVC-VAE (Yin, Huang, and Gao 2020), DEMVC (Xu et al. 2021), SiMVC (Trosten et al. 2021), CoMVC (Trosten et al. 2021), MFLVC (Xu et al. 2022), SPDMC (Chen et al. 2023), ICMVC (Chao, Jiang, and Chu 2024), DIVIDE (Lu et al. 2024b). ...

Decoupled Contrastive Multi-View Clustering with High-Order Random Walks
  • Citing Article
  • March 2024

Proceedings of the AAAI Conference on Artificial Intelligence

... To generate accurate missing view representations, inspired by the diffusion model (Ho, Jain, and Abbeel 2020) and contrastive learning (Chen et al. 2020;Lin et al. 2023), we designed a Diffusion Contrastive Generation module. Specifically, the module first applies forward diffusion and reverse denoising on the intra-view data to learn the distribution characteristics for clustering. ...

Graph Matching with Bi-level Noisy Correspondence
  • Citing Conference Paper
  • October 2023

... The fundamental concept behind it is to maximize the similarity between the cluster assignment of original and augmented data. This objective can be likened to conducting contrastive learning [59] at the cluster level. Motivated by PICA, CC [51] and DRC [121] conduct contrastive learning on both instance level and cluster level. ...

Improve Interpretability of Neural Networks via Sparse Contrastive Coding
  • Citing Conference Paper
  • January 2022

... MNN-based methods, including MNN (Haghverdi et al., 2018), Fast-MNN , BBKNN (Polański et al., 2020), Harmony (Korsunsky et al., 2019), scDML (Yu et al., 2023) and Scanorama (Hie et al., 2019), correct batch effects by identifying and aligning similar data across batches. While effective for alignment, these methods face scalability issues with large-scale datasets (Tran et al., 2020;Li et al., 2023;Yu et al., 2023) and lack of control over the degree of correction, making it challenging to achieve a balance between batch effect removal and biological conservation. Moreover, MNNbased methods rely on similarity calculations, which are difficult to standardise across omics data with distinct distributions, limiting their generalisability. ...

Single-Cell RNA-Seq Debiased Clustering via Batch Effect Disentanglement
  • Citing Article
  • March 2023

IEEE Transactions on Neural Networks and Learning Systems

... Factors such as device failures, interrupted data transmission, or storage media corruption render data incompleteness a norm. This incompleteness poses significant challenges to traditional MVC methods, prompting researchers to focus on incomplete multi-view clustering (IMVC) as a frontier topic [8][9][10][11][12][13][14][15][16]. ...

Dual Contrastive Prediction for Incomplete Multi-View Representation Learning
  • Citing Article
  • August 2022

IEEE Transactions on Pattern Analysis and Machine Intelligence

... Some studies adopt semi-supervised Shao et al. 2020;Chen et al. 2021) or unsupervised (Yang et al. 2022;Wang et al. 2024b) approaches to reduce reliance on paired data. Others attempt to generate more realistic paired data (Gong et al. 2021;Li et al. 2022;Wu et al. 2023;Fan et al. 2024) to bridge the domain gap. Despite these advancements, there remains considerable potential for further improvement, particularly in terms of the realism of the synthetic data. ...

Unsupervised Neural Rendering for Image Hazing
  • Citing Article
  • June 2022

IEEE Transactions on Image Processing

... while Wen et al. [39] proposed a view-specific graph convolutional autoencoder network to explore high-level features and geometric structures of multi-view data. Lin et al. [40] introduced COM-PLETER, which not only cleverly integrates autoencoder techniques into a unified framework for data recovery and consistency learning, but also uses autoencoders to achieve view-specific data recovery and cross-view consistency learning. However, it only supports a two-view scenario and is difficult to extend to multi-view scenarios. ...

COMPLETER: Incomplete Multi-view Clustering via Contrastive Prediction
  • Citing Conference Paper
  • June 2021