Chen Li’s research while affiliated with Xi’an Jiaotong-Liverpool University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (202)


From Patches to Wsis: A Systematic Review of Deep Multiple Instance Learning in Computational Pathology
  • Preprint

January 2025

Yuchen Zhang

·

Zeyu Gao

·

Kai He

·

[...]

·

Rui Mao

How Does Distribution Matching Help Domain Generalization: An Information-theoretic Analysis

January 2025

IEEE Transactions on Information Theory

Domain generalization aims to learn invariance across multiple source domains, thereby enhancing generalization against out-of-distribution data. While gradient or representation matching algorithms have achieved remarkable success in domain generalization, these methods generally lack generalization guarantees or depend on strong assumptions, leaving a gap in understanding the underlying mechanism of distribution matching. In this work, we formulate domain generalization from a novel probabilistic perspective, ensuring robustness while avoiding overly conservative solutions. Through comprehensive information-theoretic analysis, we provide key insights into the roles of gradient and representation matching in promoting generalization. Our results reveal the complementary relationship between these two components, indicating that existing works focusing solely on either gradient or representation alignment are insufficient to solve the domain generalization problem. In light of these theoretical findings, we introduce IDM to simultaneously align the inter-domain gradients and representations. Integrated with the proposed PDM method for complex distribution matching, IDM achieves superior performance over various baseline methods.


We used a divide-and-conquer approach to the five subtypes of thymoma. Firstly, the similar subtypes were classified into one category, and the classification with big difference was distinguished. Then these large categories are further subdivided, and finally achieve the detailed classification of five subtypes. Based on this, we plotted patch heatmap, and segmented and classified the cells. Next, we analyzed the characteristics of the cells in different patch categories to verify the correctness and interpretability of patch heatmap analysis.
(A) Illustrates the implementation of the divide-and-conquer idea of the five-classification algorithm. Using the multi-instance learning method, the five classes are divided into two classes, and then further subdivided in the second step. (B) Presents the concrete implementation schematic diagram of the multi-instance learning algorithm. It consists of three steps: whole-slide image preprocessing, tile feature extraction, and attention-based slide classification. In addition to a simple iterative process, pseudo-label processing and continuous optimization of the feature extractor are also carried out. (C) Is a flowchart for analyzing cell characteristics. The HoverNet model is used to segment and classify cells, and then the characteristics of cells in different categories of slides are analyzed.
These two graphs are the result graphs of five classifications, (A) is the confusion matrix and (B) is the ROC curve.
Visual heatmap of five thymoma subtypes. The top panel presents hematoxylin and eosin (H&E) stained overview images of each thymoma subtype. The middle panel provides corresponding visualization heatmaps, highlighting the distinctive features of each subtype. In the bottom panel, an example of a B1-type thymoma is shown, with the heatmap indicating a B2-type signal in the lower-left corner (yellow triangle). Further review of H&E sections in this area confirmed a small focus of B2-type thymoma. The yellow arrow in the H&E image marks clustered tumor cells, characteristic of B2 subtype, in contrast to the scattered, single-cell distribution typical of the B1 subtype (indicated by the arrowhead).
Three inverted violin plots show the data distribution of thymic tumor cells across three characteristics: (A) the proportion of two cell types, (B) cell area size, (C) cell eccentricity. The white dot represents the median value of the data, while the width of each violin indicates the density of the data points. Wider sections correspond to a higher density of observations and the length reflects the overall distribution of the data across each group. The red violin plots represent tumor cells, while the blue ones represent inflammatory cells. We analyzed the two cellular characteristics within five subtypes of thymic tumors, with each plot containing data from ten groups. We assessed intergroup differences among tumor cells, using an asterisk (*) to denote significant differences; if two groups are connected by an asterisk, it indicates a significant difference in the distribution of tumor cells within those tissue types.
Weakly supervised learning in thymoma histopathology classification: an interpretable approach
  • Article
  • Full-text available

December 2024

·

10 Reads

Introduction Thymoma classification is challenging due to its diverse morphology. Accurate classification is crucial for diagnosis, but current methods often struggle with complex tumor subtypes. This study presents an AI-assisted diagnostic model that combines weakly supervised learning with a divide-and-conquer multi-instance learning (MIL) approach to improve classification accuracy and interpretability. Methods We applied the model to 222 thymoma slides, simplifying the five-class classification into binary and ternary steps. The model features an attention-based mechanism that generates heatmaps, enabling visual interpretation of decisions. These heatmaps align with clinically validated morphological differences between thymoma subtypes. Additionally, we embedded domain-specific pathological knowledge into the interpretability framework. Results The model achieved a classification AUC of 0.9172. The generated heatmaps accurately reflected the morphological distinctions among thymoma subtypes, as confirmed by pathologists. The model's transparency allows pathologists to visually verify AI decisions, enhancing diagnostic reliability. Discussion This model offers a significant advancement in thymoma classification, combining high accuracy with interpretability. By integrating weakly supervised learning, MIL, and attention mechanisms, it provides an interpretable AI framework that is applicable in clinical settings. The model reduces the diagnostic burden on pathologists and has the potential to improve patient outcomes by making AI tools more transparent and clinically relevant.

Download



CoD-MIL: Chain-of-Diagnosis Prompting Multiple Instance Learning for Whole Slide Image Classification

October 2024

·

26 Reads

IEEE Transactions on Medical Imaging

Multiple instance learning (MIL) has emerged as a prominent paradigm for processing the whole slide image with pyramid structure and giga-pixel size in digital pathology. However, existing attention-based MIL methods are primarily trained on the image modality and a pre-defined label set, leading to limited generalization and interpretability. Recently, vision language models (VLM) have achieved promising performance and transferability, offering potential solutions to the limitations of MIL-based methods. Pathological diagnosis is an intricate process that requires pathologists to examine the WSI step-by-step. In the field of natural language process, the chain-of-thought (CoT) prompting method is widely utilized to imitate the human reasoning process. Inspired by the CoT prompt and pathologists’ clinic knowledge, we propose a chain-of-diagnosis prompting multiple instance learning (CoD-MIL) framework for whole slide image classification. Specifically, the chain-of-diagnosis text prompt decomposes the complex diagnostic process in WSI into progressive sub-processes from low to high magnification. Additionally, we propose a text-guided contrastive masking module to accurately localize the tumor region by masking the most discriminative instances and introducing the guidance of normal tissue texts in a contrastive way. Extensive experiments conducted on three real-world subtyping datasets demonstrate the effectiveness and superiority of CoD-MIL.



Integrating K+ Entities Into Coreference Resolution on Biomedical Texts

August 2024

·

1 Read

IEEE/ACM Transactions on Computational Biology and Bioinformatics

Biomedical Coreference Resolution focuses on identifying the coreferences in biomedical texts, which normally consists of two parts: (i) mention detection to identify textual representation of biological entities and (ii) finding their coreference links. Recently, a popular approach to enhance the task is to embed knowledge base into deep neural networks. However, the way in which these methods integrate knowledge leads to the shortcoming that such knowledge may play a larger role in mention detection than coreference resolution. Specifically, they tend to integrate knowledge prior to mention detection, as part of the embeddings. Besides, they primarily focus on mention-dependent knowledge (KBase), i.e., knowledge entities directly related to mentions, while ignores the correlated knowledge (K+) between mentions in the mention-pair. For mentions with significant differences in word form, this may limit their ability to extract potential correlations between those mentions. Thus, this paper develops a novel model to integrate both KBase and K+ entities and achieves the state-of-the-art performance on BioNLP and CRAFT-CR datasets. Empirical studies on mention detection with different length reveals the effectiveness of the KBase entities. The evaluation on cross-sentence and match/mismatch coreference further demonstrate the superiority of the K+ entities in extracting background potential correlation between mentions.




Citations (53)


... RNNs are adept at processing sequential data and are extensively applied in natural language processing and time-series predictions. GANs consist of a generator and a discriminator, both of which are competing against each other to generate high-quality data, with applications ranging from image to text generation [149]. Autoencoders are unsupervised learning algorithms that compress input data into lowdimensional representations and reconstruct them, which are commonly used for tasks such as dimensionality reduction and denoising [150]. ...

Reference:

Overview of Deep Learning and Nondestructive Detection Technology for Quality Assessment of Tomatoes
Generative adversarial network for semi-supervised image captioning
  • Citing Article
  • December 2024

Computer Vision and Image Understanding

... Nevertheless, WSIs are massive images, often containing billions of pixels [13,57], making detailed annotations and analysis difficult and expensive. To tackle these challenges, machine learning techniques incorporating few-shot and weakly supervised learning have been developed [36,27,31,50,53]. Among these, multiple instance learning (MIL) and vision-language models (VLMs) have gained particular attention for their ability to effectively manage limited annotations and interpret complex whole-slide pathology images. ...

ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification
  • Citing Conference Paper
  • June 2024

... The research involved collecting smear images and corresponding cytological reports from 161 patients who underwent serous cavity drainage. The dataset consisted of 4836 annotated patches, identifying regions with and without malignant cells [76]. ...

Multiple serous cavity effusion screening based on smear images using vision transformer

... Recently, researchers have proposed a number of video privacy protection schemes, such as video encryption [8][9][10][11], video watermarking [12][13][14][15][16], and video steganography [17][18][19][20]. Among these schemes, video encryption is one of the more significant and secure approaches. ...

A Multi-embedding Domain Video Steganography Algorithm Based on TU Partitioning and Intra Prediction Mode
  • Citing Article
  • March 2024

Neurocomputing

... Models trained through various ML algorithms have demonstrated an excellent diagnostic performance in predicting postoperative mortality, complications, and prognosis 12,13 . Some recent studies have used ML models to predict peripartum maternal bleeding and transfusion 10,[14][15][16][17] . Using ML to predict maternal bleeding and the need for a transfusion, correlations not found in conventional linear statistical analysis can be observed. ...

Construction and effect evaluation of prediction model for red blood cell transfusion requirement in cesarean section based on artificial intelligence

BMC Medical Informatics and Decision Making

... In our research, we leverage these LLM-based techniques to identify entities in the question, subsequently categorizing them as nodes and edges in a question graph. Building on previous findings [28][29][30][31][32][33], we designed a succinct prompt for entity identification, provided in Appendix A.1. ...

Template-Free Prompting for Few-Shot Named Entity Recognition via Semantic-Enhanced Contrastive Learning
  • Citing Article
  • September 2023

IEEE Transactions on Neural Networks and Learning Systems

... They [17,21,27,33,38] have achieved significant success in various pathological diagnostic tasks like cancer subtyping [2,17,35], staging [10,36], and tissue segmentation [19,42]. However, training these MIL-based models still heavily relies on a large number of slides with baglevel labels which are often unreachable for rare diseases. ...

MG-Trans: Multi-Scale Graph Transformer With Information Bottleneck for Whole Slide Image Classification
  • Citing Article
  • September 2023

IEEE Transactions on Medical Imaging

... One of the obvious advantages of the ML-and DL-based attack detection methods is their ability to identify complex and multi-stage attacks, as well as zero-days attacks [3]. The latter is possible due to the generalization ability of ML models, especially DL ones [4]. This fact has been proved by numerous research papers devoted to the efficiency evaluation of the ML-and DL-based techniques proposed to detect anomalies and attacks [3,5,6]. ...

Understanding the Generalization Ability of Deep Learning Algorithms: A Kernelized Rényi's Entropy Perspective
  • Citing Conference Paper
  • August 2023

... Named Entity Recognition (NER) is a fundamental information extraction task that aims to identify and classify text spans into predefined entity classes. Recently, tremendous advances in deep learning have propelled NER towards SOTA performance (Ge et al. 2023;Huang et al. 2023). However, the success of these deep learning-based methods depends on large-scale, manually annotated data. ...

PRAM: An End-to-end Prototype-based Representation Alignment Model for Zero-resource Cross-lingual Named Entity Recognition
  • Citing Conference Paper
  • January 2023

... All rights reserved. works initialize concept embeddings from natural language descriptions (Wu et al. 2023b;Bornet et al. 2023), or enrich patient representation with external disease ontologies Cheong et al. 2023). However, a significant gap persists between the primary knowledge modality, i.e. natural language, and the model's hidden representation. ...

MEGACare: Knowledge-guided multi-view hypergraph predictive framework for healthcare
  • Citing Article
  • July 2023

Information Fusion