January 2025
What is this page?
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
Publications (202)
January 2025
IEEE Transactions on Information Theory
Domain generalization aims to learn invariance across multiple source domains, thereby enhancing generalization against out-of-distribution data. While gradient or representation matching algorithms have achieved remarkable success in domain generalization, these methods generally lack generalization guarantees or depend on strong assumptions, leaving a gap in understanding the underlying mechanism of distribution matching. In this work, we formulate domain generalization from a novel probabilistic perspective, ensuring robustness while avoiding overly conservative solutions. Through comprehensive information-theoretic analysis, we provide key insights into the roles of gradient and representation matching in promoting generalization. Our results reveal the complementary relationship between these two components, indicating that existing works focusing solely on either gradient or representation alignment are insufficient to solve the domain generalization problem. In light of these theoretical findings, we introduce IDM to simultaneously align the inter-domain gradients and representations. Integrated with the proposed PDM method for complex distribution matching, IDM achieves superior performance over various baseline methods.
December 2024
·
10 Reads
Introduction Thymoma classification is challenging due to its diverse morphology. Accurate classification is crucial for diagnosis, but current methods often struggle with complex tumor subtypes. This study presents an AI-assisted diagnostic model that combines weakly supervised learning with a divide-and-conquer multi-instance learning (MIL) approach to improve classification accuracy and interpretability. Methods We applied the model to 222 thymoma slides, simplifying the five-class classification into binary and ternary steps. The model features an attention-based mechanism that generates heatmaps, enabling visual interpretation of decisions. These heatmaps align with clinically validated morphological differences between thymoma subtypes. Additionally, we embedded domain-specific pathological knowledge into the interpretability framework. Results The model achieved a classification AUC of 0.9172. The generated heatmaps accurately reflected the morphological distinctions among thymoma subtypes, as confirmed by pathologists. The model's transparency allows pathologists to visually verify AI decisions, enhancing diagnostic reliability. Discussion This model offers a significant advancement in thymoma classification, combining high accuracy with interpretability. By integrating weakly supervised learning, MIL, and attention mechanisms, it provides an interpretable AI framework that is applicable in clinical settings. The model reduces the diagnostic burden on pathologists and has the potential to improve patient outcomes by making AI tools more transparent and clinically relevant.
December 2024
·
11 Reads
December 2024
·
1 Read
·
1 Citation
Computer Vision and Image Understanding
October 2024
·
26 Reads
IEEE Transactions on Medical Imaging
Multiple instance learning (MIL) has emerged as a prominent paradigm for processing the whole slide image with pyramid structure and giga-pixel size in digital pathology. However, existing attention-based MIL methods are primarily trained on the image modality and a pre-defined label set, leading to limited generalization and interpretability. Recently, vision language models (VLM) have achieved promising performance and transferability, offering potential solutions to the limitations of MIL-based methods. Pathological diagnosis is an intricate process that requires pathologists to examine the WSI step-by-step. In the field of natural language process, the chain-of-thought (CoT) prompting method is widely utilized to imitate the human reasoning process. Inspired by the CoT prompt and pathologists’ clinic knowledge, we propose a chain-of-diagnosis prompting multiple instance learning (CoD-MIL) framework for whole slide image classification. Specifically, the chain-of-diagnosis text prompt decomposes the complex diagnostic process in WSI into progressive sub-processes from low to high magnification. Additionally, we propose a text-guided contrastive masking module to accurately localize the tumor region by masking the most discriminative instances and introducing the guidance of normal tissue texts in a contrastive way. Extensive experiments conducted on three real-world subtyping datasets demonstrate the effectiveness and superiority of CoD-MIL.
October 2024
·
3 Reads
·
1 Citation
August 2024
·
1 Read
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Biomedical Coreference Resolution focuses on identifying the coreferences in biomedical texts, which normally consists of two parts: (i) mention detection to identify textual representation of biological entities and (ii) finding their coreference links. Recently, a popular approach to enhance the task is to embed knowledge base into deep neural networks. However, the way in which these methods integrate knowledge leads to the shortcoming that such knowledge may play a larger role in mention detection than coreference resolution. Specifically, they tend to integrate knowledge prior to mention detection, as part of the embeddings. Besides, they primarily focus on mention-dependent knowledge (KBase), i.e., knowledge entities directly related to mentions, while ignores the correlated knowledge (K+) between mentions in the mention-pair. For mentions with significant differences in word form, this may limit their ability to extract potential correlations between those mentions. Thus, this paper develops a novel model to integrate both KBase and K+ entities and achieves the state-of-the-art performance on BioNLP and CRAFT-CR datasets. Empirical studies on mention detection with different length reveals the effectiveness of the KBase entities. The evaluation on cross-sentence and match/mismatch coreference further demonstrate the superiority of the K+ entities in extracting background potential correlation between mentions.
August 2024
·
19 Reads
·
1 Citation
Medical Image Analysis
June 2024
·
3 Reads
Citations (53)
... RNNs are adept at processing sequential data and are extensively applied in natural language processing and time-series predictions. GANs consist of a generator and a discriminator, both of which are competing against each other to generate high-quality data, with applications ranging from image to text generation [149]. Autoencoders are unsupervised learning algorithms that compress input data into lowdimensional representations and reconstruct them, which are commonly used for tasks such as dimensionality reduction and denoising [150]. ...
- Citing Article
December 2024
Computer Vision and Image Understanding
... Nevertheless, WSIs are massive images, often containing billions of pixels [13,57], making detailed annotations and analysis difficult and expensive. To tackle these challenges, machine learning techniques incorporating few-shot and weakly supervised learning have been developed [36,27,31,50,53]. Among these, multiple instance learning (MIL) and vision-language models (VLMs) have gained particular attention for their ability to effectively manage limited annotations and interpret complex whole-slide pathology images. ...
- Citing Conference Paper
June 2024
... The research involved collecting smear images and corresponding cytological reports from 161 patients who underwent serous cavity drainage. The dataset consisted of 4836 annotated patches, identifying regions with and without malignant cells [76]. ...
- Citing Article
- Full-text available
March 2024
... Recently, researchers have proposed a number of video privacy protection schemes, such as video encryption [8][9][10][11], video watermarking [12][13][14][15][16], and video steganography [17][18][19][20]. Among these schemes, video encryption is one of the more significant and secure approaches. ...
- Citing Article
March 2024
Neurocomputing
... Models trained through various ML algorithms have demonstrated an excellent diagnostic performance in predicting postoperative mortality, complications, and prognosis 12,13 . Some recent studies have used ML models to predict peripartum maternal bleeding and transfusion 10,[14][15][16][17] . Using ML to predict maternal bleeding and the need for a transfusion, correlations not found in conventional linear statistical analysis can be observed. ...
- Citing Article
- Full-text available
October 2023
BMC Medical Informatics and Decision Making
... In our research, we leverage these LLM-based techniques to identify entities in the question, subsequently categorizing them as nodes and edges in a question graph. Building on previous findings [28][29][30][31][32][33], we designed a succinct prompt for entity identification, provided in Appendix A.1. ...
- Citing Article
September 2023
IEEE Transactions on Neural Networks and Learning Systems
... They [17,21,27,33,38] have achieved significant success in various pathological diagnostic tasks like cancer subtyping [2,17,35], staging [10,36], and tissue segmentation [19,42]. However, training these MIL-based models still heavily relies on a large number of slides with baglevel labels which are often unreachable for rare diseases. ...
- Citing Article
September 2023
IEEE Transactions on Medical Imaging
... One of the obvious advantages of the ML-and DL-based attack detection methods is their ability to identify complex and multi-stage attacks, as well as zero-days attacks [3]. The latter is possible due to the generalization ability of ML models, especially DL ones [4]. This fact has been proved by numerous research papers devoted to the efficiency evaluation of the ML-and DL-based techniques proposed to detect anomalies and attacks [3,5,6]. ...
- Citing Conference Paper
August 2023
... Named Entity Recognition (NER) is a fundamental information extraction task that aims to identify and classify text spans into predefined entity classes. Recently, tremendous advances in deep learning have propelled NER towards SOTA performance (Ge et al. 2023;Huang et al. 2023). However, the success of these deep learning-based methods depends on large-scale, manually annotated data. ...
- Citing Conference Paper
January 2023
... All rights reserved. works initialize concept embeddings from natural language descriptions (Wu et al. 2023b;Bornet et al. 2023), or enrich patient representation with external disease ontologies Cheong et al. 2023). However, a significant gap persists between the primary knowledge modality, i.e. natural language, and the model's hidden representation. ...
- Citing Article
July 2023
Information Fusion