April 2025
·
1 Read
Cancer Research
Background Spatial transcriptomics (ST) enables the integration of gene expression data with spatial context in tissue samples, providing high-resolution insights without requiring single-cell dissociation. With subcellular resolution, ST offers a precise method for annotating cell types, surpassing traditional approaches based on manual pathologic annotation. Conventional AI models for pathology images typically rely on annotations from pathologists to label cellular compositions in hematoxylin and eosin (H&E)-stained slides. In contrast, our AI model leverages training data generated from subcellular resolution ST, providing a distinct advantage by achieving higher accuracy and objectivity in identifying specific cell types. In this study, we focus on lymphocyte identification in non-small cell lung cancer (NSCLC) pathology images to demonstrate the capabilities of this approach. Methods This model analyzes H&E slide images obtained from surgical specimens of NSCLC patients. The training dataset consisted of ST data (Xenium, 10X genomics) and H&E images from 90 NSCLC samples collected from a single institution. After aligning ST data with H&E images, cell-type masks were generated for lymphocyte labeling. For performance evaluation, 456 patches (256 × 256 pixels, 0.45 µm/pixel) were randomly selected from 30 NSCLC H&E slide images. Two pathologists independently annotated lymphocytes in these patches. Of the 456 patches, 355 with consistent annotations between the two pathologists were selected for model evaluation. The consensus annotations served as the reference standard to assess the model's performance in lymphocyte identification. Results This model achieved an AUROC of 0.8411, indicating high diagnostic accuracy. The optional threshold was determined to be 0.3870, with specificity and sensitivity of 72.76% and 78.92%. Compared to two pathologists’ consensus annotations, the model showed a sensitivity of 81.66%, specificity of 69.70%, and accuracy of 77.29% for lymphocyte identification. Conclusion This AI model trained from subcellular ST provides a robust performance for analyzing the distribution of lymphocytes in NSCLC H&E images. This model offers an innovative approach to analyzing the composition of cells from H&E images, demonstrating its potential contribution to TME research and the development of precision medicine. Citation Format Seo Hye Park, Haenara Shin, Hosub Park, Jaemoon Koh, Kwon Joong Na, Hongyoon Choi. Development and validation of an AI-based model for lymphocyte identification in NSCLC H&E image using spatial transcriptomics [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2025; Part 1 (Regular Abstracts); 2025 Apr 25-30; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2025;85(8_Suppl_1):Abstract nr 2421.