March 2025
·
1 Read
Cancer Research
Background: Digital high-resolution H&E images provide valuable insights into tumor heterogeneity within the tumor microenvironment (TME). However, the ability to perform detailed cell typing solely based on H&E images remains limited. Recent advances in high-resolution spatial transcriptomics (ST) enable precise characterization of cell types and their spatial relationships within the TME, addressing challenges in understanding tumor heterogeneity. In this study, we developed AI models to predict cell types in breast cancer—including subtypes of lymphocytes that are challenging to differentiate visually—using large-scale image-based ST data aligned with H&E images. Methods: We established a breast cancer image-based ST database comprising 190 samples from 113 breast cancer patients, obtained from surgically resected primary tumors. Image-based ST data were generated using the Xenium platform with a 500-gene panel and matched with high-resolution H&E images. Cell type maps were constructed using reference single-cell RNA-seq data and transferred to ST data. The ST data were spatially registered to the corresponding H&E images, and cell type masks were generated at matching resolutions. AI models were trained to segment cell type masks, including epithelial cells, cancer cells, myeloid cells, fibroblasts, endothelial cells, T cells, and B cells. Additionally, refined cell type predictions were performed for dendritic cells, NK cells, CD4+ T cells, and CD8+ T cells. Model performance was validated using external whole-slide image datasets containing ST data from four independent cohorts. Results: Model performance was assessed using the area under the receiver operating characteristic (AUROC) curve for each cell type mask. Internal validation on four samples yielded AUROC values ranging from 0.90 to 0.95. External validation across independent whole-slide datasets demonstrated AUROC values between 0.88 and 0.97 for all cell type masks. The AI model successfully mapped TME cell types with high resolution, using only H&E images. Conclusion: By employing a self-supervised approach that integrates high-resolution H&E images with image-based ST data, we developed AI-driven tools for TME analysis. These models enable accurate identification of detailed cell types and their spatial relationships within the TME. This approach facilitates large-scale analysis of breast cancer TME and holds potential for advancing our understanding of tumor biology and therapeutic strategies. Citation Format: Haenara Shin, Dongjoo Lee, Yooeun Kim, Daeseung Lee, Kwon Joong Na, Chihwan David Cha, Hosub Park, Hongyoon Choi. A self-supervised AI model leveraging spatial omics for analyzing tumor microenvironment heterogeneity in breast cancer only with H&E [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Functional and Genomic Precision Medicine in Cancer: Different Perspectives, Common Goals; 2025 Mar 11-13; Boston, MA. Philadelphia (PA): AACR; Cancer Res 2025;85(5 Suppl):Abstract nr B038.