Figure - available via license: CC BY
Content may be subject to copyright.
The number of ST spots from breast cancer tissue samples obtained by (A) manual annotation by pathologists and (B) automated annotation by PCA (A) Number of manually selected Breast cancer ST spots (B) Number of automatically identified breast cancer ST spots

The number of ST spots from breast cancer tissue samples obtained by (A) manual annotation by pathologists and (B) automated annotation by PCA (A) Number of manually selected Breast cancer ST spots (B) Number of automatically identified breast cancer ST spots

Source publication
Article
Full-text available
Background: Distinguishing ductal carcinoma in situ (DCIS) from invasive ductal carcinoma (IDC) regions in clinical biopsies constitutes a diagnostic challenge. Spatial transcriptomics (ST) is an in situ capturing method, which allows quantification and visualization of transcriptomes in individual tissue sections. In the past, studies have shown...

Contexts in source publication

Context 1
... human genome hg38 and its corresponding annotation file were used for mapping and for assigning sequence reads to genes (annotation) [25]. The general statistics for the breast cancer datasets are shown in Additional file 1: Table S1. ...
Context 2
... the manually selected ST spots of the four breast cancer datasets (Table 1A), we used 798 differentially expressed breast cancer signature ST-TCs to train the model. We selected 133 ST spots from the three breast cancer datasets 2, 3, and 4 (48 non-malignant spots, 45 DCIS spots, 40 IDC spots) to train the model. ...
Context 3
... applied principal component analysis (PCA) on the data matrix consisting of 25, 179 ST-TCs and 979 ST spots to place ST spots with similar expression close to each other in a 2-dimensional representation (the two first principal components) of the 25,179 dimensional expression space (Fig. 4a). To identify groups of ST spots in the PCA possibly corresponding to the three classes, we performed hierarchical clustering analysis (HCA) on the first two principal components and were able to identify three distinct groups of 979 ST spots (Table 1B, Fig. 4a) (see "Methods"). Given a number of expected clusters (n = 3 classes), HCA groups the ST spots on the 2-dimensional PCA plot such that each ST spot belongs to one cluster. ...

Similar publications

Article
Full-text available
Highly multiplexed immunohistochemistry (mIHC) enables the staining and quantification of dozens of antigens in a tissue section with single-cell resolution. However, annotating cell populations that differ little in the profiled antigens or for which the antibody panel does not include specific markers is challenging. To overcome this obstacle, we...

Citations

... Spatial transcriptomics (ST) performs high-throughput measurement of transcriptomes in complex biological tissues at single-cell or subcellular resolution, preserving spatial information [1][2][3][4][5][6][7][8][9]. In the past decade, the rapid development of ST technologies has facilitated exciting discoveries in different domains, including neuroscience [10][11][12] and cancer research [13][14][15]. The popular ST technologies and corresponding platforms differ in terms of the procedure used to record spatial profiles, such as region of interest (ROI) selection [16,17], next-generation sequencing (NGS) with spatial barcoding [18][19][20], and single-molecule fluorescence in situ hybridization (smFISH) [21][22][23]. ...
Article
Full-text available
In high-throughput spatial transcriptomics (ST) studies, it is of great interest to identify the genes whose level of expression in a tissue covaries with the spatial location of cells/spots. Such genes, also known as spatially variable genes (SVGs), can be crucial to the biological understanding of both structural and functional characteristics of complex tissues. Existing methods for detecting SVGs either suffer from huge computational demand or significantly lack statistical power. We propose a non-parametric method termed SMASH that achieves a balance between the above two problems. We compare SMASH with other existing methods in varying simulation scenarios demonstrating its superior statistical power and robustness. We apply the method to four ST datasets from different platforms uncovering interesting biological insights.
... Jon et al. conducted research on non-small cell lung cancer (NSCLC) patients to investigate the biomarkers linked to advantageous PD-1 checkpoint blockage using the GeoMx DSP, and they showed the potential of DSP in identifying spatially informative biomarkers of the PD-1 checkpoint blockade response in NSCLC and confirmed alternative immune predictors with spatial context deserving larger independent cohorts' validation [38]. By applying ST to detect the composition of the TME, they revealed how distinct components of the TME determine the outcome of PD-1 checkpoint blockade. ...
Article
Full-text available
Since its first application in 2016, spatial transcriptomics has become a rapidly evolving technology in recent years. Spatial transcriptomics enables transcriptomic data to be acquired from intact tissue sections and provides spatial distribution information and remedies the disadvantage of single-cell RNA sequencing (scRNA-seq), whose data lack spatially resolved information. Presently, spatial transcriptomics has been widely applied to various tissue types, especially for the study of tumor heterogeneity. In this review, we provide a summary of the research progress in utilizing spatial transcriptomics to investigate tumor heterogeneity and the microenvironment with a focus on solid tumors. We summarize the research breakthroughs in various fields and perspectives due to the application of spatial transcriptomics, including cell clustering and interaction, cellular metabolism, gene expression, immune cell programs and combination with other techniques. As a combination of multiple transcriptomics, single-cell multiomics shows its superiority and validity in single-cell analysis. We also discuss the application prospect of single-cell multiomics, and we believe that with the progress of data integration from various transcriptomics, a multilayered subcellular landscape will be revealed.
... In contrast, SRM and SRT can be modeled for unbiased predictive analysis and can be used to simultaneously obtain different types of information, such as metabolomes or transcriptomes, according to recent breast cancer research. The prediction models developed by applying a machine learning algorithm to four SRT breast cancer datasets clearly distinguished two subtypes of ductal carcinoma in situ and invasive ductal carcinoma [115]. Santoro et al. [116] used DESI-MSI combined with conventional pathology for the metabolomic analysis of different breast cancer molecular subtypes and found different lipid compositions among invasive breast cancer, ductal carcinoma in situ, and adjacent benign tissue, wherein highly saturated fatty acids and antioxidant molecules were able to differentiate invasive breast cancer from adjacent benign tissue, and fatty acids and glycerophospholipids could differentiate between ductal carcinoma. ...
Article
Full-text available
Tumors are spatially heterogeneous tissues that comprise numerous cell types with intricate structures. By interacting with the microenvironment, tumor cells undergo dynamic changes in gene expression and metabolism, resulting in spatiotemporal variations in their capacity for proliferation and metastasis. In recent years, the rapid development of histological techniques has enabled efficient and high-throughput biomolecule analysis. By preserving location information while obtaining a large number of gene and molecular data, spatially resolved metabolomics (SRM) and spatially resolved transcriptomics (SRT) approaches can offer new ideas and reliable tools for the in-depth study of tumors. This review provides a comprehensive introduction and summary of the fundamental principles and research methods used for SRM and SRT techniques, as well as a review of their applications in cancer-related fields.
... Transcriptomics analyses have become one of the most popular platforms for the identification of BC-causing key genes and PPI that might play key roles in BC diagnoses, prognoses, and therapy [2]. Additionally, spatial transcriptomics (ST) is an in situ capturing method that allows for the quantification and visualization of transcriptomes in individual histological tissue sections, distinguishing non-malignant, ductal carcinoma in situ (DCIS) and invasive ductal carcinoma (IDC) regions in clinical biopsies of the breast using an automatic selection of cell types via their transcriptome profiles [167]. Epitranscriptomics focuses on the understanding of the epitranscriptome, which plays a key role in the alternative splicing, nuclear export, transcript stability, and translation of RNAs [143]. ...
Article
Full-text available
Breast cancer (BC) is characterized by an extensive genotypic and phenotypic heterogeneity. In-depth investigations into the molecular bases of BC phenotypes, carcinogenesis, progression, and metastasis are necessary for accurate diagnoses, prognoses, and therapy assessments in predictive, precision, and personalized oncology. This review discusses both classic as well as several novel omics fields that are involved or should be used in modern BC investigations, which may be integrated as a holistic term, onco-breastomics. Rapid and recent advances in molecular profiling strategies and analytical techniques based on high-throughput sequencing and mass spectrometry (MS) development have generated large-scale multi-omics datasets, mainly emerging from the three ”big omics”, based on the central dogma of molecular biology: genomics, transcriptomics, and proteomics. Metabolomics-based approaches also reflect the dynamic response of BC cells to genetic modifications. Interactomics promotes a holistic view in BC research by constructing and characterizing protein–protein interaction (PPI) networks that provide a novel hypothesis for the pathophysiological processes involved in BC progression and subtyping. The emergence of new omics- and epiomics-based multidimensional approaches provide opportunities to gain insights into BC heterogeneity and its underlying mechanisms. The three main epiomics fields (epigenomics, epitranscriptomics, and epiproteomics) are focused on the epigenetic DNA changes, RNAs modifications, and posttranslational modifications (PTMs) affecting protein functions for an in-depth understanding of cancer cell proliferation, migration, and invasion. Novel omics fields, such as epichaperomics or epimetabolomics, could investigate the modifications in the interactome induced by stressors and provide PPI changes, as well as in metabolites, as drivers of BC-causing phenotypes. Over the last years, several proteomics-derived omics, such as matrisomics, exosomics, secretomics, kinomics, phosphoproteomics, or immunomics, provided valuable data for a deep understanding of dysregulated pathways in BC cells and their tumor microenvironment (TME) or tumor immune microenvironment (TIMW). Most of these omics datasets are still assessed individually using distinct approches and do not generate the desired and expected global-integrative knowledge with applications in clinical diagnostics. However, several hyphenated omics approaches, such as proteo-genomics, proteo-transcriptomics, and phosphoproteomics-exosomics are useful for the identification of putative BC biomarkers and therapeutic targets. To develop non-invasive diagnostic tests and to discover new biomarkers for BC, classic and novel omics-based strategies allow for significant advances in blood/plasma-based omics. Salivaomics, urinomics, and milkomics appear as integrative omics that may develop a high potential for early and non-invasive diagnoses in BC. Thus, the analysis of the tumor circulome is considered a novel frontier in liquid biopsy. Omics-based investigations have applications in BC modeling, as well as accurate BC classification and subtype characterization. The future in omics-based investigations of BC may be also focused on multi-omics single-cell analyses.
... Distinguishing between the two types of breast cancer is critical for determining the best treatment from all the options like surgery, radiation therapy, and chemotherapy [34,35]. In literature, partial annotation is available for DCIS, IDC, and non-malignant regions, see Figure 5 (c) [36]. K-means clustering based on the estimated cell proportions of FAST can recover the annotated spots and extend the annotations to those areas that were previously unclear, see Figure 5 (b) and (d). ...
Preprint
Full-text available
Motivation Spatial transcriptomics is a state-of-art technique that allows researchers to study gene expression patterns in tissues over the spatial domain. As a result of technical limitations, the majority of spatial transcriptomics techniques provide bulk data for each sequencing spot. Consequently, in order to obtain high-resolution spatial transcriptomics data, performing deconvolution becomes essential. Deconvolution enables the determination of the proportions of different cell types along with the corresponding gene expression levels for each cell type within each spot. Most existing deconvolution methods rely on reference data (e.g., single-cell data), which may not be available in real applications. Current reference-free methods encounter limitations due to their dependence on distribution assumptions, reliance on marker genes, or the absence of leveraging histology and spatial information. Consequently, there is a critical demand for the development of highly adaptable, robust, and user-friendly reference-free deconvolution methods capable of unifying or leveraging case-specific information in the analysis of spatial transcriptomics data. Results We propose a novel reference-free method based on regularized non-negative matrix factorization (NMF), named Flexible Analysis of Spatial Transcriptomics (FAST), that can effectively incorporate gene expression data, spatial coordinates, and histology information into a unified deconvolution framework. Compared to existing methods, FAST imposes fewer distribution assumptions, utilizes the spatial structure information of tissues, and encourages interpretable factorization results. These features enable greater flexibility and accuracy, making FAST an effective tool for deciphering the complex cell-type composition of tissues and advancing our understanding of various biological processes and diseases. Extensive simulation studies have shown that FAST outperforms other existing reference-free methods. In real data applications, FAST is able to uncover the underlying tissue structures and identify the corresponding marker genes.
... For breast cancer, there are few studies that support the use of lncRNAs or the combination with other biotypes as molecular predictive or prognostic biomarkers in clinical practice, and none of them have been approved for commercial distribution in prostate cancer, as in the case of PROGENSA, although there is already evidence in scientific literature about their potential as biomarkers in decision-making for the management of breast cancer patients [61][62][63]. The best example to describe the potential clinical utility of a lncRNA in patients with breast cancer is the study performed by Berger et al. in which the existence of lncRNA-coding gene regulation networks, such as NEAT1, TERC, and TUG1, together with other mRNAs, such as ESR1, AR, and SOX2, make it possible to classify patients with gynecological cancers and breast cancer into 6 clusters, which are related directly to their phenotypes and mainly to the immune response, as well as to the expression of hormone receptors in patients particularly associated with the estrogen receptor signaling pathway. ...
... The ISH-RNA assay has also allowed the detection of lncRNA SNHG3 as a potential diagnostic biomarker, distinguishing between normal breast tissue and cancerous breast tissues [84]. Furthermore, there are novel molecular approaches, such as spatial transcriptomics, which allow for the identification of a signature based on 798 transcripts, including the lncRNA LINC00657, that could be implemented in machine learning methods to distinguish invasive breast cancer [62]. In summary, the implementation of molecular biology techniques for lncRNA-based biomarkers detection in clinical practice could improve the reliability of the results of laboratory tests and the accuracy of oncological diagnosis. ...
Article
Full-text available
Given their tumor-specific and stage-specific gene expression, long non-coding RNAs (lncRNAs) have demonstrated to be potential molecular biomarkers for diagnosis, prognosis, and treatment response. Particularly, the lncRNAs DSCAM-AS1 and GATA3-AS1 serve as examples of this because of their high subtype-specific expression profile in luminal B-like breast cancer. This makes them candidates to use as molecular biomarkers in clinical practice. However, lncRNA studies in breast cancer are limited in sample size and are restricted to the determination of their biological function, which represents an obstacle for its inclusion as molecular biomarkers of clinical utility. Nevertheless, due to their expression specificity among diseases, such as cancer, and their stability in body fluids, lncRNAs are promising molecular biomarkers that could improve the reliability, sensitivity, and specificity of molecular techniques used in clinical diagnosis. The development of lncRNA-based diagnostics and lncRNA-based therapeutics will be useful in routine medical practice to improve patient clinical management and quality of life.
... Spatial transcriptomics (ST) performs high-throughput measurement of transcriptomes in complex biological tissues at single-cell or subcellular resolution, preserving spatial information [1,2,3,4,5,6,7,8,9,10]. In the past decade, the rapid development of ST technologies has facilitated exciting discoveries in different domains, including neuroscience [11,12,13] and cancer research [14,15,16]. Early ST technologies, such as smFISH [17], seqFISH [18] and MERFISH [19], operate at relatively low spatial resolution, whereas recent technologies, such as Slide-seq [20], Slide-seq V2 [21], 10X Visium [22] and HDST [23], have enabled transcriptome-wide profiling at a much finer spatial resolution on multiple thousands of locations. ...
Preprint
Full-text available
In high-throughput spatial transcriptomics (ST) studies, it is of great interest to identify the genes whose level of expression in a tissue covaries with the spatial location of cells/spots. Such genes, also known as spatially variable genes (SVGs), can provide crucial biological insights into both structural and functional characteristics of complex tissues. Existing methods for detecting SVGs either suffer from huge computational demand or significantly lack statistical power. We propose a non-parametric method termed SMASH that achieves a balance between the above two problems. We compare SMASH with other existing methods in varying simulation scenarios demonstrating its superior statistical power and robustness. We apply the method to four ST datasets from different platforms revealing interesting biological insights.
... It enables high-throughput quantitative assessment of the location and abundance of gene activity within a tissue, which can elucidate molecular mechanisms at an unprecedented level of spatial detail. Traditional molecular profiling technologies (e.g., single-cell or single-nuclei RNA sequencing) dissociate tissues, lose the spatial context of gene expression [1][2][3], or offer limited gene-level assessment via tissue slides. In contrast, SRT technologies empower the comprehensive characterization of molecular abundance within individual cells while preserving spatial information. ...
Preprint
Full-text available
The emerging field of spatially resolved transcriptomics (SRT) has revolutionized biomedical research. SRT quantifies expression levels at different spatial locations, providing a new and powerful tool to interrogate novel biological insights. An essential question in the analysis of SRT data is to identify spatially variable (SV) genes; the expression levels of such genes have spatial variation across different tissues. SV genes usually play an important role in underlying biological mechanisms and tissue heterogeneity. Currently, several computational methods have been developed to detect such genes; however, there is a lack of unbiased assessment of these approaches to guide researchers in selecting the appropriate methods for their specific biomedical applications. In addition, it is difficult for researchers to implement different existing methods for either biological study or methodology development. Furthermore, currently available public SRT datasets are scattered across different websites and preprocessed in different ways, posing additional obstacles for quantitative researchers developing computational methods for SRT data analysis. To address these challenges, we designed Spatial Transcriptomics Arena (STAr), an open platform comprising 193 curated datasets from seven technologies, seven statistical methods, and analysis results. This resource allows users to retrieve high-quality datasets, apply or develop spatial gene detection methods, as well as browse and compare spatial gene analysis results. It also enables researchers to comprehensively evaluate SRT methodology research in both simulated and real datasets. Altogether, STAr is an integrated research resource intended to promote reproducible research and accelerate rigorous methodology development, which can eventually lead to an improved understanding of biological processes and diseases. STAr can be accessed at https://lce.biohpc.swmed.edu/star/ .
... For example, we may be studying a biopsy sample from a tumor tissue, and we would like to identify the border between the tumor and healthy tissue using as few slices as possible. This localization problem is a common goal in cancer pathology (Yoosuf et al., 2020). For simplicity, assume that we can label each spatial location as being on the interior of the region of interest (y = 1) or the exterior of the region of interest (y = 0) after we have collected data for that location. ...
Preprint
Full-text available
Spatially-resolved genomic technologies have shown promise for studying the relationship between the structural arrangement of cells and their functional behavior. While numerous sequencing and imaging platforms exist for performing spatial transcriptomics and spatial proteomics profiling, these experiments remain expensive and labor-intensive. Thus, when performing spatial genomics experiments using multiple tissue slices, there is a need to select the tissue cross sections that will be maximally informative for the purposes of the experiment. In this work, we formalize the problem of experimental design for spatial genomics experiments, which we generalize into a problem class that we call structured batch experimental design . We propose approaches for optimizing these designs in two types of spatial genomics studies: one in which the goal is to construct a spatially-resolved genomic atlas of a tissue and another in which the goal is to localize a region of interest in a tissue, such as a tumor. We demonstrate the utility of these optimal designs, where each slice is a two-dimensional plane, on several spatial genomics datasets.
... In a spatial transcriptomics study of HER2-positive breast cancer [46], the researchers correlated pathologists with RNA expression-based 2020) study, Generate tensor molecular thermograms from pathological sections, subsequently generate inhomogeneous distribution maps of cells, and generate heterogeneity indices using the formula clusters and found a high degree of consistency, and likewise found that data-driven expression-based clustering captured signals that were missed by visual inspection. They have also used spatial transcriptomic data to automatically generate pathological annotations of HER2-positive breast cancer, the same approach was also applied to the annotation of invasive ductal carcinoma pathology [47]. ...
... These "high-risk areas" over the portion may determine the scope of the surgery. The investigators also used the spatial transcriptome to automate the pathological annotation of HER-2-positive breast cancer [46] as well as invasive ductal carcinoma [47]. Although this method cannot replace pathological sections in the short term due to cost and may provide pathologists with some clinical decisions hereafter. ...
Article
Full-text available
A major feature of cancer is the heterogeneity, both intratumoral and intertumoral. Traditional single-cell techniques have given us a comprehensive understanding of the biological characteristics of individual tumor cells, but the lack of spatial context of the transcriptome has limited the study of cell-to-cell interaction patterns and hindered further exploration of tumor heterogeneity. In recent years, the advent of spatially resolved transcriptomics (SRT) technology has made possible the multidimensional analysis of the tumor microenvironment in the context of intact tissues. Different SRT methods are applicable to different working ranges due to different working principles. In this paper, we review the advantages and disadvantages of various current SRT methods and the overall idea of applying these techniques to oncology studies, hoping to help researchers find breakthroughs. Finally, we discussed the future direction of SRT technology, and deeper investigation into the complex mechanisms of tumor development from different perspectives through multi-omics fusion, paving the way for precisely targeted tumor therapy.