Figure - available from: Frontiers in Genetics
This content is subject to copyright.
Schematic representation of the proposed cluster-driven batch alignment (CBA) method to align single cell RNA-seq measured in two different batches. The unsupervised clustering for cells from both batches and the network architecture in CBA are shown and the explanation of various nodes are listed on the top left corner. Cell A and cell C are from batch one, and cells B and D are from batch two. At its core, the alignment is done using an autoencoder where cells A&B are aligned and embedded in a lower dimensional representation M and, simultaneously, cells C&D are aligned and embedded in N. M&N are subsequently used to represent the aligned cells, e.g., to make a UMAP visualization. Details on the autoencoder as well as the classification layer can be found in the section that describes CBA.
Source publication
The power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is however hampered by technical differences between these...
Similar publications
Bilateral renal cell carcinoma (RCC) is a rare disease that can be classified as either familial or sporadic. Studying the cellular molecular characteristics of sporadic bilateral RCC is important to provide guidance for clinical treatment. Cellular molecular characteristics can be expressed at the RNA level, especially at the single-cell degree. S...
Citations
... Due to the stochastic nature of single-cell sequencing, experiments done at different times, in different locations, using different reagents, using different technologies, or using different technicians, may have specific biases associated with that experiment that may influence sequencing results. To combat this, deep learning models [95][96][97][98][99][100][101][102][103][104][105][106][107][108] have been developed to learn a shared latent representation for these different experiments, that removes technical noise but keeps biological variation. ...
Single-cell RNA sequencing (scRNA-seq) has become a routinely used technique to quantify the gene expression profile of thousands of single cells simultaneously. Analysis of scRNA-seq data plays an important role in the study of cell states and phenotypes, and has helped elucidate biological processes, such as those occurring during development of complex organisms, and improved our understanding of disease states, such as cancer, diabetes, and coronavirus disease 2019 (COVID-19), among others. Deep learning, a recent advance of artificial intelligence that has been used to address many problems involving large datasets, has also emerged as a promising tool for scRNA-seq data analysis, as it has a capacity to extract informative and compact features from noisy, heterogeneous, and high-dimensional scRNA-seq data to improve downstream analysis. The present review aims at surveying recently developed deep learning techniques in scRNA-seq data analysis, identifying key steps within the scRNA-seq data analysis pipeline that have been advanced by deep learning, and explaining the benefits of deep learning over more conventional analytic tools. Finally, we summarize the challenges in current deep learning approaches faced within scRNA-seq data and discuss potential directions for improvements in deep learning algorithms for scRNA-seq data analysis.
... Due to the stochastic nature of single cell sequencing, experiments done at different times, in different locations, using different reagents, using different technologies, or using different technicians, may have specific biases associated with that experiment that may influence sequencing results. To combat this, deep learning models [96][97][98][99][100][101][102][103][104][105][106][107][108][109] have been developed to learn a shared latent representation for these different experiments, that removes technical noise but keeps biological variation. ...
Single-cell RNA-sequencing (scRNA-seq) has become a routinely used technique to quantify the gene expression profile of thousands of single cells simultaneously. Analysis of scRNA-seq data plays an important role in the study of cell states and phenotypes, and has helped elucidate biological processes, such as those occurring during development of complex organisms and improved our understanding of disease states, such as cancer, diabetes, and COVID, among others. Deep learning, a recent advance of artificial intelligence that has been used to address many problems involving large datasets, has also emerged as a promising tool for scRNA-seq data analysis, as it has a capacity to extract informative, compact features from noisy, heterogeneous, and high-dimensional scRNA-seq data to improve downstream analysis. The present review aims at surveying recently developed deep learning techniques in scRNA-seq data analysis, identifying key steps within the scRNA-seq data analysis pipeline that have been advanced by deep learning, and explaining the benefits of deep learning over more conventional analysis tools. Finally, we summarize the challenges in current deep learning approaches faced within scRNA-seq data and discuss potential directions for improvements in deep algorithms for scRNA-seq data analysis.
At now, the majority of approaches rely on manual techniques for annotating cell types subsequent to clustering the data obtained from single-cell RNA sequencing (scRNA-seq). These approaches require a significant amount of physical exertion and depend substantially on the user's skill, perhaps resulting in uneven outcomes and inconsistency in treatment. In this paper, we provide a computer-assisted interpretation of every single cell of a tissue sample, along with an in-depth exploration of an individual cell's molecular, phenotypic and functional attributes. The paper will also perform k-means clustering followed by silhouette validation based on similar phenotype and functional attributes, and also, cell type annotation is performed, where we match a cell's gene profile against some known database by applying certain statistical conditions. Finally, all the genes are mapped spatially on the tissue sample. This paper is an aid to medicine to know which cells are expressed/not expressed in a tissue sample and their spatial location on the tissue sample.
In multimodal data fusion and land-cover interpretation tasks, the fusion interpretability between hyperspectral image (HSI) and light detection and ranging (LiDAR) data is always nontrivial to be clarified. Furthermore, the heterogeneous sample and distribution variances of these two remote sensing (RS) modalities impede the joint classification performance. In this article, a hyperspectral intrinsic image decomposition guided data fusion network (HI
FNet) is proposed. Generally, classic hyperspectral intrinsic image decomposition (HIID) performs well in image enhancement and shadow removal. It decomposes one HSI into one reflectance component and one shading component. Inspired by the core mechanism of HIID, our motivation is to preliminarily exploit its potential for multimodal RS data fusion and explore the inherent modality connection between HSI and LiDAR data from the intrinsic perspective. Specifically, compared with existing techniques, HI
FNet is capable of fusing the horizontal geometry information in the shading component with the vertical geometry information in the LiDAR data from both sample and distribution perspectives in the spatial domain. The generated cross-modal geometry feature contributes to guiding the reflectance stream optimization. This unique fusion framework connects both the modalities with respect to geometry information and enhances the specific fusion interpretability. The decomposition and fusion modules in HI
FNet are optimized simultaneously in a novel alternative optimization pattern. Furthermore, several unique cross-modal constraints in terms of prior RS properties are presented. Experiments conducted on three widely available datasets prove the superiority of HI
FNet over state-of-the-art techniques. The source codes will be available at
https://github.com/GEOywb/HI2D2FNet
.