ArticlePDF Available


Spatial structure of RNA expression RNA-seq and similar methods can record gene expression within and among cells. Current methods typically lose positional information and many require arduous single-cell isolation and sequencing. Ståhl et al. have developed a way of measuring the spatial distribution of transcripts by annealing fixed brain or cancer tissue samples directly to bar-coded reverse transcriptase primers, performing reverse transcription followed by sequencing and computational reconstruction, and they can do so for multiple genes. Science , this issue p. 78
18. T. Alerstam, D. Christie, A. Ulfstrand, Bird Migration
(Cambridge Univ. Press, 1993).
19. F. Liechti, W. Witvliet, R. Weber, E. Bächler, Nat. Commun. 4,
2554 (2013).
20. N. C. Rattenborg, Naturwissens chaften 93,413425
21. J. A. Lesku et al., Science 337, 16541658 (2012).
22. H. Weimerskirch, M. Louzao, S. de Grissac, K. Delord, Science
335, 211214 (2012).
23. U. C. Mohanty, M. Mohapatra, O. P. Singh, B. K. Bandyopadhyay,
L. S. Rathore, Monitoring and Prediction of Tropical Cyclones in
the Indian Ocean and Climate Change (Springer, Dordrecht,
Netherlands, 2014).
24. H. J. J. Jonker, T. Heus, P. P. Sullivan, Geophys. Res. Lett. 35,
L07810 (2008).
The tracking data presented in the paper are available
from the Dryad Digital Repository. We thank the Forces Armées
de la Zone Sud de lOcéan Indien for transport and logistical
support on Europa Island and the TAAF Administration
for allowing us to work on Europa Island. We thank the
fieldworkers involved in the study on Europa, in particular
J. B. Pons and R. Weimerskirch; R. Spivey for help with
preparing the electrocardiogram and acceleration tags
and for the data processing of the heart rate recording;
and A. Corbeau for help with data analyses. The study is a
contribution to the Program EARLYLIFE funded by a
European Research Council Advanced Grant under the
European Communitys Seven Framework Program FP7/
20072013 (grant agreement ERC-2012-ADG_20120314 to
H.W.). We thank Y. Ropert-Coudert, Y. Cherel, and two
anonymous reviewers for helpful comments on earlier versions
of the manuscript.
Materials and Methods
Supplementary Text
Figs. S1 to S10
Table S1
References (2534)
11 February 2016; accepted 20 May 2016
Visualization and analysis of gene
expression in tissue sections by
spatial transcriptomics
Patrik L. Ståhl,
*Fredrik Salmén,
*Sanja Vickovic,
Anna Lundmark,
José Fernández Navarro,
Jens Magnusson,
Stefania Giacomello,
Michaela Asp,
Jakub O. Westholm,
Mikael Huss,
Annelie Mollbrink,
Sten Linnarsson,
Simone Codeluppi,
Åke Borg,
Fredrik Pontén,
Paul Igor Costea,
Pelin Sahlén,
Jan Mulder,
Olaf Bergmann,
Joakim Lundeberg,
Jonas Frisén
Analysis of the pattern of proteins or messenger RNAs (mRNAs) in histological tissue sections
is a cornerstone in biomedical research and diagnostics.This typically involves the visualization
of a few proteins or expressed genes at a time.We have devised a strategy, which we call spatial
transcriptomics,that allows visualization and quantitative analysis of the transcriptome with
spatial resolution in individual tissue sections. By positioning histological sections on arrayed
reverse transcription primers with unique positional barcodes, we demonstrate high-quality
RNA-sequencing data with maintained two-dimensional positional information from the mouse
brain and human breast cancer. Spatial transcriptomics provides quantitative gene expression data
and visualization of the distribution of mRNAs within tissue sections and enables novel types of
bioinformatics analyses, valuable in research and diagnostics.
Tissue transcriptomes are typically studied
by RNA-sequencing (RNA-seq) (1) of ho-
mogenized biopsies, which results in an
averaged transcriptome and loss of spatial
information. The positional context of gene
expression is of key importance to understand-
ing tissue functionality and pathological changes.
Several strategies have recently been developed
with this aim (25), but they have limitations in
the number of transcripts that can be analyzed,
rely on rich preexisting data sets, and/or are costly
and labor-intensive, and none of them are opera-
tional in the standard research and diagnostic
setting of regular histological tissue sections.
troduce positional molecular barcodes in the
complementary DNA (cDNA) synthesis reac-
tion within the context of an intacttissue section
before RNA-seq. We first assessed whether it was
feasible to generate cDNA from messenger RNA
(mRNA) in tissue sections on a surface. We im-
mobilized reverse-transcription oligo(dT) primers
on glass slides and placed on the slides sections
of adult mouse olfactory bulb, a brain region
with clear histological landmarks and ample gene-
expression reference data. The tissue was fixed,
stained, and imaged (Fig. 1A) (6).
After permeabilization, we added reverse-
fluorescently labeled nucleotides to visualize the
synthesized cDNA (Fig. 1A and fig. S1). The tissue
was then enzymatically removed, which left cDNA
(6). The fluorescent cDNA showed a pattern in detail
corresponding to the tissue structure revealed by the
general histology (Fig. 1, B and C), and the cDNA was
strictly localized directly under individual cells (Fig. 1,
DtoG). By comparing the hematoxylin-and-eosin
and fluorescent signals, we could measure the aver-
cell to 1.7 ± 2 mm (mean ± SD) (fig. S1, E to H).
The realization that it is possible to capture
mRNA in tissue sections with minimal diffusion
and maintained positional representation moti-
vated us to array oligonucleotides with positional
barcodes (Fig. 2A), and we denoted this strategy
spatial transcriptomics.We deposited ~200 million
oligonucleotides in each of 1007 features, with a
diameter of 100 mm and a center-to-center distance
of 200 mm, over an area of 6.2 mm by 6.6 mm (fig. S2).
After capturing and reverse-transcribing mRNA,
we generated sequencing libraries based on
amplification by in vitro transcription (fig. S3, A
and B) (7,8). Comparison with data from RNA
extracted and fragmented in solution revealed
that ~95% of the genes found with one of the
methods was also found with the other (fig. S3C).
The correlation between the surface and in-solution
libraries was r= 0.94, with even representation
of genes having high or low expression (fig. S3D).
Replicates of surface-based experiments of adja-
cent tissue sections showed a correlation of r=
0.97 (fig. S3E). Thus, cDNA synthesis from tissue
with arrayed oligonucleotides on a surface is ef-
ficient and does not introduce bias compared
with in-solution protocols (fig. S3F and table S1).
We sorted the RNA-seq data to its correspond-
ing array features by using the spatial barcodes
and aligned the tissue image with the features of
the array, which enabled visualization and analy-
ses. Examples of gene-expression patterns revealed
by spatial transcriptomics and validation by in situ
hybridization are shown in Fig. 2B and fig. S4, A to
C. Transcripts expressed at very low levels, such as
olfactory receptor mRNAs (9), were also detected
with spatial transcriptomics (fig. S4D).
The number of genes (10) (Fig. 2C) and unique
transcripts (fig. S5A) per individual feature varied
between cell layers with different cell density (Fig.
2D and table S2). For the vast majority of genes,
the coefficient of variation decreased as the aver-
age expression increased (fig. S5B). The number of
78 1JULY2016VOL 353 ISSUE 6294 SCIENCE
Department of Cell and Molecular Biology, Karolinska
Institute, SE-171 77 Stockholm, Sweden.
Science for Life
Laboratory, Division of Gene Technology, KTH Royal Institute
of Technology, SE-106 91 Stockholm, Sweden.
of Dental Medicine, Division of Periodontology, Karolinska
Institute, SE-141 04 Huddinge, Sweden.
Science for Life
Laboratory, Department of Biochemistry and Biophysics,
Stockholm University, Box 1031, SE-171 21 Solna, Sweden.
Division of Molecular Neuroscience, Department of Medical
Biochemistry and Biophysics, Karolinska Institute, SE-17177
Stockholm, Sweden.
Department of Physiology and
Pharmacology, Karolinska Institute, SE-17177 Stockholm,
Division of Oncology and Pathology, Department of
Clinical Sciences Lund, Lund University, SE-223 81 Lund,
Department of Immunology, Genetics and
Pathology, Uppsala University, SE-751 85 Uppsala, Sweden.
Science for Life Laboratory, Department of Neuroscience,
Karolinska Institute, SE-171 77 Stockholm, Sweden.
*These authors contributed equally to this work. These authors
contributed equally to this work. Corresponding author. Email:
on July 1, 2016 from
as high as when using laser capture microdis-
section (11), and spatial transcriptomics detected
almost twice as many genes as examination by in
situ hybridization in the Allen Brain Atlas (fig.
S5, C and D). Furthermore, we compared spatial
transcriptomics with the near-100% sensitivity of
single-molecule fluorescent in situ hybridization
in adjacent tissue sections. The sensitivity of spa-
tial transcriptomics was 6.9 ± 1.5% of single-
molecule fluorescent in situ hybridization (fig.
S6). By comparison, single-cell RNA sequencing
has been reported to have about 5 to 40% sen-
sitivity (12).
To further assess the potential lateral diffusion
of transcripts, we investigated the distribution of
the expression of 10 different genes with highly
enriched expression in the mitral cell layer (MCL),
SCIENCE 1JULY2016VOL 353 ISSUE 6294 79
Fig. 1. Spatially localized cDNA synthesis. (A) The tissue is sectioned, placed onto oligo(dT) primers, stained, and imaged. cDNA synthesis with Cy3-labeled
nucleotides reveals fluorescent cDNA after tissue removal. (B) Hematoxylin-and-eosin staining of olfactory bulbs and (C) fluorescent cDNA after tissue removal.
Scale bar, 500 mm. (Dand E) Magnification of boxes in (B) and (C). Cell layers: GL; OPL, outer plexiform layer; MCL; and GCL. Arrowheads and boxes indicate
individual cells and corresponding cDNA with overlapping positions. Scale bar, 40 mm. (Fto G) Cells in (D) and (E) magnified showing cytoplasm and
corresponding cDNA.
on July 1, 2016 from
80 1JULY2016VOL 353 ISSUE 6294 SCIENCE
Fig. 2. Spatially resolved gene expression. (A) Each array feature contains
sequencing handle, a spatial barcode, a unique molecular identifier (UMI), and
an oligo(dT) VN-capture region, where V is anything but T and where N is any
nucleotide. cDNA (red) is generated from captured mRNA by reverse tran-
scription. (B) Visualization of the expression of three genes by spatial tran-
scriptomics (top) and in situ hybridization (bottom). Penk and Kctd12 in situ
images are from the Allen Institute. Cutoff normalized counts, Penk,8;Doc2g,
13; and Kctd12,19.(C) Distribution of unique genes per feature under the
tissue. (D) Number of genes detected for different layers and entire tissue over
sequencing depth. (E) Lateral diffusion of transcripts from genes enriched in
MCL. The genes are expressed in MCL features but are not separable from the
background in features adjacent to the MCL. (F) Spatial expression and in situ
hybridization of four genes in (E).The leftmost feature overlaps the MCL, and
the three rightmost features are situated in the GCL.The colored bar depicts
the distances from feature center in (E).
on July 1, 2016 from
the adjacent granular cell layer (GCL). All these
genes were confirmed to be highly expressed in
the MCL by spatial transcriptomics, but they were
undetectable or detected at very low levels within
the GCL, even with the border of the feature 0 to
5mm and the center of the feature 50 to 55 mm
from the MCL (Fig. 2, E and F, and fig. S7A).
Furthermore, we compared the distribution of
transcripts between areas obtained with laser cap-
ture microdissection (6) where there is no diffusion
of transcripts and with spatial transcriptomics
features, and we did not find evidence for a
difference between these methods in terms of
mRNA diffusion (fig. S7, B and C).
A common goal of gene expression analysis of
tissues is to define the transcriptome of specific
areas. Analysis between homologous regions re-
vealed very similar expression profiles (Fig. 3, A
and B, and fig. S8), with no differentially expressed
genes. In contrast, comparison of different domains
revealed different gene expression profiles (Fig.
3, A and C, and fig. S8). This included genes with
previously known restricted expression, such as
Doc2g in the glomerular layer (GL) and Penk in
the GCL (13), as well as novel layer-specific gene
expression profiles (Fig. 3C).
It is valuable to explore the gene expression
pattern of populations of cells or tissue domains
that can be defined by a combination of markers.
Spatial transcriptomics offers an alternative ap-
proach that circumvents multiplex labeling and
cell isolation. Any combination of presence or
absence of expression for a set of genes can be
used to define a marker profile of interest for
further analysis. Features were selected on the
basis of the presence and/or absence of the three
interneuron-marker genes Camk4,Th,andVip.
The distribution of features, where one of the
genes is expressed alone, is shown in Fig. 3D.
Comparing gene expression revealed specific
transcriptomes defined by these interneuron-
marker profiles (Fig. 3, E and F, and fig. S8).
To further explore gene expression profiles in
spatially defined domains within the olfactory bulb,
we used principal component analysis (fig . S9) or
the t-distributed stochastic neighbor embedding
(t-SNE) (14,15) machine-learning algorithm for
dimensionality reduction, followed by hierarchi-
cal clustering (Fig. 4A). When placing back the
clustered features on the tissue images, it was ap-
parent that each cluster of features largely corre-
sponded to well-defined morphological layers (Fig.
other, which allowed the identification and visu-
alization of cluster-specific marker genes (fig. S10,
A and B). This proved to be an efficient, unbiased
way to identify genes with expression enriched in
the cell layers of interest. Furthermore, we in-
vestigated the gene expression pattern in 10 sec-
tions from a total of five animals, as well as the
feature-to-feature correlation at the same location
in two adjacent sections (fig. S10, C to E).
Analysis of the histology and a set of markers
are routine in cancer diagnostics, although anal-
ysis of the expression of panels of genes has
started to enter the clinic. We asked whether
adding a spatial dimension to gene expression
analysis may add information in cancer diag-
nostics and applied spatial transcriptomics to
breast cancer biopsies. In Fig. 4, C and D (see also
fig. S11, A and B), an area with invasive ductal
SCIENCE 1JULY2016VOL 353 ISSUE 6294 81
Fig. 3. Visualization and bioinformatics analyses of tissue domains de-
fined by morphology or gene expression profile. (A) Ten selected features
in areas a(GCL), b(GCL), or g(GL) are indicated. (B) Scatterplot of gene
expression in areas aand bshows similar expression of layer-specific genes.
Examples of genes are indicated with purple and brown dots. Housekeeping
genes are orange. (C) Scatterplot of gene expression in areas aand gshows a
difference in gene expression. Examples from the 170 differentially expressed
genes are labeled. (D) The spatial expression of three interneuron-marker
gene profiles.Ten features with the different expression profiles were randomly
selected for differential expression analysis. (E) Comparing the 10 Camk4
features with the 10 Vip
features. Examples, out of the 196
differentially expressed genes, are labeled. (F) Comparing the 10 Camk4
features with the 10 Th
features. Examples from the 328
differentially expressed genes are labeled.
on July 1, 2016 from
cancer, as well as six separate areas of ductal can-
cer in situ, were identified on the basis of mor-
phological criteria. Spatial transcriptomics analysis
of the invasive component revealed high expres-
sion of extracellular matrixassociated genes (Fig.
4E). Analysis of the ductal cancer in situ areas
revealed a surprisingly high degree of heteroge-
neity in gene expression between these regions,
probably reflecting different subclones, with vary-
ing expression of several genes implicated in can-
cer progression (Fig. 4E and fig. S11C). For example,
expression of KRT17 and GAS6, implicated in
epithelial-to-mesenchymal transition (16,17), was
high only in areas 1 and 5 (Fig. 4, C to E, and fig
S11). Thus, spatial transcriptomics revealed un-
expected heterogeneity within a biopsy, which
would not be possible to detect with regular tran-
scriptome analysis and which may give more de-
tailed prognostic information.
Spatial transcriptomics calls for only a few ex-
tra steps compared with RNA-seq analysis of
homogenized tissue, with the benefit of providing
spatial information enabling additional levels of
analysis. In contrast to standard methods, different
domains of the tissue are processed in the same
reaction in spatial transcriptomics, which removes
technical variation between samples. A unique fea-
ture of spatial transcriptomics is that any gene ex-
pression profile can be selected to specify a
molecularly defined domain for further analysis.
Finally, in contrast to when different regions of a
tissue are dissected for analysis, the information for
the whole section is maintained; hence, the analysis
is not limited to the initially selected regions. An
individual spatial transcriptomics experiment thus
serves as a permanent resource to investigate gene
expression patterns for future research questions.
1. Z. Wang, M. Gerstein, M. Snyder, Nat. Rev. Genet. 10,5763 (2009).
2. N. Crosetto, M. Bienko, A. van Oudenaarden, Nat. Rev. Genet.
16,5766 (2015).
3. R. Satija, J. A. Farrell, D. Gennert, A. F. Schier, A. Regev,
Nat. Biotechnol. 33, 495502 (2015).
4. P. A. Combs, M. B. Eisen, PLOS ONE 8, e71820 (2013).
5. K. Achim et al., Nat. Biotechnol. 33, 503509 (2015).
6. Materials and methods are available as supplementary
materials on Science Online.
7. T. Hashimshony, F. Wagner, N. Sher, I. Yanai, Cell Reports 2,
666673 (2012).
8. R. N. Van Gelder et al., Proc. Natl. Acad. Sci. U.S.A. 87,
16631667 (1990).
9. R. Vassar et al., Cell 79, 981991 (1994).
10. K. D. Pruitt et al., Nucleic Acids Res. 42 (D1), D756D763 (2014).
11. S. Zechel, P. Zajac, P. Lönnerberg, C. F. Ibáñez, S. Linnarsson,
Genome Biol. 15, 486 (2014).
12. D. Grün, A. van Oudenaarden, Cell 163, 799810 (2015).
13. E. S. Lein et al., Nature 445, 168176 (2007).
14. A. Mahfouz et al., Methods 73,7989 (2015).
15. L. J. P. van der Maaten, G. E. Hinton, J. Mach. Learn. Res. 9,
25792605 (2008).
16. M. Kittaneh, A. J. Montero, S. Glück, Biomarkers Cancer 5,
6170 (2013).
17. C. Gjerdrum et al., Proc. Natl. Acad. Sci. U.S.A. 107, 11241129
We thank K. Meletis and M. Nilsson for discussions. This study
was supported by Knut och Alice Wallenberg Foundation, the
Swedish Foundation for Strategic Research, the Swedish Research
Council, the Swedish Cancer Society, the Karolinska Institute,
Tobias Stiftelsen, Torsten Söderbergs Stiftelse, Ragnar Söderbergs
Stiftelse, StratRegen, Åke Wiberg Foundation, and the Jeansson
Foundations. P.L.S. was supported by a postdoctoral fellowship
from the Swedish Research Council. We thank the Swedish
National Genomics Infrastructure hosted at SciLifeLab, as well as
the Swedish National Infrastructure for ComputingUppsala
Multidisciplinary Center for Advanced Computational Science and
Bioinformatics Long-Term Support for providing sequencing and
computational assistance and infrastructure. The sequencing data
are deposited at the National Center for Biotechnology Information,
NIH, with BioProject ID PRJNA316587. Gene counts and scripts can
be downloaded from
P.L.S., F.S., J.L., and J.F. are authors on patents applied for by
Spatial Transcriptomics AB covering the technology.
Materials and Methods
Figs. S1 to 11
Tables S1 and S2
References (1825)
12 January 2016; accepted 31 May 2016
82 1JULY2016VOL 353 ISSUE 6294 SCIENCE
Fig. 4. Comparative analyses of tissuedomains. (A) t-SNE analysis and hierarchical clustering of 551 features from two replicates creates five distinct clusters.
(B) The features placed back onto the two tissue images. (Cand D) Histological section of a breast cancer biopsy (C) containing invasive ductal cancer (INV) and
six separate areas of ductal cancer in situ (1 to 6), with analyzed spatial transcrip tomics features in (D). INVareas without, or with minimal, stromal infiltration were
selected. (E) Gene expression heat map over the different areas in four adjacent sections (D) and (fig. S11).
on July 1, 2016 from
(6294), 78-82. [doi: 10.1126/science.aaf2403]353Science
Joakim Lundeberg and Jonas Frisén (June 30, 2016)
Pontén, Paul Igor Costea, Pelin Sahlén, Jan Mulder, Olaf Bergmann,
Mollbrink, Sten Linnarsson, Simone Codeluppi, Åke Borg, Fredrik
Michaela Asp, Jakub O. Westholm, Mikael Huss, Annelie
José Fernández Navarro, Jens Magnusson, Stefania Giacomello,
Patrik L. Ståhl, Fredrik Salmén, Sanja Vickovic, Anna Lundmark,
by spatial transcriptomics
Visualization and analysis of gene expression in tissue sections
Editor's Summary
, this issue p. 78Science
can do so for multiple genes.
performing reverse transcription followed by sequencing and computational reconstruction, and they
annealing fixed brain or cancer tissue samples directly to bar-coded reverse transcriptase primers,
have developed a way of measuring the spatial distribution of transcripts byet al.sequencing. Ståhl
methods typically lose positional information and many require arduous single-cell isolation and
RNA-seq and similar methods can record gene expression within and among cells. Current
Spatial structure of RNA expression
This copy is for your personal, non-commercial use only.
Article Tools
article tools:
Visit the online version of this article to access the personalization and
Obtain information about reproducing this article:
is a registered trademark of AAAS. ScienceAdvancement of Science; all rights reserved. The title
Avenue NW, Washington, DC 20005. Copyright 2016 by the American Association for the
in December, by the American Association for the Advancement of Science, 1200 New York
(print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last weekScience
on July 1, 2016 from
... is a technique that uncovers transcriptional signatures within the spatial context of intact tissue by integrating histology with RNA-seq thereby enabling the mapping of transcriptional changes during AKI and repair 17,18 . We therefore performed and integrated spatial and single-cell transcriptomics to localize kidney T lymphocytes in murine normal and ischemic kidneys with a focus on DN T cells. ...
Full-text available
T cells are important in the pathogenesis of acute kidney injury (AKI), and TCR⁺CD4⁻CD8⁻ (double negative-DN) are T cells that have regulatory properties. However, there is limited information on DN T cells compared to traditional CD4⁺ and CD8⁺ cells. To elucidate the molecular signature and spatial dynamics of DN T cells during AKI, we performed single-cell RNA sequencing (scRNA-seq) on sorted murine DN, CD4⁺, and CD8⁺ cells combined with spatial transcriptomic profiling of normal and post AKI mouse kidneys. scRNA-seq revealed distinct transcriptional profiles for DN, CD4⁺, and CD8⁺ T cells of mouse kidneys with enrichment of Kcnq5, Klrb1c, Fcer1g, and Klre1 expression in DN T cells compared to CD4⁺ and CD8⁺ T cells in normal kidney tissue. We validated the expression of these four genes in mouse kidney DN, CD4⁺ and CD8⁺ T cells using RT-PCR and Kcnq5, Klrb1, and Fcer1g genes with the NIH human kidney precision medicine project (KPMP). Spatial transcriptomics in normal and ischemic mouse kidney tissue showed a localized cluster of T cells in the outer medulla expressing DN T cell genes including Fcer1g. These results provide a template for future studies in DN T as well as CD4⁺ and CD8⁺ cells in normal and diseased kidneys.
... Single cell transcriptomics (sc-RNASeq) has become commonplace over recent years, with increasing evidence that this approach can bring new insights to normal physiology and disease pathology [26]. Here we utilised a spatial transcriptomic approach [27][28][29] to derive data about gene expression in the arterial valves of the human embryo, during the sculpting phases of their development. We have been able to validate these gene sets by comparison with sc-RNAseq and in some cases spatial transcriptomic datasets from other studies of fetal and postnatal valves, as well as by confirmation through RNA in situ expression experiments in mouse and human embryos. ...
Full-text available
Abnormalities of the arterial valves, including bicuspid aortic valve (BAV) are amongst the most common congenital defects and are a significant cause of morbidity as well as predisposition to disease in later life. Despite this, and compounded by their small size and relative inaccessibility, there is still much to understand about how the arterial valves form and remodel during embryogenesis, both at the morphological and genetic level. Here we set out to address this in human embryos, using Spatial Transcriptomics (ST). We show that ST can be used to investigate the transcriptome of the developing arterial valves, circumventing the problems of accurately dissecting out these tiny structures from the developing embryo. We show that the transcriptome of CS16 and CS19 arterial valves overlap considerably, despite being several days apart in terms of human gestation, and that expression data confirm that the great majority of the most differentially expressed genes are valve-specific. Moreover, we show that the transcriptome of the human arterial valves overlaps with that of mouse atrioventricular valves from a range of gestations, validating our dataset but also highlighting novel genes, including four that are not found in the mouse genome and have not previously been linked to valve development. Importantly, our data suggests that valve transcriptomes are under-represented when using commonly used databases to filter for genes important in cardiac development; this means that causative variants in valve-related genes may be excluded during filtering for genomic data analyses for, for example, BAV. Finally, we highlight “novel” pathways that likely play important roles in arterial valve development, showing that mouse knockouts of RBP1 have arterial valve defects. Thus, this study has confirmed the utility of ST for studies of the developing heart valves and broadens our knowledge of the genes and signalling pathways important in human valve development.
Full-text available
Various animal and cell culture models of diabetes mellitus (DM) have been established and utilized to study diabetic peripheral neuropathy (DPN). The divergence of metabolic abnormalities among these models makes their etiology complicated despite some similarities regarding the pathological and neurological features of DPN. Thus, this study aimed to review the omics approaches toward DPN, especially on the metabolic states in diabetic rats and mice induced by chemicals (streptozotocin and alloxan) as type 1 DM models and by genetic mutations (MKR, db/db and ob/ob) and high-fat diet as type 2 DM models. Omics approaches revealed that the pathways associated with lipid metabolism and inflammation in dorsal root ganglia and sciatic nerves were enriched and controlled in the levels of gene expression among these animal models. Additionally, these pathways were conserved in human DPN, indicating the pivotal pathogeneses of DPN. Omics approaches are beneficial tools to better understand the association of metabolic changes with morphological and functional abnormalities in DPN.
Full-text available
Cells collectively determine biological functions by communicating with each other—both through direct physical contact and secreted factors. Consequently, the local microenvironment of a cell influences its behavior, gene expression, and cellular crosstalk. Disruption of this microenvironment causes reciprocal changes in those features, which can lead to the development and progression of diseases. Hence, assessing the cellular transcriptome while simultaneously capturing the spatial relationships of cells within a tissue provides highly valuable insights into how cells communicate in health and disease. Yet, methods to probe the transcriptome often fail to preserve native spatial relationships, lack single-cell resolution, or are highly limited in throughput, i.e. lack the capacity to assess multiple environments simultaneously. Here, we introduce fragment-sequencing (fragment-seq), a method that enables the characterization of single-cell transcriptomes within multiple spatially distinct tissue microenvironments. We apply fragment-seq to a murine model of the metastatic liver to study liver zonation and the metastatic niche. This analysis reveals zonated genes and ligand-receptor interactions enriched in specific hepatic microenvironments. Finally, we apply fragment-seq to other tissues and species, demonstrating the adaptability of our method.
Full-text available
Research into the potential benefits of artificial intelligence for comprehending the intricate biology of cancer has grown as a result of the widespread use of deep learning and machine learning in the healthcare sector and the availability of highly specialized cancer datasets. Here, we review new artificial intelligence approaches and how they are being used in oncology. We describe how artificial intelligence might be used in the detection, prognosis, and administration of cancer treatments and introduce the use of the latest large language models such as ChatGPT in oncology clinics. We highlight artificial intelligence applications for omics data types, and we offer perspectives on how the various data types might be combined to create decision-support tools. We also evaluate the present constraints and challenges to applying artificial intelligence in precision oncology. Finally, we discuss how current challenges may be surmounted to make artificial intelligence useful in clinical settings in the future.
Full-text available
In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at Electronic supplementary material The online version of this article (doi:10.1186/s13059-014-0550-8) contains supplementary material, which is available to authorized users.
Full-text available
Considerable progress in sequencing technologies makes it now possible to study the genomic and transcriptomic landscape of single cells. However, to better understand the complexity of multicellular organisms, we must devise ways to perform high-throughput measurements while preserving spatial information about the tissue context or subcellular localization of analysed nucleic acids. In this Innovation article, we summarize pioneering technologies that enable spatially resolved transcriptomics and discuss how these methods have the potential to extend beyond transcriptomics to encompass spatially resolved genomics, proteomics and possibly other omic disciplines.
Full-text available
Background Cortical interneurons originating from the medial ganglionic eminence, MGE, are among the most diverse cells within the CNS. Different pools of proliferating progenitor cells are thought to exist in the ventricular zone of the MGE, but whether the underlying subventricular and mantle regions of the MGE are spatially patterned has not yet been addressed. Here, we combined laser-capture microdissection and multiplex RNA-sequencing to map the transcriptome of MGE cells at a spatial resolution of 50 ¿m.ResultsDistinct groups of progenitor cells showing different stages of interneuron maturation are identified and topographically mapped based on their genome-wide transcriptional pattern. Although proliferating potential decreased rather abruptly outside the ventricular zone, a ventro-lateral gradient of increasing migratory capacity was identified, revealing heterogeneous cell populations within this neurogenic structure.Conclusions We demonstrate that spatially resolved RNA-seq is ideally suited for high resolution topographical mapping of genome-wide gene expression in heterogeneous anatomical structures such as the mammalian central nervous system.
Full-text available
The Allen Brain Atlases enable the study of spatially resolved, genome-wide gene expression patterns across the mammalian brain. Several explorative studies have applied linear dimensionality reduction methods such as Principal Component Analysis (PCA) and classical Multi-Dimensional Scaling (cMDS) to gain insight into the spatial organization of these expression patterns. In this paper, we describe a non-linear embedding technique called Barnes-Hut Stochastic Neighbor Embedding (BH-SNE) that emphasizes the local similarity structure of high-dimensional data points. By applying BH-SNE to the gene expression data from the Allen Brain Atlases, we demonstrate the consistency of the 2D, non-linear embedding of the sagittal and coronal mouse brain atlases, and across 6 human brains. In addition, we quantitatively show that BH-SNE maps are superior in their separation of neuroanatomical regions in comparison to PCA and cMDS. Finally, we assess the effect of higher-order principal components on the global structure of the BH-SNE similarity maps. Based on our observations, we conclude that BH-SNE maps with or without prior dimensionality reduction (based on PCA) provide comprehensive and intuitive insights in both the local and global spatial transcriptome structure of the human and mouse Allen Brain Atlases. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
Full-text available
Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard workflows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data, such as genomic coordinates, sequences, sequencing reads, alignments, gene model information and variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability and implementation: HTSeq is released as an open-source software under the GNU General Public Licence and available from or from the Python Package Index at
Full-text available
In recent years advances in molecular biology have launched disruptive innovations in breast cancer diagnostics and therapeutics. The advent of genomics has revolutionized our understanding of breast cancer as several different biologically and molecularly distinct diseases. This research has led to commercially available polymerase chain reaction (PCR) and microarray tests that have begun to fundamentally change the way medical oncologists quantify recurrence risk in early stage breast cancer patients. The Genomics era has altered the clinicopathologic paradigm of selecting patients for adjuvant cytotoxic chemotherapy. Sufficiently powered prospective studies are underway that may establish these molecular assays as elements of standard clinical practice in breast cancer treatment. In this article, we review the strengths and limitations of currently available breast cancer-specific molecular tests.
Recent advances in single-cell sequencing hold great potential for exploring biological systems with unprecedented resolution. Sequencing the genome of individual cells can reveal somatic mutations and allows the investigation of clonal dynamics. Single-cell transcriptome sequencing can elucidate the cell type composition of a sample. However, single-cell sequencing comes with major technical challenges and yields complex data output. In this Primer, we provide an overview of available methods and discuss experimental design and single-cell data analysis. We hope that these guidelines will enable a growing number of researchers to leverage the power of single-cell sequencing.
Understanding cell type identity in a multicellular organism requires the integration of gene expression profiles from individual cells with their spatial location in a particular tissue. Current technologies allow whole-transcriptome sequencing of spatially identified cells but lack the throughput needed to characterize complex tissues. Here we present a high-throughput method to identify the spatial origin of cells assayed by single-cell RNA-sequencing within a tissue of interest. Our approach is based on comparing complete, specificity-weighted mRNA profiles of a cell with positional gene expression profiles derived from a gene expression atlas. We show that this method allocates cells to precise locations in the brain of the marine annelid Platynereis dumerilii with a success rate of 81%. Our method is applicable to any system that has a reference gene expression database of sufficiently high resolution.
Spatial localization is a key determinant of cellular fate and behavior, but methods for spatially resolved, transcriptome-wide gene expression profiling across complex tissues are lacking. RNA staining methods assay only a small number of transcripts, whereas single-cell RNA-seq, which measures global gene expression, separates cells from their native spatial context. Here we present Seurat, a computational strategy to infer cellular localization by integrating single-cell RNA-seq data with in situ RNA patterns. We applied Seurat to spatially map 851 single cells from dissociated zebrafish (Danio rerio) embryos and generated a transcriptome-wide map of spatial patterning. We confirmed Seurat's accuracy using several experimental approaches, then used the strategy to identify a set of archetypal expression patterns and spatial markers. Seurat correctly localizes rare subpopulations, accurately mapping both spatially restricted and scattered groups. Seurat will be applicable to mapping cellular localization within complex patterned tissues in diverse systems.
The National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) database is a collection of annotated genomic, transcript and protein sequence records derived from data in public sequence archives and from computation, curation and collaboration ( We report here on growth of the mammalian and human subsets, changes to NCBI’s eukaryotic annotation pipeline and modifications affecting transcript and protein records. Recent changes to NCBI’s eukaryotic genome annotation pipeline provide higher throughput, and the addition of RNAseq data to the pipeline results in a significant expansion of the number of transcripts and novel exons annotated on mammalian RefSeq genomes. Recent annotation changes include reporting supporting evidence for transcript records, modification of exon feature annotation and the addition of a structured report of gene and sequence attributes of biological interest. We also describe a revised protein annotation policy for alternatively spliced transcripts with more divergent predicted proteins and we summarize the current status of the RefSeqGene project.