Computational analysis and experimental validation of tumor-associated alternative RNA splicing in human cancer.
ABSTRACT A genome-wide computational screen was performed to identify tumor-associated alternative RNA splicing isoforms. A BLAST algorithm was used to compare 11,014 genes from RefSeq with 3,471,822 human expressed sequence tag sequences. The screen identified 26,258 alternative splicing isoforms of which 845 were significantly associated with human cancer, and 54 were specifically associated with liver cancer. Furthermore, canonical GT-AG splice junctions were used significantly less frequently in the alternative splicing isoforms in tumors. Reverse transcription-PCR experiments confirmed association of the alternative splicing isoforms with tumors. These results suggest that alternative splicing may have potential as a diagnostic marker for cancer.
- SourceAvailable from: PubMed Central[Show abstract] [Hide abstract]
ABSTRACT: RRP1B (ribosomal RNA processing 1 homolog B) was first identified as a metastasis susceptibility gene in breast cancer through its ability to modulate gene expression in a manner that can be used to accurately predict prognosis in breast cancer. However, the mechanism(s) by which RRP1B modulates gene expression is currently unclear. Many RRP1B binding candidates are involved in alternative splicing, a mechanism of gene expression regulation that is increasingly recognized to be involved in cancer progression and metastasis. One such target is SRSF1 (serine/arginine-rich splicing factor 1) (SF2/ASF, splicing factor 2/alternative splicing factor), an essential splicing regulator that also functions as an oncoprotein. Earlier studies demonstrated that splicing and transcription occur concurrently and are coupled processes. Given that RRP1B regulates transcriptional activity, we hypothesized that RRP1B also regulates the expression of alternative mRNA isoforms through its interaction with SRSF1. Interaction between RRP1B and SRSF1 was verified by coimmunoprecipitation and coimmunofluorescence. Treatment of cells with transcriptional inhibitors significantly increased this interaction, demonstrating that the association of these two proteins is transcriptionally regulated. To assess the role of RRP1B in the regulation of alternative isoform expression, RNA-sequencing data were generated from control and Rrp1b-knockdown cells. Knockdown of Rrp1b induced a significant change in isoform expression in over 600 genes compared with control cell lines. This was verified by quantitative reverse-transcription PCR using isoform-specific primers. Pathway enrichment analyses identified cell cycle and checkpoint regulation to be those most affected by Rrp1b knockdown. These data suggest that RRP1B suppresses metastatic progression by altering the transcriptome through its interaction with splicing regulators such as SRSF1.Oncogene advance online publication, 22 April 2013; doi:10.1038/onc.2013.133.Oncogene 04/2013; · 8.56 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: Background Researches have been conducted for the identification of differentially expressed genes (DEGs) by generating and mining of cDNA expressed sequence tags (ESTs) for more than a decade. Although the availability of public databases make possible the comprehensive mining of DEGs among the ESTs from multiple tissue types, existing studies usually employed statistics suitable only for two categories. Multi-class test has been developed to enable the finding of tissue specific genes, but subsequent search for cancer genes involves separate two-category test only on the ESTs of the tissue of interest. This constricts the amount of data used. On the other hand, simple pooling of cancer and normal genes from multiple tissue types runs the risk of Simpson's paradox. Here we presented a different approach which searched for multi-cancer DEG candidates by analyzing all pertinent ESTs in all categories and narrowing down the cancer biomarker candidates via integrative analysis with microarray data and selection of secretory and membrane protein genes as well as incorporation of network analysis. Finally, the differential expression patterns of three selected cancer biomarker candidates were confirmed by real-time qPCR analysis. Results Seven hundred and twenty three primary DEG candidates (p-value < 0.05 and lower bound of confidence interval of odds ratio ≧ 1.65) were selected from a curated EST database with the application of Cochran-Mantel-Haenszel statistic (CMH). GeneGO analysis results indicated this set as neoplasm enriched. Cross-examination with microarray data further narrowed the list down to 235 genes, among which 96 had membrane or secretory annotations. After examined the candidates in protein interaction network, public tissue expression databases, and literatures, we selected three genes for further evaluation by real-time qPCR with eight major normal and cancer tissues. The higher-than-normal tissue expression of COL3A1, DLG3, and RNF43 in some of the cancer tissues is in agreement with our in silico predictions. Conclusions Searching digitized transcriptome using CMH enabled us to identify multi-cancer differentially expressed gene candidates. Our methodology demonstrated simultaneously analysis for cancer biomarkers of multiple tissue types with the EST data. With the revived interest in digitizing the transcriptomes by NGS, cancer biomarkers could be more precisely detected from the ESTs. The three candidates identified in this study, COL3A1, DLG3, and RNF43, are valuable targets for further evaluation with a larger sample size of normal and cancer tissue or serum samples.BMC Genomics 01/2012; 13(7). · 4.04 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: The control of gene expression in the course of chronic kidney disease (CKD) is not well addressed. Alternative splicing is a common way to increase complexity of proteins. More than 90% of human transcripts are alternatively spliced. We hypothesised that CKD can induce modification of the alternative splicing machinery. During mutation screening in autosomal dominant polycystic kidney disease, we identified in mononuclear cells (PBMC), an alternative splicing event on the exon 30 of PKD1 gene, the gene implicated in this disease. This alternative splice variant was not correlated with the cystic disease but with CKD. To confirm the association between this variant and CKD, a monocentric clinical study was performed with 3 different groups according to their kidney function (CKD5D, CKD3-5 and normal kidney function). An exon microarray approach was used to highlight splicing events in whole human genome in a normal cell model (fibroblasts) incubated with uremic serum. Alternative splicing variants identified were confirmed by RT-PCR. The splicing variant of the exon 30 of PKD1 was more frequent in PBMCs from patients with CKD compared to control. With the microarray approach, despite the analysis of more than 230 000 probes, we identified 36 genes with an abnormal splicing index evocating splicing event in fibroblasts exposed to uremic serum. Only one abnormal splicing event in one gene, ADH1B, was confirmed by RT-PCR. We observed two alternative spliced genes in two different cell types associated with CKD. Alternative splicing could play a role in the control of gene expression during CKD but it does not seem to be a major mechanism.PLoS ONE 12/2013; 8(12):e82702. · 3.53 Impact Factor
[CANCER RESEARCH 63, 655–657, February 1, 2003]
Computational Analysis and Experimental Validation of Tumor-associated
Alternative RNA Splicing in Human Cancer
Zhining Wang, H. Shuen Lo, Howard Yang, Sheryl Gere, Ying Hu, Kenneth H. Buetow, and Maxwell P. Lee1
Laboratory of Population Genetics, National Cancer Institute, Gaithersburg, Maryland 20877
A genome-wide computational screen was performed to identify tumor-
associated alternative RNA splicing isoforms. A BLAST algorithm was
used to compare 11,014 genes from RefSeq with 3,471,822 human ex-
pressed sequence tag sequences. The screen identified 26,258 alternative
splicing isoforms of which 845 were significantly associated with human
cancer, and 54 were specifically associated with liver cancer. Further-
more, canonical GT-AG splice junctions were used significantly less fre-
quently in the alternative splicing isoforms in tumors. Reverse transcrip-
tion-PCR experiments confirmed association of the alternative splicing
isoforms with tumors. These results suggest that alternative splicing may
have potential as a diagnostic marker for cancer.
Most mammalian genes consist of multiple exons interspersed with
introns. Introns are removed from a primary transcript by RNA
splicing, which generates a mature translatable mRNA (1). In some
cases, more than one mRNA is generated from a single primary
transcript via an alternative splicing mechanism. It is well documented
that alternative RNA splicing plays a biologically important function.
For example, membrane-bound IgM and secreted IgD are produced by
alternative splicing of the same primary transcript (2), and sex deter-
mination is regulated by alternative RNA splicing of sxl, tra, and dsx
(3) in Drosophila. Several studies have analyzed alternative RNA
splicing on a genome-wide basis (4–8). These studies showed that
35–59% of human genes have at least one alternative splicing isoform.
Several studies show that alternative RNA splicing occurs frequently
in human cancer cells (9–11). This study examines alternative splic-
ing associated with cancer on a genome-wide basis and provides
experimental validation for the tumor-associated alternative splicing.
The results suggest that alternative splicing may have potential as a
diagnostic marker for cancer.
MATERIALS AND METHODS
Computational Analysis of Alternative RNA Splicing. Each human Ref-
Seq (NM?xxxxxx) sequence was compared with the human EST2database to
identify alternative splicing isoforms using a BLAST parser. The algorithm
defined an alternative splicing event as when two perfectly aligned regions
between the EST and RefSeq were separated by a gap. An E-value of less than
10?10was required for the flanking aligned sequences, and the gap was
required to be ?10 bp. The following z statistic was used to estimate the
probability that an alternative splicing isoform is associated with tumor cells:
z ? (pt? pn)/?p(1 ? p)(1/Ct? 1/Cn). For a given alternative splicing
isoform, ptand pnare the frequency of that alternative splicing event in tumor
and normal libraries, respectively, and p is the average frequency of the
alternative splicing event. Ctand Cnare the number of ESTs in the tumor and
normal libraries, respectively. We used one-side z test for each alternative
splicing site using the z-statistics defined above. Because the two proportions
ptand pnare approximately normally distributed by the central limit theorem,
the difference pt? pnis also approximately normally distributed. Under the
null hypothesis pt? pn? p, the variance is p(1 ? p)(1/Ct? 1/Cn). We defined
the statistics z as the difference of the proportions divided by SD. So z is an
approximately standard normal variable. An alternative splicing isoform is
considered tumor-associated if it has a P ? 0.05. Tumor-associated alternative
splicing isoforms for the tissue-specific type of cancer were identified by
selecting ESTs from cancer and normal tissues.
Experimental Validation. Primers were designed with Primer3 software
(12) to detect regular and alternative splicing products. Primer sequences are
available online.3Total RNA was isolated using RNAzol B (Tel-Test, Inc.,
Friendswood, TX) according to the manufacturer’s protocol. cDNA synthesis
was carried out using avian myeloblastosis virus reverse transcriptase (Invitro-
gen Corp., Carlsbad, CA) and oligo(dT)12–18primers (Invitrogen Corp.). A
Packard MultiPROBE II EX robotic liquid handling system (Packard Instru-
ment Company, Meriden, CT) was used for PCR. PCR was carried out in a
15-?l reaction in 1? buffer [1.5 mM Mg2?, 0.2 mM dNTP, 0.5 ?M primers, 1
unit of Taq DNA polymerase (Applied Biosystems)] and 2 ?l of cDNA.
Amplification parameters were as follows: 95°C for 10 min; 40 cycles of 95°C
for 45 s; 60°C for 30 s; and 72°C for 60 s, followed by extension at 72°C for
10 min. Reaction products were analyzed by agarose gel electrophoresis.
Reactions containing multiple PCR products were purified using the QIAquick
purification kit (Qiagen, Inc., Valencia, CA).
To confirm the identity of PCR products, 18 PCR fragments were se-
quenced using ABI Prism BigDye Terminator Cycle Sequencing Ready Re-
action Kit (Applied Biosystems) and an ABI Prism 3100 or 3700 Genetic
Analyzer (Applied Biosystems). Sequencing traces were analyzed using Se-
quencher software (Gene Code Corporation, Ann Arbor, MI) and Phred/Phrap
(13). The results confirmed the expected DNA sequence in all cases.
RESULTS AND DISCUSSION
Genome-wide Screening for Tumor-associated Alternative
Splicing Isoforms. Alternative RNA splicing isoforms were identi-
fied for 11,014 RefSeq sequences using a BLAST search algorithm to
compare each RefSeq pairwise with 3,471,822 sequences in the hu-
man EST database. The algorithm identified 26,258 alternative splic-
ing isoforms, one-third of which had two or more ESTs. The majority
of ESTs aligned perfectly with a cognate RefSeq sequence (i.e.,
regular splicing event); however, in many comparisons, two perfectly
aligned regions were separated by a gap (i.e., alternative splicing
event). An EST that does not match its cognate RefSeq perfectly is an
alternative splicing isoform. Each alternative RNA splicing isoform
has unique parameters, including the location of the splice junctions
and the length of the inserted or deleted sequence.
If alternative RNA splicing isoforms are associated with specific
cancer cell types, then they could potentially serve as diagnostic
markers for cancer. This idea was tested as follows. Alternative
splicing isoforms were classified as tumor or normal based on the
source of the mRNA used to construct the relevant cDNA library and
P that the isoform was present at higher frequency in tumor cDNA
libraries than in normal cDNA libraries (see “Materials and Methods”
for details). This analysis showed that 845 alternative RNA splicing
isoforms (3.2%) were significantly (P ? 0.05) associated with tumor
libraries. A complete list of 845 tumor-associated alternative splicing
isoforms is available online.3The alternative splicing isoforms that
Received 9/12/02; accepted 12/2/02.
The costs of publication of this article were defrayed in part by the payment of page
charges. This article must therefore be hereby marked advertisement in accordance with
18 U.S.C. Section 1734 solely to indicate this fact.
1To whom requests for reprints should be addressed, at Phone: (301) 435-1536; Fax:
(301) 402-9325; E-mail: firstname.lastname@example.org.
2The abbreviations used are: EST, expressed sequence tag; RT-PCR, reverse tran-
3Internet address: leelab.nci.nih.gov/ASCA.
are associated with liver, brain, placenta, lung, kidney, and prostate
cancers were also identified (Table 1). Some examples of tumor
associated alternative splicing isoforms in various tissues were listed
in Table 1.
Splice junctions in the alternative RNA splicing isoforms were
analyzed and categorized into eight subgroups. The frequency was
calculated for each splicing subgroup (Fig. 1). Interestingly, GT-AG
splice junctions were used significantly less frequently in alternative
splicing isoforms than in regular splice events. GC-AG splice junc-
tions were used in 16.8% of type I insertions in alternative splicing
isoforms but in ?0.43% of regular splicing events (39-fold increase).
GC-AG usage increased specifically in tumors (P ? 0.002 in ?2test).
Validation of Tumor-associated Alternative Splicing Isoforms
by RT-PCR. To validate the computation-predicted alternative RNA
splicing isoforms, 12 pairs of matched tumor and normal RNA sam-
ples from lung, breast, liver, and prostate tumors were analyzed by
RT-PCR. Primers were designed to specifically amplify regular or
alternative splicing products (Fig. 2). Seventy-six alternative RNA
splicing isoforms were selected for analysis based on a P ? 0.05, or
because they were known to be involved in cancer. Fifty-five (72%)
of the expected products were detected by PCR. Forty-five of these
alternative RNA splicing isoforms were expressed in tumor but not
matched normal samples. For example, three lung cancer samples
expressed alternative splicing products, but none of their matched
normal samples expressed the same mRNA isoform (Fig. 2, Lanes
1–6, the alternative splicing marked with deletion). However, this
pattern was not observed for all samples; alternative splicing products
were detected in tumor and matched normal samples from liver (Fig.
2, Lanes 7–8).
Alternative splicing is differentially associated with specific tissues
or types of cancer. Some tumor-specific splicing isoforms were de-
tected in several tumor tissues, whereas others were less prevalent
(Fig. 3). For example, the NME1 alternative RNA splicing isoform
was detected in all five lung cancer samples but not in any of the
breast or liver cancer samples. However, the RAB1A alternative RNA
splicing isoform was present in all four types of cancer tissues.
Interestingly, three genes, RAB1A, RBBP8, and AXL, expressed
alternative but not regular splicing variants in tumors. Conversely, the
matched normal tissues expressed only regular splicing products. The
absence of regular splicing products in tumors suggests that loss of
regular splicing isoforms may be associated with tumorigenesis.
Our validation results show that 45 of 76 (59.2%) selected alterna-
tive splicing isoforms were tumor specific. This observation is con-
sistent with previous reports that alternative splicing often increases in
Fig. 1. Splicing junctions in eight alternative splicing subgroups. The splicing junction
sequences were analyzed as described in “Materials and Methods” and “Results and
Discussion.” The percentage of GT-AG splice junctions are indicated for the junction
unique to the alternative transcript. NA indicates that splice junction sequences are not
Fig. 2. Experimental validation of alternative splicing using RT-PCR. N and T denote
normal and tumor, respectively. Regular and deleted splicing products are indicated.
Arrows represent primers and open boxes designate exons. The horizontal lines between
exons represent introns. Diagonal lines indicate alternative splicing events.
Fig. 3. Summary of tumor-associated alternative splicing validation experiments. Four
breast, one liver, five lung, and two prostate cancer samples and their matched normal
samples were used in the validation experiment. The Y axis represents the number of
sample pairs that specifically expressed alternative splicing isoforms in the tumor sample
and expressed regular splicing products in the normal sample. Regular splicing products
may or may not have been detected in the tumors. Data are shown from 10 representative
genes. Additional data are available online.3
Table 1 Identification of tumor-associated alternative splicing isoforms in various tissues
No. of ESTs
No. of significant
CYP2C8, KNG, AMBP, AHSG, ANG, TDO2, BHMT2, ADH4, HFL1
MAD2L1, RAD1, MAPK8IP1, PCMT1, MCK, POLR2I, PKIA, DLEU2
LDHC, ADM, ATP6S1, ENG, IFNGR2, GPS1, DSPG3, HHLA2, UROD
LRP1, SFTPC, TCEB1L, SKP1A, RCL, HCG1V.9
CD9, PCK2, HIBCH, TINAG, TSPAN-1
TGM4, KLK2, PPP2R5A, HT012
RAB1A, NME1, NEU2, TOP1, MRE11A, NCOA1, RNPEP, IRAK1
aThis is the number of alternative splicing isoforms that are significantly associated with tumors in the given tissue; see “Materials and Methods” for details.
ALTERNATIVE RNA SPLICING IN CANCER
human cancer (9–11), although the cause of the increase is not known.
One possibility is that splicing fidelity is compromised in tumors
because of mutations or altered gene expression affecting the splicing
machinery. The mechanisms stimulating tumor-associated alternative
splicing may be related to microsatellite instability in cancer cells,
which is caused by defects in mismatch repair (14). Future experi-
ments are required to test this hypothesis.
We thank Drs. David Kaufman, Kent Hunter, and Tzu-ling Tseng for
helpful comments on the manuscript.
1. Hastings, M. L., and Krainer, A. R. Pre-mRNA splicing in the new millennium. Curr.
Opin. Cell Biol., 13: 302–309, 2001.
2. Early, P., Rogers, J., Davis, M., Calame, K., Bond, M., Wall, R., and Hood, L. Two
mRNAs can be produced from a single immunoglobulin mu gene by alternative RNA
processing pathways. Cell, 20: 313–319, 1980.
3. Nagoshi, R. N., McKeown, M., Burtis, K. C., Belote, J. M., and Baker, B. S. The
control of alternative splicing at genes regulating sexual differentiation in D, mela-
nogaster. Cell, 53: 229–236, 1988.
4. Huang, Y. H., Chen, Y. T., Lai, J. J., Yang, S. T., and Yang, U. C. PALS db: putative
alternative splicing database. Nucleic Acids Res., 30: 186–190, 2002.
5. Modrek, B., Resch, A., Grasso, C., and Lee, C. Genome-wide detection of alternative
splicing in expressed sequences of human genes. Nucleic Acids Res., 29: 2850–2859,
6. Brett, D., Hanke, J., Lehmann, G., Haase, S., Delbruck, S., Krueger, S., Reich, J., and
Bork, P. EST comparison indicates 38% of human mRNAs contain possible alterna-
tive splice forms. FEBS Lett., 474: 83–86, 2000.
7. Mironov, A. A., Fickett, J. W., and Gelfand, M. S. Frequent alternative splicing of
human genes. Genome Res., 9: 1288–1293, 1999.
8. Xie, H., Zhu, W., Wasserman, A., Grebinskiy, V., Olson, A., and Mintz, L. Compu-
tational analysis of alternative splicing using EST tissue information. Genomics, 80:
9. Stimpfl, M., Tong, D., Fasching, B., Schuster, E., Obermair, A., Leodolter, S., and
Zeillinger, R. Vascular endothelial growth factor splice variants and their prognostic
value in breast and ovarian cancer. Clin. Cancer Res., 8: 2253–2259, 2002.
10. Adams, M., Jones, J. L., Walker, R. A., Pringle, J. H., and Bell, S. C. Changes in
tenascin-C isoform expression in invasive and preinvasive breast disease. Cancer
Res., 62: 3289–3297, 2002.
11. Lee, M. P., and Feinberg, A. P. Aberrant splicing but not mutations of TSG101 in
human breast cancer. Cancer Res., 57: 3131–3134, 1997.
12. Rozen, S., and Skaletsky, H. Primer3 on the WWW for general users and for biologist
programmers. Methods Mol. Biol., 132: 365–386, 2000.
13. Ewing, B., Hillier, L., Wendl, M. C., and Green, P. Base-calling of automated
sequencer traces using phred. I. Accuracy assessment. Genome Res., 8: 175–185,
14. Kinzler, K. W., and Vogelstein, B. Lessons from hereditary colorectal cancer. Cell,
87: 159–170, 1996.
ALTERNATIVE RNA SPLICING IN CANCER