Identification and experimental validation of splicing
regulatory elements in Drosophila melanogaster reveals
functionally conserved splicing enhancers in metazoans
ANGELA N. BROOKS,1,4JULIE L. ASPDEN,1,2,4ANNA I. PODGORNAIA,1,5DONALD C. RIO,1,2
and STEVEN E. BRENNER1,3,6
1Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
2Center for Integrative Genomics, University of California, Berkeley, California 94720, USA
3Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
RNA sequence elements involved in the regulation of pre-mRNA splicing have previously been identified in vertebrate genomes
by computational methods. Here, we apply such approaches to predict splicing regulatory elements in Drosophila melanogaster
and compare them with elements previously found in the human, mouse, and pufferfish genomes. We identified 99 putative
exonic splicing enhancers (ESEs) and 231 putative intronic splicing enhancers (ISEs) enriched near weak 59 and 39 splice sites of
constitutively spliced introns, distinguishing between those found near short and long introns. We found that a significant
%) of fly enhancer sequences were previously reported in at least one of the vertebrates. Furthermore, 20%
putative fly ESEs were previously identified as ESEs in human, mouse, and pufferfish; while only two fly ISEs, CTCTCT and
TTATAA, were identified as ISEs in all three vertebrate species. Several putative enhancer sequences are similar to characterized
binding-site motifs for Drosophila and mammalian splicing regulators. To provide additional evidence for the function of
putative ISEs, we separately identified 298 intronic hexamers significantly enriched within sequences phylogenetically
conserved among 15 insect species. We found that 73 putative ISEs were among those enriched in conserved regions of the
D. melanogaster genome. The functions of nine enhancer sequences were verified in a heterologous splicing reporter,
demonstrating that these sequences are sufficient to enhance splicing in vivo. Taken together, these data identify a set of
predicted positive-acting splicing regulatory motifs in the Drosophila genome and reveal regulatory sequences that are present
in distant metazoan genomes.
Keywords: Drosophila; splicing; splicing regulatory elements; ESE; ISE
The splicing of pre-mRNAs is an important level in the
regulation in metazoan gene expression. The precise exci-
sion of introns and the joining of flanking exons is essential
for accurate protein synthesis. Introns contain several se-
quence elements required for pre-mRNA splicing: 59 and 39
splice sites (59ss, 39ss), branch point, and polypyrimidine
tract. Splice sites can be classified as ‘‘weak’’ or ‘‘strong’’
according to their similarity to consensus motifs. The de-
gree to which a splice site is used is thought to increase as
its strength increases (Lim and Burge 2001; Roca et al. 2005),
exemplified by the fact that consitutively spliced introns have
stronger splice sites than alternatively spliced introns (Koren
et al. 2007). There are also splicing regulatory elements (SREs)
within the pre-mRNA, which influence splicing efficiency
(Lim and Burge 2001). SREs are named according to their
function and location: exonic splicing enhancers (ESEs), ex-
onic splicing silencers (ESSs), intronic splicing enhancers
(ISEs), or intronic splicing silencers (ISSs). ESEs are thought
to most often be recognized by serine–arginine-rich proteins
(SRs), ESSs, and ISSs most often recognized by heteroge-
neous nuclear ribonucleoproteins (hnRNPs) (Chen and
Manley 2009). Some hnRNPs and other RNA-binding pro-
teins, such as Nova and Fox, have been shown to recognize
ISEs (Chen and Manley 2009). The specific combination of
SREs and their distances from splice junctions contributes
4These two authors contributed equally to this work.
5Present address: Computational and Systems Biology Program, Mas-
sachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.
Article published online ahead of print. Article and publication date are
RNA (2011), 17:1884–1894. Published by Cold Spring Harbor Laboratory Press. Copyright ? 2011 RNA Society.
to splicing outcome (Zhang et al. 2009). In addition, cur-
rent models for splice-site selection suggest that splice sites
are recognized through interactions of the spliceosome across
exons, termed ‘‘exon definition,’’ when exons are flanked by
long introns and across introns, termed ‘‘intron definition,’’
when introns are short (Robberson et al. 1990; Romfo et al.
2000; Lim and Burge 2001; Yeo et al. 2004). Therefore, the
relative size of introns is important for splice-site selection.
Putative SREs have previously been identified with in
vitro SELEX experiments, as well as in vivo functional se-
lection of minigene reporter libraries (Shi et al. 1997; Liu
et al. 1998; Amarasinghe et al. 2001; Wang et al. 2004;
Smith et al. 2006; Blanchette et al. 2009). Previous com-
putational approaches have also been successful at identi-
fying SREs. Since RNA-binding domains typically bind to
six to eight nucleotides, computational searches focus on
finding enriched RNA elements of this size in functionally
relevant locations (Fedorov et al. 2001). One such approach
is the Relative Enhancer and Silencer Classification by Unan-
imous Enrichment (RESCUE) method (Fairbrother et al.
2002), which has been applied to numerous genomes in-
cluding mammals, fish, and plants (Fairbrother et al. 2002;
Yeo et al. 2004; Zhang and Chasin 2004; Pertea et al. 2007).
The RESCUE method detects motifs enriched near weak
splice sites of constitutively spliced introns based on the
principle that neighboring sequences act as enhancers to
compensate for poor splice-site recognition. Other approaches
use the premise that functional splicing regulatory elements
are under stringent evolutionary constraint (Goren et al. 2006;
Kabat et al. 2006; Voelker and Berglund 2007; Yeo et al. 2007;
Churbanov et al. 2009).
Here we used a combination of these two methods, the
RESCUE approach and a statistical model to define genomic
regions under evolutionary sequence constraint, to predict
SREs in Drosophila melanogaster. To define constrained se-
quences, we used 15 highly diverged insect species to identify
phylogenetically conserved intronic elements. By compar-
ing our set of fly-splicing regulatory sequences with those
found in vertebrates, we have identified sequence elements
whose function is conserved across distant animal species.
Interestingly, 58% of the putative enhancer elements iden-
tified here in Drosophila have also been identified in
vertebrates. Several of the motifs are predicted binding sites
of both Drosophila and mammalian RNA-binding proteins.
Compared with vertebrate genomes with characterized splic-
ing regulatory elements, the D. melanogaster genome has
the unique feature of a large proportion of short introns.
We have taken advantage of this feature to ask whether there
are different regulatory sequences present near short and
long introns. A selection of putative SREs was tested for
functionality in vivo in a minigene reporter. The majority of
sequences examined had significant effects on the level
of splicing, indicating the robustness of the computational
approach used in this study.
Long and short introns have different distributions
of splice-site strengths
The number and type of regulatory elements near an intron
is dependent upon intron length and splice-site strength
(Lim and Burge 2001; Yeo et al. 2004; Xiao et al. 2007);
therefore, we looked for potential biases in SREs arising
from intron length. The length distribution of constitu-
tively spliced introns in D. melanogaster consists of a peak
with a mode at 69 nt and a long tail (Fig. 1A). This length
distribution is different from that of human introns, with
a mode of 1500 nt (Lim and Burge 2001; Yeo et al. 2004).
Given the intron-length distribution, we divided constitutively
spliced introns into two categories: short (#80 nt; 22,329
introns) and long (>80 nt; 15,474 introns).
Using MaxEntScan (Yeo and Burge 2004) to score splice-
site strengths, we found that longer introns have signifi-
cantly stronger 59 and 39 splice-site strengths than shorter
introns (P < 2.26e-16 for 59 and 39 splice sites, Wilcoxon
FIGURE 1. Splice sites of short constitutively spliced introns are weaker than long constitutively spliced introns in Drosophila. (A) Length
distribution of constitutively spliced introns in Drosophila. (B) A significant difference in the distribution of MaxEntScan splice-site scores (Yeo
and Burge 2004) of short and long introns (P < 2.26e-16, Wilcoxon rank sum test). The first quartile, median, and third quartile of splice-sites
scores are given in the table. Higher MaxEntScan scores correspond to a stronger splice-site sequence.
Drosophila splicing regulatory elements
Robida M, Sridharan V, Morgan S, Rao T, Singh R. 2010. Drosophila
polypyrimidine tract-binding protein is necessary for spermatid
individualization. Proc Natl Acad Sci 107: 12570–12575.
Roca X, Sachidanandam R, Krainer AR. 2005. Determinants of the
inherent strength of human 59 splice sites. RNA 11: 683–698.
Romfo CM, Alvarez CJ, van Heeckeren WJ, Webb CJ, Wise JA. 2000.
Evidence for splice site pairing via intron definition in Schizo-
saccharomyces pombe. Mol Cell Biol 20: 7955–7970.
Sanford JR, Coutinho P, Hackett JA, Wang X, Ranahan W, Caceres JF.
2008. Identification of nuclear and cytoplasmic mRNA targets for
the shuttling protein SF2/ASF. PLoS ONE 3: e3369. doi: 10.1371/
Sanford JR, Wang X, Mort M, Vanduyn N, Cooper DN, Mooney SD,
Edenberg HJ, Liu Y. 2009. Splicing factor SFRS1 recognizes
a functionally diverse landscape of RNA transcripts. Genome Res
Schneider TD, Stephens RM. 1990. Sequence logos: a new way to
display consensus sequences. Nucleic Acids Res 18: 6097–6100.
Schwartz S, Ast G. 2010. Chromatin density and splicing destiny: on
the cross-talk between chromatin structure and splicing. EMBO J
Schwartz SH, Silva J, Burstein D, Pupko T, Eyras E, Ast G. 2008.
Large-scale comparative analysis of splicing signals and their
corresponding splicing factors in eukaryotes. Genome Res 18:
Shi H, Hoffman BE, Lis JT. 1997. A specific RNA hairpin loop
structure binds the RNA recognition motifs of the Drosophila SR
protein B52. Mol Cell Biol 17: 2649–2657.
Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom
K, Clawson H, Spieth J, Hillier LW, Richards S, et al. 2005.
Evolutionarily conserved elements in vertebrate, insect, worm, and
yeast genomes. Genome Res 15: 1034–1050.
Smith PJ, Zhang C, Wang J, Chew SL, Zhang MQ, Krainer AR. 2006.
An increased specificity score matrix for the prediction of SF2/ASF-
specific exonic splicing enhancers. Hum Mol Genet 15: 2490–2508.
Tweedie S, Ashburner M, Falls K, Leyland P, McQuilton P, Marygold
S, Millburn G, Osumi-Sutherland D, Schroeder A, Seal R, et al.
2009. FlyBase: enhancing Drosophila Gene Ontology annotations.
Nucleic Acids Res 37: D555–D559.
Voelker RB, Berglund JA. 2007. A comprehensive computational
characterization of conserved mammalian intronic sequences re-
veals conserved motifs associated with constitutive and alternative
splicing. Genome Res 17: 1023–1033.
Wang Z, Rolish ME, Yeo G, Tung V, Mawson M, Burge CB. 2004.
Systematic identification and analysis of exonic splicing silencers.
Cell 119: 831–845.
Xiao X, Wang Z, Jang M, Burge CB. 2007. Coevolutionary networks of
splicing cis-regulatory elements. Proc Natl Acad Sci 104: 18583–
Yeo G, Burge CB. 2004. Maximum entropy modeling of short
sequence motifs with applications to RNA splicing signals.
J Comput Biol 11: 377–394.
Yeo G, Hoon S, Venkatesh B, Burge CB. 2004. Variation in sequence
and organization of splicing regulatory elements in vertebrate
genes. Proc Natl Acad Sci 101: 15700–15705.
Yeo GW, Van Nostrand EL, Nostrand EL, Liang TY. 2007. Discovery
and analysis of evolutionarily conserved intronic splicing regulatory
elements. PLoS Genet 3: e85. doi: 10.1371/journal.pgen.0030085.
Zhang XH, Chasin LA. 2004. Computational definition of sequence
motifs governing constitutive exon splicing. Genes Dev 18: 1241–
Zhang XH, Arias MA, Ke S, Chasin LA. 2009. Splicing of designer
exons reveals unexpected complexity in pre-mRNA splicing. RNA
Brooks et al.
RNA, Vol. 17, No. 10