The linked units of 5S rDNA and U1 snDNA of razor shells (Mollusca: Bivalvia: Pharidae).
ABSTRACT The linkage between 5S ribosomal DNA and other multigene families has been detected in many eukaryote lineages, but whether it provides any selective advantage remains unclear. In this work, we report the occurrence of linked units of 5S ribosomal DNA (5S rDNA) and U1 small nuclear DNA (U1 snDNA) in 10 razor shell species (Mollusca: Bivalvia: Pharidae) from four different genera. We obtained several clones containing partial or complete repeats of both multigene families in which both types of genes displayed the same orientation. We provide a comprehensive collection of razor shell 5S rDNA clones, both with linked and nonlinked organisation, and the first bivalve U1 snDNA sequences. We predicted the secondary structures and characterised the upstream and downstream conserved elements, including a region at -25 nucleotides from both 5S rDNA and U1 snDNA transcription start sites. The analysis of 5S rDNA showed that some nontranscribed spacers (NTSs) are more closely related to NTSs from other species (and genera) than to NTSs from the species they were retrieved from, suggesting birth-and-death evolution and ancestral polymorphism. Nucleotide conservation within the functional regions suggests the involvement of purifying selection, unequal crossing-overs and gene conversions. Taking into account this and other studies, we discuss the possible mechanisms by which both multigene families could have become linked in the Pharidae lineage. The reason why 5S rDNA is often found linked to other multigene families seems to be the result of stochastic processes within genomes in which its high copy number is determinant.
-
Article: MODELTEST: testing the model of DNA substitution.
[show abstract] [hide abstract]
ABSTRACT: The program MODELTEST uses log likelihood scores to establish the model of DNA evolution that best fits the data. AVAILABILITY: The MODELTEST package, including the source code and some documentation is available at http://bioag.byu. edu/zoology/crandall_lab/modeltest.html.Bioinformatics 02/1998; 14(9):817-8. · 5.47 Impact Factor
Page 1
ORIGINAL ARTICLE
The linked units of 5S rDNA and U1 snDNA of razor
shells (Mollusca: Bivalvia: Pharidae)
J Vierna1, KT Jensen2, A Martı ´nez-Lage1and AM Gonza ´lez-Tizo ´n1
1Department of Molecular and Cell Biology, Evolutionary Biology Group (GIBE), Universidade da Corun ˜a, La Corun ˜a, Spain
and2Marine Ecology, Department of Biological Sciences, Aarhus University, Ole Worms Alle ´ 1, Aarhus C, Denmark
The linkage between 5S ribosomal DNA and other multigene
families has been detected in many eukaryote lineages, but
whether it provides any selective advantage remains unclear. In
this work, we report the occurrence of linked units of 5S
ribosomal DNA (5S rDNA) and U1 small nuclear DNA (U1
snDNA) in 10 razor shell species (Mollusca: Bivalvia: Pharidae)
from four different genera. We obtained several clones
containing partial or complete repeats of both multigene
families in which both types of genes displayed the same
orientation. We provide a comprehensive collection of razor
shell 5S rDNA clones, both with linked and nonlinked
organisation, and the first bivalve U1 snDNA sequences. We
predicted the secondary structures and characterised the
upstream and downstream conserved elements, including a
region at ?25 nucleotides from both 5S rDNA and U1 snDNA
transcription start sites. The analysis of 5S rDNA showed that
some nontranscribed spacers (NTSs) are more closely related
to NTSs from other species (and genera) than to NTSs from the
species they were retrieved from, suggesting birth-and-death
evolution and ancestral polymorphism. Nucleotide conservation
within the functional regions suggests the involvement of
purifying selection, unequal crossing-overs and gene conver-
sions. Taking into account this and other studies, we discuss
the possible mechanisms by which both multigene families
could have become linked in the Pharidae lineage. The reason
why 5S rDNA is often found linked to other multigene families
seems to be the result of stochastic processes within genomes
in which its high copy number is determinant.
Heredityadvanceonlinepublication,
doi:10.1038/hdy.2010.174
2March2011;
Keywords: birth-and-death evolution; regulatory regions; 5S ribosomal RNA; U1 small nuclear RNA; linkage; Ensis
Introduction
The 5S ribosomal RNA molecule (5S rRNA) is a
component of the large subunit of ribosomes, encoded
by the 5S ribosomal DNA (5S rDNA) and transcribed by
RNA polymerase III. The eukaryote 5S rDNA is a
multigene family, typically composed of hundreds of
repeats of an approximately 120 nucleotides (nts) RNA
coding region (hereafter, 5S) and an intergenic spacer
(IGS) usually referred to as nontranscribed spacer (NTS).
The first nts downstream the 5S are transcribed as part of
the primary RNA and deleted during RNA maturation
(Sharp et al., 1984; Sharp and Garcia, 1988), but they are
considered as part of the NTS.
The 5S rDNA is characterised by a flexible organisa-
tion, as it has been found in clusters composed of similar
or divergent tandemly arranged repeats (differences
mainly occur within the NTS; for example, Shippen-
Lentz and Vezza, 1988), and in clusters of 5S rDNA
repeats tandemly linked to other multigene families (for
example, Cross and Rebordinos, 2005; Freire et al., 2010;
Cabral-de-Mello et al., 2010). A dispersed organisation of
5S rDNA has also been reported (Morzycka-Wroblewska
et al., 1985 and references therein), and some species
were found to have more than one type of organisation
within the genome (Little and Braaten, 1989).
The 5S rDNA multigene family was thought to be
characterised by low levels of intragenomic divergence
in virtually all species because of the concerted evolution
of ribosomal multigene families (see Eickbush and
Eickbush, 2007 for a review). Nevertheless, the ocurrence
of divergent variants of 5S rDNA within a genome has
been described in animals, plants and fungi (for example,
Fernandez
etal.,2005;Rooney
Caradonna et al., 2007), and in some cases, differences
in the RNA coding regions were found to correspond to
tissue-specific variants (Peterson et al., 1980). Therefore,
recent studies have pointed out to a more complex
evolutionary scenario in which birth-and-death pro-
cesses generate new 5S rDNA variants that may be
homogenised by unequal crossing-overs and gene con-
versions. For instance, in Ensis razor shells (Schumacher,
1817), the long-term evolution of 5S rDNA was found to
be driven by birth-and-death processes and selection,
and it was suggested that homogenising mechanisms
were also taking part within each variant in each species
(Vierna et al., 2009). Later on, it was proposed that the
levels of intragenomic divergence—much higher within
the 5S rDNA than within the major ribosomal genes—
were due to the more flexible organisation of 5S rDNA,
meaning that homogenisation processes were more
efficient within the array(s) of major ribosomal genes,
andWard,2005;
Received 12 July 2010; revised 18 October 2010; accepted
8 November 2010
Correspondence: J Vierna or AM Gonza ´lez-Tizo ´n, Department of
Molecular and Cell Biology, Evolutionary Biology Group (GIBE),
Universidade da Corun ˜a, A Fraga, 10, La Corun ˜a E-15008, Spain.
E-mails: jvierna@udc.es; joaquinvierna@gmail.com or hakuna@udc.es
Heredity (2011), 1–16
& 2011 Macmillan Publishers Limited All rights reserved 0018-067X/11
www.nature.com/hdy
Page 2
as they may occur in a smaller number. The long-term
evolution of both rDNA regions was then proposed to be
driven by a mixed process of concerted evolution, birth-
and-deathevolutionand
described by Nei and Rooney (2005) (Vierna et al., 2010).
Most eukaryotic genes are transcribed into precursor
messenger RNAs that must undergo splicing, an
essential step of gene expression. During precursor
messenger RNA splicing, introns are removed from the
precursor messenger RNA and exons are ligated together
to form mRNA (Will and Lu ¨hrmann, 2005). Splicing is
performed by the spliceosomes, ribonucleoprotein com-
plexes consisting of small nuclear RNAs and several
proteins. The U1 small nuclear RNA molecule is a
component of the major spliceosome, essential for the
interaction with the 50splice site of introns (Zhuang and
Weiner, 1986). This molecule is encoded by the U1 small
nuclear DNA (U1 snDNA), which consists of an RNA
coding region (hereafter, U1) and an IGS (when it is
organised in tandem repeats). U1 snDNA, transcribed by
RNA polymerase II, is a multigene family with a variable
number of repeats in each genome (around tens of
repeats in the metazoan species studied by Marz et al.,
2008). Although not much information is available
about the organisation of U1 snDNA, it was found to
be linked to other multigene families, such as 5S rDNA
(Pelliccia et al., 2001), other spliceosomal snDNA families
(Marz et al., 2008) and organised in the same array
together with 5S rDNA repeats and other spliceosomal
snDNA (Manchado et al., 2006). In general, however,
clustered copies of distinct or the same small nuclear
RNA coding genes are not common in metazoan
genomes (Marz et al., 2008).
The evolution of spliceosomal snDNA has been
recently studied in two different surveys, covering insect
species (Mount et al., 2007) and several other metazoan
groups (Marz et al., 2008), and appears not to be a simple
issue. In insects, it is governed by several concurrent
forces, namely purifying selection, unequal crossing-
overs, gene conversions and birth-and-death processes
(Mount et al., 2007). Distinguishable U1 snDNA paralogs
differentially expressed throughout development have
been described in some species (for example, Lo and
Mount, 1990), but the snDNA paralog groups seem not
to be stable over a long evolutionary time, although they
appear independently in several clades (Marz et al.,
2008).
The linkage between 5S rDNA and U1 snDNA has
only been reported in one crustacean (Pelliccia et al.,
2001) and in one fish (Manchado et al., 2006). In this
survey, we report linked units of 5S rDNA and U1
snDNA in 10 razor shell species (Mollusca: Bivalvia:
Pharidae) from four different genera. We obtained new
data about the genomic organisation of both multigene
families in these animals, and studied the genesis and
evolution of the 5S rDNA–U1 snDNA linked units. Using
the Ensis sequences available from DDBJ/EMBL/Gen-
Bank and the new sequences obtained, we provide a
comprehensive collection of razor shell 5S rDNA
variants, including their secondary structures and the
characterisation of putative pseudogenes. We also report
the first Bivalvia U1 snDNA sequences, including their
predicted secondary structures. Finally, several putative
regulatory regions of both multigene families were
studied in detail.
purifyingselection,as
Materials and methods
Animals
We selected 11 species belonging to family Pharidae
(Adams and Adams, 1858; Mollusca: Bivalvia, Table 1).
Though a greater sampling effort was made on genus
Ensis, we tried to represent the whole family by selecting
species from its different subtaxons. Thus, from sub-
family Cultellinae (Davies, 1935), we studied eight Ensis
species and one Ensiculus (Adams, 1860). The species
Siliqua patula (Dixon, 1789) was also included in the
analysis, as genus Siliqua (Mu ¨hlfeld, 1811) may represent
a separate subfamily from the Cultellinae (see Cosel,
1993). From the other subfamily, Pharinae (Adams and
Adams, 1858), we took into consideration the species
Pharus legumen (Linne ´, 1758). Two homonymous species,
Ensis minor (Chenu, 1843) and E. minor (Dall, 1899) were
studied in this survey, and hereafter they will be referred
to as E. minor (Chenu) and E. minor (Dall). All taxon
names follow Cosel (1993) and Cosel (2009), when
applicable. Razor shells were provided by several
colleagues and preserved in 100% ethanol until species
identification, except the Ensiculus cultellus (Linne ´, 1758)
sample that consisted of an ethanol-preserved piece of
muscle tissue, and Ensis goreensis (Clessin, 1888) from
which only dried tissue was available.
DNA extraction, PCR, cloning and sequencing
DNA extractions were done from muscle tissue using the
NucleoSpin Tissue kit (Macherey-Nagel, North Rhine-
Westphalia, Germany). Using the primers 5S-Univ-F and
5S-Univ-R (Vierna et al., 2009), we serendipitously
amplified complete U1 snDNA sequences flanked by
two partial 5S rDNA repeats in the species Ensis magnus
(Schumacher, 1817) and P. legumen. From these se-
quences, different primer pairs annealing at the 5S and
U1 regions of razor shells were designed using Gene-
Fisher (Giegerich et al., 1996) (Table 2). PCR reactions
were conducted in a final volume of 20ml using the 2?
Taq Master Mix RED (VWR/Ampliqon, Skovlunde,
Denmark), applying the following conditions: an initial
denaturation step at 941C for 3min followed by 40 cycles
of denaturation at 941C for 20s, annealing at the
temperatures indicated in Table 2 for 20s, extension at
721C for 1min, and a final extension at 721C for 5min.
Amplification products were run on 1% agarose gels,
stained with a 0.5mg/ml solution of ethidium bromide,
and imaged under UV light. They were cloned using the
TOPO TA cloning kit (Invitrogen, Carlsbad, CA, USA). A
subset of transformant colonies from each cloning
reaction was analysed by PCR in order to check the
insert size. From each PCR, we selected one clone per
species when only one band was retrieved (that is, in all
cases except in one of the PCRs of Ensis macha (Molina,
1782) and E. cultellus individuals, in which case we
obtained two slightly different bands, so two clones were
sequenced). Sequencing was performed at Macrogen
(Seoul, South Korea) using both T3 and T7 primers
(forward and reverse) included in the cloning kit.
Bioinformatic analyses
Electropherograms were inspected in BioEdit 7.0.9.0
(Hall, 1999). The Blast 2 sequences tool (available at
www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi) was
Linked units of 5S rDNA and U1 snDNA
J Vierna et al
2
Heredity
Page 3
used to compare the ends of both the forward and
reverse sequences obtained from each clone, which were
subsequently overlapped by hand. Sequences obtained
were subjected to a sequence-similarity search against
the DDBJ/EMBL/GenBank nucleotide collection data-
bases using the blastn algorithm. Sequences similar to
other 5S, U1 and their intergenic spacers were deposited
in the DDBJ/EMBL/GenBank databases under the
accession numbers specified in Table 1. The pair-wise
comparisons were also performed in the Blast 2
sequences tool and multiple sequence alignments were
carried out in ClustalW 2.0 (Larkin et al., 2007), and
manually adjusted for local optimisation in MEGA 4.0.2
(Tamura et al., 2007). The number of polymorphic sites
was retrieved from DnaSP 5.10.0 (Librado and Rozas,
2009). Lengths and p-distances were obtained from
MEGA 4.0.2 (Tamura et al., 2007). In p-distance calcula-
tion, gaps were not considered, and 1000 bootstrap
replicates were performed for the estimation of standard
errors.
In order to search for putative regulatory conserved
elements, sequences upstream and downstream the 5S
and U1 regions were analysed. Searches were performed
considering the first 100 nt upstream and downstream
the RNA coding regions. In the case of U1 upstream
analyses, two sequences from the gastropod molluscs
Aplysia californica and Lottia gigantea (provided by Manja
Marz, Philipps-Universita ¨t, Marburg, Germany) were
selected and included in the analyses. Conserved motifs
were identified by MEME (Bailey and Elkan, 1994) and
they were manually compared with published regula-
tory elements.
5S and U1 sequences were folded in RNAstructure
5.02 (Reuter and Mathews, 2010) at 151C, and we used
the efn2 function (Mathews et al., 1999) to recalculate the
DG values. The consensus secondary structures were
obtained from the RNAalifold webserver (Hofacker,
2003).
We used PALM (Chen et al., 2009) to select nucleotide
substitution models and to infer maximum likelihood
phylogenies. The best-fit model of nucleotide substitu-
tion was directly selected using Modeltest 3.7 (Posada
and Crandall, 1998), applying the Akaike information
criterion. Phylogenies were constructed by PALM using
PhyML (Guindon and Gascuel, 2003). Starting trees were
obtained by the BioNJ algorithm (Gascuel, 1997) and
Table 1 DNA sequences studied and specimen details
Species
sbf.
Identification
Museum code
Sampling site
Primer pair names and accession numbers
5S-Univ
5S-U1
U1-5S
U1-U1
Ensis magnus
Schumacher, 1817
Cul.
Cosel, 2009
MNHN 40042
Bonden, Sweden
FN908876a
FN908883a
FN908894a
FN908904a
E. magnus
Schumacher, 1817
Cul.
Cosel, 2009
Ortigueira, Spain
FM201454-56b
E. siliqua
(Linne ´, 1758)
Cul.
Cosel, 2009
MNHN 40047
Vigo, Spain
FM201457-62b,
FM211689b
FN908890a
FN908900a
FN908908a
E. ensis
(Linne ´, 1758)
Cul.
Cosel, 2009
MNHN 40044
La Capte, France
FM211690-91b
FN908885a
FN908896a
FN908905a
E. goreensis
(Clessin, 1888)
Cul.
Cosel, 2009
MNHN 17948
Gore ´e, Senegal
FM211692b
E. minor
(Chenu, 1843)
Cul.
Cosel, 2009
MNHN 40045
La Capte, France
FN908886a
FN908897a
FN908906a
E. directus
(Conrad, 1843)
Cul.
Cosel, 2009
MNHN 40049
Long Pond, Canada
FN908884a
FN908895a
E. directus
(Conrad, 1843)
Cul.
Cosel, 2009
Various localities, Denmark
AM904878-933b
E. macha
(Molina, 1782)
Cul.
Cosel, 2009
MNHN IM-2009-8446
Puerto Lobos, Argentina
FN908887a,
FN908898a
FN908907a
FN908888a
E. macha
(Molina, 1782)
Cul.
Cosel, 2009
MNHN 40048
Playa Dichato, Chile
FM201452b
E. macha
(Molina, 1782)
Cul.
Concepcio ´n, Chile
AM940998-1009b/c
AM906171-80b/c,
AM906203-8b/c
E minor
Dall, 1899
Cul.
Cosel, 2009
MNHN IM–2009–8447
Christmas Bay, USA
FN908889a
FN908899a
Ensiculus
cultellus
(Linne ´, 1758)
Cul.
John Taylor
BMNH 20070223
Moreton Bay, Australia
FN908881a,
FN908893a
FN908903a
FN908882a
Siliqua
patula
(Dixon, 1789)
Cul.
Dan Ayres
MNHN IM–2009–8448
Ocean City, USA
FN908892a
FN908902a
FN908910a
Pharus
legumen
Linne ´, 1758
Phar.
Cosel, 2009
MNHN 40051
Bandol, France
FN908877-80a
FN908891a
FN908901a
FN908909a
Abbreviations: a, new sequences; b, sequences previously studied by Vierna et al. (2009); c, sequences previously studied by Ferna ´ndez-Tajes and Me ´ndez, (2009); Cul., Cultellinae; Phar.,
Pharinae; sbf., subfamily.
Table 2 Primer pairs used in this survey
Sequence/referenceTa.r.a.p.
5S-Univ-F Vierna et al. (2009)
5S-Univ-R Vierna et al. (2009)
501C 5S
501C 5S
13–32
36–55
5S-U1-F
5S-U1-R
50GTCTACGGCCATATCACGTT
50GTTAGCGCGAACGCAGVC
611C 5S
611C U1 142–159
1–20
U1-5S-F
U1-5S-R
50VCTGCGTTCGCGCTAVCC
50GGTATTCCCAGGCGGTCAC
651C U1 143–160
651C 5S87–105
U1-U1-F
U1-U1-R
50GCAATGGAAGGGCCTCCTCCT 611C U1
50TTCGGTTGGGCTGATGCCTG
49–69
72–91 611C U1
Abbreviations: a.p., annealing position within each RNA coding
region; a.r., annealing region; T, annealing temperature; U1, U1
small nuclear RNA coding region; 5S, 5S ribosomal RNA coding
region.
Linked units of 5S rDNA and U1 snDNA
J Vierna et al
3
Heredity
Page 4
gaps were treated as unknown characters. The number of
substitution rate categories employed was eight, and the
bootstrap test (Felsestein, 1985) was used to estimate
node support (1000 replicates). Maximum parsimony
phylogenies were obtained from PAUP*4.0b10 (Swof-
ford, 2002) as detailed in Vierna et al. (2010). Following
Marz et al. (2008), we calculated phylogenetic networks
in addition to phylogenetic trees, using the neighbour-
net algorithm (Bryant and Moulton, 2004), implemented
as part of the SplitsTree4 package (Huson and Bryant,
2006).
Different gene tandem arrangements were drawn
usingpDRAW32(AcaClone
acaclone.com/) and we edited all phylogenetic trees in
FigTree 1.2.2 (Andrew Rambaut, http://tree.bio.ed.
ac.uk/software/figtree/).
sofware, http://www.
Results
Sequence characterisation
The identification of 5S, U1 and spacer sequences was
performed by comparing them against the DDBJ/
EMBL/GenBank nucleotide collection databases, as
explained above. For the sake of clearness, all spacer
sequences downstream a 5S will be referred to as NTS,
and all spacers downstream a U1, as IGS. All complete 5S
sequences were 120nts and NTS ranged between 283 and
986nts. All complete U1 sequences were 164nts except
the ones obtained from S. patula, that had a nucleotide
insertion at position 37. IGS ranged between 222 and
422nts. The DDBJ/EMBL/GenBank accession numbers
of the sequences studied are listed in Table 1.
Average GC contents were 55.1% for the 5S region,
54.8% for the U1 region, 38.8% for the NTS and 41.9% for
the IGS. The number of polymorphic sites in the RNA
coding regions was S¼32 for the 5S region and S¼20 for
the U1 region.
Hereafter,clonescontaining
repeats of both multigene families will be referred to as
mixed clones.
partial or complete
Alignments
An initial alignment of the NTS region showed that the
NTSs of razor shells were highly divergent, so sequences
had to be grouped separately, according to their
similarity. After performing several combinations, we
divided the NTSs into seven supergroups and 17 groups.
Each supergroup was named using a Roman numeral
and each group was denoted by a Greek letter following
Vierna et al. (2009). Supergroups and groups contained
sequences belonging to one or more species. Similarily,
IGS sequences were divided into two groups, one
containing all Ensis and Ensiculus and the other one
containing Pharus and Siliqua IGSs. The species composi-
tion, lengths and mean P-distances for each spacer group
and supergroup were recorded in Table 3.
Let’s now consider only the spacer sequences from
mixed clones. We were able to align all IGSs from Ensis,
Ensiculus, Pharus and Siliqua individuals, but the diver-
gence among them was evident; however, the last part of
the alignment (containing the upstream region of the
next 5S repeat) revealed a more conserved region. Quite
the opposite, the analysis of the NTSs from mixed clones
(upstream U1 sequences) revealed that these spacers
were less conserved than the IGSs and could not be
aligned at once. In this case, we were able to align all
Ensis sequences (from supergroup II), except an NTS
from the species E. macha (from supergroup V). The
NTSs from the species P. legumen and S. patula, belonging
to supergroup IV, could also be aligned together.
However, E. cultellus NTSs could not be aligned to Ensis,
P. legumen or S. patula sequences.
In the alignment of Ensis U1–U1 clones (Supplemen-
tary File S1), all Ensis IGSs displayed a region of
similarity with d- and g-NTSs, from the species E. directus
(Conrad, 1843). This region was located at the end of the
IGS (just upstream the 5S region) and resembled the last
portion of d- and g-NTSs. Downstream this 5S region, in
the NTS, we found another region of similarity with
d- and g-NTSs, and downstream of it there was a
fragment resembling a 5S (probably an old pseudogen-
ised copy). Even though this pattern was only found in
Ensis species, the first portion of the alignment that
corresponded to the U1–IGS–5S sequence (positions 1
to 427, Supplementary file S1), could be aligned to
E. cultellus clones, and with more difficulties, to
P. legumen and S. patula ones (as explained above).
Upstream elements
A conserved region was identified at ?25nts from both
the 5S rDNA and U1 snDNA transcription start sites
(Supplementary file S2) and named ?25 region. It was a
TATA-like motif in the 5S upstream sequences (Supple-
mentary file S3a), and upstream the U1 region (Supple-
mentary file S3b), it was an A/G-rich motif: AAAAG in
Ensis and E. cultellus, GGGGA in gastropods, AAATG in
P. legumen and GTAAG upstream S. patula putative-
pseudogenised U1 sequences (see U1 predicted second-
ary structures). Another motif (AAAGC, Supplementary
file S2) was identified just upstream the U1 snDNA
transcription start site, identical to the one found in
Drosophila melanogaster (Lo and Mount, 1990) and in
other organisms (see Discussion), but it only occurred in
some of the razor shell sequences. Finally, a less
conserved region was found upstream the ?25 region
in U1 snDNA upstream sequences (Supplementary file
S2), centred at ?44nts.
Although it was not possible to align all Ensis NTSs at
once, we were able to align the 100nt upstream the
transcription start site of 5S rDNA of Ensis species. These
stretches were the last part of either NTS or IGS
sequences. We failed to include the other Pharidae
species in this alignment, as sequences were not
conserved among genera.
Internal regulatory regions
5S internal control regions (ICR I to IV) were compared
with those described in D. melanogaster (Sharp and
Garcia, 1988). As some ICRs coincided with the primer-
annealing regions, some sequences were excluded from
the comparisons, and sequences amplified with the 5S-
Univ primers (Table 2) were only included in the ICR IV
analysis. Results were as follows: 12/16 matches within
ICR I (positions 3–18); 7/8 matches within ICR II
(positions 37–44); 11/14 matches within ICR III (positions
48–61); and 14/21 matches within the ICR IV region
(positions 78–98). The degree of conservation of these
elements within razor shells was of 14/16, 8/8, 13/14
Linked units of 5S rDNA and U1 snDNA
J Vierna et al
4
Heredity
Page 5
and 15/21 matches, respectively. Similarly, positions
50–61 (Box A), 80–89 (Box C) and 62–79 (intermediate
sequence) were compared with those described by Pieler
et al. (1987) in Xenopus laevis, obtaining 6/12, 6/10 and
12/18 matches. Within razor shells, the matches obtained
were 9/12, 7/10 and 14/18.
Six U1 internal regions that appear to be conserved
accross metazoa (Zhuang and Weiner, 1986; Marz et al.,
2008) were analysed in all razor shell sequences. They
were compared with the two gastropod sequences (see
above), the ones from the insect D. melanogaster (Lo and
Mount, 1990), and those from crustaceans Asellus
aquaticus and Proasellus coxalis (Barzotti et al., 2003).
Considering as a reference the E. magnus U1 sequence
(see U1 predicted secondary structures), they correspond
to the following positions: the 50end (includes the 50
splice site, Zhuang and Weiner, 1986 and references
therein); 28–33 (within the U1–70K protein binding site,
Query et al., 1989); the stem-loop II positions 53–55, 65–72
and 84–86 (U1-A protein binding region, Scherly et al.,
1989); and positions 124–132 (include the Sm protein
binding region, named ‘domain A’ by Branlant et al.,
1982). The most conserved region was the 50end (11nt)
that was identical in all sequences. Positions 28–33 were
identical in all sequences, but L. gigantea had an
additional G inserted between the first and the second
nt. Positions 65–72 were also identical, except in the last
nt. Finally, the 124–132 region was also conserved with
the exception of the sixth and last nt. The remaining two
regions were conserved at positions 54–55 and 84–85 in
all molluscs and arthropods.
Termination signals
One or more TTTT stretches (required for 5S rDNA
transcription termination, Bogenhagen and Brown, 1981;
Huang and Maraia, 2001; Richard and Manley, 2009)
occurred within the first 20nt of all NTSs, except for
Table 3 Intergenic spacer groups and supergroups
NTS groupSpecies Clade
N
LengthMean P-distance
Supergroup I
a
b
z
72
41
18
13
286–329
321–329
314–318
286–315
0.135±0.010
0.011±0.003
0.019±0.004
0.042±0.006
Ensis directus
Ensis macha
Ensis magnus, E. siliqua, E. ensis, E. goreensis
A
A
E
Supergroup II
g
d
Z*
y*
28
11
407–965
444–654
407
893–965
926–960
0.240±0.012
0.023±0.004
0.002±0.002
0.046±0.004
0.127±0.008
Ensis directus
Ensis directus
Ensis magnus, E. siliqua, E. ensis, E. minor (Chenu)
Ensis directus, E. macha, E. minor (Dall)
A
A
E
A
4
9
4
Supergroup III
e 1
e 2
x*
14 405–620
603
618–620
405
0.141±0.009
0.011±0.003
0.004±0.002
0.010±0.004
Ensis macha
Ensis macha
Ensiculus cultellus
A
A
6
5
3
Supergroup IV
m
o*
l*
8
3
2
3
355–550
548–550
355
419–420
0.241±0.013
0.005±0.002
0
0
Pharus legumen
Siliqua patula
Pharus legumen
Supergroup V
i*
Ensis macha
A1776
Supergroup VI
p*
r*
2
1
1
209–369
369
209
0.077±0.018
Siliqua patula
Siliqua patula
Supergroup VII
k
n*
2
1
1
283–332
332
283
0.366±0.029
Pharus legumen
Ensiculus cultellus
IGS group Species
n
LengthMean P-distance
Supergroup Ensis–Ensiculus
Ensis spp.
15
13
222–422
225–231
0.203±0.015
0.177±0.014
Ensis directus, E. macha, E. minor (Dall, 1899)
Ensis magnus, E. siliqua, E. ensis, E. minor (Chenu, 1843)
Ensiculus cultellusEnsiculus cultellus
2 421–422 0.002±0.002
Supergroup Pharus–Siliqua
Siliqua patula
Pharus legumen
5
2
3
236–342
236
342
0.193±0.017
0.064±0.015
0.007±0.004
Siliqua patula
Pharus legumen
Abbreviations: A, American clade; E, European clade (Ensis phylogenetic clades according to Vierna et al. (unpublished data)); IGS, intergenic
spacer (downstream a U1 small nuclear RNA coding region); n, sample size, NTS, nontranscribed spacer (intergenic spacer downstream a 5S
ribosomal RNA coding region).
Asterisks (*) indicate nontranscribed spacers linked to U1 small nuclear DNA;
Linked units of 5S rDNA and U1 snDNA
J Vierna et al
5
Heredity