In silico mining and characterization of simple sequence repeats from gilthead sea bream (Sparus aurata) expressed sequence tags (EST-SSRs); PCR amplification, polymorphism evaluation and multiplexing and cross-species assays.
ABSTRACT We screened for simple sequence repeats (SSRs) found in ESTs derived from an EST-database development project ('Marine Genomics Europe' Network of Excellence). Different motifs of di-, tri-, tetra-, penta- and hexanucleotide SSRs were evaluated for variation in length and position in the expressed sequences, relative abundance and distribution in gilthead sea bream (Sparus aurata). We found 899 ESTs that harbor 997 SSRs (4.94%). On average, one SSR was found per 2.95 kb of EST sequence and the dinucleotide SSRs are the most abundant accounting for 47.6% of the total number. EST-SSRs were used as template for primer design. 664 primer pairs could be successfully identified and a subset of 206 pairs of primers was synthesized, PCR-tested and visualized on ethidium bromide stained agarose gels. The main objective was to further assess the potential of EST-SSRs as informative markers and investigate their cross-species amplification in sixteen teleost fish species: seven sparid species and nine other species from different families. Approximately 78% of the primer pairs gave PCR products of expected size in gilthead sea bream, and as expected, the rate of successful amplification of sea bream EST-SSRs was higher in sparids, lower in other perciforms and even lower in species of the Clupeiform and Gadiform orders. We finally determined the polymorphism and the heterozygosity of 63 markers in a wild gilthead sea bream population; fifty-eight loci were found to be polymorphic with the expected heterozygosity and the number of alleles ranging from 0.089 to 0.946 and from 2 to 27, respectively. These tools and markers are expected to enhance the available genetic linkage map in gilthead sea bream, to assist comparative mapping and genome analyses for this species and further with other model fish species and finally to help advance genetic analysis for cultivated and wild populations and accelerate breeding programs.
In silico mining and characterization of simple sequence repeats from gilthead sea
bream (Sparus aurata) expressed sequence tags (EST-SSRs); PCR amplification,
polymorphism evaluation and multiplexing and cross-species assays
Emmanouella Vogiatzia,b,1, Jacques Lagnela,1, Victoria Pakakia,c, Bruno Lourod, Adelino V.M. Canariod,
Richard Reinhardte, Georgios Kotoulasa, Antonios Magoulasa, Costas S. Tsigenopoulosa,⁎
aInstitute of Marine Biology and Genetics (IMBG), Hellenic Centre for Marine Research (HCMR), Heraklion Crete, Greece
bDepartment of Genetics and Molecular Biology, Democritian University of Thrace, Alexandroupolis, Greece
cDepartment of Biology, University of Crete, Heraklion Crete, Greece
dCentre of Marine Sciences (CCMAR), University of Algarve, Gambelas, Faro, Portugal
eMax Planck Institute for Molecular Genetics (MPIMG), Berlin-Dahlem, Germany
a b s t r a c ta r t i c l ei n f o
Received 14 September 2010
Received in revised form 9 January 2011
Accepted 12 January 2011
We screened for simple sequence repeats (SSRs) found in ESTs derived from an EST-database development
project (‘Marine Genomics Europe’ Network of Excellence). Different motifs of di-, tri-, tetra-, penta- and
hexanucleotide SSRs were evaluated for variation in length and position in the expressed sequences, relative
abundance and distribution in gilthead sea bream (Sparus aurata). We found 899 ESTs that harbor 997 SSRs
(4.94%). On average, one SSR was found per 2.95 kb of EST sequence and the dinucleotide SSRs are the most
abundant accounting for 47.6% of the total number.
EST-SSRs were used as template for primer design. 664 primer pairs could be successfully identified and a
subset of 206 pairs of primers was synthesized, PCR-tested and visualized on ethidium bromide stained
agarose gels. The main objective was to further assess the potential of EST-SSRs as informative markers and
investigate their cross-species amplification in sixteen teleost fishspecies:seven sparid species and nine other
species from different families. Approximately 78% of the primer pairs gave PCR products of expected size in
gilthead sea bream, and as expected, the rate of successful amplification of sea bream EST-SSRs was higher in
sparids, lower in other perciforms and even lower in species of the Clupeiform and Gadiform orders. We
finally determined the polymorphism and the heterozygosity of 63 markers in a wild gilthead sea bream
population; fifty-eight loci were found to be polymorphic with the expected heterozygosity and the number
of alleles ranging from 0.089 to 0.946 and from 2 to 27, respectively. These tools and markers are expected to
enhance the available genetic linkage map in gilthead sea bream, to assist comparative mapping and genome
analyses for this species and further with other model fish species and finally to help advance genetic analysis
for cultivated and wild populations and accelerate breeding programs.
© 2011 Elsevier B.V. All rights reserved.
Since first described in the 1980s, microsatellites (also known as
simple sequence repeats—SSRs, or short tandem repeats—STRs) have
become one of the most important molecular genetic markers
currently in use (Ellegren, 2004); they have been widely accepted
as a common tool employed in population genetics, molecular
ecology, systematics, biodiversity and conservation studies and
more recently in linkage mapping, traits association studies and
comparative genome analysis (Avise, 2004; Hirschhorn and Daly,
2005; Selkoe and Toonen, 2006). Microsatellites refer to specific DNA
sequences consisting usually of one to six base pair motifs tandemly
repeated and which are abundant, codominant, hypervariable (extra-
polymorphic and multi-allelic) and highly reproducible (Schlotterer,
2000). They are present not only within most eukaryotic genomes,
mainly in non-coding (intergenic and intronic) regions, but also
within coding (exonic) regions (Tóth et al., 2000).
In the last decade, the plethora of expressed sequence tag (EST)
databases already available, as well as most importantly, those that
may be created for a given species or genus, have proved to be a
valuable source to rapidly obtain microsatellite loci (hereafter EST-
simple sequence repeats or EST-SSRs), thereby reducing financial
constraints and time-consuming library preparation and screening.
Indeed, the development of microsatellites from genomic libraries is
Marine Genomics 4 (2011) 83–91
⁎ Corresponding author at: Institute of Marine Biology and Genetics (IMBG), Hellenic
Centre for Marine Research (HCMR), P.O. Box 2214, 715 00 Heraklion Crete, Greece.
Tel.: +30 2810 337854; fax: +30 2810 337820.
E-mail address: firstname.lastname@example.org (C.S. Tsigenopoulos).
1Equal contribution by these authors.
1874-7787/$ – see front matter © 2011 Elsevier B.V. All rights reserved.
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/margen
limited to those motifs for which the initial hybridization or
enrichment was performed and in most cases the PCR primers used
to amplify SSRs are species-specific, which implies that markers
developed in one taxon cannot be easily and readily applied to others.
On the other hand, becauseEST-SSRs are exonic, their flanking regions
are expected to be more conserved across closely related species
(Slate et al., 2007) and represent a potential source of Type I (coding
and functionally-important) markers. Nevertheless, there is a bias of
this in silico detection and validation of exonic microsatellites in these
databases in favor of economically important plants and to a lesser
extent in aquaculture fish and mollusks (Vasemägi et al., 2005;
Guyomard et al., 2006; Wang et al., 2007, 2008; Vidal et al., 2009; Yu
and Li, 2007; Bouza et al., 2008; Qiu et al., 2009).
The great potential for the use of EST-SSRs not only in population
genetic analyses, but also for their transferability to phylogenetically-
close species, is highlighted by the fact that in general a small number
of markers, less than 20, are genotyped. Although the estimated ratio
of EST-SSRs in any in silico analysis is highly dependent on the search
parameters (type of the motif, length of the microsatellite, and
flanking regions), 1.5–7.3% of all ESTs are thought to harbor SSRs (Ju
et al., 2005). Taking into account that many of these EST-SSRs (at least
25% and up to 80–90%) are typically found to be polymorphic, it
therefore seems likely that EST databases containing around 1000
sequences could provide enough microsatellite markers to be used in
population genetic analyses (Slate et al., 2007). When we compare
these rates to the typical SSR isolation and amplification rates, it is
expected that even modest EST collections could prove to be of great
value to evolutionary biologists (Ellis and Burke, 2007).
The gilthead sea bream Sparus aurata, together with the European
sea bass (Dicentrarchus labrax), is one of leading marine species in
European and Mediterranean waters from an aquaculture viewpoint;
more than 125,000 t were produced by the aquaculture industry in
2007 whereas commercial fishery catches were no higher than 7500 t
(FAO, 2009). Genomic resources have grown exponentially over the
last few years, from the development of 24 dinucleotide microsatellite
loci (Batargias et al., 1999; Launey et al., 2003; Brown et al., 2005) and
their use in several population genetics studies (Palma et al., 2001;
to the construction of a first-generation genetic linkage map (Franch
etal.,2006) andanRHmap(Sengeretal.,2006). Comparisonbetween
the two maps showed that there is good consistency, with the
majority of markers in a single linkage group (LG) also located in the
same RH group (Sarropoulou et al., 2007). Moreover, there are several
ongoing parentage and pilot QTL analyses aiming to identify the
genetic loci involved in the determination of economically important
traits (Castro et al., 2007, 2008; Navarro et al., 2009). Therefore,
although a large set of microsatellite markers already exists for
will greatly benefit from the addition of markers found in expressed
regions of the genome.
In this paper, we present the results of collaborative research
carried out in the “Fish & Shellfish” node of the Marine Genomics
Europe (MGE) Network of Excellence (GOCE CT-2004-505403,
http://www.marine-genomics-europe.org). New cDNA libraries
and EST collections were produced for gilthead sea bream (S. aurata)
toincreasethegenomicinformationin this species.Morespecifically,
an EST dataset was produced for gilthead sea bream that currently
accounts for little less half of those reported in NCBI entries (Entrez
taxonomy browser) with 29,895 sea bream ESTs (Louro, 2010). In
order to develop tools and polymorphic markers for population
genetics studies, there was a first screening for simple sequence
repeats (SSRs) found in the ESTs of the MGE database using
bioinformatic pipeline analyses, after which their use in gilthead
seabream (S. aurata)population studieswasevaluatedandalsotheir
potential for cross-species amplification in fifteen teleost fish
currently being studied in IMBG investigated. Gilthead sea bream is
progressing fast within the group of fish which have advanced
species has benefited from existing information from other model
fish, the increasing number of EST data available for this species will
facilitate maximization of the potential for the development of SSR
and SNP-based EST maps, and hasten the implementation of more
SSRs in genetic studies.
2. Materials and methods
2.1. Detection of SSRs and primer design
A total of 18,196 unigenes (5268 contigs and 12,928 singletons)
from the MGE EST-dataset was used (Louro et al., 2010) and we
searched for the presence of perfect SSR motifs based on the MIcro
ipk-gatersleben.de/misa/misa.html. The search was restricted to
motifs of at least 18 bp length, except for the hexanucleotides;
therefore, sequences with a minimum number of 9 repeats for
dinucleotide, 6 for trinucleotide, 5 for tetranucleotide and 4 for
penta- and hexanucleotide repeats were taken into account. EST-
SSRs were analyzed for redundancy with NOBLAST (Lagnel et al.,
2009). The ESTScan2 program (Iseli et al., 1999; Lottaz et al., 2003)
is identified, then the SSR position in coding regions, the 5′ or the 3′
untranslated regions (UTRs) is depicted. The program uses a Markov
model to take into account the bias in hexanucleotide usage found in
coding regions relative to non-coding regions and is able to detect
and correct sequencing errors (allowing insertions and deletions
when these improve the coding region statistics) that lead to frame
shifts even in low-quality sequences.
The non-redundant EST-SSRs were used as template and were
analyzed with the Primer3 software (Rozen and Skaletsky, 2000)
with the following criteria: a) primer length from 18 to 27 bases,
b) annealing temperature (Tm) from 55 to 63 °C, and c) product size
from 100 to 250 bp. Compound microsatellites were considered those
present in the same EST and distant by a maximum of 25 bp. The
output file from Primer3 was further analyzed in order to lessen the
chance of encompassing tandem repeats in the primer's sequence, the
self- and pair-complementarity. An in-house relational EST/EST-SSR
database was developed with the outputs of MISA, EST collection, EST
annotation, BLAST results, Primer3 and metadata (like coordinate
system). All the above procedures were automated using a custom
bioinformatics pipeline (available upon request).
2.2. Cross-species amplifications
EST-SSRs was investigated in fifteen marine teleost fish (Table 1)
currently under study in IMBG. DNA extraction was performed using a
simple salt procedure (Miller et al., 1988) from ethanol preserved
muscle tissue. Two individuals per species were used in each PCR
reaction.PCRs were carried outin 20 μl reactionvolumes containing1.2
units of Taq polymerase, 1× Taq buffer, 1.5 mM MgCl2, 30 pmol of
dNTPs, 15 pmol of each primer, and approximately 20 ng of DNA.
Amplifications were performed on a PTC-200 (MJ Research) and PCR
conditions were as follows: preliminary denaturation at 95 °C (3 min),
followedby35 cyclesofstranddenaturationat94 °C(1 min),annealing
at 55 °C (1 min) and primer extension at 72 °C (1 min), and a final
extension at 72 °C (10 min).
PCR products were electrophoretically separated on 2% agarose
gels with ethidium bromide in 1× TAE (Tris-Acetate–EDTA) buffer,
visualized under UV light, photographically documented and scored.
PCR performance of homologous microsatellites in different species
was scored following a one digit code: (0) for reactions that did not
give any product or resulted in a smear which does not allow product
E. Vogiatzi et al. / Marine Genomics 4 (2011) 83–91
size determination, (1) for reactions that resulted in PCR products of
verydifferentsize fromthe expectedormultiple sharpPCR bands,and
(2) for successful reactions, i.e., when clear bands in the expected size
range (+/−50 bp) were detected.
2.3. EST-SSR polymorphism screening in gilthead sea bream
For gilthead sea bream's loci that seemed to amplify well in
agarose gels, the respective reverse primers were re-ordered labeled
with FAM, NED, VIC or PET fluorescent dyes. Microsatellite amplifi-
cation reactions were performed in a 10 μl volume containing 20 ng of
genomic DNA, 10 pmol of each locus-specific primer, 1× Taq buffer
and 1 U of Super Taq polymerase (Enzyme Technologies, Ltd).
Gradient polymerase chain reaction for magnesium chloride concen-
trations and annealing temperatures was used to optimize conditions
for each locus (Table 2). Reaction conditions included an initial
denaturation at 94 °Cfor5 min,30–35 cyclesof 30 sat 94 °C,30 sat Ta
(see Table 2) and 30 s at 72 °C, and the last step at 72 °C for 5 min. PCR
products were diluted 1/20, mixed with Hi-Di™ formamide (Applied
Biosystems) and the GeneScan™ 500 LIZ™ Size Standard (Applied
Biosystems) as internal size standard and loaded on a ABI 3700 DNA
Analyzer (Applied Biosystems).
Population analysis was performed with a wild-origin population
of 32 individuals from Messolongi, W. Greece; the number of alleles
per locus, the allele size range and the observed and expected
heterozygosities were calculated using GENETIX v4.04 software
(Belkhir et al., 1998). Deviations from the Hardy–Weinberg (HW)
equilibrium and linkage disequilibrium between pairs of loci were
estimated with FSTAT v184.108.40.206 (Goudet, 1995) in which P values from
multiple comparisons are corrected using the sequential Bonferroni
method (Rice, 1989). Finally, the presence of null alleles was
investigated according to the repeat motif of each SSR using MICRO-
CHECKER (Van Oosterhout et al., 2004) at the 95% confidence interval
for the Monte Carlo simulations with 10,000 randomizations of data.
2.4. Development of multiplex PCRs
Following PCR optimization and polymorphism analysis of each
primer pair on the wild sea bream population, EST-SSRs were sorted
according to PCR product size and number of alleles. Initially, 0.3 μM
of each primer was used in all multiplex reactions and primer
concentrations were modified subsequently to obtain peak heights
between 600 and 3000 relative fluorescent units (RFU) for each
microsatellite marker (Navarro et al., 1843). Besides the primer
concentration modifications, we also performed tests using an
annealing temperature gradient (50 to 60 °C) in four different
MgCl2concentrations (1.5, 2.0, 2.5 and 3.0 mM). Each multiplex set
was tested on four wild sea bream DNA samples and genotyping
results were compared against the initial results of each locus
PCR conditions comprised of an initial denaturation at 95 °C for
3 min, followed by 35 cycles of 95 °C for 45 s, at the annealing
temperature for 45 s and 72 °C for 45 s, with a final extension of 72 °C
for 10 min. Reactions were carried out in a final volume of 12.0 μl with
thefollowingcomponentconcentrations: 1×PCR Buffer (100 mMTris–
HClpH8.3, 500 mMKCland 0.1%gelatin),2 pmolof eachdNTP,1 UTaq
polymerase (5 U/μl, Genaxxon Bioscience), 20–50 ng of DNA template,
3 pmol of each primer and MgCl2concentrations from 2.0 to 3.0 mM.
3. Results and discussion
3.1. SSR types, distribution and position in the expressed sequences
We finally detected 899 ESTs that harbor 997 SSRs in the 5268
contigs and 12,928 singletons (18,196 unigenes) analyzed. EST-SSRs
frequency for gilthead sea bream was 4.94% and is well in the range
reported for other fish species which ranged from 1.5% in Xiphophorus
to 7.3% in zebrafish (2.2% in Fundulus and 2.6% in medaka, Ju et al.,
2005), 2.0 to 2.7% in Atlantic salmon Salmo salar (Vasemägi et al.,
2005; Ng et al., 2005), 2.1% in turbot (Bouza et al., 2008), 2.2% in
European eel (Pujolar et al., 2009) and the turbot (Chen et al., 2007),
3.98% for European sea bass (Louro, 2010), 4.0% in red sea bream
Chrysophrys major (Chen et al., 2005), 5.3% in half-smooth tongue sole
Cynoglossus semilaevis (Sha et al., 2010), 5.5% in common carp
Cyprinus carpio (Wang et al., 2007), 11.2% in channel catfish Ictalurus
punctatus (Serapion et al., 2004). It is noticeable that the frequency
and distribution of SSR motifs are related to the type of tandem
repeats explored and the choice of search criteria (Gupta and Prasad,
2009). On average, one SSR was found in every 2.95 kb of EST
sequence and the total length of the regions containing repeats is
1.68% of the total ESTs size. From the gilthead sea bream unigene
annotation (Louro et al., 2010), one third of these EST-SSR (31.6%, 284
EST-SSR) had a positive hit against annotated sequences, a percentage
close to that reported for channel catfish (Serapion et al., 2004).
bream accounting for 47.6% of the SSRs, followed by the trinucleotides
(32.6%),thetetranucleotides (11.5%),thepentanucleotides(6.6%) and
the hexanucleotides (1.7%) SSRs (Table 2). Among the dinucleotide
motifs, the AC/TG was nearly five times more abundant (81.85%) than
the AG/TC type (17.5%); only three AT/TA repeat sequences were
counted (0.65%) whereas no CG/GC motif was found (also in all fish
Species studied in cross-species essays, their common names and their taxonomic
Species Common nameAbbreviation FamilyOrder
Gilthead sea bream
White sea bream
Striped sea breamLmSparidaePerciformes
Black sea breamScSparidaePerciformes
European sea bass
SardineSp Clupeidae Clupeiformes
European hakeMm Merlucciidae Gadiformes
Motif type and position of SSRs in gilthead sea bream's EST sequences. Di-, tri-, tetra-,
penta- and hexa- are dinucleotide, trinucleotide, tetranucleotide, pentanucleotide and
hexanucleotide SSRs, respectively. 5′ UTR is the part of the EST from the 5′ end to the
position of the first codon identified for translation, 3′ UTR is the part from the last
codon to the 3′ end of the EST and CDS is the coding sequence identified.
E. Vogiatzi et al. / Marine Genomics 4 (2011) 83–91
species examined by Ju et al., 2005, except Fundulus in which AT/TA
was the most abundant motif). For the trinucleotides, only ten motifs
were observed in gilthead sea bream; the AAG motif was the most
abundant(26.5%),followedby ATC (20.6%),AGG(16.6%),AGC (14.5%)
and AAC (8.6%) while the other five motifs were at lower frequencies
(see Louro, 2010). Twenty-three types of tetranucleotides were
reported with AAAC, AAAG, ACAG, and AGAT showing a two-digit
percentage of occurrence. Finally, thirty different pentanucleotide and
twelve hexanucleotide motifs were found with AAGCT, AAAAC,
AAAAG and AATGCT at higher frequencies. Examining the distribution
of SSR motifs can help to gain insights into genome composition (Ju et
al., 2005; Serapion et al., 2004) and it seems that generally AT-rich
type of repeats were predominant in all SSRs accounting for
approximately 67%, except for the dinucleotides.
In Table 2, the type and position of the SSRs in the EST sequences
are also presented. Almost 60% of all SSRs identified in silico are found
in the untranslated regions (UTRs). Di-, tetra- and pentanucleotide
repeat typesare foundin higherfrequencies (77.8–90.1%)in the UTRs,
whereas tri- and hexanucleotide repeats are mainly found in the
coding regions (72.2 and 90.6%, respectively) most probably because
these SSRs do not trigger frame shift mutations. For all repeat types,
SSRs in the 3′ UTR are more frequent than in the 5′ UTR. This might be
an indication of the real SSR distribution in the gilthead sea bream
genome, since the first-strand synthesis aimed to generate high yields
of full-length, double-stranded (ds) cDNA which was sequenced from
the 5′ end (Wang et al., 2008).
3.2. Primer design and cross-species amplification
We used as template for primer design the non-redundant EST-
SSRs; out of the 899 sequences, seventy-eight (i.e. 8.68%) had more
than 90% similarity with others in the initial batch and was removed
from the analysis. In total, there were 664 unigenes for which we
could design primer pairs in gilthead sea bream, i.e., 73.85% of the
initial 899 EST-SSRs and 3.64% of the initial EST dataset (Supplement
1); for the remaining ESTs, primers could not be designed due to short
or inappropriate flanking regions. Primer pairs were ordered for 206
of those sequences (129 singletons and 77 contigs, of which 54 were
annotated), a number which roughly corresponds to most unigenes
containing equal or greater than 15, 7, 6, 5 and 5 repeats for di-, tri-,
tetra-, penta- and hexanucleotides, respectively. More than half of the
primer pairs were synthesized for trinucleotides (110), and less for
dinucleotides (48), tetranucleotides (34), for pentanucleotides (9)
and hexanucleotides (5) (Supplement 2).
When these primer pairs were PCR checked on wild gilthead sea
bream individuals, 63.2% resulted in PCR products of expected size
and 20.3% led to much bigger or multiple PCR products. Although
primers were designed from gilthead sea bream ESTs, there was a
significant percentage (16.5%, i.e. 34 out of 206 pairs of primers) that
did not result in any PCR product in this species (−0-type PCRs).
Taking into account only the type-‘2’ PCR results, i.e., when clear
bands in the expected size range (+/−50 bp) are detected and are
conventionally considered successful, there was no locus that was
amplified in more than 11 out of the 16 species and overall forty-nine
EST-SSR loci (23.8%) were not acceptably amplified in the 16 species
studied (Fig. 1). Amplification failure in ESTs is due mainly to primers
annealing onto neighboring exonic regions separated by intron(s) or
to primer synthesis errors and the incomplete PCR optimization for
different annealing temperatures (Tann) and MgCl2concentrations.
The former may, to some extent, be confirmed in the near future by
comparative mapping analysis with the model teleosts and the
genome of D. labrax which is currently in the process of a low-
coverage shotgun sequencing (Kuhl et al., 2010).
The rate of cross-species amplification within sparids, with the
exception of Diplodus puntazzo, was high and ranged from 25.7% in
Pagellus bogaraveo to 39.3% in Lithognathus mormyrus (Fig. 2 and
Supplement 2). This rate was lower in the other perciforms, and very
low in sprat (2.9%) and other Clupeiformes. Higher rates in non-sparids
consider only the 130 EST-SSRs successfully amplified (−2-type PCRs)
ranges from 15.4% in D. puntazzo to 54.6% in L. mormyrus. Gilthead sea
bream's EST-SSRs seem to have a similar high success probability in the
cross-species amplifications among species closely related to those of
genomic (type II) SSRs from genomic libraries enriched for tandem
repeats.Brownetal.(2005) amplified fourof thesix lociintwo ormore
of thesparid species tested, and only onelocus (SaI19) wassuccessfully
amplified in five of the six species tested, with D. puntazzo the only
species not represented at any of the loci. Additionally, Pinera et al.
(2007) cross-amplified 15 microsatellite loci isolated from P. bogaraveo
in four sparids and succeeded with eight loci in S. aurata, seven in
D. puntazzo, two in Dentex dentex and one in P. erythrinus. This
amplification transferability, however, for EST-SSRs in sparids is lower
than thatreported forsalmonidsand ranged from83.1%(Rexroadetal.,
(Chen et al., 2005) and cyprinids (Wang et al., 2007; Yue et al., 2004).
Therefore, the transferability of EST-SSR markers in sparids, and
generally in teleosts, displays promising and practical future potential.
In this way, genetic information can be transferred from marker-rich
species to other related but less studied species, in a cost and time-
other sparid species, 27 EST-SSR markers amplified in all other sparids.
Because the coding regions of genes are more conserved than other
genomic regions, the EST-SSR flanking sequences are more useful for
designing primers that work across two or more species. An exciting
application of expressed sequence data is comparativegenome analysis
among phylogenetically related species to study genome structure,
evolution, and gene function (Ju et al., 2005). Therefore, EST-SSRs have
great potential for comparative genome mapping analyses that require
the same sets of genes (i.e. cross-reference genes) to be used as cross-
species anchors and mapped to chromosomes in the species compared.
Comparative genome analysis in gilthead sea bream will enhance high-
throughput comparative mapping in sparids with the assistance of
cross-species markers, and further facilitate gene cloning by identifying
cross-reference genes. Results in the present study, however, underes-
timate the number of cross-species markers, since high stringency was
used in choosing only markers with the best performance (type-‘2’
PCRs). A greater percentage of cross-species markers may further be
Number of Species successfully amplified
Number of Loci
Fig. 1. Total number of EST-SSR loci that resulted in successful PCR reactions with
products of expected size in the 16 different fish species tested. Annotated and not-
annotated loci are indicated in light and dark bars, respectively.
E. Vogiatzi et al. / Marine Genomics 4 (2011) 83–91
obtained by optimizing primer pairs (Tm and MgCl2concentration) for
which results were moderate (type-‘1’ PCRs).
3.3. Τype I marker development and evaluation
From each SSR type, we randomly selected a subset of 63 primer
pairs for 28 dinucleotides, 22 trinucleotides, 9 tetranucleotides, 2
pentanucleotides and 2 hexanucleotides. All but five EST-SSRs were
found to be polymorphic; four trinucleotides and one tetranucleotide
SSR were the only monomorphic loci in the wild gilthead's sea bream
population examined whereas all dinucleotide SSRs were polymorphic
observed andexpected heterozygosities from 0.094 to1 andfrom 0.089
to 0.946, respectively. The mean number of alleles per locus and the
expected heterozygosity (Nei, 1987) were the highest in dinucleotides
(10.85 and 0.779) followed by the tetranucleotides (8.0 and 0.687), the
trinucleotides (4.88 and 0.498), the pentanucleotides (4.5 and 0.574)
and finally the hexanucleotides (3.0 and 0.393). Of the 63 loci, 10 SSRs
22 in the 3′ UTR (Table 3). Conclusions can be drawn mainly for
trinucleotide repeats for which there were seven SSRs in coding
sequences and seven in the 3′ UTR; polymorphism level as estimated
by the average number of alleles and the expected heterozygosity are
0.464 vs 0.629, respectively).
EST-SSRs in gilthead sea bream seem generally to be less
polymorphic than anonymous SSRs and this comparison of marker
polymorphism with already published SSRs can only be done for
dinucleotide EST-SSRs. Genetic diversity revealed by dinucleotide
markers, like the mean number of alleles per locus and expected
heterozygosity, is close to values found by Launey et al. (2003) (8.0 and
0.752) and lower than those reported by Batargias et al. (1999) (16.5
and 0.856) and Brown et al. (2005) (16.3 and 0.885). However, the
by their relatively easy detection and the high numbers available in EST
collections, as shown in the present study.
None of the possible pairwise comparisons between loci showed
significant linkage disequilibrium (adjusted P-value for 5% nominal
level: 0.000026). Nearly all loci conformed to the Hardy–Weinberg
(HW) expectations after sequential Bonferroni correction using
exact test implemented in FSTAT v220.127.116.11 (Goudet, 1995), with the
exception of cDN05P0005B16 (tri-) and cDN12P0001H03 (tetra-)
which showed a significant departure from the HW equilibrium
(Table 3) most probably because of heterozygote deficit. We there-
fore investigated whether this may be due to the presence of null
alleles, i.e., alleles that fail to amplify during PCR and when seg-
regating with another allele, resulting in an apparent homozygote.
For SSRs, null alleles appear mainly when mutations occur in the
a result they provoke severe biases in genotype-based statistics (e.g. Fis)
and parentage analysis (Pemberton et al., 1995). Using Micro-Checker
(Van Oosterhout et al., 2004), six loci showing signs of null alleles were
found; two dinucleotides (cDN04P0003A07 and CL2750Contig1), two
(cDN12P0001H03) and one pentanucleotide (cDN11P0004B17). For all
of these six loci, the indication for the presence of null alleles is the
general excess of homozygotes and the heterozygote deficit; for the
tetranucleotide locus, in particular, this homozygote excess is associated
with the fact that more than 50% of the alleles are of one size class.
3.4. Multiplex PCR essays and amplification conditions
Five highly confident multiplex PCRs were finally developed
comprising 32 of the EST-SSRs evaluated in the current study. The
final primer sets for five multiplex reactions, their annealing
temperatures and MgCl2 concentrations, are reported in Table 4.
Nine microsatellite markers amplified adequately in Saur-multi-1,
eight in Saur-multi-2, seven in Saur-multi-3 and four in Saur-multi-4
and 5. These multiplex PCRs will allow an increase in efficiency by the
reduction of the reaction costs and labor time for population genetics,
genealogy reconstruction and parentage assignment in gilthead sea
of the newly-developed Marine Genomics Europe gilthead sea bream's
EST database has proved to be efficient leading to 58 polymorphic
microsatellite markers ready to be used in population genetics and
genomic studies in this species. EST-SSRs in gilthead sea bream are
expected to increase genome information and type II marker density on
the available genetic map (Franch et al., 2006) and to enable the
development of comparative genetic maps across fish with special focus
Fig. 2. Performanceofcross-speciestransferabilityof206EST-SSRsinsilicodescribedingiltheadseabream(S.aurata)todifferentmarineteleosts,expressedaspercentageoflocithatgive
product of expected PCR size in each species (light bars), those resulted in much bigger or non-unique products (solid bars) and those failed to amplify (diagonally filled bars).
E. Vogiatzi et al. / Marine Genomics 4 (2011) 83–91
(number of alleles per locus and Fis) in the 63 microsatellite loci developed from ESTs in gilthead sea bream (S. aurata). In the case of contigs, multiple accession numbers from their
respective ESTs are reported; Hexpand Hobsare expected and observed heterozygosity, respectively. Loci are sorted from di- to hexanucleotides with ascending clone name.
cDN02P0001C17 AM952795(CA)36 di3′
1.560236 27 0.9461
2.558212 4 0.550.645
cDN04P0003A07 AM957727(AC)4(AC)5(AC)5 di2.558 21160.7710.613 0.221
cDN11P0002A06 AM972254(GT)19 di1.560227110.8180.7780.068
cDN11P0003K18 AM972840(CA)18(CA)10di 5′
cDN12P0003I09 AM974926(TG)19di 2.554 249100.850.7240.165
CL1014Contig1AM961650/FM144655 (CA)16 di3′
CL1275Contig1AM966862/AM978468(TG)9(AG)18 di 1.558197140.8590.8710.002
CL2598Contig1 AM966431(AC)18di1.5 54 136140.8781
2.5582197 0.6850.613 0.121
CL3295Contig1 (TC)12(AC)10 di3′
CL3420Contig1AM962158(GT)9(GT)15di1.5 581975 0.6750.9
CL4643Contig1AM969293 (TC)17di1.55624650.7140.71 0.022
2.5 581628 0.790.936
1.558206 60.792 0.2330.714
cDN05P0006H15AM961064 (TGA)7tricds2.5 54249100 NA
cDN07P0003D01 AM964239(AGC)8(CAG)7 tricds1.55818120.3650.48
E. Vogiatzi et al. / Marine Genomics 4 (2011) 83–91
domain of the genome and are consequently more conserved than other
genomic regions, they are far more useful in designing primers that
with aquatic model organisms like zebrafish, fugu, medaka and
stickleback (see also in Chistiakov et al., 2008) and for gene and QTL
mapping in sparids which are of high commercial importance in the
aquaculture industry. Finally, these loci are also expected to be the basis
for the detection of selection signatures in natural population of the
species which are currently distributed from the Eastern Mediterranean
Sea to the Atlantic Ocean and northwards to the Irish Sea.
Supplementary materials related to this article can be found online
at doi: 10.1016/j.margen.2011.01.003.
This work was partly financed by the ‘Marine Genomics Europe’
Network of Excellence (COGE-CT-2004-505403) and the European
Union research project “AQUAFIRST” (SSP8-CT-513692). We also
thank Erika Souche and Filip Volckaert for fruitful discussions and M.
Eleftheriou for linguistic assistance.
Table 3 (continued)
cDN09P0003E23AM968486 (CTT)7tri cds2.560 23220.0980.103
cDN09P0006J06 AM969579(ATG)7tri 2.5 5622920.4410.656
cDN12P0001K14 AM974232 (CTT)7tri3′
cDN13P0003B17AM976937 (ATT)7 tri2.55824340.4520.4290.069
cDN13P0003J21AM977121(TGA)10tricds 2.560239 40.3630.3230.127
cDN13P0005H11 AM977777(GCA)10tricds2.55817830.537 0.8
CL1017Contig1AM976212/AM962554 (AGG)9 tri1.560210100 NA
CL1330Contig1(GGA)10tri cds2.5 58198 50.5950.8
(ATG)7tri2.5 5822240.202 0.1560.24
1.560 197100 NA
cDN01P0004C20AM951700 (CTGT)8tetra 2.554124 110.8240.7930.055
cDN11P0002G23 AM972406(AAAG)8 tetra 1.56013850.7920.875
2.5 6021950.568 0.1250.786
cDN06P0001M14AM961537 (CGAGGA)5 hexaCds2.55623640.5860.807
E. Vogiatzi et al. / Marine Genomics 4 (2011) 83–91
Alarcón, J.A., Magoulas, A., Georgakopoulos, T., Zouros, E., Alvarez, M.C., 2004. Genetic
comparison of wild and cultivated European populations of the gilthead sea bream
(Sparus aurata). Aquaculture 230, 65–80.
Avise, J., 2004. Molecular Markers, Natural History and Evolution, 2nd ed. Sinauer
Associates, Sunderland, MA, USA.
Batargias, C., Dermitzakis, E., Magoulas, A., Zouros, E., 1999. Characterization of six
polymorphic microsatellite markers in gilthead seabream, Sparus aurata (Linnaeus
1758). Mol. Ecol. 8, 897–898.
Belkhir, K., Borsa, P., Goudet, J., Chikhi, L., Bonhomme, F., 1998. GENETIX, logiciel sous
Windows TM pour la génétique des populations. Laboratoire Génome et Populations,
CNRS UPR 9060, Université de Montpellier II, Montpellier (France).
Bouza, C., Hermida, M., Millán, A., Vilas, R., Vera, M., Fernández, C., Calaza, M., Pardo, B.G.,
Martínez, P., 2008. Characterization of EST-derived microsatellites for gene mapping
and evolutionary genomics in turbot. Anim. Genet. 39, 666–670.
Brown, R.C., Tsalavouta, M., Terzoglou, V., Magoulas, A., McAndrew, B.J., 2005. Additional
microsatellites for Sparus aurata and cross-species amplification within the Sparidae
family. Mol. Ecol. Notes 5, 605–607.
2007. A microsatellite marker tool for parentage assessment in gilthead seabream
(Sparus aurata). Aquaculture 272, S210–S216.
Castro, J., Pino-Querido, A., Hermida, M., Chavarrías, D., Romero, R., García-Cortés, L.A.,
Toro, M.A., Martínez, P., 2008. Heritability of skeleton abnormalities (lordosis, lack of
operculum) in gilthead seabream (Sparus aurata) supported by microsatellite family
data. Aquaculture 279, 18–22.
Chen, S.L., Liu, Y.G., Xu, M.Y., Li, J., 2005. Isolation and characterization of polymorphic
species amplification. Mol. Ecol. Notes 5, 215–217.
Chen, S.L., Ma, H.Y., Jiang, Y., Liao, X.L., Meng, L., 2007. Isolation and characterization of
and cross-species amplification. Mol. Ecol. Notes 7, 848–850.
Chistiakov, D.A., Tsigenopoulos, C.S., Lagnel, J., Guo, Y.M., Hellemans, B., Haley, C.S.,
Volckaert, F.A.M., Kotoulas,G.,2008.A combinedAFLP andmicrosatellite linkage map
and pilot comparative genomic analysis of European sea bass Dicentrarchus labrax L.
Anim. Genet. 39, 623–634.
DeInnocentiis, S.,Miggiano,E.,Ungaro,A.,Livi,S.,Sola,L.,Crosetti,D., 2005a.Geographical
origin of individual breeders from gilthead sea bream (Sparus auratus) hatchery
broodstocks inferred by microsatellite profiles. Aquaculture 247, 227–232.
De Innocentiis, S., Miggiano, E., Ungaro, A., Crosetti, D., Sola, L., 2005b. Tracing the
geographical origin of individual breeders from gilthead sea bream (Sparus
aurata) hatchery broodstocks by multilocus microsatellite profiles. Aquaculture
Ellegren, H., 2004. Microsatellites: simple sequences with complex evolution. Nat. Rev.
Genet. 5, 435–445.
FAO 2009. FishStat Plus - Universal software for fishery statistical time series, http://
Franch, R., Louro, B., Tsalavouta, M., Chatziplis, D., Tsigenopoulos, C.S., Sarropoulou, E.,
Antonello, J., Magoulas, A., Mylonas, C.C., Babbucci, M., Patarnello, T., Power, D.M.,
Kotoulas, G., Bargelloni, L., 2006. A genetic linkage map of the hermaphrodite teleost
fish Sparus aurata L. Genetics 174, 851–861.
Goudet, J., 1995. Fstat version 1.2: a computer program to calculate F statistics. J. Hered. 86,
Gupta, S., Prasad, M., 2009. Development and characterization of genic SSR markers in
Medicago truncatula and their transferability in leguminous and non-leguminous
species. Genome 52, 761–771.
Guyomard, R., Mauger, S., Tabet-Canale, K., Martineau, S., Genet, C., Krieg, F., Quillet, E.,
2006. A type I and type II microsatellite linkage map of rainbow trout (Oncorhynchus
mykiss) with presumptive coverage of all chromosome arms. BMC Genomics 7, 302.
Hirschhorn, J.N., Daly, M.J., 2005. Genome-wide association studies for common
diseases and complex traits. Nat. Rev. Genet. 6, 95–108.
Iseli, C., Jongeneel, C.V., Bucher, P., 1999. ESTScan: a program for detecting, evaluating,
and reconstructing potential coding regions in EST sequences. Proc Int Conf Intell
Syst Mol Biol 138–148.
Ju, Z., Wells, M.C., Martinez, A., Hazlewood, L., Walter, R.B., 2005. An in silico mining for
simple sequence repeats from expressed sequence tags of zebrafish, medaka,
Fundulus, and Xiphophorus. In Silico Biol. 5, 439–463.
Kuhl, H., Beck, A., Wozniak, G., Canario, A.V.M., Volckaert, F.A.M., Reinhardt, R., 2010.
The European sea bass Dicentrarchus labrax genome puzzle: comparative BAC-
mapping and low coverage shotgun sequencing. BMC Genomics 11.
Lagnel, J., Tsigenopoulos, C.S., Iliopoulos, I., 2009. NOBLAST and JAMBLAST: new options
for BLAST and a Java application manager for BLAST results. Bioinformatics 25,
Launey, S., Krieg, F., Haffray, P., Bruant, J.S., Vanniers, A., Guyomard, R., 2003. Twelve
new microsatellite markers for gilted seabream (Sparus aurata L.): characterization,
polymorphism and linkage. Mol. Ecol. Notes 3, 457–459.
Lottaz, C., Iseli, C., Jongeneel, C.V., Bucher, P., 2003. Modeling sequencing errors by
combining Hidden Markov models. Bioinformatics 19, Ii103–Ii112.
Louro, B., Passos, A.L.S., Souche, E.L., Tsigenopoulos, C., Beck, A., Lagnel, J., Bonhomme, F.,
Reinhardt, R., Canario, A.V.M., 2010. Gilthead sea bream (Sparus auratus) and
European sea bass (Dicentrarchus labrax) expressed sequence tags: Characterization,
tissue-specific expression and gene markers. Marine Genomics 3, 179–191.
Miggiano, E., De Innocentiis, S., Ungaro, A., Sola, L., Crosetti, D., 2005. AFLP and
microsatellites as genetic tags to identify cultured gilthead seabream escapees:
data from a simulated floating cage breaking event. Aquacult. Int. 13, 137–146.
Miller, S.A., Dykes, D.D., Polesky, H.F., 1988. A simple salting out procedure for
extracting DNA from human nucleated cells. Nucl. Acids Res. 16, 1215.
Navarro, A., Badilla, R., Zamorano, M.J., Pasamontes, V., Hildebrandt, S., Sαnchez, J.J.,
Afonso, J.M., 1843. Development of two new microsatellite multiplex PCRs for three
sparid species: gilthead seabream (Sparus auratus L.), red porgy (Pagrus pagrus L.)
and redbanded seabream (P. auriga, Valenciennes, and their application to
paternity studies. Aquaculture 285 (2008), 30–37.
of heritabilities and genetic correlations for growth and carcass traits in gilthead
seabream (Sparus auratus L.), under industrial conditions. Aquaculture 289, 225–230.
Nei, N., 1987. Molecular Evolutionary Genetics. Columbia University Press, New York.
Ng, S.H.S., Chang, A., Brown, G.D., Koop, B.F., Davidson, W.S., 2005. Type I microsatellite
markers from Atlantic salmon (Salmo salar) expressed sequence tags. Mol. Ecol.
Notes 5, 762–766.
Palma, J., Alarcon, J.A., Alvarez, C., Zouros, E., Magoulas, A., Andrade, J.P., 2001.
Developmental stability and genetic heterozygosity in wild and cultured stocks of
gilthead sea bream (Sparus aurata). J. Mar. Biol. Assoc. U. K. 81, 283–288.
Pemberton, J.M., Slate, J., Bancroft, D.R., Barrett, J.A., 1995. Nonamplifying alleles at
microsatellite loci—a caution for parentage and population studies. Mol. Ecol. 4,
Pinera, J.A., Bernardo, D., Blanco, G., Vazquez, E., Sanchez, J.A., 2007. Usefulness of
microsatellite markers developed from Pagellus bogaraveo to genetically study five
different species of Sparidae. Mar. Ecol. 28, 184–187.
Pujolar, J.M., Maes, G.E., Van Houdt, J.K.J., Zane, L., 2009. Isolation and characterization
of expressed sequence tag-linked microsatellite loci for the European eel (Anguilla
anguilla). Mol. Ecol. Resour. 9, 233–235.
Qiu, X., Xu, L., Liu, S., Wang, X., Meng, X., 2009. Eleven polymorphic simple sequence
repeat markers from expressed sequence tags of Pacific oyster Crassostrea gigas EST
database. Conserv. Genet. 10, 1773–1775.
Rexroad III, C.E., Rodriguez, M.F., Coulibaly, I., Gharbi, K., Danzmann, R.G., Dekoning, J.,
Phillips, R., Palti, Y., 2005. Comparative mapping of expressed sequence tags
containing microsatellites in rainbow trout (Oncorhynchus mykiss). BMC Genomics
Rice, W.R., 1989. Analyzing tables of statistical tests. Evolution 43, 223–225.
Rozen, S., Skaletsky, H., 2000. Primer3 on the WWW for general users and for biologist
programmers. Meth. Mol. Biol. 132, 365–386.
Sarropoulou, E., Franch, R., Louro, B., Power, D.M., Bargelloni, L., Magoulas, A., Senger, F.,
Tsalavouta, M., Patarnello, T., Galibert, F., Kotoulas, G., Geisler, R., 2007. A gene-
based radiation hybrid map of the gilthead sea bream Sparus aurata refines and
exploits conserved synteny with Tetraodon nigroviridis. BMC Genomics 8, 44.
Schlotterer, C., 2000. Evolutionary dynamics of microsatellite DNA. Chromosoma 109,
Selkoe, K.A., Toonen, R.J., 2006. Microsatellites for ecologists: a practical guide to using
and evaluating microsatellite markers. Ecol. Lett. 9, 615–629.
Description of the 6 multiplex reactions in gilthead seabream (S. aurata) with information on the type of the EST-SSRs (di- to hexanucleotides) and the labeling dyes, and their
amplification conditions (optimized annealing temperature—Tann, and optimized MgCl2concentration). EST-SSRs with a known annotation are shown in bold.
Saur-multi-1Dye Saur-multi-2DyeSaur-multi-3Dye Saur-multi-4DyeSaur-multi-5Dye
Tann: 58 °C, MgCl2:
Tann: 58 °C, MgCl2: 2.0 mM
Tann: 56 °C, MgCl2: 3.0 mM
Tann: 58 °C, MgCl2: 2.5 mM
Tann: 54 °C, MgCl2: 2.5 mM
E. Vogiatzi et al. / Marine Genomics 4 (2011) 83–91
Senger, F., Priat, C., Hitte, C., Sarropoulou, E., Franch, R., Geisler, R., Bargelloni, L., Power,
D., Galibert, F., 2006. The first radiation hybrid map of a perch-like fish: the gilthead
seabream (Sparus aurata L). Genomics 87, 793–800.
Serapion, J., Kucuktas, H., Feng, J., Liu, Z., 2004. Bioinformatic mining of type I
microsatellites from expressed sequence tags of channel catfish (Ictalurus
punctatus). Mar Biotechnol NY 6, 364–377.
Sha, Z.,Wang, S., Zhuang, Z., Wang,Q., Wang, Q., Li, P.,Ding, H., Wang, N., Liu, Z., Chen, S.,
2010. Generation and analysis of 10 000 ESTs from the half-smooth tongue sole
Cynoglossus semilaevis and identification of microsatellite and SNP markers. J. Fish
Biol. 76, 1190–1204.
Slate, J., Hale, M., Birkhead, T., 2007. Simple sequence repeats in zebra finch
(Taeniopygia guttata) expressed sequence tags: a new resource for evolutionary
genetic studies of passerines. BMC Genomics 8, 52.
Tóth, G., Gáspári, Z., Jurka, J., 2000. Microsatellites in different eukaryotic genomes:
survey and analysis. Genome Res. 10, 967–981.
Van Oosterhout, C., Hutchinson, W.F., Wills, D.P.M., Shipley, P., 2004. micro-checker:
software for identifying and correcting genotyping errors in microsatellite data.
Mol. Ecol. Notes 4, 535–538.
Vasemägi, A., Nilsson, J., Primmer, C.R., 2005. Seventy-five EST-linked Atlantic salmon
(Salmo salar L.) microsatellite markers and their cross-amplification in five
salmonid species. Mol. Ecol. Notes 5, 282–288.
Vidal, R., Peñaloza, C., Urzúa, R., Toro, J.E., 2009. Screening of ESTs from Mytilus for
the detection of SSR markers in Mytilus californianus. Mol. Ecol. Resour. 9,
Wang, D., Liao, X., Cheng, L., Yu, X., Tong, J., 2007. Development of novel EST-SSR markers in
common carp by data mining from public EST sequences. Aquaculture 271, 558–574.
Wang, Y., Ren, R., Yu, Z., 2008. Bioinformatic mining of EST-SSR loci in the Pacific oyster,
Crassostrea gigas. Anim. Genet. 39, 287–289.
Yu, H., Li, Q., 2007. Development of EST-SSRs in the Mediterranean blue mussel, Mytilus
galloproviancialis. Mol. Ecol. Notes 7, 1308–1310.
Yue, G.H., Ho, M.Y., Orban, L., Komen, J., 2004. Microsatellites within genes and ESTs
of common carp and their applicability in silver crucian carp. Aquaculture 234,
E. Vogiatzi et al. / Marine Genomics 4 (2011) 83–91