Page 1
SNP Discovery and Linkage Map Construction in Cultivated Tomato
KENTA Shirasawa1,*, SACHIKO Isobe1, HIDEKI Hirakawa1, ERIKA Asamizu1,†, HIROYUKI Fukuoka2, DANIEL Just3,
CHRISTOPHE Rothan3, SHIGEMI Sasamoto1, TSUNAKAZU Fujishiro1, YOSHIE Kishida1, MITSUYO Kohara1,
HISANO Tsuruoka1, TSUYUKO Wada1, YASUKAZU Nakamura1,‡, SHUSEI Sato1, and SATOSHI Tabata1
Kazusa DNA Research Institute, 2-6-7 Kazusa-Kamatari, Kisarazu, Chiba 292-0818, Japan1; National Institute of
Vegetable and Tea Science (NIVTS), National Agriculture and Food Research Organization, 360 Kusawa, Ano, Tsu,
Mie 514-2392, Japan2 and Unite´ Mixte de Recherche 619, Biologie du Fruit, Institut National de la Recherche
Agronomique (INRA Bordeaux), Universite´s de Bordeaux, 33883 Villenave d’Ornon cedex, France3
*To whom correspondence should be addressed. Tel. þ81 438-52-3935. Fax. þ81 438-52-3934.
Email: shirasaw@kazusa.or.jp
Edited by Doil Choi
(Received 14 July 2010; accepted 21 September 2010; published online 2 November 2010)
Abstract
Few intraspecific genetic linkage maps have been reported for cultivated tomato, mainly because genetic
diversity within Solanum lycopersicum is much less than that between tomato species. Single nucleotide
polymorphisms (SNPs), the most abundant source of genomic variation, are the most promising source
of polymorphisms for the construction of linkage maps for closely related intraspecific lines. In this
study, we developed SNP markers based on expressed sequence tags for the construction of intraspecific
linkage maps in tomato. Out of the 5607 SNP positions detected through in silico analysis, 1536 were
selected for high-throughput genotyping of two mapping populations derived from crosses between
‘Micro-Tom’ and either ‘Ailsa Craig’ or ‘M82’. A total of 1137 markers, including 793 out of the 1338 suc-
cessfully genotyped SNPs, along with 344 simple sequence repeat and intronic polymorphism markers,
were mapped onto two linkage maps, which covered 1467.8 and 1422.7 cM, respectively. The SNP
markers developed were then screened against cultivated tomato lines in order to estimate the transfer-
ability of these SNPs to other breeding materials. The molecular markers and linkage maps represent a
milestone in the genomics and genetics, and are the first step toward molecular breeding of cultivated
tomato. Information on the DNA markers, linkage maps, and SNP genotypes for these tomato lines is avail-
able at http://www.kazusa.or.jp/tomato/.
Keywords: DNA marker; linkage map; single nucleotide polymorphism; Solanum lycopersicum; tomato
1. Introduction
Genetics in tomato (Solanum lycopersicum) and its
wild relatives, including S. chilense, S. habrochaites,
S. pimpinellifolium, and S. pennellii, have been greatly
advanced since molecular markers have become
available.1 During the past two decades, several
genetic maps in tomato have been reported, with a
total of more than 2000 loci detected by restriction
fragment length polymorphism (RFLP), amplified
fragment length polymorphism (AFLP), cleaved ampli-
fied polymorphic sequence (CAPS), and simple
sequence repeat (SSR) markers based on the
mapping of populations derived from crosses
between tomato and related wild species.2–6
Recently, 1282 novel SSR markers and 151 intronic
polymorphic markers were mapped onto an
† Present address: Gene Research Center, University of Tsukuba,
1-1-1 Tennodai, Tsukuba, Ibaraki 305-8572, Japan.
‡ Present address: Center for Information Biology and DNA Data
Bank of Japan, National Institute of Genetics, Research
Organization for Information and Systems, Yata, Mishima,
Shizuoka 411-8510, Japan.
# The Author 2010. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://
creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium,
provided the original work is properly cited.
DNA RESEARCH 17, 381–391, (2010) doi:10.1093/dnares/dsq024
Advance Access publication on November 2, 2010
Page 2
interspecific map, ‘Tomato-EXPEN 2000’ derived from
a cross between S. lycopersicum and S. pennellii.7 Such
efforts have resulted in the identification of a number
of quantitative trait loci (QTLs) and genes for fruit
morphology,8–11 disease resistance,12–15 and other
agronomical traits.16 The identified genes, e.g. Cf-4,
Tm-2, and Sw-5, have already been used for tomato
breeding through advanced-backcross and introgres-
sion-line strategies using molecular markers.1
Though significant advances in molecular genetics
and breeding have been reported in tomato, most of
them were based on interspecific crosses because
genetic diversity in the cultivated tomato is lower
than in its wild relatives.17 Meanwhile, intraspecific
maps are required to identify QTLs for agronomically
important traits, which are the targets of practical
breeding programs. However, only one intraspecific
map, based on AFLP, RFLP, and random amplified poly-
morphic DNA (RAPD) markers, has been reported for
S. lycopersicum.18
Single nucleotide polymorphisms (SNPs) are the
most abundant source of variation in the genome
for both intragenic and intergenic regions. They
therefore represent a valuable basis for the develop-
ment of molecular markers for identification of poly-
morphisms among closely related lines. Previous
studies have suggested that DNA markers developed
from intergenic regions tend to cluster in heterochro-
matic portions of chromosomes, while those derived
from genic regions disperse along entire chromo-
somes.7,19–22 Therefore, SNPs, especially those
located in intragenic regions, are expected to distri-
bute randomly along the whole genome. In addition,
novel techniques based on the DNA microarray
method allow high-throughput SNP genotyping.23
For these reasons, SNP markers derived from intra-
genic regions are the most informative markers for
genome-wide genetic analysis in intraspecific tomato
populations. By comparing expressed sequence tags
(ESTs) in tomato and related wild species, approxi-
mately 40 000 candidate SNPs have been ident-
ified.24–27 Since then, the number of ESTs derived
from several tomato cultivars has increased to app-
roximately 300 000, all of which are available in the
public DNA databases, e.g. DNA Data Bank of Japan
(DDBJ: http://www.ddbj.nig.ac.jp/), Sol Genomics
Network (SGN: http://solgenomics.net/), and MiBASE
(http://www.pgb.kazusa.or.jp/mibase/).
The tomato is regarded as a model plant not only
for the Solanaceae but also for other fruiting
plants.28 A miniature dwarf cultivar, ‘Micro-Tom’, orig-
inally bred for home gardening purposes,29 has drawn
attention as a model tomato line because of its small
plant size, short life cycle, easy transformation, and
availability of transposon-tagging systems for use in
reverse genetics.30 Various genomic and genetic
resources have been developed for ‘Micro-Tom’.
These include mutagenized lines,31,32 effective trans-
formation systems,33,34 metabolite annotations,35
full-length cDNAs,36 and BAC-end sequences
(Asamizu et al., released in the public DNA database
with accession numbers: FT227487–FT321168).
‘Micro-Tom’ seeds are available through two seed
stock centers: the Tomato Genetics Resource Center
at the University of California, Davis (USA, accession
no. LA3911) and the National Bio-Resource Project
at the University of Tsukuba (Japan, accession no.
TOMJPF00001).
In this study, we developed SNP markers using pub-
licly available ESTs from several tomato cultivars and
designed an SNP-genotyping platform using the
GoldenGatew assay (Illumina, San Diego, CA, USA) in
order to accelerate genetic studies and molecular
breeding in tomato. SNP markers, along with SSR
markers and intronic polymorphic markers, which
were developed and mapped onto the interspecific
map Tomato-EXPEN 2000 by Shirasawa et al.,7 were
applied to create linkage maps using two mapping
populations derived from crosses between ‘Micro-
Tom’ and ‘Ailsa Craig’, a greenhouse-type tomato,
and between ‘Micro-Tom’ and ‘M82’, a processing
tomato. In addition, the polymorphism of the SNP
markers was investigated in cultivated tomato lines
in order to estimate the transferability of the SNPs
to breeding materials.
2. Materials and methods
2.1. Plant materials
Two F2 mapping populations, AMF2 and MMF2,
each derived by crossing two S. lycopersicum lines,
were used for the construction of the linkage maps.
AMF2 (n ¼ 120) was derived from a cross between
the ‘Ailsa Craig’ and ‘Micro-Tom’ lines, while MMF2
(n ¼ 135) was derived from a cross between the
‘M82’ and ‘Micro-Tom’ lines. AMF2 and MMF2 were
generated in the National Institute of Vegetable
and Tea Science, Japan, and in the Institut National
de la Recherche Agronomique, France, respectively
(Table 1). To address potential residual heterozygosity
in the parental ‘Micro-Tom’ lines used to create AMF2
and MMF2, they are distinguished in this study by the
designations ‘Micro-Tom_AM’ and ‘Micro-Tom_MM’,
respectively. Along with the four parental lines of
the mapping populations, 22 lines, including 16
inbred and 6 hybrid tomato lines, and an S. pennellii
line (‘LA716’) were used for polymorphic analysis of
SNPs (Table 1). Total DNA for each line was extracted
using the DNeasy Plant Mini kit (Qiagen, Hilden,
Germany).
382 Tomato Linkage Map of SNP Markers [Vol. 17,
Page 3
2.2. Development of SNP markers and polymorphic
analysis
A total of 229 086 EST sequences from S. lycoper-
sicum, retrieved from two public databases, SGN
(http://solgenomics.net/) and MiBASE (http://www.
pgb.kazusa.or.jp/mibase/), were used for identifi-
cation of eSNPs, i.e. SNPs discovered in silico. The
ESTs registered in MiBASE were derived only from
‘Micro-Tom’, while those registered in SGN were
developed from 19 tomato lines including ‘Micro-
Tom’. The retrieved EST sequences were assembled
using the MIRA program.37 The eSNPs were then
selected according to the following three criteria:
(i) only nucleotides with Phred scores of 15 or
more were considered candidates for eSNPs, (ii) a
nucleotide at an eSNP site should be identical
among multiple sequences within a given line, and
(iii) no other SNP candidates should be detected
on the flanking sequences 10 bp upstream and
downstream of a given candidate.
In order to validate the credibility of the identified
eSNP, nucleotide sequences of PCR products contain-
ing the eSNP regions were determined by direct
sequencing using a DNA sequencer (ABI-3730xl,
Applied Biosystems, Foster City, CA, USA). A total of
82 primer pairs were designed in flanking regions of
the randomly selected target eSNPs using the
Primer3 program.38 PCR was performed for 17
Table 1. Description of plant materials
Line name Note Sourcea Accession number SNP validationb
Parental lines of mapping populations
Micro-Tom_AM Inbred line NIVTS Tested
Ailsa Craig Inbred line NBRP TOMJPF00004 Tested
Micro-Tom_MM Inbred line INRA
M82 Inbred line INRA Tested
Tomato lines for SNP typing
Aichi First Inbred line NBRP TOMJPF00003
Best of All Inbred line NIVTS LS3908
Earliana Inbred line TGRC LA3238 Tested
Fruit Inbred line NIVTS LS1100
Furikoma Inbred line NIVTS LS3903
Heinz 1706-BG Inbred line NIVTS LS461 Tested
LA925 Inbred line Cornell University Tested
Marglobe Inbred line TGRC LA0502 Tested
Money Maker Inbred line TGRC LA2706 Tested
Ponderosa Inbred line NIVTS LS1728
Rio Grande Inbred line TGRC LA3343 Tested
Rutgers Inbred line TGRC LA1090 Tested
San Marzano Inbred line NIVTS LS4956 Tested
Tomato Chuukanbohon Nou 9 Inbred line NIVTS
Tomato Chuukanbohon Nou 11 Inbred line NIVTS
Geronimo F1 hybrid De Ruiter Seeds Co. Tested
Labell F1 hybrid De Ruiter Seeds Co. Tested
Matrix F1 hybrid De Ruiter Seeds Co. Tested
Momotaro 8 F1 hybrid Takii Seeds Co. Tested
Reika F1 hybrid Sakata Seeds Co. Tested
Regina Inbred line, cherry type Sakata Seeds Co.
Sweet100 F1 hybrid, cherry type Vilmorin Seeds Co. Tested
LA716 Inbred line, S. pennellii Cornell University
aNBRP: University of Tsukuba in National Bio-Resource Project of MEXT, Japan; INRA: National Institute for Agricultural
Research, France; NIVTS: National Institute of Vegetable and Tea Science, Japan; TGRC: Tomato Genetics Resource Center,
University of California, Davis, USA.
bLines that used for validation of 82 eSNPs prior to design a SNP genotyping platform using Illumina GoldenGatew assay.
No. 6] K. Shirasawa et al. 383
Page 4
tomato lines listed in Table 1 in a 5-ml reaction
mixture containing 0.5 ng genomic DNA, 1� PCR
buffer (Bioline, London, UK), 3 mM MgCl2, 0.04 U
BIOTAQTM DNA polymerase (Bioline), 0.2 mM dNTPs,
and 0.8 mM of each of the primers. The modified
‘touchdown PCR’ protocol was used as described
previously.39
After validation of the 82 eSNPs, a total of 1536
eSNPs were subjected to polymorphic analysis for
the two mapping populations and the 23 tomato
lines described above using the GoldenGatew assay
system (Illumina). Allele- and locus-specific oligonu-
cleotides were designed from the flanking sequences
of the 1536 SNP sites using the iCom website (https
://icom.illumina.com/). Polymorphic analysis of the
SNPs was performed according to the standard proto-
col of the GoldenGatew assay, and the data analysis
was performed using GenomeStudio Data Analysis
software (Illumina).
SNPs in DWARF (D) and SELF-PRUNING (SP) were
analyzed using the dCAPS and CAPS methods, respect-
ively. PCR was performed under the same conditions
as described above. The primer sequences are shown
in Supplementary Table S1. The PCR products from
the D and SP genes were digested with PstI and
MvaI, respectively, and were subjected to electrophor-
esis on native 10% polyacrylamide gels in 1� TBE
buffer. The resulting DNA bands were then stained
with ethidium bromide.
2.3. Mapping of SSR and intronic polymorphic markers
on AMF2
A total of 3510 tomato genomic SSR (TGS), 2047
tomato EST-SSR (TES), and 166 tomato EST-derived
intronic polymorphic (TEI) markers, developed by
Shirasawa et al.,7 were used for segregation analysis
of the AMF2 population (Supplementary Table S1).
The polymorphic analyses of the markers were per-
formed as described previously.7 Primer information
for the tested markers is available at http://www.
kazusa.or.jp/tomato/.
2.4. Linkage analysis
Linkage analysis was performed using the JoinMapw
program, version 4.40 The segregated data were classi-
fied into 12 linkage groups, which corresponded to
the Tomato-EXPEN 2000 map,7 using the grouping
module of JoinMapw with LOD scores of 4.0–10.0.
The marker order and relative genetic distances
were calculated by the regression-mapping algorithm
with the following parameters: Haldane’s mapping
function, recombination frequency �0.35, and LOD
score �2.0.
3. Results
3.1. In silico SNP mining and validation
A total of 170 586 and 58 500 EST sequences
available in SGN and MiBASE, respectively, along
with data on their quality, were used for in silico SNP
mining. The name of the original tomato line for
each EST was obtained from the DDBJ database
(http://www.ddbj.nig.ac.jp/). In total, 229 086 ESTs
derived from 20 tomato cultivars, the average length
of which was 497 bp, were used for assembly
(Table 2).
Assembly was performed using nucleotides with
Phred scores �15. As a result, a total of 20 274
contigs, the average length of which was 775 bp,
and 29 698 singletons were generated. From initial
alignment data from all 20 274 contigs, a total of
5607 eSNP sites were identified in 2634 of these
contigs (Supplementary Tables S2 and S3). We gave
an SNP code to each eSNP according to the following
rule: contig name and position of the eSNP on the
contig, linked with an underscore, e.g. the 112th pos-
ition on contig 2758 was given the following SNP
code: 2758_112.
Before designing the SNP genotyping platform
(using the Illumina GoldenGatew assay), 82
Table 2. Number of ESTs and their original sources used for
assembling
Line namea No. of ESTs
TA496 106 142
Micro-Tom 101 157
Rio Grande PtoR 8803
R11-13 5031
R11-12 4925
TA492 2120
West Virginia 106 861
Money Maker 11
Ailsa Craig 7
VF36 7
Momotaro 4
Zhongshu 4 4
Betterboy 3
Vendor 3
House Odoriko 2
Rutgers 2
M82 1
Pera 1
Rio Grande 1
UC82B 1
Total 229 086
aNames of tomato lines used for EST generation.
384 Tomato Linkage Map of SNP Markers [Vol. 17,
Page 5
randomly selected eSNPs were tested in 17 tomato
lines (Table 1) by direct sequencing of fragments
amplified by PCR. As a result, 55 (67%) out of the
82 examined eSNP candidates were experimentally
confirmed as SNPs at the predicted positions, indicat-
ing that approximately 67% of the 5607 eSNPs
detected in silico represent true SNPs in the tomato
lines used in the present study. In addition, 40
(49%) and 50 (61%) of the 82 eSNPs segregated
between the two mapping parents for AMF2 and
MMF2, respectively.
For SNP genotyping, a total of 1536 SNPs were
selected from the 5607 eSNPs, as follows: (i) one
eSNP was selected from each contig and the
Selected-BAC-Mixture contig released from the Kazusa
Tomato SBM & Marker Database (http://www.
kazusa.or.jp/tomato/); (ii) an SNP score of more than
0.6, as determined by the iCom website of Illumina
(https://icom.illumina.com/), was required for each of
these eSNPs. As reported by the GoldenGatew assay,
1338 (87%) out of the 1536 SNPs could be properly
genotyped in the 279 plants. These included the two
mapping populations (AMF2 and MMF2) and 23
other tomato lines. The remaining 198 (13%) eSNPs
failed to be genotyped because fluorescent signals for
these eSNPs did not form clusters pursuant to the
criteria required by the GenomeStudio Data Analysis
software (Illumina).
3.2. Mapping of SNP, SSR, and intronic markers
In the AMF2 population, 648 of the 1338 available
SNPs (48.4%) generated segregation data, a similar
ratio to that determined in the validation of the 82
eSNPs. Two SNP markers designed in the D and SP
genes, for which ‘Micro-Tom’ has mutant alleles,41
showed polymorphism between ‘Ailsa Craig’ and
‘Micro-Tom’. Along with the SNP markers, a total of
5723 previously reported markers, including 2047
EST-SSR (TES), 3510 genomic-SSR (TGS), and 166
intronic (TEI) markers, were used for the poly-
morphic analysis. As a result, 96 TES (4.7%), 223
TGS (6.3%), and 28 TEI (16.8%) markers exhibited
polymorphism between the parental lines. In total,
997 markers were used to construct the AMF2
linkage map.
In the MMF2 population, 640 of the 1338 avail-
able SNPs (47.9%) segregated. This ratio was over
10% less than that determined in the validation of
the 82 eSNPs, suggesting that the result of the
eSNP validation was overestimated. The SNP on the
D gene showed polymorphism in the MMF2
mapping population, while two parental lines
detected the mutated sp allele for the SP gene. In
total, 641 segregated markers were used to con-
struct the MMF2 map.
3.3. Construction of linkage maps
For AMF2, a total of 989 of the 997 segregated loci
(99.2%) formed 12 linkage groups (LGs), while 637
of the 641 segregated loci (99.4%) formed 13
linkage groups for MMF2. The total sizes of the LGs
of the AMF2 and MMF2 maps were 1467.8 and
1422.7 cM, respectively (Table 3, Fig. 1,
Supplementary Table S1). Combining the two maps
yielded a total of 1137 markers, including 793 SNP,
221 TGS, 93 TES, 28 TEI, and 2 gene markers,
located on the intraspecific map. Among these, 488
SNP markers were commonly located on both
linkage maps, while 157 and 148 marker loci were
specific to the AMF2 map and the MMF2 map,
respectively. Chromosome 7 (Chr07) of MMF2
divided into two linkage groups, Chr07p and
Chr07q, which were located at the upper and the
lower portions, respectively, of Chr07 of Tomato-
EXPEN 2000. The average lengths of the intervals
between two loci on the AMF2 and the MMF2 maps
were calculated to be 1.5 and 2.2 cM, respectively.
Segregation distortions were observed in the two
maps. In the AMF2 map, 9.8% of the marker loci
showed segregation distortions, ranging from 0.0%
for Chr01, Chr08, and Chr12, to 55.4% for Chr11
(Table 3). In the MMF2 map, 5.3% of the marker
loci were distorted, ranging from 0.0% for Chr05
and Chr08, to 17.0% for Chr09 (Table 3). The
linkage groups harboring severe segregation distor-
tions were different between the two mapping popu-
lations, especially between Chr11 of AMF2 (55.4%)
and that of MMF2 (2.3%), suggesting Chr11 of ‘Ailsa
Craig’ might have transmission ratio distorters.
3.4. Polymorphic analysis of the SNP markers in
tomato cultivars and S. pennellii
A total of 916 (68.5%) out of the 1338 SNP
markers showed polymorphisms in at least one line
among the 27 tomato lines listed in Table 1
(Supplementary Table S4). The polymorphic ratio
was similar to the ratio determined during the PCR-
based validation of the 82 eSNPs. In ‘LA719’ (S. pen-
nellii) and ‘Sweet 100’, no data were obtained for
229 (17.1%) and one SNP markers, respectively. The
polymorphic ratios differed according to the combi-
nation of tomato lines (Fig. 2), and the number of seg-
regated SNPs between any two lines among the 27
lines was 255.0 (19.1%) on average. A total of
608.2 SNPs (45.5%) were identified between ‘Micro-
Tom’ and the other inbred lines, on average, while
only 80.8 SNPs (6.0%) were identified among the
17 inbred tomato lines. Within the 17 inbred
tomato lines, ‘M82’ showed the highest number of
polymorphisms: 176.3 SNPs (13.2%) on average,
which was twice as high as that of the other lines.
No. 6] K. Shirasawa et al. 385
End of preview.