Content uploaded by Justin D Faris
Author content
All content in this area was uploaded by Justin D Faris on May 19, 2015
Content may be subject to copyright.
Funct Integr Genomics (2006) 6: 90–103
DOI 10.1007/s10142-005-0020-1
ORIGINAL PAPER
Huangjun Lu
.
Justin D. Faris
Macro- and microcolinearity between the genomic region
of wheat chromosome 5B containing the
Tsn1
gene
and the rice genome
Received: 13 September 2005 / Revised: 21 November 2005 / Accepted: 22 November 2005 / Published online: 22 December 2005
# Springer-Verlag 2005
Abstract The Tsn1 gene in wheat confers sensitivity to a
proteinaceous host-selective toxin (Ptr ToxA) produced
by the tan spot fungus (Pyrenophora tritici-repentis) and
lies within a gene-rich region of chromosome 5B. To use
the rice genome sequence information for the map-based
cloning of Tsn1, colinearity between the wheat genomic
region containing Tsn1 and the rice genome was de-
termined at the macro- and microlevels. Macrocolinearity
was determined by testing 28 expressed sequence mark-
ers (ESMs) spanning a 25.5-cM segment and encom-
passing Tsn1 for similarity to rice sequences. Twelve
ESMs had no similarity to rice sequences, and 16 had
similarity to sequences on seven different rice chromo-
somes. Segments of colinearity with rice chromosomes 3
and 9 were identified, but frequent rearrangements and
disruptions occurred. Microcolinearity was determined by
testing the sequences of 26 putative genes identified from
BAC contigs of 205 and 548 kb in length and flanking
Tsn1 for similarity to rice genomic sequences. Fourteen
of the predicted genes detected orthologous sequences on
six different rice chromosomes, whereas the remaining
12 had no similarity with rice sequences. Four genes
were colinear on rice chromosome 9, but multiple dis-
ruptions, rearrangements, and duplications were observed
in wheat relative to rice. The data reported provide a
detailed analysis of a region of wheat chromosome 5B
that is highly rearranged relative to rice.
Keywords Wheat
.
Comparative genomics
.
Colinearity
.
Rice
Introduction
Bread wheat (Triticum aestivum L., 2n=6x=42, AABBDD
genomes) is one of the most important food crops in the
world. Genomics and gene discovery in hexaploid wheat is
confounded by a genome size of approximately 17,300 Mb
(Bennett and Leitch 1995) and an abundance (80%) of re-
petitive sequences (Wicker et al. 2001; SanMiguel et al.
2002). Despite these obstacles, considerable progress toward
understanding genome structure and organization has been
made. The polyploid nature of wheat allows it to tolerate
aneuploidy, including the loss of large chromosome seg-
ments, chromosome arms, and even whole chromosomes.
This feature has led to the development of a large collection
of aneuploid stocks for cytogenetic and genomic analyses
(Sears 1954; 1966; Sears and Sears 1978; Endo and Gill
1996). Endo and Gill (1996) isolated more than 400 terminal
chromosome deletion lines, which have been used to
develop physical maps of wheat chromosomes (Werner et
al. 1992;Gilletal.1993; Hohmann et al. 1994; Delaney et al.
1995a,b; Mickelson-Young et al. 1995). Comparisons of the
physical maps with recombination-based maps led to the
discovery that recombination frequencies were highest near
the distal regions of the chromosomes and that genes existed
in clusters with the highest gene density occurring near the
telomeric regions (Gill et al. 1996a,b; Akhunov et al. 2003;
Erayman et al. 2004).
More than 600,000 expressed sequence tags (ESTs) have
been generated from wheat and closely related species by
the National Science Foundation (NSF)-funded wheat EST
project and other public and private entities (http://www.
ncbi.nlm.nih.gov/dbEST/dbEST_summary.html). A panel
of the chromosome deletion lines was used to locate more
than 16,000 EST loci to specific chromosome deletion bins
by the wheat NSF EST project (Qi et al. 2004). The EST
sequence and mapping data provide a valuable resource for
genome analysis, identification of candidate genes for traits
of interest, predicting biological function of genes, and
comparative genomics.
Comparative analysis of plant genomes has increased
our knowledge of evolutionary relationships among spe-
H. Lu
Department of Plant Sciences,
North Dakota State University,
Fargo, ND 58105, USA
J. D. Faris (*)
USDA-ARS Cereal Crops Research Unit,
Red River Valley Agricultural Research Center,
Fargo, ND 58105, USA
e-mail: farisj@fargo.ars.usda.gov
cies, and considerable effort has been put forth in com-
paring the genomic relationships among grasses (Devos
and Gale 2000). Comparative mapping experiments among
wheat and other members of the Poaceae including rice,
barley, rye, oats, and maize have revealed remarkable
similarities in gene content and marker colinearity at the
chromosome (macro) level. The genomes of distantly re-
lated cereals like oat, rice, and maize can be divided into
linkage blocks that have homology to corresponding seg-
ments of the wheat genome (Ahn et al. 1993; Van Deynze
et al. 1995a,b).
The degree of genomic similarity observed at the macro-
level among grass genomes led to the notion that infor-
mation from the small genome of rice could be directly
applied to the much larger genome of wheat, and the recent
completion of rice genome sequencing (Goff et al. 2002;
Yu et al. 2002; International Rice Genome Sequencing
Project 2005) makes it a valuable reference for comparative
genomics of the grass family. However, while some studies
of colinearity between wheat and rice at the sequence
(micro) level indicated good levels of conservation (Yan et
al. 2003; Chantret et al. 2004; Distelfeld et al. 2004), many
have reported the occurrence of multiple rearrangements in
gene order and content (Bennetzen 2000; Feuillet and
Keller 2002; Li and Gill 2002; Sorrells et al. 2003; Francki
et al. 2004). Nevertheless, the rice genome sequence is a
potentially valuable tool for map-based cloning of genes in
wheat. The positional cloning of the wheat vernalization
gene VRN1 (Yan et al. 2003) is a good example of using
information from the colinear regions in rice and sorghum
to facilitate the cloning of a wheat gene. Others have shown
that colinear regions of rice can be a useful source of
markers for saturation and high-resolution mapping of
target genes in wheat (Liu and Anderson 2003; Distelfeld
et al. 2004).
Our goal is to isolate the Tsn1 gene in wheat using a
positional cloning approach. Tsn1 confers sensitivity to the
host-selective proteinaceous toxin Ptr ToxA, which is pro-
duced by the tan spot fungus (Pyrenophora tritici-repentis )
(Faris et al. 1996). In previous work, we located Tsn1 to the
long arm of chromosome 5B by genetic linkage mapping
(Faris et al. 1996), and later, we located the gene on the
physical map of 5B within the deletion bin 5BL 0.75–0.76
(Faris et al. 2000). Haen et al. (2004) conducted saturation
and high-resolution mapping of the Tsn1 locus and iden-
tified markers delineating the gene to a 0.8-cM interval. In
addition, we conducted genetic linkage mapping of 23
markers derived from bin-mapped ESTs to further saturate
the Tsn1 locus. In this study, we explored the level of
macrocolinearity between a 25.5-cM segment containing
Tsn1 and rice using EST and cDNA marker sequences. We
also evaluated the level of microcolinearity of the Tsn1
region and rice using putative gene sequences derived from
more than 750 kb of BAC sequence at the Tsn1 locus.
Materials and methods
Genetic mapping of expressed sequence markers
A mapping population consisting of 117 recombinant sub-
stitution lines (RSLs) derived from a cross between Chinese
Spring (CS) and a CS–Triticum dicoccoides chromosome
5B disomic substitution line (CS-DIC 5B, i.e., the CS 5B
chromosome pair was replaced by a pair of T. dicoccoides
5B chromosomes) was used for mapping expressed se-
quence markers (ESMs), which consisted of EST-derived
markers and cDNA-derived RFLP markers, within and near
the 5BL 0.75–0.76 interval (Faris et al. 2000; Lu et al.,
unpublished data). The genetic mapping of cDNA-derived
RFLP markers XksuQ63, XksuQ11, Xcdo400, Xbcd183,
Xbcd1030, Xksu919(Lpx), Xrz575, and Xcdo465 was
described in Faris et al. (2000). Marker Xfcc1 was described
in Haen et al. (2004), and details of EST-derived markers
and mapping are described elsewhere (Lu et al., unpub-
lished data). The resulting map, which spanned 25.5 cM
encompassing the Tsn1 locus, was used to determine
macrocolinearity with rice.
Chromosome walking and high-resolution mapping
The Langdon (LDN) BAC library (Cenci et al. 2003) was
used to assemble BAC contigs of 205 and 548 kb (hereafter
referred to as ctg205 and ctg548, respectively) flanking the
Tsn1 gene. The library was initially screened with probes
FCG17 (CC249715) and FCG9 (CC249705), which are
AFLP-derived RFLP probes that flank Tsn1 (Haen et al.
2004). Positive clones identified with FCG17 were used to
assemble ctg205 on the proximal side of Tsn1, which con-
sists of the overlapping BACs 378P21 (AY914085) and
1154L7 (AY914086) (Lu et al., unpublished data). BACs
404J6 (DQ157837) and 533E21 (DQ157838) were iden-
tified with probe FCG9 and used to assemble a 228-kb
contig on the distal side of Tsn1. Three subsequent chro-
mosome walking steps, which included BACs 50F12 (DQ
157835), 239E17 (DQ157836), and 1089I4 (DQ157839),
were performed from the proximal end of the 228-kb contig
resulting in ctg548. For each walking step, BACs were
completely sequenced (Lu and Faris, unpublished data),
and putative genic and low-copy regions were exploited for
the development of RFLP markers as described in Faris et
al. (2003). The markers were anchored to the genetic map
by conducting high-resolution mapping in a durum wheat
(Triticum turgidum L., 2n=4x=28, AABB genomes) pop-
ulation derived from crossing LDN with LDN-DIC 5B as
described (Haen et al. 2004). This mapping population
consists of 2,719 Ptr ToxA-insensitive F
2
plants, which are
all homozygous for the tsn1 allele.
91
BLAST similarity searches
Sequences of ESMs were tested for similarity to sequences
in The Institute for Genomic Research (TIGR) wheat gene
index database TaGI release 10.0 (http://tigrblast.tigr.org/
tgi/) using BLASTn (Altschul et al. 1997) to identify the
tentative consensus (TC) sequences containing the corre-
sponding ESM. TC sequences were downloaded and sub-
jected to BLASTx searches against the NCBI nonredundant
(nr) database. A significant match was declared on the basis
of a minimum 40% amino acid identity for at least half the
length of the TC or EST sequence and an e value of less than
e−11. The hit with the lowest e value was considered the
corresponding protein of the ESM when several matches
were found.
TC sequences (or ESM sequences when no TC was
available) were subjected to BLASTn and tBLASTx
searches against rice genomic sequences using Gramene
(Ware et al. 2002), genomic sequences of genes in TIGR
rice pseudomolecules (http://tigrblast.tigr . org/euk-blast/index.
cgi?project=osa1), and all rice BAC and PAC sequences in
GenBank(http://tigrblast.tigr.org/euk-blast/index.cgi?pro
ject=osa1). For BLASTn searches, the threshold limits for
significant matches were at least 65% nucleotide identity
for at least half of the ESM or TC sequence, but not less
than 150 bases, and an e value of less than e−20. For
tBLASTx searches, significance was declared when there
was at least 40% amino acid identity over at least half of the
ESM or TC sequence, but no less than 200 amino acids, and
an e value of less than e−11. When there were several
Rice chromosomeCS x CS-DIC 5B genetic map
5BL-14 breakpoint
5BL-9 breakpoint
0.4
1.8
0.4
0.4
0.0
0.0
1.7
0.0
2.6
1.7
0.0
0.0
0.0
3.1
2.1
0.0
0.0
0.0
5.2
0.9
0.0
2.6
2.6
0.0
XBE590499
XBE488792
XBE426161
XBE403217
XksuQ63
XBF484437
XBE442978
XBM140357
XksuQ11
XBE500658
XBE591798
XBE445619
Xcdo400
Xbcd183
XBE403702
XBE423505
XBF483506
XBM138151.1
XBE425878
Xfcc1
XBE443610
XBM138151.2
Xksu919(Lpx)
Xrz575
XBG608197
Xcdo465
Tsn1
XBF483510
XBE403968
XBE203136
cM Marker
0.81 5BS-6
0.71 5BS-5
0.56 5BS-8
0.13 5BS-7
0.29 5BL-6
0.59 5BL-11
0.75 5BL-14
0.76 5BL-9
0.79 5BL-16
5B Physical Map
(11.194)
(31.064)
(20.944)
(25.396)
(14.011)
(23.009)
(14.533)
(26.404)
(30.271)
(11.294)
(22.174)
(21.578)
(30.809)
(21.099)
(30.374)
3
3
3
3
3
9
9
9
9
12
7
2
2
2
5/11
4
(23.401/27.245)
Mb
Fig. 1 Macrocolinearity between a 25.5-cM region of wheat chro-
mosome 5BL encompassing Tsn1 and the rice genome based on
expressed sequence markers (ESMs). A physical map of wheat
chromosome 5B is shown to the left for reference to the genomic
region evaluated in this study, and the 5BL-14 and 5BL-9 deletion
breakpoints, which define the 5BL 0.75–0.76 deletion interval, are
indicated on the wheat genetic map (center). Rice chromosomes
harboring sequences with significant similarity to ESMs are in-
dicated to the right, and the megabase positions of the significant
hits along rice chromosomes are indicated in parentheses
92
significant matches for a single wheat ESM or TC
sequence, only the best match is reported.
BAC sequence analysis and gene prediction
Sequences of BACs were first subjected to BLASTn
searches of the Triticeae Repeat Sequence (TREP) database
(http://wheat.pw.usda.gov/ITMI/Repeats/index.shtml), the
TIGR Triticum Repeat Database (http://www.tigr.org/tdb/
e2k1/plant.repeats/), and BLASTn and BLASTx searches
of the NCBI nr database (http://www.ncbi.nlm.nih.gov/
BLAST/) to identify repetitive elements. Sequences not
containing putative repetitive elements were then subjected
to BLASTn and BLASTx searches of the NCBI nr database
and BLASTn searches of the wheat EST database (http://
wheat.pw.usda.gov/wEST/blast/), the NCBI EST database
(http://www.ncbi.nlm.nih.gov/dbEST/), and the TIGR wheat
gene indices database (http://tigrblast.tigr.org/tgi/) to identify
putative genes. Whole BAC sequences were also submitted
to the Rice Genome Automated Annotation System (Rice
GAAS; http://ricegaas.dna.affrc.go.jp/), where sequences
were analyzed with the various integrated coding pre dic-
tion programs (Autopredgenset, GENSCAN, RiceHMM,
FGENESH, MZEF), homology search analysis programs
(BLAST, HMMER, ProfileScan, MOTIF), and repetitive
DNA analysis programs (RepeatMasker, Printrepeats, Auto
PredLTR, BLASTn against NCBI LTRdb, and BLASTx
against NCBI transposon subjects). To validate or correct
the predicted structure of putative genes, cDNA sequences
with greater than 80% identity were aligned with the ge-
nomic sequence of the corresponding predicted gene, and
the predicted coding sequence was then manually edited if
necessary. Predicted coding sequences of putative genes
were analyzed against rice genomic sequences using the
same criteria as for ESM and TC sequences described
above.
Results
Analysis of ESMs and corresponding sequences
The map for the genomic region under investigation
consisted of 29 ESM loci including seven previously
anonymous cDNA-RFLP markers (Faris et al. 2000), one
cDNA-derived AFLP marker (Xfcc1) (Haen et al. 2004),
and 21 EST-derived markers (Lu et al., unpublished data).
One EST-derived marker (XBM138151) detected two loci
(Fig. 1). Faris et al. (2000) showed that Tsn1 mapped near
the distal end of the 5BL 0.75–0.76 deletion interval.
Therefore, the 25.5-cM genomic region considered for this
study includes all of the 5BL 0.75–0.76 deletion interval,
which accounts for approximately 16.8 cM, and extends
about 8.7 cM into the 5BL 0.76–0.79 deletion interval
(Fig. 1).
The wheat NSF-EST project conducted BLASTn/x
searches of the public databases for all mapped ESTs, and
the best hits were reported at http://wheat.pw.usda.gov/cgi-
bin/westsql/map_locus.cgi. For this study, ESM sequences
were subjected to BLASTn searches of the TIGR gene
indices, which led to the identification of longer TC se-
quences for all but six of the ESM sequences. BLASTx
searches of the NCBI nr database revealed that 17 (61%) of
the 28 sequences had significant similarities to known or
putative proteins, whereas the remaining 11 sequences had
no similarities to any sequences in the databases (Table 1).
Macrocolinearity of the genomic region
containing Tsn1 and rice
To evaluate the level of colinearity between the 25.5 cM of
wheat chromosome 5B spanning Tsn1 and the rice genome,
we subjected ESM sequences to BLAST searches of the
rice genomic sequences. Of the 28 ESM sequences, 12
(43%) had no significant similarity to rice genomic se-
quences (Table 1). Eleven of these 12 were the same ESM
sequences that had no similarities to known or putative
proteins. XBE590499 was the only ESM that detected no
similar sequences in the rice genome but was assigned a
putative protein function (Gamma-2 purothionin). The re-
maining 16 (57%) ESMs had significant similarity to se-
quences on seven different rice chromosomes (Table 1;
Fig. 1).
With the exception of Xfcc1, ESMs in the distal 13.9 cM
of the genetic map that detected similar rice sequences had
colinearity with rice chromosome 3 (Fig. 1). Five ESMs
within this region of wheat chromosome 5B were colinear
with a 7.8-Mb segment of the distal region of the long arm
of rice chromosome 3, but colinearity was interrupted by
Xfcc1 (rice chromosome 9). Five ESMs within the same
region had no similarity to any rice genomic sequences.
The proximal 11.6 cM of the wheat genetic map was a
mosaic of similarities to different rice chromosomes (Fig. 1).
However, three ESMs within this segment of the map had
significant similarity to sequences on rice chromosome 9.
These three ESMs together with Xfcc1 were colinear with a
1.2-Mb segment of rice chromosome 9.
Three ESM sequences had similarity to rice chromo-
some 2 sequences, but whereas two of them (XBE403968
and XBE500658) detected rice sequences near each other,
the third (Xcdo400) detected a rice sequence approxi-
mately 14 Mb away. Of the four remaining ESMs, three
detected sequences on different rice chromosomes (4, 7,
and 12), and the fourth (XBF484437) detected sequences
on rice chromosomes 5 and 11 with nearly equal signif-
icance (Table 1; Fig. 1).
At the macrolevel, the Tsn1 locus was flanked by one
ESM on the proximal side and two ESMs on the distal side
that detected no similar sequences in rice. Adjacent to these
ESMs, but farther from Tsn1, were ESMs with similarity to
different rice chromosomes (Fig. 1). Therefore, colinearity
between the wheat Tsn1 region and rice was not conserved
at the macrolevel.
93
Table 1 Predicted proteins of expressed sequence markers based on BLASTx searches of the NCBI database and the chromosome
assignments of corresponding rice orthologues based on the best BLASTn and tBLASTx hits to rice genomic sequences
GenBank Marker TC NCBI BLASTx Rice BLASTn Rice tBLASTx
Putative protein e
value
Rice
chromosome
a
e
value
Rice
chromosome
e
value
BF483510 XBF483510 TC262304 NS NS NS
BE403968 XBE403968 TC273054 NS NS NS
BF203136 XBF203136 TC248883 Unknown protein
[Oryza sativa]
0 2 0 2 6.9e−232
BE590499 XBE590499 TC234385 Gamma-2 purothionin 1.0e−22 NS NS
BE488792 XBE488792 TC236559 Putative phospholipase
D beta 1 [O. sativa]
0 9 0 9 2.3e−292
BE426161 XBE426161 TC271278 Putative anthranilate
n-benzoyltransferase
[O. sativa]
2.0e−60 9 9.3e−70 9 5.0e−56
BE403217 XBE403217 TC247678 Multidrug resistance
protein (MRP)-like ABC
transporter [O. sativa]
4.0e−144 4 0 4 2.4e−233
DT319102 XksuQ63 N/A NS NS NS
BF484437 XBF484437 N/A Hypothetical protein
[O. sativa]
2.0e−73 5 2.3e−92 11 3.6e−69
BE442978 XBE442978 TC240270 NS NS NS
BM140357 XBM140357 TC262273 NS NS NS
DT319103 XksuQ11 TC271737 NS NS NS
BE500658 XBE500658 N/A Hypothetical protein
[O. sativa]
3.0e−12 2 1.4e−31 2 7.3e−15
BE591798 XBE591798
TC255372 Putative nicotinate-
nucleotide
pyrophosphorylase
[Arabidopsis thaliana]
9.0e−126 9 1.5e−119 9 5.8e−81
BE445619 XBE445619 TC242559 NS NS
BE439060 Xcdo400 TC251537 Putative ubiquitin-conjugating
enzyme E2 [O. sativa]
9.0e−80 2 3.7e−73 2 3.5e−53
BE438984 Xbcd183 TC268348 Putative transducin WD-40
repeat protein [O. sativa]
2.0e−78 7 4.1e−93 7 3.7e−63
BE403702 XBE403702 TC263127 PHD finger [O. sativa] 1.0e−60 12 1.7e−160 12 5.3e−110
BE423505 XBE423505 TC250760 Serpin [Triticum aestivum] 0 3 5.0e−171 3 6.1e−219
BF483506 XBF483506 N/A NS NS NS
BM138151 XBM138151.1,
XBM138151.2
N/A NS NS NS
BE425878 XBE425878 TC254873 NS NS NS
CC249712 Xfcc1 TC235883 Ps16 protein 4.0e−96 9 9.8e−93 9 4.9e−68
BE443610 XBE443610 TC238236 Mannosyltransferase (PIG-M)
domain containing
protein [O. sativa]
2.0e−91 3 4.6e−113 3 1.8e−94
BE604920 Xksu919(Lpx) TC248004 Lipoxygenase 2
[Hordeum vulgare]
8.0e−89 3 2.4e−97 3 2.8e−67
AI978442 Xrz575 TC275921 Putative somatic embryogenesis
related protein [O. sativa]
0 3 0 3 3.9e−287
BG608197 XBG608197 N/A NS NS NS
BE439135 Xcdo465 TC250804 Cell wall beta-glucosidase
[Secale cereale]
2.0e−95 3 2.4e−119 3 1.6e−74
TC Tentative consensus, NS No significant similarity, N/A TC sequences not available
a
Positions of hits to rice chromosomes are presented in Fig. 1
94
Wheat
genetic map
Tsn1
0.30
0.09
0.16
0.21
0.28
Xfcg9
Xfc
g
23
Xfcg25.2
Xfcg25.1
Xfcg26
Xfcg17
Xfcg22
Xfcg21
22.000
22.050
22.025
21.975
Cysteine proteinase
Unknown protein
Hypothetical protein
Potassium transporter
DHHC zinc finger
Protein kinase
Hypothetical protein
Hypothetical protein
Hypothetical protein
Hypothetical protein
Protein kinase
Protein kinase
Protein kinase
Mb Annotation
Rice
chromosome 9
cM
Marker
300
500
100
200
200
Tsn1
100
9
9
2
11
9
9
12
HG8
BM138151
FRIR
CS
HG9
HG10
DZF4
RNP
HG4
PT
DZF1
HG5
HG6
HG7
DZF2
DZF3
SSP1,2
HG1
SRK1
HG2
WAK1
SRK2
WAK3
HG3
400
WAK2
Wheat
physical map
Rice
chromosome
4
7
4
7
(9.714)
(0.232/4.593)
(0.232/4.593)
(24.605)
(19.352)
centromeretelomere
(31.003)
6
Fig. 2 Microcolinearity of putative genes within wheat BAC
contigs flanking the Tsn1 locus and the rice genome. BAC contigs
(center) are indicated in black, and a kilobase scale is indicated to
the left of each contig. Gene designations are indicated to the left of
the BAC contigs and shown in their relative positions along the
contigs as yellow boxes. The LDN × LDN-DIC 5B genetic map is
shown to the left in gray. Positions of markers used to anchor the
contigs to the genetic map are shown along the contigs as red lines.
Markers used to initiate the chromosome walk are shown in red
along the genetic map. Rice chromosomes harboring sequences with
significant similarity to genes in the BAC contigs are shown to the
right of the BAC contigs , and the megabase positions of the sig-
nificant hits along rice chromosomes (except for rice chromosome
9) are indicated in parentheses. The 75-kb region of rice chromo-
some 9 having colinearity with the Tsn1 region of wheat is shown to
the right, and the relative positions of genes are indicated as yellow
boxes. The transcriptional orientations of genes within the wheat
BAC contigs and putative orthologues on rice chromosome 9 are
indicated with purple arrows
95
Anchoring the BAC contig to the genetic map
We sequenced contigs of 205 and 548 kb flanking Tsn1
(Lu and Faris, unpublished data). The correct orientation
of ctg205 is unknown at this time due to the fact that no
recombinants were identified among 5,438 gametes using
markers Xfcg21 and Xfcg22 near the nonoverlapping ends
of the contig (Lu et al., unpublished data) (Fig. 2).
Probe FCG23 was developed from a low-copy region
near the distal end of ctg548 and mapped 0.28 cM distal to
Xfcg9 (Fig. 2), allowing the orientation of the initial 228 kb
contig to be determined (Lu et al., unpublished data). Sub-
sequent chromosome walking steps led to the development
of probes FCG25 and FCG26. Probe FCG25, which is a
fragment of a DHHC-type zinc finger (DZF)-like gene
(see below), detected two loci (Xfcg25.1 and Xfcg25.2)
that mapped 0.09 and 0.25 cM from Tsn1, respectively
(Fig. 2). The probe FCG26, which is a fragment of a po-
tassium transporter (PT)-like gene, cosegregated with Tsn1
in the 5,438 gametes. Therefore, ctg548 spans 0.74 cM, and
the genetic distance between ctg205 and ctg548 is 0.3 cM
(Fig. 2). The cumulative physical to genetic distance ratio
across both contigs is approximately 1 Mb/cM.
Gene prediction
A total of 26 putative genes were identified in the BAC
contigs for an average gene density of one gene/29 kb. Ten
putative genes were identified on ctg205 (Table 2; Fig. 2).
Two predicted open reading frames (ORFs) in the same
orientation at positions 336–2,312 and 4,389–7,025 of
ctg205 had significant similarity to subtilisin-like serine
protease (SSP) genes from rice as indicated by BLASTx
alignments. SSP1 and SSP2 were 1,977 and 2,637 bp in
size, respectively, and were separated by a 2,077-bp down-
stream region of SSP1, which was also duplicated down-
stream of SSP2. The two genes shared 96% identity at the
nucleotide level including the duplicated downstream re-
gions, and SSP2 had seven exons whereas SSP1 had six, but
the 660-bp difference in size between the two genes was
primarily due to a much larger intron 3 in SSP2. A wheat
cDNA (CB307830) was identified that had 96% similarity
to both SSP1 (less than e−155) and SSP2 (less than e−0),
indicating that at least one of the genes is functional. The
cDNA sequence was used to validate and/or correct pre-
dicted exons 3–5ofSSP1 and exons 4–6ofSSP2. SSP1 and
SSP2 coded for proteins of 441 and 458 amino acids,
respectively, and they shared 76% identity at that level.
Two putative S-receptor kinase (SRK)-like genes in the
same orientation were identified on ctg205 (Table 2; Fig.
2).
SRK1 occupied 2,556 bp at position 25,047–27,602. This
gene was predicted as having two exons, but alignments
with multiple wheat cDNAs with significant similarity
(less than e−88) and a rice homolog from chromosome 7
with greater than 80% identity at the nucleotide level
indicated that SRK1 contained a single exon. The pre-
dicted protein of SRK1 contained 851 amino acids. The
SRK2 gene was 1,077 bp long and resided at position
107,511–108,587 of ctg205. The gene had one predicted
exon, and it had significant similarity (e−47) with a wheat
cDNA clone (CD877820). However, the degree of sim-
ilarity and alignment length between SRK2 and homolo-
gous cDNA sequences from wheat and other species were
not sufficient to validate the structure of SRK2 . SRK1 and
SRK2 were nearly 80 kb apart, and the predicted ORFs
differed in size by 1,479 bp, but when we aligned the full-
length SRK1 gene to the predicted ORF of SRK2 plus the
1,479 bp 3′ flanking region, we found that the two se-
quences were 94% identical, and the predicted ORF of
SRK2 was truncated due to an apparent mutation at position
1,075 of the gene that induced a premature stop codon.
Therefore, SRK1 and SRK2 are likely paralogs, but it is
unkown whether SRK2 is functional in its putatively trun-
cated form. Comparisons of the 5′ upstream regions of the
two genes indicated a conserved duplication residing at
position −316 to −744 from the start codon of SRK1 and at
position −475 to −888 from the start codon of SRK2, which
retained 93% identity. Residing at position −1,390 to
−1,478 from the start codon of SRK2 was a microsatellite
consisting of GT(44) repeats, which was used to develop
the marker Xfcp1 for marker-assisted selection as pre-
viously described (Lu et al., unpublished data). No sim-
ilarity was retained in the 3′ flanking regions of the two
genes. The predicted protein of SRK2 was 385 amino acids
long, and it shared 65% identity with SRK1 at the amino
acid level.
Three wall-associated kinase (WAK)-like genes were
identified on ctg205 based on BLASTx alignments and
gene prediction (Table 2; Fig. 2). WAK1, WAK2, and WAK3
were all in the same orientation and resided at positions
71,407–73,322, 99,317–100,291, and 113,862–114,725,
respectively. The predicted ORF of WAK1 contained four
exons and was 1,916 bp in size. WAK2, which contains a
fragment of the AFLP-derived RFLP marker Xfcg17 (Haen
et al. 2004) used for initial screening of the BAC library,
contained a single exon and was 975 bp. The predicted
ORF of WAK3 included 864 bp of sequence with sig-
nificant similarity to other WAK genes as indicated by
BLASTx alignments, but the 3′ end of the predicted ORF
extended 2,691 bp into the adjacent non-LTR LINE ele-
ment Karin. Therefore, WAK3 may be a pseudogene that
resulted from interruption by the retrotransposon. The ge-
nomic sequences of the three WAK genes had significant
similarities to multiple cDNAs from various grass species,
but alignments did not allow confirmation of structure or
function because significance values were not sufficiently
high to make inferences regarding orthology. The predicted
proteins for WAK1 and WAK2 were 463 and 324 amino
acids, respectively. No significant similarity was among the
three WAK genes at the nucleotide level, but WAK2 and the
pseudogene WAK3 shared 53% amino acid identity.
Three ORFs were predicted in regions of ctg205 that did
not appear to harbor repetitive elements, and were therefore
considered putative genes. The hypothetical genes desig-
nated HG1, HG2, and HG3 resided at positions 10,088–
11,067, 47,437–48,063, and 125,413–126,204 of ctg205
(Table 2; Fig. 2). HG2 and HG3 were in the same orien-
96
Table 2 Characteristics of the 26 predicted genes in Langdon BAC contigs flanking Tsn1
Name Description BAC ctg Strand Position Gene length (bp) No. of exons Protein
size
Best BLASTn hit to
expressed sequences
SSP1 Subtilisin-like serine protease 378P21 205 + 336–2,312 1,977 6 441 Wheat cDNA clone CB307830 (e−155)
SSP2 Subtilisin-like serine protease 378P21 205 + 4,389–7,025 2,637 6 458 Wheat cDNA clone CB307830 (e−0)
HG1 Hypothetical gene 378P21 205 − 10,088–11,067 980 4 286 No significant hits
SRK1 S-receptor kinase 378P21 205 − 25,047–27,602 2,556 1 851 Wheat cDNA clone DN829563 (e−148)
HG2 Hypothetical gene 378P21 205 + 47,437–48,063 627 2 192 No significant hits
WAK1 Wall-associated kinase 378P21, 1154L7 205 − 71,407–73,322 1,916 4 463 Wheat cDNA clone CK207836 (e−51)
WAK2 Wall-associated kinase 378P21, 1154L7 205 − 99,317–100,291 975 1 324 Rice cDNA clone AK102702 (e−26)
SRK2 S-receptor kinase 1154L7 205 − 107,511–108,587 1,077 1 358 Wheat cDNA clone CD877820 (e−47)
WAK3
a
Probable pseudogene;
wall-associated kinase
1154L7 205 − 113,862–114,725 NA NA NA Rice cDNA clone AK102702 (e−40)
HG3 Hypothetical gene 1154L7 205 + 125,413–126,204 792 1 263 Wheat cDNA clone CD373786 (e−103)
RNP U2 snRNP auxiliary factor 1089I4 548 − 8,023–12,273 4,251 11 539 Durum wheat cDNA BE428242 (e−0.0)
HG4 Hypothetical gene 1089I4 548 − 15,138–16,997 1,860 5 259 Wheat cDNA clone CD900543 (e−95)
PT Potassium transporter 1089I4 548 − 17,592–22,619 5,028 9 785 Rice cDNA clone AK065464 (e−0.0)
DZF1 DHHC-type zinc finger 1089I4 548 + 33,967–38,702 4,736 6 547 Wheat cDNA clone CK199365 (e−0.0)
HG5 Hypothetical gene 1089I4 548 − 40,542–41,716 1,175 3 337 Wheat cDNA clone CK205827 (e−121)
HG6 Hypothetical gene 1089I4 548 − 56,012–56,722 711 2 236 Wheat cDNA clone BJ309122 (e−0.0)
DZF2 DHHC-type zinc finger 1089I4 548 − 69,162–69,830 669 1 222 Wheat cDNA clone CK199365 (e−0.0)
DZF3
a
Probable pseudogene;
DHHC-type zinc finger
1089I4 548 + 83,042–83,973 NA NA NA Rice cDNA clone AU093319 (e−105)
HG7 Hypothetical gene 1089I4 548 + 96,323–98,134 1,812 6 132 Wheat cDNA clone BE401011 (e−154)
DZF4 DHHC-type zinc finger 239E17 548 + 145,662–149,420 3,759 7 471 Wheat cDNA clone BE500355 (e−116)
HG8 Hypothetical gene 50F12, 533E21 548 − 311,888–312,388 501 1 166 No significant hits
BM138151
b
Possible pseudogene:
unknown gene
50F12, 533E21 548 + 320,025–320,777 NA NA NA Wheat cDNA clone BM138151 (e−0)
FRIR
a
Probable pseudogene; far-red
impaired response protein
50F12, 533E21 548 − 328,073–328,672 NA NA NA No significant hits
CS
a
Probable pseudogene;
callose synthase
533E21 548 + 369,803–371,154 NA NA NA Wheat cDNA clone BJ220565 (e−0.0)
HG9 Hypothetical gene 533E21 548 + 377,414–382,379 4,966 7 432 Barley cDNA clone BJ467479 (e−102)
HG10 Hypothetical gene 533E21, 404J6 548 − 401,442–404,054 2,613 6 553 Barley cDNA clone AV834198 (e−102)
a
Probable pseudogenes identified based on significant BLASTx alignments to known proteins. No corresponding ORFs were predicted
b
Possible pseuodogene identified based on alignment of expressed sequence tag (EST) sequence with BAC sequence, but no corresponding ORF was predicted. BM138151 detects a second
locus on chromosome 5BL (Fig. 1), which may represent the functional copy
97
tation (plus strand), whereas HG1 was the opposite. HG1
had four predicted exons and was 980 bp, with a predicted
protein consisting of 286 amino acids. BLASTx and
BLASTn alignments indicated that HG1 shared 37 (e−17)
and 81% (e −36) identity at the amino acid and nucleotide
levels, respectively, with a putative protein (AY491681)
detected at the hardness locus in Triticum monococcum
(Chantret et al. 2004). However, no homologous cDNA
sequences from wheat or other grasses were identified,
suggesting that either HG1 is not a functional gene or it
is expressed at very low levels. HG2 had two predicted
exons and was 627 bp long, with a predicted protein se-
quence consisting of 192 amino acids. No significant
BLASTn or BLASTx hits were obtained, suggesting that
either it is a novel gene that is transcribed at low levels or it
is not a real gene. HG3 had a single predicted exon, was
792 bp, and had 92% identity (e−103) with a wheat cDNA
(CD373786), indicating that it is a functional gene. It is
interesting to note that all three HG genes on ctg205 had GC
contents greater than 65%.
Seventeen putative genes were identified on ctg548.
Analysis of BLASTx alignments indicated a gene with
significant similarity (less than e−100) to U2 snRNP aux-
iliary factor (RNP)-like genes. The RNP resided at position
8,023–12,273 and was 4,251 bp long (Table 2; Fig. 2). A
barley cDNA consensus sequence (TC135043) with 91%
identity (e−185) and a durum wheat cDNA (BE428242)
with 100% identity (e−173) were used to correct the pre-
dicted intron/exon structure of the gene and validate that it
was functional. RNP contained 11 exons and coded for a
protein consisting of 539 amino acids. The gene contained
two RNA recognition motifs (RRMs), which are diagnostic
for RNA-binding proteins.
About 5.3 kb from the RNP gene at position 17,592–
22,619 was gene with significant similarity to PT-like genes
(Table 2; Fig. 2). The PT gene was 5,208 bp long and had
nine exons based on the predicted structure. The structure
and function of the PT gene was validated by alignments
with a full-length rice cDNA (AK065464) with 85% iden-
tity (e−0) and a durum wheat cDNA (CK206884) with 96%
identity (e−0), respectively. The PT gene coded for a pro-
tein consisting of 785 amino acids.
Four putative DZF-like genes within 115 kb were iden-
tified on ctg548 (Table 2; Fig. 2) based on BLASTx align-
ments with e values ranging from e−73 to e−
168. The
predicted ORF of DZF1 occupied position 33,967–38,702
and was 4,736 bp long with six exons. Attempts to identify
corresponding full-length cDNAs were not successful, but
the existence of the last exon was validated by alignment
to the wheat cDNA CK199365 (99% identity; e−0). The
predicted protein of DZF1 contained 547 amino acids.
Approximately 30.5 kb downstream from DZF1 and in
the opposite orientation was DZF2. The predicted ORF of
DZF2 contained a single exon of 669 bp and resided at
position 69,162–69,830. This gene shared 95% identity
(e−0) with wheat cDNA CK199365 across 428 bp.
Therefore, the wheat cDNA CK199365 most likely cor-
responds to the DZF1 gene because they shared a higher
percent identity over a longer segment. This suggests that
DZF1 is functional, but whether or not DZF2 is functional
is inconclusive.
DZF3 was identified based on BLASTx alignments at
position 83,042–83,973, but no corresponding ORF was
predicted with any of the gene prediction programs. There-
fore, DZF3 may be a pseudogene lacking a start codon.
DZF4 occupied position 145,662–149,420 of ctg548
and had a predicted ORF consisting of seven exons. This
gene was 3,759 bp long and in the same orientation as
DZF1. Alignments with the wheat cDNA BE500355,
which had 92% identity (e−0), allowed the determination
of function and the validation of exons 4–6, but cDNAs
corresponding to the rest of the gene could not be iden-
tified. The predicted protein of DZF4 consisted of 471
amino acids.
Comparisons of the genomics sequences of the DZF
genes with each other indicated that DZF1, DZF2, and the
pseudogene DZF3 were more similar to each other, where-
as DZF4 was slightly more divergent. Similarity among the
genes was mainly confined to the 3′ ends of DZF1 and
DZF4, which corresponded to the entire DZF3 genomic
region and DZF2 ORF. Within this segment, DZF1, DZF2,
and DZF3 had identities ranging from 93 to 96% with
each other, whereas comparisons of these three genes
with DZF4 yielded identities ranging from 86 to 91%.
Comparisons at the amino acid level yielded similar results.
DZF1, DZF2, and DZF3 shared greater than 77% identity
with each other, but
DZF4 had less than 57% identity with
the other three. No similarity was identified in 1.5 kb of the
5′ regions of the four DZF genes, but conservation was
observed in the 3′ regions where similarities ranged from
38% between the 3′ regions of DZF2 and DZF3 to 65%
between the same regions of DZF1 and DZF3.
Seven hypothetical genes were predicted on ctg548 and
designated HG4–HG10. HG4 was predicted as a 1,860-bp
gene containing five exons at position 15,138–16,997.
Several wheat cDNAs with 89% identity (e−95) to HG4
were identified, but aligned within the second predicted
intron. Therefore, the predicted structure of HG4 could not
be validated, and its functionality, including production of
its 259-amino-acid protein, is questionable.
HG5 was predicted as being 1,175 bp long with three
exons at position 40,542–41,716. A wheat cDNA clone
(CK205827) with 87% identity was identified, but align-
ment with HG5 extended from the first predicted exon
through the first intron and into the second exon. Due to the
relatively low identity and poor structural alignment, this
cDNA may not represent the corresponding coding se-
quence for HG5, and whether HG5 is a real functional gene
is left to be determined.
About 14 kb downstream from HG5 at position 56,012–
56,722 was HG6, which was 711 bp long. A wheat cDNA
clone (BJ309122) was 91% identical (e−0) and used to
validate the structure of the ORF as having a single exon.
The high degree of similarity between HG6 and the wheat
cDNA suggests that the gene, which encodes a protein
consisting of 237 amino acids, is functional.
HG7 was located at position 96,323–98,134 and pre-
dicted as being 1,812 bp long with six exons. The wheat
98
cDNA clone BE401011 was 98% identical (e−154) to
HG7, suggesting the gene is functional. However, only the
fourth and fifth exons could be confirmed by alignments to
the cDNA. HG7 was predicted to encode a protein con-
sisting of 132 amino acids.
A 501-bp gene (HG8) was predicted as having a single
exon and coding for a 166-amino-acid protein at position
311,888–312,388. No cDNA clones with significant sim-
ilarity to HG8 were identified, indicating that either it is not
a real gene or it is expressed at very low levels.
HG9 was predicted as a 4,966-bp gene with seven exons
at position 377,414–382,379. BLASTn alignments with
sequences in the EST databases indicated a large number of
barley and wheat ESTs with significant similarity (greater
than 90% identity) to HG9, but alignments were confined
to the last predicted exon, which was 256 bp in size. This
suggests that at least the last exon probably represents a
repetitive sequence, and because the gene had no signif-
icant similarity to known repetitive elements, it may rep-
resent an unclassified element.
Occupying position 401,442–404,054 on ctg548 was
HG10. The predicted ORF of this gene was 2,613 bp
long; it had six exons and a predicted protein consisting
of 553 amino acids. A barley cDNA (AV834198) shared
90% identity (e−102) over a 305-bp segment, which cor-
responded to the last exon. This suggests that HG10 may
be functional, but we were unable to validate its structure.
The wheat cDNA clone BM138151 was mapped as an
RFLP marker in the CS × CS-DIC 5B population (Lu et al.,
unpublished data) and used to assess macrocolinearity as
described above. BLASTn alignments of BM138151 against
ctg548 revealed its position on the contig at 320,025–
320,777. No corresponding ORF for BM138151 was pre-
dicted, but it should be noted that in addition to this locus,
BM138151 detects a second more distal locus on 5BL
(Fig. 1). Therefore, it is possible that the more distal locus
harbors a functional gene, while the locus detected in
ctg548 is actually a pseudogene.
In addition to WAK3, DZF3, and BM138151, we iden-
tified two additional putative pseudogenes. Genes with
similarity to a far-red impaired response (FRIR)-like pro-
tein (e−21) (position 328,073–328,672) and a callose
sythase (CS)-like protein (e−26) (369,803–371,154) were
identified based on BLASTx alignments, but no corre-
sponding ORFs were predicted. No cDNAs correspond-
ing to the FRIR gene were identified, but a wheat cDNA
(BJ220565) with 93% identity (e−0) to the CS gene was
identified, suggesting that this gene is functional at least in
some genotypes. However, it is interesting to note that the
Table 3 Predicted genes within
wheat BAC contigs and their
best BLASTn and tBLASTx hits
to rice genomic sequences
NS No significant similarity
a
Positions of hits to rice
chromosomes are presented
in Fig. 2
Predicted gene Rice BLASTn Rice tBLASTx
Rice chromosome
a
e value Rice chromosome e value
SSP1 2 5.6e−125 2 8.8e−65
SSP2 2 8.3e−120 2 9.0e−120
HG1 NS
b
NS
SRK1 4 3.0e−191 4 4.7e−251
7 2.6e−251 7 3.7e−253
HG2 NS NS
WAK1 9 4.4e−85 9 1.1e−112
WAK2 NS NS
SRK2 4 4.5e−91 4 6.9e−82
7 1.7e−95 7 3.8e−81
WAK3 NS 9 8.2e−20
HG3 NS NS
RNP 11 4.0e−179 11 1.8e−138
HG4 NS NS
PT 9 4.2e−261 9 5.0e−298
DZF1 9 2.0e−195 9 9.1e−192
HG5 NS NS
HG6 NS NS
DZF2 9 2.0e−74 9 2.3e−72
DZF3 9 3.5e−140 9 5.7e−94
HG7 NS NS
DZF4 9 1.5e−68 9 1.4e−93
HG8 NS NS
BM138151 (EST) NS NS
FRIR NS 12 4.4e−27
CS NS 6 5.4e−28
HG9 NS NS
HG10 NS NS
99
CS gene in LDN is flanked by gag–pol polyprotein-like
sequences, which may have disrupted its function.
Microcolinearity of the Tsn1 locus with rice
BLAST analysis of the 26 putative genes within the BAC
contigs showed that 14 (54%) had homologous sequences
distributed among six different rice chromosomes. These
14 gene sequences detected only nine unique loci in rice
because the two wheat SSP genes had significant similarity
to a single SSP gene on rice chromosome 2, the four DZF
genes had similarity to a single DZF gene on rice chro-
mosome 9, and the two SRK genes detected single loci on
rice chromosomes 4 and 7 with nearly equal significance
(Table 3; Fig. 2).
Four of the ten putative genes on ctg205 had no sim-
ilarity to any sequences in the rice genome, whereas the
remaining six detected orthologous loci on at least three
different rice chromosomes (Fig. 2). Only five of the 13
unique gene sequences (DZF2, DZF3, and DZF4 excluded)
on ctg548 detected orthologous loci in rice, and they were
distributed on four different rice chromosomes.
The WAK1, WAK3, PT,andDZF genes all had significant
similarity to sequences spanning a segment of about 65 kb
on rice chromosome 9 (Fig. 2). These four genes appear to
be colinear between wheat and rice; however, as mentioned
above, the correct orientation of ctg205 containing WAK1
and WAK3 is unknown. If the correct orientation is as shown
in Fig. 2,thenWAK1 and WAK3 are colinear, but the genes
are in opposite transcriptional orientation compared to the
rice orthologues. If the correct orientation is for the contig to
be inverted, then WAK1 and WAK3 wouldbeinthesame
transcriptional orientation as the rice orthologues, but their
order would be inverted and they would not be colinear.
WAK1 and WAK3 are separated by about 40 kb in wheat,
whereas the orthologous rice sequences are separated by
about 30 kb. Between the WAK1 and WAK3 orthologues in
rice are four hypothetical proteins and a putative retro-
element (not shown), whereas WAK2 and SRK2 separate the
two genes in wheat. On ctg548, the PT and DZF genes in
wheat are in the same order and orientation as the rice
orthologues. The two genes are separated by about 10 kb in
wheat, which included a DNA transposon (not shown),
whereas they are separated by only 3.5 kb in rice. None of
the rice hypothetical proteins between the WAK1 and WAK3
orthologues on rice chromosome 9 had similarity to any
sequences on the wheat BAC contigs.
In addition to breaks in colinearity between WAK1 and
WAK3, breaks in colinearity with rice chromosome 9 oc-
curred between WAK1 (or WAK3, depending on the correct
orientation of ctg205) and the
PT gene. This included the
RNP gene, which was highly significant in similarity to an
RNP orthologue on rice chromosome 11 (Table 3).
Discussion
Macrocolinearity of a 25-cM segment spanning Tsn1
and the rice genome
Comparative analysis of the wheat and rice genomes has
been conducted by investigating syntenic relationships of
bin-mapped ESTs with rice genomic sequences without
prior knowledge of the genetic order of EST loci within
bins (Sorrells et al. 2003; Conley et al. 2004; Francki et al.
2004; Hossain et al. 2004; La Rota and Sorrells 2004;
Linkiewicz et al. 2004; Munkvold et al. 2004; Peng et al.
2004). However, use of genetically resolved EST loci
within chromosome bins can reveal more regarding the
length and organization of syntenic units between wheat
and rice. In this study, genetic resolution of 29 ESM loci
spanning 25.5 cM and encompassing the Tsn1 gene al-
lowed us to determine colinearity between wheat and rice
at the macrolevel.
The wheat homoeologous group 5 chromosomes have
been shown to be the least conserved of the homoeologous
groups compared to rice chromosomes (Sorrells et al. 2003;
La Rota and Sorrells 2004). Our results agree with this
notion because while detectable levels of macrocolinearity
were observed with rice chromosomes 3 and 9, eight ESMs
detected sequences on other rice chromosomes. Of these,
XBE403968 and XBE500658 detected loci approximately
100 kb apart on rice chromosome 2 and could have been
the result of a single evolutionary event. The other six
ESMs were not adjacent in the rice genome and probably
do not represent single translocation events.
Genes in BAC contigs
Fourteen of the 26 putative genes identified on the BAC
contigs had significant similarity to known proteins, al-
lowing us to infer putative functions. However, whereas
corresponding ORFs were predicted for ten of these, the
gene prediction programs used in this study did not indicate
the presence of ORFs for WAK3, DZF3, FRIR, and CS.
Analysis of flanking sequences provided evidence that
WAK3 and CS were possibly disrupted by transposons, but
no such events were evident for DZF3 and FRIR.Itis
possible that these two genes, especially DZF3 given the
presence of three other DZF genes with potential function,
were superfluous and degenerated over time.
Of the ten predicted genes in the BAC contigs with no
similarity to known proteins, seven were strongly sup-
ported by ESTs and possibly represent real genes. The
other three hypothetical genes with no similarity to ex-
pressed sequences could be degenerated genes from un-
known repetitive elements. Therefore, further analysis of
these putative genes is necessary to determine if they are
biologically valid.
100
Inclusion of all 26 putative genes in gene density cal-
culations results in an average density of one gene/29 kb.
Exclusion of the hypothetical genes and the putative pseu-
dogenes leads to a more conservative estimate of one
gene/68 kb. However, it should also be noted that only six of
the 11 genes used in the latter calculation could be con-
fidently validated.
The discovery of four DZF genes within a 115-kb seg-
ment of ctg548 is a good example of gene duplication in
wheat that did not occur in rice. Higher frequencies of gene
duplication in wheat compared to rice have been reported in
other macrocolinearity (Anderson et al. 1992; Dubcovsky
et al. 1996; Akhunov et al. 2003) and microcolinearity
(Yan et al. 2003; Chantret et al. 2004) studies. It has also
been suggested that a higher frequency of duplications
occur in distal regions of wheat chromosomes (Akhunov
et al. 2003). The duplications of DZF, SSP, and SRK genes
observed in this study support that trend because Tsn1 and
the region under study lie in the distal 25% of chromosome
5BL. It is interesting to note that hybridization experiments
of DZF gene sequences indicated that only one copy resides
on wheat chromosome 5A (data not shown). Therefore,
multiple duplications of this gene likely occurred after the
divergence of a common Triticeae ancestor.
Microcolinearity between the Tsn1 locus and rice
Variable results have been reported in studies of colinearity
between wheat and rice at the microlevel. Most wheat–rice
microcolinearity studies have shown better conservation in
proximal regions of wheat chromosomes (Roberts et al.
1999; SanMiguel et al. 2002; Yan et al. 2003; Distelfeld et
al. 2004) compared to distal regions (Feuillet and Keller
1999; Li and Gill 2002; Yan et al. 2004; Guyot et al. 2004).
For example, Yan et al. (2003) found a high level of micro-
colinearity between a proximal region of wheat chromo-
some 5A
m
L and rice chromosome 3 with the exception of
two tandem gene duplications in wheat, and this informa-
tion was successfully used to clone Vrn1. On the other hand,
a region containing Sh2/X1/X2/A1 genes in rice chromo-
some 1 was conserved in sorghum and maize (Bennetzen
and Ramakrishna 2002), but only X2/A1 were detected in
the colinear region of the distal region of homoeologous
group 3 in wheat whereas Sh2/X1 were found in a non-
colinear region of the long arm of homoeologous group 1
(Li and Gill 2002).
An exception to the idea that distal regions are less con-
served than proximal regions was reported by Chantret
et al. (2004). They sequenced a 101-kb BAC from the
distal region of chromosome 5A
m
S, which included three
genes associated with grain texture, and found that all three
were in the same order and orientation as in rice chro-
mosome 12. In addition, the genes were separated by sim-
ilar physical distances.
Our study supports the trend of less conservation in
distal regions. Within the BAC contigs at the Tsn1 locus, 14
genes detected orthologues on six different rice chromo-
somes, and multiple instances of adjacent wheat genes de-
tecting orthologues on different rice chromosomes were
observed. For example, along ctg205, WAK1 and WAK3
were apparently colinear with rice chromosome 9, but this
colinearity was interrupted by SRK2, which is a likely
paralog of SRK1. Also, little physical distance separated PT
from RNP, which detected orthologues on rice chromo-
somes 9 and 11, respectively. Most of these observations
probably represent multiple translocation or transposition
events that led to the shuffling and rearranging of genetic
material in this region. Alternatively, it may reflect multiple
instances of single gene transposition events.
Previous research has indicated that Brachypodium is
more closely related to wheat than rice (Draper et al. 2001).
We screened a Brachypodium sylvaticum BAC library
(Foote et al. 2004) with a highly conserved portion of the
wheat PT gene but found no positive clones (Lu and Faris,
unpublished data). We then screened the library with a
fragment of the DZF gene and identified a single positive
clone. This BAC was hybridized with various fragments of
WAK1, WAK3, RNP, and PT genes, but none of them
hybridized, indicating that, relative to wheat, this region in
Brachypodium may be even less conserved than rice.
The RNP and PT genesonctg548kbcosegregatedwith
Tsn1 in 5,438 gametes, but they do not appear to be can-
didate genes (Lu and Faris, unpublished data). Therefore, it
is likely that we have not yet reached Tsn1 by chromosome
walking, and it most likely lies within the gap between the
contigs. If markers derived from the three putative rice genes
between the WAK3 and PT orthologues on chromosome 9
detect loci within the gap between the contigs, then it may be
concluded that rice genomic information is useful for the
map-based cloning of Tsn1. However, it is important to note
that the colinearity observed with rice chromosome 9 was
undetectable until we had conducted multiple chromosome
walking steps that traversed nearly 400 kb in wheat.
Acknowledgement The project was supported by the National
Research Initiative of the USDA Cooperative State Research, Education
and Extension Service, grant number 2003-35300-13109 to J.D.F.
References
Ahn S, Anderson JA, Sorrells ME, Tanksley SD (1993) Homoeo-
logous relationships of rice, wheat and maize chromosomes.
Mol Gen Genet 241:483–490
Akhunov ED, Goodyear AW, Geng S, Qi LL, Echalier B, Gill BS,
Miftahudin, Gustafson JP, Lazo G, Chao S, Anderson OD,
Linkiewicz AM, Dubcovsky J, La Rota M, Sorrells ME, Zhang
D, Nguyen HT, Kalavacharla V, Hossain K, Kianian SF, Peng J,
Lapitan NL, Gonzalez-Hernandez JL, Anderson JA, Choi DW,
Close TJ, Dilbirligi M, Gill KS, Walker-Simmons MK, Steber C,
McGuire PE, Qualset CO, Dvorak J (2003) The organization and
rate of evolution of wheat genomes are correlated with re-
combination rates along chromosome arms. Genome Res 13:
753–763
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W,
Lipman DJ (1997) A new generation of protein database search
programs. Nucleic Acids Res 25:3389–3402
Anderson JA, Ogihara Y, Sorrells ME, Tanksley SD (1992) Devel-
opment of a chromosomal arm map for wheat based on RFLP
markers. Theor Appl Genet 83:1035–1043
101
Bennett MD, Leitch IJ (1995) Nuclear DNA amounts in angio-
sperms. Ann Bot 76:113–176
Bennetzen JL (2000) Comparative sequence analysis of plant nu-
clear genomes: microcolinearity and its many exceptions. Plant
Cell 12:1021–1029
Bennetzen JL, Ramakrishna W (2002) Numerous small rearrange-
ments of gene content, order and orientation differentiate grass
genomes. Plant Mol Biol 48:821–827
Chantret N, Cenci A, Sabot F, Anderson O, Dubcovsky J (2004)
Sequencing of the Triticum monococcum hardness locus re-
veals good microcolinearity with rice. Mol Genet Genomics
271:377–386
Cenci A, Chantret N, Kong X, Gu Y, Anderson OD, Fahima T,
Distelfeld A, Dubcovsky J (2003) Construction and character-
ization of a half million clone BAC library of durum wheat
(Triticum turgidum ssp. durum). Theor Appl Genet 107:931–939
Conley EJ, Nduati V, Gonzalez-Hernandez JL, Mesfin A, Trudeau-
Spanjers M, Chao S, Lazo GR, Hummel DD, Anderson OD, Qi
LL, Gill BS, Echalier B, Linkiewicz AM, Dubcovsky J, Akhunov
ED, Dvorak J, Peng JH, Lapitan NL, Pathan MS, Nguyen HT,
Ma XF, Miftahudin, Gustafson JP, Greene RA, Sorrells ME,
Hossain KG, Kalavacharla V, Kianian SF, Sidhu D, Dilbirligi M,
Gill KS, Choi DW, Fenton RD, Close TJ, McGuire PE, Qualset
CO, Anderson JA (2004) A 2600-locus chromosome bin map of
wheat homoeologous group 2 reveals interstitial gene-rich islands
and colinearity with rice. Genetics 168:625–637
Delaney D, Nasuda S, Endo TR, Gill BS, Hulbert SH (1995a)
Cytogenetically based physical maps of the group-2 chromo-
somes of wheat. Theor Appl Genet 91:568–573
Delaney D, Nasuda S, Endo TR, Gill BS, Hulbert SH (1995b)
Cytogenetically based physical maps of the group-3 chromo-
somes of wheat. Theor Appl Genet 91:780–782
Devos KM, Gale MD (2000) Genome relationships: the grass model
in current research. Plant Cell 12:637–646
Distelfeld A, Uauy C, Olmos S, Schlatter AR, Dubcovsky J, Fahima
T (2004) Microcolinearity between a 2-cM region encompass-
ing the grain protein content locus Gpc-6B1 on wheat chro-
mosome 6B and a 350-kb region on rice chromosome 2. Funct
Integr Genomics 4:59–66
Draper J, Mur LAJ, Jenkins G, Ghosh-Biswas GC, Bablak P,
Hasterok R, Routledge APM (2001) Brachypodium dys-
tachyon: a new model system for functional genomics in
grasses. Plant Physiol 127:1539–1555
Dubcovsky J, Luo M-C, Zhong G-Y, Bransteiter R, Desai A, Kilian
A, Kleinhofs A, Dvorak J (1996) Genetic map of diploid wheat,
Triticum monococcum L., and its comparison with maps of
Hordeum vulgare L. Genetics 143:983–999
Endo TR, Gill BS (1996) The deletion stocks of common wheat.
J Hered 87:295–307
Erayman M, Sandhu D, Sidhu D, Dilbirligi M, Baenziger PS, Gill
KS (2004) Demarcating the gene-rich regions of the wheat ge-
nome. Nucleic Acids Res 32:3546–3565
Faris JD, Anderson JA, Francl LJ, Jordahl JG (1996) Chromosomal
location of a gene conditioning insensitivity in wheat to a necrosis-
inducing culture filtrate from Pyrenophora tritici-r epentis. Phyto-
pathology 86:459–463
Faris JD, Haen KM, Gill BS (2000) Saturation mapping of a gene-rich
recombination hot spot region in wheat. Genetics 154:823–835
Faris JD, Fellers JP, Brooks SA and Gill BS (2003) A bacterial
artificial chromosome contig spanning the major domestication
locus Q in wheat and identification of a candidate gene.
Genetics 164:311–321
Feuillet C, Keller B (1999) High gene density is conserved at
syntenic loci of small and large grass genomes. Proc Natl Acad
Sci U S A 96:8265–8270
Feuillet C, Keller B (2002) Comparative genomics in the grass
family: molecular characterization of grass genome structure
and evolution. Ann Bot 89:3–10
Foote T, Griffiths S, Allouis S, Moore G (2004) Construction and
analysis of a BAC library in the grass Brachypodium sylvat-
icum: its use as a tool to bridge the gap between rice and wheat
in elucidating gene content. Funct Integr Genomics 4:26–33
Francki M, Carter M, Byan K, Hunter A, Bellgard M, Appels R
(2004) Comparative organization of wheat homoeologous group
3S and 7L using wheat–rice synteny and identification of po-
tential markers for genes controlling xanthophylls content in
wheat. Funct Integr Genomics 4:1 18–130
Gill KS, Gill BS, Endo TR (1993) A chromosome region-specific
mapping strategy reveals gene-rich telomeric ends in wheat.
Chromosoma 102:374–381
Gill KS, Gill BS, Endo T, Boyko EV (1996a) Identification and
high-density mapping of gene-rich regions in chromosome
group 5 of wheat. Genetics 143:1001–1012
Gill KS, Gill BS, Endo TR, Taylor T (1996b) Identification and
high-density mapping of gene-rich regions in chromosome
group 1 of wheat. Genetics 144:1883–1891
Goff SA, Ricke D, Lan TH et al (2002) A draft sequence of the rice
genome (Oryza sativa L. ssp. japonica). Science 296:92–100
Guyot R, Yahiaoui N, Feuillet C, Keller B (2004) In silico com-
parative analysis reveals mosaic conservation of genes within a
novel collinear region in wheat chromosome 1AS and rice
chromosome 5S. Funct Integr Genomics 4:47–58
Haen KM, Lu H, Friesen TL, Faris JD (2004) Genomic targeting
and high-resolution mapping of the Tsn1 gene in wheat. Crop
Sci 44:951–962
Hohmann U, Endo TR, Gill KS, Gill BS (1994) Comparison of
genetic and physical maps of group 7 chromosomes from
Triticum aestivum L. Mol Gen Genet 245:644–653
Hossain KG, Kalavacharla V, Lazo GR, Hegstad J, Wentz MJ,
Kianian PM, Simons K, Gehlhar S, Rust JL, Syamala RR,
Obeori K, Bhamidimarri S, Karunadharma P, Chao S, Anderson
OD, Qi LL, Echalier B, Gill BS, Linkiewicz AM, Ratnasiri A,
Dubcovsky J, Akhunov ED, Dvorak J, Miftahudin, Ross K,
Gustafson JP, Radhawa HS, Dilbirligi M, Gill KS, Peng JH,
Lapitan NL, Greene RA, Bermudez-Kandianis CE, Sorrells
ME, Feril O, Pathan MS, Nguyen HT, Gonzalez-Hernandez JL,
Conley EJ, Anderson JA, Choi DW, Fenton D, Close TJ,
McGuire PE, Qualset CO, Kianian SF (2004) A chromosome
bin map of 2148 expressed sequence tag loci of wheat homoeo-
logous group 7. Genetics 168:687–699
International Rice Genome Sequencing Project (2005) The map-
based sequence of the rice genome. Nature 436:793–800
La Rota M, Sorrells ME (2004) Comparative DNA sequence
analysis of mapped wheat ESTs reveals the complexity of
genome relationships between rice and wheat. Funct Integr
Genomics 4:34–36
Li W, Gill BS (2002) The colinearity of the Sh2/A1 orthologous
region in rice, sorghum and maize is interrupted and accompanied
by genome expansion in the Triticeae. Genetics 160:1153–1 162
Linkiewicz AM, Qi LL, Gill BS, Ratnasiri A, Echalier B, Chao S,
Lazo GR, Hummel DD, Anderson OD, Akhunov ED, Dvorak
J, Pathan MS, Nguyen HT, Peng JH, Lapitan NL, Miftahudin,
Gustafson JP, La Rota CM, Sorrells ME, Hossain KG,
Kalavacharla V, Kianian SF, Sandhu D, Bondareva SN, Gill
KS, Conley EJ, Anderson JA, Fenton RD, Close TJ, McGuire
PE, Qualset CO, Dubcovsky J (2004) A 2500-locus bin map of
wheat homoeologous group 5 provides insights on gene dis-
tribution and colinearity with rice. Genetics 168:665–676
Liu S, Anderson JA (2003) Targeted molecular mapping of a major
wheat QTL for Fusarium head blight resistance using wheat
ESTs and synteny with rice. Genome 46:817–823
Mickelson-Young L, Endo TR, Gill BS (1995) A cytogenetic ladder-
map of wheat homoeologous group-4 chromosomes. Theor
Appl Genet 90:1007–1011
102
Munkvold JD, Greene RA, Bermudez-Kandianis CE, La Rota CM,
Edwards H, Sorrells SF, Dake T, Benscher D, Kantety R,
Linkiewicz AM, Dubcovsky J, Akhunov ED, Dvorak J,
Miftahudin, Gustafson JP, Pathan MS, Nguyen HT, Matthews
DE, Chao S, Lazo GR, Hummel DD, Anderson OD, Anderson
JA, Gonzalez-Hernandez JL, Peng JH, Lapitan N, Qi LL,
Echalier B, Gill BS, Hossain KG, Kalavacharla V, Kianian SF,
Sandhu D, Erayman M, Gill KS, McGuire PE, Qualset CO,
Sorrells ME (2004) Group 3 chromosome bin maps of wheat
and their relationship to rice chromosome 1. Genetics 168:
639–650
Peng JH, Zadeh H, Lazo GR, Gustafson JP, Chao S, Anderson OD,
Qi LL, Echalier B, Gill BS, Dilbirligi M, Sandhu D, Gill KS,
Greene RA, Sorrells ME, Akhunov ED, Dvorak J, Linkiewicz
AM, Dubcovsky J, Hossain KG, Kalavacharla V, Kianian SF,
Mahmoud AA, Miftahudin, Conley EJ, Anderson JA, Pathan
MS, Nguyen HT, McGuire PE, Qualset CO, Lapitan NL (2004)
Chromosome bin map of expressed sequence tags in homoeo-
logous group 1 of hexaploid wheat and homoeology with rice
and Arabidopsis. Genetics 168:609–623
Qi LL, Echalier B, Chao S, Lazo GR, Butler GE et al (2004) A
chromosome bin map of 16,000 expressed sequence tag loci
and distribution of genes among the three genomes of polyploid
wheat. Genetics 168:701–712
Roberts MA, Reader SM, Dalgliesh C, Miller TE, Foote TN, Fish
LJ, Snape JW, Moore G (1999) Induction and characterization
of Ph1 wheat mutants. Genetics 153:1909–1918
SanMiguel P, Ramakrishna W, Bennetzen JL, Buss CS, Dubcovsky
J (2002) Transposable elements, genes and recombination in a
215-kb contig from wheat chromosome 5A
m
. Funct Integr
Genomics 2:70–80
Sears ER (1954) The aneuploids of common wheat. MO Agr Exp
Sta Res Bull 572:1–59
Sears ER (1966) Nullisomic–tetrasomic combinations in hexaploid
wheat. In: Riley R, Lewis KR (eds) Chromosome manipula-
tions and plant genetics. Oliver & Boyd, Edinburgh, pp 29–45
Sears ER, Sears LMS (1978) The telocentric chromosomes of
common wheat. In: Ramanujan S (ed) Proceedings of the fifth
international wheat genetics symposium. Indian Society of
Genetics and Plant Breeding, New Delhi, India, pp 389–407
Sorrells ME, La Rota M, Bermudez-Kandianis CE, Greene RA,
Kantety R, Munkvold JD, Miftahudin, Mahmoud A, Ma X,
Gustafson PJ, Qi LL, Echalier B, Gill BS, Matthews DE, Lazo
GR, Chao S, Anderson OD, Edwards H, Linkiewicz AM,
Dubcovsky J, Akhunov ED, Dvorak J, Zhang D, Nguyen HT,
Peng J, Lapitan NLV, Gonzalez-Hernandez JL, Anderson JA,
Hossain K, Kalavacharla V, Kianian SK, Choi DW, Close TJ,
Dilbirligi M, Gill KS, Steber C, Walker-Simmons MK, McGuire
PE, Qualset CO (2003) Comparative DNA sequence analysis of
wheat and rice genomes. Genome Res 13:1818–1827
Van Deynze AE, Dubcovsky J, Gill KS, Nelson JC, Sorrells ME,
Dvorak J, Gill BS, Lagudah ES, McCouch SR, Appels R
(1995a) Molecular-genetics maps for group 1 chromosomes of
Triticeae species and their relation to chromosomes in rice and
oat. Genome 38:45–59
Van Deynze AE, Nelson JC, Yglesis ES, Harrington SE, Braga DP,
McCouch SR, Sorrells ME (1995b) Comparative mapping in
grasses. Wheat relationships. Mol Gen Genet 248:744–754
Ware DH, Jaiswal P, Ni J, Yap IV, Pan X, Clark KY, Teytelman L,
Schmidt SC, Zhao W, Chang K, Cartinhour S, Stein LD,
McCouch SR (2002) Gramene, a tool for grass genomics. Plant
Physiol 130:1606–1613
Werner JE, Endo TR, Gill BS (1992) Towards a cytogenetically
based physical map of the wheat genome. Proc Natl Acad Sci U
S A 89:11307–11311
Wicker T, Stein N, Albar L, Feuillet C, Schlagenhauf E, Keller B
(2001) Analysis of a contiguous 211 kb sequence in diploid
wheat (Triticum monococcum L.) reveals multiple mechanisms
of genome evolution. Plant J 26:307–316
Yan L, Loukoianov A, Tranquilli G, Helguera M, Fahima T,
Dubcovsky J (2003) Positional cloning of wheat vernalization
gene VRN1. Proc Natl Acad Sci U S A 100:6263–6268
Yan L, Loukoianov A, Blechl A, Tranquilli G, Ramakrishna W,
SanMiguel P, Bennetzen JL, Echenique V, Dubcovsky J (2004)
The wheat VRN2 gene is a flowering repressor down regulated
by vernalization. Science 303:1640–1644
Yu J, Hu SN, Wang J et al (2002) A draft sequence of the rice
genome (Oryza sativa L. ssp. indica). Science 296:79–91
103