Content uploaded by Anastasia Gioti
Author content
All content in this area was uploaded by Anastasia Gioti on Jun 17, 2020
Content may be subject to copyright.
The genome projects of the plant pathogenic fungi Botrytis cinerea and Sclerotinia
sclerotiorum
Sabine Fillinger1, Joëlle Amselem1,2, François Artiguenave2, Alain Billault3, Mathias
Choquer1, Arnaud Couloux2, Christina Cuomo4, Martin Dickman5, Elisabeth Fournier1,
Anastassia Gioti1, Corinne Giraud1, Chinnappa Kodira4, Linda Kohn6, Fabrice Legeai2,
Caroline Levis1, Evan Mauceli4, Cyril Pommier2, Jean-Marc Pradier1, Emmanuel
Quevillon2,7, Jeffrey Rollins8, Béatrice Ségurens3, Adeline Simon1, Muriel Viaud1, Jean
Weissenbach3, Patrick Wincker3 & Marc-Henri Lebrun7
1 INRA, Rte de St-Cyr, 78026 Versailles, France
2 URGI, INRA, 523 place des Terrasses, 91000 Evry, France
3 Génoscope - CNS, 2 rue Gaston Crémieux, 91000 Evry, France
4 Broad Institute, 320 Charles St., Cambridge MA 02141, USA
5 Inst for Plant Genomics and Biotech, 2123 TAMU, College Station, TX, USA
6 University of Toronto, 3359 Mississauga Rd North, Mississauga ON L5L 1C6, Canada
7 Plant and Fungal Physiology, UMR 2847, CNRS-BayerCropScience, 69009 Lyon, France
8 Department of Plant Pathology,University of Florida PO Box 110680, Gainesville, FL, USA
Introduction
Filamentous fungi are the most severe pathogens of crop plants worldwide, including the related
Ascomycetes Botrytis cinerea (teleomorph Botryotinia fuckeliana) and Sclerotinia sclerotiorum from Leotiales.
These two fungi are necrotrophic and polyphagous plant pathogens responsible for important crop diseases
causing hundreds of millions of dollars in losses as a result of reduced crop production and quality. As, there is
no genetic resistance to these fungi among commercial host plants cultivars, fungicide applications is the
principal method to control these diseases with a threat coming from the possible apparition of resistant strains
[1, 2]. S. sclerotiorum, the causal agent of white mould has the broadest host range known for a fungal plant
pathogen. (over 400 plant species, mainly dicots, including crops such as soybeans, dry beans, sunflowers,
canola and potatoes). S. sclerotiorum produces sclerotia which are melanized resting structures. Sclerotia can
germinate into an asexual form such as hyphae or start a sexual cycle leading to the production of apothecia from
which ascospores are liberated (Fig. 1A). The fungus is homothallic (self-fertile) and does not produce
propagative asexual spores. Ascospores are the primary inculum in infection of most crop species but direct
infection from germinating sclerotia can be important. Sclerotia can germinate in crop debris saprophytically, but
they are also known to persist in soil in a relatively dormant state, for several years, depending on soil type [3,
4]. B. cinerea, the causal agent of grey mould is responsible for 15-40 % of pre- and post harvest crop losses on
tomatoes, strawberries, ornamental flowers and more than 200 other dicotyledonous plants. On grapevine B.
cinerea strongly affects the quantity and quality of harvest. B. cinerea is also involved in noble rot, a process
used for the production of high quality liquor wines such as Sauternes. Although the biological differences
between grey and noble rot remains unknown they both belong to similar B. cinerea populations [5, 6]. The
infection cycle (Fig. 1B) generally starts with the germination of an asexual spore (macro-conidium) on the host
plant surface, followed by the penetration into host tissues, their rapid destruction and the production of
abundant conidia from diseased necrotic lesions. These conidia are dispersed by wind/rain and responsible for
primary/secondary infections on leaves, flowers and stems. During winter, B. cinerea produces resting sclerotia
that germinate into mycelium or conidiophores in the following spring [7]. Although the sexual cycle of this
heterothallic fungus (sexual cross requiring partners from two opposite mating types) is obtained under
laboratory conditions, apothecia are not or rarely found in nature [8]. However, the genetic structure of B.
cinerea populations strongly suggests that genetic recombination occurs among natural isolates as result of
unfrequented crosses [9]. Overall, both S. sclerotiorum (Sc) and B. cinerea (Bc) have the same infection
strategy, but they strongly differ in their life cycle with regards to the presence (Bc) or absence (Sc) of asexual
sporulation, the predominance of ascospore (Ss) or macroconidium (Bc) as inoculum, and homo- (Ss) or hetero-
thallism (Bc).
A
B
Fig. 1: Life cycles of the white mould agent S. sclerotiorum (A) and the grey-mould agent B. cinerea
(B). Life cycles and interactions with the host plant are simplified. Scales are not respected. For further
explanations see text.
Baker’s yeast Saccharomyces cerevisiae was the first fungal genome to be published in 1996 [10]. In
2000, the fungal genome initiative (FGI; http://www.broad.mit.edu/annotation/fgi) was launched with the
support of NSF/USDA and the Whitehead Institute Center for Genome Research, now “The Broad Institute”,
with the aim to sequence genomes from important fungal species including model systems, sabrobes, symbionts
and pathogens from different orders of the fungal kingdom. This initiative led to the release in 2003 of the first
filamentous fungus genome sequence (Neurospora crassa, a model ascomycete with a 38 Mb genome [11]).
Several other public institutions including JGI and Genoscope, are now participating to this genomic effort and
more than 50 fungal genomes sequences are either completed, in progress or planned in 2006 [12].
Between 2001-2003, two publications [13, 14] revealed whole genome analyses of B. cinerea and two
other fungal plant pathogens by Syngenta, a private agrochemical company. These genome sequences remained
undisclosed until 2005, when Syngenta released them publicly through the Broad Institute. The International
Botrytis sequencing project involving 20 research teams worldwide was launched in 2004 with the support of the
Genoscope (Evry, France). Independently, a NSF-USDA project partnered with the Broad Institute was funded
in 2004 to sequence the genome of the related species S. sclerotiorum. Between fall 2005 and spring 2006, the
genome sequences of both species were completed allowing the launch of a comparative genome annotation of
these two closely related fungal species and the analysis of the differences accumulated between two different
isolates from B. cinerea, at the genome level. Overall, these three genome projects should deliver the first set of
sequences from Leotiomycota, a fungal order located at the basis of ascomycetes phylogenetic tree.
The sequencing projects
The genome sequences of B. cinerea isolate B05.10 (Syngenta), B. cinerea isolate T4 (Genoscope) and
S. sclerotiorum isolate 1980 (Broad Institute) were obtained using a whole genome shotgun (WGS) strategy. In
summary, purified high molecular weight genomic DNA is mechanically sheared to obtain inserts of selected
sizes (3-4 kb and 9-10 kb) cloned in plasmid vectors. The resulting genomic libraries are end-sequenced using
universal primers (using a ratio of approximately 4:1 between small and big insert plasmid library reads). In
addition, for B. cinerea T4 strain and S. sclerotiorum, 50,000 sequences of larger fragments were obtained from
end sequencing of Fosmids (40 kb) or BACs (60-100 kb). These genomic sequences were assembled into contigs
using Arachne software [15], which takes advantage of read pairing and size information in addition to sequence
overlaps and has been extremely effective in assembling genomes from 0.8 Mb to 2.5 Gb in size. Contigs are
ordered and oriented into larger fragments (scaffolds) using paired sequences from Fosmids and BACs. The gaps
can be estimated using average insert size of clones which link across the gap. Finally, scaffolds may be aligned
to genetic or physical maps, to further validate the assembly and to estimate the genome size. A summary of
sequence and assembly data for both organisms is listed in table 1. An optical physical map [16] was produced
for S. sclerotiorum, highlighting 16 linkage groups likely corresponding to the 16 chromosomes. Alignment of
the optical map to S. sclerotiorum supercontigs validated the assembly and suggested only a single false join
between two contigs. For B. cinerea, a genetic map of approximately 150 polymorphic micro-satellite markers
(1 from each supercontig) is in progress. In addition to genomic reads, 50.000 – 70.000 ESTs (expressed
sequence tags) are planned for each organism and their sequencing is underway. These ESTs are critical for the
annotation process. 22.500 B. cinerea ESTs (corresponding to 5’- and 3’- reads for 5.000 clones and 10.000 5’-
reads) were generated from young mycelium (T4). Their clustering with 6.500 existing T4 mycelium ESTs [17]
led to 5.536 unisequences (1/3 of the estimated number of genes). The remaining B. cinerea libraries include
cDNAs from mycelium grown at low and high pH, with different carbon sources or treated with various stress
conditions, as well as different developmental stages (conidia, appressoria). Different developmental stages
(mycelium, apothecia, sclerotia) and infection structures (appressoria) were also used for the construction of S.
sclerotiorum cDNA libraries and 58.000 ESTs are now available (http://www.broad.mit.edu/annotation/fgi).
Table 1: Genome project data
S. sclerotiorum
B. cinerea B05.10
B. cinerea T4
coverage
7-8 x
4-5 x
10 x
reads
476,000
292,000
600,000
Total size
38 Mb
42.7 Mb
39.5 Mb
Mitochondrial genome
129 kb
81 kb
n.a.
contigs
scaffolds
contigs
scaffolds
contigs
scaffolds
number
680
36
4500
600
2281
118
N50
123 kb
1.6 Mb
16 kb
257 kb
35 kb
562 kb
ESTs
58,000
10,000*
40,000*
S. sclerotiorum sequencing and assembly and B. cinerea B05.10 assembly were realized at the Broad
Institute, B. cinerea T4 sequencing and assembly at Genoscope.
N50: 50% of the genome bp are in a contig/scaffold of at least the indicated size. EST: expressed sequence
tag. n.a.: not analyzed *Expected (22.500 B. cinerea T4 ESTs are available)
Annotating the genomes
The identification of genomic features and functional elements in fungal genomes is an essential
component of genome projects. The major difficulty is the accurate prediction of genes with automatic
procedures. ESTs are particularly helpful as they pinpoint the precise location of genes/exons boundaries, and
can suggest differentially spliced transcripts. Polymorphisms between these two species are more likely to occur
in non-coding regions than in coding regions. This characteristic is used to detect exons as conserved regions
within whole genome sequence alignments using software such as Exofish [18]. Such strategy has proven to be
useful for related yeast genomes [19], genomes from two related serotypes of the basidiomycete Cryptococcus
neoformans [20], and for partial genome sequences from related ascomycetes such as Neurospora crassa and
Sordaria macrospora [21].
Fig. 2: Flow-through chart of the automatic gene prediction for B. cinerea genes.
Bioinformatic tools are indicated in italics
Figure 2 recapitulates the bioinformatics procedures that are used to predict a maximum number of
genes with the highest achievable precision. The Botrytis T4 structural annotation performed at URGI (INRA,
Evry/France), is processed through the ab initio gene finder software Fgenesh (http://www.softberry.com), and
similarity methods such as blast, sim4, and GeneWise [22-24]. For B. cinerea T4 genome sequence,
informations based on ab initio or similarity methods as well as splice site predictions (SpliceMachine [25]) is
computed through Eugene (http://www.inra.fr/bia/T/EuGene/) for the automatic decision on the best gene model.
These softwares require training with data sets corresponding to full-length cDNAs and their corresponding
genomic fragments in order to develop organism specific parameters. Manual validation of this automatically
generated gene model will be performed by scientists of the consortium. Preliminary data obtained by this
automatic process are listed in table 2.
For S. sclerotiorum and B. cinerea B05.10 genome sequences, gene prediction was performed at the
Broad Institute using Fgenesh and GeneID [26] trained for S. sclerotiorum. The accuracy of these predicted gene
structures was assessed with preliminary available ESTs and manually annotated genes as summarized in table
2.
Table 2: Statistical data for B. cinerea and S. sclerotiorum genomes.
From (http://www.broad.mit.edu/annotation/fgi/) and URGI/Genoscope project (unpublished)
*preliminary results after ab initio prediction (fgenesh / botrytis parameters)
B. cinerea
T4
B. cinerea
B05.10
S. sclerotiorum
genome size (Mb)
39.5
42.7
38.3
Sequence coverage
10 x
4 x
8 x
%GC
43.2%
43.1%
41.8%
number of genes
(automatic prediction)
14,219*
16,448
14,522
manually curated genes
400
756
1,141
median gene length (bp)
1,140*
968
1,067
avg gene density (bp)
2,780*
2,594
2,643
median exon length (bp)
188*
190
182
avg exon number per gene
2.9*
3.2
3.3
median intron length (bp)
68*
74
78
avg introns per gene
1.9*
2.2
2.3
% genes with introns
91%*
73.8%
77.4%
median intergenic length
800*
937
974
Once the whole set of genes from a fungal genome has been predicted, the functions of their
corresponding proteins has to be determined. This functional annotation process should attribute a potential
biochemical/cellular function to most predicted proteins. The corresponding automatic annotation process is
mainly depending on similarity searches for orthologs with known functions in Swissprot- and fungal databases
and systematic searches for known protein domains using InterProscan [27]. At the same time, Gene Ontology
(GO) will be performed in order to assign additional functions such as: molecular function, cellular component
or biological process. This information, integrated in a database, will be analyzed by experts to validate the
functions assigned automatically and attribute a functional name to the protein according to pre-defined terms.
Another level of annotation is the identification of orthologous and paralogous genes. Genes with high sequence
similarity belonging to different species are orthologous if their phylogeny follow up species phylogeny.
Orthologs will typically have the same or similar cellular function. Detection of orthologs in two different
genomes generally involves whole-genome bidirectional blast similarity searches (BDBH), although alternative
methods may be used relying on micro-synteny (conservation of gene order) of neighbouring genes.
Comparative genomics – first results
Available B. cinerea B05.10, B. cinerea T4 and S. sclerotiorum 1980 genome features are listed in table
2. The automatic gene-prediction process (Genewise, FGenesH, FGenesH+, and GeneID) was applied to B.
cinerea B05.10 and S. sclerotiorum 1980 leading to the detection of a higher number of genes (+2000) in B.
cinerea than S. sclerotiorum. However, gene count in B. cinerea B05.10 sequence is affected by the
discontinuous nature of this low coverage assembly that could artificially increase the number of genes as some
might be split. Indeed, preliminary automatic gene-prediction (FGenesH) was applied to B. cinerea T4 revealing
about 14,200 genes, a number similar to that obtained for S. sclerotiorum. Thorough structural annotation is
however needed and might reveal particularly interesting differences in gene-content between the genome
sequences of the two B. cinerea strains. This preliminary analysis also highlights the high number of genes with
introns (74% - 91%) among automatically predicted gene models (some of which have been manually
corrected). This ratio is equal to that of other filamentous genomes [12] and reveals the importance of large
number of ESTs and robust gene prediction algorithms for annotating fungal genomes.
High identity orthologs between S. sclerotiorum and B. cinerea can be found for most proteins.
Comparing the two protein data sets revealed potential orthologs in B. cinerea for as many as 70% of proteins
(10,100) with an average identity of 75%. Most of these protein pairs (8651) are detected by as best bidirectional
blast matches (probable orthologs). The best protein matches to other fungal proteins are found for a smaller
subset at lower percent identity, as shown for 55 % of Sclerotinia proteins that have an average of 47 % identity
with homologs from most closely related ascomycetes. A recent comparison of orthologous proteins between
yeast species [28] highligthed a 65 % protein sequence identity between Saccharomyces cerevisiae and Candida
glabrata. An average amino acid identity of 68% was also found in a comparison between three Aspergillus
species [29], similar to the conservation between human and fish proteins (~70%). The closer relationship
between B. cinerea and S. sclerotiorum appears ideal to highlight conserved exon sequences. They have also
preserved a high degree of conserved gene order (synteny) important for establishing orthologous relationships
between genes and for studying genome evolution (Cuomo et al., unpublished).
Table 3: Similarity between S. sclerotiorum predicted proteins and other fungal proteins
B. cinerea
B05.10
A. nidulans
F. graminearum
M. grisea
N. crassa
hits
10,087*
7,710
7,982
7,888
7,651
avg % identity
75.6%
46.6%
48.4%
47.5%
48.2%
*results from BlastP, first non redundant best blast hit
Future prospects
The comparison between B. cinerea and S. sclerotiorum offers the first opportunity to compare the
genomes of two closely related plant pathogens. Indeed, the preliminary comparison between B. cinerea and S.
sclerotiorum genomes highlights a high degree of sequence conservation (table 3) suggesting a relatively recent
speciation event. This comparative analysis will allow the examination of sequence polymorphisms, genome
structures, and synteny between these two species but also between both B. cinerea strains. The results of this
study will help understanding evolutionary trends that have shaped the genome of these closely related species.
Comparing gene content and organization between B. cinerea and S. sclerotiorum will also provide a foundation
for an accurate identification of the minimal common set of genes within ascomycetes and highlight the
differences that could be associated with their different life cycles. This data set will be very useful for genome
wide comparisons with other pathogenic or saprobic fungi. These genome resources will also certainly stimulate
the development of high-throughput functional analyses. In particular, the development of B. cinerea and S.
sclerotiorum DNA chips using these genome wide gene data sets will be very helpful for the identification of
differentially expressed genes during plant-pathogen interactions. Proteomic studies and functional high-
throughput reverse genomics will also benefit from these genomic resources leading to a better understanding of
the infectious processes and a deciphering the mechanisms involved in the large host range of these species.
References
1. Bardin, S.D. and H.C. Huang, Research on biology and control of Sclerotinia diseases in Canada. Can.
J. Plant Pathol., 2001. 23: p. 88-98.
2. Leroux, P., Chemical control of Botrytis cinerea and its resistance to chemical fungicides, in Botrytis:
Biology, Pathology and Control, Y. Elad, et al., Editors. 2004, Kluwer Academic Publishers:
Dordrecht. p. 195-222.
3. Hegedus, D.D. and S.R. Rimmer, Sclerotinia sclerotiorum: when "to be or not to be" a pathogen?
FEMS Microbiol Lett, 2005. 251(2): p. 177-84.
4. Bolton, M.D., B. Thomma, and B.D. Nelson, Sclerotinia sclerotiorum (Lib.) de Bary: biology and
molecular traits of a cosmopolitan pathogen. Molecular Plant Pathology, 2006. 7(1): p. 1-16.
5. Geny, L., A. Darrieumerlou, and B. Doneche, Conjugated polyamines and hydroxycinnamic acids in
grape berries during Botrytis cinerea disease development: differences between 'noble rot' and 'grey
mould'. Australian Journal of Grape and Wine Research, 2003. 9(2): p. 102-106.
6. Fournier, E., et al., Partition of the Botrytis cinerea complex in France using multiple gene genealogies.
. Mycologia, 2005. 97(6): p. 1251-1267.
7. Holz, G., S. Coertze, and B. Williamson, The ecology of Botrytis on plant surfaces, in Botrytis: Biology,
Pathology and Control, Y. Elad, et al., Editors. 2004, Kluwer Academic Publishers: Dordrecht. p. 9-28.
8. Beever, R.P. and P.L. Weeds, Taxonomy and genetic variation of Botrytis and Botryotinia, in Botrytis:
Biology, Pathology and Control, Y. Elad, et al., Editors. 2004, Kluwer Academic Publishers:
Dordrecht. p. 29-52.
9. Giraud, T., et al., RFLP markers show genetic recombination in Botryotinia fuckeliana (Botrytis
cinerea) and transposable elements reveal two sympatric species. Molecular Biology and Evolution,
1997. 14(11): p. 1177-1185.
10. Goffeau, A., et al., Life with 6000 genes. Science, 1996. 274(5287): p. 546, 563-7.
11. Galagan, J.E., et al., The genome sequence of the filamentous fungus Neurospora crassa. Nature, 2003.
422(6934): p. 859-68.
12. Galagan, J.E., et al., Genomics of the fungal kingdom: insights into eukaryotic biology. Genome Res,
2005. 15(12): p. 1620-31.
13. Yoder, O.C. and B.G. Turgeon, Fungal genomics and pathogenicity. Curr Opin Plant Biol, 2001. 4(4):
p. 315-21.
14. Catlett, N.L., O.C. Yoder, and B.G. Turgeon, Whole-genome analysis of two-component signal
transduction genes in fungal pathogens. Eukaryot Cell, 2003. 2(6): p. 1151-61.
15. Jaffe, D.B., et al., Whole-genome sequence assembly for mammalian genomes: Arachne 2. Genome
Res, 2003. 13(1): p. 91-6.
16. Schwartz, D.C., et al., Ordered restriction maps of Saccharomyces cerevisiae chromosomes constructed
by optical mapping. Science, 1993. 262(5130): p. 110-4.
17. Viaud, M., et al., Expressed sequence tags from the phytopathogenic fungus Botrytis cinerea. European
Journal of Plant Pathology, 2005. 111(2): p. 139-146.
18. Roest Crollius, H., et al., Estimate of human gene number provided by genome-wide analysis using
Tetraodon nigroviridis DNA sequence. . Nat Genet, 2000. 25: p. 235-238.
19. Kellis, M., et al., Sequencing and comparison of yeast species to identify genes and regulatory
elements. Nature, 2003. 423(6937): p. 241-54.
20. Tenney, A.E., et al., Gene prediction and verification in a compact genome with numerous small
introns. Genome Res, 2004. 14(11): p. 2330-5.
21. Nowrousian, M., et al., Comparative sequence analysis of Sordaria macrospora and Neurospora crassa
as a means to improve genome annotation. Fungal Genet Biol, 2004. 41(3): p. 285-92.
22. Birney, E., M. Clamp, and R. Durbin, GeneWise and Genomewise. Genome Res, 2004. 14(5): p. 988-
95.
23. Florea, L., et al., A computer program for aligning a cDNA sequence with a genomic DNA sequence.
Genome Res, 1998. 8(9): p. 967-74.
24. Altschul, S.F., et al., Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs. . Nucleic Acids Res, 1997. 25: p. 3389-3402.
25. Degroeve, S., et al., SpliceMachine: predicting splice sites from high-dimensional local context
representations. . Bioinformatics, 2005. 21(8): p. 1332-38.
26. Parra, G., E. Blanco, and R. Guigo, GeneID in Drosophila. Genome Res, 2000. 10(4): p. 511-5.
27. Quevillon, E., et al., InterProScan: protein domains identifier. Nucleic Acids Res, 2005. 33: p. W116-
20.
28. Dujon, B., et al., Genome evolution in yeasts. Nature, 2004. 430(6995): p. 35-44.
29. Galagan, J.E., et al., Sequencing of Aspergillus nidulans and comparative analysis with A. fumigatus
and A. oryzae. Nature, 2005. 438(7071): p. 1105-15.