[Show abstract][Hide abstract] ABSTRACT: The soil fungus Rhizoctonia solani is an economically important pathogen of agricultural and forestry crops. Here we present the complete sequence and analysis of the mitochondrial genome of R. solani, field isolate Rhs1AP. The genome (235,849 bp) is the largest mitochondrial genome of a filamentous fungus sequenced to date and exhibits a rich accumulation of introns, novel repeat sequences, homing endonuclease genes, and hypothetical genes. Stable secondary structures exhibited by repeat sequences suggest that they comprise functional, possibly catalytic RNA elements. RNA-Seq expression profiling confirmed that the majority of homing endonuclease genes and hypothetical genes are transcriptionally active. Comparative analysis suggests that the mitochondrial genome of R. solani is an example of a dynamic history of expansion in filamentous fungi. This article is protected by copyright. All rights reserved.
[Show abstract][Hide abstract] ABSTRACT: Background
The genera Aspergillus and Penicillium include some of the most beneficial as well as the most harmful fungal species such as the penicillin-producer Penicillium chrysogenum and the human pathogen Aspergillus fumigatus, respectively. Their mitochondrial genomic sequences may hold vital clues into the mechanisms of their evolution, population genetics, and biology, yet only a handful of these genomes have been fully sequenced and annotated.
Here we report the complete sequence and annotation of the mitochondrial genomes of six Aspergillus and three Penicillium species: A. fumigatus, A. clavatus, A. oryzae, A. flavus, Neosartorya fischeri (A. fischerianus), A. terreus, P. chrysogenum, P. marneffei, and Talaromyces stipitatus (P. stipitatum). The accompanying comparative analysis of these and related publicly available mitochondrial genomes reveals wide variation in size (25–36 Kb) among these closely related fungi. The sources of genome expansion include group I introns and accessory genes encoding putative homing endonucleases, DNA and RNA polymerases (presumed to be of plasmid origin) and hypothetical proteins. The two smallest sequenced genomes (A. terreus and P. chrysogenum) do not contain introns in protein-coding genes, whereas the largest genome (T. stipitatus), contains a total of eleven introns. All of the sequenced genomes have a group I intron in the large ribosomal subunit RNA gene, suggesting that this intron is fixed in these species. Subsequent analysis of several A. fumigatus strains showed low intraspecies variation. This study also includes a phylogenetic analysis based on 14 concatenated core mitochondrial proteins. The phylogenetic tree has a different topology from published multilocus trees, highlighting the challenges still facing the Aspergillus systematics.
The study expands the genomic resources available to fungal biologists by providing mitochondrial genomes with consistent annotations for future genetic, evolutionary and population studies. Despite the conservation of the core genes, the mitochondrial genomes of Aspergillus and Penicillium species examined here exhibit significant amount of interspecies variation. Most of this variation can be attributed to accessory genes and mobile introns, presumably acquired by horizontal gene transfer of mitochondrial plasmids and intron homing.
[Show abstract][Hide abstract] ABSTRACT: Burkholderia multivorans is a Gram-negative bacterium and a member of the Burkholderia cepacia complex, which is frequently associated with respiratory infections in people with cystic fibrosis (CF) and chronic granulomatous disease (CGD). We are reporting the genome sequences of 4 B. multivorans strains, 2 from CF patients and 2 from CGD patients.
Journal of bacteriology 11/2012; 194(22):6356-7. DOI:10.1128/JB.01306-12 · 2.81 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Neisseria meningitidis serogroup B has been predominant in Brazil, but no broadly effective vaccine is available to prevent endemic meningococcal disease. To understand genetic diversity among serogroup B strains in Brazil, we selected a nationally representative sample of clinical disease isolates from 2004, and a temporally representative sample for the state of São Paulo (1988-2006) for study (n = 372).
We performed multi-locus sequence typing (MLST) and sequence analysis of five outer membrane protein (OMP) genes, including novel vaccine targets fHbp and nadA.
In 2004, strain B:4:P1.15,19 clonal complex ST-32/ET-5 (cc32) predominated throughout Brazil; regional variation in MLST sequence type (ST), fetA, and porB was significant but diversity was limited for nadA and fHbp. Between 1988 and 1996, the São Paulo isolates shifted from clonal complex ST-41/44/Lineage 3 (cc41/44) to cc32. OMP variation was associated with but not predicted by cc or ST. Overall, fHbp variant 1/subfamily B was present in 80% of isolates and showed little diversity. The majority of nadA were similar to reference allele 1.
A predominant serogroup B lineage has circulated in Brazil for over a decade with significant regional and temporal diversity in ST, fetA, and porB, but not in nadA and fHbp.
PLoS ONE 07/2012; 7(3):e33016. DOI:10.1371/journal.pone.0033016 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We provide here a comparative genome analysis of ten strains within the Pseudomonas fluorescens group including seven new genomic sequences. These strains exhibit a diverse spectrum of traits involved in biological control and other multitrophic interactions with plants, microbes, and insects. Multilocus sequence analysis placed the strains in three sub-clades, which was reinforced by high levels of synteny, size of core genomes, and relatedness of orthologous genes between strains within a sub-clade. The heterogeneity of the P. fluorescens group was reflected in the large size of its pan-genome, which makes up approximately 54% of the pan-genome of the genus as a whole, and a core genome representing only 45-52% of the genome of any individual strain. We discovered genes for traits that were not known previously in the strains, including genes for the biosynthesis of the siderophores achromobactin and pseudomonine and the antibiotic 2-hexyl-5-propyl-alkylresorcinol; novel bacteriocins; type II, III, and VI secretion systems; and insect toxins. Certain gene clusters, such as those for two type III secretion systems, are present only in specific sub-clades, suggesting vertical inheritance. Almost all of the genes associated with multitrophic interactions map to genomic regions present in only a subset of the strains or unique to a specific strain. To explore the evolutionary origin of these genes, we mapped their distributions relative to the locations of mobile genetic elements and repetitive extragenic palindromic (REP) elements in each genome. The mobile genetic elements and many strain-specific genes fall into regions devoid of REP elements (i.e., REP deserts) and regions displaying atypical tri-nucleotide composition, possibly indicating relatively recent acquisition of these loci. Collectively, the results of this study highlight the enormous heterogeneity of the P. fluorescens group and the importance of the variable genome in tailoring individual strains to their specific lifestyles and functional repertoire.
[Show abstract][Hide abstract] ABSTRACT: Staphylococci are increasingly aggressive human pathogens suggesting that active evolution is spreading novel virulence and resistance phenotypes. Large staphylococcal plasmids commonly carry antibiotic resistances and virulence loci, but relatively few have been completely sequenced. We determined the plasmid content of 280 staphylococci isolated in diverse geographical regions from the 1940s to the 2000s and found that 79% of strains carried at least one large plasmid >20 kb and that 75% of these large plasmids were 20-30 kb. Using restriction fragment length polymorphism (RFLP) analysis, we grouped 43% of all large plasmids into three major families, showing remarkably conserved intercontinental spread of multiresistant staphylococcal plasmids over seven decades. In total, we sequenced 93 complete and 57 partial staphylococcal plasmids ranging in size from 1.3 kb to 64.9 kb, tripling the number of complete sequences for staphylococcal plasmids >20 kb in the NCBI RefSeq database. These plasmids typically carried multiple antimicrobial and metal resistances and virulence genes, transposases and recombinases. Remarkably, plasmids within each of the three main families were >98% identical, apart from insertions and deletions, despite being isolated from strains decades apart and on different continents. This suggests enormous selective pressure has optimized the content of certain plasmids despite their large size and complex organization.
[Show abstract][Hide abstract] ABSTRACT: We present the draft genome for the Rickettsia endosymbiont of Ixodes scapularis (REIS), a symbiont of the deer tick vector of Lyme disease in North America. Among Rickettsia species (Alphaproteobacteria: Rickettsiales), REIS has the largest genome sequenced to date (>2 Mb) and contains 2,309 genes across the chromosome and four plasmids
(pREIS1 to pREIS4). The most remarkable finding within the REIS genome is the extraordinary proliferation of mobile genetic
elements (MGEs), which contributes to a limited synteny with other Rickettsia genomes. In particular, an integrative conjugative element named RAGE (for Rickettsiales amplified genetic element), previously identified in scrub typhus rickettsiae (Orientia tsutsugamushi) genomes, is present on both the REIS chromosome and plasmids. Unlike the pseudogene-laden RAGEs of O. tsutsugamushi, REIS encodes nine conserved RAGEs that include F-like type IV secretion systems similar to that of the tra genes encoded in the Rickettsia bellii and R. massiliae genomes. An unparalleled abundance of encoded transposases (>650) relative to genome size, together with the RAGEs and other
MGEs, comprise ∼35% of the total genome, making REIS one of the most plastic and repetitive bacterial genomes sequenced to
date. We present evidence that conserved rickettsial genes associated with an intracellular lifestyle were acquired via MGEs,
especially the RAGE, through a continuum of genomic invasions. Robust phylogeny estimation suggests REIS is ancestral to the
virulent spotted fever group of rickettsiae. As REIS is not known to invade vertebrate cells and has no known pathogenic effects
on I. scapularis, its genome sequence provides insight on the origin of mechanisms of rickettsial pathogenicity.
Journal of bacteriology 11/2011; 194(2):376-94. DOI:10.1128/JB.06244-11 · 2.81 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Ichthyophthirius multifiliis, commonly known as Ich, is a highly pathogenic ciliate responsible for 'white spot', a disease causing significant economic losses to the global aquaculture industry. Options for disease control are extremely limited, and Ich's obligate parasitic lifestyle makes experimental studies challenging. Unlike most well-studied protozoan parasites, Ich belongs to a phylum composed primarily of free-living members. Indeed, it is closely related to the model organism Tetrahymena thermophila. Genomic studies represent a promising strategy to reduce the impact of this disease and to understand the evolutionary transition to parasitism.
We report the sequencing, assembly and annotation of the Ich macronuclear genome. Compared with its free-living relative T. thermophila, the Ich genome is reduced approximately two-fold in length and gene density and three-fold in gene content. We analyzed in detail several gene classes with diverse functions in behavior, cellular function and host immunogenicity, including protein kinases, membrane transporters, proteases, surface antigens and cytoskeletal components and regulators. We also mapped by orthology Ich's metabolic pathways in comparison with other ciliates and a potential host organism, the zebrafish Danio rerio.
Knowledge of the complete protein-coding and metabolic potential of Ich opens avenues for rational testing of therapeutic drugs that target functions essential to this parasite but not to its fish hosts. Also, a catalog of surface protein-encoding genes will facilitate development of more effective vaccines. The potential to use T. thermophila as a surrogate model offers promise toward controlling 'white spot' disease and understanding the adaptation to a parasitic lifestyle.
[Show abstract][Hide abstract] ABSTRACT: Yersinia pestis is the causative agent of the plague. Y. pestis KIM 10+ strain was passaged and selected for loss of the 102 kb pgm locus, resulting in an attenuated strain, KIM D27. In this study, whole genome sequencing was performed on KIM D27 in order to identify any additional differences. Initial assemblies of 454 data were highly fragmented, and various bioinformatic tools detected between 15 and 465 SNPs and INDELs when comparing both strains, the vast majority associated with A or T homopolymer sequences. Consequently, Illumina sequencing was performed to improve the quality of the assembly. Hybrid sequence assemblies were performed and a total of 56 validated SNP/INDELs and 5 repeat differences were identified in the D27 strain relative to published KIM 10+ sequence. However, further analysis showed that 55 of these SNP/INDELs and 3 repeats were errors in the KIM 10+ reference sequence. We conclude that both 454 and Illumina sequencing were required to obtain the most accurate and rapid sequence results for Y. pestis KIMD27. SNP and INDELS calls were most accurate when both Newbler and CLC Genomics Workbench were employed. For purposes of obtaining high quality genome sequence differences between strains, any identified differences should be verified in both the new and reference genomes.
PLoS ONE 04/2011; 6(4):e19054. DOI:10.1371/journal.pone.0019054 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Mycoplasma alligatoris and Mycoplasma crocodyli are closely related siblings, one being highly virulent and the other relatively attenuated. We compared their genomes to
better understand the mechanisms and origins of M. alligatoris' remarkable virulence amid a clade of harmless or much less virulent species. Although its chromosome was refractory to closure,
M. alligatoris differed most notably by its complement of sialidases and other genes of the N-acetylneuraminate scavenging and catabolism pathway.
Journal of bacteriology 04/2011; 193(11):2892-3. DOI:10.1128/JB.00309-11 · 2.81 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Pythium ultimum is a ubiquitous oomycete plant pathogen responsible for a variety of diseases on a broad range of crop and ornamental species.
The P. ultimum genome (42.8 Mb) encodes 15,290 genes and has extensive sequence similarity and synteny with related Phytophthora species, including the potato blight pathogen Phytophthora infestans. Whole transcriptome sequencing revealed expression of 86% of genes, with detectable differential expression of suites of genes under abiotic stress and in the presence of a host. The predicted proteome includes a large repertoire of proteins involved in plant pathogen interactions, although, surprisingly, the P. ultimum genome does not encode any classical RXLR effectors and relatively few Crinkler genes in comparison to related phytopathogenic oomycetes. A lower number of enzymes involved in carbohydrate metabolism were present compared to Phytophthora species, with the notable absence of cutinases, suggesting a significant difference in virulence mechanisms between P. ultimum and more host-specific oomycete species. Although we observed a high degree of orthology with Phytophthora genomes, there were novel features of the P. ultimum proteome, including an expansion of genes involved in proteolysis and genes unique to Pythium. We identified a small gene family of cadherins, proteins involved in cell adhesion, the first report of these in a genome outside the metazoans.
Access to the P. ultimum genome has revealed not only core pathogenic mechanisms within the oomycetes but also lineage-specific genes associated with the alternative virulence and lifestyles found within the pythiaceous lineages compared to the Peronosporaceae.
[Show abstract][Hide abstract] ABSTRACT: The human microbiome refers to the community of microorganisms, including prokaryotes, viruses, and microbial eukaryotes, that populate the human body. The National Institutes of Health launched an initiative that focuses on describing the diversity of microbial species that are associated with health and disease. The first phase of this initiative includes the sequencing of hundreds of microbial reference genomes, coupled to metagenomic sequencing from multiple body sites. Here we present results from an initial reference genome sequencing of 178 microbial genomes. From 547,968 predicted polypeptides that correspond to the gene complement of these strains, previously unidentified ("novel") polypeptides that had both unmasked sequence length greater than 100 amino acids and no BLASTP match to any nonreference entry in the nonredundant subset were defined. This analysis resulted in a set of 30,867 polypeptides, of which 29,987 (approximately 97%) were unique. In addition, this set of microbial genomes allows for approximately 40% of random sequences from the microbiome of the gastrointestinal tract to be associated with organisms based on the match criteria used. Insights into pan-genome analysis suggest that we are still far from saturating microbial species genetic data sets. In addition, the associated metrics and standards used by our group for quality assurance are presented.
[Show abstract][Hide abstract] ABSTRACT: For over a decade, genome 43 sequences have adhered to only two standards that are relied on for purposes of sequence analysis by interested third parties (1, 2). However, ongoing developments in revolutionary sequencing technologies have resulted in a redefinition of traditional whole genome sequencing that requires a careful reevaluation of such standards. With commercially available 454 pyrosequencing (followed by Illumina, SOLiD, and now Helicos), there has been an explosion of genomes sequenced under the moniker 'draft', however these can be very poor quality genomes (due to inherent errors in the sequencing technologies, and the inability of assembly programs to fully address these errors). Further, one can only infer that such draft genomes may be of poor quality by navigating through the databases to find the number and type of reads deposited in sequence trace repositories (and not all genomes have this available), or to identify the number of contigs or genome fragments deposited to the database. The difficulty in assessing the quality of such deposited genomes has created some havoc for genome analysis pipelines and contributed to many wasted hours of (mis)interpretation. These same novel sequencing technologies have also brought an exponential leap in raw sequencing capability, and at greatly reduced prices that have further skewed the time- and cost-ratios of draft data generation versus the painstaking process of improving and finishing a genome. The resulting effect is an ever-widening gap between drafted and finished genomes that only promises to continue (Figure 1), hence there is an urgent need to distinguish good and poor datasets. The sequencing institutes in the authorship, along with the NIH's Human Microbiome Project Jumpstart Consortium (3), strongly believe that a new set of standards is required for genome sequences. The following represents a set of six community-defined categories of genome sequence standards that better reflect the quality of the genome sequence, based on our collective understanding of the different technologies, available assemblers, and the varied efforts to improve upon drafted genomes. Due to the increasingly rapid pace of genomics we avoided the use of rigid numerical thresholds in our definitions to take into account the types of products achieved by any combination of technology, chemistry, assembler, or improvement/finishing process.
[Show abstract][Hide abstract] ABSTRACT: Here we report the complete genome sequence of Teredinibacter turnerae T7901. T. turnerae is a marine gamma proteobacterium that occurs as an intracellular endosymbiont in the gills of wood-boring marine bivalves of the family Teredinidae (shipworms). This species is the sole cultivated member of an endosymbiotic consortium thought to provide the host with enzymes, including cellulases and nitrogenase, critical for digestion of wood and supplementation of the host's nitrogen-deficient diet. T. turnerae is closely related to the free-living marine polysaccharide degrading bacterium Saccharophagus degradans str. 2-40 and to as yet uncultivated endosymbionts with which it coexists in shipworm cells. Like S. degradans, the T. turnerae genome encodes a large number of enzymes predicted to be involved in complex polysaccharide degradation (>100). However, unlike S. degradans, which degrades a broad spectrum (>10 classes) of complex plant, fungal and algal polysaccharides, T. turnerae primarily encodes enzymes associated with deconstruction of terrestrial woody plant material. Also unlike S. degradans and many other eubacteria, T. turnerae dedicates a large proportion of its genome to genes predicted to function in secondary metabolism. Despite its intracellular niche, the T. turnerae genome lacks many features associated with obligate intracellular existence (e.g. reduced genome size, reduced %G+C, loss of genes of core metabolism) and displays evidence of adaptations common to free-living bacteria (e.g. defense against bacteriophage infection). These results suggest that T. turnerae is likely a facultative intracellular ensosymbiont whose niche presently includes, or recently included, free-living existence. As such, the T. turnerae genome provides insights into the range of genomic adaptations associated with intracellular endosymbiosis as well as enzymatic mechanisms relevant to the recycling of plant materials in marine environments and the production of cellulose-derived biofuels.
PLoS ONE 02/2009; 4(7):e6085. DOI:10.1371/journal.pone.0006085 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Polydnaviruses, double-stranded DNA viruses with segmented genomes, have evolved as obligate endosymbionts of parasitoid wasps. Virus particles are replication deficient and produced by female wasps from proviral sequences integrated into the wasp genome. These particles are co-injected with eggs into caterpillar hosts, where viral gene expression facilitates parasitoid survival and, thereby, survival of proviral DNA. Here we characterize and compare the encapsidated viral genome sequences of bracoviruses in the family Polydnaviridae associated with Glyptapanteles gypsy moth parasitoids, along with near complete proviral sequences from which both viral genomes are derived.
The encapsidated Glyptapanteles indiensis and Glyptapanteles flavicoxis bracoviral genomes, each composed of 29 different size segments, total approximately 517 and 594 kbp, respectively. They are generated from a minimum of seven distinct loci in the wasp genome. Annotation of these sequences revealed numerous novel features for polydnaviruses, including insect-like sugar transporter genes and transposable elements. Evolutionary analyses suggest that positive selection is widespread among bracoviral genes.
The structure and organization of G. indiensis and G. flavicoxis bracovirus proviral segments as multiple loci containing one to many viral segments, flanked and separated by wasp gene-encoding DNA, is confirmed. Rapid evolution of bracovirus genes supports the hypothesis of bracovirus genes in an 'arms race' between bracovirus and caterpillar. Phylogenetic analyses of the bracoviral genes encoding sugar transporters provides the first robust evidence of a wasp origin for some polydnavirus genes. We hypothesize transposable elements, such as those described here, could facilitate transfer of genes between proviral segments and host DNA.