Article

PanOCT: automated clustering of orthologs using conserved gene neighborhood for pan-genomic analysis of bacterial strains and closely related species.

J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850, USA.
Nucleic Acids Research (Impact Factor: 8.81). 08/2012; 40(22). DOI: 10.1093/nar/gks757
Source: PubMed

ABSTRACT Pan-genome ortholog clustering tool (PanOCT) is a tool for pan-genomic analysis of closely related prokaryotic species or strains. PanOCT uses conserved gene neighborhood information to separate recently diverged paralogs into orthologous clusters where homology-only clustering methods cannot. The results from PanOCT and three commonly used graph-based ortholog-finding programs were compared using a set of four publicly available strains of the same bacterial species. All four methods agreed on ∼70% of the clusters and ∼86% of the proteins. The clusters that did not agree were inspected for evidence of correctness resulting in 85 high-confidence manually curated clusters that were used to compare all four methods.

0 Bookmarks
 · 
137 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Leptospirosis, caused by pathogenic spirochetes belonging to the genus Leptospira, is a zoonosis with important impacts on human and animal health worldwide. Research on the mechanisms of Leptospira pathogenesis has been hindered due to slow growth of infectious strains, poor transformability, and a paucity of genetic tools. As a result of second generation sequencing technologies, there has been an acceleration of leptospiral genome sequencing efforts in the past decade, which has enabled a concomitant increase in functional genomics analyses of Leptospira pathogenesis. A pathogenomics approach, by coupling of pan-genomic analysis of multiple isolates with sequencing of experimentally attenuated highly pathogenic Leptospira, has resulted in the functional inference of virulence factors. The global Leptospira Genome Project supported by the U.S. National Institute of Allergy and Infectious Diseases to which key scientific contributions have been made from the international leptospirosis research community has provided a new roadmap for comprehensive studies of Leptospira and leptospirosis well into the future. This review describes functional genomics approaches to apply the data generated by the Leptospira Genome Project towards deepening our knowledge of virulence factors of Leptospira using the emerging discipline of pathogenomics.
    Pathogens (Basel, Switzerland). 01/2014; 3(2):280-308.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Typhoid fever poses significant burden on healthcare systems in Southeast Asia and other endemic countries. Several epidemiological and genomic studies have attributed pseudogenisation to be the major driving force for the evolution of Salmonella Typhi although its real potential remains elusive. In the present study, we analyzed genomes of S. Typhi from different parts of Southeast Asia and Oceania, comprising of isolates from outbreak, sporadic and carrier cases. The genomes showed high genetic relatedness with limited opportunity for gene acquisition as evident from pan-genome structure. Given that pseudogenisation is an active process in S. Typhi, we further investigated core and pan-genome profiles of functional and pseudogenes separately. We observed a decline in core functional gene content and a significant increase in accessory pseudogene content. Upon functional classification, genes encoding metabolic functions formed a major constituent of pseudogenes as well as core functional gene clusters with SNPs. Further, an in-depth analysis of accessory pseudogene content revealed the existence of heterogeneous complements of functional and pseudogenes among the strains. In addition, these polymorphic genes were also enriched in metabolism related functions. Thus, the study highlights the existence of heterogeneous strains in a population with varying metabolic potential and that S. Typhi possibly resorts to metabolic fine tuning for its adaptation.
    Scientific reports. 01/2014; 4:7457.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background The genus Legionella comprises over 60 species. However, L. pneumophila and L. longbeachae alone, cause over 95% of Legionnaires¿ disease. To identify the genetic bases underlying the different capacities to cause disease we sequenced and compared the genomes of L. micdadei, L. hackeliae and L. fallonii (LLAP10), which are all rarely isolated from humans.ResultsWe show that these Legionella species possess different virulence capacities in amoeba and macrophages, correlating with their occurrence in humans. Our comparative analysis of 11 Legionella genomes belonging to five species reveals highly heterogeneous genome content with over 60% representing species-specific genes; these comprise a complete prophage in L. micdadei, the first ever identified in a Legionella genome. Mobile elements are abundant in Legionella genomes; many encode type IV secretion systems for conjugative transfer pointing to their importance for adaptation of the genus. The Dot/Icm secretion system is conserved, however the core set of substrates is small, as only 24 out of over 300 described Dot/Icm effector genes are present in all Legionella species. We also identified new eukaryotic motifs including thaumatin, synaptobrevin or clathrin/coatomer adaptine like domains.Conclusions Legionella genomes are highly dynamic due to a large mobilome mainly comprising T4SS, while a minority of core substrates is shared among the diverse species. Eukaryotic like proteins and motifs remain a hallmark of the genus Legionella. Key factors such as proteins involved in oxygen binding, iron storage, host membrane transport and certain Dot/Icm substrates are specific features of disease related strains.
    Genome Biology 11/2014; 15(11):505. · 10.47 Impact Factor

Full-text (2 Sources)

Download
17 Downloads
Available from
May 21, 2014