A Comparison of Computational Methods for Identifying Virulence Factors

Hubei Bioinformatics and Molecular Imaging Key Laboratory, Huazhong University of Science and Technology, Wuhan, Hubei, China.
PLoS ONE (Impact Factor: 3.23). 08/2012; 7(8):e42517. DOI: 10.1371/journal.pone.0042517
Source: PubMed


Bacterial pathogens continue to threaten public health worldwide today. Identification of bacterial virulence factors can help to find novel drug/vaccine targets against pathogenicity. It can also help to reveal the mechanisms of the related diseases at the molecular level. With the explosive growth in protein sequences generated in the postgenomic age, it is highly desired to develop computational methods for rapidly and effectively identifying virulence factors according to their sequence information alone. In this study, based on the protein-protein interaction networks from the STRING database, a novel network-based method was proposed for identifying the virulence factors in the proteomes of UPEC 536, UPEC CFT073, P. aeruginosa PAO1, L. pneumophila Philadelphia 1, C. jejuni NCTC 11168 and M. tuberculosis H37Rv. Evaluated on the same benchmark datasets derived from the aforementioned species, the identification accuracies achieved by the network-based method were around 0.9, significantly higher than those by the sequence-based methods such as BLAST, feature selection and VirulentPred. Further analysis showed that the functional associations such as the gene neighborhood and co-occurrence were the primary associations between these virulence factors in the STRING database. The high success rates indicate that the network-based method is quite promising. The novel approach holds high potential for identifying virulence factors in many other various organisms as well because it can be easily extended to identify the virulence factors in many other bacterial species, as long as the relevant significant statistical data are available for them.

Download full-text


Available from: Kuo-Chen Chou, Oct 01, 2015
1 Follower
107 Reads
  • Source
    • "Using the interaction network, they identified virulence factors based on number of neighbors and strength of interactions and compared this to a feature selection method and BLAST approaches. Their results were benchmarked against a database of validated virulence factors and the network-based method was found to out-perform the other two methods (Zheng et al., 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Latent tuberculosis is a clinical syndrome that occurs after an individual has been exposed to the Mycobacterium tuberculosis (Mtb) Bacillus, the infection has been established and an immune response has been generated to control the pathogen and force it into a quiescent state. Mtb can exit this quiescent state where it is unresponsive to treatment and elusive to the immune response, and enter a rapid replicating state, hence causing infection reactivation. It remains a gray area to understand how the pathogen causes a persistent infection and it is unclear whether the organism will be in a slow replicating state or a dormant non-replicating state. The ability of the pathogen to adapt to changing host immune response mechanisms, in which it is exposed to hypoxia, low pH, nitric oxide (NO), nutrient starvation, and several other anti-microbial effectors, is associated with a high metabolic plasticity that enables it to metabolize under these different conditions. Adaptive gene regulatory mechanisms are thought to coordinate how the pathogen changes their metabolic pathways through mechanisms that sense changes in oxygen tension and other stress factors, hence stimulating the pathogen to make necessary adjustments to ensure survival. Here, we review studies that give insights into latency/dormancy regulatory mechanisms that enable infection persistence and pathogen adaptation to different stress conditions. We highlight what mathematical and computational models can do and what they should do to enhance our current understanding of TB latency.
    Frontiers in Bioengineering and Biotechnology 08/2013; 1. DOI:10.3389/fbioe.2013.00004
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Despite the tremendous progress in the field of drug designing, discovering a new drug molecule is still a challenging task. Drug discovery and development is a costly, time consuming and complex process that requires millions of dollars and 10-15 years to bring new drug molecules in the market. This huge investment and long-term process are attributed to high failure rate, complexity of the problem and strict regulatory rules, in addition to other factors. Given the availability of 'big' data with ever improving computing power, it is now possible to model systems which is expected to provide time and cost effectiveness to drug discovery process. Computer Aided Drug Designing (CADD) has emerged as a fast alternative method to bring down the cost involved in discovering a new drug. In past, numerous computer programs have been developed across the globe to assist the researchers working in the field of drug discovery. Broadly, these programs can be classified in three categories, freeware, shareware and commercial software. In this review, we have described freeware or open-source software that are commonly used for designing therapeutic molecules. Major emphasis will be on software and web services in the field of chemo- or pharmaco-informatics that includes in silico tools used for computing molecular descriptors, inhibitors designing against drug targets, building QSAR models, and ADMET properties.
    Current Topics in Medicinal Chemistry 05/2013; In Press(10). DOI:10.2174/1568026611313100005 · 3.40 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Protein-protein interaction networks are useful for studying human diseases and to look for possible health care through a holistic approach. Networks are playing an increasing and important role in the understanding of physiological processes such as homeostasis, signaling, spatial and temporal organizations, and pathological conditions. In this article we show the complex system of interactions determined by human Sirtuins (Sirt) largely involved in many metabolic processes as well as in different diseases. The Sirtuin family consists of seven homologous Sirt-s having structurally similar cores but different terminal segments, being rather variable in length and/or intrinsically disordered. Many studies have determined their cellular location as well as biological functions although molecular mechanisms through which they act are actually little known therefore, the aim of this work was to define, explore and understand the Sirtuin-related human interactome. As a first step, we have integrated the experimentally determined protein-protein interactions of the Sirtuin-family as well as their first and second neighbors to a Sirtuin-related sub-interactome. Our data showed that the second-neighbor network of Sirtuins encompasses 25% of the entire human interactome, exhibits a scale-free degree distribution and interconnectedness among top degree nodes. Moreover, the Sirtuin sub interactome showed a modular structure around the core comprising mixed functions. Finally, we extracted from the Sirtuin sub-interactome subnets related to cancer, aging and post-translational modifications for information on key nodes and topological space of the subnets in Sirt family network.
    Biochimica et Biophysica Acta 06/2013; 1834(10). DOI:10.1016/j.bbapap.2013.06.012 · 4.66 Impact Factor
Show more