A survey of across-target bioactivity results of small molecules in PubChem

National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD 20894, USA.
Bioinformatics (Impact Factor: 4.62). 07/2009; 25(17):2251-5. DOI: 10.1093/bioinformatics/btp380
Source: PubMed

ABSTRACT This work provides an analysis of across-target bioactivity results in the screening data deposited in PubChem. Two alternative approaches for grouping-related targets are used to examine a compound's across-target bioactivity. This analysis identifies compounds that are selectively active against groups of protein targets that are identical or similar in sequence. This analysis also identifies compounds that are bioactive across unrelated targets. Statistical distributions of compound' across-target selectivity provide a survey to evaluate target specificity of compounds by deriving and analyzing bioactivity profile across a wide range of biological targets for tested small molecules in PubChem. This work enables one to select target specific inhibitors, identify promiscuous compounds and better understand the biological mechanisms of target-small molecule interactions.

  • [Show abstract] [Hide abstract]
    ABSTRACT: It is common that imbalanced datasets are often generated from high-throughput screening (HTS). For a given dataset without taking into account the imbalanced nature, most classification methods tend to produce high predictive accuracy for the majority class, but significantly poor performance for the minority class. In this work, an efficient algorithm, GLMBoost, coupled with Synthetic Minority Over-sampling TEchnique (SMOTE) is developed and utilized to overcome the problem for several imbalanced datasets from PubChem BioAssay. By applying the proposed combinatorial method, those data of rare samples (active compounds), for which usually poor results are generated, can be detected apparently with high balanced accuracy (Gmean). As a comparison with GLMBoost, Random Forest (RF) combined with SMOTE is also adopted to classify the same datasets. Our results show that the former (GLMBoost+SMOTE) not only exhibits higher performance as measured by the percentage of correct classification for the rare samples (Sensitivity) and Gmean, but also demonstrates greater computational efficiency than the latter (RF+SMOTE). Therefore, we hope that the proposed combinatorial algorithm based on GLMBoost and SMOTE could be extensively used to tackle the imbalanced classification problem.
    Analytica chimica acta 01/2014; 806C:117-127. DOI:10.1016/j.aca.2013.10.050 · 4.31 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Near universal administration of vaccines mandates intense pharmacovigilance for vaccine safety and a stringently low tolerance for adverse events. Reports of autoimmune diseases (AID) following vaccination have been challenging to evaluate given the high rates of vaccination, background incidence of autoimmunity, and low incidence and variable times for onset of AID after vaccinations. In order to identify biologically plausible pathways to adverse autoimmune events of vaccine-related AID, we used a systems biology approach to create a matrix of innate and adaptive immune mechanisms active in specific diseases, responses to vaccine antigens, adjuvants, preservatives and stabilizers, for the most common vaccine-associated AID found in the Vaccine Adverse Event Reporting System.ResultsThis report focuses on Guillain-Barre Syndrome (GBS), Rheumatoid Arthritis (RA), Systemic Lupus Erythematosus (SLE), and Idiopathic (or immune) Thrombocytopenic Purpura (ITP). Multiple curated databases and automated text mining of PubMed literature identified 667 genes associated with RA, 448 with SLE, 49 with ITP and 73 with GBS. While all data sources provided valuable and unique gene associations, text mining using natural language processing (NLP) algorithms provided the most information but required curation to remove incorrect associations. Six genes were associated with all four AIDs. Thirty-three pathways were shared by the four AIDs. Classification of genes into twelve immune system related categories identified more ¿Th17 T-cell subtype¿ genes in RA than the other AIDs, and more ¿Chemokine plus Receptors¿ genes associated with RA than SLE. Gene networks were visualized and clustered into interconnected modules with specific gene clusters for each AID, including one in RA with ten C-X-C motif chemokines. The intersection of genes associated with GBS, GBS peptide auto-antigens, influenza A infection, and influenza vaccination created a subnetwork of genes that inferred a possible role for the MAPK signaling pathway in influenza vaccine related GBS.Conclusions Results showing unique and common gene sets, pathways, immune system categories and functional clusters of genes in four autoimmune diseases suggest it is possible to develop molecular classifications of autoimmune and inflammatory events. Combining this information with cellular and other disease responses should greatly aid in the assessment of potential immune-mediated adverse events following vaccination.
    BMC Immunology 12/2014; 15(1):61. DOI:10.1186/s12865-014-0061-0 · 2.25 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Significant resources in early drug discovery are spent unknowingly pursuing artifacts and promiscuous bioactive compounds, while understanding the chemical basis for these adverse behaviors often goes unexplored in pursuit of lead compounds. Nearly all the hits from our recent sulfhydryl-scavenging high-throughput screen (HTS) targeting the histone acetyltransferase Rtt109 were such compounds. Herein, we characterize the chemical basis for assay interference and promiscuous enzymatic inhibition for several prominent chemotypes identified by this HTS, including some pan-assay interference compounds (PAINS). Protein mass spectrometry and ALARM NMR confirmed these compounds react covalently with cysteines on multiple proteins. Unfortunately, compounds containing these chemotypes have been published as screening actives in reputable journals, and even touted as chemical probes or pre-clinical candidates. Our detailed characterization and identification of such thiol-reactive chemotypes should accelerate triage of nuisance compounds, guide screening library design, and prevent follow-up on undesirable chemical matter.
    Journal of Medicinal Chemistry 01/2015; DOI:10.1021/jm5019093 · 5.48 Impact Factor

Full-text (3 Sources)

Available from
Jun 1, 2014