Article

PubChem3D: Conformer generation

National Center for Biotechnology Information National Library of Medicine National Institutes of Health Department of Health and Human Services 8600 Rockville Pike, Bethesda, MD 20894, USA. .
Journal of Cheminformatics (Impact Factor: 4.54). 01/2011; 3(1):4. DOI: 10.1186/1758-2946-3-4
Source: PubMed

ABSTRACT PubChem, an open archive for the biological activities of small molecules, provides search and analysis tools to assist users in locating desired information. Many of these tools focus on the notion of chemical structure similarity at some level. PubChem3D enables similarity of chemical structure 3-D conformers to augment the existing similarity of 2-D chemical structure graphs. It is also desirable to relate theoretical 3-D descriptions of chemical structures to experimental biological activity. As such, it is important to be assured that the theoretical conformer models can reproduce experimentally determined bioactive conformations. In the present study, we investigate the effects of three primary conformer generation parameters (the fragment sampling rate, the energy window size, and force field variant) upon the accuracy of theoretical conformer models, and determined optimal settings for PubChem3D conformer model generation and conformer sampling.
Using the software package OMEGA from OpenEye Scientific Software, Inc., theoretical 3-D conformer models were generated for 25,972 small-molecule ligands, whose 3-D structures were experimentally determined. Different values for primary conformer generation parameters were systematically tested to find optimal settings. Employing a greater fragment sampling rate than the default did not improve the accuracy of the theoretical conformer model ensembles. An ever increasing energy window did increase the overall average accuracy, with rapid convergence observed at 10 kcal/mol and 15 kcal/mol for model building and torsion search, respectively; however, subsequent study showed that an energy threshold of 25 kcal/mol for torsion search resulted in slightly improved results for larger and more flexible structures. Exclusion of coulomb terms from the 94s variant of the Merck molecular force field (MMFF94s) in the torsion search stage gave more accurate conformer models at lower energy windows. Overall average accuracy of reproduction of bioactive conformations was remarkably linear with respect to both non-hydrogen atom count ("size") and effective rotor count ("flexibility"). Using these as independent variables, a regression equation was developed to predict the RMSD accuracy of a theoretical ensemble to reproduce bioactive conformations. The equation was modified to give a minimum RMSD conformer sampling value to help ensure that 90% of the sampled theoretical models should contain at least one conformer within the RMSD sampling value to a "bioactive" conformation.
Optimal parameters for conformer generation using OMEGA were explored and determined. An equation was developed that provides an RMSD sampling value to use that is based on the relative accuracy to reproduce bioactive conformations. The optimal conformer generation parameters and RMSD sampling values determined are used by the PubChem3D project to generate theoretical conformer models.

Full-text

Available from: Evan Bolton, Jun 02, 2015
0 Followers
 · 
123 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Plant-derived non-essential fatty acids are important dietary nutrients, and some are purported to have chemopreventive properties against various cancers, including that of the prostate. In this study, we determined the ability of seven dietary C-18 fatty acids to cause cytotoxicity and induce apoptosis in various types of human prostate cancer cells. These fatty acids included jacaric and punicic acid found in jacaranda and pomegranate seed oil, respectively, three octadecatrienoic geometric isomers (alpha- and beta-calendic and catalpic acid) and two mono-unsaturated C-18 fatty acids (trans- and cis-vaccenic acid). Jacaric acid and four of its octadecatrienoic geoisomers selectively induced apoptosis in hormone-dependent (LNCaP) and -independent (PC-3) human prostate cancer cells, whilst not affecting the viability of normal human prostate epithelial cells (RWPE-1). Jacaric acid induced concentration- and time-depedent LNCaP cell death through activation of intrinsic and extrinsic apoptotic pathways resulting in cleavage of PARP-1, modulation of pro- and antiapoptotic Bcl-2 family of proteins and increased cleavage of caspase-3, -8 and -9. Moreover, activation of a cell death-inducing signalling cascade involving death receptor 5 was observed. Jacaric acid induced apoptosis in PC-3 cells by activation of the intrinsic pathway only. The spatial conformation cis, trans, cis of jacaric and punicic acid was shown to play a key role in the increased potency and efficacy of these two fatty acids in comparison to the five other C-18 fatty acids tested. Three-dimensional conformational analysis using the PubChem Database (http://pubchem.ncbi.nlm.nih.gov) showed that the cytotoxic potency of the C-18 fatty acids was related to their degree of conformational similarity to our cytotoxic reference compound, punicic acid, based on optimized shape (ST) and feature (CT) similarity scores, with jacaric acid being most 'biosimilar' (STST-opt=0.81; CTCT-opt=0.45). This 3-D analysis of structural similarity enabled us to rank geoisomeric fatty acids according to cytotoxic potency, whereas a 2-D positional assessment of cis/trans structure did not. Our findings provide mechanistic evidence that nutrition-derived non-essential fatty acids have chemopreventive biological activities and Exhibit 3-D structure-activity relationships that could be exploited to develop new strategies for the prevention or treatment of prostate cancer regardless of hormone dependency.
    Phytomedicine: international journal of phytotherapy and phytopharmacology 02/2013; DOI:10.1016/j.phymed.2013.01.012 · 2.88 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background PubChem is a free and publicly available resource containing substance descriptions and their associated biological activity information. PubChem3D is an extension to PubChem containing computationally-derived three-dimensional (3-D) structures of small molecules. All the tools and services that are a part of PubChem3D rely upon the quality of the 3-D conformer models. Construction of the conformer models currently available in PubChem3D involves a clustering stage to sample the conformational space spanned by the molecule. While this stage allows one to downsize the conformer models to more manageable size, it may result in a loss of the ability to reproduce experimentally determined “bioactive” conformations, for example, found for PDB ligands. This study examines the extent of this accuracy loss and considers its effect on the 3-D similarity analysis of molecules. Results The conformer models consisting of up to 100,000 conformers per compound were generated for 47,123 small molecules whose structures were experimentally determined, and the conformers in each conformer model were clustered to reduce the size of the conformer model to a maximum of 500 conformers per molecule. The accuracy of the conformer models before and after clustering was evaluated using five different measures: root-mean-square distance (RMSD), shape-optimized shape-Tanimoto (STST-opt) and combo-Tanimoto (ComboTST-opt), and color-optimized color-Tanimoto (CTCT-opt) and combo-Tanimoto (ComboTCT-opt). On average, the effect of clustering decreased the conformer model accuracy, increasing the conformer ensemble’s RMSD to the bioactive conformer (by 0.18 ± 0.12 Å), and decreasing the STST-opt, ComboTST-opt, CTCT-opt, and ComboTCT-opt scores (by 0.04 ± 0.03, 0.16 ± 0.09, 0.09 ± 0.05, and 0.15 ± 0.09, respectively). Conclusion This study shows the RMSD accuracy performance of the PubChem3D conformer models is operating as designed. In addition, the effect of PubChem3D sampling on 3-D similarity measures shows that there is a linear degradation of average accuracy with respect to molecular size and flexibility. Generally speaking, one can likely expect the worst-case minimum accuracy of 90% or more of the PubChem3D ensembles to be 0.75, 1.09, 0.43, and 1.13, in terms of STST-opt, ComboTST-opt, CTCT-opt, and ComboTCT-opt, respectively. This expected accuracy improves linearly as the molecule becomes smaller or less flexible.
    Journal of Cheminformatics 01/2013; 5(1):1. DOI:10.1186/1758-2946-5-1 · 4.54 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The description of molecular systems using multipolar electrostatics calls for automated methods to fit the necessary parameters. In this paper, we describe an open-source software package that allows fitting atomic multipoles (MTPs) from the ab initio electrostatic potential by adequate atom typing and judicious assignment of the local axis system. By enabling the simultaneous fit of several molecules and/or conformations, the package addresses issues of parameter transferability and lack of sampling for buried atoms. We illustrate the method by studying a series of small alcohol molecules, as well as various conformations of protonated butylamine.
    Journal of Chemical Information and Modeling 12/2013; 53(12). DOI:10.1021/ci400548w · 4.07 Impact Factor